List of Contributors
J.L. Barbur, Applied Vision Research Center, City University, Northampton Square, London EC1V 0HB, UK G. Berlucchi, Dipartimento di Scienze Neurologiche e della Visione, Sezione Fisiologia Umana, Universita` di Verona, Strada Le Grazie 8, I-37134 Verona, Italy C.M. Butter, Department of Psychology, University of Michigan, Ann Arbor, MI 481091109, USA G. Campana, Dipartimento di Psicologia Generale, Universita` degli Studi di Padova, Via Venezia 8, Padova, Italy and Department of Experimental Psychology, University of Oxford, South Parks Road, OX1 3UD, UK G.G. Cole, Department of Psychology, Science Laboratories, South Road, Durham DH1 3LE, UK A. Cowey, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford, OX1 3UD, UK M. Da Silva Filho, Department of Physiology, Biological Science Center, Federal University of Para´, 66075-900 Bele´m, Para´, Brazil P. Dean, Department of Psychology, University of Sheffield, Western Bank, Sheffield, S10 2TP, UK R. Edwards, School of Psychology, University of St. Andrews, St. Andrews, KY16 9JU, UK P. Foldia´k, School of Psychology, University of St. Andrews, St. Andrews, KY16 9JU, UK M.A. Goodale, Department of Psychology, University of Western Ontario, London, ON N6A 5C2, Canada C.G. Gross, Department of Psychology, Green Hall, Princeton University, Princeton, NJ 08544, USA R.L. Gregory, Department of Experimental Psychology, University of Bristol, 8 Woodland Road, Bristol BS8 1TN, UK C.A. Heywood, Department of Psychology, Science Laboratories, South Road, Durham DH1 3LE, UK A. Hurlbert, Henry Wellcome Building for Neuroecology, School of Biology, Framlington Place, Newcastle upon Tyne NE2 4HH, UK C.-H. Juan, Department of Psychology, Vanderbilt University, 301 Wilson Hall, Nashville, TN 37240, USA R.W. Kentridge, Department of Psychology, Science Laboratories, South Road, Durham DH1 3LE, UK C. Keysers, School of Psychology, University of St. Andrews, KY16 9JU, UK B.E. Kilavik, Department of Experimental Ophthalmology, University of Tu¨bingen, D-72076 Tu¨bingen, Germany J. Kremers, Department Experimental Ophthalmology, University of Tu¨bingen, D-72076 Tu¨bingen, Germany
v
vi
B.B. Lee, SUNY Optometry, New York, NY 10036, USA and Max Planck Institute for Biophysical Chemistry, Department of Neurobiology, D-3400 Go¨ttingen, Germany C.A. Marzi, Department of Neurological and Visual Sciences, University of Verona, 8 Strada Le Grazie, 37134 Verona, Italy R.D. McIntosh, Cognitive Neuroscience Research Unit, Wolfson Research Institute, University of Durham, Queen’s Campus, University Boulevard, Stockton-on-Tees, TS17 6BH, UK A.D. Milner, Cognitive Neuroscience Research Unit, Wolfson Research Institute, University of Durham Stockton Campus, Stockton-on-Tees, TS17 6BH, UK A. Minelli, Department of Neurological and Visual Sciences, University of Verona, 8 Strada Le Grazie, 37134 Verona, Italy T. Moore, Department of Psychology, Green Hall, Princeton University, Princeton, NJ 08544, USA D.I. Perrett, School of Psychology, University of St. Andrews, St. Andrews, KY16 9JU, UK V.H. Perry, CNS Inflammation Group, University of Southampton, SO16 7PX Southampton, UK L. Pessoa, Laboratory of Brain and Cognition, National Institute of Mental Health, NIH, Building 49, Room 1B80, 49 Convent Drive, Bethesda, MD 20892-4415, USA J. Porrill, Department of Psychology, University of Sheffield, Western Bank, Sheffield, S10 2TP, UK B.E. Reese, Neuroscience Research Institute, Department of Psychology, University of California at Santa Barbara, Santa Barbara, CA 93106-5060, USA H.R. Rodman, Department of Psychology and Yerkes RPRC, Emory University, 532 N. Kilgo Circle, Atlanta, GA 30322, USA E.T. Rolls, Department of Experimental Psychology, University of Oxford, Oxford OX1 3UD, UK C.A. Saito, Department of Physiology, Biological Science Center, Federal University of Para´, 66075-900 Bele´m, Para´, Brazil S. Savazzi, Department of Neurological and Visual Sciences, University of Verona, 8 Strada Le Grazie, 37134 Verona, Italy L.C.L. Silveira, Department of Physiology, Biological Science Center, Federal University of Para´, 66075-900 Bele´m, Para´, Brazil S. Soloviev, Department of Neurology, Harvard Medical School, 75 Francis Street, Boston, MA 02215, USA M.A. Sommer, Laboratory of Sensorimotor Research, National Eye Institute, N.I.H., Building 49, Room 2A50, MSC 4435, 9000 Rockville Pike, Bethesda, MD 20892-4435, USA P. Stoerig, Heinrich Heine University, Institute of Experimental Psychology, Universita¨tstrasse 1, 40225 Du¨sseldorf, Germany J.V. Stone, Department of Psychology, University of Sheffield, Western Bank, Sheffield, S10 2TP, UK L.G. Ungerleider, Laboratory of Brain and Cognition, National Institute of Mental Health, NIH, Building 10, Room 4C104, 10 Center Drive, Bethesda, MD 20892-1366, USA L.M. Vaina, Brain and Vision Research Laboratory, Boston University, Department of Biomedical Engineering, College of Engineering, 44 Cummington Street, Room 315, Boston, MA 02215, USA V. Walsh, Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
vii
L. Weiskrantz, Department of Experimental Psychology, University of Oxford, Oxford, OX1 3UD, UK D.A. Westwood, School of Health and Human Performance, Dalhousie University, Halifax, NS B3H 3J5, Canada F. Wilkinson, Centre for Vision Research, CSB, York University and Toronto Western Research Institute, 4700 Keele Street, Toronto, ON M3J 1P3, Canada K. Wolf, Henry Wellcome Building for Neuroecology, School of Biology, Framlington Place, Newcastle upon Tyne NE2 4HH, UK R.H. Wurtz, Laboratory of Sensorimotor Research, National Eye Institute, N.I.H., Building 49, Room 2A50, MSC 4435, 9000 Rockville Pike, Bethesda, MD 20892-4435, USA D. Xiao, School of Psychology, University of St. Andrews, St. Andrews, KY16 9JU, UK E.S. Yamada, Department of Physiology, Biological Science Center, Federal University of Para´, 66075-900 Bele´m, Brazil A. Zeman, Department of Clinical Neurosciences, University of Edinburgh, Wesern General Hospital, Edinburgh EH4 2XU, UK
Foreword
The rewards of teaching come delayed, in successes of gifted students, and are immediate in the questions they raise and their challenges to the teacher’s knowledge and beliefs. The fresh eyes of students make us look again. I had the privilege of supervising (as we called it) Alan Cowey at Cambridge in the early 1950s. He went on to do his Ph.D. with Larry Weiskrantz, and I well remember how focused he was on monkey perimetry and measuring their eye movements. This was demanding, difficult work which Alan grasped in both hands, succeeding where previous attempts had largely failed. One can see this now as the basis of Alan’s creative career. He was fortunate in his Ph.D. supervisor, and others around, then and somewhat later, especially Nick Humphrey and his monkey Helen. Alan has well-developed foveal and peripheral mental vision, for he combines lengthy and for most people tedious experiments with philosophical speculations, as on consciousness. He is, indeed, an experimental philosopher, looking for specific experimental results to uncurl question marks. This is the most exciting and rewarding kind of science. This is what made seventeenth century physics so exciting with its mental and physical tools for its investigations — and now the tools of micro-electrode recording and brain imaging, which with cognitive concepts for interpreting their data allows not only brain but mind to be probed, revealing secrets in our heads. Although gifted students soon take off on their own adventures, where they start must be helpful — or the opposite. I had begun to think that many perceptual phenomena, especially varieties of illusions, can have adequate explanations from so to speak the strategy of the physiology. Thus a General can lose or win battle by how his forces are deployed. Of course there must be forces, but to see or control their effects, strategies are necessary. Historians may know the strategies top-down or deduce them bottom-up from observations. I hope this approach was not a hurdle for my students to overcome. I also expressed doubts on the apparent simplicity of extirpation and brain recording experiments, for (having fooled around with radar and communication circuits in the war) it seemed to me that relations between parts of a machine and its functions are far from simple. I used to think of removing components from a radio, asking: How do you know what the component was doing, from the symptoms of its loss? If the radio howls — was it a ‘howl suppresser’? This by no means follows (I would say). And how is it possible to localize functions if we don’t understand a circuit, or the brain, to know what functions give the output performance? How could one localize an oscillator in a radio if one didn’t know it had an oscillator, or how heterodyning works? Such comments were annoying at the time, especially when seen as attacking brain science aims and dreams. Possibly there was something in this (one defends one’s fort by at least pretending to attack potential invaders), but it seemed to me that interpreting data is as important as the data themselves — and errors due to misinterpretations can be greater than observational errors. I would now say that both statistical significance and conceptual significance are necessary for science. Conceptual errors are far more widely misleading, so really important for the teacher–student relation. But I have described philosophers (and no ix
x
doubt teachers of psychology) as ‘Guardians of Semantic Inertia.’ No doubt Alan’s intelligent selective attention protected him from my perhaps dubious philosophy, and Larry’s great expertise in experimenting led him to question and learn from nature. At that time two of the now most discussed issues were taboo: attention, and consciousness. Attention was a word simply not allowed in the Quarterly Journal of Experimental Psychology, as it had no, or too little operational definition. Consciousness was a non-starter for any budding or fully flowered author to submit for consideration. Alan has made highly significant contributions to both. Fancy getting monkeys to tell us whether they are aware of seeing! Wittgenstein would be astonished by this. The papers in this volume honoring Alan Cowey represent new ways of thinking and experimenting. We as his teachers may or may not have suggested good directions but it is he who homed in on wonderful questions and exciting answers. This, not by wild and (comfortably) woolly speculation, but by decades of exceptionally careful and detailed hard work. So in his turn Alan has become a force to be reckoned with, an inspiration for students and a take-off platform for new research. As he combines both statistical and conceptual significance he is far from a guardian of inertia: Alan gives momentum for progress into the future. R.L. Gregory Department of Experimental Psychology University of Bristol
Contents
List of Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v
Foreword by R.L. Gregory (Bristol, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
Preface by C. Heywood (Durham and Oxford, UK) . . . . . . . . . . . . . . . . . . . . . .
xi
Section I. Visual Pathways 1. Developmental plasticity of photoreceptors B.E. Reese (Santa Barbara, CA, USA) . . . . . . . . . . . . . . . . . . . . .
3
2. Morphology and physiology of primate M- and P-cells L.C.L. Silveira, C.A. Saito, B.B. Lee, J. Kremers, M. da Silva Filho, B.E. Kilavik, E.S. Yamada and V.H. Perry (Para´, Brazil; New York, NY, USA, Go¨ttingen, and Tu¨bingen, Germany and Southampton, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
3. Identifying corollary discharges for movement in the primate brain R.H. Wurtz and M.A. Sommer (Bethesda, MD, USA) . . . . . . . . . .
47
4. Visual awareness and the cerebellum: possible role of decorrelation control P. Dean, J. Porrill and J.V. Stone (Sheffield, UK) . . . . . . . . . . . . .
61
Section II. Cortical Visual Systems 5.
Some effects of cortical and callosal damage on conscious and unconscious processing of visual information and other sensory inputs G. Berlucchi (Verona, Italy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
6. Consciousness absent and present: a neurophysiological exploration E.T. Rolls (Oxford, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
7. Rapid serial visual presentation for the determination of neural selectivity in area STSa P. Fo¨ldia´k, D. Xiao, C. Keysers, R. Edwards and D.I. Perrett (St. Andrews, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
107
xiii
xiv
8. Cortical interactions in vision and awareness: hierarchies in reverse C.-H. Juan, G. Campana and V. Walsh (Nashville, TN, USA, Oxford and London, UK and Padova, Italy) . . . . . . . . . . . . . . . . . . . . . . .
117
9. Two distinct modes of control for object-directed action M.A. Goodale, D.A. Westwood and A.D. Milner (London, ON and Halifax, NS, Canada and Stockton-on-Tees, UK) . . . . . . . . . . . . .
131
Section III. Perception and Attention 10. Color contrast: a contributory mechanism to color constancy A. Hurlbert and K. Wolf (Newcastle upon Tyne, UK) . . . . . . . . . .
147
11. The primacy of chromatic edge processing in normal and cerebrally achromatopsic subjects R.W. Kentridge, G.G. Cole and C.A. Heywood (Durham, UK) . . .
161
12. Neuroimaging studies of attention and the processing of emotion-laden stimuli L. Pessoa and L.G. Ungerleider (Bethesda, MD, USA) . . . . . . . . . .
171
13. Selective visual attention, visual search and visual awareness C.M. Butter (Ann Arbor, MI, USA) . . . . . . . . . . . . . . . . . . . . . . .
183
14. First-order and second-order motion: neurological evidence for neuroanatomically distinct systems L.M. Vaina and S. Soloviev (Boston, MA, USA) . . . . . . . . . . . . . .
197
15. Reaching between obstacles in spatial neglect and visual extinction A.D. Milner and R.D. McIntosh (Stockton-on-Tees, UK) . . . . . . .
213
Section IV. Blindsight and Visual Awareness 16. Roots of blindsight L. Weiskrantz (Oxford, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.
‘Double-blindsight’ revealed through the processing of color and luminance contrast defined motion signals J.L. Barbur (London, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
229
243
18. Stimulus cueing in blindsight A. Cowey and P. Stoerig (Oxford, UK and Du¨sseldorf, Germany) .
261
19. Visually guided behavior after V1 lesions in young and adult monkeys and its relation to blindsight in humans C.G. Gross, T. Moore and H.R. Rodman (Princeton, NJ and Atlanta, GA, USA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
279
xv
20. Is blindsight in normals akin to blindsight following brain damage? C.A. Marzi, A. Minelli and S. Savazzi (Verona, Italy) . . . . . . . . . .
295
21. Auras and other hallucinations: windows on the visual brain F. Wilkinson (Toronto, ON, Canada) . . . . . . . . . . . . . . . . . . . . . .
305
22. Theories of visual awareness A. Zeman (Edinburgh, UK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
321
Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
331
SECTION I
Visual Pathways
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 1
Developmental plasticity of photoreceptors Benjamin E. Reese* Neuroscience Research Institute and Department of Psychology, University of California at Santa Barbara, Santa Barbara, CA 93106-5060, USA
Abstract: During development, retinal ganglion cells undergo conspicuous structural remodeling as they gradually attain their mature morphology and connectivity. Alterations in their dendritic organization and in their axonal projections can also be achieved following early insult to their targets or their afferents. Other retinal cell types are thought not to display this same degree of developmental plasticity. The present review will consider the evidence, drawn largely from recent experimental studies in the carnivore retina, that photoreceptors also undergo structural remodeling, extending their terminals transiently into inner plexiform layer before retracting to the outer plexiform layer. The determinants of this transient targeting to the inner plexiform layer are considered, and the role of cholinergic amacrine cells is discussed. The factors triggering this retraction are also considered, including the concurrent maturational changes in outer segment formation and in the differentiation of the outer plexiform layer. These results provide new insight into the life history of the photoreceptor cell and its connectivity, and suggest a transient role for the photoreceptors in the circuitry of the inner retina during early development, prior to the onset of phototransduction.
Introduction
sculpting, and retraction that has been described for ganglion cell dendrites and axonal projections (Frost et al., 1979; Frost, 1984; Dann et al., 1987, 1988; Ramoa et al., 1987, 1988; Langdon and Frost, 1991; Bodnarenko et al., 1995, 1999). Recent studies have shown, however, that photoreceptors initially project beyond their normal target territory (Johnson et al., 1999), much like retinal ganglion cells that overshoot the superior colliculus and invade the inferior colliculus before retracting to form their normal target innervation within the superior colliculus (Cooper and Cowey, 1990a,b). Likewise, much as these early exuberant retinofugal projections are modulated in response to the time-dependent availability of their normal and alternative targets (Perry and Cowey, 1979, 1982), so the exuberant photoreceptor projection is transiently controlled by the presence of an alternative target during development (Johnson et al., 2001a). This review will consider the major features of photoreceptor development before examining the evidence for this developmental plasticity of the rods and cones within the ferret’s retina.
Our visual abilities arise from the capacity of photoreceptor cells to transduce a photic stimulus into a neural response and to transmit this message to second-order neurons. Effective transmission of that signal is dependent on processes acting during development that orchestrate the formation of the normal retinal architecture and circuitry. The cellintrinsic and environmental factors controlling the morphological differentiation of the photoreceptor outer segment and its associated functional maturation are beginning to be understood, but relatively little is known about the developmental mechanisms responsible for establishing the connectivity of these cells. Photoreceptors, like most other retinal cells besides the retinal ganglion cells, are generally believed to differentiate and form synaptic connections in a targeted manner, avoiding the elaborate overgrowth, *Corresponding author. Tel.: þ1-805-893-2091; Fax: þ1-805-893-2005; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-1
3
4
Photoreceptor differentiation The vertebrate photoreceptor is a uniquely polarized nerve cell, with a bipolar-shaped soma giving rise to an apical specialization for phototransduction and a basally directed process for transmitting the neural response to second-order neurons within the retina. Numerous studies have charted the formation and developmental time course of these two specializations, including the ultrastructural appearance of the outer and inner segments, and the establishment of synaptic ribbons associated with the rod spherules and cone pedicles. In general, such studies have shown that these features of the mature rods and cones are acquired gradually and progressively, with little remodeling or transdifferentiation through intermediate morphologies. For example, studies of photoreceptor differentiation reveal the formation of an inner segment, a cilium and ballooning outer segment, followed by the appearance of membranous disks within the outer segment. Subsequently, both the inner and outer segments increase in length until they achieve their adult size (Olney, 1968; Feeney, 1973; McArdle et al., 1977; Vogel, 1978; Tucker et al., 1979; Morrison, 1983; Usukura and Obata, 1995). Likewise, the formation of connectivity within the outer plexiform layer (OPL) proceeds by the appearance of presynaptic densities, or ‘ribbons’, in the basally directed processes of the rods and cones presaging synaptic terminals, followed by the apposition and invagination of lateral processes from horizontal cells. These dyadic complexes are subsequently modified by the invagination of bipolar cell dendrites to create the synaptic arrangements characteristic of mature photoreceptor terminals (Olney, 1968; Weidman and Kuwabara, 1968; Blanks et al., 1974; McArdle et al., 1977; Vogel, 1978; Maslim and Stone, 1986; Rapaport, 1989).
but low levels of protein are occasionally found in cellular compartments for which there is no apparent function. For example, rod opsin protein is detectable not only within the disks of the outer segment but is also found within the plasma membrane of the entire cell (Jan and Revel, 1974; Nir and Papermaster, 1986; Jansen et al., 1987; Usukura and Bok, 1987; Bowes et al., 1988; Hicks et al., 1989; Lewis et al., 1991; Edward et al., 1993). During development, the onset of protein expression is widely assumed to parallel morphological differentiation (Colombaioni and Strettoi, 1993; Timmers et al., 1993), but some studies examining rod opsin expression contradict this view. For instance, rod opsin has been shown to be present in the plasma membrane of the rods well before these cells assemble their outer segments (Hicks and Barnstable, 1987; Bowes et al., 1988; Treisman et al., 1988; Watanabe and Raff, 1990; Saha and Grainger, 1993; Dorn et al., 1995; Jasoni and Reh, 1996). Likewise, the expression of SNARE complex proteins has been regarded as indicative of synaptogenesis (Devoto and Barnstable, 1989; Voigt et al., 1993; Kapfhammer et al., 1994; Dhingra et al., 1997), but there is also evidence that some of these synaptic vesicle proteins and presynaptic membrane proteins are present well before ultrastructurally identifiable synapses can be detected within the retina (Hering and Kro¨ger, 1996; West Greenlee et al., 2001). Antibodies to synaptic vesicle proteins label the entirety of the developing outer nuclear layer (ONL) during the period preceding the emergence of the OPL (Reese et al., 1996); thereafter, these proteins become progressively restricted to the OPL during the period of synapse formation (Greiner and Weidman, 1981). Unfortunately, relatively little is understood about the assembly of synaptic ribbons within photoreceptor terminals; further study of their plasticity in maturity may shed light upon the mechanisms that assemble them during development (Vollrath and Spiwoks-Becker, 1996).
Protein trafficking during development These morphological specializations associated with the apical and basal extensions of photoreceptor cells contain various proteins mediating their visual transduction-related and synaptic functions. In general, these proteins are selectively targeted to the outer segment or the synaptic terminal, respectively,
Environmental determinants of differentiation and connectivity Environmental signals for some of the maturational milestones associated with photoreceptor differentiation are beginning to be defined. For example, the
5
differentiation of an outer segment containing organized stacks of membranous disks requires close association or contact with cells of the retinal pigment epithelium (Hollyfield and Witkovsky, 1974; Spoerri et al., 1988; Stiemke et al., 1994; Pinzo´nDuarte et al., 2000; Bumsted et al., 2001). The outgrowth and targeting of a process from the opposite, basal, pole of the cell body to the future OPL is also presumed to involve interactions with other cells in the local environment, by way of cellsurface or secreted molecules, but little is known about the processes of initial outgrowth and subsequent target recognition. Growth factors released locally by retinal neurons and glia are known to activate receptors expressed on neighboring cells, controlling not only cell survival but also differentiation (Ary-Pires et al., 1997). For example, fibroblast growth-factor (FGF) receptors expressed on photoreceptors are thought to mediate the effects of bFGF on rod fate determination, differentiation, and resistance to injury (Hicks and Courtois, 1992; LaVail et al., 1992; Unoki and LaVail, 1994; Blanquet and Jonet, 1996; Gao and Hollyfield, 1996; Carwile et al., 1998). The glial cell-line-derived neurotrophic factor (GDNF) is also expressed in the retina, where it not only promotes ganglion cell differentiation and survival, but has also been shown to preserve the functional status of photoreceptors in vitro (Norsat et al., 1996; Klocker et al., 1997; Carwile et al., 1998; Yan et al., 1999). The neurotrophins NT-3, NT-4 and brain-derived neurotrophic factor (BDNF), and their receptors trkA, trkB, trkC, and p75 also play important roles in the morphogenesis of the visual system via paracrine mechanisms (von Bartheld, 1998). For example, dopaminergic amacrine cells increase their soma size and innervation density after BDNF application (Cellerino et al., 1998), while ganglion cells modulate their process elongation and arborization in response to BDNF and NT-4 (Bosco et al., 1993; Bosco and Linden, 1999). BDNF may also contribute to photoreceptor development. The avian photoreceptor layer expresses trkB mRNA, although it has gone undetected in the same layer of mammals (Jelsma et al., 1993; Okazawa et al., 1995; Perez and Caminos, 1995). Still, BDNF has been shown to have a clear protective effect on mammalian photoreceptors (LaVail et al., 1992; Unoki and LaVail, 1994; Perez and Caminos, 1995; LaVail et al., 1998),
and trkB knockout mice show delayed rod maturation and defective rod signaling with inner retinal neurons (Rohrer et al., 1999). These actions of BDNF signaling through trkB may be mediated indirectly by other retinal cells, particularly since application of BDNF activates intracellular-signaling pathways in inner retinal neurons and Mu¨ller glia but not in photoreceptors (Wahlin et al., 2000, 2001). A substantial fraction of the BDNF in the inner retina is derived from local sources (rather than from the optic tectum via retrograde transport), including the amacrine and ganglion cells, which synthesize and secrete BDNF, as well as express trkB (Zanellato et al., 1993; Rickman and Brecha, 1995; Ugolini et al., 1995; Cohen-Cory et al., 1996; Cellerino and Kohler, 1997; Herzog and von Bartheld, 1998). Given the intimacy shared between Mu¨ller glia and photoreceptors in vivo (Robinson and Dreher, 1990), and their role as a preferred substrate for neuritic extension by photoreceptors in vitro (Kljavin and Reh, 1991), any indirect neurotrophic effect upon photoreceptors should therefore be mediated through the Mu¨ller glia. While such a plausible neurotrophic action may contribute to the outgrowth and morphological differentiation of photoreceptors during their development, no direct evidence for a role in terminal outgrowth or target recognition has emerged to date.
Transient retinal circuitry How retinal cells communicate during early development, prior to the establishment of the mature circuitry, has become a major focus of attention recently, as has the question of the functional significance of such precocious communication (Catsicas and Mobbs, 1995; Copenhagen, 1996; Feller, 1999; Wong, 1999). In the developing inner retina, neighboring ganglion and amacrine cells display correlated ‘spontaneous’ neural activity, well before photoreceptors can respond to light. This activity has been shown to originate at a location on the retina and then propagate as a wave of activity before dissipating, after which another such wave will materialize elsewhere (Meister et al., 1991; Wong et al., 1993; Feller et al., 1996). While the full significance of these waves of activity remain to be
6
defined, there is increasing evidence that such correlated activity in the discharge patterns of the retinal ganglion cells plays a critical role in the establishment of ocular segregation and ON–OFF lamination within the lateral geniculate nucleus (Cramer and Sur, 1997; Penn et al., 1998; Muir-Robinson et al., 2002; Stellwagen and Shatz, 2002), and may prove to contribute to the formation of retinotopic maps (Eglen, 1999). The mechanisms driving this neural activity are undefined, but recent studies indicate that synaptic and gap-junctional connectivity within the inner plexiform layer (IPL) permits this activity to be propagated across the retina (Penn et al., 1994; Wong et al., 1995; Feller et al., 1996; Zhou, 1998; Singer et al., 2001). Ganglion and amacrine cell dendrites are synaptically connected during this early developmental stage, but the other main constituent of the IPL, the axon terminal of the bipolar cell, normally the driving force of neural activity within the mature IPL, develops postnatally, after this spontaneous activity is already present (Miller et al., 1998). Curiously, cells in the outer retina show receptormediated increases in intracellular calcium concentrations during these early developmental stages (Wong, 1995), suggesting that they participate in the perinatal retinal circuitry, but their identity and function remain to be determined. One possibility is that these cells are developing photoreceptors that initially overextend their terminals into the developing IPL.
Photoreceptor affinity for the inner retina That photoreceptors may have some affinity for inner retinal cells, the amacrine and ganglion cells, is not without precedent. For example, retinas from humans with retinitis pigmentosa contain surviving photoreceptors that extend their terminals into the inner retina, where they contact amacrine cells (Fariss et al., 2000), while dissociated retinal cell cultures have been shown to contain regenerating photoreceptor neurites that preferentially contact amacrine and ganglion cells over their normal target cells (Sherry et al., 1996). Further, in the zebrafish mutant, cannonball, rod photoreceptors project directly into the IPL (Brian Link, personal communication). These
examples are all drawn from anomalous developmental or degenerative conditions, but this relationship between the photoreceptors and inner retinal cells is also present during normal development.
Immature rods and cones project to the inner plexiform layer In the developing ferret’s retina, immature photoreceptors project directly to the inner plexiform layer, well before the OPL has formed (Johnson et al., 1999). By using antibodies to rod opsin, a narrow row of immunoreactive cells occupying the neuroblast layer can be identified on the day of birth, extending apically directed processes to the ventricular surface and basal processes through the neuroblast layer and beyond a layer of postmitotic amacrine cells, reaching the IPL (Fig. 1). These projections typically end in a single terminal expansion, occasionally branching within the IPL. The abundance of these projections to the IPL indicates that their outgrowth is not some rare ectopic event: at least 80% of the rod opsinimmunoreactive cells on postnatal day 1 (P-1) extend such processes (Johnson et al., 1999). Using antibodies to the cone opsins, by contrast, no immunoreactive cone photoreceptors can be detected prior to P-22 (Johnson et al., 2001b). A similar delay in cone opsin expression relative to rod opsin expression has been reported in the monkey and rat retinas (Watanabe and Raff, 1990; Sze´l et al., 1994; Dorn et al., 1995; Jasoni and Reh, 1996; Bumsted et al., 1997), despite the fact that cones are known to be generated before rods in primates, carnivores, and rodents (Young, 1985; LaVail et al., 1991; Johnson et al., 1999), suggesting that cones differentiate later than rods. Antibodies to recoverin, however (a calcium-binding protein found in adult rods and cones; Dizhoor et al., 1991), label two populations of cells in the outer parts of the neuroblast layer on the day of birth: one population is relatively faintly labeled, and can be doublelabeled with antibodies to rod opsin. The other more intensely labeled population consists of immature cone photoreceptors. Like the population of rod opsin-immunoreactive cells, the entirety of these more intensely labeled recoverin-immunoreactive cells is labeled, including apical and basal processes.
7
generated (Zimmerman et al., 1988), and already occupy an intermediate position within the developing neuroblast layer, anticipating the future level of the OPL (Greiner and Weidman, 1981; Reese et al., 1996). These horizontal cells have already begun to differentiate laterally oriented, neurofilamentimmunoreactive processes (Fig. 1), yet despite the presence of this postsynaptic target for photoreceptor cells, the projections of the latter extend well beyond this level (Johnson et al., 1999). The neuroblast layer must therefore already contain positional information specifying the level of the future OPL, yet the developing photoreceptor terminals do not respond to it.
Immunopositive cells are not proliferating neuroblasts
Fig. 1. Photoreceptors (stippled cells) initially extend their terminals through the neuroblast layer and amacrine cell layer into the IPL. Horizontal cells (diagonal lines) have already been generated and have migrated into the neuroblast layer, anticipating the site of the future OPL, yet despite their presence, the photoreceptors extend beyond them.
Their apical processes typically extend through the future outer-limiting membrane, giving rise to presumptive inner segments (Greiner and Weidman, 1981). They are detected as early as embryonic day 24 (E-24), shortly after the first cones become postmitotic, and their basally directed processes already reach into the IPL by E-30 (Johnson et al., 1999).
Signals for the nascent OPL are already present By P-1, the IPL is continuous across the entire retina, while the OPL has still yet to form (Reese et al., 1996). Horizontal cells, however, have already been
The morphology of these bipolar-shaped rod opsinand recoverin-immunoreactive cells is reminiscent of proliferating neuroepithelial cells (Hinds and Hinds, 1979; Brittis et al., 1995), raising the possibility that these cells are not postmitotic rods and cones but are precursor cells that may have already begun to express proteins characteristic of their eventual progeny. These rod opsin-positive and recoverinpositive cells, however, are situated in a narrow stratum within the neuroblast layer, while proliferating retinal cells, identified with an antibody to the cell cycle-specific nuclear antigen Ki67 (Gerdes et al., 1983; Geller et al., 1995), are distributed across the full thickness of the neuroblast layer (Reese et al., 1996), being most common in the future S- and M-phase zones (Johnson et al., 1999). In fact, the tier occupied by the rod opsin-positive and recoverinpositive cells is relatively Ki67-negative, indicating that most of the cells here are postmitotic. Further, none of the recoverin-positive cells can be double-labeled with antibodies for the Ki67 antigen (Johnson et al., 1999). Finally, injections of the thymidine analog bromodeoxyuridine on P-1 never double-labeled any of the recoverin-positive cells. This latter result rules out the possibility that the rod opsin-positive or recoverin-positive cells are a unique population of precursors that fail to express the Ki67 antigen nor show the classic pattern of interkinetic nuclear translocation associated with
8
the neuroepithelium (Robinson et al., 1985). The rod opsin-positive and recoverin-positive cells must therefore all be postmitotic, presumed to be rod or cone photoreceptors by virtue of their immunoreactivity and their positioning.
Protein-expression patterns in developing photoreceptors Most proteins associated with the visual transduction cycle and photoreceptor structure are normally detected around the time of outer segment formation (Colombaioni and Strettoi, 1993; Timmers et al., 1993), which in the ferret commences around P-15 (Greiner and Weidman, 1981). Antibodies to these proteins, including b- and g-transducin, phosducin, phosphodiesterase-g (PDEg), rhodopsin kinase, rod cGMP-gated ion channel, and peripherin, do not label photoreceptor cells in the ferret retina until the second or third postnatal week (Johnson et al., 2001b). In some cases, these proteins are compartmentally selective from the earliest stages of detection, being found exclusively within the outer segments (e.g. peripherin, the cGMP-gated cation channel and b-transducin). Others have a similar time of onset, but are found throughout the cell (e.g. gtransducin, PDEg, and phosducin), implying that distinct protein-trafficking mechanisms are at work (Fariss et al., 1997). The onset of expression of each of these proteins appears to be synchronized amongst both old and young photoreceptors, whereas the rod opsin and recoverin protein-expression patterns emerge gradually in cells in accord with their neurogenetic gradients, occurring first in a few cells in the central retina, spreading to cells at increasingly peripheral locations, and continuing to be expressed in more and more cells at all retinal loci as these cells are generated (Johnson et al., 2001b; see also Bowes et al., 1988; Treisman et al., 1988; Saha and Grainger, 1993). Because rod opsin and recoverin protein expression follow such different spatio-temporal gradients from those other outer segment-associated proteins, and since in some other species, rod opsin is reported to be expressed at the same time as these other proteins (Timmers et al., 1993; van Ginkel and Hauswirth, 1994), one might question whether the
early immunodetection of rod opsin and recoverin was spurious or artifactual. Yet independent RTPCR analyses confirm the precocious expression of these two mRNA transcripts on the day of birth, while the mRNAs for those other transductionrelated proteins mentioned above could not be detected until P-15, when outer segment assembly begins (Johnson et al., 2001b). There seems little doubt that immature rods and cones activate their rod opsin and recoverin genes and synthesize these proteins well before these cells are capable of generating a response to light. Whether these two proteins play some other precocious role in developing photoreceptors remains to be seen. Recoverin is a known calcium sensor (Ames et al., 1996; Polans et al., 1996), and given the myriad functions of calcium during development (Gu and Spitzer, 1995), it may play some other fundamental role in photoreceptor maturation. As for rod opsin, some other nonvisual function has been implicated by the fact that species of cave-dwelling crayfish, never exposed to light, should lack a functional constraint upon the frequency of mutations within the rod opsin gene, yet they show no difference from their surface-dwelling cousins (Crandall and Hillis, 1997). Regardless of whether the early expression of either of these two proteins plays a transient functional role, the fact that they are present early on and are found throughout the cell enables one to trace the complete morphology of these photoreceptors, including their projection into the IPL.
Maturational events in the OPL may trigger the elimination of this projection During the first two postnatal weeks, increasing numbers of rods become postmitotic and the number of them projecting to the IPL increases, reaching maximal density on P-15. By P-15, a cell-free OPL has formed across conspicuous stretches of the dorsal retina (Reese et al., 1996), yet the photoreceptor projections continue to extend through this region, reaching the IPL (Fig. 2). Thereafter, however, the frequency of these immunopositive processes reaching the IPL declines rapidly, falling to nearly zero by the end of the third postnatal week (Johnson et al., 1999). The gradual accumulation of these projections
9
Fig. 2. The density of photoreceptors projecting to the IPL steadily increases during the first two postnatal weeks. Horizontal cells (diagonal lines) begin elaborating their horizontally oriented dendrites toward the end of the second postnatal week, giving rise to a cell-sparse outer plexiform layer.
Fig. 3. As the horizontal and then bipolar cells (diagonal lines) continue to mature, giving rise to a continuous plexus of processes within the OPL during the third postnatal week, the photoreceptors retract their terminals from the IPL and form synapses within the OPL. Outer segment assembly is also initiated during this period.
in the IPL, between E-24 and P-15, followed by their sudden elimination during the third postnatal week, suggests that their loss is not linked to the maturational age of each cell; rather, some environmental signal has orchestrated this elimination. This loss of projections to the IPL is coincident with the further maturation of the constituents of the OPL between P-15 and P-22. A calbindin-positive plexus in the OPL gradually develops during this same period of process elimination, arising first from the differentiation of horizontal, and later, bipolar cell dendrites (Reese et al., 1996). This period is also coincident with the formation of ribbon synapses within the OPL (Greiner and Weidman, 1981;
Rapaport, 1989). Hence, maturational events in the developing OPL may trigger this elimination (Fig. 3).
Process retraction, rather than cell death or selective protein trafficking, is responsible for the elimination of these projections The decline in the number of photoreceptors projecting to the IPL does not appear to be associated with apoptosis of the parent photoreceptor cell in the ONL. Programmed cell death occurs only scarcely in the ONL, evidenced by terminal deoxytransferase dUTP nick-end labeling (TUNEL)
10
of dying cells (Johnson et al., 1999). The period of naturally occurring cell death in the ONL peaks around P-42, weeks after this period of process elimination. TUNELþ cells present within the INL during this period of process elimination serve as an internal control, confirming that their absence in the ONL during this period is not due to insufficient sensitivity of the technique. Indeed, a virtually identical time course for the relative frequency of apoptotic profiles in the ONL and INL was reported for the developing cat’s retina (Maslim et al., 1997). An alternative explanation for the transience of this projection is that it remains intact but ceases to be immunopositive for rod opsin or recoverin. This explanation, while unlikely, should not be dismissed out of hand, given that protein-trafficking mechanisms within photoreceptor cells become compartmentally selective after outer segments differentiate. Yet such an explanation can be ruled out because crystalline implants of the lipophilic carbocyanine dye, DiI, placed into the IPL of fixed specimens, readily label somata throughout both the INL and ONL on P-15, but no longer do so on P-29 or thereafter, confirming that the cells of the ONL do not maintain a process to this depth within the retina (Johnson et al., 1999). Rather, these processes must be retracted to the OPL, presumably triggered by other maturational changes at that depth within the retina (Fig. 3). A similar overextension, followed by retraction, of cone photoreceptor terminals in the rat retina expressing a glutamate transporter splice variant, and possibly also being glycine immunoreactive, has recently been reported (Pow and Hendrickson, 2000; Reye et al., 2002), and cone photoreceptors in the primate retina have also been observed to extend processes transiently to the IPL (Anita Hendrickson, personal communication). While no obvious phylogenetic correlations can be made from such a limited dataset, a parallel with a hypothesized ancestral light-sensing tissue lacking bipolar cells, in which photoreceptors directly innervate projection neurons, has been noted (Reichenbach and Robinson, 1995), akin to the pineal organ of the fish (Eckstro¨m, 1987). Yet in contrast with this example, the present photoreceptor projection to the IPL is more intimately associated with a retinal interneuron, the cholinergic amacrine cell, rather than with retinal ganglion cells.
Photoreceptors target cholinergic amacrine cells These transient projections are positioned to influence the other constituents within the IPL. Ganglion cells in the ferret retina are generated during the fourth and fifth prenatal weeks (Reese et al., 1994) and differentiate dendritic arbors shortly thereafter, forming cell class-specific morphologies during the first two postnatal weeks (Wingate and Thompson, 1994, 1995) and differentiating separate ON and OFF substrata over the first postnatal month (Bodnarenko et al., 1999; Lohmann and Wong, 2001; Wang et al., 2001). The cholinergic amacrine cells, by contrast, differentiate processes that occupy separate ON and OFF substrata much earlier, during the first postnatal week (Reese et al., 2001). Coincident with these two strata of cholinergic processes in the IPL, rod opsin-positive projections terminate at one of these same two levels during the second postnatal week (Fig. 4). Clearly, these immature photoreceptor projections recognize and respond to features defining the stratification of the developing IPL; they do not simply grow to the inner-limiting membrane (Johnson et al., 2001a).
Photoreceptor processes are immunoreactive for synaptic vesicle proteins Numerous examples of transient synaptic connectivity exist elsewhere in the CNS, including the visual system. For example, geniculo-cortical axons form transient synaptic connections with subplate cells prior to their establishing connections within the cortical plate (Chun and Shatz, 1988; Friauf and Shatz, 1991; Herrmann et al., 1994), while optic axons establish synapses within ocular domains of the lateral geniculate nucleus and superior colliculus from which they will subsequently retract (Campbell et al., 1984; Campbell and Shatz, 1992). Perhaps photoreceptors similarly form transient synapses with the cholinergic amacrine cell processes in the IPL prior to their retraction, since these cells contain synaptic proteins such as synaptophysin. Synaptophysin, an integral membrane protein of synaptic vesicles (Wiedenmann and Franke, 1985; Sudhof et al., 1987) present in both conventional and ribbon synapses (Catsicas et al., 1992; West Greenlee et al.,
11
a second synaptic vesicle protein, synaptotagmin, supporting the interpretation that these photoreceptors are preparing for, or may already be engaged in, synaptogenesis within the IPL (Johnson et al., 1999).
Early ablation of the cholinergic amacrine cells disrupts photoreceptor stratification in the IPL
Fig. 4. During the second postnatal week, photoreceptor terminals extend to one of two depths within the IPL, coincident with the stratifying processes of the cholinergic amacrine cells (dark stippled cells at the top). The photoreceptors are also immunoreactive for synaptic vesicle proteins at this stage.
1996), is generally first detectable at the onset of synaptogenesis (Knaus et al., 1986; Devoto and Barnstable, 1989; Voigt et al., 1993; Kapfhammer et al., 1994; Dhingra et al., 1997), or even preceding it (Hering and Kro¨ger, 1996). As early as the day of birth, recoverin-positive cone somata are richly synaptophysin immunoreactive, as are some rod opsin-positive somata. Conspicuously, the terminals are also richly synaptophysin immunoreactive (Johnson et al., 1999). Such synaptophysin-rich profiles extending to the IPL are increasingly frequent by P-15, despite the fact that the OPL has begun to form and to show a dense accumulation of synaptophysin itself (Reese et al., 1996). Similar results were also obtained using antibodies to
To confirm that the cholinergic amacrine cells specify the depth at which these photoreceptor projections stratify, cholinergic amacrine cells were ablated using an excitotoxic approach with L-glutamate. A single subcutaneous dose of 4 mg/g of body weight at the end of the first postnatal week was found to kill off virtually all of the cholinergic amacrine cells in the central retina, while leaving the retinal architecture relatively normal, besides a slight reduction in the thickness of the INL and IPL. This excitotoxic cell death occurs rapidly, being near-complete within one day following treatment (Reese et al., 2001). Thus, by killing off the cholinergic amacrine cells at the end of the first postnatal week, any consequence for the stratification of the photoreceptors within the IPL should then be detectable one week later, when that stratification pattern is most pronounced. In fact, one week following such cholinergic ablation, the photoreceptor projection to the IPL was no longer stratified, with terminals now found at various depths within the IPL (Fig. 5), and with a large number extending beyond the IPL into the ganglion cell layer (Johnson et al., 2001a). Unfortunately, this excitotoxicity is not selective for the cholinergic amacrine cells, as the retinal ganglion cells are also reduced by about 50%, and the alpha ganglion cells in particular are nearly completely eliminated (Reese et al., 2001), consistent with their relative glutamatergic excitability (Marc, 1999a,b); other cell types, however, are not affected or are only modestly compromised. To confirm that this change in photoreceptor stratification is not a consequence of the partial ganglion cell elimination produced by their excitotoxic ablation, the optic nerve was transected shortly after birth to kill off all of the ganglion cell population, evidenced by the loss of neurofilament immunoreactivity in the ganglion cell layer and by the loss of the optic fiber layer. While some other cell types undergo a reduction in
12
Fig. 5. Elimination of the cholinergic amacrine cells at the close of the first postnatal week disrupts the normal stratification of the photoreceptor terminals by the end of the second postnatal week.
density when the retinal ganglion cells are eliminated early on, the cholinergic amacrine cells and their strata have been shown not to be affected by this treatment (Williams et al., 2001). Under these experimental circumstances, no change to the photoreceptor stratification pattern was detected (Fig. 6), confirming that the change in photoreceptor stratification following L-glutamate exposure is not a consequence of the compromised ganglion cell population (Johnson et al., 2001a). While one should not overlook the possibility that some other cell type with processes in the IPL has been altered, and that this is the cause of the disruption of the photoreceptor stratification, none of the other stratifying processes within the IPL has been shown to be eliminated or altered following
Fig. 6. Elimination of the retinal ganglion cells during early development, by contrast, does not affect the stratification pattern of the photoreceptor terminals.
L-glutamate treatment (Reese et al., 2001). This then suggests that the alterations in photoreceptor stratification should be due to the cholinergic ablation, implying that their normal spatial coincidence indicates a causal relationship. That they do not simply grow indiscriminately beyond the IPL, but rather depend upon a cell type which itself has been shown to participate in transient retinal circuitry (Feller et al., 1996; Zhou, 1998; Wong et al., 2000), further suggests that they possess some transient functional significance.
Conclusions The early life history of the photoreceptor turns out to be far more complicated than previously considered.
13
Contrary to general opinion, these cells begin to differentiate relatively early during development, expressing photoreceptor-specific proteins well in advance of outer segment formation and the onset of phototransduction. The first cone photoreceptors in the ferret may be born as early as E-22, become immunoreactive for recoverin by E-24, and extend terminals into a differentiating IPL by E-30. Rods become postmitotic largely after the period of cone neurogenesis, through the first postnatal week, and as these cells are generated, they too extend terminals into the IPL, culminating in a maximal projection to the IPL by the end of the second postnatal week. Those photoreceptor terminals seek out the stratified processes of cholinergic amacrine cells during the second postnatal week, potentially engaging the latter in a synaptic relationship during this period. Gap junctional communication between the photoreceptors and inner retinal cells may also play a role, as outer retinal cells are connected through radial processes to differentiating inner retinal neurons during early development (Catsicas et al., 1998; Becker et al., 2002). Exactly how the photoreceptors target these cholinergic strata is unclear. Studies blocking cholinergic neurotransmission during this developmental period should clarify whether this neurotransmitter plays any role in this behavior of the photoreceptors. Alternatively, cholinergic amacrine cells and photoreceptors may express cell-surface proteins like cadherins that provide a molecular basis for this affinity (Honjo et al., 2000). Whatever its cause, this relationship is subsequently lost as the photoreceptor terminals are all retracted to the level of the OPL, apparently triggered by maturational events therein, most likely associated with the differentiation of horizontal or bipolar cell dendrites. L-type calcium channels in the photoreceptor terminal have been shown to play a role in their structural remodeling including retraction in vitro (Nachman-Clewner et al., 1999), but these effects have only been shown in mature photoreceptors, and their relevance to development is uncertain. Two other temporally related events may contribute to this retraction: the first, less-likely event, is the onset of outer segment assembly, known to be triggered by enviromental events associated with the retinal pigment epithelium (Bumsted et al., 2001).
A second possibility is the differentiation of bipolar cell terminals within the IPL, which may ‘dislodge’ the photoreceptor projection much as the normally transient retinofugal projection to the latero-posterior nucleus may be supplanted by the later invasion of other afferents to LP (Perry and Cowey, 1982). This time course of the photoreceptor retraction coincides with the transition of cholinergic-mediated spontaneous inner retinal activity to one of glutamate-mediated activity thought to reflect bipolar differentiation (Miller et al., 1999; Wong et al., 2000). Any role for the photoreceptor terminals in this activity has yet to be defined, but recent pharmacological studies reveal a significant glutamatergic component to this activity even during the earlier ‘cholinergic’ phase (Wong et al., 2000; Zhou and Zhao, 2000). Given the close temporal congruity between this transient projection and the cholinergic phase, coupled with the targeted association of the former with the cholinergic processes, the photoreceptor projection to the IPL may play a functional role in this transient retinal circuitry, providing a glutamatergic drive to the inner retina, initiating focal activity that is subsequently conveyed as waves via cholinergic and gap junctional transmission (Feller et al., 1996; Singer et al., 2001). The temporal relationship between these various anatomical and physiological events, and their relationship to other hallmark features of retinal development in the ferret, are indicated in Fig. 7. The developing retina would seem hardly the place to root out visual awareness, but if one of the wider contributions of this field is to understand visual processing as a prerequisite for the restoration of sight, then a fuller appreciation of retinal development is germane to this goal. While many of the other contributions to this volume highlight the cortical and subcortical roots of our perceptual experience, we should keep in mind the retinal processing that provides the neural blueprint for interpretation by higher visual centers. It all begins with phototransduction and signal transmission at the photoreceptor, and understanding those morphological substrates and functional characteristics of photoreceptors is only enhanced by a knowledge of their development (as is our understanding of the visual pathway in general; Reese and Cowey, 1990a,b), particularly
14
Fig. 7. Time-line depicting the major developmental milestones associated with the ferret’s retina. Two other temporal landmarks, birth and eye opening, are also indicated along the time axis. (Data are derived from the following studies: 1Reese et al., 1996; 2Reese et al., 1994; 3Johnson et al., 2001b; 4Johnson et al., 1999; 5Greiner and Weidman, 1981; 6Miller et al., 1999; 7Reese et al., 2001; 8Wong and Oakley, 1996; Bodnarenko et al., 1999; Lohmann and Wong, 2001; Wang et al., 2001; 9Cusato et al., 2001; 10Wong et al., 2000.)
if we are to develop strategies for the treatment of retinal disease.
Acknowledgments This research was supported by grants from the Santa Barbara Cottage Hospital and the National Science Foundation (IBN 9987643). I thank Pat Johnson, Mary Raven, Kathy Giannotti, Karen Cusato and Ryan Williams for their contributions to the studies described herein, and Andy Huberman, Bob Fariss and Jimmy Zhou for their comments on the manuscript.
References Ames, J.B., Tanaka, T., Stryer, L. and Ikura, M. (1996) Portrait of a myristoyl switch protein. Curr. Opin. Struct. Biol., 6: 432–438. Ary-Pires, R., Nakatani, M., Rehen, S.K. and Linden, R. (1997) Developmentally regulated release of intraretinal
neurotrophic factors in vitro. Int. J. Dev. Neurosci., 15: 239–255. Becker, D.L., Bonness, V., Catsicas, M. and Mobbs, P. (2002) Changing patterns of ganglion cell coupling and connexin expression during chick retinal development. J. Neurobiol., 52: 280–293. Blanks, J.C., Adinolfi, A.M. and Lolley, R.N. (1974) Synaptogenesis in the photoreceptor terminal of the mouse retina. J. Comp. Neurol., 156: 81–93. Blanquet, P.R. and Jonet, L. (1996) Signal-regulated proteins and fibroblast growth factor receptors: comparative immunolocalization in rat retina. Neurosci. Lett., 214: 135–138. Bodnarenko, S.R., Jeyarasasingam, G. and Chalupa, L. (1995) Development and regulation of dendritic stratification in retinal ganglion cells by glutamate-mediated afferent activity. J. Neurosci., 15: 7037–7045. Bodnarenko, S.R., Yeung, G., Thomas, L. and McCarthy, M. (1999) The development of retinal ganglion cell dendritic stratification in ferrets. NeuroReport, 10: 2955–2959. Bosco, A. and Linden, R. (1999) BDNF and NT-4 differentially modulate neurite outgrowth in developing retinal ganglion cells. J. Neurosci. Res., 57: 759–769. Bosco, A., Carri, N.G. and Linden, R. (1993) Neuritogenesis of retinal ganglion cells is differentially promoted by target extract. Brain Res., 632: 303–307.
15 Bowes, C., Van Veen, T. and Farber, D.B. (1988) Opsin, G-protein and 48 kDa protein in normal and rd mouse retinas: developmental expression of mRNAs and proteins and light/dark cycling of mRNAs. Exp. Eye Res., 47: 369–390. Brittis, P.A., Meiri, K., Dent, E. and Silver, J. (1995) The earliest patterns of neuronal differentiation and migration in the mammalian central nervous system. Exp. Neurol., 134: 1–12. Bumsted, K., Jasoni, C., Sze´l, A´. and Hendrickson, A. (1997) Spatial and temporal expression of cone opsins during monkey retinal development. J. Comp. Neurol., 378: 117–134. Bumsted, K.M., Rizzolo, L.J. and Barnstable, C.J. (2001) Defects in the MITF(mi/mi) apical surface are associated with a failure of outer segment elongation. Exp. Eye Res., 73: 383–392. Campbell, G. and Shatz, C.J. (1992) Synapses formed by identified retinogeniculate axons during the segregation of eye input. J. Neurosci., 12: 1847–1858. Campbell, G., So, K.-F. and Lieberman, A.R. (1984) Normal postnatal development of retinogeniculate axons and terminals and identification of inappropriately-located transient synapses: electron microscope studies of horseradish peroxidase-labelled retinal axons in the hamster. Neuroscience, 13: 743–759. Carwile, M.E., Culbert, R.B., Sturdivant, R.L. and Kraft, T.W. (1998) Rod outer segment maintenance is enhanced in the presence of bFGF, CNTF and GDNF. Exp. Eye Res., 66: 791–805. Catsicas, M. and Mobbs, P. (1995) Waves are swell. Curr. Biol., 5: 977–979. Catsicas, S., Catsicas, M., Keyser, K.T., Kartein, H.J., Wilson, M.C. and Milner, R.J. (1992) Differential expression of the presynaptic protein SNAP-25 in mammalian retina. J. Neurosci. Res., 33: 1–9. Catsicas, M., Bonness, V., Becker, D. and Mobbs, P. (1998) Spontaneous Ca2þ transients and their transmission in the developing chick retina. Curr. Biol., 8: 283–286. Cellerino, A. and Kohler, K. (1997) Brain-derived neurotrophic factor/neurotrophin-4 receptor trkB is localized on ganglion cells and dopaminergic amacrine cells in the vertebrate retina. J. Comp. Neurol., 386: 149–160. Cellerino, A., Pinzon-Duarte, G., Carroll, P. and Kohler, K. (1998) Brain-derived neurotrophic factor modulates the development of the dopaminergic network in the rodent retina. J. Neurosci., 18: 3351–3362. Chun, J.J.M. and Shatz, C.J. (1988) Redistribution of synaptic vesicle antigens is correlated with the disappearance of a transient synaptic zone in the developing cerebral cortex. Neuron, 1: 297–310. Cohen-Cory, S., Escandon, E. and Fraser, S.E. (1996) The cellular patterns of BDNF and trkB expression suggest
multiple roles for BDNF during Xenopus visual system development. Dev. Biol., 179: 102–115. Colombaioni, L. and Strettoi, E. (1993) Appearance of cGMPphosphodiesterase immunoreactivity parallels the morphological differentiation of photoreceptor outer segments in the rat retina. Vis. Neurosci., 10: 395–402. Cooper, A.M. and Cowey, A. (1990a) Development and retraction of a crossed retinal projection to the inferior colliculus in neonatal pigmented rats. Neuroscience, 35: 335–344. Cooper, A.M. and Cowey, A. (1990b) Retinal topography of the neonatal crossed aberrant exuberant projection to the inferior colliculus in the pigmented rat. Neuroscience, 35: 345–354. Copenhagen, D.R. (1996) On the crest of an exciting wave. Curr. Biol., 6: 1368–1370. Cramer, K.S. and Sur, M. (1997) Blockade of afferent impulse activity disrupts on/off sublamination in the ferret lateral geniculate nucleus. Dev. Brain Res., 98: 287–290. Crandall, K.A. and Hillis, D.M. (1997) Rhodopsin evolution in the dark. Nature, 387: 667–668. Cusato, K., Stagg, S.B. and Reese, B.E. (2001) Two phases of increased cell death in the inner retina following early elimination of the ganglion cell population. J. Comp. Neurol., 439: 440–449. Dann, J.F., Buhl, E.H. and Peichl, L. (1987) Dendritic maturation in cat retinal ganglion cells: a Lucifer yellow study. Neurosci. Lett., 80: 21–26. Dann, J.F., Buhl, E.H. and Peichl, L. (1988) Postnatal dendritic maturation of alpha and beta ganglion cells in cat retina. J. Neurosci., 8: 1485–1499. Devoto, S.H. and Barnstable, C.J. (1989) Expression of the growth cone specific epitope CDA 1 and the synaptic vesicle protein SVP38 in the developing mammalian cerebral cortex. J. Comp. Neurol., 190: 154–168. Dhingra, N.K., Ramamohan, Y. and Raju, T.R. (1997) Developmental expression of synaptophysin, synapsin I and syntaxin in the rat retina. Dev. Brain Res., 102: 267–273. Dizhoor, A.M., Ray, S., Kumar, S., Niemi, G., Spencer, M., Brolley, D., Walsh, K.A., Philipov, P.P., Hurley, J.B. and Stryer, L. (1991) Recoverin: a calcium sensitive activator of retinal rod guanylate cyclase. Science, 251: 915–918. Dorn, E.M., Hendrickson, L. and Hendrickson, A.E. (1995) The appearance of rod opsin during monkey retinal development. Invest. Ophthalmol. Vis. Sci., 36: 2634–2651. Eckstro¨m, P. (1987) Photoreceptors and CSF-contacting neurons in the pineal organ of a teleost fish have direct axonal connections with the brain: an HRP-electron microscopic study. J. Neurosci., 7: 987–995. Edward, D.P., Lim, K., Sawaguchi, S. and Tso, M.O.M. (1993) An immunohistochemical study of opsin in photoreceptor cells following light-induced retinal degeneration in the rat. Graefes Arch. Clin. Exp. Ophthalmol., 231: 289–294.
16 Eglen, S.J. (1999) The role of retinal waves and synaptic normalization in retinogeniculate development. Phil. Trans. R. Soc. (Lond.), B354: 497–506. Fariss, R.N., Molday, R.S., Fisher, S.K. and Matsumoto, B. (1997) Evidence from normal and degenerating photoreceptors that two outer segment integral membrane proteins have separate transport pathways. J. Comp. Neurol., 387: 148–156. Fariss, R.N., Li, Z.-Y. and Milam, A.H. (2000) Abnormalities in rod photoreceptors, amacrine cells, and horizontal cells in human retinas with retinitis pigmentosa. Am. J. Ophthalmol., 129: 215–223. Feeney, L. (1973) The interphotoreceptor space. I. Postnatal ontogeny in mice and rats. Dev. Biol., 32: 101–114. Feller, M.B. (1999) Spontaneous correlated activity in developing neural circuits. Neuron, 22: 653–656. Feller, M.B., Wellis, D.P., Stellwagen, D., Werblin, F.S. and Shatz, C.J. (1996) Requirement for cholinergic synaptic transmission in the propagation of spontaneous retinal waves. Science, 272: 1182–1187. Friauf, E. and Shatz, C.J. (1991) Changing patterns of synaptic input to subplate and cortical plate during development of visual cortex. J. Neurophysiol., 66: 2059–2071. Frost, D.O. (1984) Axonal growth and target selection during development: retinal projections to the ventrobasal complex and other ‘nonvisual’ structures in neonatal Syrian hamsters. J. Comp. Neurol., 230: 576–592. Frost, D.O., So, K.-F. and Schneider, G.E. (1979) Postnatal development of retinal projections in Syrian Hamsters: a study using autoradiographic and anterograde degeneration techniques. Neuroscience, 4: 1649–1677. Gao, H. and Hollyfield, J.G. (1996) Basic fibroblast growth factor: increased gene expression in inherited and lightinduced photoreceptor degeneration. Exp. Eye Res., 62: 181–189. Geller, S.F., Lewis, G.P., Anderson, D.H. and Fisher, S.K. (1995) Use of the MIB-1 antibody for detecting proliferating cells in the retina. Invest. Ophthalmol. Vis. Sci., 36: 737–744. Gerdes, J., Schwab, U., Lemke, H. and Stein, H. (1983) Production of a mouse monoclonal antibody reactive with a human nuclear antigen associated with cell proliferation. Int. J. Cancer, 31: 13–20. Greiner, J.V. and Weidman, T.A. (1981) Histogenesis of the ferret retina. Exp. Eye Res., 33: 315–332. Gu, X. and Spitzer, N.C. (1995) Distinct aspects of neuronal differentiation encoded by frequency of spontaneous Ca2þ transients. Nature, 375: 784–787. Hering, H. and Kro¨ger, S. (1996) Formation of synaptic specializations in the inner plexiform layer of the developing chick retina. J. Comp. Neurol., 375: 393–405. Herrmann, K., Antonini, A. and Shatz, C.J. (1994) Ultrastructural evidence for synaptic interactions between
thalamocortical axons and subplate neurons. Eur. J. Neurosci., 6: 1729–1742. Herzog, K.-H. and von Bartheld, C.S. (1998) Contributions of the optic tectum and the retina as sources of brain-derived neurotrophic factor for retinal ganglion cells in the chick embryo. J. Neurosci., 18: 2891–2906. Hicks, D. and Barnstable, C. (1987) Different rhodopsin monoclonal antibodies reveal different binding patterns on developing and adult rat retina. J. Histochem. Cytochem., 35: 1317–1328. Hicks, D. and Courtois, Y. (1992) Fibroblast growth factor stimulates photoreceptor differentiation in vitro. J. Neurosci., 12: 2022–2033. Hicks, D., Sparrow, J. and Barnstable, C.J. (1989) Immunoelectron microscopical examination of the surface distribution of opsin in rat rod photoreceptor cells. Exp. Eye Res., 49: 13–29. Hinds, J.W. and Hinds, P.L. (1979) Differentiation of photoreceptors and horizontal cells in the embryonic mouse retina: an electron microscopic, serial section analysis. J. Comp. Neurol., 187: 495–512. Hollyfield, J.G. and Witkovsky, P. (1974) Pigmented retinal epithelium involvement in photoreceptor development and function. J. Exp. Zool., 189: 357–378. Honjo, M., Tanihara, H., Suzuki, S., Tanaka, T., Honda, Y. and Takeichi, M. (2000) Differential expression of cadherin adhesion receptors in neural retina of the postnatal mouse. Invest. Ophthalmol. Vis. Sci., 41: 546–551. Jan, L.Y. and Revel, J.P. (1974) Ultrastructural localization of rhodopsin in the vertebrate retina. J. Cell Biol., 62: 257–273. Jansen, H.G., Sanyal, S., De Grip, W.J. and Schalken, J.J. (1987) Development and degeneration of retina in rds mutant mice: ultraimmunohistochemical localization of opsin. Exp. Eye Res., 44: 347–361. Jasoni, C.L. and Reh, T.A. (1996) Temporal and spatial pattern of MASH-1 expression in the developing rat retina demonstrates progenitor cell heterogeneity. J. Comp. Neurol., 369: 319–327. Jelsma, T.N., Friedman, H.H., Berkelaar, M., Bray, G.M. and Aguayo, A.J. (1993) Different forms of the neurotrophin receptor brkB mRNA predominate in rat retina and optic nerve. J. Neurobiol., 24: 1207–1214. Johnson, P.T., Williams, R.R., Cusato, K. and Reese, B.E. (1999) Rods and cones project to the inner plexiform layer during development. J. Comp. Neurol., 414: 1–12. Johnson, P.T., Raven, M.A. and Reese, B.E. (2001a) Disruption of transient photoreceptor targeting within the inner plexiform layer following early ablation of cholinergic amacrine cells in the ferret. Vis. Neurosci., 18: 741–751. Johnson, P.T., Williams, R.R. and Reese, B.E. (2001b) Developmental patterns of protein expression in photoreceptors implicate distinct environmental vs. cell-intrinsic mechanisms. Vis. Neurosci., 18: 157–168.
17 Kapfhammer, J.P., Christ, F. and Schwab, M.E. (1994) The expression of GAP-43 and synaptophysin in the developing rat retina. Dev. Brain Res., 80: 251–260. Kljavin, I.J. and Reh, T.A. (1991) Mu¨ller cells are a preferred substrate for in vitro neurite extension by rod photoreceptor cells. J. Neurosci., 11: 2985–2994. Klocker, N., Braunling, F., Isenmann, S. and Bahr, M. (1997) In vivo neurotrophic effects of GDNF on axotomized retinal ganglion cells. Neuroreport, 10: 3439–3442. Knaus, P., Betz, H. and Rehm, H. (1986) Expression of synaptophysin during postnatal development of the mouse brain. J. Neurochem., 47: 1302–1304. Langdon, R.B. and Frost, D.O. (1991) Transient retinal axon collaterals to visual and somatosensory thalamus in neonatal hamsters. J. Comp. Neurol., 310: 200–214. LaVail, M.M., Rapaport, D.H. and Rakic, P. (1991) Cytogenesis in the monkey retina. J. Comp. Neurol., 309: 86–114. LaVail, M.M., K.U., Yasumura, D., Matthes, M.T., Yancopoulos, G.D. and Steinberg, R.H. (1992) Multiple growth factors, cytokines, and neurotrophins rescue photoreceptors from the damaging effects of constant light. Proc. Natl. Acad. Sci., 89: 11249–11253. LaVail, M.M., Yasumura, D., Matthes, M.T., LauVillacorta, C., Unoki, K., Sung, C.-H. and Steinberg, R.H. (1998) Protection of mouse photoreceptors by survival factors in retinal degenerations. Invest. Ophthalmol. Vis. Sci., 39: 592–602. Lewis, G.P., Erickson, P.A., Anderson, D.H. and Fisher, S.K. (1991) Opsin distribution and protein incorporation in photoreceptors after experimental retinal detachment. Exp. Eye Res., 53: 629–640. Lohmann, C. and Wong, R.O.L. (2001) Cell-type specific dendritic contacts between retinal ganglion cells during development. J. Neurobiol., 48: 150–162. Marc, R. (1999a) Mapping glutamatergic drive in the vertebrate retina with a channel-permeant organic cation. J. Comp. Neurol., 407: 47–64. Marc, R. (1999b) Kainate activation of horizontal, bipolar, amacrine, and ganglion cells in the rabbit retina. J. Comp. Neurol., 407: 65–76. Maslim, J. and Stone, J. (1986) Synaptogenesis in the retina of the cat. Brain Res., 373: 35–48. Maslim, J., Valter, K., Egensperger, R., Hollander, H. and Stone, J. (1997) Tissue oxygen during a critical developmental period controls the death and survival of photoreceptors. Invest. Ophthalmol. Vis. Sci., 38: 1667–1677. McArdle, C.B., Dowling, J.E. and Masland, R.H. (1977) Development of outer segments and synapses in the rabbit retina. J. Comp. Neurol., 175: 253–273. Meister, M., Wong, R.O.L., Baylor, D.A. and Shatz, C.J. (1991) Synchronous bursts of action potentials in ganglion cells of the developing mammalian retina. Science, 252: 939–943.
Miller, E.D., Wong, W.T. and Wong, R.O.L. (1998) Developmental changes in the neurotransmitter regulation of correlated spontaneous retinal bursting activity. Soc. Neurosci. Abs., 24: 812. Miller, E.D., Tran, M.-N., Wong, G.-K., Oakley, D.M. and Wong, R.O.L. (1999) Morphological differentiation of bipolar cells in the ferret retina. Vis. Neurosci., 16: 1133–1144. Morrison, J.D. (1983) Morphogenesis of photoreceptor outer segments in the developing kitten retina. J. Anat., 136: 521–533. Muir-Robinson, G., Hwang, B.J. and Feller, M.B. (2002) Retinogeniculate axons undergo eye-specific segregation in the absence of eye-specific layers. J. Neurosci., 22: 5259–5264. Nachman-Clewner, M., St. Jules, R. and Townes-Anderson, E. (1999) L-type calcium channels in the photoreceptor ribbon synapse: localization and role in plasticity. J. Comp. Neurol., 415: 1–16. Nir, I. and Papermaster, D.S. (1986) Immunocytochemical localization of opsin in the inner segment and ciliary plasma membrane of photoreceptors in retinas of rds mutant mice. Invest. Ophthalmol. Vis. Sci., 27: 836–840. Norsat, C.A., Tomac, A., Lindqvist, E., Lindskog, S., Humpel, C., Stromberg, I., Ebendal, T., Hoffer, B.J. and Olson, L. (1996) Cellular expression of GDNF mRNA suggests multiple functions inseide and outside the nervous system. Cell Tissue Res., 286: 191–207. Okazawa, H., Damei, M., Imafuku, I. and Kanazawa, I. (1995) Gene regulation of trkB and trkC in the chicken retina by light/darkness exposure. Oncogene, 7: 1813–1818. Olney, J.W. (1968) An electron microscopic study of synapse formation, receptor outer segment development, and other aspects of developing mouse retina. Invest. Ophthalmol., 7: 250–268. Penn, A.A., Wong, R.O.L. and Shatz, C.J. (1994) Neuronal coupling in the developing mammalian retina. J. Neurosci., 14: 3805–3815. Penn, A.A., Riquelme, P.A., Feller, M.B. and Shatz, C.J. (1998) Competition in retinogeniculate patterning driven by spontaneous activity. Science, 279: 2108–2112. Perez, M.T.R. and Caminos, E. (1995) Expression of brainderived neurotrophic factor and of its functional receptor in neonatal and adult retina. Neurosci. Lett., 183: 96–99. Perry, V.H. and Cowey, A. (1979) The effects of unilateral cortical and tectal lesions on retinal ganglion cells in rats. Exp. Brain Res., 35: 85–95. Perry, V.H. and Cowey, A. (1982) A sensitive period for ganglion cell degeneration and the formation of aberrant retino-fugal connections following tectal lesions in rats. Neuroscience, 7: 583–594. Pinzo´n-Duarte, G., Kohler, K., Arango-Gonza´lez, B. and Guenther, E. (2000) Cell differentiation, synaptogenesis, and influence of the retinal pigment epithelium in a rat neonatal organotypic retina culture. Vis. Res., 40: 3455–3465.
18 Polans, A., Baehr, W. and Palczewski, K. (1996) Turned on by Ca2þ! The physiology and pathology of Ca2þ-binding proteins in the retina. TINS, 19: 547–554. Pow, D.V. and Hendrickson, A.E. (2000) Expression of glycine and the glycine transporter Glyt-1 in the developing rat retina. Vis. Neurosci., 17: 1–9. Ramoa, A.S., Campbell, G. and Shatz, C.J. (1987) Transient morphological features of identified ganglion cells in living fetal and neonatal retina. Science, 237: 522–525. Ramoa, A.S., Campbell, G. and Shatz, C.J. (1988) Dendritic growth and remodeling of cat retinal ganglion cells during fetal and postnatal development. J. Neurosci., 8: 4239–4261. Rapaport, D.H. (1989) Quantitative aspects of synaptic ribbon formation in the outer plexiform layer of the developing cat retina. Vis. Neurosci., 3: 21–32. Reese, B.E. and Cowey, A. (1990a) Fibre organization of the monkey’s optic tract: I. Segregation of functionally distinct optic axons. J. Comp. Neurol., 295: 385–400. Reese, B.E. and Cowey, A. (1990b) Fibre organization of the monkey’s optic tract: II. Noncongruent representation of the two half-retinae. J. Comp. Neurol., 295: 401–412. Reese, B.E., Thompson, W.F. and Peduzzi, J.D. (1994) Birthdates of neurons in the retinal ganglion cell layer of the ferret. J. Comp. Neurol., 341: 464–475. Reese, B.E., Johnson, P.T. and Baker, G.E. (1996) Maturational gradients in the retina of the ferret. J. Comp. Neurol., 375: 252–273. Reese, B.E., Raven, M.A., Giannotti, K.A. and Johnson, P.T. (2001) Development of cholinergic amacrine cell stratification in the ferret retina and the effects of early excitotoxic ablation. Vis. Neurosci., 18: 559–570. Reichenbach, A. and Robinson, S.R. (1995) Phylogenetic constraints on retinal organization and development. Prog. Ret. Eye Res., 15: 139–171. Reye, P., Sullivan, R. and Pow, D.V. (2002) Distribution of two splice variants of the glutamate transporter GLT-1 in the developing rat retina. J. Comp. Neurol., 447: 323–330. Rickman, D.W. and Brecha, N.C. (1995) Expression of the proto-oncogene, trk, receptors in the developing rat retina. Vis. Neurosci., 12: 215–222. Robinson, S.R. and Dreher, Z. (1990) Mu¨ller cells in adult rabbit retinae: morphology, distribution and implications for function and development. J. Comp. Neurol., 292: 178–192. Robinson, S.R., Rapaport, D.H. and Stone, J. (1985) Cell division in the developing cat retina occurs in two zones. Dev. Brain Res., 19: 101–109. Rohrer, B., Korenbrot, J.I., LaVail, M.M., Reichardt, L.F. and Xu, B. (1999) Role of neurotrophin receptor TrkB in the maturation of rod photoreceptors and establishment of synaptic transmission to the inner retina. J. Neurosci., 19: 8919–8930. Saha, M.S. and Grainger, R.M. (1993) Early opsin expression in Xenopus embryos precedes photoreceptor differentiation. Mol. Brain Res., 17: 307–318.
Sherry, D.M., St. Jules, R.S. and Townes-Anderson, E. (1996) Morphologic and neurochemical target selectivity of regenerating adult photoreceptors in vitro. J. Comp. Neurol., 376: 476–488. Singer, J.H., Mirotznik, R.R. and Feller, M.B. (2001) Potentiation of L-type calcium channels reveals nonsynaptic mechanisms that correlate spontaneous activity in the developing mammalian retina. J. Neurosci., 21: 8514–8522. Spoerri, P.E., Ulshafer, R.J., Ludwig, H.C., Allen, C.B. and Kelley, K.C. (1988) Photoreceptor cell development in vitro: influence of pigment epithelium conditioned medium on outer segment differentiation. Eur. J. Cell Biol., 46: 362–367. Stellwagen, D. and Shatz, C.J. (2002) An instructive role for retinal waves in the development of retinogeniculate connectivity. Neuron, 33: 357–367. Stiemke, M.M., Landers, R.A., Al-Ubaidi, M.R., Rayborn, M.E. and Hollyfield, J.G. (1994) Photoreceptor outer segment development in Xenopus laevis: influence of the pigment epithelium. Dev. Biol., 162: 169–180. Sudhof, T.C., Lottspeich, F., Greengard, P., Mehl, E. and Jahn, R. (1987) A synaptic vesicle protein with a novel cytoplasmic domain and four transmembrane regions. Science, 238: 1142–1144. Sze´l, A´., van Veen, T. and Ro¨hlich, P. (1994) Retinal cone differentiation. Nature, 370: 336. Timmers, A.M., Newton, B.R. and Hauswirth, W.W. (1993) Synthesis and stability of retinal photoreceptor mRNAs are coordinately regulated during bovine fetal development. Exp. Eye Res., 56: 257–265. Treisman, J.E., Morabito, M.A. and Barnstable, C.J. (1988) Opsin expression in the rat retina is developmentally regulated by transcriptional activation. Mol. Cell Biol., 8: 1570–1579. Tucker, G.S., Hamasaki, D.I., Labbie, A. and Muroff, J. (1979) Anatomic and physiologic development of the photoreceptor of the kitten. Exp. Brain Res., 37: 459–474. Ugolini, G., Cremisi, F. and Maffei, L. (1995) TrkA, trkB and p75NTR mRNA expression is developmentally regulated in rat retina. Brain Res., 704: 121–124. Unoki, K. and LaVail, M.M. (1994) Protection of the rat retina from ischemic injury by brain-derived neurotrophic factor, ciliary neurotrophic factor, and basic fibroblast growth factor. Invest. Ophthalmol. Vis. Sci., 35: 907–915. Usukura, J. and Bok, D. (1987) Changes in the localization and content of opsin during retinal development in the rds mutant mouse: immunocytochemistry and immunoassay. Exp. Eye Res., 45: 501–515. Usukura, J. and Obata, S. (1995) Morphogenesis of photoreceptor outer segments in retinal development. Prog. Ret. Eye Res., 15: 113–125. van Ginkel, P.R. and Hauswirth, W.W. (1994) Parallel regulation of fetal gene expression in different photoreceptor cell types. J. Biol. Chem., 269: 4986–4992.
19 Vogel, M. (1978) Postnatal development of the cat’s retina: a concept of maturation obtained by qualitative and quantitative examinations. Graefes Arch. Ophthal., 208: 93–107. Voigt, T., De Lima, A.D. and Beckmann, M. (1993) Synaptophysin immunohistochemistry reveals inside-out pattern of early synaptogenesis in ferret cerebral cortex. J. Comp. Neurol., 330: 48–64. Vollrath, L. and Spiwoks-Becker, I. (1996) Plasticity of retinal ribbon synapses. Microsc. Res. Tech., 35: 472–487. von Bartheld, C.S. (1998) Neurotrophins in the developing and regenerating visual system. Histol. Histopathol., 13: 437–459. Wahlin, K.J., Campochiaro, P.A., Zack, D.J. and Adler, R. (2000) Neurotrophic factors cause activation of intracellular signaling pathways in Mu¨ller cells and other cells of the inner retina, but not photoreceptors. Invest. Ophthalmol. Vis. Sci., 41: 927–936. Wahlin, K.J., Adler, R., Zack, D.J. and Campochiaro, P.A. (2001) Neurotrophic signaling in normal and degenerating rodent retinas. Exp. Eye Res., 73: 693–701. Wang, G.Y., Liets, L.C. and Chalupa, L.M. (2001) Unique functional properties of on and off pathways in the developing mammalian retina. J. Neurosci., 15: 4310–4317. Watanabe, T. and Raff, M.C. (1990) Rod photoreceptor development in vitro: intrinsic properties of proliferating neuroepithelial cells change as development proceeds in the rat retina. Neuron, 2: 461–467. Weidman, T.A. and Kuwabara, T. (1968) Postnatal development of the rat retina. An electron microscopic study. Arch. Ophthalmol., 79: 470–484. West Greenlee, M.H., Swanson, J.J., Simon, J.J., Elmquist, J.K., Jacobson, C.D. and Sakaguchi, D.S. (1996) Postnatal development and the differential expression of presynaptic terminal-associated proteins in the developing retina of the Brazilian opossum, Monodelphis domestica. Dev. Brain Res., 96: 159–172. West Greenlee, M.H., Roosevelt, C.B. and Sakaguchi, D.S. (2001) Differential localization of SNARE complex proteins SNAP-25, syntaxin, and vamp during development of the mammalian retina. J. Comp. Neurol., 430: 306–320. Wiedenmann, B. and Franke, W.W. (1985) Identification and localization of synaptophysin, an integral membrane glycoprotein of Mr 38,000 characteristic of presynaptic vesicles. Cell, 41: 1017–1028. Williams, R.R., Cusato, K., Raven, M. and Reese, B.E. (2001) Organization of the inner retina following early elimination
of the retinal ganglion cell population: effects on cell numbers and stratification patterns. Vis. Neurosci., 18: 233–244. Wingate, R.J. and Thompson, I.D. (1994) Targeting and activity-related dendritic modification in mammalian retinal ganglion cells. J. Neurosci., 11: 6621–6637. Wingate, R.J. and Thompson, I.D. (1995) Axonal target choice and dendritic development of ferret beta retinal ganglion cells. Eur. J. Neurosci., 7: 723–731. Wong, R.O.L. (1995) Cholinergic regulation of [Ca2þ]i during cell division and differentiation in the mammalian retina. J. Neurosci., 15: 2696–2706. Wong, R.O.L. (1999) Retinal waves and visual system development. Ann. Rev. Neurosci., 22: 29–47. Wong, R.O.L., Meister, M. and Shatz, C.J. (1993) Transient period of correlated bursting activity during development of the mammalian retina. Neuron, 11: 923–938. Wong, R.O.L., Chernjavsky, A., Smith, S.J. and Shatz, C.J. (1995) Early functional neural networks in the developing retina. Nature, 374: 716–718. Wong, R.O.L. and Oakley, D.M. (1996) Changing patterns of spontaneous bursting activity of on and off retinal ganglion cells during development. Neuron, 16:1087–1095. Wong, W.T., Myhr, K.L., Miller, E.D. and Wong, R.O.L. (2000) Developmental changes in the neurotransmitter regulation of correlated spontaneous retinal activity. J. Neurosci., 20: 1–10. Yan, Q., Wang, J., Matheson, C.R. and Urich, J.L. (1999) Glial cell-line derived neurotrophic factor (GDNF) promotes the survival of axotomized retinal ganglion cells in adult rats: comparison to and combination with brain-derived neurotrophic factor. J. Neurobiol., 15: 382–390. Young, R.W. (1985) Cell differentiation in the retina of the mouse. Anat. Rec., 212: 199–205. Zanellato, A., Comelli, M.C., Dal-Toso, R. and Carmignoto, G. (1993) Developing rat retinal ganglion cells express the functional NGF receptor p140trkA. Dev. Biol., 159: 105–113. Zhou, Z.J. (1998) Direct participation of starburst amacrine cells in spontaneous rhythmic activities in the developing mammalian retina. J. Neurosci., 18: 4155–4165. Zhou, Z.J. and Zhao, D. (2000) Coordinated transitions in neurotransmitter systems for the initiation and propagation of spontaneous retinal waves. J. Neurosci., 20: 6570–6577. Zimmerman, R.P., Polley, E.H. and Fortney, R.L. (1988) Cell birthdays and rate of differentiation of ganglion and horizontal cells of the developing cat’s retina. J. Comp. Neurol., 274: 77–90.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 2
Morphology and physiology of primate M- and P-cells Luiz Carlos L. Silveira1,*, Ce´zar A. Saito1, Barry B. Lee2, Jan Kremers3, Manoel da Silva Filho1, Bjørg E. Kilavik3, Elizabeth S. Yamada1 and V. Hugh Perry4 1
Department of Physiology, Biological Science Center, Federal University of Para´, 66075-900 Bele´m, Para´, Brazil 2 SUNY Optometry, New York, NY 10036, USA and Max-Planck Institute for Biophysical Chemistry, Department of Neurobiology, D-3400, Go¨ttingen, Germany 3 Department Experimental Ophthalmology, University of Tu¨bingen, D-72076, Tu¨bingen, Germany 4 CNS Inflammation Group, University of Southampton, SO16 7PX Southampton, UK
Abstract: Catarrhines and platyrrhines, the so-called Old- and New-World anthropoids, have different cone photopigments. Postreceptoral mechanisms must have coevolved with the receptors to provide trichromatic color vision, and so it is important to compare postreceptoral processes in these two primate groups, both from anatomical and physiological perspectives. The morphology of ganglion cells has been studied in the retina of catarrhines such as the diurnal and trichromatic Macaca, as well as platyrrhines such as the diurnal, di- or trichromatic Cebus, and the nocturnal, monochromatic Aotus. Diurnal platyrrhines, both di- and trichromats, have ganglion cell classes very similar to those found in catarrhines: M (parasol), P (midget), small-field bistratified, and several classes of wide-field ganglion cells. In the fovea of all diurnal anthropoids, P-cell dendritic trees contact single midget bipolars, which contact single cones. The Aotus retina has far fewer cones than diurnal species, but M- and P-cells are similar to those in diurnal primates although of larger size. As in diurnal anthropoids, in the Aotus, the majority of midget bipolar cells, found in the central 2 mm of eccentricity, receive input from a single cone and the sizes of their axon terminals match the sizes of P-cell dendritic fields in the same region. The visual responses of retinal ganglion cells of these species have been studied using single-unit electrophysiological recordings. Recordings from retinal ganglion cells in Cebus and Aotus showed that they have very similar properties as those in the macaque, except that P-cells of mono- and dichromatic animals lack cone opponency. Whatever the original role of the M- and P-cells was, they are likely to have evolved prior to the divergence of catarrhines and platyrrhines. M- and P-cell systems thus appear to be strongly conserved in the various primate species. The reasons for this may lie in the roles of these systems for both achromatic and chromatic vision.
Vision and visual encoding
concerned with communication between individuals; a blind person is cut off from the world of things, whereas a deaf person is cut off from the world of people (Evans, 1982). This may be a partisan approach to sensory physiology, but to consider the visual system as a device built for localization and identification might be a good way to start a discussion of how the structure and function of the visual pathways serve the purpose of vision. Object reflectance modifies the spectral distribution of light generated from natural or artificial
To specify the difference between vision and hearing in humans, it is commonly said that the sense of vision is primarily devoted to object localization and identification, whereas the sense of hearing is *Corresponding author. Universidade Federal do Para´, Centro de Cieˆncias Biolo´gicas, Departamento de Fisiologia, 66075-900 Bele´m, Para´, Brazil. Tel.: þ 5591-99834133; Fax: þ 5591-2111570; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-2
21
22
sources to provide the stimulus configurations which reach the eye from different locations in the visual field. Thus, two stimulus properties available for use by the visual system for object localization and identification are the amplitude and frequency constituting such spectral distributions, or number of photons and photon energy. As stated in the Rushton Principle of Univariance, photoreceptors only ‘measure’ the number of photoisomerizations occurring in a certain spatiotemporal window and cannot distinguish between photons of different energy once they are absorbed. Photons of a certain energy are absorbed with the highest probability, and this peak of the absorption spectrum depends on the microelectric forces provided by the opsin protein that surrounds the retinaldehyde. A few aminoacids among those that constitute the opsin structure are critical in modifying the retinaldehyde’s environment, and have been the target for evolutionary changes leading to photopigment diversity. Thus, visual information is encoded at the photoreceptor level following simple physical principles. The electrical activity of the photoreceptor mosaic is directly related to photon catch; the photoreceptor density limits spatial resolution; and the time course of the photoreceptor potential sets the temporal resolution of this representation. What happens further down in the visual pathway? How is the visual world represented at every postreceptoral level, and what are the neural correlates of the physical stimulus properties? The ganglion cells are only two synapses away from phototransduction events and represent the link between retinal circuitry and higher-level brain processes. Thus, the retinal ganglion cell layer is a convenient place to see how postreceptoral neurons first deal with photoreceptor output to construct a neural representation of the visual world, which then is sent to the visual centers in the midbrain, thalamus, and cerebral cortex. Investigation of ganglion cell physiology has provided two ways to approach the problem. One approach, mainly performed in rabbits and lower vertebrates, has postulated the existence of trigger features, special stimulus configurations, that would drive specific ganglion cell classes (Barlow, 1961). Early feature extraction implies that a high level of information processing is already attained at the ganglion cell
level, probably involving some form of nonlinear transformation of photoreceptor signals. Another approach describes ganglion cell responses in terms of spatial and temporal properties directly related to fundamental physical parameters of the stimulus (Ku¨ffler, 1953; Enroth-Cugell and Robson, 1966). It assumes relatively simple rules for the translation of intensity parameters of the optical image into the phototransduction cascade and subsequent neuronal activity, mostly by means of linear operations. This approach has been very useful to characterize ganglion cell physiology in mammals, including primates. Primate ganglion cells have been classified and their functional roles in vision have been discussed using the abovementioned approach. In addition, a well-documented correlation between physiology and morphology has been disclosed for the most common ganglion cell classes, the M- and P-cells (Shapley and Perry, 1986; Dacey and Lee, 1994). The physiological properties of the M- and P-relay neurons of the lateral geniculate nucleus (LGN) are generally similar to those of M- and P-retinal ganglion cells, although subtle differences have been found with more detailed comparisons between neurons of these two stations of the visual pathway (e.g. Kaplan et al., 1993). In this chapter we shall focus on results obtained from ganglion cells, but the reader may find elsewhere detailed reviews on the physiology of M- and P-pathways at more central levels in the thalamus and primary visual cortex (e.g. Shapley and Hawken, 1999). This chapter presents an overview on the morphology and physiology of two major classes of primate retinal ganglion cells: the M- and P-cells. Particular emphasis will be given to results obtained from New-World primates which may be diurnal or nocturnal and have different color vision phenotypes. A comparison between the ganglion cells of OldWorld and different New-World species may shed light on M- and P-pathway function and evolution. In addition, details of M- and P-cell anatomy and physiology studied at different eccentricities are presented and differences in morphology, response characteristics in the space and time domains, and the relative contributions of rod and cone signals to their responses, are discussed and related to the diurnal activity rhythm, as well as to color vision phenotype.
23
Primate ganglion cell classes Santiago Ramo´n y Cajal’s extensive description of retinal anatomy includes little about primate ganglion cell morphology (Ramo´n y Cajal, 1904). Until the 1980s, only a few selected studies had been performed on this subject. Dogiel (1891) used the method of Ehrlich to stain human retinal flat-mounts. He described three ganglion cell classes and illustrated their morphology with drawings of stained retinal patches, showing cells that look very similar to peripheral M and P cells. The first detailed descriptions of primate ganglion cells are from the work of Polyak (1941) and Boycott and Dowling (1969) who studied retinal sections stained with the Golgi method. These authors recognized several ganglion cell classes in the retina of macaques and other nonhuman primates and showed that their morphology changes with distance from fovea. Polyak coined the terms parasol and midget that are used today for the two most frequently found cell classes. Later these cells were, respectively called Pa- and Pb-cells (Perry and Cowey, 1981) or A- and B-cells (Leventhal et al., 1981). Parasol cells are now usually termed M- (or MC-) cells and midget are now called P- (or PC-) cells (Shapley and Perry, 1986). A considerable advance in the study of primate ganglion cell morphology was achieved by combining the use of retinal flat-mount preparations with modern methods of extra and intracellular injection of neurotracers. Alan Cowey and Hugh Perry, working at the University of Oxford, were among the first to employ such techniques. They labeled macaque ganglion cells by horseradish peroxidase (HRP) retrograde transport from the optic nerve, lateral geniculate nucleus and superior colliculus, and published a series of quantitative accounts on M- and P-cell morphology (Perry and Cowey, 1981, 1984; Perry et al., 1984; Perry and Silveira, 1988). When the tracer was placed in the optic nerve, behind the eyeball, the staining quality they achieved was ‘Golgi-like’, making it easy to classify the labeled cells and to measure dendritic-field and cell-body sizes at all retinal locations accurately. Moreover, the use of flattened preparations of entire retinas, made it simple to measure the distance of labeled cells from the fovea. Using these procedures, they quantified
how the sizes of dendritic fields and cell bodies of macaque M- and P-cells vary as a function of retinal eccentricity. Finally, using retrograde labeling from retinal targets, they showed that M- and P-cells project to the magno- and parvocellular layers of the lateral geniculate nucleus, respectively, confirming the results of Leventhal and colleagues (Leventhal et al., 1981). Since 1981, different research groups have used retrograde transport of HRP or Biocytin, as well as intracellular injection of Lucifer Yellow, HRP or Neurobiotin, to label ganglion cells in retinal flat-mounts of several primate species. M- and P-cells have been identified in all primates studied so far, including human (Rodieck et al., 1985; Kolb et al., 1992; Dacey and Petersen, 1992), other diurnal catarrhines (Leventhal et al., 1981; Perry and Cowey, 1981; Watanabe and Rodieck, 1989), diurnal platyrrhines (Leventhal et al., 1989; Silveira et al., 1994; Ghosh et al., 1996; Yamada et al., 1996a,b), nocturnal platyrrhines (Silveira et al., 1994; Yamada et al., 1996b, 2001), and prosimians (Yamada et al., 1998). In all primates, M-cells have large cell bodies, thick axons, and large dendritic trees with a radial branching pattern, whereas P-cells have small cell bodies, thin axons, and small dendritic trees with a more bushy and dense branching pattern (Figs. 1–3). As in other mammalian ganglion cell classes, such as cat a- and b-cells, M- and P-cells are divided into two subclasses according to the level of dendritic branching in the inner plexiform layer. The cells of one subclass have dendrites ramifying in the outer half of the inner plexiform layer, and are called outer Mor P-cells, whereas the cells of the other subclass have dendrites in the inner half of the inner plexiform layer, being called inner M- or P-cells. The outer and inner subclasses of M- and P-cells correspond to the electrophysiologically off-center and on-center varieties (Dacey and Lee, 1994). It is of special interest to investigate whether M- and P-cell morphology is the same in primates with different life styles. The primates comprise two Suborders: Anthropoidea and Prosimii (Fleagle, 1988). The anthropoids are divided in two Infraorders, Catarrhini and Platyrrhini, inhabiting in the Old- and New-World, respectively. All the 22 genera of living Old World anthropoids are diurnal and their color vision is trichromatic with little variation among
24
Fig. 1. M- and P-cells from the central retina of Cebus and Aotus (Silveira et al., 1994; Yamada et al., 1996a, 2001). A-B. Cebus M-on cells. C. Aotus M-off cell. D-E. Cebus P-on cells. F. Aotus P-off cell. Ganglion cells were retrogradely labeled by placing Biocytin in the optic nerve, 1–3 mm behind the eyeball. After 18–48 h, the animal was euthanized with a lethal dose of barbiturate and perfused with paraformaldehyde. After perfusion, the eye was removed, the retina dissected and incubated in ABC Vectastain for 12–48 h, and then reacted for peroxidase histochemistry using diaminobenzidine as chromogen. Drawings were made using a drawing tube attached to a binocular microscope. Cebus M- and P-cells are similar to those observed in other diurnal anthropoids, M-cells being larger than P-cells at all retinal locations. Aotus M- and P-cells are larger than their Cebus counterparts at similar eccentricities, but M-cells are still larger than P-cells at all eccentricities. The figure illustrates that foveal Aotus M- and P-cells are slightly larger or about the same size as Cebus M- and P-cells located about 0.5 mm more peripherally. Both in Cebus and Aotus, the central M- and P-cells are very small and the central P-cell dendritic fields have the appropriate size to contact axon terminals of single midget bipolar cells, whose dendrites make contacts with single cones. Scale bar ¼ 50 mm.
individuals and species. In humans and other catarrhines, dichromacy or anomalous trichromacy are considered abnormal phenotypes. Thus, it is not surprising that retinal organization has been shown to be very similar in all catarrhines so far studied, with only minor differences between species. The living New-World anthropoids differ from catarrhines in that they comprise diurnal and
Fig. 2. P-cells from the peripheral retina of Cebus and Aotus (Silveira et al., 1994; Yamada et al., 1996a, 2001). A. Cebus P-on cell. B. Aotus P-off cell. In the Cebus and Aotus, similarly to other diurnal anthropoids, with increasing eccentricity, P-cells increase in size but maintain its distinctive morphology. In all anthropoids so far studied, P-cells have small to medium sized cell bodies, thin axons, and small dendritic trees bearing bushy and dense branching pattern. Note, in this figure, that the two cells have about the same dendritic-field size, but the Cebus P-cell is located 2.3 mm more peripherally than the Aotus P-cell. Scale bar ¼ 50 mm
nocturnal species and a variety of color vision phenotypes (Jacobs, 1998) (Table 1). They represent useful ‘animal models’ to test hypotheses about the organization of primate visual pathways. Amongst the platyrrhines, there are 17 diurnal genera and one nocturnal genus, Aotus. Moreover, most platyrrhine species contain a mixed population of di- and trichromatic individuals (Mollon et al., 1984). This is due to the presence of just a single gene in the X-chromosome coding for photopigments sensitive to middle or long wavelengths (MWS and LWS photopigments, respectively). As a consequence, all males are dichromats having the SWS- (short wavelength sensitive) cone, the photopigment of
25 Table 1. A summary table of the New-World anthropoids. The division in Families and genera follow the recent review by Rylands et al. (2000)
Fig. 3. M-cells from the peripheral retina of Cebus and Aotus (Silveira et al., 1994; Yamada et al., 1996a, 2001). A. Cebus M-off cell. B. Aotus M-on cell. In the Cebus and Aotus, similarly to other diurnal anthropoids, with increasing eccentricity, Mcells increase in size but maintain its distinctive morphology. In all anthropoids so far studied, M-cells have large cell bodies, thick axons, and large dendritic trees with a radial branching pattern. Aotus M- and P-cells are larger than Cebus M- and Pcells, respectively, at all retinal locations. Note, in this figure, that the Aotus M-cell is still larger than the Cebus M-cell in spite of being located 0.7 mm more centrally. Scale bar ¼ 50 mm.
which is encoded on chromosome 7, together with a single MWS/LWS-cone. Due to a polymorphism of the MWS/LWS-photopigment gene, there are three or more dichromatic phenotypes amongst males. The monozygotic females are dichromats whereas heterozygotic females are trichromats, the exact proportion of dichromatic and trichromatic females depends on the number and frequency of alleles of the MWS/LWS-photopigment genes. And again, due to gene polymorphism, there are several di- and trichromatic phenotypes amongst females.
Family Callitrichidae
Species Life style
Colour vision
M- and P-cell studies
Cebuella Mico Callithrix Saguinus Leonthopithecus Callimico
1 14 6 15 4
Diurnal Diurnal Diurnal Diurnal Diurnal
– – Polymorphic Polymorphic Polymorphic
– – 5–6, 8 – –
1
Diurnal
–
–
Family Cebidae Saimiri Cebus
5 7
Diurnal Diurnal
Polymorphic Polymorphic
1 2–4, 7–8, 11–12
Family Aotidae Aotus
8
Nocturnal Monochromatic
2–3, 8–9, 13–14
19 5 2 2
Diurnal Diurnal Diurnal Diurnal
Polymorphic – – –
– – – –
8 6 4 1 2
Diurnal Diurnal Diurnal Diurnal Diurnal
Trichromatic Polymorphic – – –
– 10 – – –
Family Pitheciidae Callicebus Pithecia Chiropotes Cacajao Family Atelidae Alouatta Ateles Lagothrix Oreonax Brachyteles
Polymorphic colour vision refers to a normal mixed population of dichromatic and trichromatic individuals (see text for explanation). Morphological studies: 1Leventhal et al. (1989), 2Lima et al. (1996), 3–4 Silveira et al. (1994, 1998), 5Ghosh et al. (1996), 6Goodchild et al. (1996), 7–9Yamada et al. (1996a, b, 2001). Electrophysiological studies: 10 Hubel and Wiesel (1960), 11–12Lee et al. (1996, 2000), 13Silveira et al. (2000), 14Saito et al. (2001).
As far as our present knowledge goes, there are at least two exceptions to this standard platyrrhine scheme of color vision (Jacobs, 1998). The Aotus is a monochromat, having a single MWS/LWSphotopigment gene on the X-chromosome and a single allele for this gene. In addition, the SWS-photopigment gene on chromosome 7 is
26
nonfunctional and there are no SWS-cones in the retina. On the other hand, Alouatta exhibits ‘routine’ trichromacy, having two different MWS/LWSphotopigment genes on the X-chromosome besides the SWS-photopigment gene on chromosome 7, all genes bearing a single allele. We have used two platyrrhine genera to investigate several comparative aspects of retinal organization: the diurnal capuchin-monkey, Cebus, which displays the standard mixture of di- and trichromats (Jacobs and Neitz, 1987); and the nocturnal and monochromatic owl-monkey, Aotus (Wikler and Rakic, 1990; Jacobs et al., 1993, 1996). These two New-World monkeys have similar eye size and retinal area, facilitating direct comparisons between retinal locations both in linear and angular metrics. Ganglion cells were retrogradely labeled by placing Biocytin in the optic nerve, behind the eyeball. Cell morphology was subsequently revealed by incubating the retina in ABC Vectastain followed by peroxidase histochemistry using diaminobenzidine as chromogen (Silveira et al., 1994). The morphology of M- and P-cells in the dichromatic male Cebus are qualitatively and quantitatively similar to those of trichromatic platyrrhines and catarrhines such as marmoset and macaque monkey (Silveira et al., 1994; Yamada et al., 1996a). Both cell classes increase in size with increasing distance from the foveal slope, conserving their distinct branching pattern at all eccentricities (Figs. 1–3). The mean M-cell dendritic-field size is about the same in the temporal, dorsal, and ventral retinal quadrants, increasing from 28 mm at 0.5 mm to 308 mm at 10–12 mm distance from the fovea, whereas in the nasal quadrant it is smaller, increasing from 25 mm at 0.5 mm to 216 mm at 10–12 mm distance from the fovea (Figs. 4 and 5). The mean P-cell dendritic-field diameter depends similarly on eccentricity, but is smaller than mean M-cell dendritic-field throughout the retina, ranging from 8 mm at 0.5 mm to 100 mm at 10–12 mm distance from the fovea in the temporal, dorsal, and ventral quadrants, and from 7 mm at 0.5 mm to 62 mm at 10–12 mm distance from the fovea in the nasal quadrant (Figs. 4 and 5). In Aotus, the fovea is absent or rudimentary, and the cone-to-rod ratio is much smaller than in Cebus and other diurnal platyrrhines and catarrhines
Fig. 4. The size of M- and P-cell dendritic fields of Cebus and Aotus as a function of retinal eccentricity (Yamada et al., 1996a,b, 2001). A. Temporal cells. B. Nasal cells. Dendritic field size was measured in drawings of selected M- and P-cells. Dendritic field was defined as the convex polygon circumscribing the tips of the distal dendrites. Cell eccentricity was corrected for shrinkage using the fovea–optic disk distance as reference. Measurements of dendritic field area were performed using a bit pad connected to a microcomputer. The results were converted to diameter of circles with equivalent area and plotted as a function of eccentricity. The Aotus M- and P-cells are larger than their Cebus counterparts at similar eccentricities. In the Cebus and Aotus, M-cell dendritic-field sizes are larger than those of P-cells at any given eccentricity along both temporal and nasal retinal regions. The difference between M and P dendritic-field sizes ranges from 2.5 to 3.5-fold and 2.3 to 2.7fold, for the Cebus and Aotus retina, respectively. In the Cebus retina, dendritic-field diameter increases less steeply along the nasal retina, so that at comparable distance from fovea, nasal M- and P-cells are smaller than temporal ones, more significantly for retinal eccentricities greater than 2 mm. In the Aotus retina, the nasotemporal asymmetry of M and P dendriticfields is less pronounced than that observed in the Cebus retina, and it attains a significant level only in the retinal periphery, at 10 mm from the fovea. Figure reproduced from Yamada et al. (2001) with the kind permission of Elsevier Science.
27
Fig. 5. Dendritic-field size of central M- and P-cells of Cebus and Aotus as a function of retinal eccentricity (Yamada et al., 1996a,b, 2001). A. Temporal cells. B. Nasal cells. Symbols as in Fig. 4: filled squares, Aotus M-cells; empty diamonds, Cebus M-cells; filled circles, Aotus P-cells; empty triangles, Cebus P-cells. As in other diurnal anthropoids, the dendritic fields of Cebus P-cells do not increase in size up to 1.75 mm and 1.25 mm from fovea in the nasal and temporal regions, respectively. On the other hand, Aotus P-cells, Cebus M-cells, and Aotus M-cells increase in size steadily with increasing distance from fovea. Aotus M-cells show the steepest change.
(Silveira et al., 1993, 2001a). However, the M- and P-cell morphology is, in most aspects, qualitatively similar to that found in diurnal anthropoids (Figs. 1–3). The main qualitative difference is that Aotus cells have thicker dendrites and lower dendritic-branching density than Cebus cells. This is better illustrated when Aotus and Cebus cells are matched for size (Figs. 1–3). Notwithstanding the qualitative similarities, there are important
quantitative differences. At similar eccentricities, Aotus M- and P-cells are larger than Cebus M- and P-cells (Figs. 4 and 5). In the central retinal region of Aotus, the mean M-cell dendritic-field diameter measures 36 mm, increasing at 10–12 mm of eccentricity to about 317 mm in the nasal and to about 381 mm in the other quadrants, respectively, whereas the mean P-cell dendritic-field diameter ranges from 14 mm in the central region to 131 mm in the nasal and 177 mm in the other quadrants at 10–12 mm of eccentricity (Yamada et al., 2001; see Figs. 4 and 5). In the central retina, the dendritic-field areas of M- and P-cells, measured in Aotus are 3.9 and 5.9 times larger respectively than those of Cebus. This ratio decreases towards the retinal periphery to 1.9 and 3.5 for M- and P-cells, respectively. The size difference between Aotus and Cebus M- and P-cells is related to the relative density of cones and rods in the retina of these two platyrrhine species (Silveira et al., 1994; Yamada et al., 2001). Figure 6 shows the cone and rod convergence to M- and P-cells as a function of retinal eccentricity. Although there are large differences in the dendritic field area between Aotus and Cebus, the cone convergence to M- and P-cells is of the same magnitude. Similar cone convergences were found for other platyrrhines, such as Callithrix, and for catarrhines, such as humans and macaques (Goodchild et al., 1996). This finding indicates that during development, ganglion cells from different primate species adjust their dendritic size to collect signals from similar numbers of cones. Consistent with this hypothesis is the fact that ganglion cells and cones are generated during the first phase of retinal neurogenesis, while rods appear at a later stage (La Vail et al., 1991; Yamada et al., 2001). An important characteristic of the P-pathway is the existence of very small P-cells, that are connected to single-cone midget bipolar cells in the central 2 mm around the fovea (Polyak, 1941; Boycott and Dowling, 1969; Kolb and DeKorver, 1991). This one-to-one neuronal circuitry is thought to form the basis of red–green color opponency. A single MWS- or LWS-cone provides the signal for the P-cell receptive-field center, whereas the surround is either driven by signals selectively coming from the other cone class (Lee et al., 1998), or by a mixture of
28
Fig. 6. Photoreceptor convergence to M- and P-cells as a function of retinal eccentricity in the Cebus and Aotus retina (Yamada et al., 2001). A–B. Cone convergence to temporal and nasal cells, respectively. C–D. Rod convergence to temporal and nasal cells, respectively. The number of cone and rod per ganglion cell was obtained by multiplying the ganglion-cell dendritic-field area by photoreceptor density (Yamada et al., 2001). Cone convergence to M- and P-cells is similar for the Cebus and Aotus. Cebus M-cells: 20 cones/cell at 1 mm from fovea, 500 cones/cell at the retinal periphery. Aotus M-cells: 17 cones/cell in the fovea, 40–50 cones/cell at 1 mm from fovea, 340–420 cones/cell at the retinal periphery. Cebus P-cells: 1–2 cones/cell at 1 mm from fovea, 60 cones/cell at the retinal periphery. Aotus P-cells: 2.5 cones/cell in the fovea, 2.5–3 cones/cell at 1 mm from fovea, 60–110 cones/cell at the retinal periphery. Rod convergence to M- and P-cells is higher in Aotus. Aotus M-cells: 410 rods/cell in the fovea, 1,600–2,100 rods/cell at 1 mm from fovea, 13,900–16,200 rods/cell at the retinal periphery. Cebus M-cells: 70–80 rods/cell at 1 mm from fovea, 4,700–6,500 rods/cell at the retinal periphery. Aotus P-cells: 60 rods/cell in the fovea, 110–130 rods/cell at 1 mm from fovea to 2,400–4,400 rods/cell at the retinal periphery. Cebus P-cells: 4–7 rods/cell at 1 mm from the fovea to 570–880 rods/cells at the retinal periphery. Figure reproduced from Yamada et al. (2001) with the kind permission of Elsevier Science.
MWS- and LWS-cones. However, recent physiological evidence suggests a more complex physiological reality than this simple anatomical picture would suggest (McMahon et al., 2000). To test whether similar pathways are present in dichromats and monochromats, Cebus and Aotus
bipolar cells were stained by placing the lipophilic carbocianine dye DiI in fixed retinas (Silveira et al., 1998, 2001b). As in trichromatic catarrhines, in the dichromatic male Cebus and the monochromatic Aotus, the majority of midget bipolar cells, found in the central 2 mm of eccentricity, receive input from
29
a single cone and the sizes of their axon terminals match the sizes of P-cell dendritic-fields in the same region. These findings support the view that, similar to catarrhines (all being diurnals and trichromats), central P-cells of platyrrhines receive input from single midget bipolar cells, which in turn, receive input from single MWS/LWS-cones irrespective of whether they are diurnal, nocturnal, mono-, di-, or trichromatic. The results are consistent with the idea that a P-pathway with one-to-one connectivity was present in the anthropoid ancestor before the divergence between catarrhines and platyrrhines (Mollon and Jordan, 1988). It will be interesting to ascertain the presence of M- and P-cells in prosimians. Recently, a genetic investigation on 20 species, representing the major prosimian lineages, indicated that several forms of color vision might be found among them (Tan and Li, 1999). Some species are monochromatic, others dichromatic, and some others are potentially trichromatic, similar to the polymorphic color vision of platyrrhines. Little is known about retinal ganglion cell morphology and physiology of prosimians. The greater bushbaby Otolemur, a nocturnal and monochromatic prosimian, which has an MWS/LWS-cone and no SWS-cones, also has M- and P-ganglion cells (Yamada et al., 1998). The central P-cells receive the signals from about five cones, resulting in a cone to P-cell convergence that is higher than that found in diurnal and nocturnal anthropoids but still lower than in central cat b-cells, which receive 30 cones per cell. While little is known of the M- and P-pathways in prosimians, an analysis of their axon diameters across the depth of the optic tract has shown that Otolemur contains an organization identical to that found in humans and other anthropoids. Specifically, the deeper parts of the optic tract contain purely medium-caliber axons, while the more superficial parts of the tract contain both fine and coarse caliber axons (Reese, 1996). While this sheds little light on the morphology of their ganglion cells, it would appear strongly supportive of the view that the fundamental categories of ganglion cell classes (defined anatomically and embryologically, rather than physiologically) are conserved across primates.
The correlation between morphology and physiology Physiological studies had demonstrated that the primate retina had a variety of functional classes of ganglion cells (Hubel and Wiesel, 1960; Gouras, 1968; de Monasterio and Gouras, 1975; de Monasterio et al., 1975a,b; de Monasterio, 1978a,b,c; Kaplan and Shapley, 1986; Lee et al., 1988, 1989a,b,c, 1990, 1994; Purpura et al., 1988, 1990; Kremers et al., 1993; Croner and Kaplan, 1995). The correlation between morphology and physiology remained inferential until the development of an in vitro retino-choroidal preparation, which allows simultaneous light stimulation and intracellular recording and labeling of ganglion cells. Using this procedure, Dacey and Lee (1994) were able to confirm that the phasic, electrophysiological broad-band ganglion cells were the M-cells and that the tonic, red–green color-opponent ganglion cells were the P-cells. In addition, they also confirmed that dendrites of off- and on-center varieties of both cell classes branch in the outer and inner halves of the inner plexiform layer, respectively. Other ganglion cell classes in the primate retina have been described using morphological criteria (Perry and Cowey, 1984; Rodieck and Watanabe, 1993; Kolb et al., 1992). One of them is a small-field bistratified cell and others comprise a heterogeneous group of wide-field cells. Electrophysiologically, there are also several other classes, including blueyellow opponent cells, very phasic cells, and cells responsive only to moving stimuli (de Monasterio and Gouras, 1975; de Monasterio et al., 1975a; de Monasterio, 1978a). By intracellular labeling of physiologically identified blue-on/yellow-off coloropponent cells, Dacey and Lee (1994) were able to show that they were the small-field bistratified cells, previously described by morphological techniques. Presently, using similar techniques, other ganglion cell classes are being studied in order that the correspondence between physiology and morphology can be established. Using tritan stimuli, blue-off/ yellow-on color-opponent cells were identified in the macaque LGN (Valberg et al., 1986a). There is recent evidence that such cells correspond to a class of wide-field ganglion cells in the retina (Dacey et al., 2002).
30
Spatial properties of M- and P-cells How M- and P-cells share the duties of visual information encoding may be understood through a thorough characterization of their physiological properties. In addition, it is relevant to know in which way the physiological properties of M- and P-cells from dichromatic and monochromatic, diurnal and nocturnal primates differ from the well-studied M- and P-cells from trichromatic diurnal anthropoids. Physiological characterization is accomplished by recording ganglion-cell action potentials from axons in the optic nerve (e.g. Hubel and Wiesel, 1960) or from cell soma directly in the retina (e.g. Gouras, 1968). In addition, ganglion cell activity can be indirectly monitored by recording the presynaptic potential from LGN cells (e.g. Kaplan and Shapley, 1986). M- and P-cells are devices that code for a visual intensity dimension in both space and time. What is coded is the absolute level of light intensity and its change in time and space, expressed by its contrast. The responses of M- and P-cells are influenced by the spatial and temporal contrast enabling the identification of edges and temporal changes. However, P-cells have a sustained component in their response, which enables them to respond in some degree to the level of light intensity. This sustained component may not only be important for color and brightness vision but also for certain aspects of form perception (Kremers, 1999). Do these two cell classes differ in the size of the spatial window which they analyze at a given visual field location? Receptive-field shape and size are usually quantified by using different techniques such as thresholds to small spots of light across the receptive field (de Monasterio and Gouras, 1975), measurement of area-threshold curves (Crook et al., 1988), and stimulation of cell receptive field with bipartite fields sinusoidally modulate in counterphase (Kremers and Weiss, 1997; Lee et al., 1998). In addition, assuming linearity, it is also possible to measure cell contrast sensitivity as a function of spatial frequency and then calculate receptive field profiles by Fourier transformation of the frequency response (Derrington and Lennie, 1984; Crook et al., 1988; Croner and Kaplan, 1995). The assumption that M- and P-cells respond to visual stimuli as
linear filters is only valid within certain limits, but it is possible to obtain useful values for M- and P-cell properties in the spatial domain from measurements in the spatial frequency domain (see Kremers et al., 2001, for a discussion of some aspects of this problem). As mentioned above, M- and P-cells differ from each other in dendritic field size, M-cells being larger than P-cells. For a given ganglion cell class, the receptive-field center sizes are generally proportional to the sizes of the dendritic fields at corresponding eccentricity (Peichl and Wa¨ssle, 1979). In macaques, it has been shown that M- and P cells have approximately circular receptive fields with a centersurround organization (e.g. de Monasterio and Gouras, 1975; Passaglia et al., 2002). In this regard, they are similar to a variety of ganglion cells found in other mammals such as cat a- and b-cells. In addition, it has also been shown that the receptivefield center sizes of M- and P-cells increase with eccentricity, except for P-cells in the first 10 of visual field (de Monasterio and Gouras, 1975; Derrington and Lennie, 1984). In this region, P-cells show a large degree of scatter and little change with eccentricity. Despite the anatomical difference in dendritic field diameter, physiological studies have shown a large degree of overlap between the receptive-field center sizes of M- and P-cells, especially in the central retina (Derrington and Lennie, 1984; Lee et al., 1998; Kremers and Weiss, 1997; Kremers et al., 2001; Kilavik et al., 2003). In the first 10 of visual field, the P-cell center sizes are larger than expected if they are solely or largely driven by a single cone. This is partly due to image blur imposed by the eye optics (Lee et al., 1998), but results obtained by measuring P-cell receptive-fields bypassing the eye’s optics, using interference fringes generated directly in the retina, are only partially consistent with this explanation (McMahon et al., 2000). Very little is known about ganglion cell receptivefield size in platyrrhines, limited to the very first observations in Ateles (Hubel and Wiesel, 1960). Some measurements have been made on the magnoand parvocellular relay neurons of Saimiri, Callithrix, and Aotus LGN (Usrey and Reid, 2000; Xu et al., 2001; Kremers and Weiss, 1997; Kremers et al., 2001; Kilavik et al., 2003). The results are qualitatively
31
similar to those obtained in Macaca, but M- and P-cells from Aotus have consistently larger receptive fields than their diurnal counterparts (Usrey and Reid, 2000; Xu et al., 2001).
transformation (Purpura et al., 1990; Lee et al., 1994; Bernadete and Kaplan, 1999a). At a given eccentricity, M-cells have a shorter temporal response than P-cells (Solomon et al., 2002), and thus M-cells should signal a time event with better precision than P-cells (Silveira, 1996; Silveira and de Mello Jr., 1998).
Temporal properties of M- and P-cells Spatial properties of M- and P-cells change dramatically with eccentricity and it has been suggested that there should be some change also in their temporal properties (Silveira, 1996; Silveira and de Mello Jr., 1998). To show this, the temporal response should be measured for both cell classes at all locations of the visual field. However, until recently, only results for M- or P-cells studied within a restricted range of eccentricities have been reported (Lee et al., 1990; Purpura et al., 1990; Lee et al., 1994; Bernadete and Kaplan, 1997a,b, 1999a,b). More recent results indeed strongly indicate that the temporal contrast gain and the critical flicker fusion frequency of M- and P-cells in the retina and the LGN increase with increasing cell size and increasing retinal eccentricity (Solomon et al., 1999, 2002; Kremers et al., 2001; Kilavik et al., 2003). One of the first achievements of electrophysiological recordings performed in the primate retina was the establishment of the phasic/tonic dichotomy for M- and P-cells by probing them with temporal step functions (Gouras, 1968). M-cells discharge transiently whereas P-cells have a sustained component in their discharges to maintained stimuli (Gouras, 1968; de Monasterio and Gouras, 1975). The difference between them can be estimated by calculating the tonic/phasic index (Purpura et al., 1990). Trichromatic and dichromatic Cebus, as well as monocromatic Aotus, have M- and P-cells that can be distinguished by the tonic/phasic index (Lee et al., 2000; Silveira et al., 2000). Figure 7 illustrates examples taken from a dichromatic Cebus. The M- and P-cell impulse response functions can be estimated with temporal pulses of different durations and intensities (Lee et al., 1994); or spots, annuli, or gratings the contrast of which is modulated according to m-sequences (Bernadete and Kaplan, 1997a, 1999b). In addition, the temporal frequency response of M- and P-cells can be measured and then converted to the time domain by Fourier
Contrast sensitivity of M- and P-cells Contrast is a critical intensity parameter for the visual system. It is a comparative measurement of luminance values of adjacent spatial regions or successive instants of time. To measure contrast, the visual system at least partially adapts to environmental mean luminance through mechanisms in the photoreceptors (Yau, 1994), as well as at the postreceptoral level. These postreceptoral mechanisms include compressive nonlinearities of contrast response functions (in the spatial and temporal domains), and high-pass spatial filtering by lateral inhibition (Shapley and Enroth-Cugell, 1984; Barlow and Levick, 1976; Laughlin, 1981). However, light adaptation in P-cells is incomplete, which permits them to signal also steady levels of luminance and chromaticity (Lee et al., 1990). The M- and P-cells differ considerably in their luminance contrast sensitivity (Kaplan and Shapley, 1986; Purpura et al., 1988; Lee et al., 1990, 1994; Kremers et al., 1993). This finding was first demonstrated in LGN neurons and later extended to their retinal afferents. M- and P-cell contrast sensitivity can be evaluated by stimulating the cell’s receptive field with drifting sine-wave gratings (Kaplan and Shapley, 1986; Purpura et al., 1988), temporal sine-wave modulation (Lee et al., 1990; Kremers et al., 1993), and light pulses of different durations (Lee et al., 1994). The cell response amplitude, expressed in impulses per second, is measured as a function of contrast. M-cells are about eight to ten times more sensitive than P-cells, but their responses saturate at a relatively low contrast level, whereas P-cells are insensitive to contrast but their responses show little saturation when contrast is increased. These differences in response amplitude as a function of contrast between M- and P-cells are observed at different retinal eccentricities, retinal illuminance levels, sizes of spatial targets, duration of
32
light pulses, and at most of spatial and temporal frequencies at which both cells are sensitive. The differential sensitivity of M- and P-cells to luminance contrast is also observed across species and in different color vision phenotypes of a given species (Lee et al., 1996, 2000; Silveira et al., 2000). M- and P-cells of dichromatic Cebus are very similar in contrast sensitivity to their counterparts of macaques and trichromatic Cebus (Fig. 8); the main difference among them is that P-cells from dichromatic Cebus are color blind (Lee et al., 1996, 2000). M- and P-cells of monochromatic Aotus are less sensitive to contrast than M- and P-cells of diurnal anthropoids at high temporal frequencies, but they still differ from each other, M-cells being more contrast sensitive than P-cells (Silveira et al., 2000). The reflectance of objects in natural scenes creates spatiotemporal patterns ranging over a wide band of contrast values (Laughlin, 1983). Thus it is reasonable to suppose that the M- and P-channels in all primates are tuned for different ranges of contrasts and therefore their presence will be of value for the analysis of visual information (Yamada et al., 1996a; van Hateren et al., 2002).
The role of M- and P-cells in achromatic and chromatic vision In 1878, Ewald Hering pointed out that human color vision is not characterized by three fundamental hue sensations, but four—blue, green, yellow, and red. To explain perceptual opponency, he postulated that blue and yellow cause antagonistic effects in one color mechanism, whereas green and red cause opposing effects in another mechanism. Human visual experience results from the interaction of three opponent channels: black–white, blue–yellow, and red–green. Although the long standing inconsistency between the Young–Helmholtz trichromatic theory (Young, 1802; von Helmholtz, 1867) and the Hering coloropponent theory (Hering, 1878) can be partly resolved if color-opponency is a sign of postreceptoral neuronal processing of cone signals, it remains puzzling why the Hering unique hues do not precisely correspond to the physiological cone-opponent mechanisms (Mollon and Jordan, 1997).
In contrast to nonmammalian vertebrates, the outer retina of primates and other mammals is not the site of color-opponent processes. Primate horizontal cells exhibit to some extent cone selectivity, but they respond to light of all wavelengths with the same polarity (Dacey et al., 1996). Thus, the search for the initial site of postreceptoral mechanisms of color-opponency has to focus on the selective connections between cones and bipolar cells in conjunction with interactions between bipolar cells, amacrine cells, and ganglion cells in the inner plexiform layer. First studies on retinal ganglion cells (Gouras, 1968; de Monasterio and Gouras, 1975; de Monasterio et al., 1975a,b; de Monasterio, 1978a,b) and their LGN relay neurons (De Valois et al., 1958; Wiesel and Hubel, 1966) were able to demonstrate that trichromatic primates, such as macaques, have color-opponent neurons, which can be divided into two broad categories. One group is numerous, comprising cells that respond with one polarity in the red, with opposite polarity in the green, and have a null point in the yellow region of the spectrum. The other group is less numerous, comprising cells that respond with one polarity in the blue, with opposite polarity in the yellow, and have a null point in the green region of the spectrum. The chromatic specificity of retinal ganglion cells can be tested in various ways, for example with circular or annular stimuli of which size, intensity and wavelength are varied. It was possible to relate the characteristics of receptive-field spatial organization to the spectral response of retinal ganglion cells and to their tonic/phasic temporal properties. Much was already known of the response of LGN neurons to the same kind of stimulus (De Valois et al., 1958; Wiesel and Hubel, 1966), making it possible to hypothesize about the projection of the main classes of retinal ganglion cells to the different LGN layers. The most numerous ganglion cell class are the tonic, red/green color-opponent cells (Fig. 9). Their receptive fields have a center–surround antagonistic organization, the center giving an on- or off-response and the surround responding with opposite polarity. Moreover, the receptive-field centers and surrounds receive signals from different cone classes, either MWS or LWS, resulting in red–green color-opponency. Combining the chromatic and spatial characteristics
33
of their receptive fields, they can be grouped in four subclasses: center red-on/surround green-off; center green-on/surround red-off; center red-off/surround green-on; and center green-off/surround red-on. Based on the similar physiological properties of tonic, red/green color-opponent ganglion cells and the parvocellular relay-neurons of the LGN and the labeling of Polyak’s midget ganglion cells by HRP injections in the parvocellular LGN layers, it was proposed that the midget cells, the P-cells, correspond to the red/green color-opponent cells (Leventhal et al., 1981; Perry et al., 1984). As mentioned above, this correspondence was later directly demonstrated by in vitro recording of ganglion cells followed by intracellular injection of Neurobiotin (Dacey and Lee, 1994). A comparison between psychophysical detection thresholds and ganglion cell responses to combined chromatic and achromatic modulation plotted in an MWS/LWS-cone space, has shown that the properties of P-cells were sufficiently linear and homogeneous to support a linear, red–green opponent, chromatic mechanism (Kremers et al., 1992; Lee et al., 1993a). What is the retinal circuitry that makes P-cells red–green color-opponent and how did they evolve from some cell class in a primitive dichromatic ancestor? P-cells receive input from midget bipolar cells which, over most of the retina of macaques and other catarrhines, have single dendritic clusters connected to single MWS- or LWS-cones. In the central 2 mm of the macaque retina (about 10 of visual field), P-cell dendritic-fields are very small and connect with single midget-bipolar axon-terminals, thence to single MWS- or LWS-cones. Thus, in this region, P-cell receptive-field centers necessarily receive cone-specific inputs (but see McMahon et al., 2000). P-cells could have cone-specific or cone-mixed surrounds (Lennie et al., 1991). Although there is evidence that the P-cell receptivefield surrounds are specific (Reid and Shapley, 1992; Lee et al., 1998), the issue is not fully resolved and the neuronal circuitry that makes this possible is not known. Human red–green color vision becomes quickly degraded with visual field eccentricity (Mullen and Kingdom, 1996). Between about 10–50 of visual field, most of the midget bipolars are connected to single MWS- or LWS-cones. However at these eccentricities, P-cell dendritic-fields are
connected to an increasing number of midgetbipolar axon-terminals. Nevertheless, in this region, about 64% of P-cells show overt red–green coloropponency and a degree of red–green coloropponency can be demonstrated in as many as 80% (Martin et al., 2001). These results suggest that during development, there may be a mechanism which results in specific connections to midget bipolars that are themselves connected to either MWS- or LWS-cones, thus obtaining select conespecific inputs (Martin et al., 2001). Finally, in the far peripheral retina, at 50 eccentricity or more, the majority of midget bipolars are connected to two or more cones and there is no indication that they select either MWS- or LWS-cones. At this eccentricity, in vitro recordings show that P-cells are not color-opponent (Dacey, 1999). The P-cells can support only the red–green color-opponent channel of primate chromatic vision, since these cells have no SWS-cone input. Another, less numerous, ganglion cell class of the macaque retina comprises blue-on/yellow-off cells without clear center–surround spatial organization of the receptive fields (de Monasterio and Gouras, 1975; de Monasterio et al., 1975a; de Monasterio, 1978c; Lee et al., 1988, 1989a,b). They correspond morphologically to the small-field bistratified cells (Dacey and Lee, 1994) and project to the koniocellular layers of the LGN (Martin et al., 1997). The sizes of their receptive fields are similar to those of M-cells (de Monasterio and Gouras, 1975) and correlate with their dendritic field sizes (Dacey, 1993). Ganglion cells that have an inhibitory SWS-cone input are required to fully support the blue–yellow color-opponent channel of primate chromatic vision. The presence of such cells have been reported (de Monasterio and Gouras, 1975; de Monasterio et al., 1975a; Valberg et al., 1986a; Lee et al., 1989a,b) but their morphology has been elusive until recently, when they were associated to a class of wide-field ganglion cells (Dacey et al., 2002). The achromatic channel of primate vision is supported at least partially by cells that are insensitive to color, since there are many conditions, both physiological and pathological, in which most of achromatic stimulus detection persists in the absence of color sensation (Merigan, 1989; Lynch et al., 1992).
34
The primary studies on macaques described a class of retinal ganglion cells and LGN neurons with receptive fields with a center–surround spatial organization, and which received additive signals from MWS- and LWScones in center and surround (Wiesel and Hubel, 1966; Gouras, 1968; de Monasterio and Gouras, 1975; de Monasterio, 1978a,b). These cells give responses of the same polarity, either on or off, to all light wavelengths. Based on the similarity of physiological properties between broad-band ganglion cells and the magnocellular relay neurons of the LGN, as well as on the results of HRP injections in the magnocellular LGN layers, which labeled Polyak’s parasol ganglion cells, it was proposed that parasol cells correspond to broadband cells (Leventhal et al., 1981; Perry et al., 1984). This was later confirmed by ganglion cell intracellular recording and labeling (Dacey and Lee, 1994). Shapley and Perry (1986) proposed the broadband, magnocelullar-projecting ganglion cells are called M-cells. M-cells have been found to support psychophysical tasks such as heterochromatic flicker photometry (Lee et al., 1988), minimally distinct border (Kaiser et al., 1990; Valberg et al., 1992), and spatial position in vernier acuity (Lee et al., 1993b, 1995). This is consistent with their important role in achromatic vision. The positional signal from single M-cells is more precise than that from P-cells, especially at lower contrasts, making M-cells the likely candidates to support hyperacuity. Besides M-cells, other ganglion cell classes may also be important to the primate achromatic vision. In other highly visual mammals, two or more ganglion cell classes, such as the cat a- and b-cells, appear to share the duties of solving complex achromatic tasks needed for animal behavior. The P-cells are the most numerous ganglion cell class, corresponding to 80% of all ganglion cells in the macaque retina. These cells respond to achromatic stimulus at medium and high contrast levels. In addition, their response does not saturate, or exhibits only a small degree of saturation when contrast is increased. M-cells, on the contrary, are very sensitive to contrast, but their response saturates at a relatively low contrast level. Thus, it is possible that M- and P-cells code low and high contrast levels, respectively, and work synergistically to support achromatic vision at intermediate levels of contrast. In particular, P-cells may contribute to achromatic
lightness perception (Valberg et al., 1986b) or discrimination of suprathreshold targets (Pokorny and Smith, 1997). Finally, although P-cells are much more sensitive to chromatic than achromatic contrast, in the natural environment the range of achromatic contrasts is much greater than of chromatic contrasts, and most information in P-cell spike trains is associated with achromatic content (van Hateren et al., 2002). Platyrrhines provide an opportunity to study the contribution of M- and P-cells to the chromatic and achromatic aspects of vision, since in a single species, there are trichromatic and dichromatic individuals bearing a variety of color vision phenotypes. Anatomical studies have shown that the retina of dichromatic Cebus has similar ganglion cell densities and identical proportions of M- and P-cells as those observed in trichromatic anthropoids (Silveira et al., 1989; Lima et al., 1996; Yamada et al., 1996a). To investigate dichromatic P-cells, ganglion cell responses were recorded in trichromatic and dichromatic Cebus and compared with data obtained from macaques (Lee et al., 1996, 2000). Color vision phenotype of each animal was determined by electrophysiological procedures (see the next section) and confirmed by DNA genetic analysis of blood and liver samples. Despite some differences in quantitative details, results from M- and P-cells of trichromatic Cebus strongly resemble those from trichromatic Macaca. In particular, P-cells respond to chromatic stimuli with vigorous red–green opponency (Fig. 9). More importantly, P-cells from dichromatic Cebus appear to be blind versions of those from trichromatic Cebus and Macaca (Figs. 7 and 8). The presence of similar numbers of P-cells in dichromats and trichromats, and the notion that they respond in a very similar manner to stimuli, except for the absence of color-opponency in the dichromatic P-cells, might suggest that P-cells have a role to play in achromatic vision. Furthermore, the presence of P-cells in primates with different life styles, different cone-to-rod ratios, and different numbers of cone classes (Silveira et al., 1994; Yamada et al., 1996a,b, 1998, 2001) may indicate that the original P-cells evolved for the needs of spatiotemporal achromatic vision and became colorcoded when evolution provided two MWS/LWS
35
Fig. 7. Temporal properties of dichromatic Cebus M- and P-cells. Cells were stimulated using two LEDs of 554 nm and 636 nm peak emission, which were modulated in phase to generate 400 ms square pulses with different Weber contrasts (top row in each set). Maxwellian view was used. Cell response was extracellularly recorded with tungsten-in-glass microelectrode inserted into the retinal tissue. The bottom rows in each set depict the responses of a M-on cell and a P-off cell as perstimulus time histograms (PSTH) (total recording ¼ 6 s; sweep duration ¼ 800 ms; bin size ¼ 4 ms; vertical bar ¼ 100 impulses/s). The tonic/phasic index (TPI) was calculated according to Purpura et al. (1990) and the values are given above each cell response histogram. M-cell response is phasic and more sensitive to contrast whereas P-cell response is tonic and less sensitive to contrast.
Fig. 8. Temporal contrast sensitivity of dichromatic Cebus M- and P-cells. Cells were stimulated using two LEDs of 554 nm and 636 nm peak emission, which were modulated in phase to generate sine-wave temporal modulation of 9.76 Hz of frequency with different Michelson contrasts. Maxwellian view was used. Cell response was extracellularly recorded with tungsten-in-glass microelectrode inserted into the retinal tissue. For each stimulus condition a PSTH was recorded, a Fourier analysis was performed, and the first harmonic amplitude was extracted to be plotted as a function of stimulus contrast. NakaRushton functions were fitted to the data point. As contrast increases, both M- and P-cell responses increase, however displaying a very different behavior. M-cell response is very sensitive to contrast but saturates at intermediate and high contrast levels. P-cell response is relatively insensitive to contrast but exhibits little saturation when contrast is increased.
Fig. 9. Chromatic and achromatic responses of a trichromatic Cebus P-cell. The cell was stimulated using two LEDs of 554 nm and 636 nm peak emission, which were modulated to generate 400 ms square pulses with different chromatic and achromatic Weber contrasts. Maxwellian view was used. Cell response was extracellularly recorded with tungsten-in-glass microelectrode inserted into the retinal tissue. The cell response is shown as PSTH (total recording ¼ 6 s; sweep duration ¼ 800 ms; bin size ¼ 4 ms; vertical bar ¼ 100 impulses/s). The top three rows illustrate the P-cell chromatic response to greenward and redward increments. The response reveals that this P-cell is a center green-on/surround red-off. The bottom two rows illustrate the achromatic response to luminance pulses. P cells from trichromatic female Cebus are generally less sensitive to achromatic in comparison to chromatic contrast. P-cells from dichromatic male or female Cebus are colour blind.
36
cones (Mollon and Jordan, 1988; Kremers, 1999; Kremers et al., 1999). This is in agreement with the idea that the appearance of trichromacy in mammals was a relatively recent phylogenetic event, and probably occurred in a primate ancestor (Tan and Li, 1999) which probably already possessed a P-pathway. The P-cell cone selectivity might be simply due to the ‘hit and miss’ mechanism proposed by Shapley and Perry (1986), but, as mentioned above, it is probable that some form of reinforcement of synaptic connections, operating in trichromatic animals, can strengthen P-cell color-opponency during development (Martin et al., 2001).
Photoreceptor signals to M- and P-cells The primate retina, as do the retina of other mammals, possesses several parallel pathways which convey signals from cones to ganglion cells. This is reflected in the existence of multiple classes of cone bipolar cells, including those that connect cones to M- and P-cells (Boycott and Wa¨ssle, 1991). On the other hand, rods are connected to the inner retina by means of a single class of rod bipolar cells. A specific amacrine cell class, the AII amacrine cell, transfers rod signals from rod to cone bipolars, determining that from this point onwards, rod and cone driven signals share the same pathways (Kolb and Famiglietti, 1974). This rod pathway was initially dissected in nonprimate mammals, but more recently several studies have shown that the rod pathway is similarly organized in primates (Gru¨nert and Martin, 1991; Wa¨ssle et al., 1995; Dacey, 1999). For primates and other mammals, there is at least one alternative route, which uses gap junctions to feed rod signals directly into cones (Nelson, 1977; Schneeweis and Schnapf, 1995; Sharpe and Stockman, 1999; Verweij et al., 1999). A third possibility has been demonstrated only in rodents and consists of rod contacts with off cone-bipolar cells (Soucy et al., 1998). M- and P-cells convey cone and rod signals to the LGN with different strengths. Although anatomical studies performed in the macaque retina suggest that both cell classes receive significant rod input (Gru¨nert, 1997), ganglion cell recordings in
the retina of the same species have shown that while M-cells receive a strong rod signal, P-cells receive a weak or negligible rod input (Purpura et al., 1988, 1990; Lee et al., 1997). Human psychophysical studies also suggest that this might be the case (Sun et al., 2001). A survey of the literature on photoreceptor density distribution in primates, including recent comparative studies of several platyrrhines (Franco et al., 2000), reveals the presence of three basic patterns. Diurnal primates with small eyes, such as Callithrix and Saguinus exhibit a high cone-to-rod ratio, whereas those with medium to large eyes, such as Cebus and Macaca, exhibit a lower cone-to-rod ratio. All diurnal primates, independently of eye dimensions, have a well-developed fovea the size of which is constant in different species (Franco et al., 2000). Finally, nocturnal primates, such as Aotus and Otolemur, exhibit a very low cone to rod ratio and lack a well-developed fovea. Thus, it is interesting to see how M- and P-cells from primates, that have very different cone-to-rod ratios, handle photoreceptor signals. Among platyrrhines, there are examples of all three patterns of cone-to-rod ratios and so they represent a good animal model to investigate this question. As previously mentioned, M- and P-cells of Aotus (nocturnal, very low cone-to-rod ratio), Cebus (diurnal, low cone-to-rod ratio), and Callithrix (diurnal, high cone-to-rod ratio), adjust their dendritic field size to keep approximately the same cone convergence to either M- or P-cells for a given eccentricity. Consequently, the rod convergence to M- and P-cells are very different in these three platyrrhines and a physiological difference among them is to be expected. This question has been investigated by recording from ganglion cells in the retina of Aotus and dichromatic Cebus (Saito et al., 2001). As the retinas of both animals have a single MWS/LWS-cone class, their M- and P-cells show no additive or subtractive interactions between M and L cones, making it easier to study the presence of cone and rod signals in their responses. There are no data available for Callithrix ganglion cells, but cone–rod interactions have been studied in this platyrrhine by recording from LGN relay neurons of the M- and P-pathway (Yeh et al., 1995; Weiss et al., 1998).
37
Two protocols were used to study the contribution of cone and rod signals for the Cebus and Aotus ganglion cell responses. One of them uses a modified heterochromatic flicker photometry (HFP) paradigm: the lights of a 554 nm green LED and a 638 nm red LED were sinusoidally modulated in counterphase and the modulation amplitudes of the two LEDs were varied in relation to each other, while keeping the mean luminance and average chromaticity
constant. The amplitudes and phases of cell responses changed as a function of the ratio between the modulation amplitude of the red and green light present in the stimulus. The response amplitudes were minimal for a certain red/green ratio, depending on which photoreceptor was driving the cell responses. In addition, there was a large phase shift in the cell responses when the stimulus condition crosses the null point (not illustrated).
Fig. 10. The weighting of cone and rod signals present in Cebus and Aotus M-cell responses, measured at 2000 Trolands by heterochromatic flicker photometry (HFP). The cells were recorded in a dichromatic Cebus, having a 563 nm LWS-cone, and a monochromatic Aotus, having a 543 nm MWS-cone. Cells were stimulated using two LEDs of 554 nm and 636 nm peak emission, which were modulated in counterphase to generate sine-wave temporal modulation at different frequencies. The amplitude modulation of the two LEDs was varied to found the null point for cell response. Maxwellian view was used. Cell response was extracellularly recorded with tungsten-in-glass microelectrode inserted into the retinal tissue. For each stimulus condition a PSTH was recorded, a Fourier analysis was performed, and the first harmonic amplitude and phase were extracted. A. Photopigment templates for Cebus monkeys. B. Photopigment templates for Aotus monkeys. C. Cebus M-cell response is strongly cone-dominated in all temporal frequencies. D. Aotus M-cell response is strongly rod-dominated in all temporal frequencies.
38
Figure 10A–B shows HFP templates predicting how the response amplitudes change with red–green ratio for each rod or MWS/LWS cone photopigment that may be present in the Cebus and Aotus retina (Fig. 10A and B, respectively). The templates were obtained by convolution of photopigment absorption spectrum with LED light-emission spectra. The retina of a dichromatic Cebus has one of three MWS/LWS photopigments, with absorption spectra peaking at 535 nm, 548 nm or 563 nm. Figure 10A illustrates the response amplitude templates for these photopigments and also for the rod photopigment, which peaks at 500 nm. The Cebus retina also has an S photopigment, peaking at 440 nm, but M- and P-cells do not receive input from SWS-cones. On the other hand, the Aotus possesses a single cone photopigment peaking at 543 nm, the template of which is illustrated together the template for the rod photopigment in Fig. 10B. The results of HFP for Cebus and Aotus M-cells are illustrated in the bottom panels of Fig. 10. The results for three temporal frequencies (4.88, 9.76, and 19.53 Hz) and 2000 Trolands of retinal illuminance are shown. Cell responses were recorded and the first harmonic amplitudes and phases were extracted by Fourier analysis. The response amplitudes are plotted in Fig. 10C and D as a function of red–green ratio for Cebus and Aotus M-cells, respectively. Assuming that cell responses reflect a linear addition of cone and rod inputs, a linear vector addition model was used to fit response amplitude and phase data in the complex plane (Weiss et al., 1998). A least-square method was used to find four free parameters: cone amplitude and phase, and rod amplitude and phase. The predicted amplitudes obtained from the fits of the model to the data are displayed in Fig. 10C–D together with the measured response amplitudes. Overall, the model fitted the data well. The null-point found for the Cebus M-cell is close to the template prediction for an animal having a 563 nm cone photopigment, and this result was confirmed by genetic analysis of blood and liver samples (Lee et al., 2000). The fraction of the cone driven signal, R, present in the cell response, was calculated by dividing the cone amplitude by the sum of cone and rod amplitudes. The larger the value of R, the more the cell response is driven by cone signals. The
Cebus M-cell response is heavily cone-dominated, while the Aotus M-cell response is heavily roddominated at all temporal frequencies. The cone and rod signal contribution to Cebus and Aotus ganglion cell responses was also studied using the Smith et al. (1992) phase paradigm. In this protocol, the modulation amplitudes of the red and green lights are kept constant. The relative phase of the green to the red LED was varied in 22.5 steps. As with HFP, it is possible to predict the ganglion cell response amplitude and phase for animals of different phenotypes. Figure 11A–B shows, the amplitude and phase templates for dichromatic Cebus having different MWS/LWS-photopigments, while Fig. 12A–B shows the templates for the Aotus. In this paradigm, both the response amplitudes and phases change as a function of the relative phase between the LEDs according to the photopigment that is driving the cell responses, but the response phase is a more reliable photopigment signature due to its relative insensitivity to signal-to-noise ratio (Smith et al., 1992). The results for a Cebus M-cell are illustrated in the middle and bottom panels of Fig. 11. The results for four temporal frequencies (2.44, 9.76, 19.53, and 39.06 Hz) and two illuminance levels (20 and 2000 Trolands) are shown. The model fitted to the data points had four free parameters: cone and rod phases, cone signal fraction (R), and a scalar factor used to displace vertically the fitting curve (Smith et al., 1992; Lee et al., 1997). The results for the amplitudes and phases, at 2000 Trolands, are displayed in Fig. 11C and D, respectively, whereas those at 20 Trolands are shown in Fig. 11E and F. The model fitted the data well and the phase plots can be used to predict which photopigment is dominating the cell response. At high illuminance, the cell response is heavily dominated at all temporal frequencies by signals from a 563 nm cone photopigment. At low illuminance levels, there are signs of rod intrusion especially at low temporal frequencies. The presence of a 563 nm photopigment was confirmed by genetic analysis (Lee et al., 2000). The templates and results for an Aotus M-cell are illustrated in Fig. 12. The response phase indicates that cell responses are heavily rod-dominated even at this high illuminance level of 2000 Trolands.
39
Fig. 11. The weighting of cone and rod signals present in Cebus M-cell responses measured by the Smith et al. (1992) phase paradigm. This cell was recorded in a dichromatic Cebus, having a 563 nm LWS-cone. The cell was stimulated using two LEDs of 554 nm and 636 nm peak emission, which were modulated in counterphase to generate sine-wave temporal modulation at different frequencies. The phase of the green LED was varied to evoke cell response change. Maxwellian view was used. Cell response was extracellularly recorded with tungsten-in-glass microelectrode inserted into the retinal tissue. For each stimulus condition a PSTH was recorded, a Fourier analysis was performed, and the first harmonic amplitude and phase were extracted. A–B. Amplitude and phase of the photopigment templates for Cebus monkeys. C–D. Response amplitude and phase of Cebus M-cell response under 2000 Trolands retinal illuminance is strongly cone-dominated in all temporal frequencies. E–F. Response amplitude and phase of Cebus M-cell response under 20 Trolands retinal illuminance shows some rod intrusion, mostly in the low temporal frequencies.
40
Fig. 12. The weighting of cone and rod signals present in Aotus M-cell responses, measured by the Smith et al. (1992) phase paradigm. This cell was recorded in the nocturnal Aotus, which is monochromat and has a single 543 nm MWS-cone. The cell was stimulated using two LEDs of 554 nm and 636 nm peak emission, which were modulated in counterphase to generate sine-wave temporal modulation at different frequencies. The phase of the green LED was varied to evoke cell response change. Maxwellian view was used. Cell response was extracellularly recorded with tungsten-in-glass microelectrode inserted into the retinal tissue. For each stimulus condition a PSTH was recorded, a Fourier analysis was performed, and the first harmonic amplitude and phase were extracted. A–B. Amplitude and phase of the photopigment templates for Aotus monkeys. C–D. Response amplitude and phase of Aotus M-cell response under 2000 Trolands retinal illuminance is strongly rod-dominated in all temporal frequencies. As temporal frequency increases there is some sign of cone intrusion.
Figure 13 shows the cone signal fraction for a sample of M- and P-cells recorded from the retinas of dichromatic Cebus and monochromatic Aotus at 2000 Trolands. At this high illuminance level, the Aotus cells are still heavily rod-dominated, whereas Cebus cell responses are mainly cone dominated. This is consistent with the larger rod convergence to M- and P-cells in this nocturnal monkey found in anatomical experiments (Yamada et al., 2001).
Conclusions In this chapter, we have compared the anatomy and physiology of two main classes of primate retinal ganglion cells, the M- and P-cells. There is compelling evidence that the anatomical and physiological properties of M-cells are very similar in all anthropoids so far studied, both from the Old- and NewWorld. P-cells also have similar properties in these animals with the exception that they are color blind in
41
grant R01-13112. LCLS and ESY are CNPq research fellows. CAS has a CAPES studentship for graduates. JK is supported by a Heisenberg fellowship of German Research Council. The authors thank Dr. Jose´ Augusto P. C. Muniz, Head of the Centro Nacional de Primatas (Ananindeua, State of Para´, Brazil) for his support of primate research. The authors have been supported on different occasions by international travel grants held by CNPq, CAPES, Max Planck Gesellschaft, Deutscher Akademischer Austauschdientst, The Royal Society, and The British Council. Fig. 13. Distribution of cone/rod ratio in samples of Cebus and Aotus ganglion cells measured by HFP and phase paradigms. The results obtained with the two paradigms were shown separately. The Smith et al. (1992) phase paradigm was used to analyze, respectively, the responses of 17 Cebus ganglion cells and 28 Aotus ganglion cells. The HFP paradigm was used for 17 Cebus ganglion cells and 19 Aotus ganglion cells. At 9.76 Hz most of Aotus cells were rod-dominated even at 2000 Trolands.
the monochromatic and dichromatic platyrrhines. Moreover, there are consistent differences between closely related diurnal and nocturnal anthropoids concerning the contribution of cone and rod driven signals to ganglion cell responses: rods have a much stronger influence on the ganglion cell responses in the nocturnal Aotus than in the diurnal Cebus. Whatever the original role of the M- and P-cells, they are likely to have evolved prior to the divergence of catarrhines and platyrrhines. This suggests that they should also be present in prosimians. Very little is known about retinal ganglion cells of prosimians, but the few studies that have been done in these primates indicate that indeed they follow a general primate scheme, having M- and P-cells similar to those of anthropoids. Apart from the differences mentioned above, M- and P-cell systems thus appear to be strongly conserved in the various primate species. The reasons for this may lie in the roles of these systems for both achromatic and chromatic vision.
Acknowledgments The authors have been supported by FINEP, CNPq, and CAPES. BBL is currently supported by the NEI
References Barlow, H.B. (1961) Possible principles underlying the transformation of sensory messages. In: Rosenblith, W. (Ed.), Sensory Communication. MIT Press, Cambridge, MA, pp. 217–234. Barlow, H.B. and Levick, W.R. (1976) Threshold setting by the surround of cat retinal ganglion cells. J. Physiol. (Lond.), 259: 737–757. Bernadete, E.A. and Kaplan, E. (1997a) The receptive field of the primate P retinal ganglion cell. I: Linear dynamics. Vis. Neurosci., 14: 169–185. Bernadete, E.A. and Kaplan, E. (1997b) The receptive field of the primate P retinal ganglion cell. II: Nonlinear dynamics. Vis. Neurosci., 14: 187–205. Bernadete, E.A. and Kaplan, E. (1999a) The dynamics of primate M retinal ganglion cells. Vis. Neurosci., 16: 355–368. Bernadete, E.A. and Kaplan, E. (1999b) Dynamics of primate P retinal ganglion cells: responses to chromatic and achromatic stimuli. J. Physiol. (Lond.), 519.3: 775–790. Boycott, B.B. and Dowling, J.E. (1969) Organization of the primate retina: light microscopy. Phil. Trans. R. Soc. Lond. B, 255: 109–184. Boycott, B.B. and Wa¨ssle, H. (1991) Morphological classification of bipolar cells of the primate retina. Eur. J. Neurosci., 3: 1069–1088. Crook, J.M., Lange-Malecki, B., Lee, B.B. and Valberg, A. (1988) Visual resolution of macaque retinal ganglion cells. J. Physiol. (Lond.), 396: 205–224. Croner, L.J. and Kaplan, E. (1995) Receptive fields of P and M ganglion cells across the primate retina. Vision Res., 35: 7–24. Dacey, D.M. (1993) Morphology of a small-field bistratified ganglion cell type in the macaque and human retina. Vis. Neurosci., 10: 1081–1098. Dacey, D.M. (1999) Primate retina: cell types, circuits and colour opponency. Prog. Retin. Eye Res., 18: 737–763. Dacey, D.M. and Lee, B.B. (1994) The ‘blue-on’ opponent pathway in primate retina originates from a distinct bistratified ganglion cell type. Nature, 367: 731–735.
42 Dacey, D.M. and Petersen, M.R. (1992) Dendritic field size and morphology of midget and parasol ganglion cells of the human retina. Proc. Natl. Acad. Sci. USA, 89: 9666–9670. Dacey, D.M., Lee, B.B., Stafford, D.K., Pokorny, J. and Smith, V.C. (1996) Horizontal cells of the primate retina: cone specificity without spectral opponency. Science, 271: 656–659. Dacey, D.M., Peterson, B.B. and Robinson, F.R. (2002) Identification of an S-cone opponent OFF pathway in the macaque monkey retina: morphology, physiology and possible circuitry. ARVO Annual Meeting Abstract Search and Program Planner, Program No. 3796. de Monasterio, F.M. (1978a) Properties of concentrically organized X and Y ganglion cells of macaque retina. J. Neurophysiol., 41: 1394–1417. de Monasterio, F.M. (1978b) Center and surround mechanisms of opponent-colour X and Y ganglion cells of retina of macaques. J. Neurophysiol., 41: 1418–1434. de Monasterio, F.M. (1978c) Properties of ganglion cells with atypical receptive-field organization in retina of macaques. J. Neurophysiol., 41: 1394–1417. de Monasterio, F.M. and Gouras, P. (1975) Functional properties of ganglion cells of the rhesus monkey retina. J. Physiol. (Lond.), 251: 167–195. de Monasterio, F.M., Gouras, P. and Tolhurst, D.J. (1975a) Trichromatic colour opponency in ganglion cells of the rhesus monkey retina. J. Physiol. (Lond.), 251: 197–216. de Monasterio, F.M., Gouras, P. and Tolhurst, D.J. (1975b) Concealed colour opponency in ganglion cells of the rhesus monkey retina. J. Physiol. (Lond.), 251: 217–229. Derrington, A.M. and Lennie, P. (1984) Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. J. Physiol. (Lond.), 357: 219–240. De Valois, R.L., Smith, C.J., Kitai, S.T. and Karoly, A.J. (1958) Responses of single cells in different layers of the primate lateral geniculate nucleus to monochromatic light. Science, 127: 238–239. Dogiel, A.S. (1891) Ueber die nervo¨sen Elemente in der Retina des Menschen. Archiv fu¨r Mikroskopische Anatomie und Entwicklungsmechanik, 38: 317–344. Enroth-Cugell, C. and Robson, J.G. (1966) The contrast sensitivity of retinal ganglion cells of the cat. J. Physiol. (Lond.), 187: 517–552. Evans, E.F. (1982) Basic physics and psychophysics of sound. In: Barlow, H.B. and Mollon, J.D. (Eds.), The Senses. Cambridge University Press, Cambridge, pp. 239–250. Fleagle, J.G. (1988) Primate Adaptation and Evolution. Academic Press, San Diego. Franco, E.C.S., Finlay, B.L., Silveira, L.C.L., Yamada, E.S. and Crowley, J.C. (2000) Conservation of absolute foveal area in New World primates: a constraint on eye size and conformation. Brain Behav. Evol., 56: 276–286. Ghosh, K.K., Goodchild, A.K., Sefton, A.E. and Martin, P.R. (1996) The morphology of retinal ganglion cells in the New
World marmoset monkey Callithrix jacchus. J. Comp. Neurol., 366: 76–92. Goodchild, A.K., Ghosh, K.K. and Martin, P. (1996) A comparison of photoreceptor spatial density and ganglion cell morphology in the retina of human, macaque monkey, cat, and the marmoset Callithrix jacchus. J. Comp. Neurol., 366: 55–75. Gouras, P. (1968) Identification of cone mechanisms in monkey ganglion cells. J. Physiol. (Lond.), 199: 533–547. Gru¨nert, U. (1997) Anatomical evidence for rod input to the parvocellular pathway in the visual system of the primate. Eur. J. Neurosci., 9: 617–621. Gru¨nert, U. and Martin, P.R. (1991) Rod bipolar cells in the macaque monkey retina: immunoreactivity and connectivity. J. Neurosci., 11: 2742–2758. Hering, E. (1878) Zur Lehre vom Lichtsinne. Carl Gerold’s Sohn, Wien. Hubel, D.H. and Wiesel, T.N. (1960) Receptive fields of optic nerve fibres in the spider monkey. J. Physiol. (Lond.), 154: 572–580. Jacobs, G.H. (1998) Photopigments and seeing—lessons from natural experiments. Invest. Opthalmol. Vis. Sci., 39: 2205–2216. Jacobs, G.H. and Neitz, J. (1987) Polymorphism of the middle wavelength cone in two species of South American monkey: Cebus apella and Callicebus molloch. Vision Res., 27: 1263–1268. Jacobs, G.H., Deegan II, J.F., Neitz, J., Crognale, M.A. and Neitz, M. (1993) Photopigments and colour vision in the nocturnal monkey. Aotus. Vision Res., 33: 1773–1783. Jacobs, G.H., Neitz, M. and Neitz, J. (1996) Mutations in S-cone pigment genes and the absence of colour vision in two species of nocturnal primate. Proc. R. Soc. Lond. B, 263: 705–710. Kaiser, P.K., Lee, B.B., Martin, P.R. and Valberg, A. (1990) The physiological basis of the minimally distinct border demonstrated in the ganglion cells of the macaque retina. J. Physiol. (Lond.), 422: 153–183. Kaplan, E. and Shapley, R.M. (1986) The primate retina contains two types of ganglion cells, with high and low contrast sensitivity. Proc. Natl. Acad. Sci. USA, 83: 2755–2757. Kaplan, E., Mukherjee, P. and Shapley, R.M. (1993) Information filtering in the lateral geniculate nucleus. In: Shapley, R.M. and Lam, D.M.K. (Eds.), Contrast Sensitivity. MIT Press, Cambridge, MA, pp. 183–200. Kilavik, B.E., Silveira, L.C.L. and Kremers, J. (2003) Centre and surround responses of marmoset lateral geniculate neurones at different temporal frequencies. J. Physiol. (Lond.), 546.3: 903–919. Kolb, H. and DeKorver, L. (1991) Midget ganglion cells of parafovea of the human retina: a study by electron microscopy and serial-section reconstruction. J. Comp. Neurol., 303: 617–636.
43 Kolb, H. and Famiglietti, E.V., Jr. (1974) Rod and cone pathways in the inner plexiform layer of cat retina. Science, 186: 47–49. Kolb, H., Linberg, K. and Fisher, S.K. (1992) Neurons of the human retina: a Golgi study. J. Comp. Neurol., 318: 147–187. Kremers, J. (1999) Spatial and temporal response properties of the major retino-geniculate pathways of Old and New World monkeys. Doc. Ophthalmol., 95: 229–245. Kremers, J. and Weiss, S. (1997) Receptive field dimensions of lateral geniculate cells in the common marmoset (Callithrix jacchus). Vision Res., 37: 2171–2181. Kremers, J., Lee, B.B. and Kaiser, P.K. (1992) Sensitivity of macaque retinal ganglion cells and human observers to combined luminance and chromatic temporal modulation. J. Opt. Soc. Am. A, 9: 1477–1485. Kremers, J., Lee, B.B., Pokorny, J. and Smith, V.C. (1993) Responses of macaque ganglion cells and human observers to compound periodic waveforms. Vision Res., 33: 1997–2011. Kremers, J., Silveira, L.C.L., Yamada, E.S. and Lee, B.B. (1999) The ecology and evolution of primate colour vision. In: Gegenfurtner, K.R. and Sharpe, L.T. (Eds.), Color Vision: From Molecular Genetics to Perception. Cambridge University Press, Cambridge, pp. 123–142. Kremers, J., Silveira, L.C.L. and Kilavik, B.E. (2001) Influence of contrast on the responses of marmoset lateral geniculate cells to drifting gratings. J. Neurophysiol., 85: 235–246. Ku¨ffler, S.W. (1953) Discharge patterns and functional organization of mammalian retina. J. Neurophysiol., 16: 37–68. Laughlin, S.B. (1981) Neural principles in the peripheral visual systems of invertebrates. In: H. Autrum (Ed.), Handbook of Sensory Physiology, Vol. VII, Part 6 Comparative Physiology and Evolution of Vision in Invertebrates, B: Invertebrate Visual Centers and Behavior I. Springer-Verlag, Berlin, pp. 133–280. Laughlin, S.B. (1983) Matching coding to scenes to enhance efficiency. In: Braddick, O.J. and Sleigh, A.C. (Eds.), Physical and Biological Processing of Images. Springer, Berlin, pp. 42–52. La Vail, M.M., Rapaport, D.H. and Rakic, P. (1991) Cytogenesis in the monkey retina. J. Comp. Neurol., 309: 86–114. Lee, B.B., Martin, P.R. and Valberg, A. (1988) The physiological basis of heterochromatic flicker photometry demonstrated in the ganglion cells of the macaque retina. J. Physiol. (Lond.), 404: 323–347. Lee, B.B., Martin, P.R. and Valberg, A. (1989a) Sensitivity of macaque retinal ganglion cells to chromatic and luminance flicker. J. Physiol. (Lond.), 414: 223–243. Lee, B.B., Martin, P.R. and Valberg, A. (1989b) Amplitude and phase of responses of macaque retinal ganglion cells to flickering stimuli. J. Physiol. (Lond.), 414: 245–263.
Lee, B.B., Martin, P.R. and Valberg, A. (1989c) Nonlinear summation of M- an L-cone inputs to phasic retinal ganglion cells of the macaque. J. Neurosci., 9: 1433–1442. Lee, B.B., Pokorny, J., Smith, V.C., Martin, P.R. and Valberg, A. (1990) Luminance and chromatic modulation sensitivity of macaque ganglion cells and human observers. J. Opt. Soc. Am. A, 7: 2223–2236. Lee, B.B., Martin, P.R., Valberg, A. and Kremers, J. (1993a) Physiological mechanisms underlying psychophysical sensitivity to combined luminance and chromatic modulation. J. Opt. Soc. Am. A, 10: 1403–1412. Lee, B.B., Wehrhahn, C., Westheimer, G. and Kremers, J. (1993b) Macaque ganglion cell responses to stimuli that elicit hyperacuity in man: detection of small displacements. J. Neurosci., 13: 1001–1009. Lee, B.B., Pokorny, J., Smith, V.C. and Kremers, J. (1994) Responses to pulses and sinusoids in macaque ganglion cells. Vision Res., 34: 3081–3096. Lee, B.B., Wehrhahn, C., Westheimer, G. and Kremers, J. (1995) The spatial precision of macaque ganglion cell responses in relation to vernier acuity of human observers. Vision Res., 35: 2743–2758. Lee, B.B., Silveira, L.C.L., Yamada, E.S. and Kremers, J. (1996) Parallel pathways in the retina of Old and New World primates. Braz. J. Biol., 56(1): 323–338. Lee, B.B., Smith, V.C., Pokorny, J. and Kremers, J. (1997) Rod inputs to macaque ganglion cells. Vision Res., 37: 2813–2828. Lee, B.B., Kremers, J. and Yeh, T. (1998) Receptive fields of primate retinal ganglion cells studied with a novel technique. Visual Neurosci., 15: 161–175. Lee, B.B., Silveira, L.C.L., Yamada, E.S., Hunt, D.M., Kremers, J., Martin, P.R., Troy, J.B. and da Silva Filho, M. (2000) Visual responses of ganglion cells of a New World primate, the capuchin monkey, Cebus apella. J. Physiol. (Lond.), 528.3: 573–590. Lennie, P., Haake, P.W. and Williams, D.R. (1991) The design of chromatically opponent receptive fields. In: Landy, M.S. and Movshon, J.A. (Eds.), Computational Models of Visual Processing. The MIT Press, Cambridge, MA, pp. 71–82. Leventhal, A.G., Rodieck, R.W. and Dreher, B. (1981) Retinal ganglion cell classes in the Old World monkey: morphology and central projections. Science, 213: 1139–1142. Leventhal, A.G., Ault, S.J., Vitek, D.J. and Shou, T. (1989) Extrinsic determinants of retinal ganglion cell development in primates. J. Comp. Neurol., 286: 170–189. Lima, S.M.A., Silveira, L.C.L. and Perry, V.H. (1996) Distribution of M retinal ganglion cells in diurnal and nocturnal New-World monkeys. J. Comp. Neurol., 368: 538–552. Lynch III, J.J., Silveira, L.C.L., Perry, V.H. and Merigan, W.H. (1992) Visual effects of damage to P ganglion cells in macaques. Vis. Neurosci., 8: 575–583.
44 Martin, P.R., White, A.J., Goodchild, A.K., Wilder, H.D. and Sefton, A.E. (1997) Evidence that blue-on cells are part of the third geniculocortical pathway in primates. Eur. J. Neurosci., 9: 1536–1541. Martin, P.R., Lee, B.B., White, A.J.R., Solomon, S.G. and Ru¨ttiger, L. (2001) Chromatic sensitivity of ganglion cells in the peripheral primate retina. Nature, 410: 933–936. McMahon, M.J., Lankheet, M.J.M., Lennie, P. and Williams, D.R. (2000) Fine structure of parvocellular receptive fields in the primate fovea revealed by laser interferometry. J. Neurosci., 20: 2043–2053. Merigan, W.H. (1989) Chromatic and achromatic vision of macaques: role of the P pathway. J. Neurosci., 9: 776–783. Mollon, J.D. and Jordan, G. (1988) Eine evolutiona¨re Interpretation des menschlichen Farbensehens. Die Farbe, 35: 139–170. Mollon, J.D. and Jordan, G. (1997) On the nature of unique hues. In: Dickinson, C., Murray, I. and Carden, D. (Eds.), John Dalton’s Colour Vision Legacy. Taylor & Francis, London, pp. 381–392. Mollon, J.D., Bowmaker, J.K. and Jacobs, G.H. (1984) Variations of colour vision in a New World primate can be explained by polymorphism of retinal photopigments. Proc. R. Soc. Lond. B, 222: 373–399. Mullen, K.T. and Kingdom, F.A.A. (1996) Losses in peripheral colour sensitivity predicted from ‘hit and miss’ postreceptoral cone connection. Vision Res., 36: 1995–2000. Nelson, R. (1977) Cat cones have rod input: a comparison of the response properties of cones and horizontal cell bodies in the retina of the cat. J. Comp. Neurol., 172: 109–135. Passaglia, C.L., Troy, J.B., Ru¨ttiger, L. and Lee, B.B. (2002) Orientation sensitivity of ganglion cells in primate retina. Vision Res., 42: 683–694. Peichl, L. and Wa¨ssle, H. (1979) Size, scatter and coverage of ganglion cell receptive field centres in the cat retina. J. Physiol. (Lond.), 291: 117–141. Perry, V.H. and Cowey, A. (1981) The morphological correlates of X- and Y-like retinal ganglion cells in the retina of monkeys. Exp. Brain Res., 43: 226–228. Perry, V.H. and Cowey, A. (1984) Retinal ganglion cells that project to the superior colliculus and pretectum in the macaque monkey. Neuroscience, 12: 1125–1137. Perry, V.H. and Silveira, L.C.L. (1988) Functional lamination in the ganglion cell layer of the macaque’s retina. Neuroscience, 25: 217–223. Perry, V.H., Oehler, R. and Cowey, A. (1984) Retinal ganglion cells that project to the dorsal lateral geniculate nucleus in the macaque monkey. Neuroscience, 12: 1101–1123. Pokorny, J. and Smith, V. (1997) Psychophysical signatures associated with magnocellular and parvocellular pathway contrast gain. J. Opt. Soc. Am. A, 14: 2477–2486. Polyak, S.L. (1941) The Vertebrate Retina. University of Chicago Press, Chicago.
Purpura, K., Kaplan, E. and Shapley, R.M. (1988) Background light and the contrast gain of primate P and M retinal ganglion cells. Proc. Natl. Acad. Sci. USA, 85: 4534–4537. Purpura, K., Kaplan, E., Tranchina, D. and Shapley, R.M. (1990) Light adaptation in the primate retina: analysis of changes in gain and dynamics of monkey retinal ganglion cells. Vis. Neurosci., 4: 75–93. Ramo´n y Cajal, S. (1904) Textura del Sistema nervioso del hombre y de los vertebrados. English translation (1995): N. Swanson and L.W. Swanson (Translators) Histology of the Nervous System of Man and Vertebrates. Oxford University Press, New York. Reese, B.E. (1996) The chronotopic reordering of optic axons. Perspect. Dev. Neurobiol., 3: 233–242. Reid, R.C. and Shapley, R.M. (1992) Spatial structure of cone inputs to receptive fields in primate lateral geniculate nucleus. Nature, 356: 716–718. Rodieck, R.W. and Watanabe, M. (1993) Survey of the morphology of macaque retinal ganglion cells that project to the pretectum, superior colliculus, and parvicellular laminae of the lateral geniculate nucleus. J. Comp. Neurol., 338: 289–303. Rodieck, R.W., Binmoeller, K.F. and Dineen, J. (1985) Parasol and midget ganglion cells of the human retina. J. Comp. Neurol., 233: 115–132. Rylands, A.B., Schneider, H., Langguth, A., Mittermeier, R.A., Groves, C.P. and Rodrı´ guez-Luna, E. (2000) An assessment of the diversity of New World Primates. Neotropical Primates, 8: 61–93. Saito, C.A., Lee, B.B., Kremers, J., Silveira, L.C.L., da Silva Filho, M. and Kilavik, B.E. (2001) Rod–cone interactions in the ganglion cell response: studies using the diurnal capuchinmonkey and the nocturnal owl-monkey. Invest. Ophthalmol. Vis. Sci., 42: S676. Schneeweis, D.M. and Schnapf, J.L. (1995) Photovoltage of rods and cones in the macaque retina. Science, 268: 1053–1056. Shapley, R.M. and Enroth-Cugell, C. (1984) Visual adaptation and retinal gain controls. Prog. Retinal Res., 3: 263–346. Shapley, R.M. and Hawken, M.J. (1999) Parallel retinocortical channels and luminance. In: Gegenfurtner, K.R. and Sharpe, L.T. (Eds.), Colour Vision—From Genes to Perception. Cambridge University Press, Cambridge, pp. 221–234. Shapley, R.M. and Perry, V.H. (1986) Cat and monkey ganglion cells and their visual functions roles. Trends Neurosci., 9: 229–235. Sharpe, L.T. and Stockman, A. (1999) Rod pathways: the importance of seeing nothing. Trends Neurosci., 22: 497–504. Silveira, L.C.L. (1996) Joint entropy loci of M and P cells: a hypothesis for parallel processing in the primate visual system. Braz. J. Biol., 56(1): 345–367. Silveira, L.C.L. and de Mello, Jr. H.D. (1998) Parallel pathways of the primate vision: sampling of the information in the Fourier space by M and P cells. In: Chalupa, L.M. and
45 Finlay, B.L. (Eds.), Development and Organization of the Retina: From Molecules to Function. Plenum Press, New York, pp. 173–199. Silveira, L.C.L., Picanc¸o-Diniz, C.W., Sampaio, L.F.S. and Oswaldo-Cruz, E. (1989) Retinal ganglion cell distribution in the Cebus monkey: a comparison with the cortical magnification factors. Vision Res., 29: 1471–1483. Silveira, L.C.L., Perry, V.H. and Yamada, E.S. (1993) The retinal ganglion cell distribution and the representation of the visual field in area 17 of the owl-monkey Aotus trivirgatus. Vis. Neurosci., 10: 887–897. Silveira, L.C.L., Yamada, E.S., Perry, V.H. and PicancoDiniz, C.W. (1994) M and P retinal ganglion cells of diurnal and nocturnal New World monkeys. NeuroReport, 5: 2077–2081. Silveira, L.C.L., Lee, B.B., Yamada, E.S., Kremers, J. and Hunt, D.M. (1998) Post-receptoral mechanisms of colour vision in new world primates. Vision Res., 38: 3329–3337. Silveira, L.C.L., Lee, B.B., Kremers, J., da Silva Filho, M., Saito, C.A. and Kilavik, B.E. (2000) Receptor inputs and temporal dynamics of owl monkey ganglion cells. Invest. Ophthalmol. Vis. Sci., 41: S937. Silveira, L.C.L., Yamada, E.S., Franco, E.C.S. and Finlay, B.L. (2001a) The specialisation of the owl monkey retina for night vision. Colour Res. Appl., 26: S118–S122. Silveira, L.C.L., dos Santos, S.N. and Kremers, J. (2001b) The presence of single-cone midget bipolar cells in the retina of the owl-monkey. Invest. Ophthalmol. Vis. Sci., 42: S375. Smith, V.C., Lee, B.B., Pokorny, J., Martin, P.R. and Valberg, A. (1992) Responses of macaque ganglion cells to the relative phase of heterochromatically modulated lights. J. Physiol. (Lond.), 458: 191–221. Solomon, S.G., White, A.J. and Martin, P.R. (1999) Temporal contrast sensitivity in the lateral geniculate nucleus of a New World monkey, the marmoset Callithrix jacchus. J. Physiol. (Lond.), 3, 517: 907–917. Solomon, S.G., Martin, P.R., White, A.J.R., Ru¨ttiger, L. and Lee, B.B. (2002) Modulation sensitivity of ganglion cells in peripheral retina of macaque. Vision Res., 42: 2893–2898. Soucy, E., Wang, Y., Nirenberg, S., Nathans, J. and Meister, M. (1998) A novel signaling pathway from rod photoreceptors to ganglion cells in mammalian retina. Neuron, 21: 481–493. Sun, H., Pokorny, J. and Smith, V.C. (2001) Rod–cone interactions assessed in inferred magnocellular and parvocellular postreceptoral pathways. J. Vision, 1: 42–54. Tan, Y. and Li, W.H. (1999) Trichromatic vision in prosimians. Nature, 402: 36. Usrey, W.M. and Reid, R.C. (2000) Visual physiology of the lateral geniculate nucleus in two species of New World monkey: Saimiri sciureus and Aotus trivirgatus. J. Physiol. (Lond.), 3, 523: 755–769.
Valberg, A., Lee, B.B. and Tigwell, D.A. (1986a) Neurones with strong inhibitory S-cone inputs in the macaque lateral geniculate nucleus. Vision Res., 26: 1061–1064. Valberg, A., Seim, T., Lee, B.B. and Tryti, J. (1986b) Reconstruction of equidistant colour space from responses of visual neurones of macaques. J. Opt. Soc. Am. A, 3: 1726–1734. Valberg, A., Lee, B.B., Kaiser, P.K. and Kremers, J. (1992) Responses of macaque ganglion cells to movement of chromatic borders. J. Physiol. (Lond.), 458: 579–602. van Hateren, J.H., Ru¨ttiger, L., Sun, H. and Lee, B.B. (2002) Processing of natural temporal stimuli by macaque retinal ganglion cells. J. Neurosci., 22: 9945–9960. Verweij, J., Dacey, D.M., Peterson, B.B. and Buck, S.L. (1999) Sensitivity and dynamics of rod signals in H1 horizontal cells of the macaque monkey retina. Vision Res., 39: 3662–3672. von Helmholtz, H. (1867) Handbuch der Physiologischen Optik. Voss, Hamburg and Leipzig. Wa¨ssle, H., Gru¨nert, U., Chun, M.H. and Boycott, B.B. (1995) The rod pathway of the macaque monkey retina: identification of AII-amacrine cells with antibodies against calretinin. J. Comp. Neurol., 361: 537–551. Watanabe, M. and Rodieck, R.W. (1989) Parasol and midget ganglion cells. J. Comp. Neurol., 289: 434–454. Weiss, S., Kremers, J. and Maurer, J. (1998) Interaction between rod and cone signals in responses of lateral geniculate neurons in dichromatic marmosets (Callithrix jacchus). Vis. Neurosci., 15: 931–943. Wikler, K.C. and Rakic, P. (1990) Distribution of photoreceptor subtypes in the retina of diurnal and nocturnal primates. J. Neurosci., 10: 3390–3401. Wiesel, T.N. and Hubel, D.H. (1966) Spatial and chromatic interactions in the lateral geniculate body of the rhesus monkey. J. Neurophysiol., 29: 1115–1156. Xu, X., Ichida, J.M., Allison, J.D., Boyd, J.D., Bonds, A.B. and Casagrande, V.A. (2001) A comparison of koniocellular, magnocellular and parvocellular receptive field properties in the lateral geniculate nucleus of the owl monkey (Aotus trivirgatus). J. Physiol. (Lond.), 531(1): 203–218. Yamada, E.S., Silveira, L.C.L. and Perry, V.H. (1996a) Morphology, dendritic field size, somal size, density and coverage of M and P retinal ganglion cells of dichromatic Cebus monkeys. Vis. Neurosci., 13: 1011–1029. Yamada, E.S., Silveira, L.C.L., Gomes, F.L. and Lee, B.B. (1996b) The retinal ganglion cell classes of New World primates. Braz. J. Biol., 56(1): 381–396. Yamada, E.S., Marshak, D.W., Silveira, L.C.L. and Casagrande, V.A. (1998) Morphology of P and M retinal ganglion cells of the bush baby. Vision Res., 38: 3345–3352. Yamada, E.S., Silveira, L.C.L., Perry, V.H. and Franco, E.C.S. (2001) Morphology and dendritic field size of M and P retinal ganglion cells of the owl monkey. Vision Res., 41: 119–131.
46 Yau, K.W. (1994) Phototransduction mechanisms in retinal rods and cones. Invest. Ophthalmol. Vis. Sci., 35: 9–32. Yeh, T., Lee, B.B., Kremers, J., Cowing, J.A., Hunt, D.M., Martin, P.R. and Troy, J.B. (1995) Visual responses in the
lateral geniculate nucleus of dichromatic and trichromatic marmosets (Callithrix jacchus). J. Neurosci., 15: 7892–7904. Young, T. (1802) The Bakerian lecture: On the theory of light and colours. Phil. Trans. R. Soc. Lond., 92: 12–48.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 3
Identifying corollary discharges for movement in the primate brain Robert H. Wurtz* and Marc A. Sommer Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, MD 20892-4435, USA
Abstract: The brain keeps track of the movements it makes so as to process sensory input accurately and coordinate complex movements gracefully. In this chapter we review the brain’s strategies for keeping track of fast, saccadic eye movements. One way it does this is by monitoring copies of saccadic motor commands, or corollary discharges. It has been difficult to identify corollary discharge signals in the primate brain, although in some studies the influence of corollary discharge, for example on visual processing, has been found. We propose four criteria for identifying corollary discharge signals in primate brain based on our experiences studying a pathway from superior colliculus, in the brainstem, through mediodorsal thalamus to frontal eye field, in the prefrontal cortex. First, the signals must originate from a brain structure involved in generating movements. Second, they must begin just prior to movements and represent spatial attributes of the movements. Third, eliminating the signals should not impair movements in simple tasks not requiring corollary discharge. Fourth, eliminating the signals should, however, disrupt movements in tasks that require corollary discharge, such as a double-step task in which the monkey must keep track of one saccade in order to correctly generate another. Applying these criteria to the pathway from superior colliculus to frontal eye field, we concluded that it does indeed convey corollary discharge signals. The extent to which cerebral cortex actually uses these signals, particularly in the realm of sensory perception, remains unknown pending further studies. Moreover, many other ascending pathways from brainstem to cortex remain to be explored in behaving monkeys, and some of these, too, may carry corollary discharge signals.
Introduction
to keep track of movements as they are generated and predict the sensations that will result from them. The second challenge is in the motor domain. As behaviors become more elaborate, the need for internal information about movements becomes more critical. During quick, complex motor sequences such as those produced while fighting a competitor, information about prior actions helps to generate appropriate future ones. For both sensory perception and motor production, therefore, nervous systems need to keep track of the movements they generate. In this chapter, we consider how the brain might monitor movement information in the primate visual-oculomotor system. We review studies exploring how visual input from the world is distinguished from visual input caused by eye movements, and how primates keep track of
Generating movements is a key to survival for animals. Food gathering, escape from predators, and reproduction all involve coordinated movements. Generating movements, however, presents two major challenges to the nervous system. The first is in the sensory domain. Many movements cause sensory input identical to that elicited by external events, and consequently animals must be able to distinguish whether they, or another entity, caused the sensory input. A valuable aid in making this distinction is *Corresponding author. Building 49, Room 2A50, MSC 4435, NEI, NIH, 9000 Rockville Pike, Bethesda, MD 20892-4435, USA. Tel.: þ1-301-496-7170; Fax: þ1-301-402-0511; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-3
47
48
their eye movements while they look around rapidly. Based on experience from our own laboratory we also propose criteria for identifying internal records of movement within the primate brain.
Sources of knowledge about self-movement Among the most common movements made by primates are eye movements, and how these movements are internally monitored has been the focus of speculation for centuries and quantitative study for decades (Bridgeman, 1995a; Gru¨sser, 1995; Colby and Goldberg, 1999). As a primate makes rapid or saccadic eye movements to explore the visual scene, the apparent motion of objects in the scene is an artefact of the saccadic eye movement and is not due to actual object movements. How does the brain distinguish this self-induced, illusory object motion from real motion? As might be expected from a biological system that undoubtedly resulted from eons of evolution, there are multiple mechanisms for making this potentially life and death distinction. One useful clue is contained in the visual signal from the retinas: when the eyes move the whole visual field moves, whereas when a visual object moves it moves alone. Full-field motion, often referred to as optic flow (Fig. 1A), is a reasonable indicator of eye movement as long as the head and body remain stationary. Optic flow so frequently indicates selfmotion that it provides critical information about the heading taken by an animal as it proceeds through its environment (Warren and Hannon, 1990; Wurtz and Duffy, 1997; Duffy, 2000). This clue to self-motion, however, requires a lighted, contoured environment that of course is not always present. A second clue comes from proprioceptors in the eye muscles (Fig. 1A). As the eyes move, proprioceptive input may report eye-muscle contraction to the brain, providing information that apparent visual motion is due to eye movements. The role of the proprioceptors has been investigated for many years (Ruskell, 1999; Donaldson, 2000) and yet their exact contribution remains to be determined. There is growing evidence, however, that the major contribution of proprioception is in long-term calibration of the eye-movement system rather than in monitoring
Fig. 1. The three major sources of information about one’s own eye movements. (A) At left, a source of retinal information is indicated: optic flow, or full-field visual motion caused by a saccade. At right, two sources of extraretinal information are diagrammed. Proprioception, or input to the brain from receptors in the eye muscles, and corollary discharge, a signal within the brain representing the movement command, both accompany a saccade. (B) Time course of the three sources of information. Corollary discharge signals can occur before, during, and after a saccade. Proprioception and optic-flow signals, however, are available only after a saccade, following afferent delays from periphery to the brain.
movements on a saccade-by-saccade basis (Keller and Robinson, 1971; Guthrie et al., 1983; Lewis et al., 2001). These two sources of information are sensory in nature, arising peripherally in either the retinas or the proprioceptors. They provide clues about eye movements through afferent inputs to the brain. A third source of information is from within the brain itself (Fig. 1A), and we refer to it as a corollary discharge. This is also known as an efference copy; for a discussion of the nomenclature, see Bell (1984). A corollary discharge for movement is just that: it is a corollary signal sent to other regions of the brain at the same time that the signal is sent on the pathway
49
to activate the muscles to generate the movement. The corollary logically could be from any level of the circuit within the brain generating the movement, including the final common path to the eye muscles. The advantages of corollary discharges are that they are generated within the brain itself, making them impervious to disruptions of the peripheral receptors, and that they are available even before the movement begins, whereas sensory information is available only afterward (Fig. 1B). The specific idea of a corollary discharge evolved from the 18th century onward (McCloskey, 1981; Bridgeman, 1995b; Gru¨sser, 1995), culminating in Hermann von Helmholtz’s 19th century reference to an ‘effort of will’ as the mechanism compensating for the spurious visual motion caused by one’s own eye movements. The most influential papers of the 20th century were published by Sperry (1950) and by von Holst and Mittelstaedt (1950), who examined the behaviors of fish and flies, respectively, after ocular rotation/inversion. In both preparations the animals’ abnormal behaviors could be explained most easily by postulating that internal copies of motor commands were monitored by the nervous system. Since then the concept of corollary discharge has been invoked to help explain a wide range of animal behaviors, such as electrolocation in fish (Bell, 1984), song learning in birds (Troyer and Doupe, 2000), and chirping in crickets (Poulet and Hedwig, 2002). In all these behaviors the animals must distinguish the sensory consequences of their own actions from environmentally produced sensations. Psychophysical and lesion studies have demonstrated that corollary discharge signals exist in humans (McCloskey, 1981; Skavenski, 1990; Haarmeier et al., 1997; Thier et al., 2001; Pierrot-Deseilligny et al., 2002). Much current work on human motor control is focusing on how the generation of limb movements, especially during motor learning, relies on corollary discharge signals (or ‘forward internal models’; Jordan and Rumelhart, 1992; Frith et al., 2000; Wolpert and Ghahramani, 2000). In principle, neurophysiologists can take at least two approaches to demonstrating the existence of corollary discharge signals in neurons of any sensorimotor system (Fig. 2). The first approach is to identify the effect of the corollary discharge on a neuron’s sensory responses. The second is to identify
Fig. 2. Two ways of detecting corollary discharge in the visualoculomotor system. Experimenters usually detect corollary discharges indirectly by demonstrating otherwise inexplicable changes in sensory processing (right). For example, a modified visual signal, such as a visual response that changes just prior to saccade initiation, may suggest that a corollary discharge is present. The more direct approach is to identify the corollary discharge itself (left). To do this, one must establish criteria for determining whether movement-related neuronal activity (as in the example shown with rasters and a spike density function) is a corollary discharge or a movement command. The corollary discharge would interact at a later stage with visual input to produce a modified visual signal. Many types of interactions are possible (MacKay, 1966; Bell, 1984).
the corollary itself, but this raises the question of how to distinguish a corollary discharge signal from a movement command. We consider both of these approaches in turn as they have been applied in the monkey visual-oculomotor system.
Searching for the influence of corollary discharge on visual processing The classical approach to studying corollary discharge in the primate visual-oculomotor system has been to search not for the corollary itself but instead for the impact of the corollary on visual processing. The logical first place to look for the effect of a corollary discharge was in primary visual cortex, which receives input from the retinas via the lateral geniculate nuclei. The principle was to compare neuronal activity evoked by motion of an object (with the eyes still) with activity evoked by movement of the eyes (with the object still). If the neuron responded differently to the nearly identical object
50
motion on the retina in the two conditions, then the neuron had to be receiving information that an eye movement was occurring. This meant the neuron’s activity was influenced by corollary discharge signals. This experiment was performed in the awake, trained monkey (Wurtz, 1968), and in fact was the very first recording of visual neurons achieved in an awake, trained monkey. No clear difference was detected in the two conditions, indicating that corollary discharge signals probably have little influence on processing in primary visual cortex. There was, however, evidence that the presence or absence of a visual background influenced the neuronal responses, emphasizing that other lines of information such as optic flow (Fig. 1A) can provide clues as to the cause of visual motion. A corollary discharge associated with pursuit movements also has been sought in primary visual cortex, but none has been found (Ilg and Thier, 1996). Subsequent studies on saccades have reported slight effects of corollary discharge on primary visual-cortex neurons (Bridgeman, 1973; Galletti et al., 1984). These latter results might indicate true corollary discharge influences, but they also may be due instead to significant differences in the motion produced by the saccade versus the stimulus movement generated by the experimenter. Primary visual cortex is not the only recipient of visual signals from the retina in primates; the retina also projects directly to the superficial layers of the superior colliculus (SC), a structure on the roof of the midbrain. Neurons in the SC superficial layers respond to visual stimuli and do not increase their activity before eye movements (in contrast to neurons just below them in the SC intermediate layers that discharge in tight correlation with saccades; Schiller and Koerner, 1971; Wurtz and Goldberg, 1971; Sparks and Hartwich-Young, 1989). The same test for the presence of a corollary discharge was done on these SC superficial-layer neurons as on the primary visual-cortical neurons, but the outcome was substantially different. In contrast to the results in primary visual cortex, many SC superficial-layer neurons showed strong differences in their responses to moving visual stimuli (Robinson and Wurtz, 1976a) depending upon whether the motion was caused by visual stimulus motion with the eyes stationary (Fig. 3A, left) or by a saccade with the visual stimulus stationary (Fig. 3A,
Fig. 3. Identifying the effects of corollary discharge on visual neurons of the SC superficial layers. (A) Example of an SC neuron that may have been influenced by corollary discharge signals. The neuron showed a clear visual response when a spot of light moved across its receptive field while the eye was stationary (see the rasters and histogram of neuronal activity, left panel), but it did not respond when a saccade moved the receptive field across a stationary stimulus at about equal speed (right panel). In fact, background activity was actually suppressed when the eye moved. This was evidence that there was an extraretinal input (corollary discharge or proprioception) to these neurons. H, V, horizontal and vertical components of the eye position; Sp/s, spikes per second. From Robinson and Wurtz (1976b). (B) Demonstration that the effect is due to corollary discharge. The saccade-related suppression of background activity of an SC superficial-layer neuron (left panel) continued when the monkey attempted to move its eyes even though a retrobulbar block prevented movement (middle panel). Because the eye muscles did not contract, there were no proprioceptive signals. The attempted eye movement was indicated by an increase in activity from integrated multiple neuron activity recorded from the oculomotor nucleus (Oc. Nuc.). Activity after the block recovered is shown in the right panel. E.O.G., electrooculogram. From Richmond and Wurtz (1980). Only a few of the rasters contributing to the histograms are shown. The rasters were retouched to compensate for faint dots resulting from digitization.
right). The motion resulting from saccades frequently did not produce the usual increase of activity at all, but instead produced suppression in the background activity.
51
This difference in visual responses to eye- versus stimulus-generated motion was necessary, but not sufficient, to demonstrate the influence of corollary discharges. There was still the possibility that the effects were due entirely to proprioceptive input. Therefore, a further test was done to determine whether the suppression of activity accompanying the saccade persisted in the absence of proprioception (Richmond and Wurtz, 1980). Proprioception was eliminated by stopping the movement of the eye by numbing eye muscles with xylocaine. During the block the monkey attempted in vain to move its eyes, as indicated by bursts of activity recorded from the oculomotor nucleus, and corollary discharge signals still should have been generated accordingly. The suppression persisted (Fig. 3B, middle), so it must have been dependent upon corollary discharge. This experiment probably provides the best evidence in the primate visual-oculomotor system for the action of a corollary discharge on early visual processing. The inverse experiment was not done (eliminate the corollary discharge and keep the proprioception), so the possibility remains that proprioception may contribute to some extent; however, corollary discharge alone was sufficient to explain the effect. We noted above that corollary discharge has little, if any, influence on activity in primary visual cortex. However, it does seem to exert an effect later in the visual stream. For example, visual receptive fields of many cerebral cortical neurons suddenly shift to new locations just prior to a saccade; the new locations are those where the receptive field would be just after the saccade (Duhamel et al., 1992a; Colby and Goldberg, 1999). This predictive remapping must use corollary discharge information because it occurs before the eye actually moves. This effect has been seen in the frontal eye field (FEF) of prefrontal cortex (Umeno and Goldberg, 1997) and seems to diminish gradually in extrastriate cortex as one approaches primary visual cortex (Nakamura and Colby, 2002). Recently, a difference between stimulus- and eyeproduced motion was found for neurons in extrastriate cortical area MT of the monkey (Thiele et al., 2002). This demonstrates the presence of an extraretinal input that may be a corollary discharge, although influences of proprioception were not explicitly ruled out.
Identifying the corollary discharge itself Demonstration of an effect of corollary discharge has been accomplished many times, not only in the monkey with respect to the modification of visual processing, but also in a large number of vertebrate and invertebrate species. In contrast, the identification of the corollary discharge signal itself has been attempted in only a few studies, among them investigations of the corollary discharge of weak electric signals (the generation of which involve a muscle-like organ) in mormyrid fish (Bell, 1984) and of the corollary discharge of leg movements in cockroach and cricket (Delcomyn, 1977; Poulet and Hedwig, 2002). A critical issue in such experiments is to differentiate the signal that is the corollary from that which is the movement command (Fig. 2, left). For example, in monkeys the saccade-related discharges of SC intermediate-layer neurons could logically be either movement commands or corollaries of the commands. Certainly many are movement commands because low-threshold electrical stimulation or reversible inactivation of the SC intermediate layers elicits or impairs saccade generation, respectively (Robinson, 1972; Hikosaka and Wurtz, 1985). Whether some of the saccade-related discharges in SC are actually corollary discharge signals, however, has been unknown. In our own attempts to investigate corollary discharge signals we developed a list of criteria for identifying them within the complex circuits of the primate brain (Table 1). First, putative corollary discharges should originate from a brain structure known to be involved in the generation of the movement as indicated by changes in activity preceding the movement and alterations in the movement resulting from activating or inactivating the structure. Second, the signals should occur just prior to the movement and represent spatial
Table 1. Criteria for identifying corollary discharges 1. The signals originate in a motor area 2. The signals precede and spatially represent the movement 3. Eliminating the signals does not impair movements in tasks not requiring corollary discharge 4. Eliminating the signals does impair movements in tasks requiring corollary discharge
52
parameters of the movement. Third, eliminating the signals should not impair movements in simple tasks not requiring corollary discharge. Fourth, eliminating the signals should, however, disrupt the performance of tasks that require corollary discharge. While we think these criteria should apply to the identification of corollary discharge in systems other than the visual oculomotor and in animals other than the monkey, we make no pretense that these are the only criteria that could be used. Using these criteria, we considered whether neurons in a pathway from SC up to frontal cortex could be regarded as conveying corollary discharges for saccades, as will be discussed next.
Criterion 1: The signals originate from a motor area We investigated a pathway suspected on anatomical grounds to run from a clearly established brainstemoculomotor region up to the cerebral cortex. It was thought to originate from SC intermediate-layer neurons that project to relay neurons in the mediodorsal nucleus of the thalamus (MD) that in turn project to the FEF (Fig. 4A). Evidence for the existence of this pathway came from retrogradelabeling and anterograde-degeneration studies (Benevento and Fallon, 1975; Goldman-Rakic and Porrino, 1985) taken together with a transynaptic retrograde-labeling study using herpes simplex virus (Lynch et al., 1994). To confirm that this pathway existed and was functional, we first attempted to identify and record from MD relay neurons. The activity of thalamic neurons in and around MD during visuosaccadic behavior had been studied only once before in the monkey (Schlag and Schlag-Rey, 1984; Schlag-Rey and Schlag, 1984). While finding MD neurons in the awake monkey is itself an experimental challenge, identifying the small subset of MD neurons that relay signals from SC to FEF would seem even harder. There are, however, electrophysiological methods for identifying MD neurons that project to FEF and receive SC input, namely, antidromic and orthodromic stimulation techniques (Fig. 4B) that we described in detail previously (Sommer and Wurtz, 1998, 2002). Using these techniques, we identified 51 neurons in two monkeys that were clearly MD relay
Fig. 4. Technique for satisfying Criterion 1, ensuring that the signals under study originate in a motor area. (A) Anatomical studies indicated that some neurons in the SC intermediate layers project to mediodorsal thalamus (MD), onto relay neurons that in turn project to the frontal eye field (FEF). The SC intermediate layers also send commands that ultimately cause saccade generation down to the brainstem saccade-generating circuits. Arrows indicate direction of signal flow. (B) Method used to identify the neurons in MD that both receive input from SC and project to FEF. Every MD relay neuron was double-identified: it was both antidromically activated from the FEF (showing that it projected to FEF) and orthodromically activated from the SC (showing that it received input from the SC). Arrows show direction of action potential propagation from the stimulating electrodes.
neurons, in that each one was both antidromically activated from the FEF and orthodromically activated from the SC (Sommer and Wurtz, 2002). They may project additionally to frontal cortical areas other than FEF and they may receive other inputs besides that from the SC, but all of them
53
at least were positively identified as relay neurons between SC and FEF. After studying the MD relay neurons we then examined the SC neurons that projected up to them (Wurtz and Sommer, 2000). This was done by looking for SC neurons antidromically activated from the locations of previously recorded MD relay neurons. We also identified FEF neurons that seemed to receive the signals flowing in this ascending pathway (Sommer and Wurtz, 1998). This was done by searching for FEF neurons orthodromically activated from the SC. In sum, we recorded from identified neurons all along a pathway originating in the SC, a structure crucial for generating saccades.
Criterion 2: The signals precede and spatially represent the movements
Fig. 5. Evidence satisfying Criterion 2, showing that signals in the pathway precede and spatially represent the movements. (A) Presaccadic bursts of activity recorded from MD relay neurons. Once an MD relay neuron was isolated it was studied by having the monkey perform a delayed saccade task. The monkey looked at a fixation spot, then a target (Visual Stim.) appeared in the periphery, and after a delay period of 500–1000 ms the fixation spot disappeared (Cue to Move), which was the cue to start the eye movement (Saccade) and look at the target. Shown are examples of two major types of MD relay neurons, Visuomovement and Movement Neurons. Neurons of both types had bursts of activity beginning just prior to the saccade. The pie chart shows the percentage of each neuron type in our sample of MD relay neurons (VM, Visuomovement Neurons; M, Movement Neurons; ‘Others’ include neurons with only visual responses and those with neither visual or saccadic activity). Presaccadic bursts of activity were present in 74% of
For brevity we will focus on the MD relay neurons, which represents the crucial node in the pathway. We studied their activity while monkeys made delayed saccades to visual targets (Sommer and Wurtz, 2002) and found that most of them increased their activity just before the saccade (see Fig. 5A). Of 46 neurons tested, 57% were visuomovement neurons (having both a presaccadic burst and a visual response) and 17% were movement neurons (having a presaccadic burst but no visual response). In net, 74% of the neurons increased their activity before the saccade, on average starting their saccade-related burst 66 ms prior to the onset of movement. Note that this presaccadic initiation meant that the activity could not have resulted from proprioceptive input from eyemuscle contraction. We examined the relationship
the neurons (MþVM neurons), as indicated by the bold outline. (B) Representation of saccadic vectors by MD relay neurons. The movement field (gray oval) of an example neuron is shown at left. The neuron exhibited presaccadic bursts of activity only for saccadic vectors made from the origin into this field. The saccadic vector encoded by the peak firing of the neuron (bold arrow) was directed 27 up from horizontal and was 16 in amplitude. This vector was determined by having the monkey make various directions and amplitudes of saccades (right) and fitting the presaccadic firing rate data with Gaussians and spline curves (solid curves), respectively. Dashed lines show mean baseline activity, and dotted lines show 2 SDs above that, which was the criterion level for significance. Ipsi, ipsilateral space; Contra, contralateral space.
54
between the saccadic activity and the saccadic vector for 29 of the neurons, and 23 of them (79%) had distinct peaks in their movement fields, firing strongest for saccades of a certain amplitude and direction (Fig. 5B). For all tuned neurons the best direction was into the contralateral visual field. Many MD relay neurons, therefore, have activity preceding the saccade and representing the spatial aspects of the saccade. Incidentally, nearly identical results were found for saccade-related bursts of SC neurons projecting up to the MD, consistent with our assumption that the MD relay neurons were driven in large part, if not completely, by SC neurons. These ascending saccadic bursts are excellent candidates to be corollaries of motor commands, because they are qualitatively similar to saccadic bursts exhibited by the general population of SC neurons (Sparks and Hartwich-Young, 1989) and in particular by those SC neurons identified as projecting downstream to saccadic-generating circuits (Guitton and Munoz, 1991; Munoz and Guitton, 1991; Munoz et al., 1991).
Criterion 3: Eliminating the signals does not impair movements in a simple task not requiring corollary discharge At this point we know that signals related to impending saccades are sent from SC up to FEF. But might these signals actually be causing saccade generation through some loop involving cerebral cortex and brainstem? To answer this question we capitalized on the presence of the MD relay neurons in the ascending pathway — an experimental gift to the physiologist. By inactivating them we could specifically interrupt transmission from SC to FEF. [Directly inactivating the SC or FEF, instead, would have caused extensive unwanted effects due to perturbing the myriad other networks involving these structures, including the descending, motordedicated pathways to the pons; we already know that inactivating either SC or FEF impairs saccade generation itself (Hikosaka and Wurtz, 1985; Sommer and Tehovnik, 1997; Dias and Segraves, 1999).] We inactivated the MD relay neurons using muscimol, a GABAA agonist. Muscimol inhibits neuron cell bodies, not axons (Lomber, 1999), so it
should suppress MD relay neurons without affecting transthalamic fibers passing nearby. While MD relay neurons were inactivated, we had monkeys make single saccades to visual or remembered targets at several eccentricities and directions. Making a single saccade does not require corollary discharge information. Thus if the ascending pathway’s saccade-related signals are corollary discharges, single saccades should not be affected by MD inactivation; however, if the signals instead are needed for making saccades, then single saccades should be impaired by MD inactivation. Figure 6A (left) shows the average trajectories of saccades made to targets at 10 eccentricity and eight directions, before versus during inactivation of MD neurons in one experiment. The monkey still made saccades, and quantification showed that the accuracy and latency of these saccades was not altered by inactivation (Sommer and Wurtz, 2002). Throughout a series of like experiments, significant changes in the accuracy and latency of single saccades were infrequent and small. To examine saccadic dynamics we plotted peak speed as a function of amplitude (referred to as the main sequence, Fig. 6A, right). There were no clear impairments during inactivation; the logarithmic fits of the values before and during the injection were not significantly different. The significance of this lack of effect during MD inactivation is brought into sharper perspective by considering previous experiments in which the SC was inactivated with muscimol. Figure 6B (left) shows that during an example of SC inactivation, saccades made to the upper right quadrant were shortened and their trajectories altered. In addition, SC inactivation markedly slowed saccades (Fig. 6B, right). Similar effects have been reported for FEF inactivation (not shown; Sommer and Tehovnik, 1997; Dias and Segraves, 1999). Thus, eliminating the saccade-related signals coursing through MD does not eliminate, or even significantly affect, the generation of single saccades in simple tasks. This supports the idea that these signals provide information about saccades but are not critical for generating them. This is in contrast to inactivation of SC or FEF, which can severely impair saccade generation presumably by shutting off descending efferents to brainstem saccade-generating circuits.
55
Fig. 6. Evidence satisfying Criterion 3, showing that eliminating the signals does not impair saccades made in a simple task not requiring corollary discharge. (A) Results of inactivating the MD relay neurons while monkeys made single saccades to visual targets. Left, average trajectories of saccades made in one experiment, before versus during the inactivation. Saccades traveled from the center of the screen to each of eight targets at 10 eccentricity. Inactivation did not significantly impair saccades in any direction. Right, graphs summarizing the dynamics of contraversive single saccades. The curves show logarithmic fits. (From data presented in Sommer and Wurtz, 2002.) (B) Analogous saccade data from an experiment in which the SC was inactivated (Hikosaka and Wurtz, 1985; Aizawa and Wurtz, 1998).
Criterion 4: Eliminating the signals disrupts movements in a task requiring corollary discharge Many tasks can be imagined that require corollary discharge for their execution, for example tasks that require distinguishing sensations caused by selfmovement as opposed to external forces or tasks that require generation of fast, complex motor acts. The task we used was the double-step task, in which the monkey had to make successive saccades to two flashed targets (Fig. 7A, left). We selected this task because it is widely used as an assay for the presence
of corollary discharge, particularly in patients with cortical lesions (Duhamel et al., 1992b). Correct execution of the second saccade (the upward saccade) requires knowledge of where the eye lands after the first (horizontal) saccade. Visual feedback indicating where the eye is after the first saccade is not available because the saccades begin after the targets disappear; additionally, the experimental room is usually in total darkness. Proprioception probably does not contribute to successful performance, because it likely has little influence in the online control of saccades (Lewis et al., 2001) and has been
56
Fig. 7. Evidence satisfying Criterion 4, showing that eliminating the signals does impair saccades made in a task requiring corollary discharge. (A) We used the double-step task, which requires corollary discharge for correct performance. Left, the monkey first looked at a fixation spot (shown in gray, center of screen), which then disappeared as two targets were flashed sequentially (shown in white, T1 and T2). The monkey then made two saccades (arrows) to the target locations. Due to the reaction time of the saccades, all stimuli were gone before the saccades started. With corollary discharge intact, the first saccade would go rightward and the second saccade would go straight up from there. Right, without corollary discharge, the first saccade would go rightward but there would be no internal record of this. Hence the monkey would not know that its eyes are at a new position, and to complete the trial it would be expected to make the second saccade as if it were still looking at the center of the screen, i.e. the second saccade should travel diagonally (dashed arrow). Since the first saccade was in fact made correctly, however, the pattern of saccades should be as shown with the solid arrows. (B) Left, individual saccadic sequences from an example MD inactivation, before (top) and during (bottom) inactivation. Right, means (and SDs) of the initial fixation locations, first-saccade endpoints, and second-saccade endpoints for the same example. The only significant change was that predicted by loss of corollary discharge: there was a shift of second-saccade endpoints in the contraversive (rightward) direction. From Sommer and Wurtz (2002). S1, first saccade; S2, second saccade; n.s.d., not significantly different.
shown to be unnecessary for double-step task (Guthrie et couraged memorization or sequences by randomizing a
performing a similar al., 1983). We dispreplanning of the variety of sequences
across trials and changing the sequences between experiments. The indicator of a loss of corollary discharge in this task is specific and quantifiable. If inactivation
57
totally eliminates corollary discharge (Fig. 7A, right), the monkey should make a contraversive first saccade correctly but should not have internal information that it did so. Therefore, if the monkey tries to complete the trial by looking at the second target, it should make a second saccade as if it never made the first, i.e. as if the eyes were still looking at the fixation point; hence the second saccade should be made diagonally (Fig. 7A, right, dashed arrow). But since the first saccade actually was correct (the monkey just did not know this), the second saccade will begin at the endpoint of the first target and will land to the right of the second target location (Fig. 7A, right, diagonal arrow). The indicator of lost corollary discharge therefore would be a shift of secondsaccade endpoints contraversively (in this example, rightward) during inactivation. No vertical shifts should occur, however, nor any changes in the initial fixation locations or first-saccade endpoints. Figure 7B shows the results from one injection of muscimol into MD (Sommer and Wurtz, 2002). Before inactivation the monkey made saccadic sequences correctly. Because the saccades were made in total darkness, first-saccades were shifted upward slightly (Gnadt et al., 1991). Second saccades went nearly straight up, indicating that the corollary discharge was intact. Following inactivation of MD, the second-saccade endpoints shifted contraversively (to the right) as expected if the corollary discharge was impaired. Quantitatively the second-saccade endpoints were shifted 2.5 to the right ( P<0.001), but not significantly vertically, during the injection. Neither the initial fixation locations nor the firstsaccade endpoints were shifted significantly in either direction. We performed seven muscimol experiments in which there were a total of 22 cases of before versus during saccadic sequence pairs to analyze (Sommer and Wurtz, 2002). In every case the principle for identifying a corollary discharge deficit was the same as in Fig. 7. In 82% of the cases (18/22) there was a contraversive shift in second-saccade endpoints, and the overall mean shift (1.12 ) was significantly greater than zero. The contraversive shift in half (11/22) of the cases was individually significant. First-saccade endpoints did not exhibit a significant mean horizontal shift and neither did initial fixation locations. In the vertical direction there were no
mean shifts in any of the data. As controls we randomly interleaved trials in which the targets appeared ipsilaterally. Identical target configurations were used, but flipped across the vertical meridian. In these trials the first-saccades were ipsiversive, a direction poorly represented by MD relay neurons. Accordingly, we found no corollary discharge deficits: the mean horizontal shift for second-saccade endpoints was not significantly different from zero. We also considered whether inactivation degraded a monkey’s ability to see the second target and/or remember its location. If such deficits occurred, there should have been greater scatter of the secondsaccade endpoints during inactivation, due to greater uncertainty about the second target location. This did not occur, however. If there were subtle visual or memory deficits, they did not seem to affect performance in our task. We measured the size of the deficit by finding the percentage of the observed shift relative to that expected if the corollary were completely eliminated. In the example shown in Fig. 7B, the second-saccadic endpoints shifted 2.5 horizontally rather than the 10 as expected, and so there was a 25% deficit. Calculating this value for each experiment showing a deficit allowed us to gauge the average deficit, and overall there was a 19% impairment. There are several experimental and theoretical factors that might have contributed to the modest size of the deficit. First, we injected at only one MD site at a time, and therefore we may have left a substantial fraction of MD active. Second, we have identified one possible pathway for a corollary from brainstem to cortex, but we do not claim that it is the only pathway. In fact, our results might be interpreted as indicating that there are other such pathways including those relayed from cerebellum or substantia nigra to the thalamus and then to FEF (Lynch et al., 1994). Finally, it is conceivable that the monkeys exploited proprioceptive input after losing corollary discharge during inactivation. However, due to the dubious usefulness of proprioception in rapid saccadic behavior, as mentioned above, we think that the first two explanations are more likely. In summary, signals in the pathway from SC to FEF via MD satisfy all four criteria for being corollary discharges. The signals originate from a known motor-related region, they encode the timing
58
and spatial parameters of upcoming saccades, their removal does not affect the generation of saccades in simple tasks, and their removal does disrupt saccades in a double-step task that requires corollary discharge.
Conclusion We have concentrated on the pathway from SC to FEF and how the corollary discharge carried therein contributes to making accurate movement sequences. This is an example of how corollary discharge can be crucial for motor behavior, as has been explicated in computational detail lately with respect to limb movements by Wolpert and Ghahramani (2000). The other major role of corollary discharge is in analysis of sensory input, for example in our ability to perceive a stable visual scene despite the frequent rotations of our retinas due to saccades. One way in which corollary discharge might promote a sense of visual stability is by helping to remap visual receptive fields just prior to saccade initiation in the FEF (Umeno and Goldberg, 1997) and in the lateral intraparietal cortex (LIP) and other extrastriate visual areas (Duhamel et al., 1992a; Colby and Goldberg, 1999). Also, corollary discharge could help neurons, such as those in superficial SC and area MT, discriminate real visual motion from self-induced motion caused by eye movements. Note that corollary discharge signals sent to the FEF from the SC through our ascending pathway could then be disseminated via the FEF’s projections to a legion of other cerebral cortical areas, including MT and LIP as reviewed by Schall (1997). Future work therefore should focus on possible sensory functions of the corollary discharge signals sent from SC to FEF. Moreover, it should be recalled that saccade-related bursts of activity, the presumed corollary discharges, were not the only kinds of signals found in the pathway from SC to FEF. The exact roles of the other signal types in this pathway, e.g. the visual responses, still need to be determined. More generally, we would like to emphasize that the pathway we explored is only one of a number of brainstem-to-cortex pathways in primates. Another salient example is the pathway from the SC superficial layers relayed through the pulvinar to
Fig. 8. Two ascending pathways from SC to cerebral cortex. One, the pathway from the SC intermediate layers (SCi) through MD to frontal cortex, carries corollary discharges of saccadic eye movements as established by the criteria set forth in this chapter. The other pathway, from the SC superficial layers (SCs) through pulvinar to parietal and occipital cortex, may be involved in attention, but relatively little is known about it.
extrastriate cortex (Fig. 8; reviewed by Sommer and Wurtz, in press). We have little knowledge of the contribution of this pathway to cortical function, but we do know that inactivation of pulvinar alters a monkey’s performance on a task requiring a shift of attention (Petersen et al., 1987). This SC to pulvinar to extrastriate cortex pathway was the center of intense interest in considering multiple visual pathways to the cortex over 30 years ago (Diamond and Hall, 1969; Schneider, 1969), and it should clearly be revisited using techniques as discussed here.
References Aizawa, H. and Wurtz, R.H. (1998) Reversible inactivation of monkey superior colliculus: I. Curvature of saccadic trajectory. J. Neurophysiol., 79: 2082–2096. Bell, C.C. (1984) Effects of motor commands on sensory inflow, with examples from electric fish. In: Bolis L and Keynes R.D (Eds.), Comparative Physiology: Sensory Systems. Cambridge University Press, Cambridge, pp. 637–646. Benevento, L.A. and Fallon, J.H. (1975) The ascending projections of the superior colliculus in the rhesus monkey (Macaca mulatta). J. Comp. Neurol., 160: 339–361. Bridgeman, B. (1973) Receptive fields in single cells of monkey visual cortex during visual tracking. Int. J. Neurosci., 6: 141–152. Bridgeman, B. (1995a) A review of the role of efference copy in sensory and oculomotor control systems. Ann. Biomed. Eng., 23: 409–422.
59 Bridgeman, B. (1995b) A review of the role of efference copy in sensory and oculomotor control systems. Ann. Biomed. Eng., 23: 409–422. Colby, C.L. and Goldberg, M.E. (1999) Space and attention in parietal cortex. Annu. Rev. Neurosci., 23: 319–349. Delcomyn, F. (1977) Corollary discharge to cockroach giant interneurones. Nature, 269: 160–162. Diamond, I.T. and Hall, W.C. (1969) Evolution of neocortex. Science, 164: 251–262. Dias, E.C. and Segraves, M.A. (1999) Muscimol-induced inactivation of monkey frontal eye field: effects on visually and memory-guided saccades. J. Neurophys., 81: 2191–2214. Donaldson, I.M.L. (2000) The functions of the proprioceptors of the eye muscles. Phil. Trans. R. Soc. Lond. B., 355: 1685–1754. Duffy, C.J. (2000) Optic flow analysis for self-movement perception. Int. Rev. Neurobiol., 44: 199–218. Duhamel, J.-R., Colby, C.L. and Goldberg, M.E. (1992a) The updating of the representation of visual space in parietal cortex by intended eye movements. Science, 255: 90–92. Duhamel, J.-R., Goldberg, M.E., FitzGibbon, E.J., Sirigu, A. and Grafman, J. (1992b) Saccadic dysmetria in a patient with a right frontoparietal lesion: the importance of corollary discharge for accurate spatial behavior. Brain, 115: 1387–1402. Frith, C.D., Blakemore, S.J. and Wolpert, D.M. (2000) Abnormalities in the awareness and control of action. Phil. Trans. R. Soc. Lond. B. Biol. Sci., 355: 1771–1788. Galletti, C., Squatrito, S., Battaglini, P.P. and Maioli, M.G. (1984) ‘Real-motion’ cells in the primary visual cortex of macaque monkeys. Brain Res., 301: 95–110. Gnadt, J.W., Bracewell, R.M. and Andersen, R.A. (1991) Sensorimotor transformation during eye movements to remembered visual targets. Vis. Res., 31: 693–715. Goldman-Rakic, P.S. and Porrino, L.J. (1985) The primate mediodorsal (MD) nucleus and its projection to the frontal lobe. J. Comp. Neurol., 242: 535–560. Gru¨sser, O.J. (1995) On the history of the ideas of efference copy and reafference. Clio. Med., 33: 35–55. Guitton, D. and Munoz, D.P. (1991) Control of orienting gaze shifts by the tectoreticulospinal system in the head-free cat. I. Identification, localization, and effects of behavior on sensory responses. J. Neurophys., 66: 1605–1623. Guthrie, B.L., Porter, J.D. and Sparks, D.L. (1983) Corollary discharge provides accurate eye position information to the oculomotor system. Science, 221: 1193–1195. Haarmeier, T., Thier, P., Repnow, M. and Petersen, D. (1997) False perception of motion in a patient who cannot compensate for eye movements. Nature, 389: 849–852. Hikosaka, O. and Wurtz, R.H. (1985) Modification of saccadic eye movements by GABA-related substances. I. Effect of muscimol and bicuculline in monkey superior colliculus. J. Neurophys., 53: 266–291.
Ilg, U.J. and Thier, P. (1996) Inability of rhesus monkey area V1 to discriminate between self-induced and externally induced retinal image slip. Eur. J. Neurosci., 8: 1156–1166. Jordan, M.I. and Rumelhart, D.E. (1992) Forward models: supervised learning with a distal teacher. Cogn. Sci., 16: 307–354. Keller, E.L. and Robinson, D.A. (1971) Absence of a stretch reflex in extraocular muscle of the monkey. J. Neurophys., 34: 909–919. Lewis, R.F., Zee, D.S., Hayman, M.R. and Tamargo, R.J. (2001) Oculomotor function in the rhesus monkey after deafferentation of the extraocular muscles. Exp. Brain Res., 141: 349–358. Lomber, S.G. (1999) The advantages and limitations of permanent or reversible deactivation techniques in the assessment of neural function. J. Neurosci. Methods, 86: 109–117. Lynch, J.C., Hoover, J.E. and Strick, P.L. (1994) Input to the primate frontal eye field from the substantia nigra, superior colliculus, and dentate nucleus demonstrated by transneuronal transport. Exp. Brain Res., 100: 181–186. MacKay, D. (1966) Cerebral organization and the conscious control of action. In: Eccles J.C (Ed.), Brain and Conscious Experience. Springer, New York, pp. 422–445. McCloskey, D.I. (1981) Corollary discharges: motor commands and perception. In: Brooks V.B (Ed.), Handbook of Physiology. The Nervous System. American Physiological Society, Bethesda, MD. Munoz, D.P. and Guitton, D. (1991) Control of orienting gaze shifts by the tectoreticulospinal system in the head-free cat. II. Sustained discharges during motor preparation and fixation. J. Neurophys., 66: 1624–1641. Munoz, D.P., Guitton, D. and Pe´lisson, D. (1991) Control of orienting gaze shifts by the tectoreticulospinal system in the head-free cat. III. Spatiotemporal characteristics of phasic motor discharges. J. Neurophys., 66: 1642–1666. Nakamura, K. and Colby, C.L. (2002) Updating of the visual representation in monkey striate and extrastriate cortex during saccades. Proc. Natl. Acad. Sci. USA, 99: 4026–4031. Petersen, S.E., Robinson, D.L. and Morris, J.D. (1987) The contribution of the pulvinar to visual spatial attention. Neuropsychologia, 25: 97–105. Pierrot-Deseilligny, C., Ploner, C.J., Muri, R.M., Gaymard, B. and Rivaud-Pechoux, S. (2002) Effects of cortical lesions on saccadic: eye movements in humans. Ann. N.Y. Acad. Sci., 956: 216–229. Poulet, J.F. and Hedwig, B. (2002) A corollary discharge maintains auditory sensitivity during sound production. Nature, 418: 872–876. Richmond, B.J. and Wurtz, R.H. (1980) Vision during saccadic eye movements. II. A corollary discharge to monkey superior colliculus. J. Neurophys., 43: 1156–1167. Robinson, D.A. (1972) Eye movements evoked by collicular stimulation in the alert monkey. Vis. Res., 12: 1795–1808.
60 Robinson, D.L. and Wurtz, R.H. (1976a) Use of an extraretinal signal by monkey superior colliculus neurons to distinguish real from self-induced stimulus movements. J. Neurophys., 39: 852–870. Robinson, D.L. and Wurtz, R.H. (1976b) Use of an extraretinal signal by monkey superior colliculus neurons to distinguish real from self-induced stimulus movement. J. Neurophysiol., 39: 852–870. Ruskell, G.L. (1999) Extraocular muscle proprioceptors and proprioception. Prog. Retin. Eye Res., 18: 269–291. Schall, J.D. (1997) Visuomotor areas of the frontal lobe. In: Rockland K, Kaas J.H and Peters A (Eds.), Cerebral Cortex. Plenum Press, New York, pp. 527–638. Schiller, P.H. and Koerner, F. (1971) Discharge characteristics of single units in superior colliculus of the alert rhesus monkey. J. Neurophys., 34: 920–936. Schlag, J. and Schlag-Rey, M. (1984) Visuomotor functions of central thalamus in the monkey. II. Unit activity related to visual events, targeting and fixation. J. Neurophys., 51: 1175–1195. Schlag-Rey, M. and Schlag, J. (1984) Visuomotor functions of central thalamus in monkey: I. Unit activity related to spontaneous eye movements. J. Neurophys., 51: 1149–1174. Schneider, G.E. (1969) Two visual systems. Brain mechanisms for localization and discrimination are dissociated by tectal and cortical lesions. Science, 163: 895–902. Skavenski, A.A. (1990) Eye movement and visual localization of objects in space. In: Kowler E (Ed.), Reviews of Oculomotor Research: Eye Movements and Their Role in Visual and Cognitive Processes. Elsevier, Amsterdam, pp. 263–287. Sommer, M.A. and Tehovnik, E.J. (1997) Reversible inactivation of macaque frontal eye field. Exp. Brain Res., 116: 229–249. Sommer, M.A. and Wurtz, R.H. (1998) Frontal eye field neurons orthodromically activated from the superior colliculus. J. Neurophys., 80: 3331–3335. Sommer, M.A. and Wurtz, R.H. (2002) A pathway in primate brain for internal monitoring of movements. Science, 296: 1480–1482. Sommer M.A. and Wurtz R.H. (in press) The dialogue between cerebral cortex and superior colliculus: implications for saccadic target selection and corollary discharge.
In: L.M. Chalupa and J.S. Werner (Eds.), The Visual Neurosciences. MIT Press, Cambridge, MA. Sparks, D.L. and Hartwich-Young, R. (1989) The deep layers of the superior colliculus. In: Wurtz R.H and Goldberg M.E (Eds.), The Neurobiology of Saccadic Eye Movements, Reviews of Oculomotor Research, Vol. III. Elsevier, Amsterdam, pp. 213–256. Sperry, R.W. (1950) Neural basis of the spontaneous optokinetic response produced by visual inversion. J. Comp. Physiol. Psychol., 43: 482–489. Thiele, A., Henning, P., Kubischik, M. and Hoffmann, K.P. (2002) Neural mechanisms of saccadic suppression. Science, 295: 2460–2462. Thier, P., Haarmeier, T., Chakraborty, S., Lindner, A. and Tikhonov, A. (2001) Cortical substrates of perceptual stability during eye movements. Neuroimage, 14: S33–S39. Troyer, T.W. and Doupe, A.J. (2000) An associational model of birdsong sensorimotor learning. I. Efference copy and the learning of song syllables. J. Neurophys., 84: 1204–1223. Umeno, M.M. and Goldberg, M.E. (1997) Spatial processing in the monkey frontal eye field. I. Predictive visual responses. J. Neurophys., 78: 1373–1383. von Holst, E. and Mittelstaedt, H. (1950) Das Reafferenzprinzip. Wechselwirkungen zwischen Zentralnervensystem und Peripherie. Naturwissenschaften, 37: 464–476. Warren, W.H., Jr. and Hannon, D.J. (1990) Eye movements and optical flow. J. Opt. Soc. Am. A, 7: 160–169. Wolpert, D.M. and Ghahramani, Z. (2000) Computational principles of movement neuroscience. Nat. Neurosci., 3 Suppl: 1212–1217. Wurtz, R.H. (1968) Visual cortex neurons: response during rapid eye movements. Science, 162: 1148–1150. Wurtz, R.H. and Duffy, C.J. (1997) Relation of MST activity to optic flow and heading. In: Sakata H, Mikami A and Fuster J (Eds.), The Association Cortex — Structure and Function. Harwood Academic Publishers, Amsterdam, pp. 175–190. Wurtz, R.H. and Goldberg, M.E. (1971) Superior colliculus cell responses related to eye movements in awake monkeys. Science, 171: 82–84. Wurtz, R.H. and Sommer, M.A. (2000) Activity in the pathway from superior colliculus to frontal eye field: tectothalamic neurons. Soc. Neurosci. Abstr., 26: 969.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 4
Visual awareness and the cerebellum: possible role of decorrelation control Paul Dean*, John Porrill and James V. Stone Department of Psychology, University of Sheffield, Western Bank, Sheffield S10 2TP, UK
Abstract: The two roles in awareness most often suggested for the cerebellum are (i) keeping the details of motor skills away from forebrain computation, and (ii) signaling to the forebrain when a sensory event is not predictable from prior motor commands. However, it is unclear how current models of the cerebellum could carry out these roles. Their architecture, based on the seminal ideas of Marr and Albus, appears to need ‘motor error’ to learn correct motor commands. However, since motor error is the difference between the actual motor command and what the command should have been, it is a signal unavailable to the organism in principle. We propose a possible solution to this problem, termed decorrelation control, in which the cerebellum learns to decorrelate the motor command sent to the muscles from the sensory consequences of motor error. This method was tested in a linear model of oculomotor plant compensation in the vestibulo-ocular reflex. A copy of the eye-movement command was sent as mossy-fiber input to the flocculus, represented as a simple adaptive filter version of the Marr–Albus architecture. The sensory consequences of motor error were retinal slip, delivered as climbing fiber input to the flocculus. A standard anti-Hebbian learning rule was used to decorrelate the two. Simulations of the linearized problem showed the method to be effective and robust for plant compensation. Decorrelation control is thus a candidate algorithm for the basic cerebellar microcircuit, indicating how it could achieve motor learning using only signals available to the system. Such learning might then enable the cerebellum to free up visual awareness, and also, by providing a sensory signal decorrelated from motor command, supply awareness with crucial information about the external world.
Introduction
and to pursue the evidence needed to find the explanation that is correct. This approach in its combination of intellectual honesty and acuity has similarities to that immortalized in the great fictional detective, Sherlock Holmes, and is just as relevant to theoretical investigations of neural function as it is to experiments in the laboratory. And it is with theoretical studies, specifically with computer modeling of cerebellar function, that this present contribution deals.
Those of us fortunate enough to have worked with Alan Cowey in the laboratory are aware of both his practical skills and his helpfulness. But the example set by Alan extends beyond the laboratory. Anyone who has read his account of global stereopsis in rhesus monkeys (Cowey et al., 1975) will know about the cunning those animals use to seize on cues the experimenter did not intend them to employ. They will also be aware of this particular experimenter’s ability not to be taken in by plausible though attractive explanations of his subjects’ performance, to think of alternative although unwelcome explanations,
The cerebellum and visual awareness A long-standing view of cerebellar function concerns its ability to free the forebrain from the detailed calculation required to generate accurate movement.
*Corresponding author. Tel.: þ 44-(0)114-222-6521; Fax: þ 44-(0)114-276-6515; E-mail: p.dean@sheffield.ac.uk DOI: 10.1016/S0079-6123(03)14400-4
61
62
An early formulation was by Brindley in 1964: ‘‘the message sent down by the fore-brain in initiating a voluntary movement is often insufficient . . . it needs to be elaborated by the cerebellum in a manner that the cerebellum learns with practice . . . The cerebellum is thus a principal agent in the learning of motor skills.’’ (Brindley, 1964). This idea has been particularly influential in guiding cerebellar modeling: ‘‘. . . the cerebellum becomes rather more than a slave which copies things originally organized by the cerebrum: it becomes an organ in which the cerebrum can set up a sophisticated and interpretative buffer language between itself and muscle. This . . . leaves the cerebrum free to handle movements and situations in a symbolic way without having continually to make the translation.’’ (Marr, 1969) p. 468. From this perspective, the cerebellum fulfils a role similar to that of a certain kind of computer operating system: easy-to-use high-level commands are translated into the requisite machine language. It is the cerebellum that makes the body user-friendly. An intuitive mapping of this idea onto the field of awareness suggests that without a cerebellum, much of our conscious thought would be spent in making sure we did not fall over, in planning how to set one foot in front of another, and in working out how to move our eyes to look at the next target of interest in the visual scene. But since the cerebellum learns to execute such skills automatically, awareness is spared the necessary detailed planning, and is at liberty to focus on our internal representations of the visual world. In reading, for example, the cerebellum allows awareness of the meaning of the text to be unsullied by complex planning of the next saccade. This is not, however, the only suggestion concerning the role of the cerebellum in awareness. A number of workers have been at pains to emphasize that the cerebellum is not only (or even primarily) involved in motor functions, but instead plays a role in the acquisition and analysis of sensory input
(Paulin, 1993; Bower, 1997). For example, the cerebellum may help to clarify whether a given stimulus results from the system’s own movements, or whether instead it is unexpected and hence of external origin (Blakemore et al., 2001; Nixon and Passingham, 2001). Thus, the cerebellum has been implicated in our inability to tickle ourselves (Weiskrantz et al., 1971; Blakemore et al., 2000). Again, mapping these notions loosely onto the field of awareness suggests that the cerebellum might act as a kind of gatekeeper which reduces the salience of stimuli that were in some sense to be expected.
Problems with models of the cerebellum A minimal requirement for the plausibility of these suggestions about cerebellar roles in awareness is that models of the cerebellum are capable of carrying out the necessary calculations. Unfortunately, it is far from clear that this is in fact the case. As a background to understanding the problems of cerebellar models, it is helpful to recall some very basic features of the anatomy and physiology of cerebellar cortex (Eccles et al., 1967; Kandel et al., 2000).
Background to cerebellar models Cerebellar cortex has only one type of output cell, namely the Purkinje cell (schematic in Fig. 1), distinguished by its spectacular dendritic field. Purkinje cells receive two types of excitatory inputs, delivered by mossy fiber and climbing fiber afferents to cerebellar cortex. Mossy-fiber synapses contact granule cells, the most numerous neuronal cell type in the entire brain, whose axons ascend to the surface of the cortex then bifurcate to become parallel fibers. Both ascending axons and parallel fibers form excitatory synapses on Purkinje cells, which cause the cell to fire normal (termed ‘simple’) spikes at tonic rates of about 100 Hz. An individual Purkinje cell will receive input from many thousands of granule cells: in contrast, it is contacted by only one climbing fiber. However, this fiber wraps itself around the dendritic tree of the Purkinje cell, forming multiple synapses that ensure the Purkinje cell fires whenever the climbing fiber does. The ‘complex’
63
spike so produced is longer lasting than the usual simple spikes, but occurs much less frequently (about 1 Hz). Since many current cerebellar models are in effect descendants of the original models of Marr (1969) and Albus (1971), they tend to explain the above features of cerebellar cortex in similar ways (Fig. 2). (1) Decomposition of mossy-fiber inputs. The transformation of mossy-fiber input into parallel fiber
activity is seen as splitting the input signal into simpler components. These simpler components make learning easier. (2) Recombination of parallel fiber signals. Synapses between parallel fibers and Purkinje cells are seen as ‘weighting’ signal components. The Purkinje cell simple spike output is generated from these weighted components. (3) Weights altered by climbing fiber signals. Climbing fiber input is seen as altering the values of these weights, i.e. the parallel-fiber Purkinje-cell synapses. Climbing fiber input acts as a teaching signal, enabling the cerebellum to be involved in motor learning. This idea can in principle explain both the power of the climbing fiber input (all parallel fiber synapses must be affected) and its relative weakness (very low frequency of complex spikes, so the output of Purkinje cell is scarcely affected).
Shortcomings of cerebellar models
Fig. 1. Highly simplified sketch of the neural circuitry of cerebellar cortex, showing only the main excitatory inputs to Purkinje cells.
Fig. 2. Interpretation of simplified cerebellar circuitry in Marr–Albus framework. Mossy-fiber input y(t) is split into components yi(t) that are conveyed by parallel fibers. Each component is weighted by wi which corresponds to the efficacy of the synapse between that parallel fiber and the target Purkinje cell. The weighted components are summed to produce Purkinje cell output. The value of each weight can be altered by climbing-fiber input e(t), which acts as a teaching signal.
Why does this type of model have problems producing the kind of cerebellar behavior required for the interactions with visual awareness described above? As far as signaling unexpected sensory events is concerned, Marr–Albus-type models have tended to concentrate on the motor aspects of cerebellar function (cf. the quotation from Marr above). Possible sensory functions of the cerebellum have to some extent been neglected. However, even within the motor domain, it is not clear whether the Marr–Albus type of model actually works. Marr expressed this problem in general terms: ‘‘In my own case, the cerebellar study . . . disappointed me, because even if the theory was correct, it did not enlighten one about the motor system — it did not, for example, tell one how to go about programming a mechanical arm.’’ (Marr, 1982) p. 15. More particularly, a grave disadvantage of some versions of these models is that they appear to require ‘motor error’ as teaching signal. This is a generic problem of supervized learning algorithms, employed,
64
for example, with multilayer artificial neural networks. Supervision takes the form of telling the net what the difference was between its output and the correct output. In the case of motor commands, this difference (between the actual motor command and the correct command) is termed motor error. Using motor error as the teaching signal conveyed by climbing fibers allows Marr–Albus models to learn correct motor commands. Unfortunately, a motor-error signal does not exist in practice, because the system cannot know in advance what the correct motor commands should be. Perhaps not surprisingly then, experimental investigations of climbing fiber signals suggest that they are often sensory (concerning, e.g. touch, pain) rather than motor in nature (Simpson et al., 1996). How can the model learn the correct commands with only sensory information as a teaching signal?
Decorrelation control as a possible solution Decorrelation control has been suggested as a possible algorithm for the cerebellum to solve both the sensory and the motor problems (Dean et al., 2002). It replaces motor error as a climbing fiber signal by ‘sensory error’, that is the sensory consequences of an incorrect motor response. For example, poor aim in tennis sends the ball in an unintended direction: the difference between actual and intended direction is a form of sensory error. (Motor error would be the difference in command to the arm muscles required to move the racquet in the necessary manner for accuracy.) The crucial point about sensory error is that, in sharp contrast to motor error, it could be available to the system — visually, in the tennis example. But how could sensory error be used in learning? By definition, sensory error is caused by motor error. Values of the relevant sensory variable (e.g. in the tennis case, direction taken by ball in relation to intended direction) will therefore be correlated with preceding motor commands, if those commands are incorrect. If, however, the commands are correct, there will be no correlation between the commands and the sensory variable. In tennis, deviations between intended and unintended ball flight might be caused by sudden gusts of wind, but would in that
case be uncorrelated with motor commands. The purpose of decorrelation control is therefore to remove any correlations between motor command and the variable that codes sensory error. Decorrelation control thus requires that some mossy-fiber inputs (Figs. 1 and 2) carry information relating to the motor command, for example an efference copy. It also requires climbing fibers to carry information about the undesirable sensory consequences of motor commands. Finally, it uses the following as a learning rule: (i) If parallel-fiber firing is positively correlated with climbing-fiber firing, reduce the weight of the parallel-fiber synapse with the Purkinje cell (LTD). (ii) If parallel-fiber firing is negatively correlated with climbing-fiber firing, increase the weight of the synapse (LTP). (iii) If parallel-fiber firing is uncorrelated with climbing-fiber firing, do not change the synapse. Although this rule may appear complex, its basic equation is simple. !i ¼ eðtÞyi ðtÞ
ð1Þ
The change (wi) in the weight (wi) of the synapse between the ith parallel fiber and the target Purkinje cell is proportional (with learning-rate constant ) to the product of the sensory error e(t) (climbing-fiber signal) and the signal in the ith parallel fiber yi(t) (all signals expressed as differences from their tonic levels). The equation is based on Sejnowski’s (1977) characterization of anti-Hebbian learning at the parallel-fiber Purkinje-cell synapse as a covariance learning rule. It can be seen that learning will stop (wi ¼ 0) if the expected value of the product of the climbing-fiber signal e(t) and the parallel-fiber signal yi(t) becomes zero, that is when there is no correlation between e(t) and yi(t). If the parallel-fiber input represents a component of motor command, learning will cease when that component is decorrelated from sensory error. If the decorrelation-control algorithm were to work, the cerebellum would be able to learn correct motor responses by using an available sensory signal (consequences of motor error), not the unavailable
65
signal of motor error itself. After learning, the sensory signal would be uncontaminated by the system’s own motor commands, and would therefore signal ‘unexpected’ sensory events. The algorithm would therefore fulfil both the putative roles of the cerebellum in relation to awareness.
Testing decorrelation control A model of a neural process needs to pass at least two types of test: (i) Can it carry out the required computation? (ii) Is it consistent with experimental evidence? There has been extensive debate concerning the relation of Marr–Albus-type models to the detailed anatomy and physiology of cerebellar cortex (for reviews, see Llina´s and Welsh, 1993; Ito, 2001). The approach taken here is to focus on the first test, namely whether the decorrelation-control algorithm has the required computational power. This approach in effect asks the question if the basic Marr–Albus ideas are a reasonable simplification of cerebellar physiology, then would decorrelation control work. As far as the second kind of test is concerned, enquiry will be limited to the issue of whether the inputs to cerebellar cortex that are required by decorrelation control (see above) are observed experimentally. The computational problem facing the decorrelation-control algorithm is implicit in Eq. (1). Although learning will in fact cease once motor command and sensory error are decorrelated, the question is whether this state of affairs could ever be reached in practice. If in Eq. (1) the term e(t) were to refer to the difference between actual and desired cerebellar output (motor error), the learning rule would (under certain restrictions) be guaranteed to find the values of the weights wi (Fig. 2) that gave the best (least-squares) estimate of cerebellar output. However, the term e(t) in Eq. (1) in fact refers to sensory error, that is the effects of cerebellar output after it has been altered by the mechanical properties of the system under control (summarized by the term ‘plant’). Cerebellar cortex does not receive the information, namely motor error, required to guarantee learning (details in Dean et al., 2002). The first test for the decorrelation-control algorithm is thus
whether it is capable of dealing with the kind of plant characteristics that have been observed experimentally.
Oculomotor plant compensation We chose the oculomotor system to test decorrelation control on the grounds that, compared with the skeletal motor system, its mechanical properties are relatively simple, and because a great deal is now known about the anatomy and physiology of its low-level control circuitry. It appears that the inputs to this circuitry take the form of eye-velocity commands. However, ocular motoneuron output has to act on the eye muscles and orbital tissue (the ‘plant’ referred to above). The mechanical characteristics of the plant mean that a simple velocity command does not generate the corresponding velocity output (Carpenter, 1988). This can be seen in Fig. 3A which illustrates a very simple approximation to the oculomotor plant. Although the inertia of the globe can be ignored for most purposes, the plant still has elasticity as well as viscosity, represented in Fig. 3A by a single elastic element in parallel with the viscous element. This elasticity distorts the velocity command, as shown in Fig. 3B. Here a brief velocity command, similar to that used to produce saccades, moves the eye rapidly to a new position. But although the velocity command after the brief pulse is zero, the eye nonetheless moves, because the elastic element pulls the eye back to the primary position. Figure 3B shows the resultant exponential drift of eye position, with time constant determined by the relative values of the elasticity and viscosity. In the example illustrated, the time constant is about 200 ms. Prevention of this unwanted drift requires a mechanism for producing the desired velocity output (velocity in ¼ velocity out). This mechanism is sometimes termed ‘oculomotor plant compensation’, though in the oculomotor literature it is often referred to as ‘neural integration’ since that is the process required for a first-order plant as illustrated in Fig. 3B. Two important features of oculomotor plant compensation qualify it as a suitable task for testing the decorrelation-control algorithm. First, there is good evidence that oculomotor plant compensation requires the cerebellum. Lesions of the
66
Fig. 3. (A) Simple model of oculomotor plant, consisting of an elastic element (with elasticity k with dimensions of force and distance1) in parallel with a viscous element (viscosity b with dimensions of force and velocity1). The inertia of the eyeball is ignored. (B) Behavior of plant illustrated in A upon release from a position 1 from the resting position. The time course of the return to the resting position is an exponential decay, with a single time constant given by b/k (in example shown here, ¼ 0.2 s).
cerebellum that include a particular region produce a postsaccadic drift back to the primary position similar in appearance to that shown in Fig. 3B, though with a longer time constant of about 1–2 s (Carpenter, 1972; Robinson, 1974; Zee et al., 1981; Godaux and Vanderkelen, 1984). (We use the term flocculus for this region for simplicity, though the adjacent ventral paraflocculus is also likely to be involved). Secondly, the velocity in–velocity out rule can be regarded as an example of the ‘elaboration’ of an insufficient motor command, the generic cerebellar function proposed by Brindley (1964) in the quotation given above.
Structure of model The process of learning oculomotor plant compensation requires a source of velocity commands. A suitable source is provided by the vestibulo-ocular reflex (VOR), in which movements of the head send a velocity signal through the brainstem to the eye muscles. The goal of the reflex is to reproduce these velocity commands (with appropriate sign) so that the eyes counter-rotate to maintain stable gaze. If this goal is not achieved, the eyes move relative to the world, and so the whole image moves over the retina, a movement known as ‘retinal slip’. Retinal slip is the sensory error corresponding to the
Fig. 4. Simplified model for plant compensation in vestibuloocular reflex. Head velocity x(t) is processed by the filter V, then added to the output c(t) of the decorrelator (cerebellar flocculus) C. The summed signal is then passed to a brainstem controller B. The output of B is a motor command y(t), which acts on the plant P. A copy of y(t) is sent back to the cerebellum C. The effects of y(t) acting on P are added to the head velocity x(t); the difference is detected as retinal slip e(t) and sent to C. If there is no external visual signal acting on the eye, the desired value of e(t) is zero. This will occur when the effects of the eyemovement command y(t) acting on the plant P exactly match those of the head velocity x(t) (from Dean et al., 2002).
motor error in eye-movement commands for gaze stabilization. The structure of the VOR model is shown in Fig. 4, and a more detailed description is given in the Appendix. The general problem of VOR control was
67
simplified in three ways. First, only the horizontal reflex was considered. Second, it was assumed that each component process within the model was linear. These components are the brainstem (B), the cerebellum (C), the oculomotor plant (P), and a process (V) for transforming head velocity into a neural signal. Third, it was assumed that V was veridical (i.e. V ¼ 1). The model of the cerebellar flocculus C received two inputs. One was a copy of the eye-movement command sent to the extraocular muscles, the other the retinal-slip signal. These are the inputs required by the decorrelation control algorithm, with the command copy as mossy-fiber input to be decorrelated from sensory error as climbing-fiber input. It is important to note the extensive anatomical and physiological evidence supporting the existence of these inputs (Lisberger and Fuchs, 1978; Miles et al., 1980; Stone and Lisberger, 1990; Bu¨ttner-Ennever and Horn, 1996; Simpson et al., 1996; Voogd et al., 1996). Moreover, experimental studies of oculomotor plant compensation in primate indicate that the process uses retinal slip, and depends upon the integrity of the flocculus (Optican and Miles, 1985; Optican et al., 1986). The internal structure of the cerebellar flocculus C was modeled as an adaptive linear filter (Widrow and Stearns, 1985), perhaps the simplest possible implementation of the Marr–Albus ideas (Gilbert, 1974; Fujita, 1982). The structure of the adaptive linear filter is as shown in Fig. 2, with the constraints that the decomposition of mossy-fiber inputs into parallelfiber signals, and the weighted recombination of those signals were both linear processes. In the version of the model described here, the components of the mossy-fiber signal were the original motorcommand signal delayed by successive amounts (0.02 s between each component, 100 components). The plant P was a first-order system with time constant ¼ 0.2 s, as illustrated in Fig. 3. Although this is a simple approximation to the complexities of the real plant, it has nonetheless proved very useful in a range of modeling applications (Robinson, 1981). The brainstem B, intended to represent the medial vestibular nucleus and nucleus prepositus hypoglossi, had two components (details in Appendix). Their characteristics were intended to match those displayed after lesions of the flocculus in primate
(Zee et al., 1981; Rambold et al., 2002). One was a direct pathway with a gain that accurately matched the head-velocity input to the eye-velocity output at high (>1 Hz) frequencies. Thus, the basic gain of the VOR was not stored in the flocculus itself but in the brainstem (Luebke and Robinson, 1994; McElligott et al., 1998; Rambold et al., 2002). The second component was a leaky integrator with time constant 0.5 s, to be consistent with the observation that after cerebellar inactivation the time constant of postsaccadic drift is longer than that obtained for the plant alone (Carpenter, 1972; Robinson, 1974; Zee et al., 1981; Godaux and Vanderkelen, 1984). The performance of the brainstem controller is shown in Fig. 5. The retinal slip found in response to the training stimulus (head-velocity signals with a mixture of frequencies) shows good compensation at high frequencies (Fig. 5A), and indeed the gain of the system above about 1 Hz is close to one (Fig. 5B). After a velocity-pulse input, eye position relaxes back to the primary position with a time constant of about 1 s (Fig. 5C). Finally, because the brainstem controller is insufficient on its own to produce accurate motor commands, there are indeed correlations between components of the motor command and the subsequent sensory error, namely retinal slip (Fig. 5D).
Results of decorrelation control The effects of training the system just described with the decorrelation-control algorithm are shown in Fig. 6. Retinal slip declined rapidly at first, then more slowly (Fig. 6A), and was still continuing to decline at the end of 1000 trials of training (each trial ¼ 5 s of colored noise head-velocity input). At this point the remaining slip was very slight (Fig. 6B), and the ability of the system to hold eccentric gaze after a velocity pulse was almost perfect (Fig. 6C). Finally, the correlations between motor-command components and sensory error had almost completely disappeared (Fig. 6D). These findings demonstrate that the decorrelationcontrol algorithm is capable of learning accurate velocity commands, and thus compensating for the oculomotor plant, with the particular modeling assumptions outlined in the section on model
68
Fig. 5. Performance of the model before training, with a first-order plant P (time constant ¼ 0.2 s). The brainstem controller B was a leaky integrator with time constant 0.5 s and accurate high-frequency gain. (A) Head velocity and retinal slip. The colored-noise headvelocity signal (root-mean-square amplitude 1 /s) produced a relatively smooth retinal slip signal. (B) The reason for the smoothing is evident from the Bode plot of VOR gain against frequency of head velocity. For frequencies above about 1 Hz the VOR gain is close to 1.0, because of the properties of the brainstem controller. (C) Eye-position response of system to a head-velocity pulse (equivalent to head-position step, and similar to a saccadic eye-movement command). The eye position returns to its initial value with a time course determined by the characteristics of both the plant and the brainstem controller. (D) The correlations present between delayed versions of the eye-movement command and retinal slip, measured over a period of 500 s (modified from Dean et al., 2002).
structure. The next test for the algorithm is whether it is robust, that is to say whether it can still cope when those assumptions are relaxed. The following assumptions were investigated. (i) There are still uncertainties about the precise characteristics of the brainstem controller B (De Zeeuw et al., 1995). We tested the extreme case of having no brainstem controller at all (i.e. B set to a gain of 1) Although learning was slow, eventual convergence was good and the asymptotic performance for both retinal slip and eccentric gaze resembled that shown in Fig. 6. Thus, the success of the decorrelationcontrol algorithm does not depend on the precise characteristics of the brainstem controller. (ii) The first-order plant used above is the simplest dynamical system possible. What happens
when decorrelation control is confronted with a more realistic model plant? We approached this question in two ways. First, we replaced the single-element plant of Fig. 3 with a two-element model (details in Appendix), of the kind suggested by behavioral and electrophysiological data (Optican and Miles, 1985; Optican et al., 1986; Fuchs et al., 1988; Stahl, 1992; Goldstein and Reinecke, 1994; Goldstein et al., 2000). This plant shows substantially more complex behavior and requires more sophisticated control, including a ‘slide’ of innervation after a velocity pulse (Optican and Miles, 1985; Goldstein and Reinecke, 1994; Goldstein et al., 2000). Nonetheless, the decorrelation-control algorithm was able to learn to compensate a two-element plant (Fig. 7, details in legend). Secondly, the learning
69
Fig. 6. Performance of model during and after training, with a first-order plant P (time constant ¼ 0.2 s) and a brainstem controller B with a leaky integrator (time constant 0.5 s) and accurate high-frequency gain. (A) Typical decline in retinal-slip amplitude with training. Root-mean-square retinal-slip amplitude, measured over a 5-s training trial as shown in Fig. 4A, plotted on a log scale against number of training trials. (B) Posttraining reduction in retinal slip (note change in scale from Fig. 4A). (C) Eye-position response of system to a head-velocity pulse. The resultant eccentric eye position is maintained. (D) The pretraining correlations between delayed versions of the eye-movement command and retinal slip have almost disappeared (modified from Dean et al., 2002).
properties of the configuration shown in Fig. 4 were analyzed mathematically (Porrill et al., 2003). The analysis revealed that the synaptic weights become more accurate as long as output errors are being made. Thus, the algorithm is guaranteed to learn to compensate for any plant (subject to certain technical limitations). The crucial point is that the system operates in ‘feedback’ mode, i.e. a copy of the motor command is fed back to the cerebellum. This general result is important, not least for the specific case of oculomotor plant compensation where a variety of data suggest that the oculomotor plant may contain at least three viscoelastic elements (Robinson, 1965; Sklavos et al., 2002). The mathematical analysis indicates that the decorrelation-control algorithm is capable of compensating for these more complex plants.
(iii) Concerns have been expressed about the capacity of the climbing-fiber pathway to convey detailed information because the maximum firing rate of an individual fiber is rather low, that is about 10 Hz. However, when the decorrelation-control algorithm was tested with a climbing-fiber signal that conveyed only the direction of retinal-slip (not its magnitude) learning was still similar to that illustrated in Fig. 6. The main difference was that final performance needed to be improved slightly by reducing the learning rate ( in Eq. 1) near to convergence. (iv) A further problem with the climbing-fiber pathway is that the retinal-slip signal it delivers to the flocculus is delayed by about 100 ms (Miles, 1991). Such a delay introduces instabilities into the learning process if the training data contain frequencies higher than about 2.5 Hz (see Appendix). These instabilities can be
70
Fig. 7. The decorrelation-control algorithm used with a second-order plant P and a leaky-integrator brainstem controller B. (A) Learning as measured by reduction in root-mean-square retinal-slip amplitude. Note log scale on both axes. The two curves are for decorrelators with either the ‘delay’ or the ‘spectral’ set of basis functions. The latter were an orthogonal set derived from the principal components of compensated motor commands. The final performance of the trained filter was little affected by the basis functions used. (B) Pre- and posttraining retinal slip in response to a colored-noise head-velocity input. (C) Pre- and post-training Bode gains for the VOR. (D) Pre- and posttraining eye-position response to a head-velocity pulse (from Dean et al., 2002).
avoided by what has been termed an ‘eligibility trace’, which acts as a delay and smoothing filter to remove high frequencies from the motorcommand components (details in Appendix). A variety of behavioral and electrophysiological evidence points to the existence of an eligibility trace (Raymond and Lisberger, 1998; Wang et al., 2000; Kehoe and White, 2002). (v) Finally, very little is known about the way mossy-fiber signals are decomposed into parallel-fiber components. Our use of different delays in the simulation described above is essentially an educated guess. However, by trying different schemes for decomposing signals in the adaptive linear filter, we were able to show that their main influence was on the speed with which the decorrelationcontrol algorithm learns, rather than its final convergence. Suitable choice of decomposition method could in fact speed learning very
considerably (Fig. 7). Suggestions that the method of decomposition can itself be influenced by learning (implemented for example by synaptic plasticity between mossy fiber– granule cell complex) have been made elsewhere (Schweighofer et al., 2001). To summarize, the above results indicate that in the context of the flocculus and (linearized) oculomotor plant compensation, the decorrelation-control algorithm is an effective and robust method of ensuring that a simple velocity command into the system generates the corresponding velocity output.
Decorrelation control and visual awareness One of the roles suggested for the cerebellum in relation to awareness is that it carries out the ‘elaboration’ of simple motor commands issued by the forebrain, thereby freeing the forebrain’s computational
71
resources. But it seemed that in order to learn such elaboration, cerebellar models — at least those based on the ideas of Marr and Albus — required a signal that in principle could not be available, namely motor error. However, the decorrelation control algorithm is a possible solution to this problem, since it requires an available signal of the sensory consequences of motor error, not motor error itself. The results described above indicate that for eye movements decorrelation control used by a simplified Marr–Albus model was effective in learning to compensate for a linearized oculomotor plant, thus enabling higher centers to send only simple velocity commands downstream with consequent easing of their computational load. The second role mentioned above for the cerebellum in visual awareness concerned the provision of sensory information uncontaminated by the organism’s own activity. In the case of oculomotor plant compensation the sensory signal is whole-field retinal image movement (retinal slip), potentially contaminated by inaccurate eye-movement commands. Inasmuch as decorrelation control successfully removes this contamination, any retinal slip
remaining is a genuine external signal. This can be seen in a redrawing of the VOR circuitry (Fig. 4) to emphasize its sensory-processing aspect (Fig. 8). In the redrawn version the retinal slip that would occur if the retina did not move can be considered as a sensory ‘target variable’. This has two components: an external signal of interest u, combined with selfproduced interference n. What the system is trying to do is move the sensor surface (i.e. the eye) so as to cancel n, leaving behind the ‘real’ signal u. The eye movement can thus be regarded as an estimate of that ˆ and the resultant retinal slip an interference n, estimate of the real signal u^ . The more accurate the eye movement, the better the estimate u^ (so that if u were zero, for example, there would be no retinal slip at all). Thus, the decorrelation-control algorithm that learns to produce accurate eye movements necessarily produces a good estimate of the signal of interest. Consequently, decorrelation control is a candidate algorithm for securing both of the proposed functions of the cerebellum in visual awareness. Of course, many questions remain. One of the most important concerns movements of parts of the
Fig. 8. Redrawing of the vestibulo-ocular circuitry shown in Fig. 4 to emphasize its sensory-processing aspects. Inputs to the system are: (i) the retinal slip that would occur if the eyes remain stationary is treated as a target variable. As such it consists of an external signal of interest u(t) corrupted by additive interference n(t); and (ii) predictor variables p(t). The task of the system is to extract an ˆ of the estimate of the signal of interest uˆ(t) from the target variable. It does so by subtracting from the target variable an estimate n(t) interference, in this case by physically moving the eye. Sensor output is no longer the target variable u(t) þ n(t) but the estimate uˆ(t) of the signal of interest u(t). The decorrelator must therefore learn the motor command m(t) which will act on the plant to produce the appropriate interference estimate (from Dean et al., 2002).
72
body other than the eyes. Unfortunately, control of multijoint movements is more complex than eyemovement control, and less is known about the anatomical details of the projections of cerebellar microzones to and from the relevant premotor circuitry in cortex, brainstem, and spinal cord. However, the mathematical analysis of decorrelation control indicated that it was in principle capable of compensating for very complex plants provided a copy of the motor command was made available to the relevant region of the cerebellum. It is therefore interesting that Eccles (1973) supposed this to be the case for motor cortex itself (the basis of his ‘dynamic loop’ hypothesis). More recently anatomical investigations using transneuronal transport methods have indicated that a given area of cerebral cortex which projects to cerebellar cortex via the pons receives a projection back from that selfsame region of cerebellar cortex via the thalamus. These ‘‘closedloop circuits may be a fundamental feature of cerebellar interactions with the cerebellar cortex’’ (Middleton and Strick, 2000, p. 240). It is possible therefore that the closed-loop arrangements required by decorrelation control are characteristic not just of eye movements but of movements in general. Further investigation of cerebro-cerebellar connectivity is but one example of the extensive work required to establish decorrelation control (or any other candidate) as the generic cerebellar method. It is of course a form of detective work, the kind of work of which, as this volume attests, Alan Cowey is a master.
Appendix The model architecture of Fig. 4 was programmed in MATLABTM. P, V, B, and C were treated as linear processes, allowing use of functions in the control system toolbox. The characteristics of the linear processes in initial training were: (i) V was a unit gain. (ii) P was a first-order plant, with the transfer function Hp(s) between eye-in-head velocity eh and motor command y given by Eq. (A1). Hp ðsÞ ¼
eh ðsÞ s ¼ yðsÞ s þ 1=Tp
ðA1Þ
where s denotes the Laplace complex frequency variable and Tp the time constant of the plant ( ¼ 0.2 s). (In subsequent equations with transfer functions, the argument (s) of transfer functions is omitted for simplicity.) (iii) The brainstem B had the transfer function Hb given by: Hb ¼ Gd þ
Gi s þ 1=Ti
ðA2Þ
corresponding to a brainstem controller with two paths: (a) a direct path which passed the head-velocity signal to the plant with the correct gain (Gd ¼ 1); and (b) an indirect path in which the head-velocity signal was integrated and passed to the plant also with the correct gain (Gi ¼ 1/Tp ¼ 5). The brainstem integrator was leaky with time constant Ti ¼ 0.5 s. (iv) The input to the adaptive filter C was split into 100 components with delays between components of 0.02 s (2 s total). C was thus effectively a finite impulse-response filter of length 100, with output c(t) given by:
cðtÞ ¼
100 X
wi yi ðt 0:02iÞ
ðA3Þ
i¼1
where wi was the weight of component yi. The rule for adjusting the weights was equivalent to that given in Eq. (1) in the text. The value of the learning-rate constant in that equation was adjusted to give rapid learning without instability. The training input to the system was a headvelocity signal modeled as colored noise with unit power. The power had its peak value at 0.2 Hz, then varied with increasing frequency f as 1/f (as would occur if white-noise head acceleration were integrated to head velocity). For efficiency weight update was implemented in batch mode using 5 s batches of head-velocity data. After training with the basic system described above, a number of variants were investigated. (i) Variants of B: The integrator pathway was removed (Eq. A2, with Gi ¼ 0).
73
(ii) Variants of P: A second-order version of P was used with transfer function Hp given by: Hp ¼
sðs þ 1=Tz Þ ðs þ 1=T1 Þðs þ 1=T2 Þ
ðA4Þ
where T1 ¼ 0.37 s, T2 ¼ 0.057 s, Tz ¼ 0.2 s, taken from Stahl’s estimate (Stahl, 1992, p. 361) of the best-fit two-pole one-zero transfer function (for eye position from eye-movement command) to the data of Fuchs et al. (1988). This plant was combined with a leaky undergained integrator (Eq. A2, with Gi ¼ 5.05, Ti ¼ 0.5). (iii) Learning rule: The learning rule was changed from that shown in Eq. (1) to: wi ¼ sign½eðtÞyi ðtÞ
Acknowledgments Support for this work was Biotechnology and Biological Council (BBSRC). J.V.S. was Wellcome mathematical biology
provided by the Sciences Research the recipient of a fellowship.
ðA6Þ
and used to train an adaptive filter C with a first-order plant (Eq. A1) and a leaky undergained brainstem controller (Eq. A2, Gi ¼ 2.5, Ti ¼ 0.5). (iv) Delay: The retinal-slip signal arriving at C was delayed by d ¼ 100 ms. The system was trained with a first-order plant (Eq. A1) and a leaky undergained brainstem controller (Eq. A2, with Gi ¼ 2.5, Ti ¼ 0.5). It was found that the delay caused unstable learning if the input to C contained frequencies above 1/4d (at these frequencies the input becomes >90 out of phase with the retinal-slip signal). The components yi(t) were therefore convolved with an ‘eligibility trace’ r(t). The equation for the eligibility trace was taken from Eqs. (11) and (12) of Kettner et al. (1997): rðtÞ / t eðt=tpeak Þ
motor outputs for a perfectly compensated firstorder plant were subjected to principal component analysis. The 100 eigenvectors derived from the analysis were then used as basis functions. Learning was examined for the second-order plant with leaky undergained brainstem controller (variant 2 above).
ðA5Þ
where tpeak was set to 0.1 s. (v) Basis functions: The different delays used as basis functions for the mossy-fiber input y(t) were subsequently replaced by alternative functions. These included sine waves of different frequencies and decaying exponentials of different time constants, as well as basis functions that were orthogonalized with respect to the motor commands themselves. One method of achieving this was by spectral decomposition, in which the
References Albus, J.S. (1971) A theory of cerebellar function. Math. Biosci., 10: 25–61. Blakemore, S.J., Wolpert, D. and Frith, C. (2000) Why can’t you tickle yourself ? Neuroreport, 11: R11–R16. Blakemore, S.J., Frith, C.D. and Wolpert, D.M. (2001) The cerebellum is involved in predicting the sensory consequences of action. Neuroreport, 12: 1879–1884. Bower, J.M. (1997) Control of sensory data acquisition. Int. Rev. Neurobiol., 41: 489–513. Brindley, G.S. (1964) The use made by the cerebellum of the information that it receives from sense organs. IBRO Bull., 3: 80. Bu¨ttner-Ennever, J.A. and Horn, A.K.E. (1996) Pathways from cell groups of the paramedian tracts to the floccular region. In: Highstein S.M., Cohen B. and Bu¨ttner-Ennever J.A. (Eds.), New Directions in Vestibular Research. New York Academy of Sciences, New York, pp. 532–540. Carpenter, R.H.S. (1972) Cerebellectomy and the transfer function of the vestibulo-ocular reflex in the decerebrate cat. Proc. R. Soc. Ser. B, 181: 353–374. Carpenter, R.H.S. (1988) Movements of the Eyes. Pion, London. Cowey, A., Parkinson, A.M. and Warnick, L. (1975) Global stereopsis in rhesus monkeys. Q. J. Exp. Psychol., 27: 93–109. Dean, P., Porrill, J. and Stone, J.V. (2002) Decorrelation control by the cerebellum achieves oculomotor plant compensation in simulated vestibulo-ocular reflex. Proc. R. Soc. Ser. B, 269: 1895–1904. De Zeeuw, C.I., Wylie, D.R., Stahl, J.S. and Simpson, J.I. (1995) Phase relations of Purkinje cells in the rabbit flocculus during compensatory eye movements. J. Neurophys., 74: 2051–2064.
74 Eccles, J.C. (1973) The Understanding of the Brain. McGrawHill, New York. Eccles, J.C., Ito, M. and Szenta´gothai, J. (1967) The Cerebellum as a Neuronal Machine. Springer-Verlag, Berlin. Fuchs, A.F., Scudder, C.A. and Kaneko, C.R.S. (1988) Discharge patterns and recruitment order of identified motoneurons and internuclear neurons in the monkey abducens nucleus. J. Neurophys., 60: 1874–1895. Fujita, M. (1982) Adaptive filter model of the cerebellum. Biol. Cybern., 45: 195–206. Gilbert, P.F.C. (1974) A theory of memory that explains the function and structure of the cerebellum. Brain Res., 70: 1–8. Godaux, E. and Vanderkelen, B. (1984) Vestibulo-ocular reflex, optokinetic response and their interactions in the cerebellectomized cat. J. Physiol., 346: 155–170. Goldstein, H. and Reinecke, R. (1994) Clinical applications of oculomotor plant models. In: Fuchs A.F., Brandt T., Bu¨ttner U. and Zee D. (Eds.), Contemporary Ocular Motor and Vestibular Research: A Tribute to David A. Robinson; International Meeting, Eibsee 1993. Georg Thieme Verlag, Stuttgart, pp. 10–17. Goldstein, H.P., Bockisch, C.J. and Miller, J.M. (2000) Muscle forces underlying saccades. Invest. Ophthal. Vis. Sci., 41: S315. Ito, M. (2001) Cerebellar long-term depression: characterization, signal transduction, and functional roles. Physiol. Rev., 81: 1143–1195. Kandel, E.R., Schwartz, J.H. and Jessell, T.M. (2000) Principles of Neural Science. McGraw-Hill, New York. Kehoe, E.J. and White, N.E. (2002) Extinction revisited: similarities between extinction and reductions in US intensity in classical conditioning of the rabbit’s nictitating membrane response. Anim. Learn. Behav., 30: 96–111. Kettner, R.E., Mahamud, S., Leung, H.C., Sitkoff, N., Houk, J.C., Peterson, B.W. and Barto, A.G. (1997) Prediction of complex two-dimensional trajectories by a cerebellar model of smooth pursuit eye movement. J. Neurophys., 77: 2115–2130. Lisberger, S.G. and Fuchs, A.F. (1978) Role of primate flocculus during rapid behavioral modification of vestibuloocular reflex. II. Mossy fiber firing patterns during horizontal head rotation and eye movement. J. Neurophys., 41: 764–777. Llina´s, R. and Welsh, J.P. (1993) On the cerebellum and motor learning. Curr. Opin. Neurobiol., 3: 958–965. Luebke, A.E. and Robinson, D.A. (1994) Gain changes of the cat’s vestibulo-ocular reflex after flocculus deactivation. Exp. Brain Res., 98: 379–390. Marr, D. (1969) A theory of cerebellar cortex. J. Physiol., 202: 437–470. Marr, D. (1982) Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman, San Francisco.
McElligott, J.G., Beeton, P. and Polk, J. (1998) Effect of cerebellar inactivation by lidocaine microdialysis on the vestibuloocular reflex in goldfish. J. Neurophys., 79: 1286–1294. Middleton, F.A. and Strick, P.L. (2000) Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Rev., 31: 236–250. Miles, F.A. (1991) The cerebellum. In: Carpenter R.H.S. (Ed.), Eye Movements. MacMillan Press, Basingstoke, pp. 224–243. Miles, F.A., Fuller, J.H., Braitman, D.J. and Dow, B.M. (1980) Long-term adaptive changes in primate vestibuloocular reflex. III. Electrophysiological observations in flocculus of normal monkeys. J. Neurophys., 43: 1437–1476. Nixon, P.D. and Passingham, R.E. (2001) Predicting sensory events: the role of the cerebellum in motor learning. Exp. Brain Res., 138: 251–257. Optican, L.M. and Miles, F.A. (1985) Visually induced adaptive changes in primate saccadic oculomotor control signals. J. Neurophys., 54: 940–958. Optican, L.M., Zee, D.S. and Miles, F.A. (1986) Floccular lesions abolish adaptive control of post-saccadic ocular drift in primates. Exp. Brain Res., 64: 596–598. Paulin, M.G. (1993) The role of the cerebellum in motor control and perception. Brain Behav. Evol., 41: 39–50. Porrill, J., Dean, P. and Stone, J.V. (2003) Recurrent cerebellar architecture solves the motor error problem: application to 3-D VOR. Program No. 882.12. 2003 Abstracts Viewer/ Itinerary Planner. Society for Neuroscience, Washington, DC. Rambold, H., Churchland, A., Selig, Y., Jasmin, L. and Lisberger, S.G. (2002) Partial ablations of the flocculus and ventral paraflocculus in monkeys cause linked deficits in smooth pursuit eye movements and adaptive modification of the VOR. J. Neurophys., 87: 912–924. Raymond, J.L. and Lisberger, S.G. (1998) Neural learning rules for the vestibulo-ocular reflex. J. Neurosci., 18: 9112–9129. Robinson, D.A. (1965) The mechanics of human smooth pursuit eye movement. J. Physiol., 180: 569–591. Robinson, D.A. (1974) The effect of cerebellectomy on the cat’s vestibulo-ocular integrator. Brain Res., 71: 195–207. Robinson, D.A. (1981) Models of the mechanics of eye movements. In: Zuber B.L. (Ed.), Models of Oculomotor Behaviour. CRC Press, Boca Raton, FL, pp. 21–41. Schweighofer, N., Doya, K. and Lay, F. (2001) Unsupervised learning of granule cell sparse codes enhances cerebellar adaptive control. Neuroscience, 103: 35–50. Sejnowski, T.J. (1977) Storing covariance with nonlinearly interacting neurons. J. Math. Biol., 4: 303–321. Simpson, J.I., Wylie, D.R. and De Zeeuw, C.I. (1996) On climbing fiber signals and their consequence(s). Behav. Brain Sci., 19: 384–398. Sklavos, S., Gandhi, N.J., Sparkls, D.L., Porrill, J. and Dean, P. (2002) Mechanics of oculomotor plant estimated from effects of abducens microstimulation. In 2002 Abstract
75 Viewer, Society for Neuroscience, Washington, DC, Program No. 463.5. Stahl, J.S. (1992) Signal Processing in the Vestibulo-ocular Reflex of the Rabbit. PhD Thesis, New York University. Stone, L.S. and Lisberger, S.G. (1990) Visual responses of Purkinje cells in the cerebellar flocculus during smoothpursuit eye movements in monkeys. II. Complex spikes. J. Neurophys., 63: 1262–1275. Voogd, J., Gerrits, N.M. and Ruigrok, J.H. (1996) Organization of the vestibulocerebellum. In: Highstein S.M., Cohen B. and Bu¨ttner-Ennever J.A. (Eds.), New Directions in Vestibular Research. New York Academy of Sciences, New York, pp. 553–579.
Wang, S.S.-H., Denk, W. and Ha¨usser, M. (2000) Coincidence detection in single dendritic spines mediated by calcium release. Nat. Neurosci., 3: 1266–1273. Weiskrantz, L., Elliott, J. and Darlington, C. (1971) Preliminary observations of tickling oneself. Nature, 230: 598–599. Widrow, B. and Stearns, S.D. (1985) Adaptive Signal Processing. Prentice-Hall Inc., Engelwood Cliffs, NJ. Zee, D.S., Yamazaki, A., Butler, P.H. and Gu¨cer, G. (1981) Effects of ablation of flocculus and paraflocculus on eye movements in primate. J. Neurophys., 46: 878–899.
SECTION II
Cortical Visual Systems
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 5
Some effects of cortical and callosal damage on conscious and unconscious processing of visual information and other sensory inputs Giovanni Berlucchi* Dipartimento di Scienze Neurologiche e della Visione, Universita` di Verona, I-37134, Verona, Italy
Abstract: Although new methods of investigation from the molecular level to cognition are promoting major advances in the study of the functions of the human brain, the analysis of behavioral and psychological deficits following brain damage is still a major tool for the understanding of cerebral organization. The present paper reviews some aspects of work on functional losses and residual abilities following cortical damage that have allowed to distinguish conscious and unconscious levels of visual input processing. Attention is given to the possible contribution of residual conscious vision of color to unconscious form analysis in visual agnosia. The paper also reviews findings on temporary and permanent deficits that occur after selective lesions of a prominent input–output system of the cerebral cortex, the corpus callosum, with the aim of assessing the possibility of establishing a functional callosal topography.
The work of Alan Cowey is noted for its achievements on the path to a thorough understanding of the relations between specific neural structures and specific cognitive functions. In the author’s contribution to this Festschrift honoring him, he summarizes the results of studies that his colleagues and he have carried out in the past 3 or 4 years in an attempt to link particular forms of brain damage with loss or preservation of given higher-order neural functions. The two main topics the author will deal with belong to areas of neuroscience to which Alan has made lasting contributions: blindsight and callosal hemispheric interactions. The conceptual link between these two approaches is that they both aim at understanding cognitive and behavioral functions of the cortex as inferred from the functional changes following direct lesions of specific cortical regions or the interruption of specific corticofugal and corticopetal
pathways in the corpus callosum. In the former case one can assess functional losses and functional sparings or recoveries after circumscribed or diffuse damage to the cortex. In the latter case it is possible to observe the behavioral and cognitive consequences of the loss of interhemispheric interactions between given cortical areas, as well as the differential effects exerted on cognition and behavior by cortical centers in each hemisphere that have become functionally autonomous because of their reciprocal disconnection.
Interactions between blindsight and conscious visual awareness The term blindsight was originally coined to denote the ability of patients with primary visual cortex lesions to emit adequate behavioral reactions to visual inputs from the supposedly blind contralesional part of their visual field, in the face of their proclaimed unawareness of those inputs (Weiskrantz, 1986; Stoering and Cowey, 1997). The term is now used to refer to a range of behaviors guided by visual
*Corresponding author. Dipartimento di Scienze Neurologiche e della Visione, Sezione Fisiologia Umana, Universita` di Verona, Strada Le Grazie, 8, I-37134, Verona, Italy. Tel.: þ 39-045-8027141; Fax: þ 39-045-580881 E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-5
79
80
cues that do not give rise to conscious visual perceptions, as can be observed in patients with various neurological disorders, and even in normal subjects submitted to specific kinds of visual stimulation (Milner, 1995; Milner and Goodale, 1995; Kolb and Braun, 1995; Driver and Mattingley, 1998; Marcel, 1998; Savazzi and Marzi, 2002). Although the phenomena subsumed under the term blindsight may occur in apparently identical fashion in these various cases, the underlying neural mechanisms may be different in different conditions, and our understanding of them is still largely incomplete.
Blindsight may influence conscious vision Most reports of blindsight have come from studies of patients with unilateral brain damage whose behavioral responses to visual stimuli from a contralesional field affected by hemianopia, extinction or neglect can be compared and contrasted with their responses to visual signals from the normal ipsilesional field. Evidence for blindsight is based not only on overt behavioral responses to otherwise unperceived visual stimuli, but also on the influence that such unperceived stimuli can exert on the processing of other visual stimuli that have access to consciousness. One search for the latter influence has involved the analysis of modulatory effects of light stimuli from the impaired field on speed of reaction to light stimuli from the good field (Marzi et al., 1986; Corbetta et al., 1990). As an example, the speed of detection of a simple flash stimulus in the intact visual field ipsilateral to a complete hemispherectomy can be significantly decreased by the simultaneous presentation of an identical flash stimulus in the opposite hemianopic field, even though the latter stimulus is incapable of eliciting any overt response, whether associated with awareness or unaccompanied by it (Tomaiuolo et al., 1997). In another approach, the discrimination of patterned stimuli in the intact visual field of patients with severe contralateral neglect has been shown to be aided by the previous presentation in the neglected field of stimuli identical to or belonging in the same category as the target, notwithstanding the patients’ consistent denial of the the occurrence
of the facilitating stimuli. In contrast, no facilitation was obtained with stimuli physically and categorically unrelated to the targets (Berti and Rizzolatti, 1992). Yet another approach was employed by Danckert et al. (1998) with a patient with a hemianopia contralateral to a one-sided occipital lesion who consistently denied seeing letter and color stimuli presented in his hemianopic field. Nevertheless, the patient’s reaction time for verbal identification of letter or color stimuli presented at the fixation point was influenced by flanker letter or color stimuli simultaneously presented in the intact or the hemianopic field. Reaction time was longer when the letter or color flankers differed from the corresponding letter or color targets, as compared to when flankers and targets were the same, or when there were no flankers. The effect was obtained with flankers presented in either visual field, implying that both color and letter information was unconsciously processed in the hemianopic field up to a degree that could interfere with the processing of concurrent central targets. Finally, discrimination of the emotional expression of a half face in the intact visual field was found to be facilitated by the simultaneous presentation of a half face with a congruent expression to the blind field, whereas discrimination of the expression of whole faces in the intact field was interfered with by incongruent facial expressions presented in the blind field (De Gelder et al., 2001). All these results speak strongly in favor of the existence of a variety of interhemispheric effects of unseen inputs from the blind field on the processing of inputs from the intact field that can enter consciousness.
Conscious vision may influence blindsight A few other studies have been aimed at exploring possible converse effects of conscious visual processes on blindsight phenomena. There is evidence to suggest that blindsight can be enhanced, or even result in conscious experiences, under the influence of visual information processed by intact brain systems. As a counterpart to facilitation of reactivity to stimuli in an intact visual hemifield by stimuli in an impaired hemifield, unconscious detection of light stimuli in an impaired hemifield can be facilitated by concurrent
81
stimuli in the intact hemifield (Ward and Jackson, 2002). After being presented with unilateral or bilateral visual stimuli, a patient with unilateral damage to the primary visual cortex indicated his detection of right, left or bilateral stimuli by making unspeeded choice key-pressing responses, and also reported his awareness of the stimuli. Although he never reported awareness of stimuli in the blind field, his manual responses evinced an imperfect, but clearly above-chance detection of such stimuli, i.e. blindsight, when such stimuli were presented alone. The finding relevant here was that his detection rate of stimuli in the blind field almost doubled when they were presented along with stimuli in the intact field. The latter stimuli were consistently detected, regardless of the presence or absence of a contralateral stimulus. The same patient was studied by Kentridge et al. (1999) in a spatial two-alternative forced choice discrimination in the contralesional and ipsilesional visual fields. Above-chance performance in the contralesional field could be dissociated from stimulus awareness, and could be significantly improved by cues that signaled the time of occurrence of the stimuli for discrimination. By being presented at the fixation point, such temporal cues had direct access to the intact hemisphere, and thus presumably to conscious processes of attentional control that could influence the decision underlying the blindsight performance.
Conscious vision in a hemianopic field Torjussen (1978) found in three hemianopic patients that unperceived stimuli flashed in the hemianopic field did not produce after-images; yet the patients experienced veridical bilateral after-images when exposed to bilateral complementary stimuli. Conscious experience of the half after-image in the hemianopic field could not be attributed to a completion effect, because no such completion was reported with stimuli restricted to the good field, which gave rise to after-images also strictly localized to that field. These findings were confirmed and extended by Marcel (1998) in two patients with unilateral hemianopia, who not only had conscious experiences of bilateral after-images generated by good Gestalten crossing the midline, but also
consciously saw complete figures with illusory contours partly lying in the hemianopic field. Conscious vision of the Gestalt’s part lying in the hemianopic field is probably made possible by complementary visual inputs to the damaged hemisphere from the intact hemisphere. That the intact hemisphere can relay visual inputs to the damaged hemisphere is suggested by the finding that extrastriate visual areas in the latter hemisphere are activated by appropriate stimuli presented in either hemifield (Goebel et al., 2001). Instances of conscious vision of stimuli in a hemianopic field, especially moving stimuli, have also been reported in a patient who lost the primary visual cortex at an age that may have allowed a substantial reorganization of his visual system (Stoerig and Cowey, 1997; Sahraie et al., 1997; Zeki and fftyche, 1998; Stoerig and Barth, 2001).
Blindsight and visual agnosia Patients with severe visual agnosia caused by diffuse cortical damage can exhibit visually guided behaviors that fit the definition of blindsight. Analyses of such cases have emphasized the dissociations, rather than the possible interactions, between a severely impaired conscious vision and the unconscious guidance of action towards visual targets, thus offering a starting point for general hypotheses about an at least partial separation between the neural substrates for perception and those for action (Milner, 1995; Milner and Goodale, 1995). The hallmark of cortical visual agnosia is usually a profound inability to identify and discriminate visual objects, whereas perception of color and visual motion, as well as general visual imagery, can often be preserved to at least some degree (Milner and Goodale, 1995; Servos and Goodale, 1995; Zeki et al., 1999). In one of these cases, Aglioti et al. (1999) have investigated whether preserved color vision, in addition to providing cues that help the patient to arrive at conscious inferences about the nature of visual objects, can by itself bring out blindsight responses to visual shapes. The study was performed on a patient suffering from a dense apperceptive agnosia for visual shapes and objects associated with a bilateral parieto-occipital atrophy. The brain damage and the resulting
82
visual agnosia were the consequences of a prolonged cardio-respiratory arrest sustained in the course of an endoscopic extraction of a foreign body fom the trachea. The patient’s most conspicuous deficit consisted in a complete incapacity to identify and discriminate even simple visual shapes and objects, such as single large black letters presented against a white background. When repeatedly tested in the latter task, his performance never deviated from chance, and eventually he asked to be spared such a futile exercise. In contrast, his color perception was nearly normal and he consistently reported a distinct awareness of the color stimuli that he was asked to point to or name. Similarly intact was his visual imagery, as assessed by his good ability to write and draw from memory, in striking contrast with his failure to copy drawings and verbal material, and to read what he had written minutes beforehand. The purpose of our study was to use a modified version of the Stroop test in order to assess whether the patient’s fully conscious chromatic vision could bear out some latent, implicit or explicit capacity for the processing of visual shape (Aglioti et al., 1999).
The Stroop test with single letters In the standard version of the Stroop test, normally seeing subjects are comparatively fast in naming the color in which a word is written if the word matches the name of the color, and comparatively slow if the word denotes a competing color. The effect is best accounted for by a parallel distributed processing model whereby the two pathways, one for color processing and the other for word processing, converge onto a shared response mechanism. In the case of incongruency between the information in the color pathway and the word pathway, the well learned and presumably automatic tendency to read the word is bound to interfere with the production of the competing color-naming response (MacLeod, 1991). Recently it has been reported that the Stroop effect is diminished if only one letter selected at random in the presented color name word is colored (Monahan, 2001), but work by Regan (1978) had previously shown that a robust Stroop effect could be obtained with the presentation of appropriately colored single-letter stimuli corresponding to the
initials of color names. We have used a simplified version of Regan’s (1978) task for testing first normal observers, and then the agnosic patient. The test involved repeated discriminations between two colors, red and green, and two letters, a capital R and a capital V. These two letters are the initials of the Italian words ‘rosso’ for ‘red’ and ‘verde’ for ‘green’ (all normal controls as well as the patient were Italian by birth and upbringing). The R letters were 5.2 cm high and 3.5 cm wide; the V letters were 5.5 cm high, and their width was 4 cm at the top and 0.6 cm at the bottom. Four stimuli, a red R, a green V, a green R and a red V, were presented one at a time in random sequences on the screen of a computer. The observer, positioned at a distance of 40 cm from the screen, was instructed to fixate the screen center at the beginning of each trial, but was allowed to move head and eyes when the stimulus was present. The stimulus remained on until the observer’s response. The letter discrimination task required speeded choice key-pressing responses, with one key to the letter R and another key to the letter V, regardless of their color. The color discrimination task similarly required speeded choice key-pressing responses to the red color and the green color, regardless of whether the colors were carried by an R or a V. A forced-response paradigm was used throughout, and no feedback about accuracy and speed of performance was provided. A computer controlled the sequences of stimulus presentation and recorded response accuracy and speed. There were three sessions of 40 trials each for both the color discrimination and the letter discrimination, and the order of the two types of discrimination was counterbalanced across sessions. The succession of stimuli was random with the constraint that in each session there were 20 congruent stimuli (10 red Rs and 10 green Vs) and 20 incongruent stimuli (10 green Rs and 10 red Vs). Eight normal observers matched for age with the patient showed no differences in accuracy between tasks and between congruent and incongruent stimuli. In the color task, the accuracy was 98.2% for congruent stimuli, and 96.1% for incongruent stimuli. In the letter task the accuracy was 97.2% for congruent stimuli and 96.8% for incongruent stimuli. In both tasks, t-tests for matched pairs indicated that the difference in accuracy between congruent and incongruent stimuli
83
fell far from statistical significance. However, in typical Stroop-effect fashion, the color task yielded RTs that were significantly longer for incongruent stimuli (mean RT 415.3 ms) than congruent stimuli (mean RT 393.5 ms), a significant difference by a t-test for matched pairs. No significant difference was found in the letter task, where mean RT was 424.3 ms for congruent stimuli and 429.2 ms for incongruent stimuli. In confirming a Stroop-like effect based on the initials of color names (Regan, 1978), the findings suggest that the initials can induce an automatic activation of the representations of the corresponding color names, with the result that responses to namecongruent colors are expedited and responses to name-incongruent colors are retarded.
The simplified Stroop test with the visually agnosic patient When performing the two tasks, the agnosic patient consistently reported that he could perceive the colors but not the letters, and that he felt that his forced responses to the letter stimuli were the result of mere guessing. In the color task there were no statistical differences in either accuracy or RT between congruent and incongruent stimuli. Accuracy was 95.8% with congruent stimuli and 97.5% with incongruent stimuli, and RT was 1.5 s with congruent stimuli and 1.4 s with incongruent stimuli. In contrast, in the letter task accuracy was at chance with the incongruent stimuli (50.8%), as it had been in many previous tests of letter recognition, but it was clearly and significantly above chance with the congruent stimuli. The accuracy advantage for congruent over incongruent stimuli was 23.4% with the letter R and 20% with the letter V. Overall RT of correct responses was significantly longer for incongruent stimuli (mean 8.6 s) than for congruent stimuli (mean 6.8 s). The RT advantage for congruent over incongruent stimuli was 0.9 s for the letter R and 2.7 s for the letter V. RTs for incorrect responses were not significantly different for congruent and incongruent pairings (10.4 s vs. 9.2 s). These results with the agnosic patient differed markedly from those of normal subjects on at least three counts. First, unlike the normal controls, the patient did not exhibit a standard Stroop effect insofar as his performance
on the color discrimination was unaffected by the letter stimuli. This result was to be expected on the basis of the patient’s preserved color perception and his general inability to recognize letter stimuli in many previous clinical and experimental tests. Even assuming a latent implicit potential for processing the letter stimuli, such potential would be preempted from influencing color discrimination by the fast processing of the color stimuli. Second, the normal controls did not show any interference of the color stimuli on accuracy and speed of the letter discrimination, supposedly because reading of a single letter can easily take precedence over any automatic color processing. In contrast, the accuracy of the patient’s responses to letter stimuli showed a clear effect of the congruency or incongruency of such stimuli with the letters to be discriminated. As could be expected from many previous tests, the patient’s performance did not show any evidence of letter discrimination with color-incongruent letter stimuli, but successful discrimination clearly emerged with color-congruent letter stimuli. That such performance reflected a real potential for letter discrimination was supported by the faster response speed to color-congruent than to color-incongruent letter stimuli. Third, throughout the testing the patient consistently reported that he had no conscious awareness of the letter presented on any given trial, so that his better-than-chance ability to discriminate such stimuli and his greater speed of response to color-congruent than color-incongruent letters could legitimately be defined as blindsight.
The role of visual imagery To account for this finding, we proposed that the patient’s preserved visual imagery would allow perceived colors to activate an orthographic representation of the corresponding word name. According to McClelland and Rumelhart’s (1981) computational model of interactive letter and word perception, this word name representation would in turn activate the orthographic representation of the word’s component letters. Viewed as a top-down influence, this activation would act on any existing ability for visual processing of letters by giving an advantage to inputs consistent with the activated letter representation over inputs inconsistent with it. It thus seems possible
84
that in the case of an input consistent with the activated letter representation, the advantage afforded to that input could bear out a partial residual ability of the patient for the implicit processing of letters. Inconsistency between the input and the activated letter representation, and the resulting absence of a top-down support would preclude the emergence of a successful letter discrimination even with the protracted processing attested by the patient’s very long response times. In the letter discrimination the patient’s reaction time was indeed longer with incongruent pairings, but even in the case of congruent pairings reaction time was much slower than it was in the color discrimination task. This difference strongly suggests that the proposed activation of the color name representation and its initial letter by the color stimuli lagged behind the patient’s color perception that was fully available to his awareness. If so, the patient’s discrimination of letters of which he was consistently unaware qualifies as a form of blindsight that was contingent on normal vision, i.e. vision which is ordinarily conscious, and was thus very different from reported facilitations of object or shape recognition that rely on the direct generation of shape from color or wavelength.
Speculations about the possible neural substrates of the interactions between ‘normal’ vision and blindsight In commenting on the results of Aglioti et al. (1999), Danckert and Goodale (2000) suggested that intact visual functions may aid the processing of visual information transmitted by a damaged system in two ways. Intact visual processes may either enhance the weak signals conveyed by the system, as proposed by Aglioti et al. (1999), or facilitate the access to such signals by another processing system, resulting in blindsight or even conscious experiences. The available evidence is insufficient to decide between these two possibilities, but they may be examined and discussed in the light of a few recent studies that bear on various forms of interaction between blindsight and normal vision. Suzuki and Yamadori (2000) have reported a surprising dissociation between letter reading and awareness of the form of letter stimuli. A Japanese woman with a lesion of the lower bank of
the calcarine fissure in the left hemisphere could read aloud kana and kanji characters and Arabic numerals in the scotomatous part of her right visual field, although she claimed that she perceived the stimuli as simple light flashes and denied any visual awareness of their form. According to the authors, the damage to the visual system reduced signal processing to a degree that was compatible with a reading vocal response, but too low for accessing consciousness. This suggestion bears on Danckert and Goodale’s (2000) concept of the reduced strength of the signal within a damaged visual system, insofar as one can envisage different degrees of information processing dysfunctions, ranging from intact signal processing accompanied by awareness in the case of minimal lesions, to signal processing dissociated from awareness, or even complete functional loss in the case of more severe lesions. If it is possible for top-down controls to modulate signal strength in a damaged visual system, it would have been interesting to explore whether Suzuki and Yamadori’s (2000) patient could become aware of the stimuli as a result of appropriate signal strengthening effects by attentional or arousing mechanisms. Potential top-down controls that can strengthen the signal in a damaged visual system are the so called feedback cortico-cortical connections from higher-order visual cortical areas to lower-order ones, or even cortical projections to subcortical visual centers such as the superior colliculus. According to Lamme (2001), unconscious visually guided behaviors can be executed on the basis of entirely feedforward input–output transformations, whereas conscious vision would also require the action of feedback inputs from higher order cortical areas to the primary visual cortex. Whereas the role of feedback corticocortical connections in conscious vision is compatible with data from normal and brain damaged observers, the primary visual cortex may not be the sole recipient of these connections in the mediation of conscious vision, as suggested by the results of a recent study by Weiskrantz et al. (2002). In a patient with unilateral visual cortex damage, visual stimuli such as gratings, colors, shapes, etc. that could elicit blindsight responses (i.e. correct discriminations associated with denial of stimulus awareness) in the contralesional field could also give rise to consciously experienced negative after-images after being turnedoff. These delayed subjective experiences might be
85
accounted for by the time necessary for feedback cortico-cortical connections to act on residual substrates for conscious vision in the damaged hemisphere. In other words, feedback signals from intact higher-order cortical areas in the damaged hemisphere, or perhaps even in the intact hemisphere, might induce a delayed enhancement of information processing in cortical or subcortical centers directly targeted by the original signal. The study by Goebel et al. (2001) did not find activation of the intact hemisphere by visual inputs to the damaged hemisphere, but it is possible that signals thus generated in the intact hemisphere are too weak to be detected by neuroimaging. If a cross-talk between the hemispheres is possible, interhemispheric feedback signals from the intact to the damaged hemisphere (possibly traveling via the corpus callosum or other interhemispheric connections) were involved in the after-images experienced in the hemianopic field by the patient described by Weiskrantz et al. (2002), it appears that such signals are apt to preserve veridical spatial relations, because the after-images were localized to the original stimulation site. Incidentally, the above-chance discrimination of color-congruent letters by the patient of Aglioti et al. (1999) cannot have depended on negative after-images, because red stimuli produce green negative after-images, and green stimuli produce red negative after-images, hence a Strooplike effect based on negative after-images would have paradoxically consisted in a facilitated discrimination of color-incongruent letters.
Channels in the corpus callosum and the multifunctional splenium For centuries, the corpus callosum has posed problems to the neurosciences as a most conspicuous feature of the brain of placental mammals, and especially of the human brain, which did not possess an obvious function. The problem of its functional significance was considered largely solved when Sperry and his collaborators demonstrated a major role for the corpus callosum in the interhemispheric transfer of information and in the unification of the independent cognitive domains of the two cerebral hemispheres (Sperry, 1982). Yet other issues about
callosal functions in man remain open. One question of interest to all involved in the search for anatomofunctional correlations is: can the human corpus callosum be seen as an ensemble of ‘channels’, each of which is used for the interhemispheric transmission of specific kinds of signals, from simple sensory messages and motor commands to highly digested information underlying learning, memory, thinking, emotion and so on? The corpus callosum is a cortical commissure, and to the extent that different functions can be attributed to different cortical areas, it seems logical that the callosal connections of an area or a set of areas with a specific function should subserve that same function for the purposes of interhemispheric communication.
Maps in the corpus callosum Anatomical investigations in animals have provided evidence that certain contingents of fibers belonging to different cortical areas are compartmentalized within the corpus callosum, but such compartmentalization seems far from strict and precise. Studies in cats indicate that fibers from discrete parts of the cortex disperse through large portions of the corpus callosum, where they intermix with fibers with different cortical relations and functions (Matsunami et al., 1994). In macaque monkeys, the majority of commissural fibers from a given cortical region tend to occupy a distinct location in the corpus callosum, but overlaps of callosal fibers from different cortical areas have also been noted in the body of the corpus callosum, suggesting that the anatomic segregation of functionally diversified contingents of callosal fibers is by no means complete (Pandya and Seltzer, 1986; Lamantia and Rakic, 1990). Different deficits in interhemispheric communication resulting from surgical sections of different callosal portions in various mammalian species have generally conformed with the expectations based on anatomical knowledge, but the bulk of evidence is largely restricted to impairments in visual interhemispheric transfer and splenial lesions (Berlucchi, 1990). In contrast with animal studies, anatomical evidence about the topographical organization of human corpus callosum is severely limited. Analyses of partial callosal degenerations or atrophies after
86
Fig. 1. The anatomo-functional map of the corpus callosum according to Habib (reproduced with permission from Neurochirurgie, Vol. 44, Suppl. 1, 1998, Masson Editeur).
(2) some notable discrepancies between different reports about the location of the callosal fibers supposedly involved in the performance of verbal dichotic listening tasks.
Interhemispheric communication after surgical sections of the corpus callosum that spare the splenium
Fig. 2. The anatomo-functional of the corpus callosum according to Funnell et al. (2000a).
neuronal losses in select neocortical areas of the human brain do not clearly indicate that fibers with a specific function, or a specific cortical origin and destination, cross the midline within a circumscribed portion of the corpus callosum. Figs. 1 and 2 show tentative anatomo-functional maps of the human corpus callosum based on observed associations between discrete callosal lesions on one hand, and specific behavioral deficits or scanty direct anatomical knowledge on the other. The systematicity of such proposed associations is at least partly questionable on the basis of various considerations. Here the author will deal with (1) the considerable sparing of interhemispheric communication that is known to obtain following extensive callosal sections that leave the splenium intact, and
Several years ago Gordon et al. (1971) reported the surprising finding that two patients submitted to section of the anterior two-thirds of the corpus callosum for relief of epilepsy did not show any of the interhemispheric disconnection deficits exhibited by epileptic patients with complete callosal sections (Bogen, 1993), or by patients with spontaneous anterior callosal lesions of vascular or tumoral origin. More specifically, the patients with surgical callosal sections sparing the splenium did not present with any of the signs of alexia in the left visual field and anomia for objects felt with the left hand that are so evident in patient with complete callosotomies. One of the callosotomy patient with preserved splenium could even name olfactory stimuli presented to either nostril, whereas complete callosotomy appears to limit this ability to the left nostril, projecting to the speaking left hemisphere (Sperry, 1982). In emphasizing the remarkable degree to which the small intact posterior sector of the corpus callosum could help attain a near-normal interhemispheric
87
communication, Gordon et al. (1971) maintained that signs of interhemispheric disconnection observed after vascular or tumoral lesions of the anterior callosum may be due to the association of callosal and extracallosal lesions, an association lacking in cases with clean surgical callosal sections. According to these results, still unchallenged to the author’s knowledge, either the posterior callosum may by itself be able to sustain normal interhemispheric interactions in all or most sectors of brain activity, or it may constitute a major site of a compensatory reordering of commissural mechanisms that prevents the occurrence of interhemispheric disconnection symptoms following anterior callosotomy. While this alternative is still undecided, work from the author’s laboratory has pointed to a similar role of the posterior corpus callosum in ensuring normal interhemispheric interactions in the taste modality.
Taste and the corpus callosum As shown by conflicting statements by anatomy and physiology textbooks, the lateral organization of the gustatory pathway in man is incompletely understood. A majority of studies support an uncrossed projection from each side of the tongue to the cortex (Norgren, 1990), but reports of an opposite crossed organization continue to appear in the neurological literature (Sa´nchez-Juan and Combarros, 2001). We studied the lateral organization of the gustatory pathway in eight normal controls, a man with a complete callosal agenesis, three men with a complete section of the corpus callosum, one man with a callosal section sparing the genu and the rostrum, and a man with a callosal section sparing the posterior callosum including the splenium. (Aglioti et al., 2000, 2001). Sapid solutions containing one of three basic taste stimuli (sour, bitter, salty) were applied to one or the other side of the tongue, and subjects reported the taste of the stimulus either verbally or by manually pointing to the name of the taste. Since it was known that in the subjects with complete callosotomies and in the posterior callosotomy subject verbal responses to tactile and visual stimuli were possible only with inputs to the left hemisphere, it was felt that verbalization of lateralized taste stimuli could provide information on the lateral organization of the taste pathway. There
were no differences in accuracy and reaction time between the right and left hemitongues of the normal controls, in accord with more precise psychophysical tests (Kroeze, 1979; McMahon et al., 2001). Similar results were obtained with the genetically acallosal observer, in accord with the notion that an inborn lack of the corpus callosum is generally compatible with an effective cross-integration in most functions (Jeeves, 1990). By contrast, the three complete callosotomy subjects and the subject with a sparing of the rostrum and the genu (Fig. 3) showed a significant advantage of the left hemitongue over the right hemitongue for response accuracy, and in one case for speed of response as well, although performance with right stimuli was clearly above chance in all four cases. Finally, and quite surprisingly, the callosotomy subject with an intact splenium (Fig. 3) showed no differences between the two hemitongues. Aglioti et al. (2000, 2001) concluded from these results that: (1) gustatory pathways from the tongue to the cortex are bilaterally distributed, so that taste information from either side of the tongue can reach the left hemisphere in the absence of the corpus callosum; (2) the uncrossed input from the left hemitongue to the left hemisphere is actually more potent functionally than the contralateral input; and (3) in the normal brain, the corpus callosum appears to equalize the effects of the ipsilateral and contralateral gustatory inputs on the left hemisphere, a callosal function also suggested by electrophysiological findings from animal experiments (Kadohisa et al., 2000). These conclusions are in agreement with some old and recent evidence about lateralized taste deficits following unilateral cortical lesions (Motta, 1958; Pritchard et al., 1999; Small et al., 2001). For present purposes, they specifically suggest that it is the posterior part of the corpus callosum, including the splenium, that ensures the functional equivalence between the two sides of the tongue in normal observers. The remarkable capacity of the posterior callosum alone to maintain effective interactions between the hemispheres in vision, touch and olfaction, as shown by Gordon et al. (1971), can therefore be regarded to extend to taste as well. As in the case of olfaction, the evidence for an involvement of the splenium in interhemispheric integration of taste information is puzzling in the light of the known
88
Fig. 3. Magnetic resonance imaging of the brains of two callosotomy patients studied by Aglioti et al. (2001). Evidence for asymmetries in taste discrimination between the two hemitongues, attributable to callosal disconnection, was found in the patient whose brain appears on the left, but not in the other patient with an intact splenium, shown on the right.
location of the cortical areas for taste in the frontal lobe and the insula (Frey and Petrides, 1999; Small et al., 1999). Since callosal connections of these areas run in the anterior corpus callosum, one would have expected effects on taste perception from anterior rather than posterior lesions.
Left ear suppression and the corpus callosum The suppression of left ear signals in such tasks is a typical symptom of interhemispheric disconnection in patients treated with complete callosal sections for drug-refractory forms of epilepsy (Milner et al., 1968). Left ear suppression is generally attributed to the interruption of callosal fibers that convey verbal auditory information from the left ear/right hemisphere to the left hemisphere for report. Some findings in patients with surgical or spontaneous partial callosal lesions have suggested that the critical interruption affects fibers running in a specific portion of the posterior trunk of the corpus callosum in front of the splenium, but not in the splenium itself. Different conclusions have been drawn from studies of other patients, also with partial surgical or spontaneous callosal sections, where a permanent left ear suppression in verbal dichotic tests has been found to result from lesions involving the splenium in addition to the posterior callosal trunk.
The human corpus callosum is frequently damaged by closed head traumas (Vuilleumier and Assal, 1995), and the specific position of the lesion within the corpus callosum can be precisely visualized in vivo by noninvasive brain imaging methods, so that clinical and laboratory observations can be carried out on single cases with identified callosal lesions. The results from a single case study in our laboratory (Peru et al., 2003) has recently afforded evidence that has a direct bearing on the location of the callosal lesion giving rise to left ear suppression in verbal dichotic listening task. A young male patient who had sustained a severe closed head trauma followed by coma underwent a nearly complete functional recovery in the course of several months. Magnetic resonance imaging examinations showed a complete interruption of the posterior third of the body of the corpus callosum with a minimal involvement of the splenium, which appeared substantially preserved (Fig. 4) as also demonstrated by tests of visual interhemispheric transfer. Given the site of the callosal lesion, some previous studies would have predicted a left ear suppression in verbal dichotic listening tasks (Springer and Gazzaniga, 1975; Alexander and Warren, 1988), whereas other studies would have predicted a substantial sparing of left ear signals due to the intactness of the splenium (Sugishita et al., 1995; Pollmann et al., 2002). We employed a dichotic listening test in which each trial consisted in the
89
Fig. 4. Magnetic resonance imaging of the traumatic callosal lesion in a patient who showed a left ear suppression in a dichotic listening task shortly after the lesion (a) but not in a follow-up retest a few months later (b) (from Peru et al., Neuropsychologia, 41: 634–643, 2003).
simultaneous presentation through headphones of 40 tape-recorded series of four digits, spoken by a male voice. One series was presented to the right ear and the other was presented to the left ear, and the digits occurring simultaneously were always different for the two ears. Immediately after the presentation the patient was to report, in a free order, all the digits he remembered to have heard. When tested 4 months after the trauma, the patient was very accurate in reporting series of four digits presented monoaurally to either ear, but showed an almost complete left-ear suppression in the dichotic listening paradigm. He reported 39 digits out of 40 presented to the right ear, but only one digit out of 40 presented to the left ear, and the difference between the two ears remained virtually the same when he was explicitly required to ignore digits presented to the right ear, and to report only digits presented to the left ear. However, this striking left ear suppression was no longer observable 3 months later, when the patient reported 25 out of 40 digits presented to the left ear, and 29 out of 40 digits presented to the right ear, a performance which is hardly different from that of normal subjects who divide their attention between the ears. We believe that the performance in the later dichotic listening test is compatible with the presence of fibers carrying auditory information in the splenium as well as in presplenial callosal portions. Anatomical findings in
the cat demonstrate a spread of fibers from cortical auditory areas over the entire posterior half of the corpus callosum, where they are interspersed with interhemispheric connections of other cortical areas (Matsunami et al., 1994; Clarke et al., 1995). If that anatomical pattern obtains in the human brain as well, the auditory callosal fibers running in the intact splenium of the patient may have compensated for the initial deficit caused by the injury to the auditory presplenial fibers. Alternatively, the splenial auditory fibers might have been nonfunctional soon after the trauma, and might have recovered their function with the elapsing of time and the waning of the causal factor, for example edema. In addition to suggesting a rather widespread distribution of auditory fibers in splenial and presplenial portions of the corpus callosum, the results from this case argue for the need to analyze effects of partial callosal disconnections over time in order to distinguish temporary from permanent interhemispheric transfer deficits.
Suggestions from partial callosal agenesis Total callosal agenesis is not associated with major symptoms of functional interhemispheric disconnection, due to compensatory processes that are still largely unknown. By comparison, partial callosal
90
agenesis may manifest itself in more conspicuous signs of functional interhemispheric disconnection, possibly because the extant callosal connections do not allow a full activation of the mechanisms for functional compensation (Dennis, 1976). In support of this hypothesis, in collaboration with Aglioti et al. (1998), the author reported clear-cut deficits of interhemispheric communication, including left hand anomia, partial left field alexia and poor tactile cross-localization in a subject with a congenital absence of the posterior part of the corpus callosum including the splenium. Such deficits were similar to those exhibited by a subject with a complete surgical section of the corpus callosum, but were lacking in a subject with a total callosal agenesis (Aglioti et al., 1998), as in many other genetically acallosal cases (Jeeves, 1990). It would be interesting to study whether the rare cases with anterior callosal agenesis and apparent preservation of the splenium (e.g. Sener, 1995) are completely free of interhemispheric disconnection symptoms, including those minor ones that are as a rule observable in total callosal agenesis (Jeeves, 1990).
In conclusion, as originally argued by Gordon et al. (1971), the recognized importance of the splenium for interhemispheric communication leaves unsolved the question of what functions are mediated by the large anterior sectors of the corpus callosum. Some of these putative anterior callosal functions have been discussed by Gazzaniga and his coworkers (Gazzaniga, 2000; Funnell et al., 2000a,b), who have also claimed that there is a remarkable functional specificity in callosal information transmission. However, where such remarkable specificity has been demonstrated, the data indicate that it obtains within the splenium rather than within the entire corpus callosum (Funnell et al., 2000b). The author likes to call attention to the scanty available evidence on the anatomical origin of the contingents of fibers that cross in the splenium of the human corpus callosum (De Lacoste et al., 1985). As shown in Fig. 5, large expanses of the posterior cortex in the occipital, parietal and temporal lobes, including and extending beyond areas with recognized visual functions, contribute fibers to the splenium. If these anatomical relations are confirmed with more precise methods, it
Fig. 5. Connections between posterior cortical areas and the splenium according to De Lacoste et al. (1985). Dense degenerating fibers (black) were found in callosal sector V (the splenium) after lesions affecting all cortical sectors indicated by arrows. (A) Lesion in parieto-temporal cortex (tentative degeneration also indicated in stippled sector IV). (B) Lesion in superior parietal cortex. (C) Lesion in occipital cortex. Reproduced with permission from the Journal of Neuropathology and Experimental Neurology.
91
follows that the splenium is connected with cortical regions that handle not only visual information, but also information from all other sensory modalities. In this vein, the possibility of the splenium to transmit integrated multisensory information between the hemispheres would be consistent with a special role of this part of the corpus callosum for human cognition and general behavioral control.
Conclusion The present era of neuroscience is witnessing triumphal technical and theoretical achievements. A wide ranging cooperation between molecular and cellular biology, genetics, embriology, morphology, physiology and pharmacology has resulted in an unprecedented in-depth exploration of the organization of the central and peripheral nervous system. Modern in vivo neuroimaging technologies allow the visualization of the human brain, whether normal or damaged, during complex cognitive and motor tasks. In such an era the time-honored exploration of brain functions based on the study of the effects of brain lesions may appear largely outdated or even totally obsolete. Yet the search for precise relations between cognition and behavior on one hand and nervous structures and mechanisms on the other has still a long way to go, and there continue to be ample opportunities and justifications for studying the effects of experimental brain lesions in animals and of neuropathological damage in man, as attested by the work of Alan Cowey during four decades. Studies of the effects of brain damage are still producing results that not rarely provide ultimate tests of hypotheses generated by more modern approaches to the knowledge of cerebral organization. To the extent that brain diseases will continue to afflict humankind, a systematic investigation of the related deficits will remain indispensable for understanding the underlying physiopathological mechanisms and for planning rational pharmacological treatments and rehabilitation procedures.
Acknowledgments Work by the author reported here has been supported by grants from the Ministero della Istruzione,
della Universita` e della Ricerca Scientifica e Tecnologica, and by the Consiglio Nazionale delle Ricerche. The author is grateful to Marco Veronese for assistance with the illustrations.
References Aglioti, S., Beltramello, A., Tassinari, G. and Berlucchi, G. (1998) Paradoxically greater interhemispheric transfer deficits in partial than complete callosal agenesis. Neuropsychologia, 36: 1015–1024. Aglioti, S., Bricolo, E., Cantagallo, A. and Berlucchi, G. (1999) Unconscious letter discrimination is enhanced by association with conscious color perception in visual form agnosia. Curr. Biol., 9: 1419–1422. Aglioti, S., Tassinari, G., Corballis, M.C. and Berlucchi, G. (2000) Incomplete gustatory lateralization as shown by analysis of taste discrimination after callosotomy. J. Cogn. Neurosci., 12: 238–245. Aglioti, S.M., Tassinari, G., Fabri, M., Del Pesce, M., Quattrini, A., Manzoni, T. and Berlucchi, G. (2001) Taste laterality in the split brain. Eur. J. Neurosci., 13: 195–200. Alexander, M.P. and Warren, R.L. (1988) Localization of callosal auditory pathways: a CT case study. Neurology, 38: 802–804. Berlucchi, G. (1990) Commissurotomy studies in animals. In: Boller F. and Grafman J. (Eds.), Handbook of Neuropsychology. Vol. 4. Elsevier, Amsterdam, pp. 9–47. Berti, A. and Rizzolatti, G. (1992) Visual processing without awareness: evidence from unilateral neglect. J. Cogn. Neurosci., 4: 347–351. Bogen, J.E. (1993) The callosal syndromes. In: Heilman K.H. and Valenstein E. (Eds.), Clinical Neuropsychology. Oxford University Press, Oxford, pp. 337–407. Clarke, S., de Ribaupierre, F., Bajo, V.M., Rouiller, E.M. and Krafsik, R. (1995) The auditory pathway in cat corpus callosum. Exp. Brain Res., 104: 534–540. Corbetta, M., Marzi, C.A., Tassinari, G. and Aglioti, S. (1990) Effectiveness of different task paradigms in revealing blindsight. Brain, 113: 603–616. Danckert, J. and Goodale, M.A. (2000) Blindsight: A conscious route to unconscious vision. Curr. Biol., 10: R64–R67. Danckert, J., Maruff, P., Kinsella, G., de Graaf, S. and Currie, J. (1998) Investigating form and colour perception in blindsight using an interference task. NeuroReport, 9: 2919–2925. De Gelder, B., Pourtois, G., van Raamsdonk, M., Vroomen, J. and Weiskrantz, L. (2001) Unseen stimuli modulate conscious visual experience: evidence from inter-hemispheric summation. NeuroReport, 12: 385–391. De Lacoste, M.C., Kirkpatrick, J.B. and Ross, E.D. (1985) Topography of the human corpus callosum. J. Neuropathol. Exp. Neurol., 44: 578–591.
92 Dennis, M. (1976) Impaired sensory and motor differentiation with corpus callosum agenesis: a lack of callosal inhibition during ontogeny? Neuropsychologia, 14: 455–469. Driver, J. and Mattingley, J.B. (1998) Parietal neglect and visual awareness. Nature Neurosci., 1: 17–22. Frey, S. and Petrides, M. (1999) Re-examination of the human taste region: a positron emission study. Eur. J. Neurosci., 11: 2985–2988. Funnell, M.G., Corballis, P.M. and Gazzaniga, M.S. (2000a) Cortical and subcortical interhemispheric interactions following partial and complete callosotomy. Arch. Neurol., 57: 185–189. Funnell, M.G., Corballis, P.M. and Gazzaniga, M.S. (2000b) Insights into the functional specificity of the human corpus callosum. Brain, 123: 920–926. Gazzaniga, M.S. (2000) Cerebral specialization and interhemispheric communication. Does the corpus callosum enable the human condition? Brain, 123: 1293–1336. Goebel, R., Muckli, L., Zanella, F.E., Singer, W. and Stoerig, P. (2001) Sustained extrastriate cortical activation without visual awareness revealed by fMRI studies of hemianopic patients. Vision Res., 41: 1459–1474. Gordon, H.W., Bogen, J.E. and Sperry, R.W. (1971) Absence of deconnexion syndrome in two patients with partial section of the neocommissures. Brain, 94: 327–336. Habib, M. (1998) Syndromes de de´connexion calleuse et organisation fonctionelle du corps calleux chez l’adulte. Neurochirurgie, 44: 102–109. Jeeves, M.A. (1990) Agenesis of the corpus callosum. In: Boller F. and Grafman J. (Eds.), Handbook of Neuropsychology. Vol. 4. Elsevier, Amsterdam, pp. 99–114. Kadohisa, M., Shinohara, M. and Ogawa, H. (2000) Effects of reversible block of callosal afferent inputs on the response characteristics of single neurons in the cortical taste area in rats. Exp. Brain Res., 135: 311–318. Kentridge, R.W., Heywood, C.A. and Weiskrantz, L. (1999) Effects of temporal cueing on residual visual discrimination in blindsight. Neuropsychologia, 37: 479–483. Kolb, F.C. and Braun, J. (1995) Blindsight in normal observers. Nature, 377: 336–338. Kroeze, J.H. (1979) Functional equivalence of the two sides of the human tongue. Percept Psychophys., 25: 115–118. Lamantia, A.S. and Rakic, P. (1990) Cytological and quantitative characteristics of four cerebral commissures in the rhesus monkey. J. Comp. Neurol., 291: 520–537. Lamme, V.A.F. (2001) Blindsight: the role of feedforward and feedback corticocortical connections. Acta Psychol., 107: 209–228. MacLeod, C.M. (1991) Half a century of research on the Stroop effect: an integrative review. Psychol. Rev., 109: 163–203. Marcel, A.J. (1998) Blindsight and shape perception: deficit of visual consciousness or of visual function? Brain, 121: 1565–1588.
Marzi, C.A., Tassinari, G., Aglioti, S. and Lutzemberger, L. (1986) Spatial sumation across the vertical meridian in hemianopsics: A test of blindsight. Neuropsychologia, 24: 749–758. Matsunami, K., Kawashima, T., Ueki, S., Fujita, M. and Konishi, T. (1994) Topography of commissural fibers in the corpus callosum of the cat: a study using WGA-HRP method. Neurosci. Res., 20: 137–148. McClelland, J.L and Rumelhart, D.E. (1981) An interactive activation model of context effects in letter perception. P.1. An account of basic findings. Psychol. Rev., 88: 375–407. McMahon, D.B., Shikata, H. and Breslin, P.A. (2001) Are human taste thresholds similar on the right and left sides of the tongue? Chem. Senses, 26: 875–883. Milner, A.D. (1995) Cerebral correlates of visual awareness. Neuropsychologia, 33: 1117–1130. Milner, A.D. and Goodale, M.A. (1995) The Visual Brain in Action. Oxford University Press, Oxford. Milner, B., Taylor, L. and Sperry, R.W. (1968) Lateralized suppression of dichotically presented digits after commissural section in man. Science, 161: 184–186. Monahan, J.S. (2001) Coloring single Stroop elements: reducing automaticity or slowing color processing? J. Gen. Psychol., 128: 98–112. Motta, G. (1958) I Fattori Centrali delle Disgeusie. Tipografia Luigi Parma, Bologna, pp. 291–513. Norgren, R. (1990) Gustatory system. In: Paxinos G. (Ed.), The Human Nervous System. Academic Press, San Diego, pp. 845–861. Pandya, D.P. and Seltzer, B. (1986) The topography of commissural fibres. In: Lepore, F., Ptito, M. and Jasper, H.H. (Eds.), Two Hemispheres – One Brain: Functions of the Corpus Callosum. New York: Alan R. Liss, pp. 47–73. Peru, A., Beltramello, A., Moro, V., Sattibaldi, L. and Berlucchi, G. (2003) Temporary and permanent signs of interhemispheric disconnection after traumatic brain injury. Neuropsychologia, 41: 634–643. Pollmann, S., Maertens, M., von Cramon D.Y., Lepsien, J. and Hugdahl, K. (2002) Dichotic listening in patients with splenial and nonsplenial callosal lesions. Neuropsychology, 16: 56–64. Pritchard, T.C., Macaluso, D.A. and Eslinger, P.J. (1999) Taste perception in patients with insular cortex lesions. Behav. Neurosci., 113: 663–671. Regan, J. (1978) Involuntary automatic processing in colornaming tasks. Perc. Psychophys., 24: 130–136. Sahraie, A., Weiskrantz, L., Barbur, J.L., Simmons, A., Williams, S.C. and Brammer, M.J. (1997) Pattern of neuronal activity associated with conscious and unconscious processing of visual signals. Proc. Natl. Acad. Sci. USA, 94: 9406–9411.
93 Sa´nchez-Juan, P. and Combarros, O. (2001) Sı´ ndromes lesionales de las vı´ as nerviosas gustativas. Neurologia, 16: 262–271. Savazzi, S. and Marzi, C.A. (2002) Speeding up reaction time with invisible stimuli. Curr. Biol., 12: 403–407. Sener, R.N. (1995) Anterior callosal agenesis in mild, lobar holoprosencephaly. Pediatr. Radiol., 25: 385–386. Servos, P. and Goodale, M.A. (1995) Preserved visual imagery in visual form agnosia. Neuropsychologia, 33: 1383–1394. Small, D.M., Zald, D.H., Jones-Gotman, M., Zatorre, R.J., Pardo, J.V., Frey, S. and Petrides, M. (1999) Human cortical gustatory areas: a review of functional neuroimaging data. NeuroReport, 10: 7–14. Small, D.M., Zatorre, R.J. and Jones-Gotman, M. (2001) Increased intensity perception of aversive taste following right anteromedial temporal lobe removal in humans. Brain, 124: 1566–1575. Sperry, R.W. (1982) Some effects of disconnecting the cerebral hemispheres. Science, 217: 1223–1226. Springer, S.P. and Gazzaniga, M.S. (1975) Dichotic testing of partial and complete split-brain subjects. Neuropsychologia, 13: 341–346. Stoerig, P. and Barth, E. (2001) Low-level phenomenal vision despite unilateral destruction of primary visual cortex. Conscious. Cogn., 10: 574–587. Stoerig, P. and Cowey, A. (1997) Blindsight in man and monkey. Brain, 120: 535–559. Sugishita, M., Otomo, K., Yamazaki, K., Shimizu, H., Yoshioka, M. and Shinohara, A. (1995) Dichotic listening
in patients with partial section of the corpus callosum. Brain, 118: 417–427. Suzuki, K. and Yamadori, A. (2000) Intact verbal description of letters with diminished awareness of their forms. J. Neurol. Neurosurg. Psychiat., 68: 782–786. Tomaiuolo, F., Ptito, M., Marzi, C.A., Paus, T. and Ptito, A. (1997) Blindsight in hemispherectomized patients as revealed by spatial summation across the vertical meridian. Brain, 120: 795–803. Torjussen, T. (1978) Visual processing in cortically blind hemifields. Neuropsychologia, 16: 15–21. Vuilleumier, P. and Assal, G. (1995) Le´sions du corps calleux et syndromes de de´connexion interhe´misphe´rique d’origine traumatique. Neurochirurgie, 41: 98–107. Ward, R. and Jackson, S.R. (2002) Visual attention in blindsight: sensitivity in the blind field increased by targets in the sighted field. NeuroReport, 13: 301–304. Weiskrantz, L. (1986) Blindisght: A Case Study and Implications. Oxford University Press, Oxford. Weiskrantz, L., Cowey, A. and Hodinott-Hill, I. (2002) Prime-sight in a blindsight subject. Nature Neurosci., 5: 101–102. Zeki, S. and fftyche, D.H. (1998) The Riddoch syndrome: insights into the neurobiology of conscious vision. Brain, 121: 25–45. Zeki, S., Aglioti, S., McKeefry, D. and Berlucchi, G. (1999) The neurological basis of conscious color perception in a blind patient. Proc. Natl. Acad. Sci. USA, 96: 14124–14129.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 6
Consciousness absent and present: a neurophysiological exploration Edmund T. Rolls* University of Oxford, Department of Experimental Psychology, Oxford OX1 3UD, UK
Abstract: Backward masking was used to investigate the amount of neuronal activity that occurs in the macaque inferior temporal visual cortex when faces can just be identified. It is shown that the effect of the pattern mask is to interrupt neuronal activity in the inferior temporal visual cortex. This reduces the number of action potentials that occur to a given stimulus, and decreases even more the information that is available about which stimulus was shown because the variance of the spike counts is increased. When the onset of the mask follows the onset of the test stimulus by 20 ms, each neuron fires for approximately 30 ms, provides on average 0.06 bits of information, and human observers perform at approximately 50% better than chance in forced choice psychophysics, yet say that they are guessing, and frequently report that they are unable to consciously see the face and identify which face it is. At a longer Stimulus Onset Asynchrony of 40 ms, the neurons fire for approximately 50 ms, the amount of information carried by a single neuron is 0.14 bits, and human observers are much more likely to report conscious identification of which face was shown. The results quantify the amount of neuronal firing and information that is present when stimuli can be discriminated but not reported on consciously, and the additional amount of neuronal firing and information that is required for humans observers to consciously identify the faces. It is suggested that the threshold for conscious visual perception may be set to be higher than the level at which small but significant information is present in neuronal firing, so that the systems in the brain that implement the type of information processing involved in conscious thoughts are not interrupted by small signals that could be noise in sensory pathways.
Introduction
visual perception of the test visual stimulus, and this paradigm has been widely used in psychophysics (Humphreys and Bruce, 1989). In this chapter the author considers how much information is present in neuronal firing in the part of the visual system that represents faces and objects, the inferior temporal visual cortex (Rolls and Deco, 2002), when human subjects can discriminate in forced choice, but cannot consciously perceive, face identity. The author also considers the implications that the neurophysiological findings have for consciousness. The representation of faces and objects is in the inferior temporal visual cortex as shown by evidence that position, size and even for some neurons view, invariant representations of objects and faces are provided by neurons in the inferior temporal visual cortex (Rolls, 2000a; Rolls and
Damage to the primary (striate) visual cortex can result in blindsight, in which patients report that they do not see stimuli consciously, yet when making forced choices can discriminate some properties of the stimuli such as motion, position, some aspects of form, and even face expression (Weiskrantz et al., 1974; Stoerig and Cowey, 1997; Weiskrantz, 1997, 1998; De Gelder et al., 1999). In normal human subjects, backward masking of visual stimuli, in which another visual stimulus closely follows the short presentation of a test stimulus, reduces the *Corresponding author. Tel.: þ 44-1865-271348; Fax: þ 44-1865-310447; Web: www.cns.ox.ac.uk; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-6
95
96
Deco, 2002); that this is the last stage of unimodal visual processing in primates; and that lesions of what may be a homologous region in humans, the fusiform gyrus face and object areas (Ishai et al., 1999; Kanwisher et al., 1997), produce face and object identification deficits in the absence of low-level impairments of visual processing such as visual acuity (Rolls and Deco, 2002; Farah, 1990; Farah et al., 1995a,b). The inferior temporal visual cortex is, therefore, an appropriate stage of processing at which to relate quantitative aspects of neuronal processing to the visual perception of faces and objects. We have, therefore, studied the quantitative relationship between neuronal activity in the macaque inferior temporal visual cortex and visual perception (Rolls and Deco, 2002), and in this article the focus is on the relation between inferior temporal visual cortex and conscious visual perception, using the results from combined neurophysiological studies on the inferior temporal
visual cortex and perceptual studies in humans with the paradigm of backward masking of visual stimuli (Rolls et al., 1994, 1999; Rolls and Tove´e, 1994). A subsequent study by Kovacs et al. (1995) using a similar backward masking paradigm combined with primate electrophysiology confirmed the results.
Neurophysiology of the backward masking of visual stimuli Rolls and Tove´e (1994) and Rolls et al. (1994) measured the responses of single neurons in the macaque inferior temporal visual cortex during backward visual masking. Neurons that were selective for faces, using distributed encoding (Rolls and Tove´e, 1995; Rolls et al., 1997; Treves et al., 1999; Rolls and Deco, 2002), were tested in a visual fixation task run as shown in Fig. 1. The visual
Tone
Fixation spot
Visual stimulus S.O.A. Mask
Firing rate measurement
-500
0
500
1000
1500
2000
Time (ms)
Fig. 1. The timing used in the backward masking visual fixation blink task. The Stimulus Onset Asynchrony is the time between the onset of the visual test stimulus and the onset of the pattern mask stimulus. The test stimulus duration was 16 ms. (After Rolls et al., 1994.)
97
fixation task was used to ensure that the monkey looked at the visual stimuli. The methods used are described by Rolls and Tove´e (1994) and Rolls et al. (1994), and a few salient points follow. As shown in Fig. 1, at 100 ms, the fixation spot was blinked off so that there was no stimulus on the screen in the 100 ms period immediately preceding the test image. The screen in this period, and at all other times including the interstimulus interval and the interval between the test image and the mask, was set at the mean luminance of the test images and the mask, so that pattern discrimination with equally intense test and mask stimuli was investigated (see Bruce and Green, 1989). At 0 ms, the 500 ms warning cure tone was switched off and the test visual image was switched on for one 16 ms frame of a raster display image. The monitor had a persistence of less than 3 ms, so that no part of
the test image was present at the start of the next frame. Stimulus Onset Asynchrony (S.O.A.) values of 20, 40, 60, 100 or 1000 ms (chosen in a random sequence by the computer) were used. (The Stimulus Onset Asynchrony is the time between the onset of the test stimulus and the onset of the mask.) The duration of the masking stimulus was 300 ms. The stimuli were static visual stimuli subtending 8 degrees in the visual field presented on a video monitor at a distance of 1.0 m. The faces used as test stimuli are illustrated in Fig. 2. The usual masking stimulus (to which the neuron being analysed did not respond) was made up of letters of the alphabet {N,O}, as shown in Fig. 2. The masking pattern consisted of overlapping letters, and this masking pattern was used because it is similar to the mask used in the previous psychophysical experiments (see Rolls et al., 1994). (In some cases the masking stimulus was a face
Fig. 2. Examples of the test images used. The mask is also shown. (After Rolls et al., 1994.)
98
stimulus that was ineffective for the neuron being recorded.) Figure 3 shows examples of the effects of backward masking on the responses of a single inferior temporal cortex neuron in peristimulus rastergram and time histogram form. The top rastergram/spike density histogram pair shows the responses of the neuron to a single frame of the test stimulus (an effective face stimulus for that neuron). Relative to the prestimulus rate, there was an increase in the firing produced with a latency of approxi-
mately 75 ms, and this firing lasted for 200–300 ms, that is for much longer than the 16 ms presentation of the target stimulus. In the next pairs down, the effects of introducing a non-effective face as the masking stimulus with different S.O.A.s are shown. It is shown that the effect of the mask is to limit the duration of the firing produced by the target stimulus. Very similar masking was obtained with the standard N–O pattern mask. Similar experiments were repeated on 42 different cells (Rolls et al., 1994; Rolls and Tove´e, 1995), and in all cases the
Fig. 3. Peristimulus rastergrams and smoothed peristimulus spike density histograms based on responses in 8–16 trials to the test face alone (top raster-histogram pair), and to the test face followed by a masking stimulus (which was a face that was ineffective in activating the cell) with different S.O.A. values. (S.O.A. ¼ Stimulus Onset Asynchrony) The mask alone did not produce firing in the cell. The target stimulus was shown for 16 ms starting at time 0. (The top trace shows the response to the target stimulus alone, in that with this 1000 ms S.O.A., the mask stimulus was delayed until well after the end of the recording period shown.) (After Rolls and Tove´e, 1994.)
99
temporal aspects of the masking were similar to those shown in Fig. 3. One important conclusion from these results is that the effect of a backward masking stimulus on cortical visual information processing is to limit the duration of neuronal responses, by interrupting neuronal firing. The neuronal firing of inferior temporal cortex neurons often persisted for 200–300 ms after a 16 ms presentation of a stimulus. With a 20 ms Stimulus Onset Asynchrony, the neuronal firing was typically limited to 30 ms. With a 40 ms Stimulus Onset Asynchrony, the neuronal firing was typically limited to 50 ms. This persistence of cortical neuronal firing when a masking stimulus is not present is probably related to cortical recurrent collateral connections which could implement an autoassociative network with attractor and short-term memory properties (see Rolls and Treves, 1998; Rolls and Deco, 2002), because such continuing post-stimulus neuronal firing is not observed in the lateral geniculate nucleus (K. Martin, personal communication).
responses, averaged across the population of 15 neurons for which a sufficient number of trials was available, is shown in Fig. 4. The responses for the most (max) and the least (min) effective stimuli are shown for the period 0–200 ms with respect to stimulus onset. There was little effect (not significant) of the mask on the responses to the least effective stimulus in the set, for which the number of spikes was close to the spontaneous activity. The transmitted information carried by neuronal firing rates about the stimuli was computed with the use of techniques that have been described previously (e.g., Rolls et al., 1997; Rolls and Treves, 1998; Rolls and Deco, 2002), and have been used previously to analyse the responses of inferior temporal cortex neurons (Optican and Richmond, 1987; Gawne and Richmond, 1993; Tove´e et al., 1993; Tove´e and Rolls, 1995; Rolls et al., 1997). In brief, the general procedure was as follows (Rolls et al., 1999). The response r of a neuron to the presentation of a particular stimulus s was computed by measuring the firing rate of the neuron in a fixed time window after
Information available in inferior temporal cortex visual neurons during backward masking
Max. Min.
11
10
9 Number of spikes in 0 - 200 ms
To fully understand quantitatively the responses of inferior temporal cortex neurons at the threshold for visual perception, Rolls et al. (1997) applied information theoretic methods (see Shannon, 1948; Rolls and Treves, 1998; Rolls and Deco, 2002) to the analysis of the neurophysiological data with backward masking obtained by Rolls et al. (1994) and Rolls and Tove´e (1994). One advantage of this analysis is that it shows how well the neurons discriminate between the stimuli under different conditions, by taking into account not only the number of spikes, but also the variability from trial to trial in the number of spikes. Another advantage of this analysis is that it evaluates the extent to which the neurons discriminate between stimuli in bits, which can then be directly compared with evidence about discriminability obtained using different measures, such as human psychophysical performance. The analysis quantifies what can be determined about which of the set of faces was presented from a single trial of neuronal firing. As a preliminary to the information theoretic analysis, the effect of the S.O.A. on the neuronal
8
7
6
5
4
20 40
60
100
no mask
Stimulus Onset Asynchrony (ms)
Fig. 4. The mean (sem) across cells of the number of spikes produced by the most effective stimulus (max) and the least effective stimulus (min) as a function of Stimulus Onset Asynchrony (S.O.A.). (After Rolls et al., 1999.)
100
the stimulus presentation. The firing rates were then quantized into a smaller number of bins d than there were trials for each stimulus. After this response quantization, the experimental joint stimulusresponse probability table P(s, r) was computed from the data (where P(r) and P(s) are the experimental probability of occurrence of responses and of stimuli respectively), and the information I(S, R) transmitted by the neurons averaged across the stimuli was calculated by using the Shannon formula (Shannon, 1948; Rolls and Deco, 2002): IðS, RÞ ¼
X s, r
Pðs, rÞ log2
Pðs, rÞ PðsÞPðrÞ
and then subtracting the finite sampling correction of Panzeri and Treves (1996), to obtain estimates unbiased for the limited sampling. This leads to the information available in the firing rates about the stimulus. Information in 0 - 200 ms
Figure 5 shows the average across the cells of the cumulated information available in a 200 ms period from stimulus onset from the responses of the 15 neurons as a function of the S.O.A. This emphasizes how as the S.O.A. is reduced towards 20 ms the information does reduce rapidly, but that nevertheless at an S.O.A. of 20 ms there is still considerable information about which stimulus was shown. The reduction of the information at different S.O.A.s was highly significant (one way ANOVA) at P<0.001. It was notable that the information reduced much more than the number of spikes on each trial as the S.O.A. was shortened. The explanation for this is that at short S.O.A.s the neuronal responses become noisy, as shown by Rolls et al. (1999). This emphasizes the value of measuring the information available, and not only the number of spikes (Rolls et al., 1999).
Human psychophysical performance with the same set of stimuli
0.4
0.35
0.3
Information (bits)
0.25
0.2
0.15
0.1
0.05
0
20 40
60
100 Stimulus Onset Asynchrony (ms)
no mask
Fig. 5. The average (sem) across the cells of the cumulated information available in a 200 ms period from stimulus onset from the responses of the cells as a function of the S.O.A. (After Rolls et al., 1999.)
Rolls et al. (1994) performed human psychophysical experiments with the same set of stimuli and with the same apparatus used for the neurophysiological experiments so that the neuronal responses could be closely related to the identification that was possible of which face was shown. The monitor provided maximum and minimum luminance of 6 and 0.13 footlamberts, and was adjusted internally for linearity, within an error of no more than 3%. Five different faces were used as stimuli. All the faces were well known to each of the eight observers used in the experiment. In the forced choice paradigm, the observers specified whether the face was normal or rearranged, and identified whose face they thought had been presented. Even if the observers were unsure of their judgement they were instructed to respond with their best guess. The data were corrected for guessing to aid in comparison between classification and identification. This correction arranged that chance performance would be shown as 0% correct on the graphs, and perfect performance as 100% correct. The mean proportion of correct responses for the identification task (and for the classification of normal versus rearranged) are shown in Fig. 6. The
101 Psychophysical performance on face identification
throughout a trial). Sometimes the subjects had some conscious feeling that a part of a face (such as a mouth) had been shown. However, the subjects were not conscious of seeing a whole face, or of seeing the face of a particular person. At an S.O.A. of 40 ms, the subjects’ forced choice performance of face identification was close to 100% (see Fig. 6), and at this S.O.A., the subjects became much more consciously aware of the identity of which face had been shown (Rolls et al., 1994).
1.0
Proportion correct
0.8
0.6
Face type
0.4 Normal
Rearranged
Task
Discussion
Configuration
0.2
Identification 0.0
-0.2
20
40
80 60 100 Stimulus Onset Asynchrony (ms)
120
Fig. 6. Psychophysical performance of humans with the same stimuli as used in the neurophysiological experiments described here. The subjects were shown ‘normal’ faces or faces with the parts ‘rearranged’, and were asked to state which of 5 faces had been shown (‘Identification’), and whether the face shown was in the normal or rearranged ‘Configuration’. The plots show the proportion correct on the tasks of classification of spatial configuration (i.e., whether the face features were normal or rearranged), and of determination of the identity of the faces for faces in the Normal or Rearranged spatial configuration of face features, as a function of Stimulus Onset Asynchrony (S.O.A). The data have been corrected for guessing. The means of the proportions correct are shown. The test stimulus was presented for 16 ms. (After Rolls et al., 1994.)
proportion correct data was submitted to an arc sin transformation (to normalise the data) and a repeated measures ANOVA was performed. This analysis showed statistically significant effects of S.O.A. [F(4,28) ¼ 61.52, P<0.0001]. Forced choice discrimination of face identity was thus better than chance at an S.O.A. of 20 ms. However, at this S.O.A., the subjects were not conscious of seeing the face, or of the identity of the face, and felt that their guessing about which face had been shown was not correct. The subjects did know that something had changed on the screen (and this was not just brightness, as this was constant
The neurophysiological data (Rolls et al., 1994; Rolls and Tove´e, 1994), and the results of the information theoretic analysis (Rolls et al., 1999), can now be compared directly with the effects of backward masking in human observers, studied in the same apparatus with the same stimuli (Rolls et al., 1994). For the human observers, identification of which face from a set of six had been seen was 50% correct (with 0% correct corresponding to chance performance) with an S.O.A. of 20 ms, and 97% correct with an S.O.A. of 40 ms (Rolls et al., 1994). Comparing the human performance purely with the changes in firing rate under the same stimulus conditions suggested that when it is just possible to identify which face has been seen, neurons in a given cortical area may be responding for only approximately 30 ms (Rolls and Tove´e, 1994; Rolls et al., 1994). The implication is that 30 ms is enough time for a neuron to perform sufficient computation to enable its output to be used for identification. The results based on an analysis of the information encoded in the spike trains at different S.O.A.s support this hypothesis by showing that a significant proportion of information is available in these few spikes (see Fig. 5), with on average 0.06 bits available from each neuron at an S.O.A. of 20 ms. Thus when subjects feel that they are guessing, and are not conscious of seeing whose face has been shown, macaque inferior temporal cortex neurons provide small but significant amounts of information about which face has been shown. When the S.O.A. was increased to 40 ms, the inferior temporal cortex neurons responded for approximately 50 ms, and encoded approximately
102
0.14 bits of information (in a period of 200 ms, for the subset of face-selective neurons tested, see Rolls et al., 1999). At this S.O.A., not only was face identification 97% correct, but the subjects were much more likely to be able to report consciously seeing a face and/or whose face had been shown. One way in which the conscious perception of the faces was measured quantitatively was by asking subjects to rate the clarity of the faces. This was a subjective assessment and therefore reflected conscious processing, and was made using magnitude estimation. It is shown in Fig. 7 that the subjective clarity of the stimuli was low at 20 ms S.O.A., was higher at 40 ms S.O.A., and was almost complete by 60 ms S.O.A. It is suggested that the threshold for conscious visual perception may be set to be higher than the level at which small but significant sensory information is present so that the systems in the brain that implement the type of information processing involved in conscious thoughts is not interrupted by small signals that could be noise in sensory pathways. Consideration of the nature of this processing, and 10
Target clarity (mean rating)
8
Normal
6
4
2
0 20
40
60
80 100 120 140 160 Stimulus Onset Asynchrony (msec)
180
200
Fig. 7. Rating of the subjective clarity of faces as a function of stimulus onset asynchrony (S.O.A.). The mean of the magnitude estimation ratings is shown. (The clarity of the face, when presented without being followed by a masking stimulus, was assigned the number 10. If the observer was not able to see the features of the face, that was to be considered 0. The observers assigned a number from 0 to 10 to represent the perceived subjective clarity of the face. The effect of S.O.A. was statistically significant at P<0.001. After Rolls et al., 1994.)
the reason why it may be useful to not interrupt it unless there is a definite signal that may require the use of the type of processing that can be performed by the conscious processing system, is left to the end of the Discussion because the issues raised necessarily involve hypotheses that are not easy to test. The results of the information analysis (Rolls et al., 1999) emphasise that very considerable information about which stimulus was shown is available in a short epoch of, for example, 50 ms. This confirms the findings of Tove´e et al. (1993), Tove´e and Rolls (1995) and Heller et al. (1995), and facilitates the rapid read-out of information from the inferior temporal visual cortex, and the use of whatever information is available in the limited period of firing under backward masking conditions. It was notable that the information in the no mask condition did outlast the end of the stimulus by as much as 200–300 ms, indicating some short term memory trace property of the neuronal circuitry. This continuing activity could be useful in the learning of invariant representations of objects (Rolls, 1992, 2000a; Wallis and Rolls, 1997; Rolls and Deco, 2002). The results also show that even at the shortest S.O.A. of 20 ms, the information available was on average 0.06 bits. This compares to 0.3 bits with the 16 ms stimulus shown without the mask (Rolls et al., 1999). It also compares to a typical value for such neurons of 0.35–0.5 bits with a 500 ms stimulus presentation (Tove´e and Rolls, 1995; Rolls et al., 1997). The results thus show that considerable information (33% of that available without a mask, and approximately 22% of that with a 500 ms stimulus presentation) is available from neuronal responses even under backward masking conditions which allow the neurons to have their main response in 30 ms. Also, we note that the information available from a 16 ms unmasked stimulus (0.3 bits) is a large proportion (approximately 65–75%) of that available from a 500 ms stimulus. These results provide evidence on how rapid the processing of visual information is in a cortical area, and provides a fundamental constraint for understanding how cortical information processing operates (see Rolls and Treves, 1998; Rolls and Deco, 2002). One direct implication of the 30 ms firing with the 20 ms S.O.A. is that this is sufficient time both for a cortical area to perform its computation, and
103
for the information to be read out from a cortical area, given that psychophysical performance is 50% correct at this S.O.A. Another implication is that the recognition of visual stimuli can be performed using feedforward processing in the multi-stage hierarchically organized ventral visual system comprising at least V1–V2–V4-Inferior Temporal Visual Cortex, in that the typical shortest neuronal response latencies in macaque V1 are approximately 40 ms, and increase by approximately 15–17 ms per stage to produce a value of approximately 90 ms in the inferior temporal visual cortex (Rolls and Deco, 2002; Oram and Perrett, 1992; Dinse and Kruger, 1994; Raiguel et al., 1989; Vogels and Orban, 1994; Nowak and Bullier, 1997). (The fact that considerable information is available in short epochs, of for example 20 ms, of the firing of neurons provides part of the underlying basis for this rapid sequential activation of connected visual cortical areas (Tove´e and Rolls, 1995; Rolls and Deco, 2002).) Given these timings, it would not be possible in the 20 ms S.O.A. condition for inferior temporal cortex neuronal responses to feedback to influence V1 neuronal responses to the test stimulus before the mask stimulus produced its effects on the V1 neurons. This shows that at least some recognition of visual stimuli is possible without top–down backprojection effects from the inferior temporal visual cortex to early cortical processing areas. The processing time allowed for each cortical area to perform useful computation and for the information to be read out to the next stage of cortical processing is in the order of 15–17 ms as shown by the neuronal response latency increases from cortical area to cortical area noted above; and less than 30 ms as shown by the duration of the firing in the 20 ms S.O.A. condition when face identification was 50% better than chance. This is sufficient for recurrent collaterals to operate by feedback within a cortical area to allow them to implement attractor-based processing, as shown by analyses of the speed of settling of such networks provided that they are implemented with neurons with continuous dynamics (as implemented in models by integrate-and-fire neurons) and with spontaneous firing (Treves, 1993; Battaglia and Treves, 1998; Rolls and Treves, 1998). Indeed, the dynamics of a four-layer hierarchical network with an architecture like that of the ventral
visual system are sufficiently rapid to allow recurrent feedback attractor operations to contribute usefully to information processing in the system (Panzeri et al., 2001; Rolls and Deco, 2002; contrast with Thorpe et al., 1996). The inferior temporal visual cortex is an appropriate stage to analyse the identification of objects and faces, and to link to face identification in humans, because the ITC contains an invariant representation of faces and objects (Booth and Rolls, 1998; Rolls, 2000a; Rolls and Deco, 2002), and damage to corresponding areas in humans may produce face and object agnosias (Farah, 1990; Farah et al., 1995a,b). The quantitative analyses of neuronal activity in an area of the ventral visual system involved in face and object identification described here which show that significant neuronal processing can occur that is sufficient to support forced choice but implicit (unconscious) discrimination in the absence of conscious awareness of the identity of the face is of interest in relation to studies of blindsight (Weiskrantz et al., 1974; Weiskrantz, 1997, 1998; Stoerig and Cowey, 1997; de Gelder et al., 1999). The issue in blindsight is that conscious reports that a stimulus has been seen cannot usually be made, and yet some forced choice performance is possible with respect to, for example, the motion, position and some aspects of the form of the visual stimuli. It has been argued that the results in blindsight are not due just to reduced visual processing, because some aspects of visual processing are less impaired than others (Weiskrantz, 1997, 1998, 2001; Azzopardi and Cowey, 1997). However, it is suggested that some of the visual capacities that do remain in blindsight reflect processing via visual pathways that are alternatives to the V1 processing stream (Weiskrantz, 1997, 1998, 2001). If some of those pathways are normally involved in implicit processing, this may help to give an account of why some implicit (unconscious) performance is possible in blindsight patients. Further, it has been suggested that ventral visual stream processing is especially involved in consciousness, because it is information about objects and faces that needs to enter a system to plan actions (Milner and Goodale, 1995; Rolls and Deco, 2002); and the planning of actions that involves the operation and correction of flexible
104
one-time multiple-step plans may be closely related to conscious processing (Rolls, 1999; Rolls and Deco, 2002). In contrast, dorsal stream visual processing may be more closely related to executing an action on an object once the action has been selected, and the details of this action execution can take place implicitly (unconsciously) (Milner and Goodale, 1995; Rolls and Deco, 2002), perhaps because they do not require multiple step syntactic planning (Rolls, 1999). The implication of this discussion is that in blindsight the dissociations between implicit and explicit processing may arise because different visual pathways, some involved in implicit and others in explicit processing, are differentially damaged. In contrast, in the experiments described here, the dissociation between implicit and explicit processing appears to arise from allowing differential processing (short information-poor versus longer) in the same visual processing stream and even area, the inferior temporal visual cortex. Thus the implications of the dissociations described here, and their underlying neuronal basis, may be particularly relevant in a different way to understanding visual information processing and consciousness. One of the implications of blindsight thus seems to be that some visual pathways are more involved in implicit processing, and other pathways in explicit processing. In contrast, the results described here suggest that short and information-poor signals in a sensory system involved in conscious processing do not reach consciousness, and do not interrupt ongoing or engage conscious processing. This evidence described here thus provides interesting and direct evidence that there may be a threshold for activity in a sensory stream that must be exceeded in order to lead to consciousness, even when that activity is sufficient for some types of visual processing such as visual identification at well above chance in an implicit mode. The latter implicit mode processing can be revealed by forced choice tests and by direct measurements of neuronal responses. (Complementary evidence at the purely psychophysical level using backward masking has been obtained by Marcel (1983a,b) and discussed by Weiskrantz (1998, 2001).) Possible reasons for this relatively high threshold for consciousness are considered next. It is suggested that the threshold for conscious visual perception may be set to be higher than the level
at which small but significant sensory information is present so that the systems in the brain that implement the type of information processing involved in conscious thoughts are not interrupted by small signals that could be noise in sensory pathways, and that may not require use of the type of processing that can be performed by the conscious processing system. The exact nature of the information processing that is linked essentially to consciousness is the subject for great debate. My own theory is that phenomenal consciousness is the state that occurs when one is thinking about one’s own syntactic (or more generally linguistic) thoughts (Rolls, 1999, 2000b, 1997). In that the theory is premised on thoughts about thoughts, it is a Higher Order Thought (HOT) theory of consciousness (see also Rosenthal, 1990, 1993), and in that the theory is about linguistic thoughts, the author has termed it a HOLT theory of consciousness (Rolls, 1999, 2000b). It may also be called a HOST theory, as the higher order thoughts are about syntactic thoughts. The computational argument that the author put to specify why the higher order thoughts are computationally useful if they are syntactic is as follows. If a multi-step plan involves a number of different symbols, and requires the relations (e.g., the conditional relations) between the symbols to be specified correctly in each step of the plan, then some form of syntax is needed; for otherwise, the symbols would not be bound together correctly in each step of the plan. If the plan produces an incorrect outcome, then a process that can reflect on and evaluate each step of the plan, and determine which may be the incorrect step, is a way to solve the credit assignment problem (Rolls, 1999). It is argued that it is implausible that such higher order syntactic thought processes would occur without it feeling like something to be the system that implements these processes, especially when the thoughts are grounded in the world (Rolls, 1999). This is only a plausibility argument. It should be noted that the higher order thoughts are linguistic in the sense that they require syntax, but not necessarily, of course, human verbal language. However, to the extent that some form of syntactic processing is closely related to consciousness, it is likely to be that a serial and time-consuming process is needed (with serial processing used to limit the binding problem to whatever syntax the brain can implement). Given the serial nature of this process,
105
and its use for implementing long-term and/or multistep planning, it is suggested that it may be useful not to interrupt it unless the processing systems that can perform non-syntactic implicit processing to detect stimuli or stimulus change have sufficiently strong evidence that the signal is strong, and is of a type which may be behaviourally significant, such as a moving spot on the horizon, or an emotional expression change on a face.
Acknowledgements This research was supported by Medical Research Council Programme Grant PG9826105. The author wishes to acknowledge the excellent contributions of many scientific colleagues to the work described here, including P. Azzopardi, D.G. Purcell, S. Panzeri, A.L. Stewart, A. Treves, and M.J. Tove´e.
References Azzopardi, P. and Cowey, A. (1997) Is blindsight like normal, near-threshold vision? Proc. Natl. Acad. Sci. U.S.A., 94: 14,190–14,194. Battaglia, F. and Treves, A. (1998) Stable and rapid recurrent processing in realistic autoassociative memories. Neural Comput., 10: 431–450. Booth, M.C.A. and Rolls, E.T. (1998) View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. Cereb. Cortex, 8: 510–523. Bruce, V. and Green, P. (1989) Visual Perception: Physiology, Psychology and Ecology. Lawrence Erlbaum, London. De Gelder, B., Vroomen, J., Pourtois, G. and Weiskrantz, L. (1999) Non-conscious recognition of affect in the absence of striate cortex. Neuroreport, 10: 3759–3763. Dinse, H.R. and Kruger, K. (1994) The timing of processing along the visual pathway in the cat. Neuroreport, 5: 893–897. Farah, M.J. (1990) Visual Agnosia. MIT Press, Cambridge, Mass. Farah, M.J., Levinson, K.L. and Klein, K.L. (1995a) Face perception and within-category discrimination in prosopagnosia. Neuropsychologia, 33: 661–674. Farah, M.J., Wilson, K.D., Drain, H.M. and Tanaka, J.R. (1995b) The inverted face inversion effect in prosopagnosia: evidence for mandatory, face-specific perceptual mechanisms. Vision Res., 35: 2089–2093. Gawne, T.J. and Richmond, B.J. (1993) How independent are the messages carried by adjacent inferior temporal cortical neurons? J. Neurosci., 13: 2758–2771.
Heller, J., Hertz, J.A., Kjaer, T.W. and Richmond, B.J. (1995) Information flow and temporal coding in primate pattern vision. J. Comput. Neurosci., 2: 175–193. Humphreys, G.W. and Bruce, V. (1989) Visual Cognition. Erlbaum, Hove. Ishai, A., Ungerleider, L.G., Martin, A., Schouten, J.L. and Haxby, J.V. (1999) Distributed representation of objects in the human ventral visual pathway. Proc. Natl. Acad. Sci. U.S.A., 96: 9379–9384. Kanwisher, N., McDermott, J. and Chun, M.M. (1997) The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci., 17: 4302–4311. Kovacs, G., Vogels, R. and Orban, G.A. (1995) Cortical correlates of pattern backward-masking. Proc. Natl. Acad. Sci. U.S.A., 92: 5587–5591. Marcel, A.J. (1983a) Conscious and unconscious perception: experiments on visual masking and word recognition. Cog. Psychol., 15: 197–237. Marcel, A.J. (1983b) Conscious and unconscious perception: an approach to the relations between phenomenal experience and perceptual processes. Cog. Psychol., 15: 238–300. Milner, A.D. and Goodale, M.A. (1995) The Visual Brain in Action. Oxford University Press, Oxford. Nowak, L.G. and Bullier, J. (1997) The timing of information transfer in the visual system. In: Kaas, J., Rockland, K. and Peters, A. (Eds.), Cerebral Cortex: Extrastriate Cortex in Primates. Plenum Press, New York, pp. 205–241. Optican, L.M. and Richmond, B.J. (1987) Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex: III. Information theoretic analysis. J. Neurophysiol., 57: 162–178. Oram, M.W. and Perrett, D.I. (1992) Time course of neural responses discriminating different views of the face and head. J. Neurophysiol., 68: 70–84. Panzeri, S. and Treves, A. (1996) Analytical estimates of limited sampling biases in different information measures. Network, 7: 87–107. Panzeri, S., Rolls, E.T., Battaglia, F. and Lavis, R. (2001) Speed of information retrieval in multilayer networks of integrate-and-fire neurons. Network: Computation in Neural Systems, 12: 423–440. Raiguel, S.E., Lagae, L., Gulyas, B. and Orban, G.A. (1989) Response latencies of visual cells in macaque areas V1, V2 and V5. Brain Res., 493: 155–159. Rolls, E.T. (1992) Neurophysiological mechanisms underlying face processing within and beyond the temporal cortical visual areas. Philos. Trans. R. Soc. Lond., B. Biol. Sci., 335: 11–21. Rolls, E.T. (1997) Consciousness in Neural Networks? Neural Netw., 10: 1227–1240. Rolls, E.T. (1999) The Brain and Emotion. Oxford University Press, Oxford.
106 Rolls, E.T. (2000a) Functions of the primate temporal lobe cortical visual areas in invariant visual object and face recognition. Neuron, 27: 205–218. Rolls, E.T. (2000b) Pre´cis of The Brain and Emotion. Behav. Brain Sci., 23: 177–233. Rolls, E.T. and Tove´e, M.J. (1994) Processing speed in the cerebral cortex, and the neurophysiology of backward masking. Proc. R. Soc. Lond., B. Biol. Sci., 257: 9–15. Rolls, E.T. and Tove´e, M.J. (1995) The sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J. Neurophysiol., 73: 713–726. Rolls, E.T., Tove´e, M.J., Purcell, D.G., Stewart, A.L. and Azzopardi, P. (1994) The responses of neurons in the temporal cortex of primates and face identification and detection. Experimental Brain Research, 101: 473–484. Rolls, E.T., Treves, A., Tove´e, M. and Panzeri, S. (1997) Information in the neuronal representation of individual stimuli in the primate temporal visual cortex. J. Comput. Neurosci., 4: 309–333. Rolls, E.T. and Treves, A. (1998) Neural Networks and Brain Function. Oxford University Press, Oxford. Rolls, E.T., Tove´e, M.J. and Panzeri, S. (1999) The neurophysiology of backward visual masking: information analysis. J. Cogn. Neurosci., 11: 335–346. Rolls, E.T. and Deco, G. (2002) Computational Neuroscience of Vision. Oxford University Press, Oxford. Rosenthal, D. (1990) A theory of consciousness. ZIF Report No. 40, Zentrum fur Interdisziplinaire Forschung, Bielefeld, Germany. Rosenthal, D.M. (1993) Thinking that one thinks. In: Davies, M. and Humphreys, G.W. (Eds.), Consciousness. Blackwell, Oxford, Ch. 10, pp. 197–223. Shannon, C.E. (1948) A mathematical theory of communication. AT&T Bell Laboratories Technical Journal, 27: 379–423.
Stoerig, P. and Cowey, A. (1997) Blindsight in man and monkey. Brain, 120: 535–559. Thorpe, S., Fize, D. and Mariot, C. (1996) Speed of processing in the human visual system. Nature, 381: 520–522. Tove´e, M.J., Rolls, E.T., Treves, A. and Bellis, R.P. (1993) Information encoding and the responses of single neurons in the primate temporal visual cortex. J. Neurophysiol., 70: 640–654. Tove´e, M.J. and Rolls, E.T. (1995) Information encoding in short firing rate epochs by single neurons in the primate temporal visual cortex. Visual Cognition, 2: 35–58. Treves, A. (1993) Mean-field analysis of neuronal spike dynamics. Network, 4: 259–284. Treves, A., Panzeri, S., Rolls, E.T., Booth, M. and Wakeman, E.A. (1999) Firing rate distributions and efficiency of information transmission of inferior temporal cortex neurons to natural visual stimuli. Neural Comput., 11: 611–641. Vogels, R. and Orban, G.A. (1994) Activity of inferior temporal neurons during orientation discrimination with successively presented gratings. J. Neurophysiol., 71: 1428–1451. Wallis, G. and Rolls, E.T. (1997) Invariant face and object recognition in the visual system. Progress in Neurobiology, 51: 167–194. Weiskrantz, L. (1997) Consciousness Lost and Found. A Neuropsychological Exploration. Oxford University Press, Oxford. Weiskrantz, L. (1998) Blindsight. A Case Study and Implications, 2nd Edition, Oxford University Press, Oxford. Weiskrantz, L. (2001) Blindsight – putting beta (b) on the back burner. In: De Gelder, B., De Haan, E. and Heywood, C. (Eds.), Out of Mind: Varieties of Unconscious Processes. Oxford University Press, Oxford, Ch. 2, pp. 20–31. Weiskrantz, L., Warrington, E.K., Sanders, M.D. and Marshall, J. (1974) Visual capacity in the hemianopic field following a restricted occipital ablation. Brain, 97: 709–728.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 7
Rapid serial visual presentation for the determination of n eural selectivity in area STSa Peter Fo¨ldia´k*, Dengke Xiao, Christian Keysers, Robin Edwards and David Ian Perrett School of Psychology, University of St. Andrews, St Andrews, Fife KY16 9JU, UK
Abstract: We show that rapid serial visual presentation (RSVP) in combination with a progressive reduction of the stimulus set is an efficient method for describing the selectivity properties of high-level cortical neurons in single-cell electrophysiological recording experiments. Rapid presentation allows the experimental testing of a significantly larger number of stimuli, which can reduce the subjectivity of the results due to stimulus selection and the lack of sufficient control stimuli. We prove the reliability of the rapid presentation and stimulus reduction methods by repeated experiments and the comparison of different testing conditions. Our results from neurons in area STSa of the macaque temporal cortex provide a well-controlled confirmation for the existence of a population of cells that respond selectively to stimuli containing faces. View tuning properties measured using this method also confirmed earlier results. In addition, we found a population of cells that respond reliably to complex non-face stimuli, though their tuning properties are not obvious.
Introduction
Individual neurons throughout the vertebrate visual cortex are tuned to several parameters or properties of the stimulus, and they respond only when each of these relevant parameters lie within some relatively narrow range specific to the neuron. Finding effective stimuli for a particular sensory neuron involves, on the one hand, identifying the parameters of the stimulus that are relevant to the neuron under investigation and, on the other hand, finding combinations of values of these parameters that are effective at activating the neuron. This is a non-trivial task even in the relatively wellcharacterized early visual areas, where the stimulus parameters relevant to the neurons are often assumed to be known. For instance, in primary visual cortex, stimulus location, size, orientation, spatial and temporal frequency, color and stereoscopic disparity are often considered the relevant parameters. The problem, however, becomes much harder in higher visual areas, for instance, in higher areas of the
The visual cortex encodes information about the visual world by the activity pattern of a large number of neurons in the visual cortex. In spite of the number of neurons involved in this representation, understanding the response properties of individual single neurons is still a fundamental scientific problem, as it is the stimulus selectivity of the neurons making up the population that determine the encoding and the nature of the representation. The encoding of stimuli by single neurons is also interesting as the properties directly represented by single neurons have substantial influence on what tasks can be performd by the visual system efficiently (Gardner-Medwin and Barlow, 2001).
*Corresponding author. E-mail: peter.foldiak@st-andrews. ac.uk DOI: 10.1016/S0079-6123(03)14400-7
107
108
ventral, ‘pattern recognition’ pathway of the visual system, such as in areas V4 and especially in inferotemporal (IT) cortex. Stimulus selectivity of the neurons in these areas is much less well characterized, as these cells’ preferences involve complex shapes and specific combinations of simpler features. The selective properties in these higher areas are usually impossible to account for by simple stimulus features, such as edge orientation, position or color. The dependence of the neural response on the stimulus parameters is more complicated in these areas, and it is not even clear what the relevant parameters are. Cells in temporal cortex have been found to respond to complex visual patterns and objects, such as hands and geometric shapes (Gross et al., 1972; Perrett et al., 1982; Desimone et al., 1984; Kendrick and Baldwin, 1987; Tanaka et al., 1991; Young and Yamane, 1992; Miyashita, 1993; Tanaka, 1996). The preferred responses have also been shown to depend on and to be modified (possibly even defined) by visual experience (e.g. Miyashita, 1993; Gauthier and Logothetis, 2000; Gilbert et al., 2001). There are serious limitations to existing techniques used to define selectivity of neurons at these intermediate and higher levels of sensory processing. The main practical limitation is the amount of time available in an experiment for the extra-cellular recording of the activity of a single neuron, which is rarely more than an hour due to experimental conditions, physiological factors or, in the case of awake subjects, their motivation to look. The total experimental time limits the number of stimuli that can be presented. This is especially restrictive in case of the conventional presentation method, which presents stimuli for up to a second, leaving longer pauses between the presentation of subsequent stimuli in order to isolate the responses. This limitation raises the serious problem of stimulus selection. The selection of the stimulus set is often influenced by previous experiments, the experimenters’ hypotheses about the function of the neurons under investigation, theoretical arguments and intuitions. This rather subjective process has a fundamental effect on the results as they consist of tuning curves defined over only those stimuli that were selected for the experiment. There have been attempts to find less subjective, theoretically inspired sets of stimuli, such as ‘Fourier descriptors’ of boundary curvature
(Schwartz et al., 1983), 2-D square wave grating or Walsh patterns (Richmond et al., 1987), polar gratings (Gallant et al., 1993, 1996), fractal patterns (Miyashita, 1993), curves and angles (Pasupathy and Connor, 1999), face and body images (Perrett et al., 1982, 1985, 1991). These studies suffer from similar problems in that they find systematic tuning curves to the selected parameters of the stimulus set, but these properties have not been proven to be a fundamental visual ‘alphabet’. The neural responses cannot be predicted to an arbitrary set of stimuli and it is difficult even to establish whether the function of the neurons is related to the tested stimulus set. The failure of these schemes as models of the response of higher visual cells is not immediately apparent in these studies because the stimulus sets have only a small number of ‘control stimuli’ outside these relatively narrowly defined domains. For instance, studies of faces typically use collections of faces and face-like stimuli but only a small set of non-face control objects. The tuning properties observed could then not be interpreted as objective evidence for the cells’ biological function, as the results are affected by the initial selection of the stimulus set used for the study. For a more objective description of the functional properties of high-level visual cells, we need a much larger number and a broader range of stimuli. Even though the selection problem cannot be avoided completely, in this study we attempted to maximize both the number and the diversity of natural stimuli used for the description. One important task was to identify the conditions under which a high amount of information can be gained in a given total experimental time about the neurons’ responses. Our earlier studies (Keysers et al., 2001) indicated that cortical neurons’ selectivity and discrimination power in area STSa of the temporal lobe is maintained across a wide range of stimulus presentation speeds. The continuous and rapid stimulus presentation (RSVP) technique introduced there seemed therefore suitable for describing response properties in STS. To provide a broad range and diversity of stimuli, we used natural and naturalistic photographs and graphics downloaded from various image archives. We aimed not to select images on any specific criterion in order to maximize the diversity of the stimulus set and minimize the bias introduced by our subjective selection.
109
Thus with a far more extensive library of visual images we aimed to define the types of stimuli to which cells in temporal cortex respond. To this end we investigated different ways of isolating effective stimuli and compared exhaustive search methods where the response to an entire stimulus set was measured each over multiple trials, with progressive narrowing search methods, and the most effective stimuli in one round of testing were selected for a subsequent round of testing. In addition to the most effective stimuli, a small number of the least effective stimuli were also selected in order to preserve variability in the neural response and to reduce adaptation.
Methods Physiology The experiments were conducted on an awake monkey (Macaca mulatta), seated in a primate chair and head restrained. We located an area within the cortex of the lower and upper banks of the anterior superior temporal sulcus (STSa, 12–18 mm anterior to the interaural plane) containing cells responsive to faces, and recorded from single neurons using standard extracellular recording methods (see Oram and Perrett, 1992) while the monkey performed a fixation task. Cell positions were confirmed by X-rays of microelectrodes and histology of microlesions and DiI (as used by Snodderly and Gur, 1995). The subject’s eye position was monitored (accuracy 1 deg; IView, SMI, Germany). The subject received fruit juice reward every 500 ms as long as fixation was maintained within 5 of the center of the screen. A 486 PC and Cambridge Electronics CED 1401 interface recorded eye position, spike arrival times and measured stimulus onset times.
Stimuli Images (256 320 pixels, 10 12.5 ) were presented in rapid continuous random sequences, each image lasting 55 ms (18 images per s) or 110 ms on the center of the screen of a 72 Hz Sony GDM-20D11 monitor connected to an SGI Indigo2 workstation. Onset and duration of the stimuli were monitored using light-sensitive diodes attached to the monitor screen.
The 1704 stimuli (for a sample see Fig. 1) collected from internet image libraries included color and black and white images of human and monkey head and body views, animals, natural outdoor scenes, fruits, man-made objects, vehicles, buildings, cartoons, abstract patterns and drawings. To allow the measurement of view angle tuning of face selective cells, fifteen percent of the stimuli were prepared to contain clear views of faces on gray backgrounds (‘lab faces’) with view angles every 45 in the horizontal plane commencing with the frontal face image. Two types of testing conditions were employed: ‘exhaustive’ and ‘reductive’. In the exhaustive condition, 600 or 1200 stimuli were presented, each repeated up to 15 times. In reductive testing, an initial set of 1200 or 600 images was presented 1–5 times, then the stimuli were ranked according to the magnitude of the responses they evoked. Stimuli located in the lower middle part of the rank order were dropped, and only the remaining stimuli were presented in subsequent stages. This process was repeated until a relatively small number of stimuli were left. A number of different stimulus reduction schedules were tried, which are plotted in Fig. 2. An example of the stimulus reduction process is shown in Fig. 3.
Analysis The times of individual neural spikes were recorded, and they were collected separately by stimulus identity. The spike times were recorded relative to the onset of each stimulus (see Keysers et al., 2001). As the different stimuli followed each other without inter-stimulus gaps, the window for counting response spike for a particular stimulus had to be chosen carefully. An inappropriate window would count spikes caused by unrelated, randomized surrounding stimuli. For the on-line selection of the stimulus set for the next presentation stage, the spikes were counted using a window starting 100 ms after stimulus onset and a width equal to the presentation duration of a single stimulus (55 ms or 110 ms). For off-line analysis, a latency value for each cell was estimated separately, and the spikes were counted again using a window starting at the latency, with a
110
Fig. 1. 256 random samples from the 1704 images used as the initial stimulus set.
Fig. 2. Reduction schedules for the reductive stimulus testing condition. A range of reduction schedules were tested. For each of the reductive experiments, the number of remaining stimuli for each stage is plotted.
width equal to the presentation duration of a single stimulus. For the analysis of view angle tuning, only cells’ responses to the clear ‘lab face’ stimuli were used. To determine if a cell responded to faces, a wider response window (from 50 ms after stimulus onset to stimulus end þ 100 ms) was used. Responses to all stimuli were then ranked and spikes from top 10 stimuli were accumulated to form 1 ms bin-width collective peri-stimulus histograms. A spike density function was computed by smoothing the above histogram with a Gaussian function (standard deviation ¼ 6 ms). A baseline was defined as the mean activity measured during a window ending 40 ms after stimulus onset with duration equal
111
to the duration of a single stimulus. As 40 ms is shorter than the shortest cell latency, this window is affected only by the randomized stimuli surrounding the one under investigation, so this baseline is effectively an average of the responses to the stimuli. The value of the baseline plus 2.58 times the standard deviation of the base period was taken as threshold above which we considered the stimulus effective. The beginning of twenty-five consecutive effective response bins marks the response onset and so the latency. Responses to all stimuli were then recalculated using this latency. Response windows were defined as latency plus stimulus duration plus half of the smoothing window length. Activity during the baseline period was treated as baseline activity. t-tests were performed on the activities of base and response periods at the significance level of 5%. If a cell significantly responded to one or more view of a ‘lab-face’ stimulus, all the lab-face stimulus sets (8 views each) which included at least one significant stimulus were included in the view tuning analysis. Where the resulting stimuli covered the whole 8 view range, the view tuning was computed. The responses to the same view were averaged across tested stimulus sets. Maximum and minimum values
Fig. 3. An example of reductive testing for one cell (073_5_2). This experiment consisted of five stages (1199, 600, 300, 150, 75 stimuli), represented in the graph by the five vertical columns. Within each column, the stimuli are rank-ordered based on the total cumulative response, including responses from previous stages. The stimuli with highest responses are at the top of the columns, while the least effective stimuli are at the bottom. Line segments between the columns connect the rank positions of the identical surviving stimuli between the stages. The white triangular gaps between the columns correspond to the lower middle ranking stimuli excluded from the next stage.
of the response were calculated and the responses were normalized as ((resp-min)/(max-min)), to the range of 0 to 1. The view tuning curves were then aligned so that the highest response occupies the middle position (i.e. is defined as the best view) with other views shifted accordingly.
Results Effectiveness of the stimulus set Across the 32 recorded STSa neurons, mean response latency was 75 ms, ranging from 61 to 108 ms. On average, 8% (range 0.75–20%) of stimuli tested produced a statistically significant response. The effectiveness of stimuli in driving the neurons under study is illustrated in Fig. 4 which shows results of the exhaustive testing condition. The maximal response was calculated for each cell, and the histogram of the
Fig. 4. The effectiveness of the stimuli in the exhaustive testing condition (22 cells tested). For each cell tested in the exhaustive condition, a histogram of neural responses is shown. The horizontal axis is the response magnitude expressed as a percentage of the maximal response for the given cell (ten 10% wide bins are used). The vertical axis represents the fraction of stimuli falling in those bins (i.e. the bin labeled ‘10’ corresponds to responses between 0 and 10% of the maximal response of the neuron).
112
Fig. 5. Response magnitudes (in spikes/s units) rank ordered by response magnitude for three example cells. The small images below the plot show reduced size versions of the stimuli. The left edge of each image is aligned with the rank order on the horizontal axis of the plot so that the most effective stimuli are visible in the leftmost column of images. (a) and (b) show the responses of two cells view-tuned, and highly selective for faces, while (c) shows the responses of a cell selective against faces.
percent of stimuli for each relative response magnitude bin is plotted. Relative response is the ratio of response to the maximal response of the neuron, and the 0–100% response range was divided into 10 equal width bins. A curve peaked at the low response bins reflects a neuron that is narrowly tuned to the stimulus set (i.e. only a few stimuli are effective). Figure 4 shows that the recorded neurons varied substantially in their breadth of tuning (or ‘lifetime sparseness’). Figure 5 shows the rank ordered responses to the stimuli of three cells. The cell in Fig. 5a shows strong selectivity to faces with especially high responses to the clearest, approximately front-facing ‘lab face’ images of monkey and human faces. The responses to the 103 stimuli in the final stage of stimulus reduction
113
are shown and they are the result of 30 55ms presentations of each stimulus. Remarkably, all of the top 71 of the 103 final stimuli contain faces. Medium magnitude responses can be seen to face drawings, partially occluded, image-manipulated, or cartoon faces. Figure 5b shows the result of stimulus reduction for another cell showing face selective responses, showing the 73 images of the final stimulus set (19 presentations each). Figure 5c shows the top 144 stimuli for a cell selective for stimuli other than faces from an exhaustive presentation experiment with 600 stimuli (with five 55 ms presentations each). The effective stimuli contain a wide variety of images.
Face category selectivity A measure of face selectivity is the proportion of images containing faces out of all the images evoking significant responses. Figure 6 shows the histogram of the number of cells showing different amounts of face selectivity. The number of effective stimuli containing faces was divided by the total number of effective stimuli. This ratio was then divided into ten equal bins between 0% (corresponding to no faces in the effective stimulus set) to 100% (corresponding to all effective stimuli containing faces). The distribution of number of cells in these bins is shown in Fig. 6. All stimulus images were rated for the clarity of faces in them by two human subjects on a 10 point scale, with 1 corresponding to no face and 10 corresponding to the full size clear face images with a uniform
Fig. 6. Histogram showing a measure of the cells’ face selectivity. Face selectivity was characterized by the fraction of the effective stimuli that contained clear faces as rated by two human observers.
background. A minimum average rating of 5 was necessary for considering an image a face stimulus. Seven cells responded almost exclusively to faces, while six cells responded almost exclusively to nonface stimuli, and several responded to both faces and non-faces. The histogram is bimodal; most cells are either mainly face cells or non-face cells.
View tuning Previous studies indicated that face cells respond preferentially to a particular head orientation (with some cells preferring frontal views of faces, i.e. facing the experimental animal; other cells prefer one of the profiles, or the view of the back of the head) (Perrett et al., 1985, 1991; Hasselmo et al., 1989). We investigated this view angle tuning for the face cells with our rapid presentation methods to replicate previous studies using a larger set of stimuli. Figure 7 shows the average tuning curve (with standard error bars) across the cells selectively responsive to face or other views of the head. Before averaging, the peaks of the individual tuning curves were aligned to get the characteristic shape of the tuning curves (see Methods for details). The curve drops to half of its peak value at approximately 60 away from the peak, which is in agreement with previous studies using conventional, slow presentation methods.
Fig. 7. The view tuning of cells selectively responsive to the sight of the head. The magnitude of the responses is plotted as a fraction of the response to the optimal view angle (‘best’). Mean and standard error are shown averaged across the face selective cells. Only the clear, ‘lab face’ stimuli were considered for this analysis.
114
Reliability of search
Fig. 8. The reliability of the reductive method as shown (a) by two repeated narrowing experiments on the same cell, (b) by comparison of the result of the exhaustive and the reductive testing on the same cells. (a) The fraction of stimuli remaining in the final stage of second narrowing experiment is plotted as a function of the relative response magnitude in the first test. Responses were categorized into five 20%-wide bins, i.e. the bins labeled ‘20’ correspond to 0–20% of the maximal response of the cell in the first test. The two different shades (black and gray) represent two different cells (black: 88_4 [tests 3 vs. 2], gray: 89_4 [tests 5 vs. 4]). (b) The fraction of stimuli surviving the reduction process as a function of the response magnitude. Stimuli were binned based on the responses in the exhaustive condition into ten 10%-wide bins. Four pairs of tests are shown in four shades (from black to gray: cells 91_2 [tests 2 vs. 3], 203_4 [tests 13 vs. 14], 204_1 [tests 3 vs. 4], 204_1 [tests 2 vs. 5]).
In order to assess the reliability of the reductive method, some of cells were either tested with both an exhaustive test and a reductive test using the same set of stimuli or tested with two identical reductive procedures in succession. If the methods are reliable, we expected most of the stimuli that gave a high response in the first test to survive the stimulus reduction stages in the second test. We found that the exhaustive and reductive search procedures revealed similar effective stimuli for individual cells. The probability of images being retained in reductive searches was related to their effectiveness: the most effective stimuli had a much greater chance of remaining in the final reduced set of stimuli. Fig. 8a shows the result of the repeated stimulus reduction procedures for two cells (bars for one of the cells are indicated in black, the other cell is shown in gray). The maximal response for each cell was calculated, and the stimuli were binned based on the relative response using five bins of equal width. For both cells, all stimuli that elicited a response greater than 60% of the maximum were found in both experiments (corresponding to the four 100% bars on the right). In the exhaustive vs. reductive pairs (Fig 8b), for the two cells, any stimulus above 50–60% response in the exhaustive tests was also found with the reduction process. For both conditions there was a general tendency for stimuli with larger responses in the first test to survive the second test. Figure 9 shows the result of three tests on the same neuron selective for stimuli other than faces. The top and middle rows are the most effective stimuli found
Fig. 9. Best nine stimuli for a cell selective for stimuli other than faces. The columns represent the stimulus rank (with most effective stimulus on the left). The three rows correspond to subsequent, independent experiments on the same cell. The first two rows were using the same initial set of 600 images. The two most effective stimuli are identical and seven stimuli are the same in the top nine stimuli are identical in these two experiments. The third row is the result of an experiment on the same cell using on a completely distinct set of 600 initial images.
115
in two consecutive tests using the same initial stimulus set of 600. Seven of the top nine stimuli were identical, indicating that the stimulus reduction process was effective, and that the cell had a systematic selectivity. The bottom row shows the most efficient stimuli in an experiment using a different set of 600 stimuli. The visual basis for the selectivity apparent in the 1st or 2nd stimulus sets for this cell may have been the inclusion of high spatial frequency patterns within the test image. The correlation coefficient between the responses of the same cells to the common stimuli (those present in both experiments) was also calculated for all repeated experiments. For the comparison of exhaustive vs. reductive methods, the mean correlation coefficient was r ¼ 0.84 (range 0.78–0.91). For the comparison of exhaustive and reductive tests, the correlation coefficient between the responses to the stimuli in the final reductive stage was calculated. For the four pairs, the mean correlation coefficient was r ¼ 0.69 (range 0.52–0.9).
Discussion We showed that the rapid stimulus presentation method we introduced earlier for single-cell electrophysiological experiments (Keysers et al., 2001) is an efficient method for describing neural selectivity using a much larger number of stimuli than had been possible previously. A less restricted stimulus set allows the experiment to be better controlled and less subjective. We combined the rapid presentation method with a successive stimulus set reduction process. This process uses the relatively unreliable responses from a small number of initial trials to an extensive set of stimuli to keep potentially effective candidate stimuli for further testing and to eliminate those stimuli that are unlikely to be effective. These unreliable and noisy estimates are then improved by further testing trials, and the stimulus reduction is also repeated until the response estimates of the final set of effective stimuli are reliable. The finite amount of experimental time available for the testing of a single neuron means that in conventional testing methods there is an unavoidable tradeoff between the number of stimuli and the reliability of the results. However, the proposed reduction method achieves a reliable response estimate for the effective stimuli from a very large initial set of stimuli.
We used a large library of images containing a wide range of different types of images in our experiments. We found a population of cells that responded selectively to images of human and animal faces. This provides strong confirmation for earlier findings, as these experiments, unlike previous ones, selected the effective stimuli from a set that contained mostly non-face, control images. The view tuning properties found earlier were also reproduced using the new method (Perrett et al., 1985, 1991; Hasselmo et al., 1989). Several cells responded to images both with and without faces, and we found a population of cells selectively responsive to non-face stimuli. The distribution of tuning shows bimodality, i.e. cells were either predominantly face-selective or non-face selective. This confirms that the dimension of ‘faceness’ was important to the neural coding within the area studied. Had selectivity for faces evident in some cells been an incidental property produced by random sensitivity to particular visual features present in some of the face images, then the population level selectivity between face and other stimuli displayed in Fig. 6 would not have been flat rather than bimodal. The reductive process was found effective for cells selective to non-face patterns. These cells showed a reliable and repeatable tuning to the stimuli (for instance, see Fig. 9), but the features that explain or predict the response were not necessarily obvious. Synthetically generated optimized stimuli (Fo¨ldia´k, 2001) may help reveal more about the selectivity and function of these cells in the future.
Abbreviations IT RSVP STSa
inferotemporal cortex rapid serial visual presentation superior temporal sulcus, anterior
Acknowledgments This work was supported by a BBSRC (U.K.) grant to D. Perrett and P. Fo¨ldia´k. P. Fo¨ldia´k was also partially supported by the Albert Szent-Gyo¨rgyi Fellowship (Hungary) and the NSF (USA).
116
References Desimone, R., Albright, T.D., Gross, C.G. and Bruce, C. (1984) Stimulus-selective properties of inferior temporal neurons in the macaque. J. Neurosci., 8, 2051–2062. Fo¨ldia´k, P. (2001) Stimulus optimisation in primary visual cortex. Neurocomputing, 38–40(1–4): 1217–1222. Gallant, J.L., Braun, J. and VanEssen, D.C. (1993) Selectivity for polar, hyperbolic and Cartesian gratings in macaque visual-cortex. Science, 259(5091): 100–103. Gallant, J.L., Connor, C.E., Rakshit, S., Lewis, J.W. and VanEssen, D.C. (1996) Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. J. Neurophyiol., 76(4): 2718–2739. Gardner-Medwin, A.R. and Barlow, H.B. (2001) The limits of counting accuracy in distributed neural representations. Neural Comput., 3(3): 477–504. Gauthier, I. and Logothetis, N.K. (2000) Is face recognition not so unique after all? Cogn. Neuropsych., 17(1–3): 125–142. Gilbert, C.D., Sigman, M. and Crist, R.E. (2001) The neural basis of perceptual learning. Neuron, 31(5): 681–697. Gross, C.G., Rocha-Miranda, C.E. and Bender, D.B. (1972) Visual properties of neurons in inferotemporal cortex of the macaque. J. Neurophysiol., 35, 96–111. Hasselmo, M.E., Rolls, E.T., Baylis, G.C. and Nalwa, V. (1989) Object centred encoding by face-selective neurons in the cortex of the superior temporal sulcus of the monkey. Exp. Brain Res., 75, 417–429. Kendrick, K.M. and Baldwin, B.A. (1987) Cells in temporal cortex of conscious sheep can respond preferentially to the sight of faces. Science, 236, 448–450. Keysers, C., Xiao, D.K., Fo¨ldia´k, P. and Perrett, D.I. (2001) The speed of sight. J. Cogn. Neurosci., 13(1): 90–101. Miyashita, Y. (1993) Inferior temporal cortex — Where visualperception meets memory. Ann. Rev. Neurosci., 16, 245–263.
Richmond, B.J., Optican, L.M., Podell, M. and Spitzer, H. (1987) Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. 1. Response characteristics. J. Neurophysiol., 57. Oram, M.W. and Perrett, D.I. (1992) Time course of neural responses discriminating different views of the face and head. J. Neurophysiol., 68, 70–84. Pasupathy, A. and Connor, C.E. (1999) Responses to contour features in macaque area V4. J. Neurophysiol., 82(5): 2490–2502. Perrett, D.I., Rolls, E.T. and Caan, W. (1982) Visual neurons responsive to faces in the monkey temporal cortex. Exp. Brain Res., 47(3): 329–342. Perrett, D.I., Smith, P.A.J., Potter, D.D., Mistlin, A.J., Head, A.S., Milner, A.D. and Jeeves, M.A. (1985) Visual cells in the temporal cortex sensitive to face view and gaze direction. Proc. Roy. Soc. London B, 223, 293–317. Perrett, D.I., Oram, M.W., Harries, M.H., Bevan, R., Hietanen, J.K. and Benson, P.J. (1991) Viewer-centred and object-centred coding of heads in the macaque temporal cortex. Exp. Brain Res., 86, 159–173. Schwartz, E.L., Desimone, R., Albright, T.D. and Gross, C.G. (1983) Shape-recognition and inferior temporal neurons. Proc. Natl. Acad. Sci. USA, 80(18): 5776–5778. Snodderly, D.M. and Gur, M. (1995) Organization of striate cortex of alert, trained monkeys (Macaca fascicularis): Ongoing activity, stimulus selectivity, and widths of receptive field activating regions. J. Neurophysiol., 74, 2100–2125. Tanaka, K. (1996) Inferotemporal cortex and object vision. Ann. Rev. Neurosci., 19, 109–139. Tanaka, K., Saito, H., Fukada, Y. and Moriya, M. (1991) Coding visual images of objects in the inferotemporal cortex of the macaque monkey. J. Neurophysiol., 66, 170–189. Young, M.P. and Yamane, S. (1992) Sparse population coding of faces in the inferotemporal cortex. Science, 256(5061): 1327–1331.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 8
Cortical interactions in vision and awareness: hierarchies in reverse Chi-Hung Juan1,*, Gianluca Campana2,3 and Vincent Walsh4 1 Department of Psychology, 301 Wilson Hall, Vanderbilt University, Nashville, TN 37240, USA Department of Experimental Psychology, University of Oxford, South Parks Rd, Oxford, OX1 3UD, UK 3 Dipartimento di Psicologia Generale, Universita` degli Studi di Padova, Via Venezia 8, Padova, Italy 4 Institute of Cognitive Neuroscience, University College London, 17 Queen Sq, London, WC1N 3AR, UK
2
Abstract: The anatomical connections between visual areas can be organized in ‘feedforward’, ‘feedback’ or ‘horizontal’ laminar patterns. We report here four experiments that test the function of some of the feedback projections in visual cortex. Projections from V5 to V1 have been suggested to be important in visual awareness, and in the first experiment we show this to be the case in the blindsight patient GY. This demonstration is replicated, in principle, in the second experiment and we also show the timing of the V5–V1 interaction to correspond to findings from single unit physiology. In the third experiment we show that V1 is important for stimulus detection in visual search arrays and that the timing of V1 interference with TMS is late (up to 240 ms after the onset of the visual array). Finally we report an experiment showing that the parietal cortex is not involved in visual motion priming, whereas V5 is, suggesting that the parietal cortex does not modulate V5 in this task. We interpret the data in terms of Bullier’s recent physiological recordings and Ahissar and Hochstein’s reverse hierarchy theory of vision.
Introduction
next stage in visual cortex, the inferotemporal cortex, one finds neurons that respond selectively to faces, objects and different views of the preferred stimulus (Gross et al., 1972). This bottom up hierarchical view is still dominant. It is true that we now know more about back projections from higher to lower areas and about parallel processing of visual attributes. However, the interactions between higher and lower areas are usually considered in two contexts. First, in a bottom up constructivist manner in which the lower areas feed the higher areas with simple information that is then elaborated; second in a top down, modulatory manner in which higher areas influence the selectivity of lower areas. The weight of evidence for this view depends partly on the kinds of questions that have been asked of V1 neurons. When illusory contours were considered, and indeed termed
The classical view of visual cortical processing is one of simple receptive fields in area V1 distributing information to higher areas in which cells have larger receptive fields and more complex response properties. A cell in V1, for example, may have a receptive field (RF) of less than one degree of visual angle and give its largest response to a bar oriented within a few degrees of a critical orientation (Hubel and Wiesel, 1977). By comparison, a cell in extrastriate area V4 may have a RF size up to 30 degrees or more and respond to more complex configurations of shape and color (Heywood and Cowey, 1987). At the *Corresponding author. Tel.: þ 1-615-343-7538; Fax: þ 1-615-343-8449; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-8
117
118
‘cognitive contours’ it was presumed that the perception was generated in a cognitive, higher area such as inferotemporal cortex. Subsequent studies of V2 (von der Heydt et al., 1984) and later V1 neurons (Grosof et al., 1993), however, showed that these lower areas contained the necessary architecture and response properties to retrieve the form of illusory contours. Similarly, because attention, in one of its many flexible guises, was considered to be dominated by inferotemporal and parietal cortices, few experiments were aimed at assessing whether a V1 or an extrastriate neuron changed its responses to stimuli depending on behavioral relevance. Those that did probe these and other questions aimed at reassessing the role of these areas in visual processing consistently found that V1 neurons were indeed sensitive to the context of a visual scene (Zipser et al., 1996; Nothdurft et al., 1999), showed responses that can be described as attentional (Motter, 1993; Somers et al., 1999; Brefczynski and DeYoe, 1999), and respond at times later than the mean latencies for secondary visual areas (Schmolesky et al., 1998; Bullier, 2001). Further, a new influential theory of vision provides principled reasons for reconsidering the role of V1 in vision (Ahissar and Hochstein, 2000). In their reverse hierarchy theory of vision, Ahissar and Hochstein propose that visual processing follows a global to local trajectory in which extrastriate neurons with large RFs carry out an initial coarse-grained analysis of the visual field followed by a more detailed analysis in earlier visual areas. Against this background we have used transcranial magnetic stimulation (TMS) to examine interactions between visual areas. In this paper we consider the interactions between areas V5 and V1 in awareness (Cowey and Walsh, 2000; Pascual-Leone and Walsh, 2001), between extrastriate and striate cortex in visual search (Juan and Walsh, 2003) and between the parietal cortex and extrastriate cortex in visual priming (Campana et al., 2002). Based on the results of these experiments we argue that V1 is necessary for visual awareness, that feedback to V1 is necessary for accurate performance in detecting visual targets in a noisy array, and that extrastriate area V5, rather than the parietal cortex, is the critical site for priming of visual motion. These results are consistent with the reverse hierarchical view of vision.
In the discussion we place these results in a wider context.
Experiment 1. V1 and awareness: neuropsychological evidence Rationale One of the areas of contention in the role of V1 in higher functions concerns its putative role in awareness. One view is that activity in an extrastriate area specialized for processing a given visual attribute is sufficient for awareness of that attribute (Zeki, 1993). An alternative view (Stoerig and Cowey, 1997; Weiskrantz, 1997) is that activity in an extrastriate region only contributes to awareness as a result of interaction with V1. Both these positions have been supported in experiments on the blindsight patient GY, but it is always problematic to compare visual experiments on GY because subtle differences in stimulus conditions can result in large differences in the results. By using TMS one can bypass the geniculostriate visual pathways in GY and induce activity directly into V5. If this activity is sufficient for awareness, GY should report seeing moving visual phosphenes when TMS is applied over his V5.
Methods We compared the effects of TMS over V1 and V5 in GY, in a 61 year old, peripherally blind patient, PS (Cowey and Walsh, 2000), and in six normal subjects. TMS was applied using a Magstim Super Rapid stimulator with a 70 mm figure of eight coil. TMS was delivered either in single pulse mode or at 4, 8, 10 or 12 Hz for a maximum of 500 ms over V1 or V5. During TMS subjects sat fixating the center of a polar plot drawing board at a distance of 57 cm. PS ‘fixated’ by placing an index finger at the center of the board. Following each pulse or train of pulses the subject reported whether or not a phosphene was seen. If so they were asked to draw/describe the sensation. Care was taken to ensure that the site was not missed by TMS and following co-registration of the coil with GY’s structural and functional MRI scans twenty five points were stimulated in several attempts to elicit phosphenes.
119
Results As Fig. 1 shows, when TMS was applied over V1 of the control subjects, PS, or the intact hemisphere of GY phosphenes were elicited and showed the intact cortices to be organized retinotopically. When TMS was applied over V5 of the six control subjects or PS, moving phosphenes were reported. In GY, however, moving phosphenes could only be elicited by stimulating over the right, intact hemisphere. When TMS was applied over left V5 it was impossible to
elicit the perception of moving phosphenes in the contralateral visual field.
Conclusion From this experiment we inferred that activity in extrastriate area V5 alone was insufficient to produce awareness of motion. However, this experiment did not address directly the question of interactions between V1 and other areas and it is this we turn to next.
Experiment 2. V1 and awareness: neurochronmetric evidence Rationale If, from the results of the above experiment, one can conclude that backprojections from V5 to V1 are critical for visual awareness, it follows that this projection must have a time course.
Methods
Fig. 1. Moving phosphenes generated by applying TMS to V5 of a neurologically intact subject (a), a subject with bilateral optic tract damage and therefore no input to the striate cortex (b) and a subject with damage to the striate cortex of the left hemisphere (c). In GY it was not possible to elicit phosphenes by stimulating left V5.
TMS was applied over two visual cortical sites, V5 and V1, using a combination of phosphene generation and disruption methods in eight subjects. We first established that subjects perceived moving phosphenes following TMS over V5 and stationary phosphenes, in a region of the visual field that overlapped with the perceived moving phosphenes with TMS over V1. Subjects then received suprathreshold TMS over V5 such that they reliably perceived movement and this was either preceded or followed by TMS over V1 delivered at 80% of phosphene threshold. Thus the V1 TMS was insufficient to mask a V5 generated percept with a V1 generated percept. The V5–V1 TMS asynchrony was between 100 and 10 ms in steps of 5 ms (i.e. V1 TMS preceded V5 TMS) or between þ 10 and þ 100 ms in steps of 5 ms (i.e. V1 TMS followed V5 TMS). Subjects were asked to state whether they perceived a phosphene and if so whether it was moving, possibly moving but they were uncertain, present but stationary or whether there was no phosphene at all.
120
Results When TMS over V1 occurred prior to TMS over V5 there were no changes in subjects’ reports of phosphene perception. When V1 TMS followed V5 stimulation there were clear changes in perception (Fig. 2). Disruption of V1 processing between 5 and 45 ms after activity induced in area V5 resulted in fewer reports of motion and in some subjects a complete absence of the perception of movement.
of the interaction covers a window of between 5 and 45 ms after signals reach V5. In the two experiments described so far, we have ‘bypassed’ V1 either by using subjects without input to V1 or with damage to it (Experiment 1) and also by directly stimulating V5 and in effect exciting V5 neurons by what might be considered an abnormal route. This of course leaves open the question of how V1 might operate in more natural circumstances and in Experiment 3 we address this.
Experiment 3. V1 and complex scene discrimination Conclusion Rationale These data show that the backprojection from V5 to V1 is necessary for visual awareness of activity in V5 (as shown by Experiment 1) and that the time course
The implications of the reverse hierarchy theory of vision (Ahissar and Hochstein, 2000) extend to many
Fig. 2. TMS over V1 before V5 has no effect on the percepts induced by V5 TMS. But V5 TMS that is followed by V1 TMS disrupts the perception of the induced phosphene perception of motion.
121
functions other than awareness and one of the most challenging hypotheses is that V1 is important as a look-up table for the fine details of the visual scene (Lee et al., 1998; Lamme and Spekreijse, 2000). If this were the case it should be possible to apply TMS to V1 at later times in processing and to disrupt visual discrimination performance.
Methods Feature and conjunction visual detection tasks were used in the experiments. Subjects reported the presence or absence of a target defined either by a unique color (either a blue circle amongst red circles or a red circle amongst blues) or a unique combination of color and orientation (a blue forward slash amongst an array of blue back slashes and red forward slashes). On a trial in which the target was present there were 11 distracters and on a target absent trail there were 12 nontargets. The search array subtended 2 2 degrees of visual angle. Subjects viewed the stimuli binocularly, 57 cm from the screen. Each trial began with a 500 ms central fixation point followed by the search array and a visual mask. The mask was displayed until subjects made the present/absent response by a key press. Subjects were asked to respond accurately without any time pressure. Performance was manipulated by varying the stimulus-mask onset asynchrony (SOA) in steps of 10 ms in a staircase procedure until performance reached between 62.5 and 87.5% correct. The mean SOA in the feature task was 50 ms (range 30–100) and in the conjunction task 120 ms (range 80–180 ms). Responses were subjected to signal detection analysis (Green and Swets, 1966). In each experiment there was a block of 160 feature trials and a block of 160 conjunction trials. TMS was delivered randomly on half the trials. Three experiments were carried out to test this hypothesis. In Experiment 3a subjects received rTMS at 10 Hz for 500 ms at the onset of the visual stimulus in order to establish whether V1 disruption would have any effect on either of the visual search tasks. In Experiment 3b subjects received rTMS at 10 Hz for 500 ms with the onset of rTMS beginning 100 ms after the onset of the visual array. This was to test whether the first 100 ms, if left unstimulated, would be sufficient for V1 to carry out all the processing required in feature and conjunction tasks.
In Experiment 3c subjects received double pulse TMS (dTMS) to try to establish if the role of V1 in feature search was confined to a particular range of time within the first 100 ms after stimulus onset and to test whether the role of V1 in conjunction search had any temporal specificity. In the feature search task, which Experiment 2 showed to involve V1 only in the first 100 ms, subjects received dTMS at 0 and 40 ms, 0 and 100 ms and 80 and 120 ms after the onset of the visual array. In the conjunction search task which, as Experiment 2 showed, requires V1 involvement outside the first 100 ms (we could not say, without Experiment 3, whether the first 100 ms is also important) subjects received dTMS at the same times as in the feature task and also at 140 and 180, 200 and 240 and 260 and 300 ms after stimulus onset. A Magstim Super-Rapid TMS machine with a 70 mm figure of eight coil was used. Stimulation intensity was set at 65% of stimulator output. The coil was centered approximately 1 cm above the inion, selected on the basis of phosphene production or scotomas induced in previous experiments. We have shown previously that stimulation over this region principally excites V1 (Cowey and Walsh, 2000; Pascual-Leone and Walsh, 2001; see also Kammer, 1999).
Results Experiment 3a Repetitive pulse TMS significantly impaired performance on both the features (d 0 decrease 0.8: t(7) ¼ 2.27, P<0.05, Fig. 3a) and the conjunction tasks (d 0 decreased by 0.8: t(7) ¼ 3.01, P<0.05, Fig. 3b).
Experiment 3b There was no effect of TMS on the feature search task when the first 100 ms after the onset of the visual array were left unstimulated (t(5) ¼ 0.02, P>0.05, Fig. 3c). In the conjunction task rTMS reduced d 0 by 0.4 (t(5) ¼ 3.07, P<0.05, Fig. 3d)
Experiment 3c In the feature task, dTMS produced a significant cost in d 0 (F(3, 15) ¼ 6.26, P<0.01, Fig. 3e) when applied
122
Fig. 3. The effects of TMS over V1 on feature and conjunction d 0 scores. (a) Feature search, (b) conjunction search (c) Feature search, (d) conjunction search (e) Feature search, (f ) and (g) conjunction search.
at 0 and 100 ms (t(5) ¼ 2.85, P<0.05) and at 80 and 120 ms (t(5) ¼ 6.05, P<0.01) after array onset. In the conjunction task dTMS significantly impaired performance (F(3, 15) ¼ 10.98, P<0.01, Fig. 3f ) when applied at any of the three early intervals 0, 40 (t(5) ¼ 3.84, P<0.05); 0, 100 (t(5) ¼ 5.09, P<0.01); and 80, 120 ms (t(5) ¼ 14.42, P<0.01)) and one of the
later intervals (200, 240 ms (t(5) ¼ 1.72, P ¼ 0.053, Fig. 3g).
Conclusion The interference with V1 for 500 ms after visual stimulus onset shows V1 to be required in both visual
123
feature and conjunction tasks (Experiment 3a). Delaying the onset of rTMS for 100 ms after visual stimulus onset shows that V1 is only required in the first 100 ms for feature search, but is important beyond the first 100 ms in conjunction search (Experiment 3b). Applying dTMS shows V1 to be involved in feature search in the later part of the first 100 ms and in the conjunction task throughout the first 100 ms and again between 200 and 240 ms (Experiment 3c). These findings thus establish that V1 is necessary for normal performance in conjunction visual search at times that suggest a sustained or recurring involvement of V1 as visual analysis proceeds. We interpret this as further evidence supporting the reverse hierarchy view of vision. Hierarchies are a feature of all cortical processing and in addition to exploring the view that V1 may have a critical position in the visual sensory hierarchy we have examined the role of V5 in what might be called a ‘visuocognitive’ hierarchy. It is sometimes argued that extrastriate areas are modulated by the parietal cortex. However, the evidence for this is indirect and in Experiment 4 we tested this assumption directly.
Experiment 4. The visual cortex and priming Rationale In the previous three experiments we explored interactions between V5 and V1, but there are many assumptions about the interactions between areas at different levels of the visual hierarchy that can be challenged. One such assumption concerns the visual abilities of the right posterior parietal cortex which has been associated with visual priming. It has long been contested, however, that extrastriate areas are critical to short-term memory (Martin-Elkins et al., 1989) and also to priming of visual attributes (Bar and Biederman, 1999; Walsh et al., 2000). The theoretical basis for this, like the reverse hierarchy theory, presents a challenge to the orthodoxy that so-called higher functions require so-called higher areas. Tulving and Schacter (1990) have articulated the Perceptual Representation System (PRS) view of memory for visual attributes and Magnussen and Greenlee (1999) have proposed a similar view of what they term perceptual memory based on
psychophysical studies. In this experiment we tested the hypothesis that a secondary visual area with some relative specialization for a visual attribute would be the area likely to represent short-term changes in the likelihood of that attribute being an important component of the visual scene, and therefore important for priming/short-term memory. The competing hypothesis is that the parietal cortex would be the area responsible for short-term priming of visual attributes (Marangolo et al., 1998).
Methods Subjects were presented with four panels of 100% coherent motion. Each panel subtended 1.9 of visual angle and were separated from each other by 0.57 . Seven subjects, five male and two female, aged between 23 and 29 years and all right handed, participated in the motion priming condition. Five subjects, three male, two female, aged between 25 and 39 years and all right handed, participated in the color priming control. Five right-handed subjects, three male and two female, aged between 25 and 31 years, participated in the third experiment to control for nonspecific response time (RT) decreases following TMS over a nonvisually related area, in this case the vertex. All subjects were of a sufficient level of scientific education to understand the information given about magnetic stimulation and gave written, informed consent according to the Declaration of Helsinki. The experiment was approved by the local ethics committee, Oxford Research Ethics Committee.
Stimuli and procedure Stimuli were generated using a PC Pentium II, 400 MHz processor and presented on a Sony CPD-G200 Trinitron flat screen monitor (800 600 pixels, refresh rate 120 Hz) driven by a Rage IIC AGP display adapter. Stimuli were four virtual squares (1.9 1.9 of visual angle and black, like the background, 0.31 cd/m2) regularly arranged around the center of the screen and separated by 0.57 from the two adjacent squares. Each square contained 100 spatially random white pixels (57.2 cd/m2), all moving either horizontally or vertically at 2.96 /s in the same
124
direction within a square. In three of the squares the direction of motion was identical, while in the fourth square it was orthogonal to the other three. This odd direction was the target to be detected by the subject. Each trial began with a white fixation point (0.2 , 57.2 cd/m2) for 500 ms and followed by a blank screen. Stimulus conditions and procedures were identical for a color priming task, designed to test for nonspecific effects of TMS, with the exception that motion was in the same direction (top–bottom) in all four squares, the target was the odd color [either green target (CIE.288.597) and red distracters (CIE.609.352) or vice versa] and stimuli were presented for 48 ms because good performance on the color task required fewer frames than the motion experiment. Subjects had to select one of four response buttons, using the thumb and first fingers of each hand, to indicate which square contained the target. The rTMS pulses were applied 500 ms after the response (i.e. in the second 500 ms of the intertrial interval) and were followed by the fixation spot for the next trial. A target was present on every trial. To avoid possible effects of spatial priming the target never appeared in the same square on any two consecutive trials, a fact of which the subjects were not informed and which they did not realize during the experiment. This manipulation also ensures that the priming data are not confounded either by spatial priming effects or repetition of finger responses. The direction of moving dots and location of the target was pseudorandomised between trials. In the motion experiment, subjects were given 320 trials in blocks of 40, and received 80 TMS trials per stimulation site (V1, V5/MT and PPC) and 80 control trials. In the color experiment subjects were given 240 trials in blocks of 40, and received 80 TMS trials per stimulation site (V5/MT and PPC) and 80 control trials. In any given block TMS was given either on every trial or none of the trials. In TMS experiments there is a choice between delivering TMS predictably or randomly. In our experience the best method depends on th experimental design and the task being performed. In this experiment we chose a block design to avoid an interaction between trials that were preceded by TMS or no TMS; this would have been theoretically uninformative in the context of our
hypothesis, and would have required twice the number of TMS and nonTMS trials. Practice blocks were allowed to ensure that the subjects could perform the task accurately and that the priming effect was present. In the motion experiment, the independent variable was the direction of the doors in the target square. The direction could either be the same as or different from the previous trial, and the aim of the statistical analyses carried out was to assess the effects of TMS on the relative reaction times of trials in which the preceding target was moving in the same direction versus those trials in which the target was preceded by a different direction. In the color experiment, the color of the dots in the target square was the independent variable. Throughout we refer to these as Same and Different trials respectively. A third experiment was carried out to ensure that any decreases in reaction time could be attributed to intersensory facilitation and not to the effects of preactivating visual cortex by applying TMS over visually related areas, V1, V5/MT and PPC during the intertrial interval. In this control experiment the visual stimuli and magnetic stimulation parameters were identical to those used in the motion experiment with the exception that the TMS was applied over the vertex. The subjects were stimulated by TMS and also by the sound of the discharge and the tactile sensation on the scalp. Conceivably, subjects might respond more quickly because the TMS had a nonspecific alerting effect which could decrease reaction times to an irreducible level at which it is not possible to interpret differential effects of primed and nonprimed trials.
Magnetic stimulation The stimulator was a MagStim 200 Super-Rapid stimulator delivering current to a 70 mm figure-ofeight coil. Details of the stimulator, coil selection and the intensity of stimulation have been given elsewhere (Ashbridge et al., 1997). Repetitive pulse TMS was applied to three sites. For V5/MT stimulation the coil was held tangential triangular to the skull with the handle pointing backwards at 90 to the axis of the spinal cord and for parietal stimulation the
125
handle pointed backwards at 45 to the spine. For V1 stimulation, the coil handle was oriented vertically. Pulses were delivered at an intensity of 60% of maximum output of the stimulator at 10 Hz for 500 ms. The posterior occipital site was located between 2 and 4 cm above the inion at the midline. We shall refer to this site as striate cortex or V1. We are aware that it is unlikely that only V1 was stimulated, but the shorthand is justifiable on several grounds. First, the interpretation of the phosphene data is consistent with the topography of striate cortex (Kastner et al., 1998; Kammer, 1999). On behavioral grounds V1 is also the most frugal interpretation since single and repetitive pulse TMS produce effects consistent with interference to V1 whether one considers the time course of the effects (PascualLeone and Walsh, 2001), the phenomenology of stimulation on perceptual recognition (Kosslyn et al., 1999; Cowey and Walsh, 2000) or the disruptive effects of stimulation (Amassian et al., 1993). A left hemisphere lateral occipital site was located 3 cm dorsal and 5 cm lateral to the inion (individual subjects’ coordinates varied between 3 and 4 cm dorsally and 5 and 6 cm laterally). We refer to this site as V5/MT on the basis of previous anatomical data showing that this site overlies extrastriate area V5/MT and on the numerous previous studies that have selectively impaired visual motion perception with TMS over this site (Hotson et al., 1994; Pascual-Leone et al., 1999; Stewart et al., 1999). As in previous studies we designated the V1 and V5/MT sites on the basis of phosphene production; the V1 site was selected as the point of stimulation which gave the most consistent phosphene in the central few degrees of the visual field and the chosen V5/MT site was where TMS produced a clear moving phosphene [for descriptions of how moving phosphenes are established and how they vary see (Stewart et al., 1999)]. To establish the appropriate parietal cortical site we used a visual search task as a behavioral assay, and the region at which stimulation produced a deficit on this task was taken as the critical parietal location for visuospatial processing. The details of TMS and visual search have been given in several sources (Ashbridge et al., 1997; Walsh et al., 1998a,b; Walsh and Cowey, 1998).
Results Motion priming Baseline trials in the absence of TMS showed a clear priming effect — reaction times on Same trials were 40 ms faster than on Different trials (paired sample t-test, t ¼ 3.3, df ¼ 6, P ¼ 0.016). A two-way analysis of variance (ANOVA) was carried out with factors stimulation site (three levels: V1, V5/MT and parietal) and priming (two levels: Same prime, Different prime). There was a significant effect of priming (F ¼ 8.609, P ¼ 0.026) and a significant interaction between priming and TMS stimulation site (F ¼ 11.465, df ¼ 2, P ¼ 0.002). TMS over V5/MT in the intertrial interval abolished the priming effect (Different trials were marginally, though not significantly, faster than Same trials), and post hoc contrasts showed that the interaction of TMS site and priming were due to the effects of TMS at V5/MT relative to V1 (F ¼ 10.776, df ¼ 1, P ¼ 0.017) and relative to the parietal site (F ¼ 16.847, df ¼ 1, P ¼ 0.006). Repetitive pulse TMS in the intertrial interval decreased reaction times across all conditions with no differences between V5/MT, V1, or parietal TMS induced RT decreases (F ¼ 1.05, df ¼ 2, P ¼ 0.38). Thus TMS over V5/MT had a specific effect on reaction time — it changed the relative differences between Same and Different trials. The overall decrease in reaction times has been seen in many TMS experiments and is an example of nonspecific inter-sensory facilitation. An elegant solution to the problem of evaluating this kind of speeding was presented by Marzi et al., (1998), who interpreted differences between facilitations as the real deficit caused by TMS. In our study this rationale is particularly relevant since the effect of interest was not merely a predicted difference between TMS stimulation sites but a predicted difference within (Same versus Different) and between (V1, V5/MT, parietal) sites. The effects of TMS were not due to an overall decrement in performance as measured by accuracy. A 3 2 ANOVA with factors stimulation site and prime showed that performance on Same trials was more accurate than on Different trials (F ¼ 11, df ¼ 1, P ¼ 0.016) but there was no significant effect of stimulation site (F ¼ 0.201, df ¼ 2, P >0.05) nor an interaction between TMS and accuracy on
126
Fig. 4. The effects of TMS over V5, V1 and posterior parietal cortex on visual motion priming. (a) Only V5 TMS disrupts the RT advantage of being presented with a direction target presented on the previous trial. (b) Motion accuracy per se was unperturbed by TMS, even over V5.
primed and unprimed trials (F ¼ 0.56, df ¼ 2, P > 0.05). Figure 4 shows the data for motion priming.
Color priming The difference between the effects of TMS over V5/MT, PPC and V1 provided controls for specificity of TMS stimulation site. A color condition was included to control for specific task demands to ensure that the effects of V5/MT stimulation were specific to motion and not any priming task. Baseline
trials in the color priming condition also showed a clear priming effect in the predicted direction with a mean RT advantage of 30 ms for Same over Different trials (t ¼ 5.55, df ¼ 4, P<0.005). TMS over V5/MT or parietal cortex had no effect on color priming. An ANOVA with prime (Same versus Different) and rTMS site (V5/MT versus parietal) as factors showed the effect of the prime to be significant (F ¼ 38.83, df ¼ 1, P<0.005) but no effect of TMS site (F ¼ 0.012, df ¼ 1, P>0.05) and no interaction between TMS site and prime (F ¼ 0.383, df ¼ 1, P>0.05). A 3 2 ANOVA with factors stimulation site and
127
Fig. 5. TMS over V5 and PPC had no effects on color priming (a) or in color discrimination accuracy (b).
prime showed that performance on Same trials was slightly more accurate than on Different trials (F ¼ 7.42, df ¼ 1, P ¼ 0.053) and there was no significant effect of stimulation site (F ¼ 1.3, df ¼ 1, P>0.05) and no interaction between TMS and accuracy on primed and unprimed trials (F ¼ 0.185, df ¼ 1, P>0.05). In other words, the effect of TMS over V5/MT was specific to motion. Figure 5 shows the data.
df ¼ 1, P ¼ 0.007) when TMS was applied over the vertex to control for visually related reaction time enhancements. The reaction time enhancements caused by nonspecific intersensory effects of TMS were significant (F ¼ 7.8, df ¼ 1, P ¼ 0.049), and there was no interaction between the effects of prime and TMS (F ¼ 3.06, df ¼ 1, P ¼ 0.155), indicating that TMS over vertex did not interfere with motion priming.
Vertex stimulation and motion priming Conclusion A two-way ANOVA (condition: TMS/nonTMS; priming: Same/Different) showed that the priming effect of Same direction was maintained (F ¼ 26.34,
The results of this experiment establish that visual area V5 is the region responsible for the short-term
128
priming of visual motion. Parietal cortex and primary visual cortex, on the other hand, do not play a critical role in priming.
General discussion There are two themes to this series of experiments. The first is to explore cortico–cortical interactions in vision. The second is to raise the question: Why do we always look to higher areas to explain higher functions. Why would we want to raise this question at all? Well, to quote the person honoured in this volume — as soon as everyone believes something it is likely to be wrong. This dictum is more than mere mischief. If we are to decide a priori that all spatial and vaguely attentional functions require the posterior parietal cortex, that all memory functions require areas beyond the sensory cortex and that all back projections to the sensory areas are modulatory (in the top down sense), then we have made some brave assumptions about how the brain works. And if these assumptions are wrong, or at least greatly limited, as they seem to be, further progress will be hindered. Collectively, the experiments reported here challenge these assumptions and are best interpreted within frameworks that are not yet standard. The finding that V1, or interaction between V1 and extrastriate cortex, is necessary for visual awareness, can be explained by Bullier and colleagues’ physiological work and Ahissar and Hochstein’s Reverse Hierarchy theory. The fourth experiment was conceived within and is explained by Tulving and Schacter’s Perceptual Representation System hypothesis (akin to Magnussen and Greenlee’s Perceptual Memory System). We can look forward, then to extending the approach taken in these experiments but perhaps these too are limited — particularly by what they have borrowed from standard approaches. While we have begun to explore the reverse hierarchy views, we have done so with old stimuli — bars, dots and colors presented briefly, in the belief that these impoverished conditions either adequately represent the visual environment or at least the building blocks of it. There are two important ways in which this faith in simple stimuli may be wrong, and both can be remedied by the use of natural images and the presentation of familiar stimuli. It is clear that the environment presents many regularities to the sensory
system and these can be described as statistical distributions of signals along dimensions such as wavelength, intensity, location, contrast, motion, time etc. . . and we can therefore investigate how the visual system encodes and responds to stimuli it may have evolved to handle (Olshausen and Field, 1996; Baddeley et al., 1997; Baddeley and Hancock, 1991). Using these kinds of stimuli and analytical methods may provide harsh tests of any theory based on simple arrays of bars, colors and movement. Using familiar stimuli is also an important next step to take. Consider most visual experiments; the subjects are naı¨ ve to the strange stimuli, the simple response and the experimental setting. In our everyday vision, however, we tend to be very familiar with what we see and what we do with what we see. One tends to wake up in the same bedroom, take the same train, walk the same streets, lose the same diary on the same desk etc. . . we search, remember and act in somewhat constant surroundings. Novelty is, well, novel and a pessimistic view of much of visual cognition might be that we have developed elaborate theories of how subjects respond to novel stimuli in a novel environment and in a novel way. It is not a given that these theories will translate to natural scenes and behaviors. The reverse hierarchy theory presents a framework with which we can test the processing of natural images. The spatial scale of correlations in an image can provide a measure of what might be considered global and local in real scenes, from which we can extend the work reported in Experiment 3. Experiments 1 and 2 can also be extended using complex images. Scenes will contain correlations between attributes as well as within and this may provide a new starting point for the study of binding. Finally, what we remember from a scene may be changed greatly by competing information and it is important to evaluate the sensory memory hypothesis in the real world.
Acknowledgments Gianluca Campana is supported by a European Community Marie Cure Award, Vincent Walsh is supported by the Royal Society. Some of the work reported was supported by grants from the Medical
129
Research Council and an Equipment Grant from The Wellcome Trust.
References Ahissar, M. and Hochstein, S. (2000) The spread of attention and learning in feature search: effects of target distribution and task difficulty. Vision Res., 40: 1349–1364. Amassian, V.E., Maccabee, P.J., Cracco, R.Q., Cracco, J.B., Rudell, A.P. and Eberle, L. (1993) Measurement of information processing delays in human visual cortex with repetitive magnetic coil stimulation. Brain Res., 605: 317–321. Ashbridge, E., Walsh, V. and Cowey, A. (1997) Temporal aspects of visual search studied by transcranial magnetic stimulation. Neuropsychologia, 35: 1121–1131. Bar, M. and Biederman, I. (1999) Localizing the cortical region mediating visual awareness of object identity. Proc. Natl. Acad. Sci. USA, 96: 1790–1793. Baddeley, R.J. and Hancock, P.J.B. (1991) A statistical analysis of natural images matches psychophysically derived orientation tuning curves. Proc. R. Soc. Lond., B, 246: 219–223. Baddeley, R., Abbott, L.F., Booth, M.C.A., Sengpiel, F., Freeman, T., Wakeman, E.A. and Rolls, E.T. (1997) Responses of neurons in primary and inferior temporal visual cortices to natural scenes. Proc. R. Soc. Lond., B, 264: 1775–1783. Brefczynski, J.A. and DeYoe, E.A. (1999) A physiological correlate of the ‘spotlight’ of visual attention. Nat. Neurosci., 2: 370–374. Bullier, J. (2001) Integrated model of visual processing. Brain Res. Rev., 36(2–3): 96–107. Campana, G., Cowey, A. and Walsh, V. (2002) Priming of motion direction and area V5/MT: a test of perceptual memory. Cereb. Cortex, 12(6): 663–669. Cowey, A. and Walsh, V. (2000) Magnetically induced phosphenes in sighted, blind and blindsighted observers. Neuroreport, 11: 3269–3273. Grosof, D. H., Shapley, R. M. and Hawken, M. J. (1993) Macaque V1 neurons can signal ‘illusory’ contours. Nature, 365(6446): 550–552. Gross, C. G., Rocha Miranda, C. E. and Bender, D. B. (1972) Visual properties of neurons in inferotemporal cortex of the Macaque. J. Neurophysiol., 35(1): 96–111. Green, D.M. and Swets, J.A. (1966). Signal Detection Theory and Psychophysics. Wiley, New York. Heywood, C. A. and Cowey, A (1987) On the role of cortical area V4 in the discrimination of hue and pattern in macaque monkeys. J. Neurosci., 7(9): 2601–2617. Hotson, J., Braun, D., Herzberg, W. and Boman, D. (1994) Transcranial magnetic stimulation of extrastriate cortex degrades human motion direction discrimination. Vision Res., 34: 2115–2123.
Hubel, D.H. and Wiesel, T.N. (1977) Ferrier lecture. Functional architecture of macaque monkey visual cortex. Proc. R. Soc. Lond. B Biol. Sci., 198: 1–59. Juan, C-H. and Walsh, V. (2003) Feedback to V1: a reverse hiererachy in vision. Exp. Brain Res., 150(2): 259–263. Kammer, T. (1999) Phosphenes and transient scotomas induced by magnetic stimulation of the occipital lobe: their topographic relationship. Neuropsychologia, 37: 191–198. Kastner, S., Demmer, I. and Zieman, U. (1998) Transient visual field defects induced by transcranial magnetic stimulation over human occipital pole. Exp. Brain Res., 118: 19–26. Kosslyn, S.M., Pascual Leone, A., Felician, O., Camposano, S., Keenan, J.P., Thompson, W.I., Ganis, G., Sukel, K.E. and Alpert, N.M. (1999) The role of area 17 in visual imagery: convergent evidence from PET and rTMS. Science, 284: 167–170. Lamme, V.A. and Spekreijse, H. (2000) Modulations of primary visual cortex activity representing attentive and conscious scene perception. Front Biosci., 5: D232–243. Lee, T.S., Mumford, D., Romero, R. and Lamme, V.A. (1998) The role of the primary visual cortex in higher level vision. Vision Res., 38: 2429–2454. Magnussen, S. and Greenlee, M.W. (1999) The psychophysics of perceptual memory. Psychol. Res., 62: 81–92. Marangolo, P., Di Pace, E., Rafal, R. and Scabini, D. (1998) Effects of parietal lesions in humans on color and location priming. J. Cogn. Neurosci., 10: 704–716. Martin-Elkins, C.L., George, P. and Horel, J.A. (1989) Retention deficits produced in monkeys with reversible cold lesions in the prestriate cortex. Behav. Brain Res., 32: 219–230. Marzi, C.A., Miniussi, C., Maravita, A., Bertolasi, L., Zanette, G., Rothwell, J.C. and Sanes, J.N. (1998) Transcranial magnetic stimulation selectively impairs interhemispheric transfer of visuo-motor information in humans. Exp. Brain Res., 118: 435–438. Motter, B.C. (1993) Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. J. Neurophysiol., 70: 909–919. Nothdurft, H.C., Gallant, J.L. and Van Essen, D.C. (1999) Response modulation by texture surround in primate area V1: correlates of ‘popout’ under anesthesia. Vis. Neurosci., 16: 15–34. Olshausen, B. and Field, D. (1996) Emergence of simple-cell receptive-field properties by learning a sparse code for natural images. Nature, 381(6583): 607–609. Pascual-Leone, A., Bartres-Faz, D. and Keenan, J.P. (1999) Transcranial magnetic stimulation: studying the brainbehavior relationship by induction of ‘virtual lesions’. Phil. Trans. R. Soc. Lond. B. Biol. Sci., 254: 1222–1238. Pascual-Leone, A. and Walsh, V. (2001) Fast backprojections from the motion to the primary visual area necessary for visual awareness. Science, 292: 510–512. Schmolesky, M. T., Wang, Y., Hanes, D. P., Thompson, K. G., Leutgeb, S., Schall, J.D. and Leventhal, A. G. (1998) Signal
130 timing across the macaque visual system. J. Neurophysiol., 79(6): 3272–3278. Somers, D.C., Dale, A.M., Seiffert, A.E. and Tootell, R.B. (1999) Functional MRI reveals spatially specific attentional modulation in human primary visual cortex. Proc. Natl. Acad. Sci. USA., 96: 1663–1668. Stewart, L., Battelli, L., Walsh, V. and Cowey, A. (1999) Motion perception and perceptual learning studied by magnetic stimulation. Electroencephalogr. Clin. Neurophysiol., 51: 334–350. Stoerig, P. and Cowey, A. (1997) Blindsight in man and monkey. Brain, 120: 553–559. Tulving, E. and Schacter, D.L. (1990) Priming and human memory systems. Science, 247: 301–306. von der Heydt, R., Peterhans, E. and Baumgartner, G. (1984) Illusory contours and cortical neuron responses. Science, 224: 1260–1262. Weiskrantz, L (1997). Consciousness Lost and Found: A Neuropsychological Exploration. Oxford University Press, Oxford.
Walsh, V. and Cowey, A. (1998) Magnetic stimulation studies of visual cognition. Trends Cognit. Sci., 2: 103–110. Walsh, V., Ashbridge, E. and Cowey, A. (1998a) Cortical plasticity in perceptual learning demonstrated by transcranial magnetic stimulation. Neuropsychologia, 36: 363–367. Walsh, V., Ellison, A., Barttelli, L. and Cowey, A. (1998b) Task-specific impairments and enhancements induced by magnetic stimulation of human visual area V5. Proc. R. Soc. Lond. B. Biol. Sci., 265(1395): 537–543. Walsh, V., Le Mare, C., Blaimire, A. and Cowey, A. (2000) Normal discrimination performance accompanied by priming deficits in monkeys with V4 or TEO lesions. Neuroreport, 11: 1459–1462. Zipser, K., Lamme, V.A. and Schiller, P.H. (1996) Contextual modulation in primary visual cortex. J. Neurosci., 16: 7376–7389. Zeki, S. (1993). A Vision of the Brain. Blackwell Scientific Publications, Oxford.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 9
Two distinct modes of control for object-directed action Melvyn A. Goodale1,*, David A. Westwood2 and A. David Milner3 1
CIHR Group on Action and Perception, Department of Psychology, The University of Western Ontario, London, ON N6A 5C2, Canada 2 School of Health and Human Performance, Dalhousie University, Halifax, NS B3H 3J5, Canada 3 Cognitive Neuroscience Research Unit, Wolfson Research Institute, University of Durham Queen’s Campus, Stockton-on-Tees TS17 6BH, UK
Abstract: There are multiple routes from vision to action that play a role in the production of visually guided reaching and grasping. What remain to be resolved, however, are the conditions under which these various routes are recruited in the generation of actions and the nature of the information they convey. We argue in this chapter that the production of real-time actions to visible targets depends on pathways that are separate from those mediating memory-driven actions. Furthermore, the transition from real-time to memory-driven control occurs as soon as the intended target is no longer visible. Real-time movements depend on pathways from the early visual areas through to relatively encapsulated visuomotor mechanisms in the dorsal stream. These dedicated visuomotor mechanisms, together with motor centers in the premotor cortex and brainstem, compute the absolute metrics of the target object and its position in the egocentric coordinates of the effector used to perform the action. Such real-time programming is essential for the production of accurate and efficient movements in a world where the location and disposition of a goal object with respect to the observer can change quickly and often unpredictably. In contrast, we argue that memory-driven actions make use of a perceptual representation of the target object generated by the ventral stream. Unlike the real-time visuomotor mechanisms, perception-based movement planning makes use of relational metrics and scene-based coordinates. Such computations make it possible, however, to plan and execute actions upon objects long after they have vanished from view.
Introduction
is a good deal of evidence to suggest that the visual processing used in the programming and the control of grasping is quite distinct from the visual processing that supports our perception and recognition of objects (Goodale and Milner, 1992). Much of the evidence for this distinction between what has been called ‘vision-for-action’ and ‘vision-for-perception’ comes from work with neurological patients. Perhaps the most compelling example is the patient D.F., who developed a profound visual form agnosia as a consequence of an anoxic episode. Even though D.F. shows no perceptual awareness of the form and dimensions of objects, she is able to scale her hand to the size, shape, and orientation of a goal object as she reaches out to pick it up
When we reach out to grasp an object, the shape of our hand begins to reflect the size, shape, and orientation of the target almost as soon as the movement is initiated (Jeannerod, 1986). The programming of grasping must therefore depend on visual information about the goal object that is garnered before the movement begins. Once the movement is initiated, feedback information is used to adjust the final position and posture of the hand as it closes in on the object (i.e., movement control). There *Corresponding author. Tel.: þ 519-661-2070; Fax: þ 519-661-3680; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14400-9
131
132
(Goodale et al., 1991; Milner et al., 1991). On the basis of these and related observations, Goodale and Milner (1992) argued that the visual control of grasping and other visually guided movements is mediated by dedicated visuomotor systems in the dorsal stream — a set of visual projections that arise in primary visual cortex and project to areas in the posterior parietal cortex. The perception of objects, they argued, depends on a quite separate set of ventral-stream projections that also arise in primary visual cortex but project instead to the temporal lobe. As it turns out, D.F.’s brain damage is concentrated at the junction of the occipital and temporal lobes, in a region of the human ventral tream that has been shown to be involved in the visual recognition of objects (James et al., in press). It is presumably this selective damage to her ventral stream that has disrupted her ability to perceive the form of objects without interfering with her ability to reach out and grasp objects. In fact, recent neuroimaging studies have revealed relatively normal activation in D.F.’s dorsal stream when she grasps objects that vary in size and orientation (Culham, 2004). Consistent with these observations, neurological patients with damage to the dorsal stream often demonstrate impaired grasping movements, despite having relatively intact perception of object features (Perenin and Vighetto, 1988; Milner et al., 2001). As soon as their fingers make contact with the target, of course, they can use haptic information to adjust their hand to the correct posture. Many of these patients are also unable to reach in the correct direction to objects placed in different positions in the visual field contralateral to their lesion. But despite exhibiting a clear deficit in the visual control of reaching and grasping (known clinically as ‘optic ataxia’), these same patients have little difficulty describing the orientation, size, shape, and even the relative spatial location of the very objects they are unable to grasp correctly (for review, see Milner and Goodale, 1995). Evidence for the distinction between ‘vision-foraction’ and ‘vision-for-perception’ also comes from experiments with normal observers. Many studies have reported that the scaling of grip aperture in manual prehension — which is strongly influenced by the size of the target object — is unaffected by
size-contrast illusions that, by definition, influence perceptual judgments about the size of the target object (see Carey, 2001 for review). In a recent study, for example, Hu and Goodale (2000) asked participants to estimate the width of a rectangular block that was presented alongside a smaller or larger companion block. As expected, these perceptual estimates were affected by the relative size of the companion block. When accompanied by a larger block, the target block was perceived to be smaller than when it was accompanied by a smaller block. The presence of the companion blocks had no effect, however, on the scaling of the grip aperture when the participants were asked to reach out to pick up the target block. These and other related findings are quite consistent with the idea that separate visual mechanisms mediate object perception and object-directed action. According to the two-visual-systems account, sizecontrast illusions affect perception and not action because the ventral and dorsal streams compute the location, size, and disposition of objects in quite different ways. The perceptual mechanisms in the ventral stream use scene-based frames of reference and relational metrics, computations that provide a rich and detailed representation of the world upon which cognitive processes can operate. Thus, the ventral stream — and therefore perception — falls victim to size-contrast illusions because such illusions engage the obligatory relative-size computations that characterize the operation of this system. In contrast, the visuomotor mechanisms in the dorsal stream use egocentric frames of reference and compute the absolute metrics of the target object, computations that are required for accurate and efficient actions. Thus, actions like grasping are refractory to size-contrast illusions because they are scaled to the real not the relative size of the target object (Goodale and Haffenden, 1998). It should be noted that a number of studies have reported that grip scaling is affected by size-contrast illusions, in some cases to the same degree as perceptual size judgments (Pavani et al., 1999; Vishton et al., 1999; Franz et al., 2000; Glover and Dixon, 2001). Clearly then, there must be some conditions under which illusions can affect grip scaling. But one has to be careful in interpreting these experiments, since the investigators have not always
133
used tasks that reliably invoke the dedicated visuomotor mechanisms in the dorsal stream. Some, for example, have used two-dimensional stimuli, which are often not treated as real goal objects, causing subjects to generate movements that they would not normally use in picking up objects (Vishton et al., 1999). Others have used illusions that operate extremely early in the visual system and thus affect both dorsal and ventral streams of visual processing (Glover and Dixon, 2001). [For a discussion of these issues, see Haffenden et al. (2001) and Dyde and Milner (2002).] But in any case the failure to demonstrate a difference in the effects of illusions on visually guided action and visual perception does not seriously challenge the two-visual-systems theory. These negative findings cannot explain, for example, the behavioral dissociation between vision-for-perception and vision-for-action that has been observed in neurological patients. Nor for that matter can they explain why so many other studies have found evidence for this dissociation in normal observers. The findings do suggest, however, that the control of action is not always encapsulated within the visuomotor modules of the dorsal stream and that, under some circumstances, the control of action will make use of perceptual information that is presumably processed in the ventral stream. In this chapter, we explore some of the conditions that determine whether an action will be controlled entirely by the dorsal stream or will instead be influenced by information provided by perceptual mechanisms in the ventral stream. We will suggest that dedicated, real-time visuomotor mechanisms are engaged for the feedforward control of action only at the moment an individual decides to make a goal-directed action, and only if the target is visible. We shall make an important distinction between two feedforward modes of control for action, which we refer to as planning and programming. In this chapter we will argue that action planning begins as soon as the observer perceives the goal object, and that this action planning depends — at least in part — on the perceptual mechanisms in the ventral stream (see Glover and Dixon, 2001 for a similar argument). We review evidence suggesting that the period of time immediately prior to movement onset is critical for action programming. During this period
of time direct retinal information from the goal object is transformed immediately into a metrically accurate movement program. In other words, action programming — quite unlike action planning — occurs in real time. Psychophysical and neuropsychological evidence suggests that real-time action programming depends on the visuomotor mechanisms in the dorsal pathway, rather than the perceptual mechanisms in the ventral pathway that are so critical for action planning. In light of these arguments, one must interpret with caution the results of experiments that use delayed rather than real-time action tasks to explore the cognitive and neural mechanisms of sensorimotor transformation.
The effects of delay on visually guided actions in neurological patients Studies with neurological patients have demonstrated clear limits on the ability of the isolated dorsal system to guide manual prehension (Fig. 1). For example, D.F. demonstrates extremely poor size scaling of her grip when she reaches to grasp a target object after it has been removed from view, even though she shows good scaling when the object is visible up until the time of movement initiation (Goodale et al., 1994). In fact, when a 2-s delay was introduced between viewing the goal object and initiating the grasp, there was no correlation at all between the size of the object and the aperture of her grasp in flight. D.F.’s deficit
Fig. 1. Grip scaling in D.F. in real time and in pantomime (after a 2-s delay). In real time, D.F.’s grip aperture is highly correlated with the size of the target object. Her pantomimed grasps, however, show no scaling at all. Reproduced with permission from Goodale et al. (1994).
134
contrasts sharply with the performance of normal participants in the same task, who demonstrate only subtle differences in their visually guided and delayed grasping movements. These findings suggest that D.F. — unlike normal subjects — could not use a visual memory of the objects that were presented earlier to program her delayed grasping movements, presumably because she did not perceive the dimensions of those objects in the first place. In other words, the visuomotor mechanisms responsible for the control of actions to visible targets do not appear to retain in memory information about the target object or the grasping movement it affords. Visual memory for object features appears instead to depend on the perceptual mechanisms that reside in the ventral stream. Interestingly, however, D.F. was able to pantomime the grasping of familiar objects such as peas or grapefruits, presumably because the size information in this case could be retrieved from long-term, semantic memory. Taken together, these results suggest that (1) the dorsal system operates in ‘real-time’, accessing transient visual signals about the target object at the time of movement programming, and (2) the ventral stream is probably necessary for creating the object representations that are maintained in memory for the control of later actions. These conclusions are supported by evidence from neurological patients with damage to the dorsal stream, who show the opposite pattern of deficits and spared visual abilities from those seen in D.F. The optic ataxia patient A.T., for example, is quite unable to scale her grasp when reaching to unfamiliar objects — even though the object remains visible both before and during the action. Presumably this deficit arises because the action must be generated exclusively on the basis of on-line visual processing of the object’s features — processing that is thought to take place in the dorsal stream, which is of course damaged in this patient. A.T. shows much better scaling of her grasp when reaching to familiar objects, however, because in this case the action can be programmed by accessing stored semantic information about the object from long-term memory; presumably the dorsal stream is not necessary for programming actions from long-term memory (Jeannerod et al., 1994). But what about actions directed to unfamiliar objects that are no
longer visible; objects whose features cannot be accessed from long-term memory, but must instead be stored in short-term memory? If, as suggested earlier, the control of such memory-dependent grasping movements depends not on the visuomotor mechanisms of the dorsal pathway, but rather on the perceptual mechanisms of the ventral pathway, then a rather counterintuitive prediction can be made. Specifically, a patient with optic ataxia who is required to grasp an object after it has been removed from view might show a paradoxical improvement in performance. In this situation the patient would be forced to rely on a stored representation of the target object, laid down by the perceptual mechanisms of the ventral pathway, rather than an on-line computation of the object’s features, which would require the visuomotor mechanisms in the (damaged) dorsal pathway (Fig. 2). This prediction was exactly borne out in a recent experiment with the optic ataxia patient I.G. (Milner et al., 2001). Like other patients with optic ataxia, I.G. is quite unable to open her hand and fingers appropriately when reaching out to pick up objects of different sizes. Yet despite this deficit in real-time grasping, I.G. showed good grip scaling when ‘pantomiming’ a grasp for an object that she had seen earlier but that was no longer present. In fact, after practice, I.G. was able to scale her grip when grasping a real target object that she had previewed 5 s earlier. [Note that the object was visible during the movement, as in the original example; therefore, the suprising finding is that despite the presence of ‘real-time’ vision, her performance improved by virtue of the preview.] By interposing catch trials in which a different object was covertly substituted for the original object during the delay between the preview and the grasp, the experimenters were able to show that I.G. was using memorized visual information to calibrate her grasping movements. In other words, on these catch trials, her grip scaling reflected the size of the object she had previewed earlier, rather than the size of the object that was now in front of her, in sharp contrast to what happened with normal participants. Again, these findings support the idea that the control of grasping movements made after a delay depends on information derived from earlier perceptual processing of the object by mechanisms in the ventral stream.
135
Fig. 2. Grip scaling in I.G. under different conditions. In real time, I.G. shows no grip scaling, whereas when she is required to delay her grasp or to pantomime, her grip aperture is now scaled to the size of the target object. Reproduced with permission from Milner et al. (2001).
But these findings with neurological patients raise an important question. Why should the processing carried out by the visuomotor mechanisms in the dorsal stream be so sensitive to the interposition of a delay between viewing a goal object and initiating a goal-directed action? The answer to this question requires an examination of the differences in the computational requirements and constraints of vision-for-action and vision-for-perception.
Differences in the timing of action and perception As was mentioned earlier, vision-for-action and vision-for-perception use different metrics and frames of reference. To be able to grasp an object successfully, for example, it is essential that the brain compute the actual (absolute) size of the object and its orientation and position with respect to the observer. Moreover, the information about the orientation and position of the object must be computed in egocentric frames of reference — in other words, in frames of reference that take into account the orientation and position of the object with respect to the effector that is to be used to perform the action. This explains why an action like object-directed grasping is so refractory to sizecontrast illusions. But the time at which these computations are performed is also critical. Observers and goal objects
rarely stay in a static relationship with one another and, as a consequence, the egocentric coordinates of a target object can often change dramatically from moment to moment. It makes sense, therefore, that the required coordinates for action be computed immediately before the movements are initiated and it would be of little value for these ephemeral coordinates (or the resulting motor programs) to be stored in memory. In short, visuomotor systems work best in an on-line mode. The situation is quite different for perception, both in terms of the frames of reference used to construct the percept and the time period over which that percept (or some version of it) can be accessed. Vision-for-perception appears not to rely on computations about the absolute size of objects or their egocentric locations — a fact that explains why we have no difficulty watching television, a medium in which there are no absolute metrics at all. Instead, the perceptual system computes the size, location, shape, and orientation of an object primarily in relation to other objects and surfaces in the scene. In other words, the metrics of perception are inherently relative and the frames of reference are largely scene based, which explains why we are so sensitive to sizecontrast illusions and other visual illusions that depend on comparisons between different objects in the visual array. Encoding an object in a scene-based frame of reference (sometimes called an allocentric
136
frame of reference) permits a perceptual representation of the object that preserves the relations between the object parts and its surroundings without requiring precise information about absolute size of the object or its exact position with respect to the observer. Our perception of the world is also limited by the fact that most of what we perceive in any detail is located within a few degrees of the fovea. Indeed, if our perceptual machinery attempted to deliver the real size and distance of all the objects in the visual array, the computational load on the system would be astronomical. [For a more detailed discussion of these issues, see Goodale and Humphrey (1998).] But perception also operates over a much longer timescale than that used in the visual control of action. We can recognize objects we have seen minutes, hours, days — or even years before. When objects or scenes are coded in allocentric frames of reference, it is much easier to preserve their identity over relatively long periods of time. In other words, we can recognize objects that we have seen before, even though the position of those objects with respect to our body might have changed considerably since the last time we saw them. But of course, this stored information is not only available for mediating recognition of previously encountered objects, but also potentially for contributing to the control of our movements, particularly when we are working in off-line mode. These differences in the timescale over which vision-for-action and vision-for-perception operate help to explain why D.F. and the optic ataxia patients react quite differently to the insertion of a delay between viewing the goal object and initiating the grasp. Because of her damaged ventral pathway, D.F. cannot use perceptual memories of object shape to drive her grasping movements following the delay — but she can use her intact dorsal stream to generate well-formed grasping movements to objects that remain visible. Conversely, the optic ataxia patients show disruptions in the generation of real-time grasping movements, but are able to use perceptual memories to pantomime such movements or even to generate grasping movements on the basis of these memories. In other words, the performance of the optic ataxia patients is paradoxically better when they are working off-line than when they are in on-line mode. In the next section, we show that
the differences between on-line and off-line control can also be seen in the visuomotor performance of neurologically intact individuals, particularly in the context of pictorial illusions.
The effects of delay on actions in normal observers Goodale et al. (1994) showed that the imposition of a delay of 2 s between object viewing and movement execution produced a dramatic alteration in the kinematics of the grasp in normal observers. Not only were movements to the remembered object slower than movements made to a visible object, but the trajectory of the hand was more curvilinear, rising higher above the surface on which the object had been presented. Moreover, the hand did not show the typical overshoot in grip aperture that characterizes grasping movements initiated to visible objects. Instead the opening of the hand simply matched the size of the remembered object. It was as though the participants were ‘showing’ the experimenter how they would pick up the object they had seen 2 s earlier, rather than replicating a real grasping movement. Their response had to be a pantomime of the real thing because the object was no longer present. Nevertheless, the normal observers, unlike the patient D.F., were able to recover the size of the object they had seen a few moments before and use that information to scale their hand aperture, albeit in a way that did not precisely resemble what unfolds in a grasping movement directed at a visible target. In other experiments, it has been shown that when normal observers delay their grasp for a few seconds and then reach out to grasp an object in the dark, their grasping movements also no longer resemble real-time grasping movements (Hu et al., 1999). Movement time is significantly longer and the hand aperture is significantly wider for delayed than for real-time movements — even though in both cases the object and the moving hand are occluded from view during the execution of the movement. Again, however, grip aperture in the moving hand is correlated with the size of the target for both the delayed and the real-time grasping movements, indicating that some information about the size of the target object can be recovered from memory and used to control grasping movements even after a delay.
137
The effects of delay on grasp kinematics are particularly dramatic in experiments in which normal observers are required to pick up objects that are presented within size-contrast illusions. As was discussed earlier, grip scaling in real time is quite refractory to size-contrast illusions. The same illusions can have an effect on grip scaling, however, if the participants are required to reach out to grasp the target object after it has been removed from view. Thus, in the size-contrast experiment by Hu and Goodale (2000) discussed earlier, the introduction of a 5-s delay between viewing the target block and initiating the grasping movement made the participants sensitive to the size of the companion block. With the delay, they opened their hand wider for the target block when it was accompanied by a smaller block than they did when the target block was accompanied by a larger block. In this situation, the participants are presumably scaling their grasp on the basis of their perceptual memory of the target’s size, which was originally encoded in scene-based relative metrics. Similar increases in the sensitivity of target-directed actions to illusions following delay have been demonstrated in a variety of pictorial illusions in which relative metrics and scene-based frames of reference drive the illusion (Gentilucci et al., 1996; Bridgeman et al., 2000; Westwood et al., 2000a,b; Bradshaw and Watt, 2002).
Does the dorsal stream have a memory? As we have just seen, in memory-guided grasping, the motor system must generate a movement program using a stored representation of the target object — a representation that was originally created by perceptual mechanisms in the ventral stream. In real-time grasping, in which fast and metrically accurate movements are generated, the underlying visuomotor computations require direct visual input. It has been argued that the on-line computations that inform the required movement programming have only a brief memory and that these computations (or the program itself) decays quickly when the target is removed from view and the execution of the movement is delayed (Milner and Goodale, 1995). Indeed, it has been argued that these dorsal-stream
computations might last less than 2 s (Goodale et al., 1994). It is possible, however, that the dedicated visuomotor mechanisms in the dorsal-stream pathway are not engaged at all in memory-guided action tasks. In other words, the visuomotor mechanisms might not be engaged until the action is actually required (i.e., at the cue to respond) — and only if the target is still visible at that time. This latter ‘real-time’ view of the visuomotor system is based on the notion that the egocentric position of the target can change quickly — and often unpredictably — in the period of time between target identification and the intention to move. In other words, it might be more efficient to generate a motor program at the time when the action is actually required than to generate a motor program when the target is first identified and then continuously update it in response to egocentric position changes. In a recent experiment (Westwood and Goodale, 2003), we sought to distinguish between these two possibilities by comparing memory-guided and visually-guided grasping using a new experimental paradigm.
Comparing visually-guided and memory-guided grasping We used a size-contrast illusion to assess the contribution of perceptual mechanisms to the control of visually-guided and memory-guided grasping movements. Previous experiments that have examined this question have employed separate blocks of trials for visually-guided and memory-guided responses (Goodale et al., 1994; Hu and Goodale, 2000; Westwood et al., 2000a,b; Milner et al., 2001). As a consequence, the participants in these experiments could have attended to quite different aspects of the target display in the two conditions. For example, in the memory condition, they could have explicitly directed their attention to the spatial layout of the scene so that they could store task-relevant information in memory for later use — a strategy that would have selectively activated visual mechanisms in the ventral pathway. These ventral-stream mechanisms presumably would be particularly sensitive to relative size cues. In the visually guided condition, however, the dedicated systems in the dorsal stream would be automatically engaged as
138
soon as the target came into view. In other words, when memory-guided and visually-guided grasps were run in blocks, quite different systems would be engaged as soon as the targets were presented. To make sure this did not happen in our study, we randomly interleaved visually-guided and memoryguided trials, which allowed us to equate the viewing strategies that the participants might have brought to the two types of trials. In other respects, however, our study was similar to the one already described by Hu and Goodale (2000). Participants were instructed to reach out and grasp a rectangular target object (50, 55, or 60 mm long 20 mm wide 5 mm thick) that was presented beside either a smaller (40, 45, or 50 mm long), larger (60, 65, or 70 mm long), or same-sized flanking object (Fig. 3A). The objects were presented on a surface that was oriented 20 back from the picture plane, with one object to the left of center (20 mm) and the other to the right (20 mm). The target object was marked with a small dot. Vision was controlled with liquid-crystal shutter goggles and the opening between the index finger and thumb during the grasping movement was tracked with OPTOTRAK (Northern Digital Inc.). Participants previewed the target array for 500 ms and then initiated each grasping movement in response to a 50-ms auditory tone (see Fig. 3B), picking the target object up along its long axis with
their index finger and thumb. During the experiment, participants were reminded to wait for the auditory cue before responding, in order to prevent anticipatory responses. The auditory cue was given immediately after the preview period for one group of participants (No-delay group; N ¼ 12), whereas the cue was given 2500 ms after the preview period for the second group (Delay group; N ¼ 12). For the Nodelay group, two visual conditions were randomly interleaved in an equal number of trials: in the vision trials, vision remained available after the tone sounded until the participant’s hand left the start key, whereas in the occlusion trials, vision was occluded coincidentally with the presentation of the tone (Fig. 3B). For the Delay group, two visual conditions were also randomly interleaved in an equal number of trials: vision was occluded for 2500 ms following the preview period in both types of trials, but in the vision trials, vision was restored from the time the tone sounded until movement onset, whereas in the occlusion trials, vision was not restored. [Six grasps were made to each of the nine possible target arrays (3 target sizes 3 relative flanker sizes) in both types of visual trials (vision and occlusion), for a total of 108 trials per participant.] Thus, the only difference between the randomly interleaved trials was whether the target array was visible (vision trials) or visually occluded (occlusion
Fig. 3. (A) Apparatus for presenting target and flanker objects. (B, C) Event sequences for No-delay (B) and Delay groups (C). Vision was available between movement cueing and movement onset for Vision trials, whereas vision was occluded between cueing and onset for Occlusion trials. Reproduced with permission from Westwood and Goodale (2003).
139
trials) for the period of time between the cue to respond and the onset of the hand movement. The key comparison involved the vision and occlusion trials in the No-delay group. If the dorsal visuomotor mechanisms are engaged when the target array is initially viewed — even if the response has not yet been cued — then the perceptual illusion should have no effect on grasping movements in the vision and occlusion trials for this group. This is because any visuomotor computations carried out during the initial viewing period would have no time to decay when the cue to respond was presented. If, however, the dorsal mechanisms are engaged only after the cue to respond, then the illusion should affect grasping in the occlusion trials but not the vision trials. This is because in the occlusion trials the target array is not visible at the cue to respond, which should
prevent the engagement of the dorsal visuomotor mechanisms. Instead, the control of action should access a stored representation of the target object — a representation laid down by perceptual mechanisms that are sensitive to the size-contrast illusion. Finally if it is indeed the case that dorsal-stream mechanisms are engaged only when the cue to respond is given, then the reintroduction of vision in the Delay group should similarly make the grasping movement refractory to the illusion. The results were clear. When vision was occluded at the moment the cue to respond was given, participants fell victim to the size-contrast effect — despite the fact that in the No-delay condition vision was available right up to the moment the tone sounded. As Fig. 4 illustrates, for both the No-delay and the Delay groups, the peak grip aperture, (which
Fig. 4. Effect of relative flanker size (small, same, large) on the scaling of peak grip aperture on Vision and Occlusion trials for No-delay and Delay groups. Flanker size did not affect peak grip aperture in Vision trials for either group, but in Occlusion trials, significantly larger peak grip aperture was used to grasp objects presented beside relatively smaller flankers. Inset: Peak grip aperture difference scores for small flanker minus large flanker. Error bars indicate standard error of the mean. Reproduced with permission from Westwood and Goodale (2003).
140
Fig. 5. Effect of target size on the scaling of peak aperture for the No-delay and Delay groups in Vision and Occlusion trials. Reliable size scaling was observed in all conditions, but the slope of the size-scaling function was reduced for Occlusion trials. Error bars indicate standard error of the mean. Reproduced with permission from Westwood and Goodale (2003).
was achieved about 70% of the way through the entire movement) was affected by the size-contrast illusion in the occlusion trials — but not in the vision trials. In other words, in the occlusion trials, the peak grip aperture was greater for targets presented beside smaller rather than larger flanking objects (see the insets in Fig. 4). In fact, the direction and magnitude of the ‘flanker effect’ in these occlusion trials is the same as that observed for perceptual judgments of the size of the target object, in which participants in a separate experiment were simply asked to estimate its size by opening their finger and thumb a matching amount while holding their hand still. In vision trials, the peak aperture of the grasp was scaled to the real size of the target object and was unaffected by the size of the flanker objects. It should be noted, however, that in both the vision and occlusion trials, peak grip aperture was correlated with the actual size of the target (Fig. 5).
Off-line versus real-time control of actions These findings are consistent with those of earlier studies suggesting that perceptual mechanisms in the ventral pathway are invoked for the control of actions to remembered targets — targets that are removed from view before the action is initiated. But, as was discussed earlier, it is not clear from previous studies that perceptual mechanisms are in fact necessary for the control of memory-guided actions. Because memory trials were blocked in earlier studies, it could have been the case that the perceptual mechanisms were strategically engaged in response to the expected memory requirements of the task, rather than as a consequence of the actual memory requirements. Participants, for example, might choose to pay more attention to the entire visual display in the memory-guided task, in an effort to store in memory as much information as possible.
141
In the present experiment, participants could not strategically deploy their attention to different features of the target display in the vision and occlusion trials, because these trials were randomly interspersed and indistinguishable from one another when the target display was first viewed. The only difference between trials was whether or not vision was occluded for the period of time between the cue to respond and the onset of movement. For both the Delay and the No-delay groups, the peak aperture of the grasp was sensitive to the perceptual size illusion on the occlusion trials, but not the vision trials. These findings indicate that differences in the deployment of attention or other strategies cannot explain why perceptual information was used to control the memory-guided actions. Rather, we suggest that the perceptual mechanisms are in fact necessary for the control of memory-guided actions — because the dedicated visuomotor modules in the dorsal stream that control grasping can work only in real time and thus effectively have no memory. We propose that the various visuomotor systems in the posterior parietal cortex are engaged for the control of action when the response is required (i.e., at the cue to respond), and not before. But importantly, these mechanisms can be engaged only if the target is visible at the time the movement is to be made. According to this view, the visuomotor mechanisms in the dorsal stream compute the absolute metrics of the target object and deliver this information to the motor system for immediate use. These computations presumably make use of transient retinal and extraretinal signals — signals that, with certain constraints, can be used to recover the absolute metrics of the target object. This information reaches the posterior parietal cortex directly (via primary visual cortex as well as the collicular–pulvinar route) without first being processed by ventral-stream structures. A key aspect of our ‘real-time’ hypothesis is that the dorsal-stream visuomotor mechanisms are not engaged until the movement is cued (or until the individual decides to initiate the response). This proposal is similar in spirit to a ‘conversion-ondemand’ model put forward by Henriques et al. (1998, 2002), in which they argue that the selected target is transformed into the required head, body, and limb coordinates only at the moment a movement is executed.
In our experiment, for example, the visuomotor mechanisms do not convert the retinal information into the required motor coordinates for grasping when the target display is first viewed and then store them in memory for the control of later actions — not even for a very brief period of time. If this were the case, then the effect of the size-contrast illusion would have been the same for the vision and occlusion trials in the No-delay group. After all, there was no appreciable delay between the initialviewing period and the onset of the response in the occlusion trials; any visuomotor computations that took place during the preview period would have had little if any time to decay. The different effect of the illusion in the vision and occlusion trials must therefore be due to visuomotor processing that occurred between the cue to respond and the onset of the movement (see also Milner et al., 2001 for a similar discussion). If the target is not visible when a movement is cued (or when the individual decides to initiate a movement), then the control of action must access a stored representation of the target object. As argued earlier, this stored representation is not encoded by visuomotor mechanisms in the dorsal pathway, but rather by the perceptual mechanisms of the ventral stream — the very mechanisms whose processing created the size-contrast illusion in the first place. According to this view, a perceptual representation of the target object is automatically generated when the target display is first viewed, and then is stored in memory for a host of possible cognitive operations, including the off-line control of action. Indeed, the illusion had a similar effect in the occlusion trials for both the No-delay and the Delay groups, suggesting that the stored perceptual representation of the target (which includes the size-contrast effect) does not change appreciably within the first 2.5 s of target occlusion. Nevertheless, grip aperture on the occlusion trials was larger overall for the Delay group than for the No-delay group, suggesting that the length of the delay between target offset and response execution is important for action control. It is possible that, after a few seconds of visual occlusion, participants become less certain about the size of the remembered object and therefore increase the safety margin used to control the grip aperture. In other words, the
142
delay-dependent increases in grip aperture on the occlusion trials may have little to do with the nature of the stored perceptual representation, and more to do with the individual’s confidence in that memory. According to the preceding arguments, perception-based movement planning can begin as soon as the target has been seen — even if the response has not yet been cued. Just how (and where in the brain) the perceptual information influences the motor plan is not yet well understood. One clue comes from single-unit work in the monkey parietal cortex. Neurons in the lateral intraparietal sulcus (LIP), a dorsal-stream area that has been implicated in the visual control of voluntary saccades, show activity during the delay phase of memory-guided eye movements tasks in which the monkey is required to ‘hold in mind’ the position of a target (Murata et al., 1996; Snyder et al., 1997). Nearby areas in the posterior parietal cortex that are thought to be involved in the visual control of arm movements also show similar memory-related activity during the delay phase (Mazzoni et al., 1996; Snyder et al., 1997). The activity of these neurons could reflect action planning that is based on a perceptual representation of the target stimulus — planning that is fundamentally different from that which occurs when the target is present on the retina and the response is cued by the experimenter or initiated by the individual. In other words, off-line activity in the dorsal stream may be dependent on information derived from the ventral stream. If this is the case, the use of the delay task to study the activity of neurons in the posterior parietal cortex may reveal a good deal about how actions are planned in off-line mode but little about their role in the programming of actions on-line. The possibility that delay-related activity in the dorsal stream reflects inputs from the ventral stream has yet to be established empirically. There are some tantalizing bits of evidence, however, that suggest that the ventral stream does play such a role. Toth and Assad (2002), for example, have shown that neurons in LIP can become sensitive to an arbitrary cue, such as color, when that attribute is linked to the direction of the required eye movements. Similar changes in the coding of neurons in LIP have also been observed when the monkey has been trained to make eye movements in arbitrary association with
certain visual patterns (Sereno and Maunsell, 1998). In other words, following training, LIP neurons now show coding that is specific to the colors or visual patterns that were used in training. It seems likely, but remains unproven, that the color and pattern information must first be processed in the ventral stream before being relayed to dorsal-stream areas such as LIP. Although it is not known how ventraland dorsal-stream activity is coordinated in such tasks, some authors have proposed that areas in the prefrontal cortex play the central role in this process, allowing arbitrary information about color, shape, and other visual features that are processed in the ventral stream to influence the activity of neurons in dorsal-stream structures that are involved in visuomotor control (Passingham and Toni, 2001; Toni et al., 2001). Indeed, the notion that networks in the prefrontal cortex (and the striatum) permit humans and other primates to learn associations between particular stimuli and particular actions is an idea that has been put forward (in various forms) by many authors (for reviews, see Asaad et al., 1998, 2000). The memory-driven grasping movements that we have been discussing in this chapter would appear to be prototypical examples of the use of visual information in association with the production of particular (somewhat arbitrary) actions. Although there is evidence that the dorsal stream participates in the planning of actions well in advance of their execution (with the help of information derived from the ventral stream), this is not to say that dorsal-stream mechanisms are necessary for memory-guided action planning. After all, as we saw earlier, Milner et al. (2001) have shown that an optic ataxia patient (with extensive damage to parietal cortex) can grasp quite accurately remembered — but not visible — objects. It would be interesting to compare how the memory-driven actions of such patients differ from those of neurologically intact individuals who presumably would make use of dorsal-stream mechanisms in the planning of such movements. A comparison of this kind might reveal a dorsal-stream contribution to off-line control that has yet to be characterized. Whether or not such off-line control shares some of the same dorsal-stream circuitry as that involved in on-line control remains an open question.
143
Acknowledgments Supported by the Canadian Institutes of Health Research, the Canada Research Chairs Program, and the Wellcome Trust.
References Asaad, W.F., Rainer, G. and Miller, E.K. (1998) Neural activity in the primate prefrontal cortex during associative learning. Neuron, 21: 1399–1407. Asaad, W.F., Rainer, G. and Miller, E.K. (2000) Task-specific neural activity in the primate prefrontal cortex. J. Neurophysiol., 84: 451–459. Bradshaw, M.F. and Watt, S.J. (2002) A dissociation of perception and action in normal human observers: the effect of temporal-delay. Neuropsychologia, 40: 1766–1778. Bridgeman, B., Gemmer, A., Forsman, T. and Huemer, V. (2000) Processing spatial information in the sensorimotor branch of the visual system. Vis. Res., 40: 3539–3552. Carey, D.P. (2001) Do action systems resist visual illusions? Trends Cogn. Sci., 5: 109–113. Culham, J.C. (2004) Human brain imaging reveals a parietal area specialized for grasping. In: N. Kanwisher and J. Duncan (Eds.), Attention and Performance XX. Functional Brain Imaging of Visual Cognition. Oxford University Press, Oxford, 415–436. Dyde, R.T. and Milner, A.D. (2002) Two illusions of perceived orientation: one fools all of the people some of the time; the other fools all of the people all of the time. Exp. Brain Res., 144: 518–527. Franz, V.H., Gegenfurtner, K.R., Bulthoff, H.H. and Fahle, M. (2000) Grasping visual illusions: no evidence for a dissociation between perception and action. Psychol. Sci., 11: 20–25. Gentilucci, M., Chieffi, S., Deprati, E., Saetti, M.C. and Toni, I. (1996) Visual illusion and action. Neuropsychologia, 34: 369–376. Glover, S.R. and Dixon, P. (2001) Dynamic illusion effect in a reaching task: evidence for separate visual representations in the planning and control of reaching. J. Exp. Psychol. Hum. Percept. Perform., 27: 560–572. Goodale, M.A. and Haffenden, A.M. (1998) Frames of reference for perception and action in the human visual system. Neurosci. Biobehav. Rev., 22: 161–172. Goodale, M.A. and Humphrey, G.K. (1998) The objects of action and perception. Cognition, 67: 179–205. Goodale, M.A. and Milner, A.D. (1992) Separate visual pathways for perception and action. Trends Neurosci., 15: 20–25. Goodale, M.A., Milner, A.D., Jakobson, L.S. and Carey, D.P. (1991) A neurological dissociation between perceiving objects and grasping them. Nature, 349: 154–156.
Goodale, M.A., Jakobson, L.S. and Keillor, J.M. (1994) Differences in the visual control of pantomimed and natural grasping movements. Neuropsychologia, 32: 1159–1178. Haffenden, A.M., Schiff, K.C. and Goodale, M.A. (2001) The dissociation between perception and action in the Ebbinghaus illusion: nonillusory effects of pictorial cues on grasp. Curr. Biol., 11: 177–181. Henriques, D.Y., Klier, E.M., Smith, M.A., Lowy, D. and Crawford, J.D. (1998) Gaze-centered remapping of remembered visual space in an open-loop pointing task. J. Neurosci., 18: 1583–1594. Henriques, D.Y., Medendorp, W.P., Khan, A.Z. and Crawford, J.D. (2002) Visuomotor transformations for eyehand coordination. Prog. Brain Res., 140: 329–340. Hu, Y. and Goodale, M.A. (2000) Grasping after a delay shifts size-scaling from absolute to relative metrics. J. Cogn. Neurosci., 12: 856–868. Hu, Y., Eagleson, R. and Goodale, M.A. (1999) The effects of delay on the kinematics of grasping. Exp. Brain Res., 126: 109–116. James, T.W., Culham, J., Humphrey, G.K., Milner, A.D. and Goodale, M.A. (in press). Ventral occipital lesions impair object recognition but not object-directed grasping: A fMRI study. Brain. Jeannerod, M. (1986) Mechanisms of visuomotor coordination: a study in normal and brain-damaged subjects. Neuropsychologia, 24: 41–78. Jeannerod, M., Decety, J. and Michel, F. (1994) Impairment of grasping movements following a bilateral posterior parietal lesion. Neuropsychologia, 32: 369–380. Mazzoni, P., Bracewell, R.M., Barash, S. and Andersen, R.A. (1996) Motor intention activity in the macaque’s lateral intraparietal area. I. Dissociation of motor plan from sensory memory. J. Neurophysiol., 76: 1439–1456. Milner, A.D. and Goodale, M.A. (1995). The Visual Brain in Action. Oxford University Press, Oxford. Milner, A.D., Perrett, D.I., Johnston, R.S., Benson, P.J., Jordan, T.R., Heeley, D.W., Bettucci, D., Mortara, F., Mutani, R., Terazzi, E. and Davidson, D.L.W. (1991) Perception and action in ‘visual form agnosia’. Brain, 114: 405–428. Milner, A.D., Dijkerman, H.C., Pisella, L., McIntosh, R.D., Tilikete, C., Vighetto, A. and Rossetti, Y. (2001) Grasping the past. Delay can improve visuomotor performance. Curr. Biol., 11: 1896–1901. Murata, A., Gallese, V., Kaseda, M. and Sakata, H. (1996) Parietal neurons related to memory-guided hand manipulation. J. Neurophysiol., 75: 2180–2186. Pavani, F., Boscagli, I., Benvenuti, F., Rabuffetti, M. and Farne, A. (1999) Are perception and action affected differently by the Titchener circles illusion? Exp. Brain Res., 127: 95–101.
144 Passingham, R.E. and Toni, I. (2001) Contrasting the dorsal and ventral visual systems: guidance of movement versus decision making. NeuroImage, 14: S125–S131. Perenin, M.T. and Vighetto, A. (1988) Optic ataxia: a specific disruption in visuomotor mechanisms. I. Different aspects of the deficit in reaching for objects. Brain, 111: 643–674. Sereno, A.B. and Maunsell, J.H. (1998) Shape selectivity in primate lateral intraparietal cortex. Nature, 395: 500–503. Snyder, L.H., Batista, A.P. and Andersen, R.A. (1997) Coding of intention in the posterior parietal cortex. Nature, 386: 167–170. Toni, I., Rushworth, M.F.S. and Passingham, R.E. (2001) Neural correlates of visuomotor associations. Spatial rules compared with arbitrary rules. Exp. Brain Res., 141: 359–369.
Toth, L.J. and Assad, J.A. (2002) Dynamic coding of behaviourally relevant stimuli in parietal cortex. Nature, 415: 165–168. Vishton, P.M., Rea, J.G., Cutting, J.E. and Nunez, L.N. (1999) Comparing effects of the horizontal-vertical illusion on grip scaling and judgment: relative versus absolute, not perception versus action. J. Exp. Psychol.: Hum. Percept. Perform., 25: 1659–1672. Westwood, D.A. and Goodale, M.A. (2003) Perceptual illusion and the real-time control of action. Spat. Vis, 16: 243–254. Westwood, D.A., Chapman, C.D. and Roy, E.A. (2000a) Pantomimed actions may be controlled by the ventral visual stream. Exp. Brain Res., 130: 545–548. Westwood, D.A., Heath, M. and Roy, E.A. (2000b) The effect of a pictorial illusion on closed-loop and open-loop prehension. Exp. Brain Res., 134: 456–463.
SECTION III
Perception and Attention
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 10
Color contrast: a contributory mechanism to color constancy Anya Hurlbert* and Kit Wolf Henry Wellcome Building for Neuroecology, School of Biology, Framlington Place, Newcastle upon Tyne NE2 4HH, UK
Abstract: Color constancy — by which objects tend to appear the same color under changes in illumination — is most likely achieved by several mechanisms, operating at different levels in the visual system. One powerful contributory mechanism is simultaneous spatial color contrast. Under changes in natural illumination the spatial ratios of within-type cone excitations between natural surfaces tend to be preserved (Foster and Nascimento, 1994); therefore, the neural encoding of colors as spatial contrasts tends to achieve constancy. Several factors are known to influence the strength of chromatic contrast induction between surfaces, including their relative luminance, spatial scale, spatial configuration and context (Ware and Cowan, 1982; Zaidi et al., 1991). Here we test the hypothesis that color contrast is weakened by differences between surfaces which indicate that they may be under distinct illuminants. We summarize psychophysical measurements of the effects of relative motion, relative depth and texture differences on chromatic contrast induction. Of these factors, only texture differences between surfaces weaken chromatic contrast induction. We also consider neurophysiological and neuropsychological evidence and conclude that the mechanisms which mediate local chromatic contrast effects are sited at low levels in the visual system, in primary visual cortex (V1) or below, prior to image segmentation mechanisms which require computation of relative depth or motion. V1 and lower areas may therefore play a larger role in color constancy than previously thought.
Introduction
in the retina must contribute to constancy — although its effects are so fundamental that they are difficult to tease apart from the mechanisms into which they feed — but cortical mechanisms are also essential, as lesion studies demonstrate (Ru¨ttiger et al., 1999). Theoretical arguments also suggest that color constancy may be carried out by different mechanisms, and a number of distinct computational models have been proposed that each achieve constancy with some degree of success, under particular conditions (Hurlbert, 1998). It is useful to group these mechanisms in the following framework, which distinguishes between the type of computational mechanism and the neural level on which it would most likely function:
Color constancy is a fundamental perceptual phenomenon, by which objects tend to appear the same color regardless of the illumination upon them. Color constancy is a property of surfaces in the context of other surfaces, not of lights in the void. Although it obviously requires interactions between distinct light signals from distinct spatial locations in the image, no single specific physiological mechanism at a particular locus has been identified as responsible for color constancy. In fact, the evidence suggests that color constancy is achieved by a heterogeneous group of mechanisms, operating at different levels in the visual system. For example, chromatic adaptation
*Corresponding author. Tel.: þ 44-191-222-7638; Fax: þ 44-191-222-5622; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-0
147
Sensory. Models at this level require only simple linear transformations of the photoreceptor
148
responses, one example being scaling of the individual receptors by their individual mean activities over the image, or, in other words, DC chromatic adaptation (Von Kries, 1906; Finlayson et al., 1993). Perceptual. Models at this level require some parsing or segmenting of the image into distinct reflection or surface components, an example being chromaticity convergence algorithms which estimate the illumination spectrum from specular highlights or mutual reflections (Lee, 1986; Funt et al., 1991). Cognitive. Models at this level require recognition of objects and/or color memory for identified objects, an example being the adjustment of memory color of familiar objects proposed by Beck (1972). Here, we focus on one contributory mechanism to color constancy, color contrast, and argue from psychophysical, neurophysiological, and neuropsychological evidence that it operates at the sensory level.
Color contrast and its link with color constancy Color contrast occurs when a surface of one color induces its opponent color in an adjacent surface. Typically, the inducing surface is large and entirely surrounds the target surface; for example, a small grey disk acquires a pinkish tinge when surrounded by a large green annulus. Although it is a labile phenomenon, which occurs only under particular conditions, it is nonetheless powerful when it does occur. Under optimal conditions, color contrast is instantaneous, cannot consciously be ‘turned off’, and almost entirely determines the color appearance of a surface. For example, in a simple image, the color appearance of a small desaturated disk can largely be predicted by the spatial cone-contrast (the within-type spatial ratio of cone excitations) between it and its background (Shepherd, 1997; Hurlbert et al., 1998). The contribution of color contrast to constancy comes about through this dependence of color appearance on spatial cone-contrasts. Under changes in natural illumination (such as daylight) the spatial cone-contrasts between natural surfaces tend to be
preserved (Foster and Nascimento, 1994). Thus, the encoding of color appearance by cone ratios between surfaces may help to preserve color appearance under natural illumination changes. While it is clear that the local cone-contrast between a surface and its immediate background has a strong influence on color appearance, it is not clear to what extent the contrast from more distant surfaces may contribute. The distinction is often made between ‘local’ and ‘global’ contrast (see Kraft and Brainard, 1999), but it is perhaps more accurate to distinguish between surfaces which share edges with the target surface (‘local’ contrast) and those which do not (‘remote’ contrast). Where this distinction is made, the contribution of remote contrast to color appearance has been quantified as much smaller than local contrast (for example, remote fields up to 10 distant from a target surface contribute less than 10% of the induction from its immediate background; Wachtler et al., 2001; Wolf and Hurlbert, 2003). By another estimate, remote surfaces contribute significantly to color constancy, even when they contradict the color specified by immediate local contrast (Kraft and Brainard, 1999). Furthermore, not only the mean chromatic contrast, but also its variance, contributes to color appearance (Brown and MacLeod, 1997), although whether this effect is local or instantaneous is unclear. Another important distinction is that between simultaneous spatial contrast and temporal adaptation. Ideally, spatial contrast is an instantaneous phenomenon that depends only on spatial interactions between image regions observed within one fixation. Temporal adaptation requires time for nearby and distant regions to influence each other, as neural response gains are adjusted according to signals sampled over several eye movements and via more slowly propagating lateral interactions. Empirically, it is difficult to tease apart the two phenomena; the color appearance of a small target against a large chromatic background, viewed for seconds or longer, will be influenced in the same way by temporal adaptation to the mean chromaticity and luminance of the dominant background as by the instantaneous spatial cone-contrast at the edges between them. When properly teased apart, instantaneous contrast effects may contribute up to 60% of the chromatic induction effects of a large
149
uniform background, and do so within the first 25 ms of the background’s initial display (Rinner and Gegenfurtner, 2000). According to Rinner and Gegenfurtner (2000), the remaining induction effects are perpetrated by two adaptational mechanisms, one fast (half-life 40–70 ms), the other slow (half-life 20–30 s). The time scale and spatial characteristics of the former are consistent with fast, local receptoral adaptation mediated by micro-saccades (Shady and MacLeod, 2002). Qualitatively, therefore, there seem to be at least three distinct sensory mechanisms that interact to determine color appearance: chromatic adaptation to the mean, which occurs over relatively large spatial and temporal scales, alters the neutral point, and requires at least 1 min to complete after a transition in mean chromaticity (Rinner and Gegenfurtner, 2000; Werner et al., 2000); chromatic adaptation to the variance (Webster, 1996) which scales sensitivity to cone-contrasts around the neutral point, and requires probably minutes to complete; and spatial contrast between image regions of distinct cone excitations, which occurs ‘instantaneously’ (Rinner and Gegenfurtner, 2000). These mechanisms further interact with color filling-in processes, so that color appearance seems to be determined by contrasts spreading away from edges, across regions (Hurlbert and Poggio, 1989; Broerse et al., 1999; Rudd and Arrington, 2001). Here we focus on near-instantaneous spatial conecontrast effects between a target surface and its immediate background. The effects we measure are dominated by local contrast, although we cannot exclude a contribution from fast, predominantly local receptoral adaptation.
Psychophysical investigations of simultaneous chromatic contrast In a series of psychophysical experiments, we have measured the effect of distinct factors on color contrast, in order to help pinpoint its locus and mode of operation in the visual system. One might argue that since we are focussing on ‘local’ color contrast, we have already defined the locus to be low level. But we now know that ‘global’ effects may occur in V1 — for example, as contextual modulation of
classical receptive field responses (Albright and Stoner, 2002) — as well as in the thalamus, via feedback from higher areas, and conversely, that ‘local’ effects may dominate neuronal responses in higher visual areas (Allman et al., 1985). Factors already known to influence the strength of color contrast include their relative luminance, spatial scale, spatial configuration and context (Ware and Cowan, 1982; Zaidi et al., 1991). In selecting factors that might influence color contrast, our guiding hypothesis has been that color contrast serves color constancy, and that therefore, reasoning from a ‘neuroecological’ viewpoint, color contrast should be weakened between surfaces likely to be under different illuminants. Detachable surfaces are more likely to be under different illuminants than attached surfaces. We therefore have considered factors likely to influence the apparent ‘detachability’ of two surfaces, such as differences in depth, motion, and surface texture (large differences in texture scale are consistent with the surfaces being at different distances from the observer, or made from distinct materials). In phrasing our motivation in this way, we do not mean to imply that chromatic contrast induction is necessarily preceded by the subconscious analysis or conscious perception of the ‘attachedness’ of the surfaces involved. Rather, we argue that if chromatic contrast induction evolved to serve the purpose of discounting the chromatic effects of a common illuminant, its success in serving its purpose would depend on its being applied in the appropriate circumstances, and therefore, it might co-evolve with mechanisms that signal the likelihood of a common illuminant. One might expect, therefore, that chromatic contrast mechanisms would segregate from mechanisms that signal distinct illuminants. On the other hand, because color contrast enhances material differences between surfaces by factoring out the common illuminant, we could argue that surface ‘detachability’ is necessary to signal the usefulness of color contrast. For the purposes of behavior, it would be more important to distinguish detachable surfaces from each other, so that they may be picked apart — for example, red fruit against a background of green leaves. Enhancing the contrast between same-surface color patches may have
150
less behavioral significance, because the patches would not ordinarily need to be picked apart (although representing accurately the colors of a multicolored surface would be important for recognition). We might therefore expect that contrast operates best within a certain range: for surfaces that are detachable from each other, but not so detached as to appear under distinct illuminants. Other factors that influence whether two surfaces appear to be under the same illuminant or not then become crucial, but we would expect these to operate at a cognitive level, beyond elementary image analysis. Our list of factors is designed to help pinpoint the locus of chromatic contrast within the earlier stages of image processing, if it does indeed occur there.
Experimental paradigm The experiments employ one of two basic types of color appearance judgment task. The first is a two-interval, forced-choice color discrimination task, in which observers compare the color appearance of two central targets, presented in rapid temporal succession against uniform or articulated backgrounds, based on the paradigm of Wachtler et al. (2001). The first (reference) target and background are neutral in chromaticity (but slightly different in luminance, to insure target visibility); the second (test) background is shifted in chromaticity with respect to the reference, in the direction of increasing L-cone excitation, while the test target is shifted in the same direction but to a different, variable degree (see Fig. 1). The two successive presentations thus simulate a temporal change in a spatially uniform, global illumination and the observer’s task is to report whether the color of the central target becomes redder or greener under the change. The cone-contrast (with respect to the neutral reference) required for the test target to appear neutral is our measure of the strength of contrast induction. The second basic task (described in more detail below) requires the observer to null apparent color changes of a central target against a background whose chromaticity is temporally or spatially modulated along the (L–M) cone-contrast axis. The
Fig. 1. (a) and (b) Stimulus parameters and protocol for the basic 2AFC task. The basic stimulus (1a) consists of a ‘target’ square presented centrally against a rectangular background. To insure its visibility, its luminance is set 8% dimmer than the background level of 15 cd/m2. The background in this example contains a cartoon of the ‘Mondrian’ texture that we used (not to scale). Fig. (1b) shows the time-course of a single trial. During the first stage, presented for 500 ms, the observer memorizes the colour of a neutrally colored target (Reference), set against a neutrally coloured background. The background colour is then shifted is then shifted isoluminantly along the LM-axis by a fixed amount, and a concomitant variable color shift is introduced to the target (Test). The experimental task is to decide, during the 500 ms for which this stimulus is presented, whether the new target viewed against the shifted background appears reddish or greenish. Observers indicate their responses using a game-pad, during a period of top-up adaptation when they view the full-screen neutral reference background.
observer adjusts the extent of temporal modulation of the central square’s chromaticity along the same axis, and the strength of contrast induction is measured as the amount of modulation necessary to
151
stabilize the color appearance of the target square at neutral chromaticity. In each task, there is no opportunity for long-term DC adaptation to the nonneutral inducing background (although the temporal changes in both tasks may influence AC adaptation (Webster, 1996); see discussion below). The brief presentation of the test (inducing) background is immediately followed by a period of readaptation to the neutral background (enforced to a minimum of 1 s); under these conditions, the mean adaptational state remains nearneutral as confirmed in separate control experiments. (The achromatic setting under this rapid-change paradigm deviates by at most 5% from the achromatic setting under steady-state adaptation to the neutral background.) The test square against the test background is displayed for 500 ms only, long enough for instantaneous spatial contrast effects and fast, local receptoral adaptation (which, aided by micro-saccades, may propagate over several degrees in that time; Shady et al., 2002) to reach completion, but well before completion of the slow chromatic adaptation mechanisms that dominate steady-state color appearance (half-life 20–30 s; Fairchild and Reniff, 1995; Rinner and Gegenfurtner, 2000). The tasks therefore differ from others used to quantify contrast and constancy which involve simultaneous comparisons between distinct stimuli under prolonged viewing. The experimental methods are described in full in Hurlbert and Wolf (2002) and Wolf and Hurlbert (2002a,b), and briefly summarized in the Methods appendix.
Texture segmentation Differences in the spatial and spectral characteristics of surface texture may signal that two surfaces are made from different materials, or that they are located at different distances from the viewer. Texture differences are powerful cues for image segmentation (see Nothdurft, 1994), and in the achromatic domain, have been shown to disrupt lightness contrast induction when they occur between target and background (Laurinen et al., 1997). Using the basic 2AFC task with an þ L–M background-shift between the reference and test stimuli (see Methods), we investigated whether the
strength of contrast induction depends on surface texture properties of the target and background. We compared three texture-difference conditions with one control condition: (1) Chromatic texture in the central target; spatially uniform background. (2) Spatially uniform central target; chromatic texture in the background. (3) Identical chromatic texture in both central target and background. (4) Spatially uniform central target and background (the control condition). The chromatic texture resembled a fine-grained ‘Mondrian’, and was created by scattering rectangles between 1 1 and 3 3 pixels in size (at 32 pixels per degree) at random across a square grid covering the target and/or background (see Fig. 1). Each rectangle was assigned one of five distinct isoluminant colors distributed evenly around the reference chromaticity along one axis of cone-contrast space. The method insured that proportions of pixels of each color did not differ by more than 1%. To change the mean chromaticity of the texture, for example to produce an þ L–M shifted background, the reference chromaticity was redefined, but the distribution of the five colors around the reference remained constant in the cone-contrast space that we used. Thus the spaceaveraged chromaticity and luminance of the chromatic textures remained identical to those of the spatially uniform controls. For condition 1, we also performed separate experiments using textures defined by different characteristics: for example, chromatic textures of differing spatial scale and regularity, and differing chromatic distributions, and luminance textures of different spatial scales. Qualitatively, the results were the same for all textures, although they differed quantitatively (see discussion below). The observer’s task was as usual to report whether the central target became redder or greener between the reference and test presentations on each trial. Even for the multicolored target with chromatic texture, this task was entirely straightforward for all observers, in that reddish or greenish shifts in the average color of the texture were clearly discernible.
152
The results, summarized in Fig. 2, demonstrate that chromatic texture differences do disrupt chromatic contrast induction. But the effect is asymmetric: in condition 1, chromatic texture in the central target almost completely blocks contrast induction from the uniform background, for all 4 observers. In condition 2, chromatic texture in the background weakens contrast induction on a uniform central target, but not as strongly as in condition 1. Nonetheless, when we add chromatic texture to both background and target, contrast induction is restored to approximately the same level as condition 2 for most observers. This result is notable, as one explanation for the lack of induction of the uniform background on the textured target might be the known inability of contrast induction to propagate across chromatic borders (Zaidi et al., 1992). Yet the total number of chromatic borders in condition 3
(with textured target and background) is larger than in condition 1 (textured target against a uniform background), and induction is stronger. Thus the number of edges is unlikely to be the crucial factor in blocking induction. In separate experiments, we varied the intrinsic contrast of the textures, by varying the range of constituent colors around the neutral (or shiftedneutral) point. Even for textures with just-noticeable chromatic contrast, the effect of target-background texture-differences remained powerful, substantially reducing chromatic contrast induction. To determine whether the reduction in contrast-induction might be due to differences in the strength of induction between the individual constituent colors, we measured induction strength for each color individually in the spatially uniform control condition. As Fig. 2 illustrates, each of the constituent colors (from the
Fig. 2. The strength of contrast induction (see Methods in Appendix) under different stimulus configurations for a single observer (KW). Induction is strongest for a uniform gray target viewed against a uniform background. It is weaker for yellow, blue, red or green targets, but in each case still stronger than when the same individual colors are combined to make a blue and yellow texture (S-texture condition) or a red and green texture (LM-texture condition) for which induction is weakened substantially. The bar labeled LM bkg shows that induction is intermediate in strength for a uniform neutral target against a textured background. The rightmost bar (target þ background texture) shows that when both background and target have the same LM-texture, the induction strength is greater than for the target-only texture conditions.
153
highest-contrast supra-threshold chromatic texture) undergoes substantially greater contrast induction from the þ L–M shifted uniform background than does the chromatic-textured target made up of these same colors. The results also suggest that, while chromatic contrast along any direction in cone-contrast space blocks induction to some extent, as does luminance contrast, the blocking effect is strongest when the chromatic texture varies along the same direction as the background chromaticity shift. As Fig. 2 illustrates, when the background chromaticity shift is in the direction of increasing L-cone excitation, chromatic texture varying along the (L–M) axis blocks induction significantly more strongly than chromatic texture along the S axis. In fact, the inductionblocking effect appears to be tuned to the cardinal axes: chromatic texture which varies along an intermediate axis between the S and (L–M) directions, and therefore contains components of both, significantly weakens induction from backgrounds chromaticity-shifted along any direction in conecontrast space.
Depth segmentation, via binocular disparity Previous studies on the effects of depth differences on simultaneous contrast have focussed almost exclusively on achromatic lightness or brightness contrast, with conflicting results. There is no doubt that under certain conditions depth perception strongly influences lightness perception. Using occlusion cues as the sole cues to depth under monocular viewing, Gilchrist (1977) demonstrated that the perceived lightnesses of paper squares changed dramatically when the apparent depth plane to which they belonged changed. Earlier studies in which binocular disparity was manipulated as the primary cue to depth (Gogel and Mershon, 1969; Mershon, 1972) also concluded that the perceived depth separation between a test target and its inducing background strongly influenced brightness (and whiteness) contrast between them. On the other hand, using simpler stimuli, Gibbs and Lawson (1973) found no effect of binocular disparity on brightness contrast, and concluded that previous effects may have been mediated by apparent size changes with depth. More
recent studies have reported that depth differences do influence achromatic lightness and brightness contrast (Schirillo and Shevell, 1993) and chromatic contrast (Shevell and Miller, 1996), but the effects described are very small. Using the nulling task described above, we investigated whether depth-differences defined by binocular disparity may disrupt chromatic contrast. In this task, the central target was again a small square, set against a large inducing background (see Methods) presented as separate left- and right-eye views through a modified Wheatstone stereoscope. To provide sufficient form cues for binocular fusion, we added fine-scale luminance texture to the otherwise uniform background, taking into account the slight weakening in chromatic contrast this background texture introduces. By displacing the central target laterally in opposite directions for each eye, we moved it between three planes across trials, one of which was coplanar with the background, the second in front of the background, and the third behind, as if viewed through a window. On each trial, the background chromaticity was modulated sinusoidally along the (L–M) axis between red and green, and the observer nulled the induced color change of the target using a computer keyboard. We found that chromatic contrast induction was equally strong for all three depth configurations, for all observers. This finding contradicts earlier reports for brightness contrast (Gogel and Mershon, 1969) but is consistent with others (Gibbs and Lawson, 1973). Many factors may explain the apparent discrepancies. In general, strong effects of depth differences on perceived lightness or color are found with more complex stimuli; ours employ the simplest possible center-surround configuration. The effects in some earlier studies (e.g. Mershon, 1972) may have been confounded by the presence of black borders separating the test and inducing fields, which themselves are known to inhibit simultaneous contrast. Gilchrist (1977) argued explicitly that target lightness depends on perceived depth only if the perceived lighting framework to which the target belongs also depends on perceived depth; the target must be perceived to move from a brightly lit to a dimly lit framework to change in perceived lightness. The presence of multiple, obvious lighting frameworks may therefore be the crucial difference between
154
Gilchrist (1977) and previous experiments. But other differences may also be important; for example whether actual changes in depth are used or cues such as occlusion or disparity are used to manipulate depth perception. Our stimuli contain only a single lighting framework, so we cannot exclude the possibility that there are interactions between the perception of lighting frameworks within a scene and the neural mechanisms responsible for simultaneous chromatic contrast. But we do show conclusively that depth segmentation cues alone are not sufficiently strong to disrupt the illusion, unlike the presence of a black border, or texture cues. By keeping our stimuli very simple, we deliberately hoped to avoid contamination of our results by higher mechanisms influencing color perception, which may or may not be independent of those responsible for simultaneous contrast.
Motion segmentation Relative motion is also a powerful cue to image segmentation (Møller and Hurlbert, 1996), and obviously signals detachability between surfaces. As surfaces move from one location to another, they may move from one illumination framework into another. Using two variants of the color appearance nulling task, we investigated whether motion differences between the inducing background and target may also inhibit contrast induction. In the first paradigm, the target square moves sinusoidally and vertically against a stationary background with a vertical gradient in chromaticity, varying along the (L–M) axis from neutral in the center to red at the top and green at the bottom of the display. In the second paradigm, the target square remains stationary in the center of the display, and the luminance-textured background moves continuously and sinusoidally, either translating or rotating, while simultaneously oscillating in color between red and green along the (L–M) axis. In both paradigms, the chromaticity of the target square is modulated along the (L–M) axis in phase with the background modulation. The observer’s task is to adjust the amplitude of the target’s modulation so that it no longer changes color over the course of its or the background’s motion. We measure the
strength of the induced contrast as the amplitude of modulation required to null the apparent change in color of the target. We found that chromatic contrast induction was strong for both paradigms and all motion-difference conditions, for all speeds and frequencies tested, despite the fact that the target and background were clearly disjoint. Thus, motion segmentation does not disrupt simultaneous contrast induction.
Summary and discussion of psychophysical results Neither motion nor depth differences (conveyed by binocular disparity) disrupt simultaneous chromatic contrast induction, but texture differences do. These results suggest that the mechanisms underlying chromatic contrast are sited early in visual processing, at a monocular level prior to image segmentation based on the computation of relative depth or motion. The effects of texture-differences are also consistent with a low-level locus, since they are tuned to cardinal axes whose physiological instantiation has not been proved beyond primary visual cortex. Furthermore, the blocking effect of texturedifferences is very local: we have shown in other experiments that a small (1/3 ) textured annulus surrounding the target is as effective as a full-screen textured background in reducing contrast induction (Hurlbert and Wolf, 2002). Thus, the texturedifference effect most probably reflects an intrinsic dependence of the low-level spatial contrast mechanism on local chromatic variance. We argue that the induction effects we measure are dominated by local rather than global spatial chromatic contrast for several reasons: (1) earlier experiments using a similar rapid-change paradigm conclude that change in remote fields up to 10 distant from the central target have only a small effect on the induced contrast (Wachtler et al., 2001; Wolf and Hurlbert, 2003). (2) Control experiments in which only a very small annulus surrounding the central target undergoes the shift in chromaticity between the reference and test presentations, while the remaining background remains neutral, yield similar induction effects. (3) The strength of contrast induction is similar for the different measurement
155
techniques and conditions we used, despite their temporal differences (Fig. 3). We further argue that these effects are dominated by instantaneous spatial contrast, rather than temporal adaptation, because the very brief exposures to the inducing background do not permit steady-state chromatic adaptation to the same background. Furthermore, in independent measurements, we determined that a 2 min exposure to the textured inducing background did yield near-complete chromatic adaptation (i.e. the central target required almost the same shift in chromaticity as the background to appear neutral). Therefore, the small amount (10–15%) of contrast induced by the 500-ms exposure to the textured background provides an upper limit to the amount of adaptation that could occur within this brief duration, although even this small amount is probably primarily due to spatial contrast rather than temporal adaptation.
Other studies support the conclusion that local chromatic induction is monocular in origin (Shepherd, 1997), while there is some evidence that remote induction effects occur beyond the retina at a binocular locus (Shevell and Wei, 1998) and that spatial cone excitation ratios may be computed binocularly (Nascimento and Foster, 2001). In other experiments, we found that the contrast induction effects of remote surfaces presented to one eye only may be cancelled by a stimulus presented to the other eye, and conclude therefore that remote effects are likely to occur at a binocular site (Wolf and Hurlbert, 2003). Given the evidence suggesting that perceived depth differences inhibit contrast induction, particularly in the achromatic domain, why did we fail to find any such effect of binocular-disparity differences on chromatic contrast induction? Differently from other studies, we used a nulling technique in order to
Fig. 3. A comparison of the strength of contrast induction as measured by a variety of techniques. A value of 100% would indicate that the target square must always have the same chromaticity as its immediate background in order to appear neutral. A value of 50% indicates that the target appears neutral when its chromaticity is halfway between that of the immediate background and the neutral point. The strength of induction is comparable in our basic 2AFC experiment and in the ‘moving-center’ experiments. It is weaker for the moving-background experiments, perhaps because of the luminance-texture we used to define the background position. But there are no significant differences between the strength of induction for moving versus stationary backgrounds.
156
maintain DC adaptation at a fixed neutral point while allowing temporal modulation of background chromaticity. This temporal chromatic modulation may induce AC adaptation. We would therefore expect decreased sensitivity to color deviations from neutral along the (L–M) axis: reds would look less red and greens less green (Webster, 1996). But the DC shift in neutral chromaticity, which is what we measure here, would not be affected. The more likely explanation for the discrepancy lies in the simplicity of our stimuli compared with earlier studies. In fact, our results are in agreement with those of Gibbs and Lawson (1973) for achromatic contrast, whose stimuli are most similar to ours. Interestingly, the strong dependence on texture of spatial chromatic contrast, despite its low-level mechanism, may be very effective in weakening its operation between detachable surfaces in the natural world, since most natural surfaces are textured, areas of absolutely uniform color being the exception rather than the rule. The contrast-blocking effect of texture may explain why, in everyday life, objects do not generally change color when viewed against different colored backgrounds, despite the powerfulness of chromatic contrast under other conditions.
Neurophysiological evidence for the role of V1 in simultaneous chromatic contrast Decades of neurophysiological experiments have produced a shifting taxonomy of color-selective cells in the primate brain, yet only a few have proved eligible candidates for achieving color constancy. The most likely candidates have been found in cortical area V4 (Zeki, 1983a,b; Desimone et al., 1985; Schein and Desimone, 1990), well beyond the first stages of visual processing. The general view until now has been that although early visual areas may perform an initial analysis of the spectral content of images, it is not until V4 or beyond that object colors are constructed and first fed to consciousness (Lennie et al., 1990). In a landmark statement, Crick and Koch (1995) specifically argued against primary visual cortex (V1) as the site for consciousness, because neurons there do not display the properties necessary for color constancy and color constancy is a bedrock of our conscious experience.
In the past few years, the argument has begun to swing the other way: new studies of neurones in V1 now suggest that it is a critical site for color constancy. These studies fall broadly into two classes: those that examine contextual modulation and those that examine spatial receptive field properties. The first type stem from the increasing recognition of the large role that nonclassical-receptive-field context plays in modulating the activity of V1 neurons in response to stimuli within the classically defined receptive fields. Context modulates the response to texture or direction-of-motion (Albright and Stoner, 2002); therefore, it should not be surprising if context did the same for color. Motivated by this reasoning, Wachtler et al. (2003) found that the chromatic tuning of V1 neurons is indeed influenced by their nearby and distant chromatic surroundings. The cells’ responses to preferred colors are suppressed by a background of the same color, consistent with the change in appearance induced by reduced contrast against the background, and therefore indirectly consistent with the constancy observed in the face of unchanging contrasts. The cells’ responses are also influenced by discrete, remote patches of color located up to 6 outside the receptive field. Similarly, MacEvoy and Paradiso (2001) have described ‘lightness’ neurons in V1, whose responses to achromatic stimuli depend on their contrast with surrounding achromatic stimuli, outside the classical receptive field. Preliminary results from recordings in marmoset V1 provide direct evidence that color-selective neurons there encode spatial cone-contrasts (Hurlbert et al., 2001). The second set of experiments have probed the spectral and spatial structures of classical receptive fields in V1 in order to assess the specialization of V1 neurons for color. Johnson et al. (2001) found, using sine-wave gratings of varying spatial frequency, that most cells did indeed respond more strongly to luminance than chromatic variations within their receptive fields. But a significant proportion of cells (nearly 40%) either preferred chromatic variations at very low spatial frequencies, or responded equally well to modulations of luminance or color (at the same spatial frequency) when total cone-contrast was equated for the two types of stimuli. Many of these latter ‘color-luminance’ cells share key
157
characteristics with the double-opponent cell newly rediscovered by Conway (2001), in that the L and M cone responses are out of phase with each other across the receptive field. Conway (2001) demonstrated that approximately 10% of neurons in V1 satisfy the criteria for true double-opponency, having coneopponency in the center (e.g. excited by þ L contrast, suppressed by þ M contrast; þ L–M) and the opposite opponency in the surround (i.e. þ M–L). Other recent studies report that V1 neurons are more highly specialised for color processing than previously acknowledged (Hanazawa et al., 2000). Double-opponent cells are ideal candidates for the computation of local cone-contrast, in that they explicitly compare within-type cone excitations across space. But they compute very local contrast only, between the center and surround of receptive fields extending little more than 1–2 in total. They are therefore probably distinct from the V1 cells modulated by extra-receptive-field contrast (Hurlbert et al., 2001; Wachtler et al., 2003), whose color preferences were measured using large isoluminant stimuli ill-favored by double-opponent cells. But the context-modulated cells have not been tested explicitly for double-opponency, and indeed little is yet known about their receptive field structure. Conversely, the influence of extra-receptive-field contrast on the double-opponent cell responses has not been tested. So, although it is unclear how many distinct types of V1 cell encode spatial chromatic contrast, it is clear that a substantial proportion of cells do. Intriguingly, most of the above studies have been carried out in alert behaving monkeys, whereas earlier studies which demonstrated color-constant responses in V4 but not V1 were performed in anaesthetised animals. Thus, the rise of V1’s importance to color perception has paralleled the rise in the experimental animal’s consciousness.
discriminating small differences in spatial conecontrasts despite being perceptually unaware of colors (Hurlbert et al., 1998; Kentridge et al., 2003). Similarly, the color names chosen by another cerebrally achromatopsic observer, JPC, show strong dependence on the background against which the colored surfaces are displayed, under otherwise relatively constant viewing conditions (D’zmura et al., 1998). Both observers suffer damage to the ventral and temporo-occipital cortical regions, including the human ‘color center’ localized to the lingual and fusiform gyri (controversially labeled as V8 or V4, and probably corresponding to area TEO in macaque monkey; Hadjikhani et al., 1998) yet both observers retain residual sensitivity to spatial chromatic contrast. Observer MS, despite being able to discriminate differences in spatial cone-contrasts in simple centersurround configurations, is unable to discriminate these differences when imbedded in more complex scenes (Hurlbert et al., 1998). (This finding argues against his local discrimination ability being mediated solely by retinal adaptation, since DC adaptation is similar for the complex and simple scenes). Other studies of observers with incomplete achromatopsia demonstrate that deficits in color discrimination may be dissociated from deficits in color constancy (Kennard et al., 1995; Ru¨ttiger et al., 1999), and, further, that the crucial lesion underlying specific deficits in constancy lies in the superior temporal gyrus, anterior and temporal to the location of the human ‘color center’ in the fusiform and lingual gyri. Taken together, these studies indicate that color perception relies first on the computation of local cone-contrasts in V1 or below, but requires longer range spatial contrast comparisons that are mediated by higher cortical areas in the ventral–visual pathway.
Conclusion Neuropsychological evidence for the role of V1 in chromatic contrast Natural lesion studies also suggest that chromatic contrast mechanisms may be mediated at the level of V1 or below. Studies of the cerebrally achromatopsic observer, MS, indicate that he is capable of
In summary, it is highly plausible that the mechanisms which mediate local chromatic contrast effects are sited at low levels in the visual system, in primary visual cortex or below, prior to image segmentation mechanisms which require computation of relative depth or motion. To the extent that color contrast
158
contributes to color constancy (and by some accounts, local color contrast is the major contributor) V1 and lower areas therefore mediate color constancy. This conclusion might seem counter to the Crick–Koch hypothesis that activity in V1 cannot rise to awareness; is it possible that the activity in V1 may reach consciousness only if the animal is in fact conscious? Crick and Koch (1995) argue not: ‘‘. . . if neurons in both V1 and V4 in the alert monkey did turn out to show [color constancy] this would not, by itself, disprove our hypothesis’’. Instead, the neural activity in V1 might serve simply to trigger the awareness-related activity in V4 and beyond. It might be that double-opponent cells carry out the essential local contrast calculations (and segment the image in the bargain), while V4 cells stabilize and adjust colors according to global contrast. Both mechanisms would be aided and abetted by chromatic adaptation in the retina, which may propagate far enough and early enough, to show up as long-range contextual effects for all color cells in V1. The fact that color contrast may be perceived without color consciousness further argues that global mechanisms beyond V1 are needed for consciousness of color constancy.
Appendix: Methods Cone-contrast space. Total cone-contrast (dC) was defined as: s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi L 2 M 2 S 2 þ þ dC ¼ L M S where L, M and S refer to the original cone excitations of the neutral reference color (CIE chromaticity coordinates: x ¼ 0.321; y ¼ 0.337), as calculated using the Smith–Pokorny cone fundamentals, and L, M, and S refer to the changes in cone excitation following the shift, i.e. L ¼ Lshifted – Lprototypic. We further defined two chromatic cardinal axes: the ‘LM’ axis for which S-cone stimulation is constant, and the S axis for which the ‘L:M’ ratio is invariant. All color shifts were isoluminant with respect to the prototypic color (i.e. L ¼ M). ‘ þ L–M’ refers to increasing L-cone-contrast along the (LM) axis. The results reported here are for induction along the LM axis, with the exception of those relating to the chromatic tuning of the texture-blocking effect.
Stimuli. Our prototypic stimulus, as shown in Fig. 1(a), consisted of an achromatic central square set against a uniform or textured background (with a neutral space-averaged chromaticity and luminance). Except when otherwise stated, the central target was 1 square; the background dimensions were 30 20 . For the 2AFC task, the target and background luminances were, respectively 13.8 and 15 cd/m2; for the nulling experiments (disparity and motion), the values were 28 and 30 cd/m2. Stimuli were generated as truecolor bitmaps using Matlab on a PC and displayed on a 20-in. SGI CRT monitor, calibrated and checked for spatial chromatic homogeneity with a Minolta CS-100 chromameter. Observers viewed the stimuli in complete darkness from a distance of 63 cm, through a black-lined viewing box, with a chinrest to limit head movement. Observers were instructed to maintain fixation on the target square throughout each trial. To enforce preadaptation, no responses were collected for the first 2 min of each experimental session. Four observers participated in the experiment, all with normal color vision as verified with the Farnsworth–Munsell 100-hue test. 2AFC task. On each trial the following sequence was presented: prototypic stimulus (500 ms); test stimulus (500 ms); neutral background (minimum 500 ms). The test stimulus was identical to the reference in its spatial configuration; only the colors differed. The background colors were uniformly translated along the (L–M) axis in the isoluminant plane relative to their prototypic colors, always with a constant total cone-contrast shift of 0.1. The total cone-contrast of the central square was varied between trials, its value taken at random from a set of constant increments along the isoluminant (L–M) axis relative to the prototypic square color. The observer’s task was to indicate by a button-press whether the test square appeared ‘redder’ or ‘greener’ than the prototypic square. Responses were taken during the top-up adaptation period (minimum 500 ms) between trials. Stimulus presentation and response collection was performed under the software package ‘Presentation’ (Neurobehavioural Systems, http://www.neurobehavioralsystems.com). We quantified the chromatic induction for each condition as the total cone-contrast of the central test square (relative to the prototypic square’s neutral
159
color) which, when viewed against the þ L–M shifted background, appeared neither redder nor greener than the prototypic square, expressed as a percentage of the total cone-contrast of the þ L–M-shifted background. This value was calculated as the 50% point of the best-fitting Weibull function for the psychometric data relating percentage ‘redder’ responses to total cone-contrast of the test square (see Fig. 2). Binocular disparity and motion experiments. The backgrounds in our nulling experiments were modulated at 0.5 Hz, with a cone-contrast (dC) amplitude of 0.15. The ‘moving-target’ experiment was tested at one additional speed of 1 Hz. For the ‘moving-background’ experiment, the background moved laterally with an amplitude of either 0 or 8 . For the disparity experiment, the background size was reduced to 15 20 to fit the screen. In the ‘moving-target’ experiment, the background was locally uniform, the only chromatic variation being the vertical chromatic gradient applied to it. In the ‘moving-background’ and disparity experiments, we defined the background position using a low-contrast ( 10% luminance) salt-and-pepper texture, that we had previously shown to have little effect on chromatic induction. We investigated disparities of 0 and 0.5 , i.e. with the target in the same plane as its background, in front or behind. These experiments were programmed as Matlab mex functions, written in ‘C’ using the NVIDIA openGL library. The computer used was a GNU/ Linux PC equipped with an NVIDIA GeForce2 MX/ MX4 64Mb graphics card with ‘Digital Vibrance’ disabled. The monitor refresh rate was 100 Hz.
References Albright, T.D. and Stoner, G.R. (2002) Contextual influences on visual processing. Ann. Rev. Neurosci., 25: 339–379. Allman, J.M., Miezin, F.M. and McGuiness, E. (1985) Direction and velocity specific surrounds in area MT of the owl monkey. Perception, 14: 105–126. Beck, J. (1972) Surface Color Perception. Cornell University Press, London, pp. 145–151. Broerse, J., Vludsich, T. and O’Shea, R.P. (1999) Colour at edges and colour spreading in McCullough effects. Vision Res., 39: 1305–1320. Brown, R.O. and MacLeod, D.I.A. (1997) Color appearance depends on the variance of surround colors. Curr. Biol., 7: 844–849.
Conway, B. (2001) Spatial structure of cone inputs to color cells in alert macaque primary visual cortex (V-I). J. Neurosci., 21: 2768–2783. Crick, F. and Koch, C. (1995) Are we aware of neural activity in primary visual cortex? Nature, 375: 121–123. Desimone, R., Schein, S.J., Moran, J. and Ungerleider, L.G. (1985) Contour, color and shape analysis beyond the striate cortex. Vision Res., 25: 441–452. D’Zmura, M., Knoblauch, K., Henaff, A-M. and Michel, F. (1998) Dependence of color on context in a case of cortical color vision deficiency. Vision Res., 38: 3455–3459. Fairchild, M.D. and Reniff, L. (1995) Time course of chromatic adaptation for color-appearance judgements. J. Opt. Soc. Am. A, 12: 824–833. Finlayson, G., Drew, M. and Funt, B.V. (1993) Color constancy: diagonal transforms suffice. CSSI/LCCR TR9302. SFU, Centre for Systems Science, Burnaby, B.C., Canada. Foster, D.H. and Nascimento, S.M.C. (1994) Relational color constancy from invariant cone-excitation ratios. Proc. Roy. Soc. London B, 257: 115–121. Funt, B.V., Drew, M. and Ho, J. (1991) Color constant color indexing. Int. J. Comput. Vision, 6: 5–24. Gibbs, T. and Lawson, R.B. (1973) Simultaneous brightness contrast in stereoscopic space. Vision Res., 14: 983–987. Gilchrist, A. (1977) Perceived lightness depends on perceived spatial arrangement. Science, 95: 185–187. Gogel, W.C. and Mershon, D.H. (1969) Depth adjacency in simultaneous contrast. Percept. Psychophys., 5(1): 13–17. Hadjikhani, N., Liu, A.K., Dale, A.M., Cavanagh, P. and Tootell, R.B.H. (1998) Retinotopy and color sensitivity in human visual cortical area V8. Nat. Neurosci., 1: 235–241. Hanazawa, A., Komatsu, H. and Murakami, I. (2000) Neural selectivity for hue and saturation of colour in the primary visual cortex of the monkey. Eur. J. Neurosci., 12: 1753–1763. Hurlbert, A.C. (1998) Computational models of colour constancy. In: Walsh V. and Kulikowski J. (Eds.), Perceptual Constancy: Why things look as they do. Cambridge University Press, Cambridge, pp. 283–321. Hurlbert, A.C. and Poggio, T. (1989) A network for image segmentation using color. In: Touretzky D. (Ed.), Neural Information Processing Systems I. Morgan Kaufman Publishers, California, pp. 297–303. Hurlbert A. and Wolf K. (2002) The contribution of local and global cone-contrasts to colour appearance: a Retinex-like model. In B.E. Rogowitz and T.N. Pappas (Eds.), Human Vision and Electronic Imaging VII. Proceedings of SPIE 4662, pp. 286–297. Hurlbert, A.C., Bramwell, D.I., Heywood, C. and Cowey, A. (1998) Discrimination of cone contrast changes as evidence for colour constancy in cerebral achromatopsia. Exp. Brain Res., 123: 136–144. Hurlbert, A.C., Gigg, J., Golledge, H. and Tovee, M. (2001) Neurons are selective for local cone-contrast in Marmoset V1. Soc. Neurosci. Abs.
160 Johnson, E.N., Hawken, M.J. and Shapley, R. (2001) The spatial transformation of color in the primary visual cortex of the macaque monkey. Nature Neurosci., 4: 409–416. Kennard, C., Lawden, M., Morland, A.B. and Ruddock, K.H. (1995) Colour identification and constancy are impaired in a patient with incomplete achromatopsia associa with prestriate cortical lesions. Proc. Roy. Soc. London B, 260: 169–175. Kentridge, R.W., Cole, G. G. and Heywood, C.A. (2003) The primacy of chromatic edge processing in normal and cerebrally achromatopsic subjects. Prog. Brain Res., 144. Kraft, J.M. and Brainard, D.H. (1999) Mechanisms of color constancy under nearly natural viewing. Proc. Natl. Acad. Sci. USA, 96: 307–312. Laurinen, P.I., Olzak, L.A. and Peromaa, T.L. (1997) Early cortical influences in object segregation and the perception of surface lightness. Psychol. Sci., 8: 386–390. Lennie, P., Krauskopf, J. and Sclar, G (1990) Chromatic mechanisms in striate cortex of macaque. J. Neurosci., 10: 649–669. Lee, H.C. (1986) Method for computing the scene-illuminant chromaticity from specular highlights. J. Opt. Soc. Am. A, 3: 1694–1699. MacEvoy, S.P. and Paradiso, M.A. (2001) Lightness constancy in primary visual cortex. Proc. Natl. Acad. Sci. USA, 98: 8827–8831. Mershon, D.H. (1972) Relative contributions of depth and directional adjacency to simultaneous whiteness contrast. Vision Res., 12: 969–979. Møller, P. and Hurlbert, A.C. (1996) Psychophysical evidence for fast region-based segmentation processes in motion and color. Proc. Natl. Acad. Sci. USA, 93: 7421–7426. Nascimento, S.M.C. and Foster, D.H. (2001) Detecting changes of spatial cone-excitation ratios in dichoptic viewing. Vision Res., 41: 2601–2606. Nothdurft, H.-C. (1994) Common properties of visual segmentation. In: Morgan M.J. (Ed.), Higher-order Processing in the Visual System, Ciba Foundation Symposium 184. Wiley, Chichester, UK, pp. 245–268. Rinner, O. and Gegenfurtner, K.R. (2000) Time course of chromatic adaptation for color appearance and discrimination. Vision Res., 40: 1813–1826. Ru¨ttiger, L., Braun, D.I., Gegenfurtner, K.R., Petersen, D., Schoenle, P. and Sharpe, L.T. (1999) Selective color constancy deficits after circumscribed unilateral brain lesions. J. Neurosci., 19: 3094–3106. Rudd, M.E. and Arrington, K.F. (2001) Darkness filling-in: a neural model of darkness induction. Vision Res., 41: 3469–3662. Schein, S.J. and Desimone, R. (1990) Spectral properties of V4 neurons in the Macaque. J. Neurosci., 10: 3369–3389. Schirillo, J.A. and Shevell, S.K. (1993) Lightness and brightness judgements of coplanar retinally noncontiguous surfaces. J. Opt. Soc. Am. A, 10(12): 2442–2452.
Shady, S. and MacLeod, D.I.A. (2002) Color from invisible patterns. Nat. Neurosci., 5(8):729–730. Shepherd, A.J. (1997) A vector model of colour contrast in a cone-excitation colour space. Perception, 26: 455–470. Shevell, S.K. and Miller, P.R. (1996) Color perception with test and adapting lights perceived in different depth planes. Vision Res., 36(7): 949–954. Shevell, S.K. and Wei, J. (1998) Chromatic induction: border contrast or adaptation to surrounding light? Vision Res., 38: 1561–1566. Von Kries, J. (1906) Chromatic adaptation. In: MacAdam D.L. (Ed.), Sources of Color Science. MIT Press, Cambridge, MA, pp. 109–119. Wachtler, T., Albright, T.D. and Sejnowski, T.J. (2001) Nonlocal interactions in color perception: nonlinear processing of chromatic signals from remote inducers. Vision Res., 41: 1535–1546. Wachtler, T., Sejnowski, T.J. and Albright, T.D. (2003) Representation of color stimuli in awake macaque primary visual cortex. Neuron, 37: 681–691. Ware, C. and Cowan, W.B. (1982) Changes in perceived color due to chromatic interactions. Vision Res., 22: 1353–1362. Webster, M.A. (1996) Human colour perception and its adaptation. Network: Comput. Neural Sys., 7: 587–624. Werner, A., Sharpe, L.T. and Zrenner, E. (2000) Asymmetries in the time-course of chromatic adaptation and the significance of contrast. Vision Res., 40: 1101–1113. Wolf, K. and Hurlbert, A.C. (2002a) Chromatic texture influences chromatic contrast induction. In: ARVO Annual Meeting Abstr. Fort Lauderdale. Wolf, K. and Hurlbert, A.C. (2002b) Influences of chromatic texture on contrast induction. In: VSS Annual Meeting Abstr. Sarasota. Wolf, K. and Hurlbert, A. C. (2003) The effect of global contrast distribution on colour appearance. In: Mollon J. D., Pokorny J. and Kurblanch K. (Eds.), Normal and Defective Color Vision. Cambridge University Press, Cambridge, pp. 239–247. Zaidi, Q., Yoshimi, B. and Flannigan, J. (1991) Influence of shape and perimeter length on induced color contrast. J. Opt. Soc. Am. A, 8: 1810–1817. Zaidi, Q., Yoshimi, B., Flannigan, J. and Canova, A. (1992) Lateral interactions within color mechanisms in simultaneous induced contrast. Vision Res., 32: 1695–1707. Zeki, S. (1983a) Colour coding in the cerebral cortex: The reaction of cells in monkey visual cortex to wavelengths and colours. Neurosci., 9: 741–765. Zeki, S. (1983b) Colour coding in the cerebral cortex: The responses of wavelength-selective and colour-coded cells in monkey visual cortex to changes in wavelength composition. Neurosci., 9: 767–781.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 11
The primacy of chromatic edge processing in normal and cerebrally achromatopsic subjects R.W. Kentridge*, G.G. Cole and C.A. Heywood Department of Psychology, Science Laboratories, South Road, Durham DH1 3LE, UK
Abstract: The local chromatic contrast between surfaces in a visual scene plays an important role in theories of color perception. Our studies of cerebral achromatopsia suggest that this contrast signal is computed independently of the more complex processes such as edge integration and anchoring. We report a study in which we attempted to determine whether local-contrast signals also drove behavior in normal subjects. We sought to reduce the role of edge integration and anchoring by using stimuli whose background varied very gradually in color from top to bottom. The local chromatic contrast of patches relative to such backgrounds depends upon the position at which they are presented. It is therefore possible for patches with identical spectral composition to have opposite contrasts. We constructed stimuli in which two of three vertically arranged discs had the same contrast while the third had opposite contrast. The stimuli were also constructed so that the contrast-odd disc and one of the other two had identical spectral composition while the third disc had different composition. We used these stimuli in an attentional task where, after a brief delay, a letter discrimination target was presented in the location of one of the discs. Attention should automatically be attracted to the odd disc in such a display. Normal observers were faster at making the letter discrimination when the target appeared at the contrast-odd as opposed to spectrally odd location. We conclude that local chromatic contrast, but not raw spectral composition, is accessible to normal observers at an appropriate stage in visual processing to drive attention.
Introduction
that object. Perceived color can be thought of as an estimate of surface-reflectance properties. It depends upon a complex relationship between spatial and temporal patterns of wavelength and intensity variation in the light reaching the retina that undoubtedly relies on specialized neural computations. Some of these computations serve simply to disentangle intensity and wavelength variation in the visual scene. Others allow the visual system to estimate the composition of the illuminant or to discount the effects of changes in the illuminant without explicit estimation. Signals derived from a visual scene which are invariant under changes of illumination provide a useful starting point for the perception of surface color. The wavelength of light reflected from surfaces themselves is, of course, not invariant under changes of illumination. If, however, one compares a pair
Our normal visual experience of the world is filled with color. Color is a property of surfaces, shadows and lights. These colors do not, however, correspond to the physical wavelengths of light striking the retina in as direct a way as might at first be imagined. Color is a distal, not a proximal percept. The perceived color of objects in the world is a property of those objects, not of the wavelength of light reflected from them to the retina. The light reaching our eyes from an object is dependent both on the surface-reflectance properties of the object and the light illuminating *Corresponding author. Tel.: þ 44-191-334-0437; Fax: þ 44-191-334-3241; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-1
161
162
of surfaces then if the first surface reflects more long-wavelength light than the other under one illuminant, it will also reflect more long-wavelength light under a different illuminant (given a few realistic provisos about the nature of the illuminants). If the pair of surfaces abut then the comparison between reflectances is being made across a boundary or edge. Land’s retinex theory of color vision takes such local contrasts across edges as its starting point (Land and McCann, 1971). It has recently been shown that there are cells in the striate cortex of macaques that respond selectively to the chromatic contrast across edges (Conway, 2001; Conway et al., 2002). The process of ‘filling-in’ the properties of a surface from its edges, as seen in the Craik–O’Brien–Cornsweet illusion (see Cornsweet, 1970), can be seen as an illustration of the way surface color perception relies on chromatic contrast across edges. The neurological condition of cerebral achromatopsia, however, provides some of the most compelling evidence that edge processing has a quite distinct role in additional to its status as a starting point for surface color perception. Cerebral achromatopsia is a neurological condition in which color vision becomes defective as a result of damage to ventromedial areas of cortex in the vicinity of the fusiform and lingual gyri (Meadows, 1974). Patients with complete cerebral achromatopsia report an absence of phenomenal color experience, do not name colored samples correctly and fail in color sorting and color oddity tasks. Failures in oddity and sorting tasks confirm that cerebral achromatopsia is a perceptual deficit rather than one of color naming or color memory. If cerebral achromatopsia involved a complete loss of wavelength-selective processing then one would expect that patients would fail to respond to wavelength-defined stimuli in which there was no luminance variation. For example, a subject incapable of extracting wavelength information from a stimulus should entirely fail to perceive a green disk against a red background if the disk and the background were of the same luminance. Despite their absence of color experience, cerebral achromatopsics do, however, respond to such stimuli. Moreover, this response is not simply behavioral. They do not just ‘guess’ reliably when a wavelength-defined stimulus is present in a display. It is obvious from
their reactions and reports that they clearly see such stimuli consciously. They cannot, however, explain what differs between the stimulus and its background. They appear to see the chromatic edge effortlessly despite having no experience of the two surfaces that comprise the edge. Together with Alan Cowey, we have investigated the residual edge-processing abilities in an achromatopsic patient, M.S., over many years. The ability of M.S. to perceive form from color was established by Heywood et al. (1994). In order to avoid the issue of exactly how one should equate the luminance of figure and ground with a specific achromatopsic patient they used the technique of random luminance masking. That is, the figure was composed of a set of rectangles all having the same chromaticity but differing in luminance, embedded in a background of rectangles all having another chromaticity and again all differing in luminance. In such a display the figure cannot be discriminated from the background on the basis of luminance cues as there are luminance edges through the display, not just at the borders between figure and ground. Heywood et al. (1991) showed that M.S. can not only use chromatic boundaries to perceive form but can also discriminate between boundaries. They found that M.S. could discriminate a block of abutting isoluminant squares ordered sequentially in terms of color from a block of the same squares in random order. One key difference between the ordered and random stimuli is that in the ordered stimulus the color contrasts between squares will always be low (because neighbouring squares will have similar colors) whereas the boundaries between squares in the randomly ordered stimulus will usually be higher and will be more variable. The fact that M.S. failed to make this discrimination if the squares comprising the blocks were not abutting emphasizes the role that edge information was playing in this residual ability. Heywood et al. (1998) demonstrated that wavelength-based form perception in M.S. is mediated by the color-opponent P-channel rather than the M-channel in which signals from the three different cone types are simply summed (see Chatterjee and Callaway, 2002 for evidence that S-cones contribute to the M-channel). It is not possible to silence
163
all M-channel cells even when using stimuli that minimize luminance responses psychophysically. The color-opponent system should, however, respond in a characteristic fashion to stimuli with a spectral distribution in which the wavelengths put into opponency are in balance. In a nutshell, a stimulus in which wavelengths producing excitation and inhibition balance will produce less response than one in which net excitation can be elicited in at least some cells. We were able to show that the same wavelength balance was needed to make such stimuli appear perceptually dimmer both to M.S. and to normal subjects, implying that at least the early stages of the color-opponent P-channel were intact in M.S. In our most recent work we have shown that M.S. can not only discriminate the magnitude of contrast across a chromatic border, but also aspects of its chromatic composition. For example, he can discriminate a red–yellow border from a green– yellow border. This ability disappears if the borders are obscured by black lines. In normal observers chromatic borders are just the starting point in color perception and we usually respond to the color derived from the border contrast rather than the local contrast per se. A pair of ripe oranges look the same color to us even if one is sitting on a yellow plate and the other on a green kitchen worktop. Nevertheless, the local chromatic contrasts at the borders between orange and yellow and between orange and green are quite different. The difference in local chromatic contrast only becomes directly perceptually apparent to us under special circumstances. If there is a very gradual transition between colors in the background the visual system may treat this transition as if it reflected a change in illumination rather than a change in background. A pair of stimuli with identical spectral compositions displayed against different parts of a graduated background will then elicit different color percepts in line with the different local contrasts they produce against their immediate backgrounds. We showed that both M.S. and normal observers responded on the basis of local contrast in a task using a graduated background. Normal observers’ behavior switched to one consistent with object surface color when a background with abrupt transitions were used. M.S., however, persisted in responding on the basis of local chromatic contrast.
Although it is clear that against the graduated background both M.S. and the normal observers are responding on the basis of local contrast, the basis of the normal observer’s responses in the other conditions is ambiguous. A response based on passing the local-contrast signal through further stages of processing in order to arrive at an estimate of surface reflectance would produce the choices observed. It is, however, also the case that responses based simply on wavelength, a signal simpler even than local contrast, would be equally consistent with the results. Zeki et al. (1999) suggest that cells in striate cortex do, indeed, show wavelength tuning. Such cells may coexist with the contrast-tuned cells found by Conway (2001). Alternatively, the apparent wavelength-based tuning may, however, be a consequence of the stimulus sizes used by Zeki (Maunsell and Newsome, 1987; Wachtler et al., 1999). Whether or not stimulus wavelength directly influences behavior can be disentangled experimentally by pitting wavelength and local contrast against one another as we did in the graduated-background task. In these tasks three disks are presented against the background. Two of them have the same contrast relative to the background immediately surrounding them and two have the same spectral composition. Simply asking normal subjects which is odd yield responses favoring contrast as the property that defines perceptual oddity. However, subjects sometimes find the task difficult, commenting that all three items look a little different from one another. We reasoned that it might be the case that an indirect test of oddity could reveal whether spectral composition could, indeed, be extracted late enough in the visual system to influence behavior. The indirect test we used was that of the color singleton paradigm (Yantis and Jonides 1988) in which an odd-colored item (a ‘color singleton’) is presented on each trial but, importantly, the singleton is not relevant to the participant’s task. There is considerable debate in the literature over whether a task-irrelevant unique item (i.e. a singleton) presented amongst an array of homogeneous items automatically captures visual spatial attention. Central to this debate is whether color as a singleton can elicit this type of stimulus-driven attentional capture. Clearly color, as with all other singletons, will automatically capture attention if it is relevant to a participant’s task. For instance if an
164
observer is asked to detect a single red line amongst an array of green lines, the color singleton will be effortlessly detected. RT to detect the odd item will be independent of the number of green items. The singleton is said to ‘pop out’ (Treisman and Gelade, 1980). However, the question remains as to whether an odd item captures attention in the absence of a relevant attentional set. Initially, evidence suggested that a color singleton does not capture stimulus-driven attention (Jonides and Yantis, 1988). However, more recently Turatto and Galfano (2001) have provided data showing that color singletons can indeed capture attention. Participants searched for a target letter amongst distracters each of which occurred inside a small circle. One of the circles was a different color to the others. Unlike Jonides and Yantis, Turatto and Galfano found that response time to detect the target was reduced when it coincided with the color-singleton. Using a variation of the color singleton task first employed by Gibson and Jiang (1998), Horstmann (2002) has also provided evidence that color singletons can capture bottom-up attention. Gibson and Jiang suggested that color may in fact be able to summon stimulus-driven attention but participants could habituate to the repeated appearance of color singletons that they know are task irrelevant. They reasoned that capture may occur for the initial (unexpected) presentation of a singleton but fail to do so after a repeated number of presentations. The capture effect would therefore be lost in analysis when reaction times are averaged over the whole experiment. In order to assess this, Gibson and Jiang (1998) manipulated the color-singleton task such that a surprise singleton was presented at various points in the experiment. Although Gibson and Jiang found that surprise singletons did not summon attention, Horstmann (2002) has recently showed that they can. The question of whether color singletons can automatically capture attention has wider theoretical implications for the more general issue of whether the onset of a new object is particularly effective in attentional capture. Our brief review of the color singleton literature has shown how elusive the effect of color singletons on attentional capture has proved to be. Furthermore, data from Jonides and Yantis (1988) and Hillstrom and Yantis (1994) suggest that
luminance and motion singletons also do not accrue attention automatically. However, there exists one class of singleton which has consistently shown to be effective in attentional capture: the onset singleton. As with the other singleton paradigms, the onset singleton task requires participants to search for a target letter presented amongst distracters. One of the letters appears abruptly and hence is referred to as a ‘new’ object or an onset, whilst all other letters are created by the offset of elements that camouflage existing letters. That is, the transformation of already existing ‘old’ objects. Results show that when the target happens to coincide with the new onset, RT is reduced compared to when the target coincides with one of the old objects. Furthermore, Gellatly et al. (1999) and Gellatly and Cole (2000) have shown that the enhanced attentional facilitation for new objects does not rely on the luminance change that normally accompanies a new onset. In other words, luminance-change detection is not a necessary condition of capture. Finally, Cole et al. (2003) have demonstrated that object onset is less susceptible to ‘change blindness’ (Rensink, 2002) than many other changes that occur in this paradigm. It is the findings from the onset singleton task compared with findings from other singleton tasks, most notably color singleton, that has lead Yantis (1993) to argue that the only feature that can elicit bottom-up attentional capture in the absence of a relevant attentional set is the onset of a new object. In another way, object onset has a special status in attentional capture. Our assessment of wavelength and local-contrast processing will therefore also enable us to address the issue of chromatically based attentional capture. In the present study we compared performance on a letter-discrimination task in locations in which a cone-contrast or a spectral composition singleton had just been presented. If singletons in one of these two dimensions proved differentially effective in automatically capturing attention then letter discrimination in the location of the odd target in that dimension should be facilitated.
Stimuli Two types of display were used in the experiment. In the first, three 2.3 diameter isoluminant-colored
165
disks were presented at an eccentricity of 8 to the right of fixation, spaced vertically with polar angles of –60 , 0 and þ 60 to the horizontal, against a black background. The borders between colored disks and the black background had identical large luminance contrast; however, as there was no chromatic content in the background there were no useable local chromatic contrast signals. The only dimension that could be used to define oddity was therefore the spectral content of the disks. On every trial two disks had the same content (the same color) and one an odd color. The odd disk was always presented in the uppermost or lowest of the three locations, never the middle. In the second display, disks with spectral compositions matching those used in the first display were presented against an isoluminant-graduated color background. The background was constructed so that all of the disks had the same magnitude of RMS cone contrast relative to their immediate background. The contrast of the spectrally odd disk and one of the other two disks was in the same direction (e.g. they were both redder than their background). For the third disk the contrast was in the opposite direction (e.g. the disk was greener than its background). The odd-contrast location in the graduated-background display was the same as the spectrally odd location in the matching blackbackground display. For illustration, Fig. 1 shows a luminance-modulated graduated-background display with an odd-contrast central disk in which the upper and central disks have the same luminance. Construction of the graduated-background display in which local cone-contrast magnitudes are precisely matched is not as trivial as it might first appear. It is not sufficient to take two isoluminant colors of equal and opposite cone contrasts from a midpoint as backgrounds and use the midpoint as spectrally identical but opposite-contrast targets. This is because the cone contrast of a stimulus is defined relative to its immediate background. The changes in cone excitation required to produce contrasts of equal magnitude against two differing backgrounds differ according to the composition of those backgrounds. In fact, in a cone space plane (x, y) one has to satisfy the following set
of expressions: xe ¼ xm þ xm c sinðm Þ ye ¼ ym þ ym c cosðm Þ xt ¼ xe þ xe c sinðe Þ yt ¼ ye þ ye c cosðe Þ where xe, ye are the coordinates of a background color, xt, yt are the coordinates of a stimulus, c is the cone contrast between background colors from the midpoint defined as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ððxt xe Þ=xe Þ2 þððyt ye Þ=ye Þ2 m is the angle, in cone-contrast space (x/xm, y/ym), of xe, ye from the midpoint and e is the angle, in cone-contrast space (x/xe, y/ye), of xt, yt from one of the background colors. When both angles are on the isoluminant axis [by definition arctan(y/x) for a cone-contrast space centered on (x,y)] the cone-contrast magnitude at a background patch with coordinates xe, ye which has a
Fig. 1. An illustration of the effect of local contrast on the perception of surfaces against a gradually changing background. The uppermost and central disks are identical shades of gray. The uppermost and lowermost disks are both lighter than their immediate backgrounds while the central disk is darker than its background. The three disks are reproduced to the right of the graduated background to illustrate the strength of the background’s effect.
166
contrast c relative to the midpoint x, y required to satisfy the relations is given by: 2cy cosðy=xÞ y þ cy cosðy=xÞ y cy cosðy=xÞ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi þ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ðy cy cosðy=xÞÞ ðy þ cy cosðy=xÞÞ2 1þ 1þ 2 ðx cy cosðy=xÞÞ ðx þ cy cosðy=xÞÞ2 We used this expression to construct stimuli against linearly graduated backgrounds with a central chromaticity of CIE x ¼ y ¼ 0.45, and a maximum RMS cone contrast from the top to the bottom of the screen of 25% in a cone-contrast plane of just L- and M-cone modulations. Stimuli were presented 1=4, 1=2, and 3=4 of the way along the gradation such that the oddcontrast stimulus could either have positive or negative L contrast and either be at the uppermost of lowest of the three locations. The gradation could either be from green to red or red to green running down the screen. The luminance of the disks and graduated background was 15 cd m2. For the letter discriminations the same disks were white and contained centered Arial Bold lowercase letters subtending 1 vertically. Stimuli were generated using a Cambridge Research Systems VSG2/3 graphics system driving an Eizo F784T monitor at 100 Hz and responses collected using a Cambridge Research Systems CB2 button box.
rest of the alphabet (apart from ‘m’ which was much wider than any other letter) being presented in the remaining two locations. There was therefore counterbalancing over 12 conditions [direction of gradation (2), odd-contrast location (2), and target location (3)] within the graduated-background phase, and matching trials within the black-background phase, with 30 instances of each condition, yielding 360 trials per phase. Presentation order was randomized for each phase for each subject. One each trial the subject initially viewed a 0.4 diameter white-fixation spot against a uniform background, either black, for the black-background phase, or CIE(x, y) ¼ (0.45, 0.45) with a luminance of 15 cd m2 for the graduated-background phase. In the graduated condition the background was replaced by the graduated background, again with a fixation disk after 1 s. After a further 250 ms, 2.3 diameter disks were presented in the three possible target locations. For the graduated-background phase the odd-contrast item was always at the top or bottom and the spectrally odd item was at the center. For the black background the locations of the odd-contrast item and the spectrally odd item were exchanged. This means that differences between reaction times to targets presented at top and bottom locations allow one to compare contrast oddity or spectral oddity in the same locations with the same control.
Subjects and procedure Results Subjects were 22 undergraduate volunteers with no reported color-vision deficiencies. Half of them completed the phase using black backgrounds first, the rest completed the graduated-background phase first. The subjects’ task on each trial in both phases was to determine whether the letter ‘p’ or the letter ‘q’ occurred, indicating their choice as quickly as possible by pressing the left or right buttons on the response box, respectively. There were equal numbers of trials in which the spectrally odd (black-background phase) or contrast-odd (graduated-background phase) item was presented in the upper or lower locations. Within each location there were equal numbers of positive and negative L-contrast-odd items. The search target, ‘p’ or ‘q’, appeared equally often in each of the three locations, with distracters drawn randomly from the
We initially discarded all incorrect responses and all responses to targets from the central location, so that we could compare spectral oddity against a black background or contrast oddity against a graduated background with spectrally identical distracters. The median reaction times for all remaining trials in which the target appeared in the odd-item location, and for those in which the target appeared in the other location, were then entered into an analysis of variance for both phases of the experiment with the variables oddity and phase. Reaction-time experiments are susceptible to the effects of outliers and taking medians is the most stringent procedure with which one can minimize their effects. It is, however, the case that statistically similar results are obtained
167
Fig. 2. Average reaction-time advantage for targets in the odd location over targets in the control location against black and graduated backgrounds. Error bars are 1 standard error of the mean.
with less-stringent procedures (e.g. by trimming at means 2 standard deviations). The differences between reaction times in the odd-item and control locations are shown in Fig. 2. It can be seen that the reaction-time advantage in the graduated-background phase differs from zero, whereas there is no marked reaction-time advantage in the black-background phase. The analysis of variance confirms that oddity is only effective in speeding reaction time in subsequent letter discriminations when it is defined by contrast, not when it is spectral. There is a significant main effect of oddity (F(1, 21) ¼ 5.41, P<0.05) and a significant interaction between oddity and background (F(1, 21) ¼ 5.59, P<0.05) but no effect of background alone (F(1, 21) ¼ 1.09, n.s.). Average reaction times for odd and control conditions were 745 and 740 ms, respectively (an RT advantage of 5 ms) against the black background and 705 and 722 ms (an RT advantage of þ 17 ms) against the graduated background.
Discussion We found that odd spectral composition at a given location failed to produce a subsequent reaction time advantage there, and hence can be assumed not to capture attention automatically. Odd local contrast, on the other hand, produced a significant subsequent
reaction-time advantage when compared with a target presented in a location that had been odd neither in contrast nor spectral content. This suggests that localcontrast signals are accessible late enough in the visual system to trigger automatic attentional processes. Signals coding raw wavelength, although they are undoubtedly present early in the visual system (for example in the responses of single-opponent cells to stimuli bathing both their center and surround fields), do not appear to be accessible to automatic attentional processes. It is likely that attentional processes are cortically driven and the effectiveness of local contrast is consistent with evidence for local-contrast tuned cells in striate cortex (Conway, 2001; Conway et al., 2002). While the current results cannot rule out the existence of specifically wavelength-tuned cells in cortex (as opposed, for example, to reflectance-tuned cells whose response is derived from contrast signals), they do imply that signals from any such cells do not have a strong direct interaction with higher visual processes. We consider the results from the color-contrast condition as one of the few but growing reports demonstrating the automatic selection of a color singleton. It may be somewhat unsurprising that previous research has failed to show consistent attentional capture by color singletons given our findings that color singletons defined in two different ways (spectral composition and local contrast) reveal differing effects. This demonstrates that colorsingleton tasks can be sensitive to extreme subtleties of procedure and stimulus display. Our results therefore also suggest that object onset is not the only feature singleton that can elicit bottom-up attentional capture (Yantis, 1993). Our data also shed light on recent reports of ‘surprise’ color singletons. As reviewed in the introduction, Gibson and Jiang (1998) suggested that color singletons could, in principle, capture attention but observers might habituate to their repeated presentation. Horstmann (2002) has recently shown that surprise singletons can indeed capture attention. However, our data demonstrate that surprise singletons need not be used to elicit attentional capture. Rather than there being anything particularly important about surprise singletons, we argue that the procedure simply increases the sensitivity of measurement aimed to index attentional capture by oddity.
168
The size of the reaction-time advantage upon which we are drawing these conclusions is certainly small at 17 ms. Turatto and Galfano (2000) found reaction-time advantages ranging between 5 and 75 ms when they assessed reaction times to detect an odd letter in a circular array of eight items. They compared times taken to make the letter discrimination when the target letter appeared against an odd-color disk with the times taken when the odd disk was one, two, or three locations away from the odd letter. The reactiontime advantage depended crucially upon two factors, first, the distance between the odd-color location and the target-letter location and second, the subjects’ experience with the task. They suggest that subjects’ attention is initially captured by the task-irrelevant color singleton but that as subjects learn that color oddity is not predictive of target location, the extent to which color oddity captures attention diminishes. This diminution of attentional capture was evident when comparing an initial session of 512 trials with a second similar session the following day. Our subjects completed comparable numbers of trials (360 per session) with half of them undertaking the graduatedbackground task (where we found our reaction-time advantage) after a block of black-background trials. The letter targets we compared always had one intervening location between them which, again, Turatto and Galfano (2001) found decreased reaction-time advantage in comparison to a situation where target locations were adjacent. In conclusion, a 17-ms reaction-time advantage for attentional capture by local-contrast oddity does not appear to be exceptionally low given the experimental design. There are no obvious sources of cueing apart from the location of the odd-contrast disk — the relationships between odd-contrast polarity, direction of background gradation, odd-contrast location, and target location were all fully counterbalanced. The only unavoidable difference between the black and graduated-background conditions that might affect cueing is the fact that the black background remained constant throughout a test session whereas the graduated background could change from trial to trial. It is, however, hard to conceive of a way in which any temporal-alerting function provided by a change in background-preceding presentation of the disks could independently confer a selective advantage on the odd-contrast location.
Finally, we can address some issues regarding the black-background condition in which spectral oddity did not confer a reaction-time advantage on targets presented at the odd location. Although we have characterized this condition as one in which color-contrast signals are not present, this may not strictly be true. A number of studies have shown that wavelength comparisons made over long ranges influence neural responses, for example, Wachtler et al. (2001) showed that patches up to 10 away from a target stimulus can influence the perception of the target (but only when they accompany a change in background color). Neural mechanisms that might underlie such nonlocal influences can be identified both in extrastriate regions (e.g. Schein and Desimone, 1990) and in striate cortex (Wachtler et al., 1999). Although interactions may therefore be occurring between the processing of the three disks in the black-background condition we can conclude that neither oddity in these interactions nor spectral oddity per se are effective in automatically capturing attention in our experiment. To summarize, in a retinex color-perception process local-contrast signals are initially derived from the raw-wavelength input, subsequent more complex processes which involve integration of those signals together with anchoring provide estimates of the reflectance properties of objects in the world and the subjective experience of color. M.S. has apparently lost the ability to carry out these latter stages but does have access to local chromatic contrast signals. We asked the question whether access to these local-contrast signals is purely a consequence of the damage to M.S. or if they were also accessible in the normal visual system. When stimuli provide less information than would normally be available (by virtue of the paucity of edge information against a graded background) do subjects access local-contrast signals from an intermediate point in color processing or does access fall back to earlier ‘raw’-wavelength signals? Our finding that automatic orientation of visual attention was driven by local-contrast oddity rather than wavelength oddity suggests the former. Local contrast either drive attentional processes at an early stage or at least pass ‘unmodified’ through the latter stages of color processing drive attention from there.
169
It is noteworthy that the local chromatic contrast signal that we assume is responsible for attentional capture in the graduated-background task, and which is absent in the black-background task, is not a necessary condition in the process that gives rise to color experience — the red and green disks in the black background look quite different from one another. The subjective experience of color and its psychological consequences are clearly dissociable in normal subjects.
Acknowledgments We would like to acknowledge the support of MRC component project grant G0000679 and the help of patient M.S. RWK is the University of Durham Sir Derman Christopherson Foundation Fellow.
References Chatterjee, S. and Callaway, E.M. (2002) S cone contributions to the magnocellular visual pathway in macaque monkey. Neuron, 35: 1135–1146. Cole, G.G., Kentridge, R.W., Gellatly, A.R.H. and Heywood, C.A. (2003) Detectability of onsets versus offsets in the change detection paradigm. J. Vis., 3: 22–31 (http:// journalofvision.org/3/1/3/, DOI 10.1167/3.1.3). Conway, B.R. (2001) Spatial structure of cone inputs to color cells in alert macaque primary visual cortex (V1). J. Neurosci., 21: 2768–2783. Conway, B.R., Hubel, D.H. and Livingstone, M.S. (2002) Color contrast in macaque V1. Cereb. Cortex, 12: 915–925. Cornsweet, T.N. (1970). Visual Perception. Academic Press, London. Gellatly, A.R.H. and Cole, G.G. (2000) Accuracy of target detection in new-object and old-object displays. J. Exp. Psychol.: Hum. Percept. Perform., 26: 889–899. Gellatly, A.R.H., Cole, G.G. and Blurton, A. (1999) Do equiluminant object onsets capture visual attention? J. Exp. Psychol.: Hum. Percept. Perform., 25: 1609–1624. Gibson, B.S. and Jiang, Y. (1998) Surprise! An unexpected color singleton does not capture attention in visual search. Psychol. Sci., 9: 176–182.
Heywood, C.A., Cowey, A. and Newcombe, F. (1991) Chromatic discrimination in a cortically colour blind observer. Eur. J. Neurosci., 3: 802–812. Heywood, C.A., Cowey, A. and Newcombe, F. (1994) On the role of parvocellular (P) and magnocellular (M) pathways in cerebral achromatopsia. Brain, 117: 245–254. Heywood, C.A., Kentridge, R.W. and Cowey, A. (1998) Form and motion from colour in cerebral achromatopsia. Exp. Brain Res., 123: 145–153. Hillstrom, A.P. and Yantis, S. (1994) Visual motion and attentional capture. Percept. Psychophys., 55: 399–411. Horstmann, G. (2002) Evidence for attentional capture by a surprising color singleton in visual search. Psychol. Sci., 13: 499–505. Jonides, J. and Yantis, S. (1988) Uniqueness of abrupt visual onset in capturing attention. Percept. Psychophys., 43: 346–354. Land, E.H. and McCann, J.J. (1971) Lightness and retinex theory. J. Opt. Soc. Am., 61: 1–11. Maunsell, J.H.R. and Newsome, W.T. (1987) Visual processing in monkey extrastriate cortex. Ann. Rev. Neurosci., 10: 363–401. Meadows, J.C. (1974) Disturbed perception of colours associated with localized cerebral lesions. Brain, 97: 615–632. Rensink, R.A. (2002) Change detection. Ann. Rev. Psychol., 53: 245–277. Schein, S.J. and Desimone, R. (1990) Spectral properties of V4 neurons in the Macaque. J. Neurosci., 10: 3369–3389. Treisman, A. and Gelade, G. (1980) A feature integration theory of attention. Cognit. Psychol., 12: 97–136. Turatto, M. and Galfano, G. (2001) Attentional capture by color without any relevant attentional set. Percept. Psychophys., 63: 286–297. Wachtler, T., Sejnowski, T.J. and Albright, T.D. (1999) Responses of cells in macaque V1 to chromatic stimuli are compatible with human color constancy. Soc. Neurosci. Abstr., 25: 4. Wachtler, T., Albright, T.D. and Sejnowski, T.J. (2001) nonlocal interactions in color perception: nonlinear processing of chromatic signals from remote inducers. Vis. Res., 41: 1535–1546. Yantis, S. (1993) Stimulus-driven attentional capture and attentional control settings. J. Exp. Psychol.: Hum. Percept. Perform., 19: 676–681. Zeki, S., Aglioti, S., McKeefry, D. and Berlucchi, G. (1999) The neurological basis of conscious color perception in a blind patient. Proc. Natl. Acad. Sci. USA, 96: 14124–14129.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 12
Neuroimaging studies of attention and the processing of emotion-laden stimuli Luiz Pessoa* and Leslie G. Ungerleider Laboratory of Brain and Cognition, Department of Health and Human Services, National Institute of Mental Health, National Institutes of Health, Bethesda, MD 20892-4415, USA
Abstract: Because the processing capacity of the visual system is limited, selective attention to one part of the visual field comes at the cost of neglecting other parts. In this paper, we review evidence from single-cell studies in monkeys and functional magnetic resonance imaging (fMRI) studies in humans for neural competition and how competition is biased by attention. We suggest that, at the neural level, an important consequence of attention is to enhance the influence of behaviorally relevant stimuli at the expense of irrelevant ones, providing a mechanism for the filtering of distracting information in cluttered visual scenes. Psychophysical evidence suggests that processing outside the focus of attention is attenuated and may be even eliminated under some conditions. A major exception to the critical role of attention may be in the neural processing of emotion-laden stimuli, which are reported to be processed automatically, namely, without attention. Contrary to this prevailing view, in a recent study we found that all brain regions responding differentially to faces with emotional content, including the amygdala, did so only when sufficient resources were available to process those faces. After reviewing our findings, we discuss their implications, in particular (1) how emotional stimuli can bias competition for processing resources; (2) the source of the biasing signal for emotional stimuli; (3) how visual information reaches the amygdala; and finally (4) the relationship between attention and awareness.
Over the past 25 years, a great deal has been learned about the neural mechanisms of visual attention. Converging evidence from single-cell recording studies in monkeys and neuroimaging and event-related potential studies in humans have shown that the processing of attended information is enhanced relative to the processing of unattended information (Desimone and Duncan, 1995; Hillyard and Anllo-Vento, 1998; Kastner and Ungerleider, 2001). At the same time, there is increasing evidence indicating that a network of frontal and parietal areas is critical for the control of attention and is
thought to provide the top-down signals that modulate activity within visual processing regions (Mesulam, 1998; Hopfinger et al., 2001; Kastner and Ungerleider, 2001; Nobre, 2001; Corbetta and Shulman, 2002). Because the processing capacity of the visual system is limited, selective attention to one part of the visual field comes at the cost of neglecting other parts (Broadbent, 1958). Thus, several investigators have proposed that there is competition for processing resources (Grossberg, 1980; Bundesen, 1990; Desimone and Duncan, 1995). One instance of this proposal is the biased competition model of attention, as developed by Desimone and Duncan (1995). According to this model, the competition among stimuli for neural representation, which occurs within visual cortex itself, can be biased in several ways. One way is by bottom-up sensory-driven mechanisms,
*Corresponding author. Laboratory of Brain and Cognition, 49 Convent Drive, Building 49, Room 1B80, National Institute of Mental Health, NIH, Bethesda, MD 20892-4415, USA. Tel.: þ 1-301-496-5625, extn. 275; Fax: þ 1-301-402-0046; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-2
171
172
such as stimulus salience. For example, stimuli that are colorful or of high contrast will be at a competitive advantage. But, another way is by attentional top-down feedback, which is generated in areas outside the visual cortex. For example, directed attention to a particular location in space facilitates processing of stimuli presented at that location. Stimuli surviving the competition for neural representation will have further access to memory systems for mnemonic encoding and retrieval and to motor systems for guiding action and behavior. Evidence for neural competition and how competition is biased by attention comes from single-cell studies in monkeys and functional magnetic resonance imaging (fMRI) studies in humans (for a recent review, see Pessoa et al., 2002a). It has been shown, for example, that the neuronal response to a single effective stimulus in extrastriate area V4 is reduced when an additional, ineffective stimulus is present in the receptive field (Reynolds et al., 1999); see Fig. 1A. The reduced response to the paired stimuli suggests that the two stimuli within the receptive field interact with each other in a mutually
suppressive way. Such correlates of competitive interactions have been observed in many visual processing areas (Moran and Desimone, 1985; Miller et al., 1993; Rolls and Tovee, 1995; Recanzone et al., 1997). In human cortex, fMRI studies have revealed similar competitive interactions (Kastner et al., 1998). Single-cell recording studies have also demonstrated that spatially directed attention can bias the competition among multiple stimuli in favor of one of the stimuli by modulating competitive interactions. In particular, in extrastriate areas V2 and V4 it has been shown that spatially directed attention to an effective stimulus within a neuron’s receptive field counteracts the suppressive influence induced by a second, ineffective stimulus presented within the same receptive field (Reynolds et al., 1999; see Fig. 1B). Again, recent fMRI studies indicate that similar mechanisms occur in the human extrastriate visual cortex (Kastner et al., 1998). As in monkeys, the effect of spatially directed attention in humans is to reduce the suppressive effect exerted by multiple competing visual stimuli.
Fig. 1. The attentional modulation of competitive interactions within a V4 receptive field (RF). Examples show responses of a typical V4 cell’s response to an effective (‘good’) stimulus (green) alone or response to paired effective and ineffective (‘poor’) (red) stimuli in the RF. (A) Attention outside the RF. (B) Attention inside the RF to the good stimulus. Dotted region indicates the cell’s RF. Cone indicates location of attention. Data courtesy of Drs. Reynolds and Desimone.
173
It therefore appears that, at the neural level, an important consequence of attention is to enhance the influence of behaviorally relevant stimuli at the expense of irrelevant ones, providing a mechanism for the filtering of distracting information in cluttered visual scenes.
Attention is needed to process visual stimuli An implicit prediction of the biased competition model is that only items that survive the competition for neural representation in visual processing areas will impact on subsequent memory and motor systems. A related, but stronger, proposal has been advanced by Lavie (1995), who has suggested that the extent to which unattended objects are processed depends on the available capacity of the visual system. If, for example, the processing load of a target task exhausts available capacity, then stimuli irrelevant to that task would not be processed at all. Hence, perceptually such stimuli may not even reach awareness. Consistent with this idea, psychophysical studies in the past decade have demonstrated that processing outside the focus of attention is attenuated and may be eliminated under some conditions. Rock and colleagues showed that even the simplest visual tasks are compromised when attention is taken up elsewhere (Rock et al., 1992), a phenomenon they termed ‘inattentional blindness’. Further, in a striking demonstration, Joseph et al. (1997) showed that so-called ‘preattentive’ tasks, such as orientation pop-out, require attention to be successfully performed. The necessity of attention for perception is perhaps most compellingly illustrated by ‘change blindness’ studies (Rensink et al., 1997; Simons and Levin, 1997; Rensink, 2002), in which subjects may miss even very large changes in complex scenes, provided the changes are not associated with stimulus transients that capture attention. But what is the fate of unattended stimuli? As mentioned above, in extrastriate areas V2 and V4, single-cell studies in monkeys have shown that when an effective and ineffective stimulus are placed within a neuron’s receptive field, spatially directed attention to the effective stimulus results in a response similar to the one elicited by the effective stimulus when presented alone. Remarkably, spatially directed
attention to the ineffective stimulus results in a response similar to the one elicited by the ineffective stimulus when presented alone. In essence, it is as if the unattended stimulus, be it the effective or ineffective one, were not in the receptive field (Reynolds et al., 1999). These findings suggest that, at the neural level, responses evoked by unattended items may be eliminated. Such an interpretation is consistent with fMRI studies demonstrating that the stimulus-evoked fMRI response is essentially abolished when subjects are engaged in a competing task with high attentional load. In one study, Rees et al. (1997) showed that moving stimuli did not elicit fMRI activation in area MT when subjects performed a concurrent, highly demanding linguistic task. In a related study, Rees et al. (1999) showed that activations associated with words were not elicited when subjects performed a concurrent, highly demanding object working memory task. Thus, like the processing of visual motion, even word processing seems to require attention, contrary to claims for full automaticity (Van Orden et al., 1988; Menard et al., 1996).
Is attention necessary for the processing of emotion-laden faces? A major exception to the critical role of attention may be in the neural processing of emotion-laden stimuli, which are reported to be processed automatically, namely, without attention (Vuilleumier et al., 2001a; Ohman, 2002). For example, subjects exhibit fast, involuntary autonomic responses to emotional stimuli, such as aversive pictures or faces with fearful expressions (Wells and Matthews, 1994; Globisch et al., 1999). Other behavioral studies suggest that such autonomic responses to facial expressions occur not only ‘automatically’ (Stenberg et al., 1998) but may even take place without conscious awareness (Ohman et al., 1995). This conclusion is also supported by imaging studies of the neural processing of emotional stimuli in the amygdala, a structure that is known to be important in emotion, particularly the processing of fear (LeDoux, 1996; Aggleton, 2000; Lane and Nadel, 2000). Such studies report that the amygdala is activated not only when normal subjects view fearful faces, but even when these stimuli are masked and
174
subjects appear to be unaware of their occurrence (Morris et al., 1998; Whalen et al., 1998). Using the backward masking paradigms developed by Ohman and colleagues (Esteves and Ohman, 1993), Whalen et al. (1998) showed that fMRI signals in the amygdala were significantly larger during the viewing of masked, fearful faces than during the viewing of masked, happy faces. In another study, Morris et al. (1998) combined backward masking with classical conditioning to investigate responses to perceived and nonperceived angry faces. Although the participants never reported seeing the masked, angry stimuli, the contrast of conditioned and nonconditioned masked, angry faces activated the right amygdala. The view has thus emerged that the amygdala is specialized for the fast detection of emotionally relevant stimuli in the environment, and that this can occur without attention and even without conscious awareness. If this were indeed the case, amygdala activity would reflect an obligatory response independent of the locus of spatial attention. Vuilleumier et al. (2001a) tested this prediction in an fMRI study in which subjects fixated a central cue and matched either two faces or two houses presented eccentrically. Both fearful and neutral faces were utilized. As in earlier studies (Haxby et al., 1994; Wojciulik et al., 1998), activity in the fusiform gyrus, which is known to respond strongly to faces (Haxby et al., 2000), was modulated by attention. At the same time, Vuilleumier et al. failed to see evidence that attention modulated responses in the amygdala, regardless of stimulus valence. Not surprisingly, these results were interpreted as further evidence for obligatory activation of the amygdala by negative stimuli.
A strong test of automatic amygdala activation In a recent study (Pessoa et al., 2002b), we tested the alternative possibility, namely, that the neural processing of stimuli with emotional content is not automatic and instead requires some degree of attention, similar to the processing neutral stimuli. We hypothesized that the failure to modulate the processing of emotional stimuli by attention in previous studies was due to a failure to fully engage attention by a competing task. In other words, activation in the amygdala by emotional stimuli should resemble activation in MT to moving stimuli;
if the competing task is of high load, activation should be reduced or absent. We therefore employed fMRI and measured activations in the amygdala and other brain regions that responded differentially to faces with emotional expressions compared to neutral faces and then examined how those responses were modulated by attention. We measured fMRI responses evoked by pictures of faces with fearful, happy, or neutral expressions when attention was focused on them (attended condition), and compared the responses evoked by the same stimuli when attention was directed to oriented bars (unattended condition). In designing our bar orientation task, we chose one that was sufficiently demanding to exhaust all attentional resources on that task and leave little or none available to focus on the faces, even though they were viewed foveally during the bar orientation task. We found that attended compared to unattended faces evoked significantly greater activations bilaterally in the amygdala for all facial expressions (Fig. 2A). Importantly, there was a significant interaction between stimulus valence and attention. That is, the differential response to stimulus valence was observed only in the attended condition (Fig. 2B). Moreover, for the unattended condition, responses to all stimulus types were equivalent and not significantly different from zero. Thus, amygdala responses to emotional stimuli are not automatic and instead require attention. Our findings are in direct contrast to those by Vuilleumier et al. (2001a) who failed to see evidence that attention modulated responses in the amygdala, regardless of stimulus valence. What is the explanation for their negative findings? The most likely explanation is that the attentional manipulation in the Vuilleumier et al. study was not as effective as in ours. For example, behavioral performance for the bar orientation task in our study and house matching in the Vuilleumier et al. study was 64% and 86% correct, respectively, indicating that our competing task was a more demanding one. In addition, in the Vuilleumier et al. study, reaction time while subjects matched the houses were slower when unattended faces were fearful than when they were neutral, demonstrating that faces interfered more effectively when they were fearful. This strongly suggests that they captured attentional resources away from the
175
Fig. 2. Attention is required for the processing of stimulus valence. (A) Arrows point to the amygdala. Attended faces compared to unattended faces evoked significantly greater activations for all facial expressions. The level of the coronal section is indicated on the small whole-brain inset. (B) Estimated responses for the left and right amygdala regions of interest as a function of attention and valence. FA: fearful attended; FU: fearful unattended; HA: happy attended; HU: happy unattended; NA: neutral attended; NU: neutral unattended. From Pessoa et al. (2002b).
processing of houses, such that there were spared resources to devote to the processing of the faces. In our study, by contrast, no difference in reaction time during the bar orientation task was observed as a function of the emotional content of the unattended face. This lack of interference by fearful faces is consistent with the idea that all processing capacity was exhausted by the bar orientation task. Thus, the difference between the two studies likely reflects the extent to which the competing tasks did or did not exhaust processing resources. In our study, during unattended conditions, responses evoked by different expressions were equivalent. These results demonstrate that the expression of a valence effect requires attention. However, the question remains concerning the level at which attention gates the processing of faces. Our results are consistent with the gating of information at intermediate stages of visual processing, as the responses in the fusiform gyrus were essentially eliminated when attention was devoted to the bar
orientation task. It should be pointed out that fMRI may not be sufficiently sensitive to be able to pick up weak signals that might have been associated with unattended faces. Thus, further studies are needed to determine the neural stage at which face-expression information is gated. To summarize, contrary to the prevailing view, we found that the amygdala responded differentially to faces with emotional content only when sufficient attentional resources were available to process those faces. Indeed, when all attentional resources were consumed by another task, responses to faces were eliminated, consistent with Lavie (1995) proposal that if the processing load of a target task exhausts available capacity, stimuli irrelevant to that task will not be processed. Indeed, we also found that other brain regions responding differentially to faces with emotional content, including the superior temporal sulcus, the orbitofrontal cortex, the fusiform gyrus, and even the cortex within and around the calcarine fissure, showed a similar dependency on attentional
176
resources. It therefore does not appear that faces with emotional expressions are a ‘privileged’ category of objects immune to the effects of attention. Like neutral stimuli, faces with emotional expressions must also compete for neural representation. This is illustrated within the context of the biased competition model of attention in Fig. 3.
Emotional stimuli can bias competition for processing resources Although our results indicate that attentional resources are required for processing stimulus valence, they do not imply that humans are unable to respond to potential threats outside the focus of attention or that the amygdala only responds to attended stimuli. Indeed, if attentional resources are not exhausted, then even ignored items of neutral valence can attract attention and interfere with ongoing processing (Yantis and Johnson, 1990; Lavie and Tsal, 1994). Moreover, numerous studies have demonstrated that negative stimuli are a more effective source of involuntary interference to ongoing tasks than neutral and positive ones (Hartikainen et al., 2000; Tipples and Sharma, 2000; Vuilleumier et al., 2001a), and more readily recruit attention (Pratto and John, 1991; Bradley et al., 1997; Eastwood et al., 2001). It therefore appears that emotional (especially negative) stimuli can bias the competition for processing resources, such that they are at a competitive advantage compared to neutral stimuli. If so, then just as attention enhances activity within visual cortex to items at attended locations, so too should emotional pictures evoke stronger responses in visual cortex than neutral ones. This is indeed the case. We and others have found that both posterior visual processing areas, such as the occipital gyrus, and more anterior, ventral temporal regions, such as the fusiform gyrus, exhibit differential activation when emotional and neutral pictures are contrasted (Breiter et al., 1996; Lang et al., 1998; Lane et al., 1999; Simpson et al., 2000; Moll et al., 2002). Remarkably, we also obtained evidence for valence-dependent responses in and around the calcarine fissure (V1/V2). It therefore appears that, like attentional modulation of activity in visual cortex, emotional modulation can provide a top-down influence on very early processing areas.
Fig. 3. Biased competition model of visual attention and the processing of emotion-laden stimuli. Facial expressions must compete for neural representation (see arrow labeled ‘stimulus valence’) just as neutral stimuli do. From Pessoa et al. (2002a).
In sum, just as attention can favor the processing of attended items, so too do stimuli with emotional valence. Thus, we hypothesize that the increased activation produced by emotional stimuli in visual cortex reflects emotional modulation by which the processing of this stimulus category is favored as compared to that of neutral stimuli.
What is the source of the biasing signal for emotional stimuli? In the past decade or so, the amygdala has been shown to be a critical node in a circuit mediating the processing of stimulus valence, notably fear. Because of its widespread projections to cortical sensory processing areas (Amaral et al., 1992), it has been suggested that the amygdala may be the source of modulation of activity evoked by emotional stimuli. Consistent with this proposal, Morris et al. (1999) found that amygdala signals covary with signals from visual areas in a condition-dependent manner. Such changes in ‘functional connectivity’ highlight changes in the coupling between brain regions. In their study, the correlation between amygdala and visual cortical activity increased when subjects viewed fearful faces compared to happy ones. In our study, we also observed increased coupling during attended compared to unattended trials between the amygdala and visual areas, including the superior temporal sulcus, the middle occipital, and the fusiform gyri. Interestingly, we found increased amygdala coupling with the calcarine fissure, which is consistent with
177
projections from the amygdala to very early visual areas, including V1 and V2 (Amaral et al., 1992). Increased coupling was not restricted to visual processing regions, however, but also included the orbitofrontal and parietal cortex. While the results from our study and others (see also Rotshtein et al., 2001) are consistent with a modulatory role for the amygdala, the type of analysis employed (based on activity covariation) cannot determine the direction of the interaction. More direct evidence that the amygdala is a source of emotional modulation comes from a recent study by Anderson and Phelps (2001), who showed that patients with bilateral amygdala lesions did not show an advantage at detecting word stimuli with aversive content compared to neutral content, in stark contrast to the behavior of normal subjects. Emotional modulation can potentially be implemented in one of the two ways. First, it could rely on the direct feedback projections from the amygdala to visual-processing areas (Amaral et al., 1992), as illustrated in Fig. 4. Once the amygdala attributes valence or significance to an incoming stimulus, it would be in a privileged position to influence visual processing along the entire ventral, occipitotemporal processing stream. If this view is correct, then in patients with bilateral amygdala lesions, visual responses evoked by emotional stimuli should be essentially equivalent to responses evoked by neutral stimuli, unlike in normal controls in whom emotional stimuli elicit stronger responses. A second possibility is that the amygdala could modulate activity within visual processing areas via its projections to frontal sites that control the allocation of attentional resources, including dorsolateral prefrontal and anterior cingulate cortex (Amaral et al., 1992). In this view, emotional modulation would essentially be a form of attentional modulation in which the valence of the stimulus would serve to rapidly inform attentional control regions of a potentially important stimulus. This latter situation would, in turn, be closer to processes of ‘exogenous’ attention (Corbetta and Shulman, 2002) in which stimulus salience is able to direct attention to its location. If emotional modulation were indeed dependent on attentional circuits in frontal cortex, then patients with frontal lesions, but who have spared amygdalas, should exhibit little or no differential activity in visual
Fig. 4. Emotional modulation. The amygdala receives highly processed visual input from inferior temporal areas TEO and TE. At the same time, the amygdala projects to several levels of visual processing, including areas as early as V1, which allows it to influence visual processing according to the valence of the stimulus. Note that the amygdala is also interconnected with, among other regions, the orbitofrontal cortex, another brain structure important for the processing of ‘stimulus significance’. Brain regions: Green: occipitotemporal visual processing areas; Orange: posterior superior temporal sulcus; Red: amygdala (note that the amygdala is not visible from a lateral view of the brain; instead it is situated subcortically near the brain’s medial surface); Blue: orbitofrontal cortex (note that important orbitofrontal regions are situated along the midline, and hence are not visible from a lateral view of the brain). From Pessoa et al. (2002a).
processing areas as a function of stimulus valence. Of course, it is possible that the implementation of emotional modulation involves both direct amygdala feedback projections to visual cortex or indirect circuits encompassing the frontal lobe. Finally, although we have emphasized the role of the amygdala in the attribution of stimulus valence, several brain regions, including the orbitofrontal and ventromedial prefrontal cortices, might act in concert with the amygdala in determining the behavioral and social significance of incoming stimuli (Bechara et al., 2000).
How does visual information reach the amygdala: slow-cortical and fast-subcortical pathways If the amygdala conveys valence to sensory stimuli, what is the pathway by which it receives its sensory inputs? There is evidence that two pathways exist. One is a cortical pathway that starts in early sensory regions, progresses through several intermediate
178
stages, and finally delivers highly processed sensory information to the amygdala (Amaral et al., 1992; Friedman et al., 1986). The existence of a parallel, subcortical pathway to the amygdala for auditory processing has been demonstrated by studies of fear conditioning to acoustic stimuli in rats and guinea pigs (LeDoux, 1995; Weinberger, 1995). It has thus been proposed that, in general, acoustic signals are transmitted via a ‘fast’, subcortical route, in addition to the ‘well processed’ signals that the amygdala receives via cortical projections (LeDoux, 1996). Several investigators have proposed that a fast, subcortical pathway also exists for the processing of face stimuli (Morris et al., 1999; de Gelder et al., 2001; Ohman, 2002). For example, Morris et al. (1999) have proposed that a retino-collicular-pulvinar-amygdala pathway provides the neural substrate for the automatic processing of facial expression (but see below regarding the associated anatomy). As stated by Vuilleumier et al. (2001a) truly automatic pathway should not depend on attentional resources. Yet, in our study, we found a strong interaction between stimulus valence and attention, such that differential responses to emotional and neutral faces only occurred when subjects attended to the faces. Moreover, our results provided strong evidence that both occipitotemporal (including fusiform gyrus) and amygdala responses are eliminated when processing resources are exhausted. In fact, it is unclear how a proposed subcortical pathway would support the processing of the detailed form information required for face perception. For example, superior colliculus neurons resolve far lower spatial frequencies compared to neurons in the geniculostriate system (Miller et al., 1980; Rodman et al., 1989). Moreover, an anatomical substrate for the putative fast subcortical pathway has not been delineated. In primates, the superior colliculus projects to the inferior pulvinar, but projections to the amygdala from the pulvinar originate in the medial nucleus (Jones and Burton, 1976). Further, interconnections between the inferior and medial nuclei have not been described. And yet, studies with blindsight patient GY (who has a right hemianopia following left occipital lobe damage) reveal that he is able to discriminate emotional facial expressions presented in his blind hemifield (de Gelder et al., 1999), a phenomenon
called affective blindsight (de Gelder et al., 2000). GY has been recently scanned with fMRI in a study in which he was exposed to lateralized presentations of fearful or happy expressions in his blind and intact hemifields (Morris et al., 2001). Despite the absence of normal vision in his blind hemifield, fearful faces presented to that hemifield elicited enhanced amygdala responses. These results were taken to suggest that information reached the amygdala subcortically. However, one difficulty with interpreting GY’s results is that he suffered an occipital lesion at an early age (8 years old), which may have produced experiencedependent changes in collicular function. The pioneering work of Cowey (Weiskrantz and Cowey, 1967) demonstrated a practice-induced recovery of visually guided saccades in monkeys with striate cortex lesions. A subsequent investigation by Mohler and Wurtz (1977) showed that after the striate cortex lesions there appeared to be more neurons that showed a (normal) response enhancement when the stimulus in their receptive field was a target for an eye movement. Similar kinds of reorganization may also have been the case with GY, especially because he has undergone repeated testing for many years (Weiskrantz, 2000). Residual vision in his ‘blind’ hemifield may also play a role in his performance (Fendrich et al., 2001). We suggest that in the normal brain the critical pathway for the processing of facial expressions is not subcortical but rather proceeds from V1 to extrastriate areas, including the fusiform gyrus and superior temporal sulcus, and then to the amygdala. If attentional resources are depleted, however, face stimuli, regardless of valence, will fail to reach the amygdala and will fail to be tagged with emotional expression. Consequently, valence information will not be conveyed. This is exactly what we observed when subjects were engaged in a high-load, competing task: neither the amygdala nor regions to which it projects showed a valence effect. Thus, contrary to simple auditory stimuli, for which subcortical processing is likely to be sufficient, for detailed form information required for face perception, a cortical pathway seems to be necessary. In fact, there is evidence that in conditioning paradigms involving finer acoustic discrimination, where one conditioned stimulus is paired with shock and another is not, cortical lesions interfere with conditioning (Jarrell et al., 1987).
179
Thus, according to our proposal, when emotional faces are viewed, the initial volley of activation over occipitotemporal cortex would be equivalent to that produced by neutral faces. Later, after feedback from other structures such as the amygdala converge onto occipitotemporal cortex, the responses would be selective for the valence of the stimulus. This view is consistent with results from event-related potential studies, indicating that valence-modulated components in occipitotemporal cortex occur between 250 and 600 ms after stimulus onset, far exceeding the so-called N170 (170 ms poststimulus) faceselective responses (Eimer and Holmes, 2002; Krolak-Salmon et al., 2001). Moreover, in monkeys, neuronal responses to specific facial information (such as expressions) peaks, on average, 50 ms after global facial information (Sugase et al., 1999).
Attention and awareness We have proposed that attention is required for the expression of stimulus valence. How does one reconcile this view with the finding that amygdala responses are evoked by masked faces of which subjects are presumably unaware (Morris et al., 1998; Whalen et al., 1998)? Because current theories of human cognition propose that unaware perception involves automatic processes that do not require attention (Eysenck, 1984), our results appear to contradict these previous studies of unaware (subliminal) perception. However, in such studies subjects direct attention to the location of the stimuli and, critically, no concurrent competing stimuli or task is employed. Thus, we propose that unaware perception or responses do not necessarily imply that processing of emotional stimuli proceeds without attention. We would like to argue instead that in previous studies of unaware perception attention was still available to process the associated stimuli. Indeed, our working hypothesis is that attention is also needed for unaware perception. If so, it should be possible to eliminate both subliminal perception and associated responses (e.g. skin conductance, fMRI signals) if attention is completely consumed via attentional manipulations. Our view is, perhaps, unconventional as attention and awareness are often inextricably tied. Under some views, if stimuli are subliminal, then attention
cannot affect them, as such dependence would imply that subjects should be aware of them. However, there is no logical inconsistency in our proposal if attention and awareness are not equated (see Thorton and Fernandez-Duque (2002) and Naccache et al. (2002) for similar points; see also Lamme, 2003). Our view is also consistent with recent findings by Lachter et al. (2000) that unaware repetition priming in a lexical decision task occurred only if the masked primes appeared at spatially attended locations. In another study, Naccache et al. (2002) demonstrated that the occurrence of unaware priming in a numbercomparison task was determined by the allocation of temporal attention to the time window during which the prime-target pair was presented. These studies thus provide evidence that attention is also required for subliminal perception, further underscoring the distinction between attention and awareness. It is also important to reconcile our results with a recent report by Vuilleumier et al. (2002) in which a patient exhibiting visual ‘extinction’ was studied. In visual extinction, patients may be able to report the presence of a stimulus when presented alone, but fail to detect it when the stimulus is presented at the same time with a ‘competing’ stimulus. In their study, fearful faces elicited greater responses than neutral faces even when extinguished, suggesting that attention is not required for the processing of emotional stimuli. As in the case of visual masking discussed above, we would like to propose that although stimuli were not reported by the subjects, it does not imply that attention was not required to process the stimuli. For instance, under conditions in which the stimulus timing is unpredictable (precluding the temporal allocation of attention), we anticipate that extinguished stimuli will not elicit differential responses in the amygdala. In general, we hypothesize that the lack of awareness may be associated with weak neural signals (see also Farah, 1994; Zeki and Ffytche, 1998). When sufficient attention is devoted to a stimulus, its neural representation will be favored, leading to stronger neural signals. Strong neural signals may be essential for visual awareness. For example, imaging studies of visuospatial neglect show that signals evoked by unseen faces are weak compared to those evoked by seen faces (Rees et al., 2000; Vuilleumier et al., 2001b); see also Zeki and
180
Ffytche, 1998). Moreover, activity in the fusiform gyrus correlates with the confidence with which a subject reports recognizing an object (Bar et al., 2001). Thus, it appears that some threshold in visual cortex must be reached before visual awareness is possible. It should be stressed, however, that other factors are likely to be important in determining whether a stimulus reaches awareness or not, including the activation of fronto-parietal regions (Beck et al., 2001) and temporal synchrony of neuronal firing (Engel et al., 2001). Interestingly, these factors may also contribute to the generation of robust, strong neural representations.
Acknowledgments We thank David Sturman for assistance in the preparation of the manuscript. The research presented here was supported by the National Institute of Mental Health Intramural Research Program.
References Aggleton, J. (2000) The Amygdala: A Functional Analysis. Oxford University Press, Oxford. Amaral, D.G., Price, J.L., Pitkanen, A. and Carmichael, S.T. (1992) Anatomical organization of the primate amygdaloid complex. In: Aggleton J. (Ed.), The Amygdala: Neurobiological Aspects of Emotion, Memory, and Mental Dysfunction. Wiley-Liss, New York, pp. 1–66. Anderson, A.K. and Phelps, E.A. (2001) Lesions of the human amygdala impair enhanced perception of emotionally salient events. Nature, 411: 305–309. Bar, M., Tootell, R.B., Schacter, D.L., Greve, D.N., Fischl, B., Mendola, J.D., Rosen, B.R. and Dale, A.M. (2001) Cortical mechanisms specific to explicit visual object recognition. Neuron, 29: 529–535. Beck, D.M., Rees, G., Frith, C.D. and Lavie, N. (2001) Neural correlates of change detection and change blindness. Nat. Neurosci., 4: 645–650. Bechara, A., Damasio, H. and Damasio, A.R. (2000) Emotion, decision making and the orbitofrontal cortex. Cerebral Cortex, 10: 295–307. Bradley, B.P., Mogg, K. and Lee, S.C. (1997) Attentional biases for negative information in induced and naturally occurring dysphoria. Behav. Res. Ther., 35: 911–927. Breiter, H.C., Etcoff, N.L., Whalen, P.J., Kennedy, W.A., Rauch, S.L., Buckner, R.L., Strauss, M.M., Hyman, S.E. and Rosen, B.R. (1996) Response and habituation of the
human amygdala during visual processing of facial expression. Neuron, 17: 875–887. Broadbent, D.E. (1958) Perception and Communication. Pergamon Press, London. Bundesen, C. (1990) A theory of visual attention. Psychol. Rev., 97: 523–547. Corbetta, M. and Shulman, G.L. (2002) Control of goaldirected and stimulus-driven attention in the brain. Nat. Rev. Neurosci., 3: 201–215. de Gelder, B., Pourtois, G., van Raamsdonk, M., Vroomen, J. and Weiskrantz, L. (2001) Unseen stimuli modulate conscious visual experience: evidence from inter-hemispheric summation. Neuroreport, 12: 385–391. de Gelder, B., Vroomen, J., Pourtois, G. and Weiskrantz, L. (1999) Non-conscious recognition of affect in the absence of striate cortex. Neuroreport, 10: 3759–3763. de Gelder, B., Vroomen, J., Pourtois, G. and Weiskrantz, L. (2000) Affective blindsight: are we blindly led by emotions? Trends Cogn. Sci., 4: 126–127. Desimone, R. and Duncan, J. (1995) Neural mechanisms of selective attention. Annu. Rev. Neurosci., 18: 193–222. Eastwood, J.D., Smilek, D. and Merikle, P.M. (2001) Differential attentional guidance by unattended faces expressing positive and negative emotion. Percept Psychophys., 63: 1004–1013. Eimer, M. and Holmes, A. (2002) An ERP study on the time course of emotional face processing. Neuroreport, 13: 427–431. Engel, A.K., Fries, P. and Singer, W. (2001) Dynamic predictions: oscillations and synchrony in top-down processing. Nat. Rev. Neurosci., 2: 704–716. Esteves, F. and Ohman, A. (1993) Masking the face: recognition of emtoional facial expressions as a function of the parameters of backward masking. Scand. J. Psychol., 34: 1–18. Eysenck, M. (1984) Attention and performance limitations. In: Eysenck M. (Ed.), A Handbook of Cognitive Psychology. Erlbaum, Hillsdale, NJ, pp. 49–77. Farah, M.J. (1994) Visual perception and visual awareness after brain damage. In: Umilta C. and Moscovitch M. (Eds.), Attention and Performance XV. MIT Press, Cambridge, MA, pp. 37–76. Fendrich, R., Wessinger, C.M. and Gazzaniga, M.S. (2001) Speculations on the neural basis of islands of blindsight. Prog. Brain Res., 134: 353–366. Friedman, D.P., Murray, E.A., O’Neill, J.B. and Mischkin, M. (1986) Cortical connections of the somatosensory fields of the lateral sulcus of macaques: evidence for a cortocolimbic pathway for touch. J. Comp. Neurol., 252: 323–3447. Globisch, J., Hamm, A.O., Esteves, F. and Ohman, A. (1999) Fear appears fast: temporal course of startle reflex potentiation in animal fearful subjects. Psychophysiol., 36: 66–75. Grossberg, S. (1980) How does a brain build a cognitive code? Psychol. Rev., 87: 1–51. Hartikainen, K.M., Ogawa, K.H. and Knight, R.T. (2000) Transient interference of right hemispheric function due
181 to automatic emotional processing. Neuropsychologia, 38: 1576–1580. Haxby, J.V., Hoffman, E.A. and Gobbini, M.I. (2000) The distributed human neural system for face perception. Trends Cogn. Sci., 4: 223–233. Haxby, J.V., Horwitz, B., Ungerleider, L.G., Maisog, J.M., Pietrini, P. and Grady, C.L. (1994) The functional organization of human extrastriate cortex: a PET-rCBF study of selective attention to faces and locations. J. Neurosci., 14: 6336–6353. Hillyard, S.A. and Anllo-Vento, L. (1998) Event-related brain potentials in the study of visual selective attention. Proc. Natl. Acad. Sci. USA, 95: 781–787. Hopfinger, J.B., Woldorff, M.G., Fletcher, E.M. and Mangun, G.R. (2001) Dissociating top-down attentional control from selective perception and action. Neuropsychologia, 39. Jarrell, T.W., Gentile, C.G., Romanski, L.M., McCabe, P.M. and Schneiderman, N. (1987) Involvement of cortical and thalamic auditory regions in retention of differential bradycardiac conditioning to acoustic conditioned stimuli in rabbits. Brain Res., 412: 285–294. Jones, E.G. and Burton, H. (1976) A projection from the medial pulvinar to the amygdala in primates. Brain Res., 104: 142–147. Joseph, J.S., Chun, M.M. and Nakayama, K. (1997) Attentional requirements in a preattentive feature search task. Nature, 387: 805–807. Kastner, S., De Weerd, P., Desimone, R. and Ungerleider, L.G. (1998) Mechanisms of directed attention in the human extrastriate cortex as revealed by functional MRI. Science, 282: 108–111. Kastner, S. and Ungerleider, L.G. (2001) The neural basis of biased competition in human visual cortex. Neuropsychologia, 39: 1263–1276. Krolak-Salmon, P., Fischer, C., Vighetto, A. and Mauguie`re, F. (2001) Processing of facial emotional expression: spatiotemporal data as assessed by scalp event-related potentials. Eur. J. Neurosci., 13: 987–994. Lachter, J., Forster, K.I. and Ruthruff, E. (2000) Unattended words are not identified. Paper presented at the Annual Meeting of the Psychonomic Society, New Orleans, LA, November. Lamme, V.A.F. (2003) Why visual attention and awareness are different. Trends Cogn. Sci., 7: 12–18. Lane, R.D. and Nadel, L. (2000) Cognitive Neuroscience of Emotion. Oxford University Press, Oxford. Lane, R.D., Chua, P.M. and Dolan, R.J. (1999) Common effects of emotional valence, arousal and attention on neural activation during visual processing of pictures. Neuropsychologia, 37: 989–997. Lang, P.J., Bradley, M.M., Fitzsimmons, J.R., Cuthbert, B.N., Scott, J.D., Moulder, B. and Nangia, V. (1998) Emotional arousal and activation of the visual cortex: an fMRI analysis. Psychophysiology, 35: 199–210.
Lavie, N. (1995) Perceptual load as a necessary condition for selective attention. J. Exp. Psychol. Human, 21: 451–468. Lavie, N. and Tsal, Y. (1994) Perceptual load as a major determinant of the locus of selection in visual-attention. Percept. Psychophys., 56: 183–197. LeDoux, J.E. (1995) In search of an emotional system in the brain: Leaping from fear to emotion and consciousness. In: Gazzaniga, M.S. (Ed.), The Cognitive Neurosciences. MIT Press, Cambridge, MA, 1049–1061. LeDoux, J.E. (1996) The Emotional Brain. Simon & Schuster, New York. Menard, M.T., Kosslyn, S.M., Thompson, W.L., Alpert, N.M. and Rauch, S.L. (1996) Encoding words and pictures: a positron emission tomography study. Neuropsychologia, 34: 185–194. Mesulam, M.M. (1998) From sensation to cognition. Brain, 121: 1013–1052. Miller, E.K., Gochin, P.M. and Gross, C.G. (1993) Suppression of visual responses of neurons in inferior temporal cortex of the awake macaque by addition of a second stimulus. Brain Res., 616: 167–202. Miller, M., Pasik, P. and Pasik, T. (1980) Extrageniculostriate vision in the monkey. VII. Contrast sensitivity functions. J. Neurophysiol., 43: 1510–1526. Mohler, C.W. and Wurtz, R.H. (1977) Role of striate cortex and superior colliculus in visual guidance of saccadic eye movements in monkeys. J. Neurophysiol., 40: 74–94. Moll, J., de Oliveira-Souza, R., Eslinger, P.J., Bramati, I.E., Mourao-Miranda, J., Andreiuolo, P.A. and Pessoa, L. (2002) The neural correlates of moral sensitivity: a functional magnetic resonance imaging investigation of basic and moral emotions. J. Neurosci., 22: 2730–2736. Moran, J. and Desimone, R. (1985) Selective attention gates visual processing in the extrastriate cortex. Science, 229: 782–784. Morris, J.S., DeGelder, B., Weiskrantz, L. and Dolan, R.J. (2001) Differential extrageniculostriate and amygdala responses to presentation of emotional faces in a cortically blind field. Brain, 124: 1241–1252. Morris, J.S., Ohman, A. and Dolan, R.J. (1998) Conscious and unconscious emotional learning in the human amygdala. Nature, 393: 467–470. Morris, J.S., Ohman, A. and Dolan, R.J. (1999) A subcortical pathway to the right amygdala mediating ‘unseen’ fear. Proc. Natl. Acad. Sci. USA, 96: 1680–1685. Naccache, L., Blandin, E. and Dehaene, S. (2002) Unconscious masked priming depends on temporal attention. Psychol. Sci., 13(5): 416–424. Nobre, A.C. (2001) Orienting attention to instants in time. Neuropsychologia, 39: 1317–1328. Ohman, A. (2002) Automaticity and the amygdala: nonconscious responses to emotional faces. Curr. Dir. Psychol. Sci., 11: 62–66.
182 Ohman, A., Esteves, F. and Soares, J.J.F. (1995) Preparedness and preattentive associative learning: Electrodermal conditioning to masked stimuli. J. Psychophysiol., 9: 99–108. Pessoa, L., Kastner, S. and Ungerleider, L.G (2002a) Attentional control of the processing of neutral and emotional stimuli. Cog. Brain Res., 15: 31–45. Pessoa, L., McKenna, M., Gutierrez, E. and Ungerleider, L.G. (2002b) Neural processing of emotional faces requires attention. Proc. Natl. Acad. Sci. USA, 99: 11458–11463. Pratto, F. and John, O.P. (1991) Automatic vigilance: the attention-grabbing power of negative social information. J. Pers. Soc. Psychol., 61: 380–391. Recanzone, G.H., Wurtz, R.H. and Schwarz, U. (1997) Responses of MT and MST neurons to one and two moving objects in the receptive field. J. Neurophysiol., 78: 2904–2915. Rees, G., Frith, C.D. and Lavie, N. (1997) Modulating irrelevant motion perception by varying attentional load in an unrelated task. Science, 278: 1616–1619. Rees, G., Russell, C., Frith, C.D. and Driver, J. (1999) Inattentional blindness versus inattentional amnesia for fixated but ignored words. Science, 286: 2504–2507. Rees, G., Wojciulik, E., Clarke, K., Husain, M., Frith, C. and Driver, J. (2000) Unconscious activation of visual cortex in the damaged right hemisphere of a parietal patient with extinction. Brain, 123: 1624–1633. Rensink, R.A. (2002) Change detection. Annu. Rev. Psychol., 53: 245–277. Rensink, R.A., O’Regan, J.K. and Clark, J.J. (1997) To see or not to see: The need for attention to perceive changes in scenes. Psychol. Sci., 8: 368–373. Reynolds, J.H., Chelazzi, L. and Desimone, R. (1999) Competitive mechanisms subserve attention in macaque areas V2 and V4. J. Neurosci., 19: 1736–1753. Rock, I., Linnett, C.M., Grant, P. and Mack, A. (1992) Perception without attention: results of a new method. Cognitive Psychol., 24: 502–534. Rodman, H.R., Gross, C.G. and Albright, T.D. (1989) Afferent basis of visual response properties in area MT of the macaque. I. Effects of striate cortex removal. J. Neurosci., 9: 1510–1526. Rolls, E.T. and Tovee, M.J. (1995) The responses of single neurons in the temporal visual cortical areas of the macaque when more than one stimulus is present in the receptive field. Exp. Brain Res., 103: 409–420. Rotshtein, P., Malach, R., Hadar, U., Graif, M. and Hendler, T. (2001) Feeling or features. Different sensitivity to emotion in high-order visual cortex and amygdala. Neuron, 32: 747–757. Simons, D. and Levin, D.T. (1997) Change blindness. Trends Cogn. Sci., 1: 261–267. Simpson, J.R., Ongur, D., Akbudak, E., Conturo, T.E., Ollinger, J.M., Snyder, A.Z., Gusnard, D.A. and Raichle, M.E. (2000) The emotional modulation of cognitive processing: an fMRI study. J. Cognitive Neurosci., 12: 157–170.
Stenberg, G., Wilking, S. and Dhal, M. (1998) Judging words at face value: Interference in a word processing task reveals automatic processing of affective facial expressions. Cognition Emotion, 12: 755–782. Sugase, Y., Yamane, S., Ueno, S. and Kawano, K. (1999) Global and fine information coded by single neurons in the temporal visual cortex. Nature, 400: 142–147. Thornton, I.M. and Fernandez-Duque, D. (2002) Converging evidence for the detection of change without awareness. Prog. Brain Res., 140: 99–118. Tipples, J. and Sharma, D. (2000) Orienting to exogenous cues and attentional bias to affective pictures reflect separate processes. Brit. J. Psychol., 91: 87–97. Van Orden, G.C., Johnston, J.C. and Hale, B.L. (1988) Word identification in reading proceeds from spelling to sound to meaning. J. Exp. Psychol. Learn. Mem. Cogn., 14: 371–386. Vuilleumier, P., Armony, J.L., Driver, J. and Dolan, R.J. (2001a) Effects of attention and emotion on face processing in the human brain: An event-related fMRI study. Neuron, 30: 829–841. Vuilleumier, P., Sagiv, N., Hazeltine, E., Poldrack, R.A., Swick, D., Rafal, R.D. and Galbrieli, J.D. (2001b) Neural fate of seen and unseen faces in visuospatial neglect: a combined event-related functional MRI and event-related potential study. Proc. Natl. Acad. Sci. USA, 98: 3495–3500. Vuilleumier, P., Armony, J.L., Clarke, K., Husain, M., Driver, J. and Dolan, R.J. (2002) Neural response to emotional faces with and without awareness: event-related fMRI in a parietal patient with visual extinction and spatial neglect. Neuropsychologia, 40: 2156–2166. Weinberger, N.M. (1995) Retuning the brain by fear conditioning. In: Gazzaniga M.S. (Ed.), The Cognitive Neurosciences. The MIT Press, Cambridge, MA, pp. 1071–1089. Weiskrantz, L. (2000) Blindsight: implications for the conscious experience of emotion. In: Lane R.D. and Nadel L. (Eds.), Cognitive Neuroscience of Emotion. Oxford University Press, New York, pp. 277–295. Weiskrantz, L. and Cowey, A. (1967) Comparison of the effects of striate cortex and retinal lesions on visual acuity in the monkey. Science, 155: 104–106. Wells, A. and Matthews, G. (1994) Attention and Emotion: A Clinical Perspective. Lawrence Erlbaum, Hove UK. Whalen, P.J., Rauch, S.L., Etcoff, N.L., McInerney, S.C., Lee, M.B. and Jenike, M.A. (1998) Masked presentations of emotional facial expressions modulate amygdala activity without explicit knowledge. J. Neurosci., 18: 411–418. Wojciulik, E., Kanwisher, N. and Driver, J. (1998) Covert visual attention modulates face-specific activity in the human fusiform gyrus: fMRI study. J. Neurophysiol., 79: 1574–1578. Yantis, S. and Johnson, D.N. (1990) Mechanisms of attentional priority. J. Exp. Psychol. Human, 16: 812–825. Zeki, S. and Ffytche, D.H. (1998) The Riddoch syndrome: insights into the neurobiology of conscious vision. Brain, 121: 25–45.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 13
Selective visual attention, visual search and visual awareness Charles M. Butter* Department of Psychology, University of Michigan, Ann Arbor, MI 48109-1109, USA
Abstract: In a previous study, Butter and Goodale (2000) reported that visual search increases the identification of targets relative to distracters. The present series of studies investigated further this effect of search. Search increased identification of Ls when they were targets and decreased their identification when Ls were distracters in concurrent search involving feature conjunctions (Exp. 1). Subjects in Experiment 2 showed increases in sensitivity, but not in response bias, to Ls. Search had a weaker effect on d 0 for identification of Ts. Increasing the difficulty of searching for Ts or Ls by increasing the number of distracters enhanced identification of targets versus distracters relative to the effect of search involving eight distracters (Exp. 3). Two studies investigated the effects on probe identification of easy search tasks involving differences in stimulus features. In Experiment 4, there were no reliable effects of search on probe identification when targets were distinguished from distracters by straight versus curved lines (Z vs. O). When horizontal and vertical lines were targets and distracters, subjects showed weak and inconsistent effect on probe identification (Exp. 5). These findings, together with results of neurophysiological studies, support the view that executive mechanisms play a role in visual search by augmenting the activity of goal representations in working memory, thus increasing the likelihood of identifying goal stimuli and enhancing the efficiency of visual search.
Introduction
stated this view when he observed: ‘‘Attention is the taking possession by the mind, in clear and vivid form, of one out of what see several simultaneously possible objects . . . Focalization, concentration, of consciousness are of its essence.’’ (James, 1890, pp. 403–404). Crick and Koch (1998, p. 99) expressed a similar view: ‘‘. . . consciousness is enriched by visual attention although attention is not essential for visual consciousness to appear.’’ Some have proposed the view that consciousness is prerequisite for selective attention. For example, in his model of visual processing with and without visual awareness, Rensink (2000, p. 1485) claimed that we experience only things to which we attend: ‘‘Focused attention is necessary and sufficient for conscious visual experience.’’ With reference to the role of attention in visual search, Treisman (1993, p. 13) asserted, ‘‘Before any conscious visual experience [of targets defined by
Alan Cowey’s studies of blindsight have shed considerable light on regions of the brain contributing to visual awareness as well as on the range of visual stimuli that control behavior in the absence of visual awareness (see Cowey, 1997). Indeed, there is ample and convincing evidence that neurologically intact subjects are capable of visual perception, identification, and recognition in the absence of their awareness of the stimuli evoking these processes (see review by Merikle et al., 2001). However, there is general agreement that one process, selective visual attention, accompanies and often contributes to visual awareness. William James *Corresponding author. Tel.: þ1-734-763-4364; Fax: þ1-734-763-7480; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-3
183
184
feature conjunctions] is possible, some form of attention is required . . . ’’ These comments of various investigators suggest the study of visual selective attention may shed light on the problem of visual awareness. For several decades, researchers have used visual search tasks in order to investigate the properties and functions of selective attention. Much of this research has focused on the role of stimulus factors in search efficiency. Treisman and colleagues presented evidence for parallel, preattentive processing when features such as color or shape distinguish search targets and distracters. In contrast, when targets differ from distracters by feature conjunctions, search proceeds more slowly, because subjects serially direct attention to stimuli and terminate search when they find the target (Treisman and Gelade, 1980; Treisman, 1993). Subsequently, a number of studies reported conjunction searches performed in parallel or faster than would be expected according to Treisman’s theory (Nakayama and Silverman, 1986; McLeod et al., 1988; Treisman and Sato, 1990; Cohen and Ivry, 1991; Wolfe, 1992). Treisman (1993) suggested that in these studies, inhibitory connections between highly distinctive features speed search. Duncan and Humphreys (1989) presented findings suggesting that search efficiency varies along a continuum rather than falling into separate categories defined by parallel and serial processes. They proposed that search efficiency (and where attention operates in the chain of processing) is largely determined by two stimulus factors — target distracter similarity and nontarget heterogeneity, and their interactions. To account for the fast performance of conjunction search they found, Wolfe et al. (1989) proposed a guided search model in which information provided by parallel processing of features in ‘candidate targets’ is used to direct attention to conjunctions. Because their model assumed that processes active in the two stages interact, Wolfe et al. (1989) asserted that the two stages are not autonomous. More recently, Wolfe (1998) summarized data from a number of search tasks indicating that search efficiency depends upon several stimulus factors, including salience (contrast with background), dissimilarity among distracters and discriminability between features. Thus, by this view, multiple factors
and their interactions determine search efficiency, which, as Duncan and Humphreys (1989) argued, varies along a continuum, rather than being divided in two distinctly different processes. Whereas a large number of studies have examined the contribution of stimulus factors to attentional processes in visual search, relatively less effort has been devoted to investigating how selective attention operates on perceptual processing in visual search. Shiffrin and Schneider (1977) used the term ‘attentional director’, to describe a mechanism that selects task-relevant stimuli. This hypothetical mechanism is similar to older ideas such as ‘preparatory set’ or ‘attentional (top-down) priming’ (see Kahneman and Treisman, 1984), information processing analogs to James’ term ‘stimulus preprocessing’ (James, 1890). In a similar vein, LaBerge (1995, pp. 12–13) described ‘preparatory attention’, which, prior to the appearance of a search display, operates on representations of the search target held temporarily in working memory. Preparatory attention, under executive control, elevates the activity of perceptual representations of goal objects, thus increasing search efficiency. LaBerge based his ideas about executive mechanisms upon information-processing models (Norman and Shallice, 1980; Duncan, 1984) and Baddeley’s conception of working memory (Baddeley, 1986). In their model of visual search, Duncan and Humphreys (1989) also considered a mechanism by which top-down effects might operate. Following a parallel stage in which perceptual grouping takes place, representations of inputs engage in competitive interactions, the outcomes of which are determined by the weights attached to inputs. These weights are affected by stimulus factors (such as salience), as well as by top-down attentional influences on target templates, so that attention is guided to the target stimulus (see Bundeson, 1990 for a similar concept). Duncan and Humphreys, like Wolfe et al. (1989), asserted that search efficiency varies along a continuum, but one that is largely determined by similarities between targets and nontarget similarities and heterogeneity between nontargets. Desimone and Duncan (1995) proposed a similar account of visual search, but one formulated in neurophysiological terms. After
185
a parallel stage in which search stimuli activate neural representations in the visual cortex, visual neurons competitively interact with one another. These competitive interactions are biased by top-down executive mechanisms, as well as by stimulus factors. Several neurophysiological findings provide evidence for ‘preparatory’ activation of neurons in monkey visual cortex coding search objects. Chelazzi et al. (1998) recorded discharges of neurons in inferior temporal cortex of monkeys performing search tasks. Neurons responding selectively to the search object (presented as a cue prior to search) showed heightened activity during the interval between presentation of the cue object and the choice objects in the search task. Luck et al. (1997) and Reynolds et al. (1999) reported similar findings. Several neuroimaging have provided evidence of top-down activation of cortical areas processing sensory stimuli during visual search by human subjects, even in the absence of search stimuli (see review by Kastner and Ungerleider, 2000). In these neural studies, a crucial observation for the biased competition model is the activation of behaviorally relevant cells or cortical regions in the absence of sensory stimulation from the search display. Presearch activation provides evidence that the bias effect includes a preparatory component. If neural representations of search targets are selectively activated prior to search, one might expect that subjects would identify search targets more readily than distracters when they are preparing to search for such targets. This expectation is based on the assumption that identification of a stimulus requires activation of its neural representation to some theshold value. If the representation is already partially activated by an executive mechanism, it would require less stimulus activation than would a task-irrelevant stimulus. Butter and Goodale (2000) tested this notion by examining the effects of visual search on identification of targets versus distracters in a probe task interleafed with search trials. Subjects in that study recognized a briefly presented letter (T or L) more frequently when it was the target than when the same letter was a distracter in a concurrent search task. When presented with a ‘hybrid’ stimulus composed of a T and L, subjects misidentified it as the target in the concurrent search task more often
than they did when the same stimulus was a distracter in the concurrent search task. Using similar methods, the author undertook the following studies to examine further the effect of search on identification of targets and distracters. These studies dealt with two sets of questions, the first of which pertains to the nature of the effect of search on target identification: In what way is selective attention to search stimuli altered by visual search? Search might facilitate identification of targets (as the neurophysiological research cited above suggests), or inhibit identification of distracters, or both. In addition, the effect of search on target identification could be due to a change in sensitivity, or a change in response bias or to changes in both of these processes. The second set of questions raised here concerns the possible role of search difficulty in target identification. In the original study, targets and distracters (T and L) were distinguished by feature conjunctions. Consequently, search was moderately difficult (search times on target-absent trials were longer than search times on target-present trials). Would further increases in search difficulty increase target identification? One might expect this outcome, because as search difficulty increases, executive mechanisms would increase their activation of the representation of the search target, thus making target identification easier. A similar argument could be made for reduced identification of search targets as search becomes easier (see General Discussion). To answer these questions, in the second set of studies the author compared the effects on target identification of search tasks with similar, greater, or lesser difficulty than the search task employed earlier (Butter and Goodale, 2000).
Experiment 1 The purpose of this study was to find out whether the previously described effect of search on identification of targets versus distracters was due to enhanced identification of targets, reduced identification of distracters, or both effects. In order to distinguish these potential effects, subjects were tested for identification of a neutral stimulus (not present in search tasks), as well as for targets and distracters, in probe trials interleafed with search trials.
186
Methods Subjects The eight subjects included four men and four women (average age ¼ 22.4 years).
Apparatus and procedures Subjects viewed stimuli on the screen of a Zenith Supersport 286e computer. The screen was 20.5 cm in width and 15.5 cm in height; it had a liquid crystal display and a refresh rate of 20 ms. In testing, subjects sat in a dimly lit room and faced the screen at a distance of approximately 60 cm. On each trial, they pressed the space bar while fixating the letter X on the center of the screen, which was otherwise blank. As soon as they pressed the space bar, the X vanished; 50 ms later, either a display of characters (in the search task) or a single character (in the probe task) appeared on the screen. Two hundred milliseconds after subjects responded by pressing a key, the X reappeared, prompting subjects to initiate a new trial. In each of two sessions, subjects searched for a single T in a field of seven Ls or for a single L in a field of seven Ts. Both letters were 0.25 in height and 0.24 in width. They appeared yellow against a gray background. The target letter (along with the seven distracters) appeared on 48 search trials in each block of 120 trials. On the remaining 48 target-absent trials in each block, eight distracter letters appeared. Trials with and without a target letter occurred in a pseudorandom order. Stimuli remained on the screen until the subject pressed one of two keys, indicating that a target letter was present or not present. In each block of trials, target-present and target-absent trials occurred with equal frequency. On target-present trials, two distracter letters appeared in each of the three quadrants in which the target was absent; the seventh distracter was presented in the same quadrant occupied by the target. On target-absent trials, the distracter letter appeared twice in each quadrant. On each target-present and target-absent trial, a letter appeared in one of four randomly chosen positions within each quadrant, so that letter displays were not aligned in rows or columns.
On each of 24 probe trials in each block, a single letter, either a T, and L or a Z (of the some dimensions, color, and eccentricity as the other two letters) appeared equally often 3 to the left or right of the center of the screen for 50 ms. Subjects responded by pressing the key corresponding to the letter that they saw. In each block of trials, each probe stimulus appeared equally often on either side of center. Trials with different probe stimuli followed a pseudorandom sequence. There were six blocks of trials on each of two sessions, separated in time by 1–3 days. In the first session, half the subjects searched for Ts; the other half searched for Ls. Target and distracter letter were reversed in the two sessions.
Results and discussion Search task Subjects performed the search tasks accurately (per cent correct search ¼ 98.1, S.D. ¼ 1.1%). Performance with the two targets was not different; when T was the target, accuracy was 97.8% (S.D. ¼ 0.9). When L was the target, accuracy was 98.4% (S.D. ¼ 1.01). Search times averaged over the two tasks were 1.56 s (S.D. ¼ 0.22). When T was the target, average search time was 1.73 s (S.D. ¼ 0.31); when L was the target, average search time was 1.49 s (S.D. ¼ 0.21). This difference approached significance (t ¼ 2.043, df ¼ 9; P ¼ 0.084). The average search time on targetpresent trials was 1.95 s (S.D. ¼ 0.29); on targetabsent trials it was 1.17 s (S.D. ¼ 0.16). The average ratio of search times to target-absent trials to search times on target-present trials was 65.3%. The difference between the ratios when T was the target (53.3%, S.D. ¼ 8.4) and when L was the target (77.3%, S.D. ¼ 9.9) approached significance (t ¼ 2.156; df ¼ 9; P ¼ 0.079).
Probe identification When Ls were search targets, they were recognized on 73.1% of probe trials (S.D. ¼ 10.4), but when they were distracters in search testing they were recognized on only 43.9% of probe trials (S.D. ¼ 21.6). This difference between identification rates for Ls was
187
significant (t ¼ 3.962, df ¼ 9; P ¼ 0.004, Bonferroni correction, subsequently referred to as Bcorr). Identification rates for Ls in probe trials also differed from identification rates for the neutral stimulus. When the L probe was the target in the concurrent search task, subjects recognized it more often (t ¼ 3.341, df ¼ 11, P ¼ 0.014 Bcorr) than the neutral probe (mean per correct ¼ 60.9, S.D. ¼ 14.2). Conversely, when the L probe was the distracter in the concurrent search task (mean ¼ 43.9%, S.D. ¼ 21.6), subjects recognized it less frequently (t ¼ 3.015; df ¼ 11; P ¼ 0.024 Bcorr) than the neutral probe. When Ts were search targets, they were identified in probe trials more frequently (mean per cent correct ¼ 63.7, S.D. ¼ 11.6) than they were when they were distracters (mean per cent correct ¼ 52.7, S.D. ¼ 19.4). However, this effect of search on identification of Ts in probe trials was marginal (t ¼ 2.221, df ¼ 11, P ¼ 0.096, Bcorr). Identification rates of probe Ts, when T was either target or distracter in concurrent search, did not differ reliably from identification rates of the neutral stimulus. Whereas their status in search tasks affected identification of L and T (marginally) probe stimuli, this was not the case for RTs to these probes. When L was the target, mean RT to probe Ls was 0.59 s (S.D. ¼ 0.09); when L was the distracter, mean RT to probe Ls was 0.67 s (S.D. ¼ 0.14). When T was the target, RTs to probe Ts was 0.73 s (S.D. ¼ 0.12); when T was the distracter, RTs to probe Ts was 0.79 s (S.D. ¼ 0.21). Mean RTs to the neutral stimulus (Z) on all sessions were 0.81 s (S.D. ¼ 0.15). None of the comparisons between RTs to Z and to each of the other probes attained or approached significance. The differences in identification rates for Ls and neutral stimuli in probe trials imply that the influence of search on identification of L probes was in fact due to two effects. One was enhanced identification of Ls when they were targets; the other, reduced identification when L served as a distracter. It is possible that similar effects were not found with T probes because the overall search versus distracter effect on Ts was weak, as found previously (Butter and Goodale, 2000). A potential problem with this study is that identification of the neutral and search stimuli was not first established outside of the search context. However, the fact that rates of correct responses to the neutral
stimulus were intermediate to rates for targets and distracters implies that this may not have been a serious problem.
Experiment 2 Introduction The enhanced identification of the L probe when it served as the search target in Experiment 1 might have been due to a change in sensitivity (d 0 ) or to increased response bias, or to both. This experiment was conducted in order to distinguish between these alternatives.
Methods Subjects The subjects were ten subjects, five men and five women (average age ¼ 28.6 years).
Apparatus and procedures As in Experiment 1, subjects performed two search tasks: in one, L was the target and Ts were distracters; in the other, the roles of these two letters were reversed. The apparatus and procedures were identical to those in Experiment 1, except that there was no neutral stimulus on probe trials. Thus, on each probe trial, either a T or L appeared; subjects were required to make a forced choice between the two. The parameters of the T and L used in search and in probe trials were the same as those used in Experiment 1.
Results and discussion Search task Subjects’ average performance on the two search tasks was 98.5% correct (S.D. ¼ 1.3). Average performance on the two search tasks was not different (when T was the target, 98.0%; when L was the target, 98.4%). Mean group search time was significantly longer (t ¼ 3.936, df ¼ 9; P ¼ 0.003) when T was the target (1.63 s, S.D. ¼ 0.263) than when L was the target (1.41 s, S.D. ¼ 0.228). Mean search
188
time on target-present trials was 1.09 s (S.D. ¼ 0.18); on target-absent trials, mean search time was 1.94 s (S.D. ¼ 0.32). The average ratio of target-absent to target-present search times was 77.9% (S.D. ¼ 30.1). By this measure, visual search was somewhat more difficult than it was in Experiment 1 (65.3%) and in the original study (55%). The difference between the ratio of target-absent to target-present search in the two search tasks approached significance (when T was target, mean increase ¼ 68.54%, S.D. ¼ 32.07; when L was target, mean increase ¼ 87.0%, S.D. ¼ 35.02; t ¼ 2.021; df ¼ 9; P ¼ 0.074).
Probe identification When L was the search target, subjects responded correctly on 86.4% of the concurrent probe trials (S.D. ¼ 12.7). When L was the distracter in search tests, subjects responded correctly on 66.0% of concurrent probe trials (S.D. ¼ 16.8). This difference was statistically significant (t ¼ 5.593, df ¼ 9, P<0.002, Bcorr). However, the effect of search on identification of T was not significant (t ¼ 1.779, df ¼ 9, P ¼ 0.109 Bcorr). When T was the search target, subjects recognized it on 76.1% of probe trials (S.D. ¼ 17.45); when T served as the distracter in search tests, they responded to it correctly on 70.1% of probe trials (S.D. ¼ 17.4). RTs to the two characters did not differ when they served as target or distracter. Nor did the status of either character affect RTs. When T was the target, mean RTs ¼ 0.65 ms, S.D. ¼ 0.13; when T was the distracter, mean RTs ¼ 0.87 ms, S.D. ¼ 0.35 (t ¼ 1.559; df ¼ 9; P ¼ 0.153). When L was the target, mean RTs ¼ 0.61 ms, S.D. ¼ 0.15; when L was the distracter, mean RTs ¼ 0.65 ms, S.D. ¼ 0.11 (t ¼ 0.791; df ¼ 9; P ¼ 0.449).
Analysis of d 0 and response bias (Xc) The average value of d 0 in all probe tests was 1.443 (S.D. ¼ 0.654). The d 0 value for probe Ls was higher (t ¼ 4.230, df ¼ 9, P ¼ 0.004, Bcorr) when L was the search target (d 0 ¼ 2.025, S.D. ¼ 0.74) than the d 0 value when L was the distracter in concurrent search tests (d 0 ¼ 1.076, S.D. ¼ 0.83). When T was searchtarget, d 0 (1.511, S.D. ¼ 0.964) was higher than it was when T was distracter (d 0 ¼ 1.134, S.D. ¼ 0.08).
False alarms occurred on the average in 9.5% (S.D. ¼ 9.3) of probe trials, a value not significantly different from zero (t ¼ 0.989, df ¼ 7, P ¼ 0.349). Response bias (Xc), defined as z [false alarm rate], (Macmillan and Creelman, 1992) was not significantly elevated when T was the target (mean ¼ 1.149, S.D. ¼ 0.761) compared to Xc (mean ¼ 1.504, S.D. ¼ 0.844) when T was the distracter (t ¼ 1.719, df ¼ 7, P ¼ 0.129 Bcorr). The results of the analysis of Xc when L was the target versus distracter were similar (mean Xc ¼ 1.019, S.D. ¼ 0.523 when L was the target; mean Xc ¼ 1.298, S.D. ¼ 0.867 when L was the distracter; t ¼ 1.351, df ¼ 7, P ¼ 0.226 Bcorr). The analyses of d 0 provides evidence for a difference in sensitivity for probe stimuli depending on their status as target or distracter in search tasks. However, search had a weaker effect on d 0 scores for identification of Ts compared to identification for Ls, probably because search had a weak effect on identification of Ts, a finding also reported in the original study (Butter and Goodale, 2000) and in Experiment 1 here. Whereas search affected d 0 to one stimulus (L), and marginally to the other (T), it did not alter Xc scores. Thus, these findings support the conclusion that search for a particular target enhances its identification by altering sensitivity without affecting response bias. This experiment did not address the question whether the d 0 change was due to enhanced sensitivity when L was target, to reduced sensitivity when L was the distracter, or to both of these changes. The conclusion that search enhances sensitivity to targets is supported by studies of brain activity in monkeys and humans performing search tasks (see General Discussion). However, the possibility that search may also decrease sensitivity to distracters is raised by the finding (Experiment 1) that subjects recognize distracters less frequently than neutral stimuli.
Experiment 3 This study was undertaken to investigate whether increasing search difficulty relative to search difficulty in previous studies would enhance search effects on probe stimuli. The present search tasks included 32 distracters rather than the eight distracters used
189
in a prior study that otherwise followed the same procedures (Butter and Goodale, 2000). Consequently, the results of this study and the previous one could be compared in order to directly assess the effects of increasing search difficulty on identification of probe stimuli.
121%, a value significantly higher (t ¼ 3.486, df ¼ 7, P ¼ 0.010) than that found in the study with eight distracters (55%, S.D. ¼ 14.2). This ratio was significantly larger (t ¼ 6.285; df ¼ 7, P ¼ 0.000) when subjects searched for Ls (mean ¼ 220.6%, S.D. ¼ 51.6) than when they searched for Ts (mean ¼ 106.5%, S.D. ¼ 35.2).
Methods Probe performance Subjects Eight subjects, four males and four females (average age ¼ 24.7 years) participated in this experiment.
Apparatus and procedures The apparatus and procedures were the same as those used in a prior study (Butter and Goodale, 2000, Experiment 2), except that in search trials, 32 distracters (Ts or Ls) appeared on target-absent trials and 31 on target-present trials. As in the prior experiment, three kinds of stimuli appeared singly on probe trials: a T, an L, or a hybrid stimulus composed of a T and an L, which shared the same vertical stem.
Results Search performance Subjects’ average performance in the two search tasks was 94.3% correct (S.D. ¼ 3.1), a level of performance similar to that found in the earlier study (Butter and Goodale, 2000) with eight distracters. When T was the target, mean accuracy was 92.0% (S.D. ¼ 4.9). When L was the target, mean accuracy was 96.8% (S.D. ¼ 3.2). These two values did not differ reliably (t ¼ 0.995, df ¼ 7; P ¼ 0.353). Search times in the two tasks did not differ (t ¼ 1.421; df ¼ 7; P ¼ 0.1983). When T was the target, mean search time ¼ 2.51 s, S.D. ¼ 0.655; when L was the target, mean search time ¼ 2.26 s, S.D. ¼ 0.24). Mean search time for target-present and target-absent trials were, respectively, 1.35 s (S.D. ¼ 0.323) and 2.99 s (S.D. ¼ 0.51). The ratio of distracter to target search times was
Probe performance is shown in Fig. 1A. The effect of search target on identification of Ts in probe trials was significant (t ¼ 4.347, df ¼ 7, P ¼ 0.006, Bcorr), whereas the search target effect on identification of L probes was marginal (t ¼ 2.406, df ¼ 7, P ¼ 0.094 Bcorr). Search also increased misidentification of hybrids as the search target, relative its misidentification as the distracters. When T was the search target, subjects identified the hybrid as T more often than they did when T was the distracter (t ¼ 5.108, df ¼ 7, P ¼ 0.002, Bcorr). When the search target was L, subjects identified the hybrid as L more frequently than they did when L was the distracter (t ¼ 6.686, df ¼ 7, P<0.001, Bcorr). When T was the target, mean RT was 1.05 s (S.D. ¼ 0.34); when T was the distracter, mean RT was 1.31 (S.D. ¼ 0.64). When L was the target, mean RT was 0.99 s (S.D. ¼ 51); when L was the distracter, mean RT was 1.20 s (S.D. ¼ 0.40). With regard to hybrid RTs, when T was the target, subjects misidentified it as a T with an average RT of 1.11 s (S.D. ¼ 0.31). When T was the distracter, subjects misidentified the hybrid as a T with an average RT of 1.42 s (S.D. ¼ 0.52). When L was the target, subjects misidentified the hybrid as L with an average RT of 1.19 s (S.D. ¼ 0.71). When L was the distracter, subjects misidentified the hybrid as an L with an average RT of 1.46 s (S.D. ¼ 0.55). Paired t-tests showed that none of the above comparisons achieved or approached significance. As seen in Fig. 1A and B, these effects were on the average consistently higher than those obtained in the earlier study in which eight distracters were used. The following ratio was calculated from each subject’s data in order to compare the effects of target search on responses to T and L probes when
190
Fig. 1. (A) Percent responses to probe stimuli in Exp. 3. Letters in bars: [T, T*] T ¼ target stimulus; [T, D] T ¼ distracter; [L, T*] L ¼ target stimulus; [L, D] L ¼ distracter. (B) Percent responses to probe stimuli in Butter and Goodale (2000), Exp. 2. Letters in bars refer to the stimulus conditions as in A.
32 and 8 distracters were present in search: % corr: resp: ðprobe ¼ targetÞ % corr: resp: ðprobe ¼ distracterÞ 100 % corr: resp: ðprobe ¼ targetÞ þ% corr: resp: ðprobe ¼ distracterÞ The author used a similar ratio to compare the effect of target search on misidentification of hybrid stimuli when 8 and 32 distracters were presented in search: % resp: to hybrid as target % resp: to hybrid as distracter 100 % resp: to hybrid as target þ% resp: to hybrid as distracter These ratios correct for large differences in response levels; they also provide normalized scores, thus
allowing direct comparison of search effects on responses to probes in the two experiments. An overall search-effect score was calculated by averaging the values of the two ratios shown above, derived from data in the present study. Similarly, an overall search-effect score was calculated from data generated in the prior study with eight distracters. The search-effect score derived from data obtained in the present experiment was 41.8 (S.D. ¼ 20.1), a value that is significantly larger (t ¼ 3.172, df ¼ 7, P ¼ 0.0157) than the score obtained in the prior experiment with eight distracters (23.4, S.D. ¼ 7.6). Search effects, assessed by these ratios, also were examined separately for each probe character and for hybrids. Although all search-effect scores were higher when 32 distracters were used compared to eight distracters, none attained significance.
191
Search with 32 distracters was considerably more difficult than it was in the earlier study with eight distracters. Accompanying this increase in search difficulty was an elevated effect of search on pooled scores for probe identification. This result would be expected if the executive mechanism controlling search activated the target representation more as search difficulty increased (see General Discussion).
Experiment 4 This is the first of two experiments undertaken to find out whether visual search tasks involving feature detection would have less of an effect on identification of targets versus distracters than the search tasks involving feature conjunctions, used heretofore. This result might be expected if executive mechanisms activate representations of targets to a lesser extent than they would when subjects engage in more difficult search tasks.
Methods Subjects Ten subjects, including an equal number of women and men (average ¼ 27.8 years), participated in this experiment.
Apparatus and procedures Subjects searched on separate days for a single target letter, either a Z or O, in a field of seven O or Z distracters, respectively. Their dimensions, eccentricity and color were the same as those of the characters used previously. A single Z, O, or a hybrid stimulus composed of a superimposed Z and O appeared on each probe trial. Except for the characters used in search and probe trials, the
procedures and apparatus were the same as those used previously (Butter and Goodale, 2000).
Results Search performance Average search performance was 96.7% correct (S.D. ¼ 2.7). Performance with the two targets did not differ (for Z: 97.0%; for O: 96.4%). When Z was the target, mean search time was 0.68 s (S.D. ¼ 0.09). When O was target, mean search time was 0.66 s (S.D. ¼ 0.12). This difference was not significant (t ¼ 0.893; df ¼ 7, P ¼ 0.242). On target-present trials, mean search time was 0.66 s (S.D. ¼ 0.11). On target absent trials, mean search time was 0.65 (S.D. ¼ 0.12), a difference that was not significant (t ¼ 0.7436, df ¼ 7; P ¼ 0.379). The overall ratio of target-absent to target-present search times was 0.06% (i.e. search times on target absent trials were slightly lower than search times on target-present trials). When Z was the target, the ratio of target-absent to target-present trials was 12%. When O was the target, the ratio of target-absent to target-present trials was 3.2%. This difference was not significant (t ¼ 1.037, df ¼ 7, P ¼ 0.173).
Probe analysis Mean identification scores for characters and hybrid stimulus are shown in Table 1. Although all the comparisons were in the expected direction, none of them attained statistical significance. Paired t-tests disclosed no evidence that median RTs to probes differed from one another when they served as targets than when they served as distracters. These negative findings obtained when RTs to the two characters were averaged (mean RT to targets ¼ 0.64 s, S.D. ¼ 0.21; mean RT to distracters ¼ 0.74 s, S.D. ¼ 0.25). Negative results also were found when
Table 1. Experiment 4 mean per cent responses (S.D.s) to O, Z, and hybrid stimulus (Hyb) in probe trials OT>O
OD>O
ZT>Z
ZD>Z
Hyb>OT
Hyb>OD
Hyb>ZT
Hyb>ZD
62.3 (17.0)
50.0 (24.2)
69.8 (8.6)
67.2 (19.6)
43.0 (17.4)
42.5 (20.0)
29.9 (19.6)
20.2 (17.8)
The first symbol in each column heading refers to the stimulus; the second symbol refers to the response. T, search target; D, search distracter.
192
RTs to each probe were considered separately (mean RT to Z as targets ¼ 0.65 s, S.D. ¼ 0.21; mean RT to Z as distracters ¼ 0.74 s, S.D. ¼ 0.253; mean RT to O as targets ¼ 0.63 s, S.D. ¼ 0.25; mean RT to O as distracters ¼ 0.78 s, S.D. ¼ 0.26). Similarly, paired t-tests provided no evidence that RTs to hybrids misidentified as targets (mean RT ¼ 0.71 s, S.D. ¼ 0.17) differed from hybrids misidentified as distracters (mean RT ¼ 0.76 s, S.D. ¼ 0.17). The same negative results occurred when misidentifications of hybrids as Z were analyzed (when Z was target, mean RT of misidentifications as Z ¼ 0.65 s, S.D. ¼ 0.17; when Z was distracter, mean RT of misidentifications as Z ¼ 0.69 s, S.D. ¼ 0.22), and when misidentifications of O were analyzed (when O was target, mean RT of misidentifications as O ¼ 0.63 s, S.D. ¼ 0.25; when O was distracter, mean RT of misidentifications as O ¼ 0.78 s, S.D. ¼ 0.26). It appears then that performance of an easy search task involving feature differences has no reliable effects on identification of targets or distracters, or on mistaken identification of a hybrid stimulus. It is possible that increased statistical power, or additional search tests, would reveal effects on probe identification by parallel search, as previous studies employing more difficult search tasks did. Accordingly, the next and final experiment was undertaken to further examine possible effects of parallel search on identification of probe stimuli.
Experiment 5
in a field of seven V or H distracters, respectively. Dimensions of the stimuli were equal and the same as those of the characters used previously. On each probe trial, a single H, V, or a hybrid character composed of the two superimposed lines, forming a cross, appeared. The lines were 0.25 in length and appeared at 3 eccentricity. Except for the characters used in search and probe trials, the procedures and apparatus were the same as those used in a prior study (Butter and Goodale, 2000).
Results Search performance Average search performance was 97.6% correct (S.D. ¼ 1.8). Search performance for the two targets did not differ (for Hs: 97.8%, S.D. ¼ 1.7; for Vs: 97.4%, S.D. ¼ 1.9). When Hs were targets, the mean search time was 0.74 s (S.D. ¼ 0.20). When Vs were targets, the mean search time was 0.68 s (S.D. ¼ 0.12). This difference was not significant (t ¼ 0.893; df ¼ 7, P ¼ 0.242). The average overall search time ¼ 0.71 s. Mean search time on all target-present trials was 0.76 s (S.D. ¼ 0.115) and on all target-absent trials, 0.66 s (S.D. ¼ 0.09). The increase in search times on target-absent relative to target-present trials was 15.5% (S.D. ¼ 11.9). The increase was 22% (S.D. ¼ 13.0) when Hs were targets and 9.0% (S.D. ¼ 18.8) when Vs were targets. This difference that was not statistically significant (t ¼ 1.505, df ¼ 7, P ¼ 0.187).
Introduction This is the second of two experiments undertaken to find out whether visual search tasks involving simple feature detection, in this case, differences in the orientation of two lines, would have less of an effect on identification of targets versus distracters than search tasks involving feature conjunctions.
Methods Apparatus and procedures Subjects searched on separate days for a single target letter, either a horizontal (H) or a vertical (V) line,
Probe performance Two comparisons yielded marginal differences. When subjects searched for Vs, their identification rate for probe Vs was marginally higher than it was when Vs were search distracters (t ¼ 2.552, df ¼ 9, P ¼ 0.0622, Bcorr). Secondly, subjects misidentified hybrid probes as Hs at a marginally higher rate (t ¼ 2.373, df ¼ 9, P ¼ 0.0834, Bcorr) when Hs were the targets than they did when Hs were distracters in concurrent search tasks. The two remaining comparisons resulted in statistically insignificant differences (identification of Hs when they were targets versus distracters: t ¼ 1.409, df ¼ 9, P ¼ 0.3858, Bcorr; misidentification
193 Table 2. Experiment 5 mean per cent responses (S.D.s) to horizontal (H), vertical (V) lines, and hybrid stimulus (Hyb) in probe trials HT>H
HD>H
VT>V
VD>V
Hyb>HT
Hyb>HD
Hyb>VT
Hyb>VD
64.2 (7.6)
52.6 (25.1)
65.1 (8.1)
45.4 (21.7)
44.8 (21.6)
23.3 (16.5)
31.2 (23.1)
28.6 (20.4)
The first symbol in each column heading refers to the stimulus; the second symbol refers to the response. T, search target; D, search distracter.
of the hybrids as Vs line when they were search target versus distracter (t ¼ 0.380, df ¼ 9, P ¼ 0.7128, Bcorr). RTs to probes on sessions when they were targets versus distracters did not differ according to paired t-tests. Significant differences (or differences approaching significance) were not present when RTs were averaged over the two characters (mean RT to targets ¼ 0.68 s, S.D. ¼ 0.15; mean RT to distracters ¼ 0.70 s, S.D. ¼ 0.20). Similarly, negative findings obtained when H and V lines were considered separately (mean RT to Hs as targets ¼ 0.60 s, S.D. ¼ 0.18; mean RT to Hs as distracters ¼ 0.70 s, S.D. ¼ 0.20; mean RT to Vs as targets ¼ 0.64 s, S.D. ¼ 0.24; mean RT to Vs as distracters ¼ 0.73 s, S.D. ¼ 0.197). Paired t-tests provided no evidence than RTs to hybrids misidentified as targets (mean RT ¼ 0.68, S.D. ¼ 0.16) differed from hybrids misidentified as distracters (mean RT ¼ 0.72, S.D. ¼ 0.17). Negative outcomes also were found when misidentifications of hybrids as Hs were analyzed. When H was the target, mean RT of misidentifications as Hs ¼ 0.64, S.D. ¼ 0.16; when H was distracter, mean RT of misidentifications as Hs ¼ 0.68, S.D. ¼ 0.20). Similar results were found when misidentifications of hybrids as Vs were analyzed (when V was target, mean RT of misidentifications as V ¼ 0.62 s, S.D. ¼ 0.22; when V was distracter, mean RT of misidentifications as V ¼ 0.70, S.D. ¼ 0.23) (Table 2). The effects of search on probe identification in this study were weak and less consistent than those found in two prior studies in which targets and distracters differed in feature configuration (Butter and Goodale, 2000). Apparently, the search task used here was not sufficiently difficult to exert effects like those found in earlier studies. Indeed, the increase of 15.5% in search times on target-absent relative to target-present trials was not significantly different (t ¼ 1.736, df ¼ 9, P ¼ 0.117) from the increase (7.5%) found in Experiment 4 (Z vs. O), where no effects of search on probe identification were found.
General discussion Summarizing the findings presented here, search increased identification of Ls when they were targets and decreased identification when Ls were distracters in concurrent search (Exp. 1). Subjects in Experiment 2 showed increases in sensitivity, but not in response bias, to Ls. Search had a weaker effect on d 0 for identification of Ts, which was not reliably altered by search. In Experiment 3, enhancing the difficulty of searching for Ts or Ls by increasing the number of distracters augmented identification of targets versus distracters compared to the effect of search found previously (Exps. 1 and 2). Two studies investigated the effects on target identification of search tasks that were easier than those employed in prior studies here and elsewhere (Butter and Goodale, 2000). When targets were distinguished from distracters by straight versus curved lines (Z vs. O), there were no reliable effects of search on probe identification (Exp. 4) When horizontal and vertical lines were targets and distracters, subjects showed weak and inconsistent effect on probe identification (Exp. 5). The conclusion that conjunction search (T vs. L) affects target identification is tempered by lack of significant effects of search on identification of Ts in Experiments 1 and 2. However, predicted trends for this effect appeared in these studies as well as in Experiment 2 in Butter and Goodale (2000). Moreover, when search for Ts was made more difficult by employing 32 distracters (Exp. 3), a significant effect of search on identification of Ts appeared. Also, the d 0 values for identification of Ts as targets and distracters were marginally different. These findings suggest that identification of Ts is altered by search, but this effect is less reliable and more dependent on search difficulty than identification of Ls. RTs to probes were not significantly affected by search tasks in this or subsequent experiments, as found in the earlier study (Butter and Goodale,
194
2000), although in several studies there were trends in the direction of faster RTs to targets than to distracters. These findings suggest that search may affect speed of RTs to probes when statistical power is increased. The prolongation of search times on target-absent trials in feature-conjunction search (Exps. 1 and 2) was similar in degree to earlier findings (Butter and Goodale, 2000). This finding, plus the greater increase in target-absent search times in Experiment 3, suggests that these search tasks were difficult. In contrast, search times on target-absent trials were not prolonged (or much less prolonged) when features distinguished targets and distracters. The present studies did not examine slopes of functions relating set size to search times, a method commonly used for distinguishing parallel from serial search. This method requires a large number of search trials, a procedure that in pilot studies reduced or abolished the effect of search on target identification. There may be a general problem with the use of search times to distinguish different stages of processing. As Wolfe (1998) has pointed out, search times may not be the ideal measure to distinguish parallel from serial search, which he asserted may be an artificial distinction. An alternative interpretation of the findings presented here attributes enhancement of target identification to the assumption that the target stimulus was likely the last stimulus seen (on target-present trials). Direct test of this possibility would require analysis of responses to probes following targetpresent versus target absent trials. Unfortunately, these data are not available. However, this interpretation does not account for the finding that subjects identified distracters less frequently than neutral stimuli (Exp. 1), nor for the greater effect of search on target identification when the number of distracters was increased (Exp. 3). By this interpretation, one might expect increasing distracters would reduce, and not enhance, target identification, because on many target-present trials one or more distracters was close to the target and thus was identifiable. The findings that search affects responses to probe targets are consistent with those derived from studies of single-unit recordings in monkeys engaged in visual search (Chelazzi et al., 1993; Luck et al., 1997;
Reynolds et al., 1999) and fMRI studies in humans (Kastner and Ungerleider, 2000). These findings and the results presented here converge on the view that executive mechanisms augment the activity of goal representations, thus increasing the likelihood of target identification, as shown here. (See Introduction for a more detailed description of this effect.) This view implies that executive modulation of representations held in working memory enhances the efficiency of search. This would be especially useful when searching a cluttered field of irrelevant objects for an object, a common situation in everyday search. Executive activation of representations in working memory would provide the selective bias required for the object sought to compete successfully with other objects for the searcher’s attention (Desimone and Duncan, 1995). Support for this view derives from findings that visual working memory influences visual selective attention (Downing, 2000). The findings that targets are recognized more frequently than neutral stimuli (Exp. 1) and that their sensitivity is enhanced by search (Exp. 2) are consistent with the activating effects of executive modulation on neurons coding for search stimuli. Enhancement of sensitivity would be a consequence of selectively increasing the activity level of neurons encoding targets. Search may also decrease sensitivity to distracters, a conclusion suggested by subjects’ less-frequent identification of distracters compared to neutral stimuli in Experiment 1. (For neurophysiological evidence for such inhibitory effects, see Moran and Desimone, 1985 and Luck et al., 1997.) However, this inhibitory effect may be limited to search tasks where irrelevant items are homogeneous and appear repeatedly. Sensitivity for irrelevant items is less likely to be reduced when searching for an object in a cluttered setting of irrelevant and different objects. In this common situation, inhibitory control over large numbers of visual coding neurons would be an inefficient way of guiding search to a particular object. The studies reported here show that search difficulty determines to what degree search enhances identification of targets. A four-fold increase in distracters (distinguished from targets by feature conjunctions) made search more difficult, and improved target identification (Exp. 3). This finding
195
implies that when input processing is more difficult, executive mechanisms increases the activity level of the target’s representation, thus making activation by an appropriate stimulus to its threshold for identification more likely. The same interpretation applies to search in which difficulty is varied by size differences in targets and distracters. When subjects in a pilot study searched for a target differing in size from distracters by 6%, identification of probe targets was significantly better than it was when the size difference between target and distracters was 80%. Conversely, when search was relatively easy, its effects on probe identification were weak (Exp. 5) or not apparent (Exp. 4). It is possible that a more sensitive measure of probe identification would have disclosed effects of easy search. This kind of search task (involving feature differences) may benefit from top-down enhancement of target identification, as well as from stimulus salience (see Mack and Rock, 1998; Yantis and Egeth, 1999). These findings imply that bottom-up processing and top-down modulation of visual input work together in a coordinated manner. When processing becomes difficult (because of many distracters or small differences between target and distracters), top-down modulation increases; consequently, targets are more easily distinguished from distracters. When input processing is less demanding, executive controlled modulation is reduced. Studies of visual search inform us that our awareness of things is controlled in part by the modulating effects of executive mechanisms on internal representations of future goals. These modulating effects regulate awareness of searched-for objects and, in doing so, take into account the demands of input processing. Further studies of ways in which awareness is modulated by selective attention may bring us closer to understanding visual awareness, a phenomenon to which Alan Cowey has made outstanding contributions and to whom this chapter is dedicated.
References Baddeley, A. (1986) Working Memory. Oxford University Press, New York. Bundeson, C. (1990) A theory of visual attention. Psychol. Rev., 97: 523–547.
Butter, C.M. and Goodale, M.A. (2000) Visual search selectively enhances recognition of the search target. Vis. Cogn., 7: 7669–7782. Chelazzi, L., Miller, E.K., Duncan, J. and Desimone, R. (1993) A neural basis of visual research in inferior temporal cortex. Nature, 363: 345–347. Chelazzi, L., Duncan, J., Miller, E.K. and Desimone, R. (1998) Responses of neurons in inferior temporal cortex during memory-guided visual search. J. Neurophysiol., 80: 2918–2940. Cohen, A. and Ivry, R.B. (1991) Density effects in conjunction search: evidence for coarse location mechanism of feature integration. J. Exp. Psychol. Hum. Percept. Perform., 17: 891–901. Cowey, A. (1997) Current awareness: spotlight on consciousness. Dev. Med. Child Neurol., 39: 54–62. Crick, F. and Koch, C. (1998) Consciousness and neuroscience. Cereb. Cortex, 8: 97–107. Desimone R. and Duncan J. (1995) Neural mechanisms of selective visual attention. In: W. Maxwell Cowan, (Ed.), Annual Review of Neuroscience, Vol. 18, Annual Reviews. Palo Alto, California, pp. 193–222. Downing, P.E. (2000) Interactions between visual working memory and selective attention. Psychol. Sci., 11: 467–473. Duncan, J. (1984) Selective attention and the organization of visual information. J. Exp. Psychol. Gen., 113: 501–517. Duncan, J. and Humphreys, G.W. (1989). Visual search and stimulus similarity. 96: 433–458. James, W. (1890) The Principles of Psychology. Holt, New York. Kahneman, D. and Treisman, A. (1984) Changing views of attention and automaticity. In: Parasuraman R. and Davies D.R. (Eds.), Varieties of Attention. Academic Press, New York. Kastner, S. and Ungerleider, L.G. (2000) Mechanisms of visual attention in human cortex. In: W.M. Cowan (Ed.), Annual Review of Neuroscience, Vol. 23, Annual Reviews. Palo Alto, California, pp. 315–341. LaBerge, D. (1995) Attentional Processing. Harvard University Press, Cambridge, MA. Luck, S.J., Chelazzi, L., Hillyard, S.A. and Desimone, R. (1997) Neural mechanisms of spatial selective attention in areas V1, V2 and V4 of ht macaque visual cortex. J. Neurophysiol., 77: 24–42. Mack, I. and Rock, I. (1998) Inattentional Blindness. MIT Press, Cambridge, MA. Macmillan, N.A. and Creelman, C.D. (1992) Detection Theory: A User’s Guide. Cambridge University Press, New York. McLeod, P., Driver, J. and Crisp, J. (1988) Visual search for conjunctions of movement and form is parallel. Nature, 332: 154–155. Merikle, P.M., Smilek, D. and Eastwood, J.D. (2001) Perception without awareness: perspectives from cognitive psychology. Cognition, 79: 115–124.
196 Moran, J. and Desimone, R. (1985) Selective attention gates visual processing in the extrastriate cortex. Science, 229: 782–784. Nakayama, K. and Silverman, G.H. (1986) Serial and parallel processing of visual feature conjunctions. Nature, 320: 264–265. Norman, D.A. and Shallice, T. (1980) Attention to Action: Willed and Automatic Control of Behavior. University of California, Center for Human Information Processing, San Diego, CA, Report No. 8006. Rensink, R.A. (2000) Seeing, sensing and scrutinizing. Vis. Res., 40: 1469–1487. Reynolds, J.H., Chelazzi, L. and Desimone, R. (1999) Competitive mechanisms subserve attention in macaque areas V2 and V4. J. Neurosci., 19: 1736–1753. Shiffrin, R.M. and Schneider, W. (1977) Controlled and automatic human information processing. II. Perceptual learning, automatic attending and a general theory. Psychol. Rev., 84: 127–158. Treisman, A. (1993) The perception of features and objects. In: Baddeley A. and Weiskrantz L. (Eds.), Attention: Selection.
Awareness and Control A tribute to Donald Broadbent. Oxford University Press, Oxford. Treisman, A. and Gelade, G. (1980) A feature-integration theory of attention. Cogn. Psychol., 12: 97–136. Treisman, A. and Sato, S. (1990) Conjunction search revisited. J. Exp. Psychol., Hum. Percept. Perform., 16: 459–478. Wolfe, J.M. (1992) ‘Effortless’ texture segmentation and ‘parallel’ visual search are not the same thing. Vis. Res., 32: 757–763. Wolfe, J.M. (1998) What can 1 million trials tell us about visual search? Psychol. Sci., 9: 33–39. Wolfe, J.M., Cave, K.R. and Franzel, S.L. (1989) Guided search: an alternative to the feature integration model for visual search. J. Exp. Psychol. Hum. Percept. Perform., 15: 419–433. Yantis, S. and Egeth, H.E. (1999) On the distinction between visual salience and stimulus-driven attentional capture. J. Exp. Psychol. Hum. Percept. Perform., 25: 661–676.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 14
First-order and second-order motion: neurological evidence for neuroanatomically distinct systems Lucia M. Vaina1,2,* and Sergei Soloviev1 1
Department of Biomedical Engineering, Brain and Vision Research Laboratory, Boston University, BME, 44 Cummington Str. 2, Boston, MA 02215, USA 2 Department of Neurology, Harvard Medical School, 75 Francis Str., Boston, MA 02215, USA
Abstract: An unresolved issue in visual motion perception is how distinct are the processes underlying ‘first-order’ and ‘second-order’ motion. The former is defined by spatio-temporal variations of luminance and the latter by spatiotemporal variations in other image attributes such as contrast or depth, for example. Using neuroimaging and psychophysics we present data from four neurological patients with unilateral and mostly cortical infarcts, which strongly suggest that first- and second-order motion have a different neural substrate. We found that from the early stages of processing, these two types of motions are mediated by two distinct pathways: first-order motion is carried out by mechanisms along the dorsal pathway in the occipital lobe, while the second-order motion by mechanisms mostly along the ventral pathway. The data reported here also suggest that different cortical regions may be in charge of processing direction-discrimination in second-order motion defined by different second-order attributes.
Introduction
approaches to study visual motion perception converge towards firmly elucidating the underlying neural substrate of motion mechanisms (for reviews Nakayama, 1985; Snowden, 1994; Sekuler et al., 2002). However, it has proved surprisingly difficult to achieve a consensus among motion-perception researchers on some basic yet fundamental questions, particularly on how many distinct motion systems human vision embodies. Some research groups maintain that a single mechanism is sufficient to account for human motion sensing regardless of the image cue (Johnston et al., 1992; Johnston and Clifford, 1995; Taub et al., 1997); for a summary review see Table 1 in Clifford and Vaina, 1999). Others have put forth the hypothesis that human vision may embody at least two motion sensing systems: a first-order system consisting of luminanceor color-defined attributes and at least another, second-order, system consisting of moving patterns whose motion attributes are not first-order, but most frequently are defined by texture flicker,
Visual motion perception is one of the most fundamental abilities of our visual system. Visual motion can be sensed from spatio-temporal variations in numerous and very different cues in the image, such as luminance, color, local contrast, texture, flicker or disparity. Over the past few decades neurophysiological and neuroanatomical studies have described motion selectivity in many cortical areas, and there is a consensus that motion processing occurs at different stages (for reviews see Albright and Stoner, 1995; Andersen, 1997) in the visual system. Psychophysical and computational research has defined and characterized a large number of motion processes and their relationships. The close link between the neurophysiological, psychophysical and computational *Corresponding author. Tel.: þ 1-617-353-2455; Fax: þ 1-617-353-6766; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-4
197
198
texture-contrast, texture spatial frequency (Chubb and Sperling, 1988; Cavanagh and Mather, 1990; Derrington and Badcock, 1992; Derrington et al., 1993; Fleet and Langley, 1994; Nishida et al., 1997; Baker Jr., 1999; Clifford and Vaina, 1999). The firstorder motion mechanisms are blind to second-order motion because the latter contains no consistent difference in luminance. Yet psychophysics has convincingly demonstrated that normal human observers have no trouble in perceiving purely second-order motion (Chubb and Sperling, 1989). Psychophysical evidence suggests that, at least initially, the visual system analyzes first- and second-order motion by distinct visual pathways and different mechanisms (i.e. a quasi-linear process detects first-order motion, and a nonlinear mechanism detects second-order motion (for a review, see Baker, 1999). In a recent study of first- and secondorder motion, Schofield (2000) argues convincingly that the visual world contains both first-order (luminance) and second-order (contrast) defined information. While first-order motion mechanisms are blind to second-order motion, physiological studies in both awake and anesthesized animals provide clear evidence for the existence of neurons sensitive to second-order motion. Directionally selective neurons in primate cortical areas that are strongly responsive to first-order visual motion stimuli are found in the middle temporal area (MT or V5) (Albright, 1992; O’Keefe and Movshon, 1996, 1998; Churan and Ilg, 2001), the middle superior temporal area (both MSTd, Geesaman and Andersen, 1996, and MSTl, Churan and Ilg, 2001) and superior temporal polysensory area (STP) (O’Keefe and Movshon, 1998), and cat areas 17 and 18 (Zhou and Baker, 1993), have been found to respond to a variety of second-order motion stimuli. The study of O’Keefe and Movshon (1998) compared responses to first and second-order motion in both MT/V5 and V1 and concluded that while most MT/V5 neurons responded poorly and nonselectively to second-order motion, none of the V1 cells did. A fundamental question common to these studies is whether any of these known motion-responsive cortical areas provide the substrate for the secondorder motion mechanism in the same way it has been described for specific first-order mechanisms. There
was no strong evidence that a subpopulation of neurons in any of the cortical areas studied thus far are selective for the different kinds of second-order stimuli employed. Thus, the question still confronting us is whether there are visually responsive areas specifically devoted to processing second-order motion, or at least significantly involved in this processing. Several recent fMRI studies in normal human observers point to areas outside of human hMT þ (MT and MST) as being strongly responsive to several types of second-order motion stimuli (Smith et al., 1998; Somers et al., 1999; Wenderoth et al., 1999). In particular, Smith et al. (1998) report that the cortical area V3 (lower hemifield) and its ventral counterpart, VP (upper hemifield) have stronger responses to second-order than to first-order motion. Based on these results, they speculated that V3 and VP may be the first visually responsive areas in which second-order motion is explicitly represented. They also reported that hMT þ was activated by both first and second-order motion. However, the experimental design and stimuli employed in this study were not conducive to clarifying whether second-order motion is detected in V1 and V2. In a recent carefully designed fMRI study, Dumoulin and collaborators (Dumoulin et al., 2002, 2003) used first- and second-order stimuli with identical spatial and temporal properties (Boulton and Baker, 1993a,b; Clifford et al., 1998) to further investigate the possibility of cortical specialization for these motions in normal human observers. While many cortical areas had similar responses to both types of motion, a region-of-interest analysis indicated that V1 and V2 were more involved in the processing of first-order motion, while a region posterior to hMT þ was preferentially activated by secondorder motion. The neuroanatomical substrate of first- and second-order motion was also investigated behaviorally in a few neurological studies. For example, Plant and collaborators (Plant and Nakayama, 1993; Plant et al., 1993) studied perceptual abilities for both types of motion stimuli in patients before and after unilateral occipital–temporal resection. For firstorder motion they used speed and direction discrimination tasks, while for second-order motion subjects were required to discriminate direction in
199
a ‘beat’ pattern, resulting from the sum of two oppositely drifting sine-wave gratings. After the surgery, but not before, contrast threshold for direction discrimination was more impaired for second-order than for first-order motion patterns for stimuli presented in the contralesional visual field. However, the threshold of contrast discrimination of static orientation gratings was normal. Plant and colleagues interpreted their results to indicate that the mechanisms underlying secondorder motion are not as widely distributed in the extrastriate cortex as the first-order motion mechanisms, and therefore they are less likely to survive insults to the posterior part of the brain. The lesions in these patients were quite large and involved significant white matter, so these studies tell us little about a potential neuroanatomical substrate for second-order motion mechanisms. Moreover, one must be cautious in interpreting the results since, as pointed out by Clifford and Vaina (Clifford and Vaina, 1999), the ‘beat stimuli’ of the form used in these studies are not well suited to isolating secondorder motion processing. Chubb and Sperling (Chubb and Sperling, 1988) showed that stimuli for studying second-order motion should be ‘driftbalanced’, so as not to contain first-order motion components. Beat stimuli are not drift-balanced because they are formed by the additive superposition of two sinusoidal components (Barron et al., 1994). For a static carrier, as in Plant and Nakayama’s stimulus, the two components have equal and opposite temporal frequencies, but differ in spatial frequency, and therefore are not drift balanced. (Clifford and Vaina, 1999, provide a detailed theoretical discussion of the problem.) Greenlee and Smith (Greenlee and Smith, 1997) compared detection and discrimination of first- and second-order motion in twenty-one neurosurgery patients with unilateral lesions in the posterior cerebral cortex and normal control subjects. In the first set of experiments, thresholds for orientation and direction of moving patterns (first-order) or contrastmodulation depth (second-order) were measured using the method of constant stimuli. In another experiment speed discrimination thresholds were determined for both first and second-order gratings. The first interesting outcome of this study is that direction thresholds were slightly elevated especially
for second-order stimuli in patients with lesions in the lateral intraparietal (LIP) and superior temporal (ST) areas, as compared with lesions in the inferotemporal region (IT). The second outcome was that speed discrimination thresholds for first-order motion stimuli were just slightly more elevated than for second-order motion in patients with damage to ST and LIP areas. These results suggest a significant overlap in the neural substrate of first- and secondorder speed discrimination. Nawrot and colleagues (Nawrot et al., 2000) studied a patient with a bilateral resection of the occipital–temporal areas to treat epilepsy. This patient presented with transient deficits of both first- and second-order motion perception which, however, recovered within a few weeks. Similarly, recovery of second-order motion perception was found 20 months after the surgical lesion in the patients studied by Braun et al. (Braun et al., 1998). Azzopardi and Cowey (2001) studied motion perception of a patient, G.Y., with a striate cortex lesion and cortically blind in the contralesional field. They investigated G.Y.’s ability to discriminate direction of motion in several first- and secondorder stimuli. While G.Y. reliably discriminated the direction of isolated first-order moving bars even when presented in the scotoma, he failed on discriminating motion direction in random dot kinematograms (RDK) even at 100% coherence, despite correctly detecting the presence of movement in the stimulus. The same was true for stimuli embodying gratings, plaids, or RDK of motion in depth. For second-order motion stimuli, he could easily detect and discriminate the direction of motion of second-order bars which had no luminance cues associated with them. However, in his scotoma G.Y. could not even detect RDK stimuli defined by dynamic texture contrast, at any speed or contrast. Of relevance for this article is that G.Y. failed to discriminate direction in RDK stimuli, no matter whether they were first- or second-order. Overall G.Y.’s performance on a series of carefully designed psychophysical tests demonstrates that motion discrimination is severely impaired in the scotoma, suggesting that the relevant motion information does not arrive to the intact motion-sensitive extrastriate cortical areas (i.e. MT and MST) by a pathway bypassing the damaged area V1. The other
200
neurological studies discussed above suggest that various extrastriate lesions may affect the perception of second-order motion, often together with that of first-order motion. However, the lesions in these were sufficiently large to include quite a few putative extrastriate areas, and they involved significant amounts of white matter which complicate the precision of the functional-anatomical localization. Moreover, because of the diversity of the stimuli used, comparison across studies is difficult. We are therefore still left with the questions: which type of motion is processed, where and how? Over the past 10 years, as part of our research on the effects of lesions on visual motion perception in humans, we compared the ability of discriminating direction of first- and second-order motion in several stroke patients with unilateral lesions. In a few patients (F.D., R.A., T.F. and J.V.) with small, circumscribed, single and cortically centered unilateral infarctions, we carried out a detailed neuroanatomical analysis of their structural MRI data. In this article I will relate their performance on firstand second- psychophysical tests and the locus of their lesions on the basis of detailed analysis of MRI data of their brains. To allow a direct comparison among these patients, the anatomical data from all four patients was reprocessed such that their lesion localization can be shown on the 3-D surface of an inflated brain accompanied by coronal or transverse brain slices that show the lesion in relation to various landmarks. For the first two patients, F.D. and R.A., a direct comparison with recent fMRI data on first- and second-order motion has also been made (Dumoulin et al., 2002, 2003). During these patients’ weekly visits to the laboratory for a period of over a year, we were interested to determine the integrity of their visual motion perceptual abilities and whether deficits remained stable or recovered over time. Our data provide more specific evidence than the previously reported neurological cases for the hypothesis that first and second-order motion are separate mechanisms and that they can be selectively damaged by lesions. The specificity of these lesions and the double dissociation of deficits suggest that at least to some extent first- and second-order motion are mediated by different pathways in the visually responsive cortex.
Patients F.D. and R.A. Patients F.D. and R.A. will be discussed first as both their lesions have been analyzed in more detail and the psychophysical study was more detailed. Here we will focus only on the comparable first- and secondorder motion tasks and on a task of long range motion (Green, 1986). Detailed longitudinal studies of these patients on a large battery of motion tasks have been published previously (Vaina and Cowey, 1996; Vaina et al., 1996, 1998, 1999b, 2000). Second, patients T.F. and J.V. will be briefly discussed, with the intent to demonstrate a finer dissociation of the anatomical pathways proposed to mediate first- and second-order motion analysis (Vaina, Soloviev and Dumoulin, in preparation). Patient F.D. is a right-handed male, collegeeducated social worker, who suffered a left hemisphere infarct at the age of 41. Neurological examination at the time of the cerebrovascular accident (CVA) revealed slight right-sided weakness, lasting a few days, and a mild anomia lasting a few weeks. For a few weeks after the infarct he complained of feeling disturbed by visually cluttered moving scenes and by (auditorily) noisy surroundings. Neuroophthalmological examination, including visual fields, was normal. Contrast sensitivity for detection of static or moving gratings and for discrimination of direction and speed of motion were normal as was temporal frequency discrimination. The anatomical locus of lesion is shown first on the inflated probabilistic brain (Fig. 1, top left), in relation to anatomical landmarks, and in Fig. 1 (bottom right) in comparison with areas of activity elicited by a recent fMRI study of firstand second-order motion study in normal subjects (Dumoulin et al., 2002, 2003). The 2-D MR slices shown illustrate in the coronal plane F.D.’s lesion on the dorso-lateral surface in the left hemisphere (shown on the right). In this figure, the lesion has been localized using the Cardview software package (Rademacher et al., 1992, 1993). The segmented cortical surface of each hemisphere is subdivided by topographic criteria into parcellation units (PU.) The parcellation system is relatively fine grained and retains the principal topographic landmarks. The spatial extent of the lesion was outlined on each slice where it was identified. The lesion was
201
Fig. 1. Lesion location in Patient F.D. and fMRI data is schematically drawn on the average unfolded surfaces of the left hemisphere: (top left): T1-weighted image of F.D.’s brain was co-registered with the Talairach coordinate system using automatic 3D inter-subject registration tools of MR volumetric data in standardized Talairach space from the Montreal Neurological Institute, which is an average brain volume derived from roughly 300 brains. On the surface F.D.’s lesion location affecting second-order motion perception is plotted on the inflated brain. Coronal slices: Using the Cardview parcellation and lesion localization software (Rademacher et al., 1993), F.D.’s lesion is shown in coronal slices in relation to the relevant parcellation units, which are familiar gross anatomical structures. The lesion is located dorsolaterally in the left hemisphere involving the superior (OLs) and inferior (OLi) occipital lateral cortex. It extends anteriorly into the angular gyrus (AG) and middle temporo-occipital cortex (TO2) and terminates in the supramarginal gyrus (Sgp). (bottom right): The fMRI clusters preferentially activated by second-order motion (Worsley et al., 1996, 2002) are outlined in green (Dumoulin et al., 2000). The circles indicate the peaks of the second-order activation in the parietal (P ¼ 0.001) and occipital (P ¼ 0.001) lobe. The occipital activation and lesion sites are mainly dorsal to hMTþ (depicted in black). F.D.’s lesion is portrayed in green (filled area). It overlaps with the most posterior area of cortical activity elicited by second-order motion stimuli. (S. Dumoulin helped in generating the inflated brain.)
small enough not to disrupt the identification of the requisite sulcal trajectories and landmarks. The Cardview system allows illustration of the lesion in relation to the cortical parcellation units. The lesion involves both the superior (Ols) and inferior (Oli) lateral occipital cortex, it extends anteriorly and involves a portion of the angular gyrus (AG) and middle temporo-occipital cortex (TO2) and terminates in the inferior portion of the posterior supramarginal gyrus (Vaina et al., 1999b). To compare the location of the lesion with the cortical areas identified by fMRI studies, we registered F.D.’s T2 weighted structural MRI volume in the Talaraich space (Talairach and Tournoux, 1988), and subsequently we applied specially devel-
oped scripts which identify the retinotopic areas and the hMT þ based on published fMRI studies. The coordinates of the region of interest (ROI) representing F.D.’s center of the lesion were {48(6), 55(8), 11(4)} suggesting that the lesion is dorsal to hMT þ reported by several studies to roughly correspond to {46(6), 73(10), 4(4)} (i.e. Van Oostende et al., 1997; Mendola et al., 1999; Sunaert et al., 1999; Vaina et al., 2001; Zeki et al., 2003). Patient R.A. is a right-handed retired computer manager who suffered a sudden right hemisphere embolic stroke at the age of 66. Visual fields obtained by both Goldmann and Humphrey perimetry revealed a left inferior quadrantopsia which resolved over a period of 16 months, when the behavioral data
202
Fig. 2. Lesion location in Patient R.A. and fMRI data schematically drawn on the average unfolded surfaces of the left hemisphere, viewed medially: (Top left): On the surface R.A.’s lesion location affecting first-order motion perception (Vaina et al., 1998) is plotted on the inflated brain using the same method as for F.D. Coronal slices: Using the Cardview parcellation and lesion localization software (Rademacher et al., 1993), R.A.’s lesion is shown in coronal slices in relation to the relevant parcellation units, which are familiar gross anatomical structures. The outline of the occipital cortex is shown in red and the outline of the lesion is shown in green. The lesion is located medially in the right hemisphere. It begins in the occipital pole, extending into the cuneus (CN) and the supracalcarine cortex (SCLC) and the lingual gyrus (LG). (Bottom right): The fMRI activation for first-order motion task falls partially in visual areas V1 and V2 (average V1/V2 border is shown with black lines). Though the first-order activation did not reach statistical significance in a stereotaxic analysis it did in a volume-of-interest analysis on the early visual areas identified in a separate scanning session (Dumoulin et al., 2003). The ROI analysis provides a signal-to-noise improvement due to intra- and inter-subject averaging, i.e. averaging of voxels within a functional area and averaging of the same area across subjects. Using a ROI analysis, area V1 was found to preferentially respond to first-order motion (P ¼ 0.01), a trend that decreased and eventually reversed in later visual areas. (S. Dumoulin helped generating the inflated brain.)
presented here were obtained. Contrast sensitivity was normal. Like in F.D., R.A.’s lesion is shown first on both the inflated probabilistic brain (Fig. 2, top left) and on coronal MR images of his own brain using the Cardview parcellation system. The lesion is predominantly dorsal to the striate cortex of the calcarine sulcus, involves the cuneus (CN) and the supracalcarine cortex (SCLC) and then descends to slightly include portions of the calcarine cortex and the lingual gyrus. Figure 2 (bottom right) shows the lesion again on the probabilistic brain in relation to the fMRI activation to first-order motion stimuli (Dumoulin et al., 2002, 2003) on which R.A. was selectively impaired (Vaina et al., 1999b). It is worth noting that among the tasks that compared R.A.’s ability to perform direction discrimination in firstand second-order motion, were the exact stimuli used by Dumoulin et al. (2002) and Dumoulin et al. (2003) to localize the neuroanatomical substrate of these two motion mechanisms.
We registered R.A.’s lesion in the Talaraich space and obtained the x, y, z, coordinates of the region of interest (ROI) delineating his right hemisphere cortical lesion. Consistent with the methods of lesion localization reported above, R.A.’s lesion was confined to the occipital lobe, with the centre in {14, 92, 2}. The result of processing R.A.’s structural MRI data with our anatomical templates based on reports from the fMRI literature of coordinates for the retintotopic areas and hMT þ , suggest that the lesion overlaps with cortical areas V1v and V2v, sparing area VP (is posterior to VP). For example, the fMRI studies of Mendola et al. (1999) and Sunaert et al. (1999), identified the following Talaraich coordinates for the areas V1v in the right hemisphere {8(2), 81(4), 5(5)}, and for V2v the coordinates reported are {9 (8), 78 (7), 6(5)}. The coordinates reported in these studies for the area VP are {16 (10), 79(7), 11(5)}.
203
Screening for visual motion perception: direction discrimination Among other visual motion tests, F.D. and R.A. were evaluated on their ability to discriminate direction in two first-order motion tests. In the first, Direction discrimination, the stimulus was a random dot kinematogram (RDK) where all the dots moved with a variable angle to the right or left of an imaginary vertical line, and subjects were required to report whether the RDK moved left or right. Figure 3B shows that R.A. was very impaired on this task for stimuli shown in his contralesional visual field, yet his performance for stimuli shown in the ipsilesional field and that of F.D. in both visual fields were normal. In the second test, Motion coherence (Fig. 3C), the stimulus was a RDK in which the variable parameter was the proportion of coherently moving dots necessary for reliable global direction discrimination of the dynamic cloud of dots
Fig. 3. Screening battery for motion direction discrimination: (A) Schematic view of the direction discrimination task. The stimulus presented a field of evenly distributed dots all moving in the same direction, either slightly to the right or to the left of an imaginary vertical line (two small lines placed outside the stimulus indicated true vertical). (B) Results on this task from normal controls and patients F.D. and R.A. for stimuli presented in the right and left visual field, at 2 eccentricity. The y axis indicates the smallest angular difference (in degrees of visual angle) from vertical needed for reliable discrimination of direction of motion. (C) A schematic view of the motion coherence stimulus. The filled circles denote signal, that is dots that translate in the same direction — up, down, left, or right. Open circles denote masking motion noise, dots that are replotted at random spatial locations from one frame to another within the stimulus aperture. The arrows refer to the magnitude of the dot jumps. (D) Results on this task from normal controls and patients F.D. and R.A. for stimuli presented in the right and left visual field, at 2 eccentricity. The y axis represents the proportion of the coherently moving dots for reliable direction discrimination. (E) A schematic view of the flickering bar test. The stimulus consists of dense static random dots and a flickering bar (square-wave grating) moving up and down over the static random dot pattern background. The percentage of flickering dots was varied in a staircase procedure. Flickering is
obtained by random changing polarity of the dots within the bar. (F) Results from F.D. and R.A. on this task. The y axis represents proportion of the flickering dots necessary to determine the direction of motion. (G) A schematic view of the display in the long range motion task The display consisted of two pairs of vertically oriented Gabor patches positioned orthogonally to each other and arranged at the four corners of an imaginary square centered on a cross hair fixation mark. The stimulus consisted of four consecutive frames, displayed twice in succession to give a total of eight frames in one trial. The eight frames, each visible for 75 ms, interleaved with seven 45 ms stimulus intervals, are displayed in one of two sequences, corresponding to clockwise or counterclockwise rotation. The Gabors of each pair have the same spatial frequency and during a ‘rotation’ only the position of the Gabors changes, not their orientation. One pair of Gabors, the reference, was held constant at 5 cycles/deg. The other pair could have one of five different central spatial frequencies: 1, 1.7, 3, 5 and 10 cycles/deg. The separation between centers of like Gabors was 3.6 and for a 45 rotation each Gabor travelled 1.4 . The viewer discerns movement from the change in position of the textures. (H) The results of the long range motion tasks from control subjects (the gray area indicates mean and standard deviation) and F.D. (open squares) and R.A. (filled circles) show that F.D. was impaired while R.A. showed normal performance. In all the tests, the data illustrate threshold and error bars representing 1 SD. In all the tests, except Longrange motion, subjects fixated 2 off the lateral edge of the stimulus at midline level. In Long-range motion, fixation was in the middle. Using a staircase procedure the subject was shown all these stimuli (except the motion coherence test which was a 4AFC) in a 2AFC paradigm.
204
(Newsome et al., 1990; Newsome and Pare´, 1988). Figure 3D shows that R.A. was impaired on this task, while F.D.’s performance was normal. (It is interesting to note that F.D.’s performance on this test was initially in the impaired range for stimuli presented in the visual field contralateral to the lesion but, within 2 months, had recovered to normal (Vaina and Cowey, 1996)). In both these tests the speed of the dots was 3 /s. Two screening tests of direction discrimination in second-order motion were also used: motion defined by flicker, the Flickering Bar test (Fig. 3E) adapted from Albright (1992) and Long Range Motion (Fig. 3G) in which direction of motion was defined by differences in spatial frequency matching and position changes (adapted from Green, 1986). The two patients had very different performance on these tests. On the Flickering Bar test (presented at 6 Hz) R.A. performed normally while F.D. was severely impaired for stimuli presented in the right visual field contralateral to his lesion (Fig. 3F). Their performance on this test was in stark contrast with their performance on the first-order direction test. On the Long Range Motion test, F.D. was impaired, while R.A.’s performance was normal on this task. (Details on this task are provided in the legend for Fig. 3G.)
Comparison of direction discrimination in firstand second-order motion perception We used a single kind of stimulus to compare F.D. and R.A.’s ability to discriminate direction in firstand second-order motion. Each pair of first- and second-order stimuli had identical spatial and temporal properties. The reason for carrying out this specific comparison was that these patients’ performance suggested a double dissociation on direction-discrimination between first- and secondorder motion tests on the standard screening tests.
Direction discrimination in first- and second-order motion: motion coherence The pair of global motion-coherence tasks is illustrated schematically in Fig. 4A (first-order) and Fig. 4C (second-order). The task is conceptually
similar to the motion coherence test described above (Fig. 3C). The background consisted of flickering random dots, and subjects had to perform a direction discrimination (left or right) in stochastic first-order or second-order global motion displays. A variable proportion of the signal ‘tokens’ (i.e. small binary black and white texture patches) move coherently, left or right, while the other ‘tokens’ are presented from frame to frame at random location within the aperture. In the first-order version of the stimulus, there is a difference between the mean luminance of the tokens and the background while the contrast is identical. In the second-order version of the stimulus, the tokens differ in mean contrast from the background but not in mean luminance. The stimulus field subtended 10 10 arc, and was presented against a uniform gray background (9.5 cd/ m2) at 2 left or right of a small fixation mark placed at eye level. The stimulus area was divided into a notional grid of 38 38 blocks, each subtending 16 min 16 min. Each block consisted of a dense random dot microtexture made of pixels whose luminance was one of 256 possible gray levels. The dots defining the microtexture could have one of two states: on or off, represented by different gray levels. The number of ‘on’ and ‘off ’ dots within a block was evenly distributed. The mean luminance of a block was the average of its ‘on’ and ‘off ’ dots. Its contrast was the ratio of the difference of the ‘on’ and ‘off ’ dots luminance divided by twice the mean luminance. A block can differ from the background either in mean luminance but not contrast (first-order motion), or in contrast but not mean luminance (second-order motion), and in both cases is called a token. Whether a block was ‘token’ or ‘background’, was randomly assigned at the beginning of every trial and token-block density remained constant at 42% throughout the test. The mean luminance of firstorder motion token blocks was 12.3 cd/m2 and contrast within the block was 0.2, while the mean luminance of second-order blocks was 9.5 cd/m2 and internal contrast was 0.6. The mean luminance of the background was in both cases 9.5 cd/m2 with internal contrast of 0.2. The strength of the motion signal in the stimulus was varied by changing the proportion of the tokenblock micropatterns in a given trial that carried the same undirectional motion signal. The remainder of
205
Direction discrimination in first- and second-order motion: D-max
Fig. 4. Direction discrimination in first- and second-order motion: Motion Coherence. On the left are shown schematic views of the first-order motion coherence task (A) and the second-order motion coherence task (C). On the right, B and D show the threshold coherence necessary to reliably perform these tasks for control subjects and F.D. and R.A. Note that compared to the normal controls, F.D.’s performance on the second order task was impaired for stimulus presentation in his right visual field, while R.A.’s performance was impaired on the first order motion task for stimuli shown in the left visual field contralateral to his lesion.
the token-block micropatterns appeared from frame to frame at random locations, creating the impression of flickering noise. When all the micropatterns reappeared with the same spatial and temporal offset, the display appeared as a cluster of micropatterns all moving to the left or to the right across the flickering background. Token density was 2 tokens per degree, and speed was 3 /s. Using an adaptive staircase procedure the stimulus was presented for 12 frames, with each frame shown for 45 ms, with zero interframe interval. Observers were asked whether the direction of the global motion was rightward or leftward. Threshold of proportion of signal tokens (percent coherent motion) necessary for reliable discrimination of stimulus direction was computed as the mean of the last six reversals in the staircase. Figures 4B and D show that in the visual hemifield contralateral to the lesion R.A. was impaired on the first-order but not on the second-order motion stimulus, whereas F.D. showed the opposite dissociation.
The second-order global motion stimulus contained a high proportion of both temporal (the flickering background) and motion (the ‘noise’ tokens) noise, and, initially F.D. was significantly impaired on global motion direction discrimination (motion coherence task) when the signal was embedded in masking motion noise (Vaina and Cowey, 1996b). R.A. was impaired on first-order global motion (the motion coherence task in the screening test battery). Was the deficit restricted to global motion, or was it a more general direction-discrimination deficit? For these reasons we assessed R.A. and F.D.’s ability to discriminate first and second-order motion in a task which addresses local motion measurements. The concept of this task is similar to the classic D-Max tests (Braddick, 1974, 1980), and at least for the first-order displays, the motion measurements are mediated by motion mechanisms that are spatially local. The characteristics of the display are identical to those in the global stimulus. Motion stimuli consist of two successively presented frames (frame duration 45 ms, and zero inter-frame interval) (Fig. 5). From one frame to the next the ‘token’ blocks are shifted coherently either to the left or to the right, with the remaining background acting as a static viewing window. The specific spatial pattern of texture defining the tokens is independent from frame-to-frame, thus from one frame to the next, the component pixels of the ‘token’ blocks and of the background flickered by randomly changing their state (from ‘on’ to ‘off ’ and vice-versa), yet keeping their mean luminance and contrast identical throughout the trial. Subjects were instructed to keep their gaze on a fixation mark 2 to the left or right of the lateral edge of the display. In a 2AFC task they reported whether the direction of motion was to the left or right. Stimuli were varied by an adaptive staircase procedure and the threshold for the maximum displacement for which observers correctly perceived the direction of motion, was averaged over the last six reversals. The results reveal that R.A. was selectively impaired on the first-order motion for stimuli presented in his left visual field. F.D.’s performance on this task and R.A.’s performance for stimuli
206
Fig. 5. First- and second-order D-max motion task. A and C show schematically the display used to measure direction discrimination in textured random dot kinematograms. In A, the motion is first-order: a group of micropatterns differing from the background in luminance is shifted coherently to the left or to the right. In C, the motion is second-order: the shifting micropatterns differ from the background in contrast but not in mean luminance. B and D show the performance of F.D. and R.A. and control subjects on the two tasks for stimuli presented in the inferior quadrants. F.D. was impaired in his left visual field for the second order stimulus but not for the first order stimulus. R.A. was impaired on first order motion but not on second, for stimuli shown in his contralesional field.
shown in the right visual hemifield were normal. The results on the second-order motion task were exactly the opposite; R.A.’s performance was normal, while F.D. was impaired for stimuli shown in the contralesional visual field.
Discussion of patients R.A. and F.D. R.A.’s lesion is probably centered on the putative functionally defined areas V2 in the occipital lobe, as illustrated in Fig. 2 (also slightly involving V1). In this figure (bottom right) we see the peak of fMRI activation (from Dumoulin et al., 2003) for the firstorder stimulus in normal subjects (on a task not discussed in this paper, but which R.A. actually performed and was very impaired on in the left visual field, contralateral to his right hemisphere lesion (Vaina et al., 1998). This task was not available in our laboratory when F.D. participated in the study). F.D.’s lesion appears to be dorsal to hMT þ , and is
consistent with the findings of the fMRI study of Dumoulin et al. (2002) and Dumoulin et al., 2003) that the anatomical substrate of second-order motion involves this region (Fig. 1, bottom right). Furthermore, F.D.’s psychophysical performance is consistent with the hypothesis that this area might provide at least in part an underlying neural substrate for second-order motion processing. Thus, our psychophysical results in these two patients and the significantly different locations of their lesions support the idea of relative cortical specialization for first- and second-order motion in the early visual areas of the occipital lobe, and in a cortical region posterior to hMT þ , respectively. The first is more specific for processing first-order motion, the latter for the processing of second-order motion. While the anatomical location of the lesion and the psychophysical data from F.D. and R.A. is consistent with that of Dumoulin and colleagues (2002), it is at odds with the functional imaging studies of Smith et al. (1998) and Wenderoth et al. (1999) which suggest the cortical areas VP, or ventral V3, as the putative substrate for second-order motion processing. This area was not directly, and almost certainly even not indirectly, involved in R.A.’s or F.D.’s lesions.
Two patients with dorsal and ventral V3 unilateral lesions I will briefly discuss similarities and differences in behavioral data on first- and second-order motion tasks of two patients with small unilateral infarcts centered in the ventral occipital region, V2/VP (Patient J.V.) and in the dorsal V2/V3 (Patient T.F.; Vaina et al., 2000). Patient J.V., was a 60-year-old right-handed college educated woman who suffered an infarct in the left occipital lobe, as shown in Fig. 6. For 6 weeks after the stroke, she reported that the world looked fragmented, ‘like a Picasso painting’, and she felt very uncomfortable coping with the visual world surrounding her. Patient T.F., was a 60-year-old right handed college educated man who suffered a mild almost infarct in the occipital lobe. Formal Humphrey and Goldmann perimetry visual field testing showed a well defined upper right
207
Fig. 6. Lesion localization in Patients J.V. and T.F. The lesion location in Patient J.V. is shown in green (the first row) and in Patient T.F. is shown in red (the second row). In the first column, lesion locations are shown on inflated images of J.V. and T.F.’s brains. The inflation was obtained using T-1 weighted images of the patients’ brains and was processed with the FreeSurfer software package (Dale et al., 1999; Fischl et al., 1999). For each patient the lesion was identified based on T2-weighted high resolution MRI images and their contours were overlaid on the inflated brain surface. The adjacent columns show again the lesions of J.V. and T.F. registered on T-1 weighted axial brain images in the Talairach space (Talairach and Tournoux, 1988). The axial slices also show the overlap of the lesions with Brodmann areas 17 (light blue), 18 (dark blue), and 19 (yellow) in the occipital lobe.
quadratopsia in J.V., but T.F.’s visual fields were normal. However, he spontaneously reported that he had difficulties seeing motion in the lower left visual quadrant. In both patients full vision was restored within 6 months when the data reported here were obtained. Figure 6 (top), shows J.V.’s infarct localized on the inflated 3D brain obtained from her MRI study together with axial slices registered into the Talairach space (Talairach and Tournoux, 1988), illustrating the lesion encroaching Brodmann areas 18 and 19 ventrally in the left hemisphere. Figure 6 (bottom), shows T.F.’s lesion localized on the inflated 3D brain obtained from the MRI study and relevant axial slices in the Talairach space, illustrating a small right hemisphere infarct in Brodmann area 18, just above the calcarine fissure. Using anatomical templates generated on the basis of the Talaraich coordinates defined in the literature for the retinotopic areas and for hMT þ , the ROI corresponding to J.V.’s lesion showed that it encroached BA 18 with the centre in {14, 78, 10} and BA 19 with the centre in {20, 80, 12}.
The lesion closely overlapped with the ventral areas V2v {10 (7), 79 (8), 10(6); in BA18} and VP {19(8), 79(8) 13 (6); in BA19} as defined by Mendola et al. (1999) and Sunaert et al. (1999). The center of the ROI corresponding to T.F.’s lesion, overlapped with the dorsal retinotopic areas, V2d {11(9), 93(6), 11(10)} and V3 {18(11), 93(6), 13(10)}. Figure 7 shows the results of J.V. and T.F. on the same tests for screening for visual motion perception that were reported above for F.D. and R.A. (Fig. 3). T.F.’s performance on both the direction discrimination and motion coherence tasks, was impaired for stimuli shown in the left visual field contralateral to his right hemisphere lesion (Fig. 7B and D). J.V.’s performance on these tasks was normal. On the other hand, T.F.’s performance was entirely normal on the flickering bar test, which is a second-order motion task. J.V. was impaired for stimuli shown in her contralesional visual field. It is of note that both patients performed within the normal range on the long range motion test (Fig. 7H).
208
Fig. 7. Screening battery for motion direction discrimination. As in Fig. 3, a schematic view of the stimuli and performance of normal controls and two patients is illustrated. A and B: Direction discrimination; C and D: Motion Coherence; E and F: Flickering bar; G and H: Long-range motion. Specific to the comparison of J.V. and T.F. on these tasks, is that in the first three they show a double dissociation of deficits (T.F. is normal on second order but impaired in the first order in the visual field contralateral to the lesion, and J.V. has the opposite performance). However, on the Long-range motion task both patients had normal performance.
The same two pairs of direction discrimination tasks in first- and second-order motion, Motion Coherence and D-max were also administered to these patients (Fig. 8). On these tests T.F. and J.V. presented a double dissociation of deficits. Similar to patient R.A., T.F. was normal on the secondorder motion tasks (Fig. 8D and H), but selectively impaired on the first-order tasks (Fig. 8B and F) for stimuli presented in the contralesional field. J.V., similar to patient F.D., had the opposite pattern of performance. She had normal performance
Fig. 8. Top two rows: First- and second-order D-max motion tasks. A and C show schematically the display used to measure direction discrimination in textured random dot kinematograms. In A, the motion is first-order: a group of micropatterns differing from the background in luminance is shifted coherently to the left or to the right. In C, the motion is second-order: the shifting micropatterns differ from the background in contrast but not in mean luminance. B and D show the performance of J.V. and T.F. and control subjects on the two tasks for stimuli presented in each visual field separately. J.V. was impaired in her right visual field for the second order stimulus but not for the first- order stimulus. T.F. was impaired on first order motion but not on second, for stimuli shown in the contralesional field. Error bars represent 1 SD. Bottom two rows: First- and second order motion coherence tasks. E and G show schematic views of the first- and second-order global motion coherence task, respectively. F and H show the threshold coherence necessary to reliably perform these tasks from control subjects and patients J.V. and T.F. Error bars represent 1 SD. Compared to the normal control subjects, T.F.’s performance on the first order task was impaired for stimulus presentation in his contralesional field. His performance on the second-order motion stimuli was normal for presentation in both visual fields. J.V.’s performance was the opposite. She had normal performance on the first-order task, but was selectively impaired on the second-order task for stimuli presented in the contralesional field.
209
on the first-order motion stimuli (Fig. 8B and F) but scored at a very impaired level on both second-order motion tasks when the stimuli were presented in the contralesional field (Fig. 8D and H).
Conclusion In this chapter we discussed selective deficits on direction discrimination selective to first- or secondorder motion in four neurological patients who were also studied neuroanatomically on the basis of the MRI’s of their brains. The MRI’s of the patients’ brains were resliced and registered in the Talairach space (Talairach and Tournoux, 1988) in order to provide a uniform coordinate system for comparison and reference to the fMRI localization of retinotopic areas and of hMTþ . However, as a point of caution, while a fine structural localization of the lesion can be obtained, it is unlikely that just based on structural MRI information one can determine precisely what cortical areas have a reduced input or output. Since for all four patients the MR images were registered in stereotaxic space (Talairach and Tournoux, 1988), we applied scripts developed in our laboratory to describe the location of the lesions using as reference the spatial coordinates (x, y, z) of visual areas published in the literature. Human fMRI studies have implicated the early visual areas (V1 and V2) in processing first-order motion (Dumoulin et al., 2002, 2003). Smith et al. (1998) found that areas V3 and VP are more responsive to second-order than to first-order motion, while Dumoulin et al. (2002) found instead that a region posterior to hMT þ was preferentially activated by second-order motion. How can we relate the psychophysical data from the four patients presented here to the fMRI result? Where does this leave us in explaining the performance of the four patients discussed here on direction discrimination in first- and second-order motion tasks? Area VP was clearly not involved in R.A.’s lesion and so he might have used it in processing the secondorder motion stimuli. Area VP is connected to other motion areas that are rich in direction-selective neurons, for example area MT (Felleman and Van Essen, 1991) and physiological studies have shown that neurons in MT respond to direction of motion
whether motion is first- or second-order. This could explain R.A.’s normal performance on a broad range of direction discrimination tasks of second-order motion. Moreover, the involvement in R.A.’s lesion of areas V2, known to contribute to the analysis of first-order stimuli and to project to MT and other higher motion areas, may explain his severe impaired performance on first-order motion direction discrimination tasks. T.F.’s lesion encroached with areas V2d and V3, which may explain his normal performance on the second-order motion and his impaired performance on the first-order motion tasks. However, we should note that Smith et al. (1998), found that V3 also responds to second-order motion. This is at odds with our findings, and should be further investigated in patients with small cortical lesions and by fMRI. J.V.’s lesion, located more ventrally directly involves area VP and her impairment in the contralesional visual field on local or global secondorder motion defined by flicker or contrast supports a role of this area in second-order motion. However, her normal performance on the long-range motion task suggests that perhaps the different second-order motion mechanisms have different neuroanatomical substrates. F.D., whose lesion was mainly posterior to hMT þ , was selectively impaired on direction discrimination in all the second-order motion tasks described here. It is relevant to recall that F.D.’s lesion intersected with the ROI found by Dumoulin et al. (2002, 2003) to strongly respond to second-order motion stimuli. It is also interesting to note that, different from J.V., F.D. was impaired on all secondorder motion tasks, including long-range motion which is higher level. Performance on this task depends on the difference in the carrier spatial frequency between the pairs of Gabor patches forming the stimulus, which is consistent with the operations of a secondorder long-range mechanism (Werkhoven et al., 1993). Different from the other tasks described here, it is possible that long-range motion is actually a higher level mechanism based on attention. The idea of several types of second-order motion mechanisms, and even a third-order motion system (Lu and Sperling, 2001) and that they may have different neuronal substrates is an important one and worth pursuing. Studies of neurological patients can provide double dissociations of deficits (as shown in
210
this paper) and the patients’ self-report of what they do and do not perceive in the everyday world will provide invaluable information about these motion mechanisms’ relevance to perception beyond laboratory setting where there are carefully controlled stimuli. The tentative suggestions made here on the basis of psychophysical and indirectly on fMRI studies would have been undoubtedly stronger if we had been able to also obtain a functional anatomical map of these patients, and thus register their functionally defined retinotopic areas and hMT þ with their brain activity specific to the first- or second-order motion tasks described here. Unfortunately this was not possible for these patients. We believe that carefully designed fMRI studies of neurological patients with small cortically centered lesions will refine our understanding of the functional architecture of the human visual motion system.
Acknowledgments This research has been supported in part by the NIH grant 2 RO1 EY-07861 to LMV. We are grateful to Serge Dumoulin for providing parts of Figures 1 and 2.
References Albright, T. and Stoner, G. (1995) Visual motion perception. Proc. Natl. Acad. Sci. USA, 92: 2433–2440. Albright, T.D. (1992) Form-cue invariant motion processing in primate visual cortex. Science, 255: 1141–1143. Andersen, R.A. (1997) Neural mechanisms of visual motion perception in primates. Neuron, 18: 865–872. Azzopardi, P. and Cowey, A. (2001) Motion discrimination in cortically blind patients. Brain, 124: 30–46. Baker, C.L. (1999) Central neural mechanisms for detecting second-order motion. Curr. Opin. Neurobiol., 9: 461–466. Barron, J.L., Fleet, D.J. and Beauchemin, S.S. (1994) Performance of optical flow techniques. Int. J. Comp. Vis.: 43–77. Boulton, J. and Baker, C. (1993a) Different parameters control motion perception above and below a critical density. Vision Res., 33: 1803–1811. Boulton, J.C. and Baker, C.L. (1993b) Dependence on stimulus onset asynchrony in apparent motion: evidence for two mechanisms. Vision Res., 33: 2013–2019.
Braddick, O. (1974) A short-range process in apparent motion. Vision Res., 14: 519–527. Braddick, O.J. (1980) Low-level and high-level processes in apparent motion. Phil. Trans. R. Soc. B., 290: 137–151. Braun, D., Petersen, D., Scho¨nle, P. and Fahle, M. (1998) Deficits and recovery of first- and second-order motion perception in patients with unilateral cortical lesions. Europ. J. Neurosci., 10: 2117–2128. Cavanagh, P. and Mather, G. (1990) Motion: the long and short of it. Spatial Vision, 4: 103–129. Chubb, C. and Sperling, G. (1988) Drift-balanced random stimuli: a general basis for studying non Fourier motion perception. J. Opt. Soc. Am. A, 5: 1986–2007. Chubb, C. and Sperling, G. (1989) Two motion perception mechanisms revealed through distance driven reversal of apparent motion. Proc. Natl. Acad. Sci. USA, 86: 2985–2989. Churan, I. and Ilg, U.J. (2001) Processing of second-order motion stimuli in primate middle temporal area and medial superior temporal area. J. Opt. Soc. Am. A, 18: 2297–2306. Clifford, C. and Vaina, L. (1999) A computational model of selective deficits in first and second-order motion processing. Vision Res., 39: 113–130. Clifford, C.W., Freedman, J.N. and Vaina, L.M. (1998) Firstand second-order motion perception in Gabor micropattern stimuli: psychophysics and computational modelling. Cogn. Brain Res., 6: 263–271. Dale, A.M., Fischl, B. and Sereno, M.I. (1999) Cortical surfacebased analysis: I. Segmentation and surface reconstruction. NeuroImage, 9: 179–194. Derrington, A.M. and Badcock, D.R. (1992) Two-stage analysis of the motion of 2-dimensional patterns, what is the first stage? Vision Res., 32: 691–698. Derrington, A.M., Badcock, D.R. and Henning, G.B. (1993) Discriminating the direction of second-order motion at short stimulus durations. Vision Res, 33: 1785–1794. Dumoulin, S., Baker, C.L. and Hess, R.F. (2002) Cortical specialization for processing first- and second-order motion in parietal and occipital lobe: an fMRI study. Soc. Neurosci. Abstr., 219.1. Dumoulin, S., Bitta, R., Kabani, N., Baker, C., Goualher, G., Pike, G. and Evans, A. (2000) A new anatomical landmark for reliable identification of human area V5/Mt: a quantitative analysis of sulcal patterning. Cereb. Cortex, 10: 454–463. Dumoulin, S., Baker, C.L., Hess, R. and Evans, A. (2003) Cortical specialization for first and second-order motion. Cereb. Cortex, in press. Felleman, D.J. and Van Essen, D.C. (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex, 1: 1–47. Fischl, B., Sereno, M.I. and Dale, A.M. (1999) Cortical surfacebased analysis II: inflation, flattening, and a surface-based coordinate system. NeuroImage, 9: 195–207.
211 Fleet, D.J. and Langley, K. (1994) Computational analysis of non-Fourier motion. Vision Res., 34: 3057–3079. Geesaman, B.J. and Andersen, R.A. (1996) The analysis of complex motion patterns by form/cue invariant MSTd neurons. J. Neurosci., 16: 4716–4732. Green, M. (1986) What determines correspondence strength in apparent motion? Vision Res., 26: 599–607. Greenlee, M.W. and Smith, A.T. (1997) Detection and discrimination of first- and second-order motion in patients with unilateral brain damagem. J. Neurosci., 17: 804–818. Johnston, A., McOwan, P.W. and Buxton, H. (1992) A computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells. Proc. R. Soc. Lond. B., 250: 297–306. Johnston, A.J. and Clifford, C.W.G. (1995) Perceived motion of contrast modulated gratings: predictions of the multichannel gradient model and the role of full-wave rectification. Vision Res., 35: 1771–1783. Lu, Z.-L. and Sperling, G. (2001) Three-systems theory of human visual motion perception: review and update. J. Opt. Soc. Am. A, 18: 2331–2370. MacDonald, D., Kabani, N., Avis, D. and Evans, A. (2000) Automated 3-D extraction of inner and outer surfaces of cerebral cortex from MRI. Neuroimage, 12: 340–356. Mendola, JD, Dale, AM, Fischl, B., Liu, AK. and Tootell, RBH (1999) The representation of illusory and real contours in human cortical visual areas revealed by functional magnetic resonance imaging. J. Neurosci., 19(19): 8560–8852. Nakayama, K. (1985) Biological image motion processing: A review. Vision Res., 25: 625–660. Nawrot, M., Rizzo, M., Rockland, K.S. and Howard, M. (2000) A transient deficit of motion perception in human. Vision Res., 40(24): 3435–3446. Newsome, W.T., Britten, K.H., Salzman, C.D. and Movshon, J.A. (1990) Neuronal mechanisms of motion perception. Cold Spring Harb. Symp. Quant. Biol., 55: 697–705. Newsome, W.T. and Pare´, E.B. (1988) Relation of cortical areas MT and MST to pursuit eye movements. II. Differentiation of retinal from extraretinal inputs. J. Neurosci., 8: 2201–2211. Nishida, S.Y., Ledgeway, T. and Edwards, M. (1997) Dual multiple-scale processing for motion in the human visual system. Vision Res., 37: 2685–2698. O’Keefe, L.P. and Movshon, J.A. (1996) First- and secondorder motion processing in the superior temporal sulcus of the alert macaque. Soc. Neurosci. Abstr., 22: 716. O’Keefe, L.P. and Movshon, J.A. (1998) Processing of first- and second-order motion signals by neurons in area MT of the macaque monkey. Vis. Neurosci., 15: 305–317. Plant, G.T., Laxer, K.D., Barbaro, N.M., Schiffman, J.S. and Nakayama, K. (1993) Impaired visual motion perception in the contralateral hemifield following unilateral posterior cerebral lesions in humans. Brain, 116: 1303–1335.
Plant, G.T. and Nakayama, K. (1993) The characteristics of residual motion perception in the hemifield contralateral to lateral occipital lesions in humans. Brain, 116: 1337–1353. Rademacher, J., Caviness, V.S., Steinmetz, H. and Galaburda, A.M. (1993) Topographical variation of the human primary cortices: implications for neuroimaging, brain mapping, and neurobiology. Cereb. Cortex, 3: 313–329. Rademacher, J., Galaburda, A.M., Kennedy, D.N., Filipek, P.A. and Caviness, V.S., Jr. (1992) Human cerebral cortex: localization, parcellation, and morphometry with magnetic resonance imaging. J. Cogn. Neurosci., 4: 352–374. Schofield, A.J. (2000) What does second-order vision see in an image? Perception, 29: 1071–1086. Sekuler, S, Watamaniuk, S. and Blake, R. (2002). Stevens’ handbook of experimental psychology. In: Yantis, H.S. (Ed.), Sensation and Perception, Vol. 1. Wiley, New York, pp. 121–153. Smith, A.T., Greenlee, M.W., Singh, K.D., Kraemer, F.M. and Hennig, J. (1998) The processing of first- and second-order motion in human visual cortex assessed by functional magnetic resonance imaging (fMRI). J. Neurosci., 18: 3816–3830. Snowden, R.J. (1994). Motion processing in the primate cerebral cortex. In: Smith, A.T. and Snowden, R.J. (Eds.), Visual Detection of Motion. Academic Press, London. Somers, D.C., Dale, A.M., Seiffert, A.E. and Tootell, R.B. (1999) Functional MRI reveals spatially specific attentional modulation in human primary visual cortex. Proc. Natl. Acad. Sci. USA, 96: 1663–1668. Sunaert, S., van Hecke, P., Marchal, G. and Orban, G.A. (1999) Motion-responsive regions in the human brain. Exp. Brain Research, 127: 355–370. Talairach, J. and Tournoux, P. (1988). Co-Planar Stereotaxic Atlas of the Human Brain. Thieme Medical Publishers, New York. Taub, E., Victor, J.D. and Conte, M.M. (1997) Nonlinear processing in short-range motion. Vision Res., 37: 1459–1477. Vaina, L. and Cowey, A. (1996) Impairment of the perception of second-order motion but not first-order motion in a patient with unilateral focal brain damage. Proc. R. Soc. Lond. B., 263: 1225–1232. Vaina, L.M., Makris, N. and Cowey, A. (1996) The neuroanatomical damage producing selective deficits of first or second-order motion in stroke patients provides further evidence for separate mechanisms. NeuroImage, 3: 360. Vaina, L.M., Makris, N., Kennedy, D. and Cowey, A. (1998) The selective impairment of the perception of first-order motion by unilateral cortical brain damage. Vis. Neurosci., 15: 333–348. Vaina, L., Cowey, A. and Kennedy, D. (1999b) The neuroanatomical damage producing selective deficits to first or
212 second-order motion in stroke patients. Human Brain Mapping, 7: 67–77. Vaina, L.M., Soloviev, S., Bienfang, D.C. and Cowey, A. (2000) A lesion of cortical area V2 selectively impairs the perception of the direction of first-order visual motion. NeuroReport, 11: 1039–1044. Vaina, L.M., Solomon, J., Chowdhury, S., Sinha, P. and Belliveau, J.W. (2001) Functional neuroanatomy of biological motion perception in humans. Proc. Nat. Acad. Sci., 98(20): 11,656–11,661. Van Oostende, S., Sunaert, S., Van Hecke, P., Marchal, G. and Orban, G.A. (1997) The kinetic occipital (KO) region in man: an fMRI study. Cerebral Cortex, 7: 690–701. Wenderoth, P., Watson, J.D.G., Egan, G.F., TochonDanguy, H.J. and O’Keefe, G.J. (1999) Second order components of moving plaids activate extrastriate cortex:
a positron emission tomography study. NeuroImage, 15: 227–234. Zeki, S., Perry, R.J. and Barteis, A. (2003) The processing of kinetic contours in the brain. Cerebral Cortex, 13: 189–202. Werkhoven, P., Sperling, G. and Chubb, C. (1993) The dimensionality of texture-defined motion: a single channel theory. Vision Res., 33: 463–486. Worsley, K., Marrett, S., Neelin, P., Vandal, A., Friston, K. and Evans, A. (1996) A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping, 4: 58–73. Worsley, K.J., Liao, C.H., Aston, J., Petre, V., Duncan, G.H., Morales, F. and Evans, A.C. (2002) A general statistical analysis for fMRI data. NeuroImage, 15: 1–15. Zhou, Y.-X. and Baker, C.L. (1993) A processing stream in mammalian visual cortex neurons for non-Fourier responses. Science, 261: 98–101.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 15
Reaching between obstacles in spatial neglect and visual extinction A. David Milner* and Robert D. McIntosh Cognitive Neuroscience Research Unit, Wolfson Research Institute, University of Durham, Queen’s Campus, University Boulevard, Stockton-on-Tees TS17 6BH, UK
Abstract: The aim of the present studies was to investigate whether ‘perception’ and ‘visually guided action’ could be dissociated with regard to two different aspects of the neglect syndrome. In the first study we tested a group of patients with neglect in two tasks, both within the same experimental setting. One task was to bisect a space between two objects, while the other required subjects to reach between the same pair of objects en route to a target area, so that the objects became potential obstacles to the reach. In the second study we tested a patient with visual extinction to double simultaneous stimulation, using a similar reaching task. Our aim was to determine whether visual awareness of obstacles in the workspace was necessary for successful navigation. In both studies we found evidence that reaching responses took normal account of the presence and location of obstacles on the left side, despite the tendency to neglect such left-sided information in more explicit perceptual tasks. We interpret both sets of results within a theoretical framework that identifies on-line visuomotor control with the occipito-parietal ‘dorsal stream’ (along with associated premotor and subcortical structures), and visual perception with the occipito-temporal ‘ventral stream’, plus associated temporo-parietal areas.
Introduction
Extinction can be characterized as a lateral bias of spatial attention occurring in the context of a reduced attentional capacity (e.g. Driver et al., 1997), whereby a stimulus on the ipsilesional side briefly attracts attention to the exclusion of simultaneous (or near-simultaneous) stimuli located in more contralesional locations. Other symptoms of neglect, such as rightward errors in line bisection, and left-sided deficits in visual search tasks, are generally regarded as more central to the essence of spatial neglect, though theories as to their causation range widely. The present chapter is concerned with exploring the phenomena of visual neglect and extinction from an unconventional starting point, by asking whether these conditions affect a person’s ability to guide their hand to a desired location while avoiding intervening obstacles lying on the right and left of the workspace. The logic of this approach derives from the visual processing model set out
The neglect syndrome is a complex and multi-faceted group of symptoms (Heilman et al., 2002), some of which hang together better than others. The brain lesions that cause neglect symptoms vary widely, though there is a region around the parieto-temporal junction, particularly in the right hemisphere in cases of left-sided neglect, that is included in the majority of cases (Vallar, 1993; Vallar et al., 1994; Karnath et al., 2001). Perceptual extinction to double simultaneous stimulation is traditionally included as one of the cardinal symptoms of the neglect syndrome (Heilman et al., 2002), though it frequently occurs in the absence of other signs of neglect, and several authors have argued that it may be causally independent (e.g. Milner, 1987, 1997). *Corresponding author. Tel.: þ 44-1642-333850; Fax: þ 441642-385866; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-5
213
214
by Milner and Goodale (1995). They argued that a distinction should be drawn between visual processing for perception and visual processing for action. According to the model, the former is embodied in the occipito-temporal ‘ventral stream’ of cortical processing, while the latter is embodied in the occipito-parietal ‘dorsal stream’ (Glickstein and May, 1982; Ungerleider and Mishkin, 1982). Although all visual processing can ultimately be expressed in the form of overt behavior (and indeed that is why it exists) the dorsal stream is thought to interact directly with premotor cortex and brainstem structures to transform visual information into motor coordinates, while the ventral stream can only influence action indirectly. What the latter route loses in immediacy, of course, it gains in flexibility. We have suggested (Milner and Goodale, 1995; Milner, 1997; Milner and McIntosh, 2002) that the temporo-parietal region of the right hemisphere that is most heavily implicated in the causation of neglect probably functions in good part as a high-level representational system that is fed principally by visual inputs arising from the ventral stream. The region can perhaps be regarded as the endpoint of the perceptual processing pathway, where the everchanging contents of our visual consciousness are represented, operating under the flexible control of attentional and executive systems located in superior parietal and prefrontal regions (Driver and Mattingley, 1998; Driver and Vuilleumier, 2001). Of course there is no doubt that overt behavior is affected by neglect, not just the patient’s inner experience. Nonetheless, the hypothesis is that much of this abnormal behavior is driven by the indirect ‘perceptual’ route, rather than by the direct ‘visuomotor’ route. In other words, it is an expression of the distorted and disrupted perceptual experience of the patient rather than a direct result of a damaged visuomotor control system in the dorsal stream. Indeed, even some apparently pure motor manifestations of neglect, such as biased ocular exploration in darkness (e.g. Hornak, 1992; Karnath et al., 1998), might derive from a distorted internal representation of external space. A strong prediction of this hypothesis is that, in many cases of neglect, visual processing for direct action should be free from the pronounced perceptual biases that characterize the
syndrome. It is this proposal that we aimed to test in the present studies. It is necessary to distinguish this proposal of dissociated perceptual and visuomotor processing in neglect from the more familiar concept of a division between perceptual and motor contributions to neglect. Traditionally, there has been an assumption that the symptoms of neglect arise from spatial biases either in the processing of sensory input or in the programming of motor responses. A number of attempts have been made to distinguish between input- and output-related biases in neglect, though the validity of many of these studies has been questioned (see Mattingley and Driver, 1997 for a review). In contrast to this serial input–output distinction, however, the model of Milner and Goodale (1995) is concerned with a distinction between parallel visual processing systems, respectively underlying conscious perceptual awareness and automatic goal-directed actions. In relation to neglect symptoms, this model proposes that the expression of spatial biases in visual processing should depend upon the behavioral context in which the visual processing takes place. Specifically, the symptoms of neglect should be more often linked to the visual processing subserving perceptual awareness than to the processing underlying the guidance of automatic goal-directed actions. Whether on not the visual guidance of simple direct actions in neglect patients is subject to lateral spatial biases has been a matter of some debate. Goodale et al. (1990) and Jackson et al. (2000) reported rightwardly curved trajectories in the pointing movements of recovered left neglect patients, and Harvey et al. (1994) observed a similar effect in right hemisphere damaged patients without neglect. Such effects, however, are not ubiquitous. For instance, Chieffi et al. (1993) observed no abnormal curvature in the pointing movements of a recovered left neglect patient to isolated visual targets. More importantly, these phenomena have not been substantiated by direct tests of patients with full left neglect. Perenin (1997) failed to observe any directional skewing of visually guided (open-loop) pointing movements among four neglect patients. Similarly, Karnath and colleagues (1997) reported accurate open- and closed-loop visual guidance in five chronic neglect patients, both in terms of terminal
215
accuracy and hand trajectories, a result replicated recently by Harvey et al. (2002) in four neglect patients performing a simple grasping task. On the basis of such findings, Karnath et al. (1997) have argued that abnormally curved movement trajectories are not characteristic of patients with neglect, although they may be characteristic of optic ataxia (Perenin, 1997). The evidence for a systematic spatial bias in the visually guided actions of neglect patients is therefore not strong. However, at least two published studies have claimed to observe specifically visuomotor manifestations of neglect. Behrmann and Meegan (1998) required six patients with left visual neglect to reach for a target LED, presented alone or simultaneous with a distractor LED, with target and distractor distinguished by color. Like normal subjects, neglect patients were slower to initiate a response to any target that was accompanied by a distractor but, compared to controls, they had an increased RT cost for a distractor on the right and a reduced cost for a distractor on the left. Behrmann and Meegan concluded that information from the neglected side must be processed minimally, if at all, in the visuomotor domain. However, we suggest that it is misleading to portray this effect as visuomotor, since conscious perceptual discrimination of the target from any distractor would be a necessary prerequisite for selecting which stimulus to respond to in the task. Thus, the asymmetrical influence of left and right-sided distractors is most likely to reflect a difficulty in choosing the left stimulus in the presence of the irrelevant stimulus on the right, irrespective of what response has to be made to the target. An accentuated influence of right-sided distractors has also been reported for a recovered left neglect patient performing a grasping task (Chieffi et al., 1993). Again, however, it is unclear whether this reflects unbalanced visuomotor processing, or interference with the perceptual discrimination of the target from the distractor object. A more direct strategy to assess the normality of visuomotor processing in neglect is to compare performance between tasks where the stimuli are matched as closely as possible, but the mode of response differs. Several studies have already employed this general strategy to investigate reaching and grasping in neglect. Robertson et al. (1995, 1997)
found that neglect patients showed significantly less bisection error when asked to pick up a rod at its midpoint than when asked only to point to the rod’s midpoint. They argued that the pointing task reflected disordered perception in their patients, but that reaching to grasp was an action driven more directly by the dorsal than the ventral stream, thus enabling an improvement in bisection accuracy. In a similar vein, Pritchard et al. (1997) found that a neglect patient (E.C.) was able to calibrate her finger-thumb grip aperture accurately when reaching to grasp different sized cylinders, with no asymmetry in grip size between target locations on the two sides of space. Yet when asked to indicate her perceived size of the cylinders, she consistently underestimated them when they were located on her left as compared with her right side. Later studies with groups of neglect patients have replicated this symmetrical grasping behavior (Harvey et al., 2002; McIntosh et al., 2002), though there was no direct demonstration in those papers that the cylinders were perceived as having different sizes on the two sides of space. In the present paper we report a series of studies aimed at assessing the visuomotor processing underlying simple reaching movements in neglect. In the first study, we asked neglect patients to perform two tasks, both involving the presentation of two upright cylindrical stimuli, whose locations varied from trial to trial. In the ‘bisection’ task, the patient was asked to judge the midpoint between the two cylinders and to place their finger at that location. In the ‘reaching’ task, the patient was asked to move the hand rapidly from a start point to touch a ‘target zone’ located beyond the two objects. This second task was designed to be a simple act of reaching under direct visual control, with an implicit requirement that the reach needs to be executed so as to minimize the risk of collision with either cylinder. The conceptual similarity between these two tasks is that they both require the subject to take account of the location of the cylinders on the left and right simultaneously. In one situation, this demand is part of the explicit spatial analysis underlying a bisection response. In the other, it arises implicitly in computing the optimal spatial path for a visually guided reaching movement. Preliminary data on these tasks have been presented already (Milner and
216
McIntosh, 2002). We found good evidence in this experiment for a dissociation between the two tasks: most of the patients behaved like healthy controls in the reaching task, though they showed asymmetrical behavior when attempting to make explicit bisection responses. In other words, we found that most neglect patients were able to take a potential obstacle on the left as fully into account as one located on the right when reaching between them. However these results do not of course demonstrate that such obstacle avoidance can take place even when there is no conscious perception of the left obstacle. After all, there was no restriction of viewing time, and no independent evidence that the patients were unaware of the presence of the left-side object. We therefore conducted a single-case study in which we attempted to address this question. We tested a patient (V.E.) with persistent visual extinction who, under appropriate conditions, would often report seeing only the right stimulus when in fact a left stimulus was present as well. Again using a reaching task, we assessed whether his awareness or unawareness of the left object, when two objects were present, affected the trajectories taken by his hand. Normal subjects vary their reaches systematically according to the presence of a left-alone, right-alone, or bilateral pair of objects. Our question was whether V.E. would show these same phenomena, even when he did not ‘see’ the obstacle on the left — or would his reaches on such ‘unaware’ trials be more similar to those he made when there literally was only one object, located on the right? Our data indicate that a failure to ‘see’ an obstacle, due to visual extinction, does not necessarily compromise the ability to process the location of that obstacle in order to avoid it when reaching.
Experiment 1: Reaching and bisection in neglect Twelve patients with left visual neglect following unilateral right hemisphere stroke, and ten agematched controls, took part in this experiment. All patients displayed neglect on three or more of five standard diagnostic tests. Full details of the patients can be found in McIntosh et al. (submitted). Subjects were seated in front of a 60 cm square white stimulus board depicted in Fig. 1, with the right index finger
placed at the start position. Two dark gray cylinders (24.5 cm tall and 3.5 cm in diameter) could be fixed into the board, one to either side of the midline, at a depth of 25 cm with respect to the start position, and 20 cm in front of a 5 cm deep gray strip that spanned the far edge of the board. Each cylinder could occupy one of two possible locations, with its inside edge either 8 cm or 12 cm away from the midline. The factorial combination of these four locations thus created four stimulus configurations. Each patient performed two different tasks on this stimulus board, with the order of tasks alternating between subjects within each group. All responses were recorded by sampling the position of a magnetic marker attached to the nail of the right index finger, at a frequency of 86.1 Hz (Minibird, Ascension Technology Ltd.). Full
Fig. 1. Plan view of the apparatus used in Experiment 1. Cylinders were always presented in pairs, one on the left and one on the right of the midline. The open circles show their possible locations, with inside edges 8 and 12 cm from the midline. Each trial began with the subject’s right index finger resting on the start position (black dot). In the bisection task, the subject was requested to place their finger midway between the two cylinders, and the dependent measure was the lateral position of the response with respect to middle of the stimulus board. In the reaching task, the subject was instructed to reach out and touch the gray strip as fast as possible. The dependent measure was the lateral position of the right index finger as it crossed the virtual line joining the two cylinder locations (dotted line).
217
methodological details can be found elsewhere (McIntosh et al., submitted). In the bisection task, the subject was asked to place their right index finger midway between the two cylinders. The subject was told that this was a test of ‘accuracy of judgment’, and that an unlimited time was available for each response. On each trial, the subject was allowed to adjust their finger position until satisfied that it was exactly midway between the cylinders. The position of the marker attached to the index finger was then sampled for 1 s. The dependent measure was the average lateral position (P) of the finger marker, with respect to the midline of the stimulus board, during this 1-s period. Each subject made 48 bisection responses, with each of the four stimulus configurations presented 12 times in a fixed pseudo-random order. In the reaching task, the subject received the instruction ‘On the go signal, reach out and touch the gray strip as quickly as you can’, and was told that this was a test of ‘speed of movement’. Prior to the task, they were informed that, whenever a cylinder was present, there would be one on the left and one on the right, and that they should pass their hand between the two cylinders, rather than around the outside edge of the board. The presence of the cylinders was not otherwise mentioned. Each reaching response was recorded in full, and the dependent measure was the lateral position (P) of the index finger marker as it crossed the virtual line joining the two cylinder locations (the exact value of P was
estimated by linear interpolation). Each subject made 60 reaches, 12 for each of the four cylinder configurations, and 12 in which no cylinder was present on the board, in a fixed pseudo-random order. The 12 trials with no cylinder in place were included to check for any systematic reaching biases when the response was not constrained by potential obstacles. An independent t-test (corrected for unequal variances) performed on these responses alone found no difference between the groups [t (13) ¼ 0.59, P ¼ 0.57], with both groups passing on average slightly to the left of the board midline. This constant bias is unsurprising, since the marker from which responses were recorded was attached to the right index finger and was thus on the left side of the hand in its palm-down reaching posture. These no-cylinder trials were excluded from all further analyses. Figure 2 shows the group mean responses for each stimulus configuration in each task. A repeated measures ANOVA was performed with the withinsubjects factors of task (bisection, reaching), left cylinder location (near, far) and right cylinder location (near, far), and the between-subjects factor of group (control, neglect). Overall, the neglect group responded slightly further rightwards than the controls in the bisection task, and slightly further leftwards than controls in the reaching task, as reflected in a significant interaction of task by group [F(1, 20) ¼ 6.67, P ¼ 0.02]. Due to the presence of this interaction, and a significant three-way interaction of task by group by left cylinder location [F(1, 20) ¼
Fig. 2. Experiment 1: Mean responses in the bisection task (left) and the reaching task (right). The large gray circles depict the stimulus cylinders.
218
20.03, P<0.001], subsequent ANOVAs were conducted for each task separately. For the bisection task, the factor of subject group was not significant [F(1, 20) ¼ 1.39, P ¼ 0.25]. Thus, although the neglect group bisected slightly further rightwards than controls, this tendency was not reliable at the group level. This was to be expected from previous evidence that the typical rightward line bisection errors of neglect patients tend to be reduced or eliminated when gaps are presented instead of lines (e.g. Bisiach et al., 1996; McIntosh et al., in press). However, the mere lack of an overall rightward bias amongst neglect patients does not imply that their bisection behavior was normal in terms of its dependence on the locations of the left and the right cylinders. Although the main effects of the left cylinder location [F(1, 20) ¼ 224.42,<0.001] and the right cylinder location [F(1, 20) ¼ 351.63,<0.001] were robust, a significant interaction of left cylinder location by subject group [F(1, 20) ¼ 55.82,<0.001] indicated that the responses of neglect patients were far less influenced by the location of the left cylinder than were the responses of the control subjects. In Fig. 2, this effect is reflected in the conspicuous differences between the groups for stimulus configurations B and D of the bisection task. The control subjects moved their bisection point appropriately (by about 20 mm) when the left cylinder was moved from the near to the far location, but the neglect patients were less affected by this change. In the reaching task, by contrast, both the neglect patients and the controls shifted their reaching trajectories symmetrically when either cylinder moved between near and far locations (Fig. 2, right). Accordingly, the analysis of the reaching data found robust main effects of left cylinder location [F(1, 20) ¼ 78.68,<0.001] and right cylinder location [F(1, 20) ¼ 131.36,<0.001], and these effects did not interact with the factor of subject group. In summary, the reaching responses of both groups were sensitive to the locations of the cylinders on either side of space, but the bisection responses of the neglect group were specifically insensitive to the location of the left cylinder. This pattern can be most clearly appreciated if we visualize the data in terms of the ‘weights’ given to the left and right cylinders in determining the responses. As we have described
elsewhere (McIntosh et al., in preparation), it is possible to analyze bisection data in a way that emphasizes the dependence of the response position (P) on the locations of the left endpoint (L) and the right endpoint (R), provided that these endpoint locations have been manipulated independently. Whenever L or R changes by some amount (40 mm in this experiment), P will also change by some amount: dPL and dPR, respectively. The values of dPL and dPR can be considered to reflect the ‘weight’ given to either endpoint in determining the response. In a series of line bisection experiments, we have found that neglect patients invariably accord a lower weight to the left endpoint (dPL) than to the right endpoint (dPR); this may be true even for patients who do not produce mean rightward bisection errors. Figure 3 represents the data from Fig. 2 in terms of the mean weights, dPL and dPR, given to the left and right cylinders (i.e. how much the response shifts as a function of 40 mm shift of one or the other cylinder). As the left half of the figure illustrates, the control subjects were equally attentive to the location of the left and the right cylinders in making their bisection responses, producing similar values of dPR and dPL (approximately 20 mm). In other words, the controls adjusted their bisection responses by approximately half of the distance by which the cylinders shifted. The neglect patients, however, had values of dPL that were dramatically lower than those of dPR, just as we have found in comparable analyses of line bisection data (McIntosh et al., in preparation). This effect was present in all twelve patients (see Fig. 4), despite the fact that five patients had mean bisection errors that were leftward or zero. A very different picture is apparent from the right half of Fig. 3, where the results for the reaching task are presented. The values of dPR and dPL are more similar across the neglect group as a whole (with two notable exceptions: see Fig. 4), indicating that the point of transection was more evenly determined by the locations of the cylinders on the left and right. It thus appears that task demands may greatly modulate the expression of neglect: the patients take normal account of the location of the left cylinder when making a speeded reaching movement, but fail to do so when making an explicit bisection response.
219
Fig. 3. Experiment 1: The mean change in response induced by a 40 mm shift in the location of the left cylinder (dPL) and right cylinder (dPR) in each task.
Fig. 4. Experiment 1: A plot comparing the asymmetries found for each subject in the influence (weighting) of the two cylinders (dPR dPL), between the reaching task and the bisection task.
Discussion of Experiment 1 In this experiment, the reaching responses of both groups were similarly sensitive to the locations of the cylinders on either side of space, but the bisection responses of the neglect group were peculiarly insensitive to the location of the left cylinder. We
have concluded that neglect patients make normal use of spatial information from the left side in programming a fast reaching response, information that they fail to take into account when making a more explicit bisection judgment. However, this broad conclusion is subject to a caveat. Figure 3 shows that even in the reaching task there is a qualitative (though nonsignificant) trend toward a reduced influence of the left cylinder relative the right cylinder. This seems to result from the fact that while most patients show the group dissociation between the two tasks, some do not. This individual variation within the neglect group is illustrated in Fig. 4, in which the magnitude of asymmetry between the weightings of the left and right cylinders (dPRdPL) for the reaching and bisection tasks are co-plotted as a scattergram. All neglect patients have a positive asymmetry in the bisection task, giving a higher weight to the right cylinder than to the left, and there is no overlap with the control group. In the reaching task, there is considerable overlap between groups, but there are nonetheless two clear instances of an abnormally positive asymmetry within the neglect group. Indeed, the group trend toward a reduced influence for the left cylinder in the reaching task (Fig. 3) is driven very substantially by the behavior of the two patients in the upper right-hand corner of Fig. 4. These two patients show a comparably strong positive asymmetry, reflecting a reduced influence of the left
220
cylinder, in both tasks. In these two patients at least, there is no dissociation between neglect for the reaching and bisection tasks. It may be that their similar values of (dPRdPL) on the two tasks reflect the same underlying deficit, causing a reduced sensitivity to the left cylinder location in both tasks. However it is also possible that the lesions were simply more extensive in these two patients, causing a disruption of mechanisms of visuomotor attention as well as those of perceptual integration (see General Discussion). Certainly these were two of the most severe cases of neglect in our group, as indicated by screening tasks. Excluding these two patients, Experiment 1 provides a clear example of a dissociation between perception and action, in that all of our other patients took good account of the location of the left-sided cylinder during a reaching task, but failed to do so when making a bisection judgment. Of course the bisection judgment in itself required the patient to make an action, but this was a very different kind of action from reaching between the cylinders to a location beyond them. The bisection response was an act of communication: a form of ostensive behavior, whereby the patient was telling us what he or she perceived to be the midpoint of the space between the two cylinders on each trial. In contrast, the reaching task required the patient only to avoid colliding with the cylinders while moving the hand between and beyond them. Such successful obstacle avoidance clearly required taking into account the location of both cylinders, and this was almost universally achieved by the patients, as shown by their high dPL values in the reaching task. Yet, of course, our data do not allow us to claim that our neglect patients were necessarily unaware of the left-sided cylinder during the reaching and bisection tasks. In fact it is clear that even during the bisection task, most of them did pay some attention to it, as shown by the fact that their values of dPL were generally greater than zero (though always smaller than their dPR values). This is not too surprising, since there was no restriction of the time available for subjects to view the stimulus array, and no attempt to restrict their eye movements. We sought clearer evidence on this specific question of the role of visual awareness in obstacle avoidance by studying a patient with visual extinction to double simultaneous stimulation.
Experiment 2: Obstacle avoidance in visual extinction We have now undertaken a series of studies of a 75-year-old patient, V.E., who experiences left-sided visual extinction, due to a predominantly parietotemporal infarct in the right hemisphere (based on CT and MRI scans). He shows no visual field defect on Tu¨bingen perimetry, and no hemispatial neglect (though neglect had been apparent acutely following his stroke, some 12 months prior to our study). To assess the influence of extinguished stimuli on V.E.’s visually guided actions, we have taken advantage of the normal tendency for reaching movements to veer away from potential obstacles in the workspace (Tresilian, 1998). Normal reaches tend to swerve leftwards away from an obstacle on the right, rightwards away from an obstacle on the left, and the bilateral presence of obstacles induces intermediate reaching trajectories. Our question was whether V.E. would also produce such intermediate reaching trajectories when obstacles were present bilaterally, and whether this ability would depend on an explicit awareness of both obstacles. To address this question, we recorded V.E.’s reaching responses in the presence of obstacles, whilst collecting a verbal report of his awareness of the obstacles on each trial. By time-limiting V.E.’s view of the stimulus array, we were able to ensure that he correctly reported a single object located on either side of space, but failed to report the left object (i.e. showed extinction) on about half the trials in which both objects were present. We reasoned that if extinguished stimuli are able to influence his reaches, then he should make intermediate reaches irrespective of whether he reports seeing the left obstacle. If, however, extinguished stimuli do not influence his reaches then, whenever he fails to report an obstacle on the left, his trajectories should show a typical leftward swerve as if there was only an obstacle present on the right. Experiment 2.1 employed a simple stimulus setup in which V.E. had to reach and touch a target zone in the presence of potential obstacles, which in this case were thin poles, 15 cm high. There were two obstacle locations, one on either side of the midline, allowing room for the hand to pass through, but making it necessary to pay full regard to any poles present in order to make a smooth and uninterrupted reach
221
(see Fig. 5). V.E. was required to fixate centrally at the beginning of each trial and, following stimulus exposure, to reach out and touch the target zone as rapidly as possible and then to report verbally any poles that he had seen. A pair of liquid crystal glasses limited stimulus exposure to 500 ms. Poles were presented bilaterally or unilaterally, with bilateral trials twice as frequent as left-unilateral or rightunilateral trials. Trials cycled through a pseudorandom schedule until at least 20 bilateral trials had been collected in which V.E. reported both poles, and at least 20 in which he reported only the right pole. Since V.E. never initiated his reaching movements less than 835 ms after stimulus onset, he executed all reaches without visual feedback (visually ‘openloop’). Reaches were recorded by sampling the position of a marker attached to the right index finger at a frequency of 86.1 Hz, again using the Minibird motion analysis system. Full methodological
Fig. 5. Experiment 2.1: The upper panel shows a schematic diagram of the experimental set-up. V.E. fixated the flag (F) at the start of each trial. Following stimulus exposure, he reached out rapidly to touch the target zone and then reported any poles that he had seen. The lower panel shows spatially averaged trajectories in each condition (dotted lines indicate standard errors), where the zero lateral coordinate is aligned with V.E.’s mid-sagittal axis.
details can be found elsewhere (McIntosh et al., in press). As Fig. 5 illustrates, the spatial path of the hand shifted according to the poles that were present, reflecting a strategic minimization of the risk of collision. The overall influence of the right pole was more pronounced than that of the left pole. This pattern is common in normal subjects (McIntosh et al., in press) and presumably reflects the fact that, when responding with the right arm, objects on the right are more obstructive than those on the left. More interestingly, however, V.E.’s performance on the bilateral-pole trials was totally independent of his verbal report: there was no significant difference between the ‘extinction’ and ‘non-extinction’ reaching trajectories at the point where they crossed the line joining the two poles. A one-way ANOVA performed on the lateral displacement of the index finger at the point crossing the virtual line joining the two pole locations was highly significant [F (3, 85) ¼ 140.79, P<0.001]. Scheffe´ post hoc tests found reliable differences (P<0.001) between all conditions except between the two sets of bilateral-pole trials (P ¼ 0.73). In other words, V.E. avoided the obstacle on the left regardless of whether or not he reported its presence verbally. His reaching trajectories when he failed to report the left pole on bilateral-pole trials were not at all like those he made when there really was only an obstacle on the right present. This suggests that V.E. was implicitly processing the left obstacle’s presence and location: he denied awareness of the left pole on many trials, and yet his reaching behavior on those same trials showed that the presence and location of the left pole had been fully taken into account. Nonetheless, a lingering doubt remained. After all, V.E. did not give us his verbal report until after he had completed his reach on each trial. A skeptic might suggest that he somehow ‘forgot’ that he had seen the left obstacle, having first processed it for motor guidance. We therefore ran a second experiment (Experiment 2.2), in which we asked V.E. again to report what he saw on a trial-by-trial basis, but this time under two different conditions. In the ‘motor–verbal’ (MV) condition, he was asked (as before) to ‘touch the gray strip as quickly as possible, then report which poles you saw’. In the ‘verbal– motor’ (VM) condition, however, he was asked to
222
Fig. 6. Experiment 2.2: V.E.’s spatially averaged trajectories in the motor–verbal (MV) and verbal–motor (VM) tasks (dotted lines indicate standard errors).
‘shout out which poles you see as quickly as possible, then touch the gray strip’. Identical results were obtained in the two temporal order conditions: V.E.’s reaches took full account of any pole on the left, regardless of whether or not that pole was reported (Fig. 6). For the MV task, a one-way ANOVA performed on the lateral displacement of the index finger at the point crossing the virtual line joining the two pole locations was highly significant [F(3, 109) ¼ 68.14, P<0.001]. Scheffe´ post-hoc tests found reliable differences (P<0.001) between all conditions except between the two sets of bilateral trials (P ¼ 0.90). A similar main effect was found for the VM task [F(3, 113) ¼ 75.28, P<0.001]. Scheffe´ post-hoc tests again found reliable differences (P<0.002) between all conditions bar the two sets of bilateral trials (P ¼ 0.89). The MV result thus replicated Experiment 2.1, whilst the VM result confirms that the dissociation between verbal awareness and reaching behavior depends on the two different response modes, and not on their order of occurrence. It is notable that there was now no tendency for V.E.’s trajectories on bilateral trials where he reported only the right pole to be shifted slightly toward those when there really was only the right pole present. In fact the reverse was the case, suggesting that the trend in Experiment 2.1 (Fig. 5) was not a real one. In our view, this series of experiments establishes beyond reasonable doubt that V.E. was
able to use unconscious visual information just as effectively as conscious information in the guidance of his actions.
General discussion The results of our first study, documenting a dissociation between perception and action in neglect, agree nicely with those of Robertson et al. (1995), who contrasted picking up a rod at its midpoint with pointing to the rod’s midpoint. In effect we have provided evidence for the generality of this phenomenon by replacing the rod with an empty gap. At a general level, these dissociations between perception and action in neglect are reminiscent of the examples of preserved visuomotor control observed in the visual-form agnosic patient D.F., in the face of dramatically impaired perceptual experience (Goodale et al., 1991; Milner et al., 1991). Indeed, we have long been aware at an informal level that D.F. has no difficulty in negotiating her way through a cluttered room. More recently we have observed this preserved ability in a formal test situation in which D.F. was asked to reach out to grasp an object which was flanked by an irrelevant object on the left or right (McIntosh et al., 2000). She was consistently found to adjust her trajectories appropriately to take account of the obstacles.
223
This parallel with D.F. encourages us to construct a hypothesis for explaining our neglect data in terms similar to those we have used to understand D.F.’s pattern of visual abilities and disabilities. On the basis of the known neural properties of the ventral and dorsal visual processing streams in the primate cortex, Milner and Goodale (1995) proposed that D.F. had lost her ability to process form for perception through damage to the ventral stream, but was still able to use her residual dorsal stream structures to process form for the purpose of visuomotor control. This hypothesis has now been dramatically confirmed by recent functional and structural MRI studies (Culham, 2003, in press; James et al., submitted). These studies have revealed that the lateral occipital complex (LOC), generally believed to be equivalent to the inferotemporal area in the monkey’s ventral stream, has been destroyed bilaterally in D.F. The posterior parietal region has also suffered some degeneration, but an area lying anteriorly in D.F.’s intraparietal sulcus, believed to correspond to area AIP in the monkey’s dorsal stream, is still active during visual grasping, just as it is in healthy subjects. Of course the lesions that cause hemispatial neglect are different from those that are present in D.F., but they are also very different from those that cause visuomotor problems such as optic ataxia (Perenin 1997; Karnath et al., 2001). As we mentioned in the Introduction, the lesions in neglect are typically located inferiorly, in the right parietotemporal region, rather than in the superior parts of the parietal lobe. In contrast, it is in superior parietal regions, in and around the intraparietal sulcus, where lesions cause optic ataxia. This superior region thus, by analogy with the monkey data, appears to form a central part of the human homolog of the dorsal visual stream, a conclusion further supported by fMRI evidence (Culham and Kanwisher, 2001; Culham, 2003 in press). Since neglect is almost by definition a disorder of perceptual awareness, we have therefore argued that the inferior parts of the right parietal lobe (and superior parts of the right temporal lobe) probably relate more closely to the ventral than to the dorsal stream (Milner and Goodale, 1995). [Of course, it is not claimed that activity in the ventral stream circuits is sufficient for visual awareness; indeed there is
evidence that even in patients with extinction unconscious stimuli can still elicit activation — at low levels — within the ventral stream (Driver and Vuilleumier, 2001). Presumably it is through such sub-optimal ventral activation that aspects of object perception and identification can still occur in the absence of conscious awareness (see review by Merikle et al., 2001).] Our present data are consistent with this broad idea: we may assume that most of our neglect patients will have a relatively intact dorsal stream enabling them to avoid obstacles normally, just as D.F. appears to do. However, most of them have suffered damage to areas in the right inferior parietal lobule and/or superior temporal gyrus, which we believe embody a system concerned with constructing and manipulating mental representations of the spatial array (Milner, 1997; Driver and Mattingley, 1998; Driver and Vuilleumier, 2001). This damage evidently unbalances that system, causing the patients to attach reduced weightings to items on the contralesional side of the array when making perceptual judgments. Our related studies of patient V.E. have taken us further, by establishing a novel form of unconscious processing of extinguished visual stimuli. Although V.E. was unaware of the presence of the left object on many of the bilateral-object trials, his reaches on those trials were influenced by the extinguished object to just the same extent as on ‘aware’ bilateral trials. It is important to note that V.E.’s brain damage lies in the right inferior parietal lobe and right temporal lobe. This damage is consistent with our view that these areas play an important role in determining the contents of our perceptual awareness. The lesion does not encroach on the superior parietal region around the intraparietal sulcus, so we may assume that the human homologue of the dorsal visual stream is unscathed. Indeed it seems likely that V.E.’s implicit processing of obstacle location reflects the sparing of this superior region. But, of course, the superior parietal region is not only implicated in the visual guidance of movement. It also plays an important role in the control of visuospatial attention. This function seems to be most closely associated with systems that provide visual guidance for saccadic eye movements, particularly area LIP (Rizzolatti et al., 1994; Goldberg et al., 2002). This attentional control appears to operate
224
‘top-down’ by modulating activity in the inferior parietal region and occipito-temporal visual areas (Corbetta et al., 2000; Hopfinger et al., 2000; Kastner and Ungerleider, 2001; Yantis et al., 2002). Such a multiplicity of areas involved in visual attention makes it unsurprising that visual extinction can result from a range of different lesion sites. For example, although an early report by Posner et al. (1984) reported a stronger ‘extinction-like effect’ in the standard Posner spatial cuing paradigm at short inter-stimulus intervals in patients with superior rather than inferior parietal lesions, more recent evidence from Friedrich et al. (1998) indicates that lesions around the parieto-temporal junction are more crucial. These results make sense in the context of a modulatory pathway or pathways passing down from the intraparietal sulcal region to the inferior parietal and occipito-temporal regions. It is reasonable to suppose that the pathological imbalance of perceptual attention that constitutes extinction could result from damage at any point in these pathways. In the present context, where we have been at pains to distinguish between visual processing for perception and visual processing for action, it is important to note that the attentional imbalance that constitutes extinction is by definition one of attention for perception, whereby the ipsilesional stimulus of a pair consistently out-competes the contralesional stimulus for entry into awareness. As V.E.’s case shows, however, this can happen without a similar imbalance in visuomotor attention (Rizzolatti et al., 1994; Milner and Goodale, 1995), presumably by virtue of V.E.’s bilaterally intact superior parietal areas, including LIP. In contrast, if an extinction patient’s lesion were to include such superior parietal damage, then we would not expect visuomotor attention to escape unscathed. In other words, we would predict that the patient would not show the spared obstacle avoidance that V.E. shows so beautifully: instead, reaching trajectories on extinguished trials should look more like trials with only a unilateral right obstacle present, and not at all like unextinguished bilateral trials. Our evidence for a neurological dissociation between ‘visuomotor attention’ and ‘perceptual attention’ provides direct support for a distinction between these two concepts (e.g. Milner and Goodale, 1995: p. 190). Milner and Goodale
suggested specifically that these two varieties of visuospatial attention might take the form of modulations of neural activity within the dorsal and ventral visual streams, respectively. It is important to emphasize that there has never been any suggestion that these two kinds of attention would operate independently in most normal circumstances. Both functional MRI (Corbetta et al., 2000; Hopfinger et al., 2000; Kastner and Ungerleider, 2001; Yantis et al., 2002) and behavioral evidence (e.g. Schneider and Deubel, 2002) now support the idea that the two kinds of attention are normally closely coupled, and agree with Milner’s (1995) proposal that the dorsal stream takes the lead in co-ordinating the two. This would make good sense for most everyday situations. Our present findings indicate however that a ventrally located lesion can have a direct unbalancing effect on perceptual attention, without involving visuomotor attention at all. Interestingly, if our speculations are correct, the reverse should never occur: that is, it should not be possible to find a case of ‘visuomotor extinction’ in the absence of perceptual extinction.
Acknowledgments We thank all of our test subjects for their patience and co-operation, and also several colleagues whose collaboration has been central to the work described here, especially Chris Dijkerman, Kevin McClements, Tim Cassidy and Igor Schindler. We also gratefully acknowledge the financial support of the Wellcome Trust (grant no. 052443), the Leverhulme Trust (grant no. F00128C), and the UK Medical Research Council (grant no. G0000680).
References Behrmann, M. and Meegan, D.V. (1998) Visuomotor processing in unilateral neglect. Consc. Cognit., 7: 381–409. Bisiach, E., Pizzamiglio, L., Nico, D. and Antonucci, G. (1996) Beyond unilateral neglect. Brain, 119: 851–857. Chieffi, S.M., Gentilucci, M., Allport, A., Sasso, E. and Rizzolatti, G (1993) Study of selective reaching and grasping in a patient with unilateral parietal lesion. Brain, 116: 1119–1137. Corbetta, M., Kincade, J.M., Ollinger, J.M., McAvoy, M.P. and Shulman, G.L. (2000) Voluntary orienting is dissociated
225 from target detection in human posterior parietal cortex. Nature Neurosci., 3: 292–297. Culham, J.C. and Kanwisher, N.G. (2001) Neuroimaging of cognitive functions in human parietal cortex. Curr. Opin. Neurobiol., 11: 157–163. Culham, J.C. (2003) Neuroimaging investigations of visuallyguided grasping. In: Kanwisher, N. and Duncan, J. (Eds.), Attention and Performance XX: Functional Brain Imaging of Human Cognition. Oxford University Press, Oxford. Driver, J., Mattingley, J.B., Rorden, C. and Davis, G. (1997) Extinction as a paradigm measure of attentional bias and restricted capacity following brain injury. In: Thier, P. and Karnath, H.-O. (Eds.), Parietal Lobe Contributions to Orientation in 3D Space. Springer-Verlag, Heidelberg, pp. 401–429. Driver, J. and Mattingley, J.B. (1998) Parietal neglect and visual awareness. Nature Neurosci., 1: 17–22. Driver, J. and Vuilleumier, P. (2001) Perceptual awareness and its loss in unilateral neglect and extinction. Cognition, 79: 39–88. Friedrich, F.J., Egly, R., Rafal, R.D. and Beck, D. (1998) Spatial attention deficits in humans: a comparison of superior parietal and temporal-parietal junction lesions. Neuropsychology, 12: 193–207. Glickstein, M. and May, J.G. (1982) Visual control of movement: the circuits which link visual to motor areas of the brain with special reference to the visual input to the pons and cerebellum. In: Neff, W.D. (Ed.), Contributions to Sensory Physiology, Vol. 7. Academic Press, New York, pp. 103–145. Goldberg, M.E., Bisley, J., Powell, K.D., Gottlieb, J. and Kusunoki, M. (2002) The role of the lateral intraparietal area of the monkey in the generation of saccades and visuospatial attention. Ann. N.Y. Acad. Sci., 956: 205–215. Goodale, M.A., Milner, A.D., Jakobson, L.S. and Carey, D.P. (1990) Kinematic analysis of limb movements in neuropsychological research: Subtle deficits and recovery of function. Canad. J. Psychol., 44: 180–195. Goodale, M.A., Milner, A.D., Jakobson, L.S. and Carey, D.P. (1991) A neurological dissociation between perceiving objects and grasping them. Nature, 349: 154–156. Harvey, M., Milner, A.D. and Roberts, R.C. (1994) Spatial bias in visually-guided reaching and bisection following right cerebral stroke. Cortex, 30: 343–350. Harvey, M., Jackson, S.R., Newport, R., Kra¨mer, T., Morris, D.L. and Dow, L. (2002) Is grasping impaired in hemispatial neglect? Behav. Neurol., 13: 17–28. Heilman, K.M., Watson, R.T. and Valenstein, E. (2002) Spatial neglect. In: Karnath, H.-O., Milner, A.D. and Vallar, G. (Eds.), Cognitive and Neural Bases of Spatial Neglect. Oxford University Press, Oxford, pp. 3–30. Hopfinger, J.B., Buonocore, M.H. and Mangun, G.R. (2000) The neural mechanisms of top-down attentional control. Nature Neurosci., 3: 284–291.
Hornak, J. (1992) Ocular exploration in the dark by patients with visual neglect. Neuropsychologia, 30: 547–552. Jackson, S.R., Newport, R., Husain, M., Harvey, M. and Hindle, J.V. (2000) Reaching movements may reveal the distorted topography of spatial representations after neglect. Neuropsychologia, 38: 500–507. James, T.W., Culham, J., Humphrey, G.K., Milner, A.D., and Goodale, M.A. (in press) Ventral occipital lesions impair object recognition but not object-directed grasping: an fMRI study. Brain. Karnath, H.-O., Niemeier, M. and Dichgans, J. (1998) Space exploration in neglect. Brain, 121: 2357–2367. Karnath, H.-O., Dick, H. and Konczak, J. (1997) Kinematics of goal-directed arm movements in neglect: control of hand in space. Neuropsychologia, 35: 435–444. Karnath, H.-O., Ferber, S. and Himmelbach, M. (2001) Spatial awareness is a function of the temporal not the posterior parietal lobe. Nature, 411: 950–953. Kastner, S. and Ungerleider, L.G. (2001) The neural basis of biased competition in human visual cortex. Neuropsychologia, 39: 1263–1276. Mattingley, J.B. and Driver, J. (1997) Distinguishing sensory and motor deficits after parietal damage: an evaluation of response selection biases in unilateral neglect. In: Thier, P. and Karnath, H.-O. (Eds.), Parietal Lobe Contributions to Orientation in 3D Space. Springer-Verlag, Heidelberg, pp. 309–337. McIntosh, R.D., Dijkerman, H.C., Mon-Williams, M. and Milner, A.D. (2000) Visuomotor processing of spatial layout in visual form agnosia. Presented at Experimental Psychology Society, Cambridge. McIntosh, R.D., Pritchard, C.L., Dijkerman, H.C., Milner, A.D. and Roberts, R.C. (2002) Prehension and perception of size in left visuospatial neglect. Behav. Neurol., 13: 3–15. McIntosh, R.D., McClements, K.I., Dijkerman, H.C. and Milner, A.D. (in press) ‘Mind the gap’: the size-distance dissociation in visual neglect is a cueing effect. Cortex. McIntosh, R.D., McClements, K.I., Dijkerman, H.C. and Milner, A.D. (submitted) Preserved obstacle avoidance in patients with left visual neglect. McIntosh, R.D., McClements, K.I. and Milner, A.D. (in preparation) Weights and measures: a new look at line bisection behaviour in neglect. McIntosh, R.D., McClements, K.I., Schindler, I., Cassidy, T.P., Birchall, D. and Milner, A.D. (in press) Avoidance of obstacles in the absence of visual awareness. Proc. R. Soc. Lond. Merikle, P.M., Smilek, D. and Eastwood, J.D. (2001) Perception without awareness: perspectives from cognitive psychology. Cognition, 79: 115–134. Milner, A.D. (1987) Animal models for the syndrome of spatial neglect. In: Jeannerod, M. (Ed.), Neurophysiological and
226 Neuropsychological Aspects of Spatial Neglect. Elsevier, Amsterdam, pp. 259–288. Milner, A.D. (1995) Cerebral correlates of visual awareness. Neuropsychologia, 33: 1117–1130. Milner, A.D. (1997) Neglect, extinction, and the cortical streams of visual processing. In: Thier, P. and Karnath, H.-O. (Eds.), Parietal lobe contributions to orientation in 3D space. Springer-Verlag, Heidelberg, pp. 3–22. Milner, A.D. and Goodale, M.A. (1995) The Visual Brain in Action. Oxford University Press, Oxford. Milner, A.D. and McIntosh, R.D. (2002) Perceptual and visuomotor processing in spatial neglect. In: Karnath, H.O., Milner, A.D. and Vallar, G. (Eds.), Cognitive and Neural bases of Spatial Neglect. Oxford University Press, Oxford, pp. 153–166. Milner, A.D., Perrett, D.I., Johnston, R.S., Benson, P.J., Jordan, T.R., Heeley, D.W., Bettucci, D., Mortara, F., Mutani, R., Terazzi, E., Davidson, D.L.W. (1991) Perception and action in ‘visual form agnosia’. Brain, 114: 405–428. Perenin, M.-T. (1997) Optic ataxia and unilateral neglect: clinical evidence for dissociable spatial functions in posterior parietal cortex. In: Thier, P. and Karnath, H.-O. (Eds.), Parietal Lobe Contributions to Orientation in 3D Space. Springer-Verlag, Heidelberg, pp. 289–308. Posner, M.I., Walker, J.A., Friedrich, F.J. and Rafal, R.D. (1984) Effects of parietal lobe injury on covert orienting of attention. J. Neurosci., 4: 1863–1874. Pritchard, C.L., Milner, A.D., Dijkerman, H.C. and MacWalter, R.S. (1997) Visuospatial neglect: veridical coding of size for grasping but not for perception. Neurocase, 3: 437–443. Rizzolatti, G., Riggio, L. and Sheliga, B.M. (1994) Space and selective attention. In: Umilta`, C. and Moscovitch, M. (Eds.), Attention and Performance XV: Conscious and
Nonconscious Information Processing. MIT Press, Cambridge, MA, pp. 231–265. Robertson, I.H., Nico, D. and Hood, B. (1995) The intention to act improves unilateral left neglect: two demonstrations. NeuroReport, 7: 246–248. Robertson, I.H., Nico, D. and Hood, B.M. (1997) Believing what you feel: using proprioceptive feedback to reduce unilateral neglect. Neuropsychology, 11: 53–58. Schneider, W.X. and Deubel, H. (2002) Selection-for-perception and selection-for-spatial-motor-action are coupled by visual attention: a review of recent findings and new evidence from stimulus-driven saccade control. In: Prinz, W. and Hommel, B. (Eds.), Attention and Performance XIX: Common Mechanisms in Perception and Action. Oxford University Press, Oxford, pp. 609–627. Tresilian, J.R. (1998) Attention in action or obstruction of movement? A kinematic analysis of avoidance behaviour in prehension. Exp. Brain Res., 120: 352–368. Ungerleider, L.G. and Mishkin, M. (1982) Two cortical visual systems. In: Ingle, D.J., Goodale, M.A. and Mansfield, R.J.W. (Eds.), Analysis of Visual Behavior. MIT Press, Cambridge, MA, pp. 549–586. Vallar, G. (1993) The anatomical basis of spatial hemineglect in humans. In: Robertson, I.H. and Marshall, J.C. (Eds.), Unilateral Neglect: Clinical and Experimental Studies. Erlbaum, Hove, UK, pp. 27–59. Vallar, G., Rusconi, M.L., Bignamini, L., Geminiani, G. and Perani, D. (1994) Anatomical correlates of visual and tactile extinction in humans: a clinical CT scan study. J. Neurol. Neurosurg. Psychiat., 57: 464–470. Yantis, S., Schwarzbach, J., Serences, J.T., Carlson, R.L., Steinmetz, M.A., Pekar, J.J. and Courtney, S.M. (2002) Transient neural activity in human parietal cortex during spatial attention shifts. Nature Neurosci., 5: 995–1002.
SECTION IV
Blindsight and Visual Awareness
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 16
Roots of blindsight L. Weiskrantz* Department of Experimental Psychology, University of Oxford, Oxford OX1 3UD, UK
Abstract: The chapter reviews the historical background to demonstrations that there is residual visual function in the total absence of striate cortex (V1) in monkey and humans. The late 19th century evidence by Munk and others, as reviewed by William James, was that this was not possible in humans, and doubtful at best in monkeys. It has gradually become realized, starting in the middle of the 20th century, that even total bilateral removal of striate cortex in monkeys does not abolish all visual capacity, including spatial and pattern vision. The situation regarding unilateral or incomplete bilateral lesions in the monkey did not become clarified until Cowey’s doctoral work in the 1960s, demonstrating that field defects were not absolute, that sensitivity continued to improve over several months of postoperative testing, that the size of the field defect gradually shrank, that the sensitivity was poorest at the center of the field defect, and that recovery was not spontaneous but depended on sustained practice. In human subjects with unilateral lesions from the 1970s onwards, using forced-choice methodology parallel to animal studies, a wide range of visual discriminations was demonstrated but with alterations or complete absence of acknowledged awareness by subjects (blindsight). Various varieties of skepticism are discussed and rebutted. The gap between humans and animals was diminished by the demonstration by Cowey and Stoerig that monkeys, like humans, classify responses to blind-field stimuli as being ‘unseen’. Further recent degrees of closure and developments in human blindsight research are discussed.
Background of monkey research
extirpation, [monkeys’] visual sensations become perfect again; they are able to see minute objects, what they want is the discernment of things . . . ; they are deficient, in a word, of visual perception’’, but his conclusions regarding the lesion were doubted by William James, without citing his evidence. James was a great believer in foreign sources, and so he referred (1886) to Luciani’s work only in the German, failing strangely to refer to an excellent English translation (by one of the editors of Brain) of the original Italian, which had appeared in that journal in 1884. James thereby ensured, unfortunately, that most of his English readers would take Luciani’s prescient view no further. Ferrier (1886) had concluded that the occipital lobes were completely dispensable for visual function (did not cause ‘the slightest appreciable impairment’), but it is almost certain that the lesions were incomplete. William James, on the other hand, cited Munk to the effect that bilateral occipital lesions in the
It is always risky trying to identify the very first seeds from which a plant’s roots have sprung. It is not clear when the concept first emerged, but ‘blindsight’ as a word in the English Language (‘‘a condition in which the sufferer responds to visual stimuli without consciously perceiving them’’. Oxford Concise Dictionary) first surfaced in 1973. One nurturing source, however, was the long history of animal research on occipital cortex, which had demonstrated that primates could carry out visual discriminations in the absence of visual cortex (see Weiskrantz, 1972, 1998). Admittedly there was a wide range of views, even extreme disagreements, and the answer was slow to emerge. Luciani (1884, p. 153), for example, had concluded that ‘‘some time . . . after their *Corresponding author. Tel.: þ 01865-271362; Fax: þ 01865-310447; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-6
229
230
monkey caused permanent and total blindness. In fact, Munk allowed that ‘‘very gradually [the monkey’s] vision slightly improved so that he will not bump into things’’ (1881, translation by Von Bonin, 1960, p. 106) and, depending on the limits of the lesion, recovery could be much more complete. James also quotes Scha¨fer’s work (1888), at the time correctly, that blindness was permanent. But later Scha¨fer (1900) concluded ‘‘the blindness was not permanent, unless the lesion extended somewhat in advance of what is generally taken to be the limit of the lobe, on the inner and lower surface.’’ (1900, p. 753). James ended up hedging his bets; he suggested that occipital removal ‘‘makes the animal almost blind . . . a crude sensitivity to light . . . may remain’’, (the author’s italics), adding, if so, that ‘‘nothing exact is known about its nature or its seat’’. (1890, p. 63). But work in the first half of the 20th century established clearly that monkeys without striate cortex could respond to light. Thus, Marquis and Hilgard (1937) demonstrated that a conditioned eyelid response to light could be demonstrated in monkeys with complete striate cortex removal (and with histological confirmation of the lesion and complete degeneration of the lateral geniculate nucleus). Heinrich Klu¨ver’s work (Klu¨ver, 1927, 1942) was widely accepted as providing the best empirically-based description of the nature of the ‘crude sensitivity’ that James allowed as a possibility. Klu¨ver concluded that in the absence of striate cortex monkeys (with rather extensive lesions exceeding the limits of striate cortex) could only discriminate the total amount of luminous energy and not its pattern or distribution. Weiskrantz’s study (1963, 1972) suggested an extension to an integration of all retinal ganglionic activity, not just for luminous flux, so as to include total contour length and movement. Subsequent work, however, made it clear that the capacity was richer than could be accounted for by mere summation of activity. For example, in their seminal research the Pasiks and colleagues (Pasik and Pasik, 1971, 1982; Schilder et al., 1972), demonstrated pattern discrimination and brightness discrimination, both with total flux equated, in monkeys with total removal. They also measured the visual acuity and orientation discrimination thresholds, demonstrating good, but reduced capacity
(Humphrey and Weiskrantz, 1967; Humphrey, 1970, 1974) taught monkeys to respond to a variety of ‘salient’ visual events, demonstrating simply and elegantly by requiring the animal merely to reach out and touch the source of a visual event. He found that the animals could respond to small objects and to their location. This could not have been explained by random eye movements to intact regions (if any) of the visual field because it was later demonstrated that monkeys with striate cortex removal can locate stimuli randomly located in space, and do so with durations too brief to allow an eye movement to be executed before the stimulus disappeared (Weiskrantz et al., 1977). Humphrey’s and others’ evidence, in turn, could be linked to the original ‘twovisual system theory’, in which cortex was postulated to mediate visual identity, and the subcortex to mediate the detection of visual events (Ingle, 1967; Schneider, 1967; Trevarthen, 1968). It was influential at the time, and still deserves to be. If the situation regarding total striate lesions in the monkey historically was slow to gel, it was even more uncertain concerning subtotal lesions — which are the directly important comparison with human clinical cases, on which the evidence for human blindsight eventually was based. Bilateral and total lesions in clinical patients in the absence of lesions well outside of visual cortex are rare. Human subjects with unilateral damage, both in clinical examination and in everyday life, demonstrate a region of blindness in the visual field, or at least some part of it, that is rendered defective by a V1 lesion, and wellknown maps in textbooks plot the correspondence between lesion and field defect. The measurement of the field defect clinically, visual perimetry, is based on the subject’s responses to the presence of a light presented systematically over the whole visual field. The instruction is simply ‘tell me (or press a key) when (or whether) you see a light’. Whether it yields a fair representation of the human visual defect is another matter to which the author will return later. But if monkeys, in seeming contrast, possess some residual function even with total removal, it is not clear what would, or should, result from a subtotal lesion. If there was a corresponding defect in a specific region of the monkey’s visual field, would it be an absolute hole, would it be fuzzy at the edges, or indeed would there be any hole at all
231
(Weiskrantz, 1961)? The answer is difficult to reach just by looking at the animals because, apart from some transitory misreaching for food, the animals appeared to be quite normal (Cowey and Weiskrantz, 1963; Weiskrantz, 1972).
Cowey’s monkey perimetry results To have a definite answer one must have good evidence as to where the animal is looking at the crucial moment when visual capacity is measured objectively. The bare minimum one wants for perimetric charts in the monkey is reliable evidence about the detection of a visual event as a function of its position in the animal’s visual field. This had never been obtained for the monkey before 1960. Indeed, the eminent physiologist, John Fulton, declared in his influential neurophysiology textbook (1949, p. 345) that perimetry in the animal is ‘virtually impossible’. Earlier, Settlage (1939, p. 106) had commented that ‘‘there is no known method of perimetric examination of the monkey’s visual field’’. That is how matters stood when Alan Cowey started his doctoral research in the late 1950s. In fact, the effort started even earlier, while he was still an undergraduate. At Cambridge we had a kind of research project auction in which ideas were thrown out to the assembled Part II Psychology students for their undergraduate research projects. Richard Gregory and the author had earlier discussed the eye position issue together in relation to visual cortical lesions, and Gregory made the clever suggestion of using highlights on the eye to assess eye position — the highlights have very useful optical properties. And so the author tossed out this idea for grabs to the student meeting, and Alan took it up and perfected it. He proceeded to demonstrate that photographs of highlights could be used to assess eye position in human subjects reliably and with an error of no more than 2.2 , and even lateral head movements (of about 1.2 eye diameters) could be tolerated. And so the necessary groundwork was done. In his subsequent doctoral research, completed in 1961, Cowey applied the measurement technique to monkeys with visual cortex lesions. The animals’ differential behavioral responses to random sequences of lights or blanks — differential key
presses — were recorded. The method allowed him to tell where the animal was looking (typically, gazing fixedly at a small mirror in which the animal could see the reflection of its own eye), and hence performance could be related to eye position and thus to the position of the stimulus in the visual field. It was no mean task — it was, in fact, heroic. Training and analysis took up to 100,000 trials per animal, but the results were clear (Cowey, 1961, 1962, 1963, 1967; Cowey and Weiskrantz, 1963): local impairments were found approximately where they would be predicted from known primate anatomy and electrophysiology, for example that of Talbot and Marshall (1941) and Daniel and Whitteridge (1961). And so on this point Ferrier and others were wrong. But the main conclusion had to be qualified in several very important ways. The defect was by no means an absolute one; in fact, it was only when the intensity was reduced by several log units below the training level that the defect could be defined reliably. Secondly, the ability of every animal improved markedly and radically during several months of postoperative testing. In fact, it was necessary to continue to reduce the flash intensity further and successively in order for the defect to be revealed. Thirdly, there was a gradual shrinkage in the size of the defective region over the same period. And so on these two points Munk was right and James was wrong. Fourthly, thresholds were not distributed uniformly, but were highest in the center of the defective region of visual field and lowest at the edges. Fifthly, the gradual improvement probably did not occur spontaneously, because an animal who was not tested for two years postoperatively showed an unchanged picture on retesting, although it did then show the typical gradual improvement with subsequent retesting. Two tests conclusively and painstakingly ruled out that the results were due to artifacts based on stray light (Cowey, 1962, 1963). (For reviews, see Weiskrantz and Cowey, 1970; Weiskrantz, 1972). The landmark findings of Cowey’s still stand, unchallenged. They proved that the monkey’s visual defect is not absolute, and that its size and sensitivity need not be permanent. Clear and reliable quantitative determinations of the detection thresholds within and across the field defect were obtained, both before and after intervals of time and after
232
practice. The benefits of practice were shown in an elegant study by Mohler and Wurtz (1977), who demonstrated that practice in a portion of the monkey’s field defect can lead to a differential improvement in sensitivity in the practiced region relative to the unpractised region. Given that the monkey’s field defects are not absolute, and human perimetry charts show regions of absolute blindness, it is not surprising that it continued to be widely assumed that there is a major evolutionary separation between the visual systems of these two species of primates, even though the neuroanatomy seems closely similar: in both species there are parallel pathways originating in the retina, projecting to noncortical targets that might allow visual function to be sustained in the absence of the cortical target. Thus Marquis (1935) and Monokow (1914), among others, appealed to a doctrine of encephalization of function, such that higher the phylogenetic status of the animal, the more vision is dependent upon the visual cortex and its integrity, i.e., visual processes that must be presumed to be mediated subcortically in the monkey are elevated to an absolute dependence on the cortex in the human. That position, in some ways, carries some curious implications (see Weiskrantz, 1961, 1977), but the author just notes here that such an eccentric view has not altogether disappeared today.
Human research background But, leaving perimetry aside for the moment, in other respects, just as with the monkey, opinions regarding the effects of V1 damage in humans were widely divided well into the middle of the 20th century. William James was in no doubt. ‘‘The literature is tedious ad libitum . . . The occipital lobes are indispensable for vision in man. Hemiopic disturbances comes from lesion of either one of them, and total blindness, sensorial as well as psychic, from destruction of both’’. (1890, p. 47). A similar conclusion was advanced by Gordon Holmes, the doyen of British neurologists during World War I, who held the view that ‘‘severe lesions of the visual cortex produce complete blindness’’, although he added that if the lesions were ‘incomplete’ an amblyopia could result in which objects appeared indistinct, although
moving objects might ‘excite sensations’ (1918, p. 384). (In other words, parenthetically, he suggested that Riddoch’s (1917) claim that some patients could see moving stimuli but not stationary ones was based on incomplete lesions). On the other hand, in striking contrast, Holmes’ neurological counterpart with the German forces, Walter Poppelreuter (1917; see 1990 translation), declared that he could never find an absolutely blind scotoma with occipital damage; some rudimentary function was always present, which was also the conclusion of Wilbrand and Saenger in 1917 in their study of several hundred cases. A similar conclusion was drawn by Teuber et al. (1960) in their influential monograph of World War II and Korean War brain-damaged soldiers. They reported that there was no permanent visual blindness in the field defects in any of their brain-damaged group of more than 46 patients, even though their perimetry charts showed absolute blind regions. They especially advocated the use of adjunct methods based on dark adaptation, tachistoscope presentation of forms, recognition of hidden figures, perception of apparent and real motion, and so forth as methods of increasing sensitivity. Teuber, however, later was more cautious about his colleague’s evidence — and also of Poppelreuter’s — because of the problem of diffusion of light into intact visual fields. And, of course, the issues were and are always complicated by the vagaries of the limits and complexity of lesion sites in clinical cases. And so, surprisingly, about the largest and the most intensely studied system in the brain, the visual system, by the mid-20th century there was still controversy regarding the limits and permanence of field defects caused by cortical damage in human subjects. Fundamentally, the situation had not advanced that much beyond the position at the end of the previous century. In some ways we were further ahead in characterizing the field defects of monkeys, in relation to the site of damage, than we were for human subjects. Whatever the ultimate answer is about permanence and absoluteness, the point is that during clinical examination, and in everyday life, human subjects who have sustained a V1 lesion resolutely insist that they do not see the stimulus in a region of the visual field. But given the earlier results showing that monkeys with V1 lesions can respond to visual events
233
and discriminate them, and given that the human visual system has a similar anatomical organization to that of the monkey, why is the human field defect blind? Are we forced to conclude that there is a qualitatively distinct quantal evolutionary gap? There are two important differences in the comparison between human and monkey. Human perimetry charting is typically based on the subject reporting ‘yes’, (or pressing the ‘yes’ key) when a stimulus detected. The monkey perimetry carried out by Cowey used two alternatives, essentially, ‘yes, I detect’, and ‘no, I do not’. Secondly, of course, the human subject is instructed and reports verbally (even if there is a response key, he or she would have been instructed as to its use verbally). And the verbal instruction would inevitably be of the form, ‘press, or say ‘yes,’ when you see a light.’ It would be possible to do monkey perimetry using only a single ‘yes’ key, but at the expense of large numbers of false positives and shifts in response criteria. But it is not possible to instruct the monkey verbally, and certainly not to instruct it to report its actual visual experience.
Human blindsight research with animal methodology If one is comparing monkey and human, therefore, one must try to compare like with like methodologically. That means, first, dispensing with the luxury of asking human subjects what they see, but, second, to require a discrimination along the lines of animal testing, using forced-choice methods. (An obsessional investigator would even dispense peanut rewards if the research grant would rise to such lavishness.) The first group, as far as the author is aware, who deliberately dispensed with a dependence on subjects’ verbal reports were Po¨ppel et al. (1973). Animal evidence by Denny-Brown and Chambers (1955) and Humphrey (1974) had noted that monkeys with total visual cortical removal would direct their eyes to novel visual events. Po¨ppel et al. flashed a light briefly in different locations of the field defects caused by gun-shot wounds in war veterans, and encouraged them to look in the direction in which the flash had just occurred. The point of departure in their study was not that they recorded visual reflexes, in fact, they did not, but that they asked their subjects
voluntarily to look the direction in which the flash occurred. In other words, they engaged an instrumental response, an operant response, if you will. The instruction caused some considerable puzzlement in their subjects, because they could not actually ‘see’ the flash, but with encouragement they ‘played the game’. There was a weak but significant positive correlation between the original target position and the position taken up by the eye, at least for eccentricities out to about 25 . Shortly afterwards a patient, DB, was seen at the National Hospital in London who had undergone surgical removal of a tumor from the right calcarine fissure, and who appeared to be able to locate events in his blind hemifield better than one might have expected from clinical perimetry. Elizabeth Warrington and the author, and colleagues, confirmed the eye movement result of Po¨ppel et al., but then we carried out a range of tests based on monkey testing, as for example opened up by Nick Humphrey’s work and the Pasiks (Weiskrantz et al., 1974). We followed it up in this way for a further 10 years, including detection and visual acuity, motion directional discrimination, orientation discrimination and spatial localization, summarized in book form (Weiskrantz, 1986; Weiskrantz, 1998). DB often reported being completely unaware of the stimuli he could detect or discriminate. Soon afterwards work on another hemianopic subject, GY, was started up by Keith Ruddock, John Barbur and colleagues (Barbur et al., 1980) and this expanded into a worldwide enterprise with this subject. Cowey some years later also pursued work on human blindsight research, actively combining it with his animal work. This is not the place for a review (several are available, Cowey and Stoerig, 1991; Stoerig and Cowey, 1997; Weiskrantz, 1998, 2001b) — the important point was that subjects seemed to be able to make visual discriminations in the clinically blind fields even though they acknowledged no awareness of them. Blindsight as a term got attached almost by happenchance, when the author quickly had to find a title for a seminar to the Oxford neurologists, (‘hindsight and blindsight’). The response requirements for the visual discriminations were forced-choice, typically with two verbal responses (red/green; present/absent, grating/ uniform field; moving/not moving; moving left/ moving right, etc.) or preferably, two response keys.
234
To avoid response bias, some tests employed a twointerval forced-choice paradigm, (was the specified target stimulus in the first or second temporal interval?), usually with essentially the same result as with two forced-choice alternatives in a single interval. Thus, it would appear that, in terms of objective performance, the subject with a restricted V1 lesion can respond to visual stimuli even in the areas of the visual field that for the human are defined as absolutely blind. In neither the monkey nor the human is there an absolutely functionless visual field. In both species two (or more) alternative choice methodology is used to demonstrate this. But there still is an incomplete gap in this comparison between monkey and human. That is, we generally would assume, as a matter almost of definition if not of preconception, that the monkey must be aware of the stimulus when it presses the ‘present’ key, whereas even when the human subject correctly signals the difference between present and absent (or a variety of other binary choices along other dimensions) he nevertheless asserts that he does not perceive it, but is doing it by guessing as insisted upon by the experimenter. DB would frequently judge, after a block of trials, that he was just performing at chance (‘pure guesswork’) even when his performance was better than 90%, and express puzzlement when shown the results. In one memorable experiment on GY using a Posner attention paradigm, he commented to Kentridge, the experimenter, that he might as well stop because as far as he was concerned nothing was happening (Kentridge et al., 1999). (Fortunately the experiment went on!) Another kind of response of subjects, if they do not deny having any experience whatever, is that they ‘know’ or ‘feel’ that something is happening, especially with rapidly moving or sudden transient stimuli, and yet do not ‘see’ anything. We will return to the apparent monkey– human gap later in this essay.
Commentaries and points of controversy The relationship between the strength of the discriminatory performance and the subject’s awareness of the stimuli can be studied more directly by providing keys not only for the discrimination itself but also, on separate keys, for the subject’s reports of
presence or absence, or some degree, of awareness, which we have termed ‘a commentary key paradigm’ (Weiskrantz, 1986; Weiskrantz et al., 1995; Sahraie et al., 1997). Thus, there are two scales: one for discrimination, the other for awareness. Blindsight, as is the case with all implicit residual phenomena in neuropsychology, is essentially a disjunction between the two psychophysical scales in contrast to their close bonding in normal, intact function. The author has argued that this is a more satisfactory and direct route to interpretation than trying to appeal to signal detection theory to derive presence or absence of acknowledged awareness from the discrimination data themselves. Likening the difference between the blindness revealed in clinical perimetry and its absence in blindsight psychophysics to a difference between a ‘yes/no’ response requirement in clinical testing, which allows for a response-criterion shift, and forced-choice guessing in blindsight, which is criterion independent, Azzopardi and Cowey (1997) arranged for both the ‘yes/no’ and two alternative forced-choice discriminations to be independent of response bias. The result was there still was a difference between the two modes of discrimination for the blindsight subject but not for the normal subjects, leading to the conclusion that ‘blindsight is unlike normal, near-threshold vision’ (1997, p. 14190). However, and this is the critical point, even if there had been no such difference, the subject’s reported commentary would still be that there was no awareness, and this would be the case whether it was forced-choice yes/no or 2AFC (Weiskrantz, 1995, 2001a). Nor can awareness be finessed by fiat from an implicit assumption, taken by some to be almost a matter of definition, that a significant d 0 necessarily entails awareness of the discriminative stimuli. An off-line commentary is not equivalent to a bias in the on-line discrimination task. As this is a paper about roots, it is worth mentioning that almost at the very outset somehow the qualifier ‘controversial’ got attached, barnaclelike, onto ‘blindsight’ in the same way that ‘red’ is attached to ‘pillar box’ or ‘deep’ is to ‘sea’ (no pun intended.) Perhaps it was just part of the same background characterized by William James: ‘‘. . . the quarrel about the function of occipital cortex is very acrimonious; indeed the subject of localization of functions in the brain seems to have a peculiar
235
effect on the temper of those who cultivate it experimentally’’. (1890, p. 46). Aside from the continuation of such a vigorous tradition, another reason for controversiality is that blindsight — among the whole set of neuropsychological implicit syndromes — is the most deeply counterintuitive. One no longer finds evidence of long-term retention by priming in amnesic subjects so surprising (at least, not now — although it was vigorously doubted at the outset) but to be able to discriminate without some degree of conscious seeing — impossible! It naturally attracts skepticism. Soon after our 1974 paper with Warrington and other colleagues at the National, critiques were offered (see Campion et al., 1983): the phenomena could have been based on an artifact due to stray light into the intact visual field; or inadvertent eye movements to bring the stimuli into the intact field; or blindsight might be weak but normal vision, together with a change in criterion; or the striate cortex damage might be incomplete; and, finally, why so few cases? Such concerns deserve to be taken seriously, and they have been. The issues have been well ventilated and the author does not propose to review them in detail here, but will focus on one of them a little later. Briefly, stray light and inadvertent eye movements were already put to rest, or should have been, the author believes, by 1979 and were reviewed in the author’s Bartlett Lecture (Weiskrantz, 1980), and later in the blindsight book (1998), at least for DB. The claim of weak normal vision, similarly, does not convince unless near-perfect performance in the blind field is taken to be weak vision. Azzopardi and Cowey’s conclusion, based on a signal detection analysis, countering the view that blindsight is like normal near-threshold vision, has already been alluded previously. Other suggestions (Fendrich et al., 1992; Gazzaniga et al., 1994) had to wait for better brain imaging techniques to demonstrate completeness of striate cortex lesions, at least within the limits of high resolution MRIs (Barbur et al., 1993; Baseler et al., 1999; Morland, personal communication; see also Weiskrantz, 1995; Kentridge et al., 1997 regarding visual ‘islands’). (Unfortunately MRI imaging is not possible in DB because of metal clips implanted during surgery.) Incomplete lesions are, of course, always a possibility given the vagaries of the clinical case material, but they can be ruled out as a general explanation. Of course, residual function in
the monkey cannot be due to islands of intact striate cortex because there is histological confirmation, but some of the critics will not accept the animal primate evidence as having validity for the human visual system (Gazzaniga et al., 1994; see also Weiskrantz, 1995, 1998; Weiskrantz et al., 1998, demonstrating a very close correspondence between monkey and human pupillometry and human blindsight psychophysics.). In sum, the author believes there is now direct evidence concerning all of the issues, at least in critically studied subjects, except the question of small numbers of subjects. But he does wonder sometimes why the term ‘controversial’ perseverates. The term has enduring adhesive and somewhat disengaging properties. Aside from the territorial imperative that is commonly a root of attack in science, the author believes that one reason for continuing debate stems from a positive attribute, because it is an example in cognitive neuroscience that entails the conjoining of specific evidence with theoretical and philosophical as well as scientific implications at both the neurological and the perceptual and conceptual psychological levels. It occupies a very large arena, and there is a lot to tempt a skeptic. Critiques can and have come at it from several different angles. It is not just whether there is implicit perception, with all the issues entailed there, but whether it specifically applies to hemianopia and specifically and uniquely to V1 lesions; whether there is subcortical mediation of unconscious perceptual discrimination; what might be the role of visual reflexes versus commentaries; whether blindsight might be a route to the neural correlates of conscious perception; what are the evolutionary and phylogenetic differences, if any, in neural organization between human and nonhuman primates; is there a comparison with putative examples of blindsight in normal subjects; and a host of issues raised by philosophers regarding the relevance of the empirical evidence to the philosophy of mind, and especially of consciousness. the author takes such a very rich, tempting target as a large plus sign for cognitive neuroscience and hopes it is a sign of the future.
Closing some ellipses Some major apparent gaps still remain to be discussed. The first is the matter of whether blindsight
236
is rare, which is commonly assumed. Until recently only a small number of suitable subjects have been tested intensively, which has been taken by some to imply that blindsight is a rare condition. Even Azzopardi and Cowey start their paper (1997, p. 14190) as follows: ‘‘Blindsight is a rare and paradoxical ability of some human subjects with occipital lobe brain damage’’. But we simply do not know how rare it is. In his Bartlett lecture he reported that he had found evidence of residual visual function in 14 out of 22 National Hospital patients with occipital damage in scotomata judged by clinical assessment to be absolutely blind; and he could not rule it out in the other 8 cases (Weiskrantz, 1980). But this was carried out under far from ideal conditions, on acute cases, and with no possibility of thorough psychophysical investigation or follow-ups. More recently Arash Sahraie and his colleagues in Aberdeen have started to tackle the question headon by testing a population of hemianopic patients with a variety of etiologies. But it is necessary to introduce two further matters in order to deal with his study, which the author takes to be the first really intensive survey to be initiated. The first is that until now there has been an absence of a common spatiotemporal metric to compare all hemianopes. Arbitrary choice of a particular and narrow set of stimulus parameters can lead one badly astray, as Weiskrantz et al. (1991) showed in a repetition of a study by Hess and Pointer (1989) that had reported negative results for blindsight. These were confirmed using the same parameters, but with a slight change in temporal parameters the results were strongly positive. Sahraie’s basic paradigm (see Sahraie et al., 2002) is detection of sine wave gratings over a range of spatial frequencies with a range of temporal frequencies. Secondly, the question is how to characterize the mode in which patients say they ‘know’ that something has happened, even though they do not ‘see’ it, a commentary frequently uttered by DB and GY. When it occurs, it is generally for higher contrast and higher temporal frequencies than for those discriminations that can be carried out in the complete absence of any awareness, by sheer guesswork according to the subjects. The author has called the latter, pure form, Type 1 and the other Blindsight Type 2. The author recognizes that the distinction may be a smooth and shaded one, rather than binary, and
not everyone likes it, but one needs some sort of term. Petra Stoerig prefers ‘amblyopia’ instead of Type 2, and indeed back in 1972 the author also suggested that same term for the residual function in the monkey. But in our original 1974 human blindsight paper in Brain, we state why we consider amblyopia to be inappropriate, at least for DB. The author still adheres to this view that amblyopia, ‘fuzzy vision’ is not an adequate or appropriate term. In any case, whatever one calls it, Arash Sahraie uses a commentary key paradigm to plot the thresholds of both Type 1 and Type 2 for all patients over a wide range of spatial frequencies. So far, in the first 10 patients tested thoroughly, 8 out of 10 show blindsight (personal communication). (The two that do not have deep lesions extending into the thalamus.) The author does not believe that any Scottish hemianope will escape the patient’s attention, and so the issue of rarity is on the way to be solved. There certainly is no other way to do it, from the author’s point of view. Another gap concerns the use of meaningful, especially emotional stimuli in the blind field, and their relation to the amygdala via an extra-striate route. Marcel (1998) reported that unseen meaningful words in the blind field could prime the meaning of ambiguous words in the intact field. De Gelder and her colleagues (De Gelder et al., 1999, 2001, 2002) have published evidence showing that discrimination of facial expressions and also of evocative stimuli — e.g., puppy dogs versus spiders — is possible by GY and DB in their blind fields, calling the phenomena ‘affective blindsight.’ Functional MRI evidence demonstrates activation of the amygdala by emotional visual stimuli, correlated with that in the superior colliculus and pulvinar, in the absence of striate cortex (Morris et al., 2001). (The linkage of blindsight with the amygdala in this way closes a personal ellipse for the author, given that he started his research with the amygdala before embarking on visual research.) Cowey had reported in his original thesis research that monkeys with V1 lesions do not react behaviorally to emotional stimuli, such as snakes, or tempting food, but we do not know if there are reflexive autonomic changes or activation via the subcortical route implicated in humans, nor whether discriminations would be possible by formal testing. Another elliptical path started in a somewhat misleading way, based on the possibility of a reflex
237
response to visual stimuli. When the results of the monkey field defects were published, a suggestion was put to us that the monkeys may have been responding to their pupillary changes. Cowey and the author showed that the behavioral results were unchanged when the pupils were paralyzed with atropine. Of course the animals might still have been responding to the command signal controlling the reflex circuitry even though the final effector was blocked — an unassailable possibility. But rather than dismissing the pupil, it has turned out to be a very important adjunct to the assessment of blindsight. Pupillary changes occur to sine-wave gratings, movement and color, that are closely correlated with the forcedchoice psychophysical findings of blindsight (Barbur and Forsyth, 1986; Barbur and Thomson, 1987). As such they provide a very useful screening method for its occurrence (Weiskrantz, 1990; Weiskrantz et al., 1999). In fact, pupillometry can actually be more sensitive than psychophysics. For example, it reveals the complementary colored after-effect of a colored patch presented to the blind field (Barbur et al., 1999), demonstrating the integrity of successive color contrast. As this is an ‘unseen’ after-effect of an ‘unseen’ stimulus, it would present a challenge for psychophysical demonstration. After-effects connect with yet another ellipse. DB reports that he consciously perceives after-images following his fixation of ‘unseen’ stimuli in his blind field. They are produced by a range of achromatic and chromatic stimuli, and approximate to Emmert’s Law (Weiskrantz et al., 2002). Several years ago it was reported that after-effects can be generated by invisible stimuli ‘fixated by imagination’ (Weiskrantz, 1950; Oswald, 1957), which also conform to Emmert’s Law. To date no other blindsight subject has reported after-images, but the demonstration of after-effects by pupillometry suggest that they might occur but are below threshold for other subjects. DB, the longest-standing and practised blindsight subject, may well have recovered sensitivity which is only potential in other subjects. A rather more likely explanation for DB, however, stems from his history of migraine. There is evidence that migraineurs have an increased duration of visual adaptation effects (Shepherd, 2001) and increased amplitudes and defective habituation of evoked potentials (Afra et al., 1998; Connolly et al., 1982). In any event, the
presence of after-images in DB offers a unique opportunity to study conscious and unconscious attributes of precisely the same stimulus presented to the same position in the same visual field. Because of the surgical metal clips this cannot be done in DB with fMRI, but Cowey and the author and colleagues are actively pursuing the matter with event-related potentials. There is one final ellipse to discuss, and it is a major one. Alan’s seminal work on monkeys with V1 lesions, as we have seen, stood in contrast to the evidence on human subjects but it was also a backdrop against which the human subjects, like the monkeys, could be shown to possess residual function. One essential route for the human research led from the background work with monkeys, demonstrating that their field defects, as Cowey showed so clearly, are not absolutely blind nor static. Nor, it seems, are they in humans. But this is clearly demonstrable in humans only if the discrimination is not based on verbal judgments of ‘awareness’ or ‘seeing’, but on forced-choice methodology, akin to those used in animal research. When commentary keys are provided for the human subjects they can report their awareness or lack of it in parallel with their psychophysical choices. But if the monkey had a commentary key, how would it judge the appearance of the stimuli? Would the monkey have blindsight? No one has succeeded in training monkeys on a commentary key paradigm. But Cowey, together with Stoerig, has closed the gap in a logically related way. The commentary key is, essentially, an off-line judgment of a parallel discrimination. Cowey and Stoerig confirmed, firstly, that the animals can detect and localize stimuli very sensitively in their blind fields: when a light is randomly and briefly presented in the blind field, the monkeys detect it very well. But in a subsequent experiment the animals were trained to discriminate between ‘lights’ or ‘blanks’, by rewarding them appropriately for differentially touching either of two loci, in the intact hemifield, one for lights and one for blanks, in a random series of lights and blanks. They were rewarded for correct performance for both keys. The crucial question was how the animals would respond to lights in the blind field, the same lights for which the animals earlier had shown such exquisite sensitivity for detection and localization. The answer was that the monkeys
238
consistently pressed the ‘blank’ panel for lights in the blind field. The result was reliable and robust, even with increases in luminance and even when the stimuli were moving (Cowey and Stoerig, 1995, 1997; Stoerig et al., 2002). And so the monkey performs just as a human blindsight subject does — good discrimination of stimuli which are classified as being nonvisual. The monkey has blindsight. (Or at least one form of blindsight — we still do not know whether it is Type 1 or Type 2.) This is further evidence for the similarity in organization between the primate and human visual systems. It is time to draw things to a close. The author started with Cowey’s definitive work on V1 in the monkey. Blindsight in human subjects, as a concept and as an empirical pursuit occurred more than 15 years afterwards, and the closure of the ellipse back to monkeys via blindsight, from Cowey and Stoerig’s work, much more recently. But from whence did any idea itself arise about the possibility of unconscious perception? Well, here is an extract from a paper by Alan and the author, published in 1963. ‘‘Discussions which attempt a comparison of vision in man and in other animals usually omit to say what we mean by two of the terms most frequently used, namely see and blind. When we say that a human being sees a visual stimulus we mean that he had experienced something . . . if he is incapable of this experience we say he is totally blind; if visual stimuli fail to elicit the experience . . . we say the subject has an absolute scotoma. If the subject can respond to a light only because it makes him blink or because it can be used as a conditioned stimulus for blinking, we nevertheless say he is blind for he can tell us that he did not see the stimulus . . . . Is it not conceivable that the monkey is much better equipped than man to utilize the effects of a visual stimulus as cues but that seeing a stimulus is organized in a very similar manner in the two species?’’ (Cowey and Weiskrantz, 1963, p. 113) The paper from which the author quotes was jointly authored, but he knows that Alan wrote that section because he can remember discussing it with him quite intensely in draft form. In his own case he thinks that the distinction between performance and awareness derived mainly from his being primed from the years of work with Elizabeth Warrington on implicit memory and amnesia, which accustomed the
author to considering the counter-intuitive, as well as many, many discussions that she and the author had about the remarkable capacities that were gradually unfolding with DB, who at the time was the only blindsight subject to be studied intensively. Blindsight was also preceded and accompanied by a number of other examples that had emerged from neuropsychological syndromes (Weiskrantz, 1991, 1997). But here in this joint paper with Alan was a seed that long antedates the human blindsight work itself. But it is never possible to link a root to the first seed. The author’s colleague, Peter McLeod, has jestingly suggested to him that the term should be attributed to Shakespeare (‘‘Dead life, blind sight, poor mortal living ghost . . . ’’ — Richard III, Act 4, Scene 4), but of course the meaning in Shakespeare was precisely the opposite — sight that is blind rather than blindness that is sighted. When an article on blindsight appeared in the New York Academy’s Current Sciences (Weiskrantz, 1992) the journal forwarded to the author an irate letter they had received from a reader stating that the idea of blindsight had already been advanced long before, in 1905 in France by a Dr. Bard, published in La Semaine Me´dicale. The author had not known of it. And, indeed, Bard did say something like it, without much quantitative evidence (Bard, 1905). But it demonstrates that there seems never to be a first for an idea in science; the roots may be there but they require the right time and conditions to grow.
References Afra, J., Cecchni, A.P., De Pasqua, V., Albert, A. and Schonen, J. (1998) Visual evoked potentials during long periods of pattern-reversal stimulation in migraine. Brain, 121: 233–241. Azzopardi, P. and Cowey, A. (1997) Is blindsight like normal, near-threshold vision? Proc. Natl. Acad. Sci. USA, 94: 14190–14194. Barbur, J.L. and Forsyth, P.M. (1986) Can the pupil response be used as a measure of the visual input associated with the geniculo-striate pathway? Clin. Vis. Sci., 1: 107–111. Barbur, J.L. and Thomson, W.D. (1987) Pupil response as an objective measure of visual acuity. Ophthal. Physiol. Opt., 7: 425–429. Barbur, J.L., Harlow, J.A. and Weiskrantz, L. (1994) Spatial and temporal response properties of residual vision in a case of hemianopia. Phil. Trans. Roy. Soc. B., 343: 157–166.
239 Barbur, J.L., Ruddock, K.H. and Waterfield, V.A. (1980) Human visual responses in the absence of the geniculo-striate projection. Brain, 102: 905–992. Barbur, J.L., Watson, J.D.G., Frackowiak, R.S.J. and Zeki, S. (1993) Conscious visual perception without V1. Brain, 116: 1293–1302. Barbur, J.L., Weiskrantz, L. and Harlow, J.A. (1999) The unseen color after-effect of an unseen stimulus: Insight from blindsight into mechanisms of color afterimages. Proc. Natl. Acad. Sci. USA, 96: 11637–11641. Bard, L. (1905) De la persistance des sensations lumineuses dans le champ aveugle des hemianopsiques. La Semaine Me´dicale, 22: 3–255. Baseler, H.A., Morland, A.B. and Wandell, B.A. (1999) Topographic organization of human visual areas in the absence of input from primary cortex. J. Neurosci., 19: 2619–2627. Campion, J., Latto, R. and Smith, Y.M. (1983) Is blindsight an effect of scattered light, spared cortex, and near-threshold vision? Behav. Brain Sci., 6: 423–448. Connolly, J.F., Gawell, M. and Rose, F.C. (1982) Migraine patients exhibit abnormalities in the visual evoked potentials. J. Neurol. Neurosurg. Psychiatr., 45: 464–467. Cowey, A. (1961) Perimetry in monkeys. D. Phil. Thesis. Univ. of Cambridge. Cowey, A. (1962) Visual field defects in monkeys. Nature, 193: 302. Cowey, A. (1963) The basis of a method of perimetry with monkeys. Quart. J. Exp. Psychol., 15: 81–90. Cowey, A. (1967) Perimetric study of field defects in monkeys after cortical and retinal ablations. Quart. J. Exp. Psychol., 19: 232–245. Cowey, A. and Weiskrantz, L. (1963) A perimetric study of visual field defects in monkeys. Quart. J. Exp. Psychol., 15: 91–115. Cowey, A. and Stoerig, P. (1991) The neurobiology of blindsight. Trends Neurosci., 29: 65–80. Cowey, A. and Stoerig, P. (1995) Blindsight in monkeys. Nature Lond., 373: 247–249. Cowey, A. and Stoerig, P. (1997) Visual detection in monkeys with blindsight. Neuropsychologia, 35: 929–937. Daniel, P.M. and Whitteridge, D. (1961) The representation of the visual field on the cerebral cortex in monkeys. J. Physiol., 159: 203–221. De Gelder, B., Vrooman, J., Pourtois, G. and Weiskrantz, L. (1999) Non-conscious recognition of affect in the absence of striate cortex. NeuroReport, 10: 759–763. Denny-Brown, D. and Chambers, R.A. (1955) Visuo-motor function in the cerebral cortex. J. Nerv. Ment. Dis., 121: 288–289. Fendrich, R., Wessinger, C.M. and Gazzaniga, M.S. (1992) Residual vision in a scotoma; implications for blindsight. Science, 258: 1489–1491.
Ferrier, D. (1886) The Functions of the Brain. Smith, Elder, London. Fulton, J.F. (1949) Physiology of the Nervous System, 3rd ed., Oxford University Press, New York. Gazzaniga, M.S., Fendrich, R. and Wessinger, C.M. (1994) Blindsight reconsidered. Curr. Dir. Psychol. Sci., 3: 93–96. Hess, R.F. and Pointer, J.S. (1989) Spatial and temporal contrast sensitivity in hemianopia. A comparative study of the sighted and blind hemifields. Brain, 112: 871–894. Holmes, G. (1918) Disturbances of vision by cerebral lesions. Brit. J. Ophthalm., 2: 353–384. Humphrey, N.K. (1970) What the frog’s eye tells the monkey’s brain. Brain Behav. Evol., 3: 324–337. Humphrey, N.K. (1974) Vision in a monkey without striate cortex: a case study. Perception, 3: 241–255. Humphrey, N. and Weiskrantz, L. (1967) Vision in monkeys after removal of the striate cortex. Nature, Lond., 215: 595–597. Ingle, D. (1967) Two visual mechanisms underlying the behavior of fish. Psychol. Forsch., 31: 44–51. James, W. (1890) Principles of Psychology. Macmillan, London. Kentridge, R.W., Heywood, C.A. and Weiskrantz, L. (1997) Residual vision in multiple retinal locations within a scotoma: Implications for blindsight. J. Cogn. Neurosci., 9: 191–202. Kentridge, R.W., Heywood, C.A. and Weiskrantz, L. (1999) Attention without awareness in blindsight. Proc. Roy. Soc. B., 266: 1805–1811. Klu¨ver, H. (1927) Visual disturbances after cerebral lesions. Psych. Bull., 24: 316–358. Klu¨ver, H. (1942) Functional significance of the geniculo-striate system. Biol. Sympos., 7: 253–299. Luciani, L. (1884) On the sensorial localisations in the cortex cerebri. Brain, 7: 145–160. Marcel, A.J. (1998) Blindsight and shape perception: deficit of visual consciousness or of visual function? Brain, 121: 1565–1588. Marquis, D.G. (1935) Phylogenetic interpretation of the functions of the visual cortex. Arch. Neurol. Psychiat., 33: 807–815. Marquis, D.B. and Hilgard, E.R. (1937) Conditioned responses to light in monkeys after removal of the occipital lobes. Brain, 60: 1–12. Mohler, C.W. and Wurtz, R.H. (1977) Role of striate cortex and superior colliculus in visual guidance of saccadic eye movements in monkeys. J. Neurophysiol., 43: 74–94. Morris, J.S., De Gelder, B., Weiskrantz, L. and Dolan, R.J. (2001) Differential extrageniculate and amygdala responses to presentation of emotional faces in a cortically blind field. Brain, 124: 1241–1252. Monokow, C. von (1914) Die Lokalization im Grosshirn. Bergmann, Weisbaden.
240 Munk, H. (1881) Ueber die Funktionen der Grosshirnrinde; gesammelteMittheilungen aus den Jahren 1877–1880. Hirschwald, Berlin. Oswald, I. (1957) After-images from retina and brain. Quart. J. Exp. Psychol., 9: 88–100. Pasik, P. and Pasik, T. (1971) The visual world of monkeys deprived of visual cortex: effective stimulus parameters and the importance of the accessory optic system. In: Shipley T. and Dowling J.E. (Eds.), Visual Processes in Vertebrates, Vision Research Supplement no. 3. Pergamon Press, Oxford, pp. 419–435. Pasik, P. and Pasik, T. (1982) Visual functions in monkeys after total removal of visual cerebral cortex. Contributions Sensory Physiology, 7: 147–200. Po¨ppel, E., Held, R. and Frost, D. (1973) Residual visual function after brain wounds involving the central visual pathways in man. Nature, Lond., 243: 295–296. Poppelreuter, W. (1917) Disturbances of Lower and Higher Visual Capacities Caused by Occipital Damage. Oxford University Press, Oxford. Riddoch, G. (1917) Dissociation of visual perceptions due to occipital injuries, with especial reference to appreciation of movement. Brain, 40: 15–17. Sahraie, A., Weiskrantz, L., Barbur, J.L., Simmons, A., Williams, S.C.R. and Brammer, M.L. (1997) Pattern of neuronal activity associated with conscious and unconscious processing of visual signals. Proc. Natl. Acad. Sci. USA, 94: 9406–9411. Sahraie, A., Weiskrantz, L., Trevethan, C.T., Cruce, R. and Murray, A.D. (2002) Psychophysical and pupillometric study of spatial channels of visual processing in blindsight. Exp. Brain Res., 143: 249–256. Scha¨fer, E.A. (1888) Experiments on special sense localisations in the cortex cerebri of the monkey. Brain, 10: 362–380. Scha¨fer, E.A. (1900) Textbook of Physiology, Vol. 2. Young J. Pentland, Edinburgh. Schilder, P., Pasik, P. and Pasik, T. (1972) Extrageniculate vision in the monkey. III. Circle vs triangle and ‘red vs green’ discrimination. Exp. Brain Res., 14: 436–448. Schneider, G.E. (1967) Contrasting visuomotor functions of tectum and cortex in the golden hamster. Psychol. Forsch., 31: 52–62. Settlage, P.H. (1939) The effect of occipital lesions on visually guided behavior in the monkey. J. Comp. Psychol., 27: 93–131. Shepherd, A.J. (2001) Increased visual after-effects following pattern adaptation in migraine: a lack of intracortical excitation? Brain, 124: 2310–2318. Stoerig, P. and Cowey, A. (1997) Blindsight in man and monkey. Brain, 120: 535–559. Stoerig, P., Zontanou, A. and Cowey, A. (2002) Aware or unaware: assessment of cortical blindness in four men and a monkey. Cereb. Cortex, 12: 565–574.
Talbot, S.A. and Marshall, W.H. (1941) Physiological studies on neural mechanisms of localization and discrimination. Am. J. Ophthal., 24: 1255–1264. Teuber, H.-L., Battersby, W.S. and Bender, M.B. (1960) Visual Field Defects after Penetrating Missile Wounds of the Brain. Harvard University Press, Cambridge MA. Trevarthen, C.B. (1968) Two mechanisms of vision in primates. Psychol. Forsch., 31: 299–337. Von Bonin, G. (1960) Some Papers on the Cerebral Cortex. Thomas, Springfield, Ill. Weiskrantz, L. (1950) An unusual case of after-imagery following fixation of an’imaginary’ visual pattern. Quart. J. Exp. Psychol., 2: 170–171. Weiskrantz, L. (1961) Encephalisation and the scotoma. In: Thorpe W. H. and Zangwill O. L. (Eds.), Current Problems in Animal Behaviour. Cambridge University Press, Cambridge, pp. 30–58. Weiskrantz, L. (1963) Contour discrimination in a young monkey with striate cortex ablation. Neuropsychologia, 1: 145–164. Weiskrantz, L. (1972) Behavioural analysis of the monkey’s visual nervous system. Proc. Roy. Soc. Lond. B., 182: 427–455. Weiskrantz, L. (1977) Trying to bridge some neuropsychological gaps between monkey and man. Br. J. Psychol., 68: 431–445. Weiskrantz, L. (1980) 2Varieties of residual experience (Eighth Sir Frederick Bartlett Lecture). Quart. J. Exp. Psychol., 32: 365–386. Weiskrantz, L. (1990) Outlooks for blindsight: explicit methodologies for implicit processes. The Ferrier Lecture. Proc. Roy. Soc. Lond. B., 239: 247–278. Weiskrantz, L. (1991) Disconnected awareness for detecting, processing, and remembering in neurological patients. J. Roy. Soc. Med., 84: 466–470. Weiskrantz, L. (1992) Unconscious vision. The sciences (N.Y. Acad. of Sciences) (Sept–Oct), pp. 22–30. Weiskrantz, L. (1995) Blindsight: Not an island unto itself. Curr. Dir. Psychol. Sci., 4: 146–151. Weiskrantz, L. (1997) Consciousness Lost and Found. A Neuropsychological Exploration. Oxford University Press, Oxford. Weiskrantz, L. (1998, 2nd edition, 1st in 1986) Blindsight. A Case Study and Implications. Oxford University Press, Oxford. Weiskrantz, L. (2001a) Blindsight — putting beta (b) on the back burner. In: de Gelder B., de Haan E.H.F. and Heywood C.A. (Eds.), Out of Mind: Varieties of Unconscious Processes. Oxford University Press, Oxford, pp. 20–31. Weiskrantz, L. (2001b) Blindsight. In: Behrmann M. (Ed.), Handbook of Neuropsychology. Elsevier, Amsterdam, pp. 215–237. Weiskrantz, L., Warrington, E.K., Sanders, M.D. and Marshall, J. (1974) Visual capacity in the hemianopic field following a restricted occipital ablation. Brain, 97: 709–728.
241 Weiskrantz, L. and Cowey, A. (1970) Filling in the scotoma: a study of residual vision after striate cortex lesions in monkeys. In: Stellar E. and Sprague J.M. (Eds.), Progress in Physiological Psychology, Vol. 3. Academic Press, New York, pp. 237–260. Weiskrantz, L., Cowey, A. and Passingham, C. (1977) Spatial responses to brief stimuli by monkeys with striate cortex ablations. Brain, 100: 655–670. Weiskrantz, L., Harlow, A. and Barbur, J.L. (1991) Factors affecting visual sensitivity in a hemianopic subject. Brain, 114: 2269–2282. Weiskrantz, L., Barbur, J.L. and Sahraie, A. (1995) Parameters affecting conscious versus unconscious visual discrimination without V1. Proc. Natl. Acad. Sci. USA, 92: 6122–6126.
Weiskrantz, L., Cowey, A. and LeMare, C. (1998) Learning from the pupil: a spatial visual channel in the absence of V1 in monkey and human. Brain, 121: 1065–1072. Weiskrantz, L., Cowey, A. and Barbur, J.L. (1999) Differential pupillary constriction and awareness in the absence of striate cortex. Brain, 122: 1533–1538. Weiskrantz, L., Cowey, A. and Hodinott-Hill, I. (2002) Prime-sight in a blindsight subject. Nature Neuro., 5: 101–102. Wilbrand, H. and Saenger, A. (1917) Die Neurologie des Auges. Vol VII: Die Erkrankungen der Sehbahn vom Tractus bis in den Cortex B Die homonyme Hemianopsie nebst ihren Beziehungen zu den anderen cerebral Herderscheinungen. Bergmann, Wiesbaden.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 17
‘Double-blindsight’ revealed through the processing of color and luminance contrast defined motion signals John L. Barbur* Applied Vision Research Centre, City University, Northampton Square, London EC1V 0HB, UK
Abstract: Background perturbation techniques using both static and dynamic luminance contrast and chromatic contrast noise have been employed to investigate the interaction between luminance and chromatic contrast signals. A number of experiments involving contrast detection thresholds for static stimuli, simple reaction times, and pupil color responses yield evidence for independent processing of luminance and color signals. One exception is the perception of color-defined coherent motion of randomly distributed checks when dynamic, luminance-defined, motion signals can disrupt completely the perception of coherent motion. When the moving stimulus contains color-defined, recognizable features that are spatially distinct from background noise, the perception of coherent motion is restored and becomes completely independent of dynamic luminance contrast noise. Feature-tracking mechanisms must therefore play a major part in the detection of color-defined motion. Other findings reveal the existence of a motion channel that receives exclusively achromatic inputs and a separate channel that extracts and combines luminance contrast and colordefined motion signals. Further studies carried out to investigate the interaction between luminance and chromatic signals show that color-defined, position-change thresholds are mediated by motion detection mechanisms and are, therefore, also disrupted by dynamic luminance contrast noise. In spite of the perceived uncertainty of spatial location, other experiments show that accurate, color-defined position information is preserved and available for use in other tasks. For example, accurate position information can be used to detect the vertical misalignment of color-defined checks, independently of luminance contrast noise, even when position-change thresholds are severely impaired. The analogy with ‘blindsight’ is obvious and irresistible. ‘Blindsight’ describes the ability to make accurate use of visual information in the absence of conscious visual perception. The findings that have emerged from this investigation reveal the ability of normal subjects to discard misleading, consciously perceived visual signals. The ability to override perception and to make good use of accurate visual information, in spite of misleading percepts, can best be described as ‘Double-blindsight’.
Introduction
visual system extracts and combines spatial and temporal modulations of luminance and spectral content and the extent to which the processing of luminance and chromatic contrast signals involves separate and independent mechanisms. The discovery of anatomically distinct Pa and Pb ganglion cells (Perry and Cowey, 1981) that respond best to different attributes of the retinal image and project to distinct layers of the dLGN (Perry et al., 1984) and from there to separate layers of the primary visual cortex has provided much of the evidence for the
The coding of visual information in static images is based on spatial modulation of luminance and/or spectral content. Temporal changes in either of these attributes can be used to code additional information that results largely in the perception of flicker and motion. It is of great interest to establish how the *Corresponding author. Tel.: þ44-20-70405060; Fax: þ44-20-70408355; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-7
243
244
existence of parallel pathways (Hubel and Wiesel, 1972). Studies of differences in spatiotemporal and chromatic properties of magnocellular (M) and parvocellular (P) neurons have provided further support for specialization in processing of different stimulus attributes. The large receptive fields, transient responses, high contrast gain, and lack of chromatic opponency of magnocellular neurons makes them ideal for detection of flickering or moving stimuli defined by luminance contrast (Kaplan and Shapley, 1982; Derrington and Lennie, 1984). Parvocellular neurons, on the other hand, exhibit smaller receptive fields, lower contrast gain, respond to both chromatic and luminance contrast and exhibit bandpass spatial frequency response characteristics, properties consistent with detection of static, fine spatial detail defined by both luminance and chromatic boundaries (Derrington and Lennie, 1984). Many other, less numerous ganglion cell classes have also been identified, but the exact contribution made by each type of cell to the detection of different stimulus attributes remains unclear (Rodieck, 1988). Many of these cells project not only to the six laminae of the dLGN, but also to its interlaminar regions (Hendry and Yoshioka, 1994) and to different midbrain nuclei (Cowey et al., 1994). Since many parvocellular neurons respond well to both luminance and chromatic contrast, the coding of such signals becomes ambiguous (Rodieck, 1991). These observations make it difficult to answer important questions concerning the independent processing of luminance and chromatic signals. In general, evidence from visual psychophysics tends to support the separate, independent processing hypothesis of luminance and chromatic signals, particularly when the visual task involves detection of static, threshold stimuli (King-Smith and Carden, 1976; Reffin et al., 1991; Birch et al., 1992; Barbur et al., 1996). A number of interesting questions have also been asked in relation to color and luminance contrast-defined motion. Although there is no longer any doubt that the visual system has developed neural mechanisms for the processing of color-defined motion, it is by no means clear whether these mechanisms are wired up to extract ‘first-order’ motion signals (Cavanagh and Favreau, 1985; Mullen and Baker, 1985; Gorea and Papathomas, 1989) or rely on higher-order mechanisms that compute motion from the tracking
of the spatial features of the stimulus (Cavanagh, 1991; Gorea and Papathomas, 1991; Lu et al., 1999). Considerable controversy also remains as to whether color-defined motion is processed independently of first-order mechanisms that extract luminance contrast-defined motion (Gorea and Papathomas, 1989). In this study we have addressed a number of related questions concerning the processing of luminance and chromatic signals using an experimental approach that relies on background perturbation. The novel techniques developed for this work can be used to investigate the independent processing of different stimulus attributes by examining the extent to which the spatiotemporal and chromatic properties of background noise affect thresholds for detection of either static or moving, luminance or color-defined test stimuli.
Experimental methods Apparatus The measurement of reaction times, pupil responses, detection thresholds, and the generation of appropriate visual stimuli were carried out on the P_SCAN system (Barbur et al., 1987). The version of the system employed in this study incorporates a Sony Trinitron monitor (Model F520) driven by an ELSA Gloria XL 10 bit graphics card and facilities for monitoring direction of gaze and the measurement of pupil size, using infrared illumination. Measurement of the spectral radiance output of each phosphor was carried out using a Gamma Scientific telespectroradiometer (Model 2030-31). The luminance output for each possible gun-voltage value was also calibrated automatically using an LMT 1003 luminance meter. The calibration was carried out at the center of the display with the test region surrounded by a steady uniform background of luminance similar to that employed in the actual experiments. Standard colorimetric transformations (Wyszecki and Stiles, 1982) and the use of the spectral and luminance calibration data made possible the generation of any specified color/luminance combination within the limits of the phosphors of the display. A standard Zeiss chin/forehead rest was employed and the subject viewed the display from
245
75 cm through a large dichroic mirror that reflects infrared light, and transmits >95% over the visual part of the spectrum.
Visual stimuli The area of the stimulus was defined as an array of checks. Random, spatial modulation of either the luminance or the chromaticity of each check gave rise to the perception of luminance or color noise, as shown in Figs. 1 and 6. In the case of spatial textures defined by luminance contrast (LC) noise, the luminance of each check was allowed to vary with equal probability within a range specified as a percentage of background luminance. The latter formed the amplitude of LC noise (Fig. 1). Chromatic contrast (CC) noise was generated by allowing the chromaticity of each check to vary randomly in saturation, away from background chromaticity, along a line of fixed length, pointing toward a specified region of the spectrum locus in the CIE — (u0 , v0 ) chromaticity chart (CIE, 1964). The length of this line provided a measure of CC noise amplitude (Fig. 6). The spatial distribution of luminance or chromaticity either remained constant throughout the stimulus, causing ‘static’ noise, or changed every 13.3 ms to 80 ms, causing the perception of ‘dynamic’ noise. A subset of checks, modulated in either color or luminance contrast formed the test stimulus. The latter was presented static or moving in either direction along the horizontal meridian. The subset of checks that formed the test stimulus was either selected randomly or obeyed simple, spatial selection rules so as to generate two-dimensional spatial primitives similar to those shown in Fig. 7.
Detection thresholds The principal aim of this study was to investigate how the presence of LC or CC noise affects thresholds for detection of coherent motion of test patterns defined by either luminance or chromatic signals. Detection thresholds were also measured with equivalent, but static, briefly presented stimuli. Staircase procedures with variable step sizes were used throughout to measure thresholds for detection of coherent motion or thresholds for detection of the corresponding static
Fig. 1. Examples of color-defined stimuli buried in dynamic luminance contrast (LC) noise. The template for the stimulus is an array of checks subtending some 4 4.5 of visual angle. The mean luminance of the array remains constant and equal to that of the larger, uniform background field (24 cd m2, CIE — (x, y): 0.305, 0.323). The luminance of each check in the array can vary randomly in time, with equal probability, within a range specified as a percentage of background luminance. This percentage determines the amplitude of dynamic LC noise, as illustrated by the cross-section plot of the corresponding luminance profile. In addition to the random LC noise, a subset of these checks are also given a chromaticity that is different to that of the background checks. The colored checks form the test stimulus and can be constrained to form spatially coherent patterns as shown in the two upper sections of this figure or can be distributed randomly amongst the background checks. The colored checks can either be presented stationary or moving coherently in either direction along the horizontal meridian.
stimuli. The number of possible combinations of each set of background noise attributes with each of the test stimulus conditions is large, and each combination can, in principle, yield useful information. It is, however, hoped that the results will fall naturally into a small number of groups that illustrate either the presence or the absence of interaction between the background noise and test stimulus attributes.
Reaction times The experimental procedure and the stimulus configuration employed to measure reaction times have been described previously (Barbur et al., 1998).
246
In this study we have investigated the extent to which dynamic LC noise affects reaction times to chromatic stimuli. Reaction times to luminance and chromatic stimuli were also measured under stimulus conditions that yield the shortest possible responses. The stimulus conditions employed in this study are described in caption to Fig. 3.
Pupil response measurements The discovery that the processing of chromatic stimuli (Barbur et al., 1996) and the corresponding chromatic afterimages (Barbur et al., 1999) cause systematic pupil responses, even when the colored stimuli are buried in dynamic luminance contrast noise, makes the pupil color response attractive to study the independent processing of color and luminance contrast signals. The pupil color response is particularly useful when suprathreshold chromatic stimuli are employed. The aim of this study was to discover whether the processing of chromatic signals, the generation of chromatic afterimages, and the corresponding pupil responses can be masked by the addition of luminance contrast noise. The stimulus employed was photopically isoluminant and in addition it also had zero rod contrast (Young and Teller, 1991), a requirement necessary to avoid the generation of rod-driven, pupil light-reflex responses. This constraint resulted in the generation of a ‘greenish’ stimulus (see Fig. 5) with a corresponding ‘reddish-magenta’ afterimage. The stimulus was a central disc of 8 diameter and was buried in progressively larger amounts of dynamic LC (see Fig. 5). Binocular measurements of pupil responses were carried out using the P_SCAN system that has been described previously (Alexandridis et al., 1992). A number of colored stimuli, each of fixed chromatic strength, but buried in different amounts of dynamic LC noise were interleaved randomly and pupil response traces were measured 16 times for each level of noise employed. The remaining stimulus parameters are as described in caption to Fig. 5.
Position-change thresholds It is of great interest to establish whether the breakdown of coherent motion of randomly distributed,
colored checks when buried in dynamic LC noise (see Experimental findings, Motion detection thresholds) is caused by the loss of position information. The breakdown of coherent motion perception for these stimuli may reflect the failure of featuretracking, motion mechanisms caused by the loss of accurate position information. This hypothesis can account for our experimental findings without having to postulate the existence of a first-order motion mechanism capable of extracting and combining the random motion signals present in dynamic LC noise with the coherent motion signals of color-defined checks. It is therefore important to test whether position-change thresholds for color-defined checks are affected by dynamic LC noise. The stimulus designed for this test consisted of four colored checks, located along diagonal lines on a circle with its center on the fixation point (Fig. 10). Each quadrant contained one colored check and the four checks together would normally form a perfect square. During the stimulus presentation, one of the four colored checks changed position with one of its nearest diagonal neighbors. The subject’s task was to press one of four response buttons, arranged as the corners of a square, to indicate the quadrant containing the check that had changed position. In order to avoid the use of symmetry cues and shape changes (i.e., selective distortion of the square pattern defined by the colored checks, as a result of displacement of one of its corners, see Fig. 10), small, random changes in the initial location of each of the four colored checks were introduced in each stimulus. Thresholds for correct identification of the quadrant containing the colored check that changed position during the presentation were measured for each level of noise using interleaved, multiple staircases with variable step sizes. The chromatic saturation of the test check was then decreased only after two consecutive correct responses. This procedure yields a low chance probability of 1/16.
Misalignment detection thresholds Detection of position-change thresholds can, in principle, be achieved using a number of different mechanisms that could rely on the coding of absolute or relative position of color-defined checks with respect
247
to other objects in the field, or the detection of directional changes that could be mediated entirely by motion detection mechanisms. In order to confirm that dynamic LC noise causes the loss of position information of color-defined checks and that positionchange thresholds do not reflect the loss of motion sensitivity, a new test was developed that does not confound position-change and direction discrimination or motion thresholds. The test is based on the detection of vertical misalignment of colored checks buried in dynamic LC noise. The stimulus configuration is shown in Fig. 11 and consists of two test stimulus checks that are vertically collinear with the fixation stimulus. For each trial, either one or the other of the two, test stimulus checks is presented shifted to the left or to the right of the vertical, causing the corresponding check to appear vertically misaligned (see Fig. 11). Short presentation durations in the range 107–373 ms have been employed. The subject’s task was to indicate the direction of stimulus misalignment by pressing one of four buttons, arranged to form a square. This spatial arrangement of the buttons makes it easy for the subject to link the misaligned direction of one of the two test checks
with the top left, top right, bottom left, or bottom right locations of the four response buttons. The fixation stimulus was defined by luminance contrast and was always well above threshold. The test checks were defined by either luminance or chromatic contrast and buried in either luminance or chromatic contrast noise.
Experimental findings Chromatic detection thresholds Thresholds for detection of chromatic signals were found to be relatively independent of static or dynamic LC noise (Barbur et al., 1992, 1994). Chromatic detection thresholds for static patterns consisting of color-defined, vertical bars were measured for a number of directions of chromatic displacement, as defined in the CIE-(x, y) chromaticity chart (Fig. 2A). The threshold detection contour forms the well-known chromatic detection ellipse, originally derived by MacAdam from estimates of standard deviation of successive color matches (MacAdam, 1942). The results demonstrate clearly
Fig. 2. (A, B) Section A shows thresholds for detection of color-defined vertical bars buried in various amounts of dynamic luminance contrast noise. The mean luminance of the stimulus array remains constant and equal to that of the larger uniform background field (24 cd m2, CIE-(x, y): 0.305, 0.323). The results show that the processing of chromatic signals is independent of the ongoing LC noise (Barbur et al., 1994). Section B shows thresholds for detection of luminance contrast-defined motion buried in either static or dynamic LC noise (see stimulus appearance above the graph). The results reveal the effectiveness of dynamic LC noise in masking the detection of luminance contrast signals. Static LC noise, on the other hand, has little or no effect on the detection of LC-defined motion.
248
that the detection of chromatic signals is not affected by the presence of LC noise.
Luminance contrast thresholds The usefulness of the results shown in Fig. 2A becomes more apparent when one examines the detection of luminance contrast-defined stimuli buried in dynamic LC noise. Thresholds for detection of static stimuli were found to increase linearly with LC noise amplitude (Barbur et al., 1994). We have extended these findings by measuring thresholds for detection of luminance contrast-defined motion, with the moving stimulus buried in either static or dynamic LC noise. The results shown in Fig. 2B reveal the effectiveness of dynamic LC noise in masking the sensitivity of mechanisms that can normally respond to and detect very small luminance contrast changes. The results are of great interest since they demonstrate clearly that luminance contrast-defined motion involves only transient mechanisms that are not affected by static LC noise. This finding confirms a very well-known observation that camouflage achieved by appropriate choice of static, spatial noise breaks down completely when movement is involved. We have extended these
observations to luminance contrast-defined motion signals buried in static or dynamic chromatic contrast noise. The results plotted in Fig. 6 show that neither static nor dynamic chromatic contrast noise has any effect on the detection of luminance-defined coherent motion. These findings are important since they confirm the existence of a motion channel that responds exclusively to luminance contrast-defined motion.
Reaction times Manual reaction times are known to decrease asymptotically with increasing stimulus contrast (Lupp et al., 1973; Mollon and Krauskopf, 1973; Breitmeyer and Julesz, 1975; Kranda, 1983). In the straight-line region, reaction times become independent of stimulus strength and can therefore be used to measure differences in processing times, when different stimulus attributes are involved (Barbur et al., 1998). In this study we wanted to investigate whether reaction times to suprathreshold chromatic stimuli are affected by the presence of luminance contrast noise. Figure 3A shows the relationship between reaction time and chromatic saturation for colored stimuli buried in small (3%) and large (25%)
Fig. 3. (A, B) Manual reaction times (RTs) to color and luminance contrast signals. RTs to color-defined stimuli exhibit the wellknown asymptotic dependence on stimulus strength, but are not affected by the presence of dynamic LC noise (section A). Contrast thresholds for detection of luminance stimuli increase monotonically with LC noise amplitude (see Fig. 2B). Consistent with this observation is the increase in RT with dynamic LC noise amplitude for luminance, but not for color-defined stimuli (section B). The stimulus was a disc of 6 diameter and was buried in dynamic LC noise. The luminance of the uniform background field was 24 cd m2 and its CIE-(x, y) chromaticity was 0.305, 0.323. The achromatic stimulus (section B) had a luminance contrast of 0.9. The colored stimulus was isoluminant with respect to the uniform background and its chromaticity was 0.425, 0.323. For convenience we use the distance between the chromaticity of the test stimulus and the adjacent uniform background as a measure of chromatic saturation. Both the luminance contrast of the achromatic stimulus and the chromatic saturation of the colored stimulus were sufficiently large to yield the shortest RTs possible for either luminance or color signals (see Fig. 4A). The mean data points shown for each stimulus attribute represent the results of randomly interleaved trials carried out in one session with 64 presentations per stimulus.
249
levels of dynamic LC noise. Figure 3B shows how reaction times to color and luminance contrastdefined stimuli depend on the amplitude of dynamic LC noise. In the case of a LC-defined test stimulus, the presence of dynamic LC noise causes a systematic increase in reaction time. Reaction times to the colordefined stimulus, on the other hand, show little or no dependence on LC noise amplitude. We have also investigated differences in processing times for color and luminance using stimuli that yield the shortest reaction times. Figure 4B shows the statistical distribution of reaction times for the two stimulus conditions. There is a clear 20-ms difference between the two conditions and this difference remains relatively unchanged when reaction times are plotted as a function of stimulus strength (Fig. 4A). This experimental approach, with the colored test stimulus buried in dynamic LC noise, ensures that the measured reaction times reflect the processing of chromatic signals with no contribution from luminance contrast signals, even when suprathreshold colored stimuli are employed.
Pupil color responses Pupil responses to chromatic stimuli and the involvement of color-opponent mechanisms in the control of the pupil response have been reported in previous studies (Young and Alpern, 1980; Krastel et al., 1985; Barbur and Harlow, 1990). More recently, we reported pupil responses that seemed to be linked directly to the generation of chromatic afterimages in two patients with unilateral damage to the primary visual cortex (Barbur et al., 1999). The patients were completely unaware of the presentation of the isoluminant stimulus, or its chromatic afterimage. It is of great interest to establish whether the observed pupil constrictions to chromatic afterimages are associated entirely with chromatic mechanisms or whether such responses reflect the involvement of magnocellular pathways, as has been suggested in earlier studies (Kelly and Martinez-Uriegas, 1993). If transient, magnocellular pathways contribute to this response, both the generation of the chromatic afterimage and the subsequent pupil response is
Fig. 4. (A, B) The results illustrate the asymptotic behavior of RTs with increasing stimulus strength for both color and luminance signals (section A). The data are fitted well by an exponential equation of the form: c1*exp(c2*x) þ c3, where x represents the stimulus strength and c1, c2, and c3 are constants. The best-fit parameters were 57.018*exp(33.519*x) þ 204.2 for color RTs, and 33.364*exp(38.39*x) þ 184.4 for luminance RTs. The frequency histograms shown in section B illustrate the clear difference in processing times for luminance and color signals. The stimulus parameters employed were again selected to yield the shortest RTs for both color and luminance (Color: chromatic saturation ¼ 0.12 units, n ¼ 512, mean RT ¼ 206 ms, se ¼ 1.26 ms, LC noise ¼ 25%; Luminance: contrast (L/Lb) ¼ 100%, n ¼ 512, mean RT ¼ 186 ms, se ¼ 1.25 ms, LC noise ¼ 3%). The difference of 20 ms between the mean RTs to color and luminance is highly significant because of the large sample sizes employed.
250
Fig. 5. Effect of dynamic LC noise on pupil responses to the onset of a colored stimulus and its corresponding chromatic afterimage. The stimulus was a disc of 8 diameter and was generated in the center of a uniform background field that approximated daylight at 6500 K (L ¼ 24 cd m2, x ¼ 0.305, y ¼ 0.323). The duration of the stimulus was 2 s and this is indicated by the rectangular pulse along the abscissa. The stimulus was photopically isoluminant and, in addition, it also had zero-rod contrast. The stimulus color can be described as green and its complementary afterimage was magenta red. The colored stimulus was buried in dynamic luminance contrast noise of varying amplitude as shown in the legend. The results show that the dynamic LC noise has very little effect on the pupil color response. This observation is consistent with the independent processing of chromatic signals. The second pupil constriction at stimulus offset also remains relatively unaffected by dynamic or static LC noise.
likely to be affected strongly by the presence of dynamic LC noise. The results of Fig. 5 show that neither the response to the onset of the chromatic stimulus, nor that to the subsequent afterimage is affected significantly by dynamic LC noise. This observation is in marked contrast to pupil responses to luminance contrast increments that show a rapid decrease when the test flashes are buried in dynamic LC noise (Barbur, 2003) (Fig. 6).
Fig. 6. Effect of static or dynamic chromatic contrast noise on thresholds for detection of coherent motion defined by luminance contrast. Chromatic contrast noise was generated by allowing the chromaticity of each check in the array to change randomly along a line of fixed length, pointing away from the background chromaticity point towards a specified location on the spectrum locus in the CIE-(x, y) chromaticity chart. The luminance of each check remained unchanged and equal to that of the uniform background (24 cd m2). The length of this line determined the amplitude of chromatic noise. The latter varied from 0.005 units (close to the color detection threshold) to 0.06 units (well above threshold). The noise was either static or changed randomly, every 80 ms. The subject’s task was to detect the direction of motion of the test pattern defined by luminance contrast. The results show that luminance contrast-defined motion is not affected by either static or dynamic chromatic contrast noise.
Motion detection thresholds Previous experiments involving threshold detection of chromatic stimuli, the measurement of reaction times, and pupil responses to suprathreshold stimuli show that both threshold and suprathreshold chromatic signals are processed independently of either static or dynamic LC noise. Based on these findings it seemed natural to expect that the processing of color-defined coherent motion would also be independent of LC noise. The first experiment carried out involved measurement of color thresholds for detection of coherent motion of randomly distributed colored
251
Fig. 7. Effect of dynamic LC noise on thresholds for detection of color-defined motion. Unlike color detection thresholds, RTs and pupil color responses that are not affected by LC noise, the perception of coherent motion of randomly distributed, colored checks breaks down completely in the presence of dynamic LC noise. Static LC noise, on the other hand, has little or no effect on thresholds for detection of color-defined coherent motion. When dynamic LC noise is present, the random motion of luminance contrast-defined checks interferers with the coherent motion of the colored checks and this causes the threshold for detection of coherent motion to increase linearly with LC noise. Interestingly, when the colored checks are constrained to form recognizable spatial primitives (as shown in the right-hand side image of the stimulus above the graph), the perception of coherent motion is restored completely and becomes independent of the amplitude of LC noise. At least 12 subjects took part in various stages of this study, and the results above have been reproduced by each of these subject.
checks, buried in dynamic LC noise. The results were unexpected in that the presence of random motion of luminance contrast-defined background checks caused the complete breakdown of color-defined coherent motion. Thresholds for detection of colordefined coherent motion increased linearly with the amplitude of dynamic LC noise (Figs. 7 and 8C). This observation is completely consistent with a motion mechanism that extracts and combines luminance and color-defined motion signals (Gorea and Papathomas, 1989). Further serendipitous findings revealed that when the colored checks were no longer
distributed randomly over the array, but were constrained to form recognizable spatial primitives, similar to those shown in Fig. 7, the perception of color-defined coherent motion was restored completely and was no longer affected by even large amplitudes of dynamic LC noise. This observation is completely consistent with color-defined motion being mediated by feature-tracking mechanisms, as has been suggested in several previous studies (Cavanagh and Seiffert, 1999; Lu et al., 1999). Sperling and colleagues went further to suggest that color-defined motion relies exclusively on feature-tracking mechanisms. The breakdown of color-defined coherent motion when the moving colored checks are distributed randomly over the pattern (Fig. 7) is not inconsistent with this claim, provided the random motion of background checks is also detected largely by featuretracking mechanisms. Other studies have shown that first-order motion mechanisms respond best to fastmoving stimuli with a temporal frequency cut-off as high as 15 Hz. Feature-tracking mechanisms, on the other hand, extract motion after processing of distinct spatial features at discrete locations in the visual field has been secured. Accurate position information is therefore important when motion detection relies on feature-tracking and the additional processes involved set a lower temporal cut-off limit of 3 Hz (Lu et al., 1999; Lu and Sperling, 1995). Based on this knowledge we designed a new experiment to investigate whether the very slow coherent motion of colored checks is affected by high-speed, random motion of LC-defined background checks. If color-defined motion relies entirely on feature-tracking mechanisms then the high-speed, first-order motion signals of background checks should have little or no effect on the detection of coherent motion of the colored checks. The results of Fig. 9A and B show that the selection of background noise parameters that favor the properties of first-order motion mechanisms has little or no effect on the interaction between the coherent motion of color-defined checks and the random motion of background checks.
A measure of suprathreshold chromatic sensitivity Figure 8 shows how we can make use of coherent motion thresholds to derive both a threshold and
252
Fig. 8. (A–D) Measures of threshold and suprathreshold chromatic sensitivity based on thresholds for detection of color-defined coherent motion. Section A shows examples of coherent motion thresholds measured for different colors (i.e., different directions of chromatic modulation in the CIE-(x, y) diagram). Each direction is indicated by the angle it makes with the abscissa. For example, 0 corresponds to a displacement, away from background chromaticity, toward the red region of the spectrum locus. Similarly, 70 corresponds to the yellow region and 240 corresponds to the blue region. The stimulus employed is that shown in Fig. 6 (on the right) and consists of randomly generated, recognizable, color-defined spatial primitives. The thresholds for detection of the colored pattern are not significantly different to those for detection of coherent motion and remain unaffected by dynamic LC noise. The thresholds were therefore averaged for each direction and the reciprocal of the mean threshold provides a measure of chromatic sensitivity. Similar measurements were carried out for several directions of chromatic displacement toward different points on the spectrum locus and the corresponding chromatic sensitivity function is shown in section B. These data are equivalent to the chromatic threshold ellipse shown in Fig. 2A. When the moving, color-defined checks are distributed randomly within the array, the thresholds for detection of coherent motion increase monotonically with the amplitude of dynamic LC noise. Section C shows examples for two different stimulus directions. Experimentally one finds that the gradient of each graph varies systematically with the direction of chromatic displacement and this provides a measure of ‘noise-equivalent’ chromatic sensitivity. This can be realized by computing the reciprocal of the gradient of the least-squares line fitted to each set of data points (as shown in section C). Such measurements were repeated for a number of different color directions and the corresponding suprathreshold chromatic sensitivity curve is shown in section D. Although the two estimates of chromatic sensitivity have some similarity, the noise-equivalent technique shows that suprathreshold, red stimuli are more effective than any other colors.
a suprathreshold measure of chromatic sensitivity. The presence of distinct spatial features makes the detection of coherent motion independent of LC noise. This observation justifies the averaging of data such as those shown in Fig. 8A for each direction of chromatic displacement, specified as an angle measured with respect to the x-axis. The reciprocal of these thresholds provide a measure of chromatic
sensitivity with the blue–yellow and the red–green axes indicating the major and minor semiaxis of the chromatic threshold ellipse (Fig. 8B). When the moving stimulus does not contain distinct spatial features, coherent motion thresholds increase systematically with dynamic LC noise. For a given direction of chromatic displacement, the gradient of the measured straight-line relationship represents the
253
Fig. 9. (A, B) Effect of fast and slow random motion of background checks on thresholds for detection of color-defined coherent motion. The stimulus consisted of randomly distributed colored checks buried in dynamic luminance contrast noise (see Fig. 7). Thresholds for detection of coherent motion of the colored checks are plotted as a function of LC noise. The colored checks moved slowly in a horizontal direction at a constant speed of 2.3 s1. Fast, random motion of background checks was generated by allowing the luminance of each check to vary randomly within a specified range (i.e., the amplitude of dynamic LC noise) every 13.3 or 26.7 ms. A longer interval of 80 ms between successive changes of luminance was employed to generate slow, random motion of background checks. Thresholds for detection of coherent movement are shown for two different directions of chromatic modulation in the CIE (u0 , v0 ) — diagram. These correspond to red (section A) and to green (section B) stimulus colors. The results do not reveal a significant difference between ‘fast’ and ‘slow’ random motion of background checks.
chromatic signal per unit luminance contrast noise. The reciprocal of this quantity is a measure of suprathreshold chromatic sensitivity (Fig. 8D). The results suggest that when suprathreshold colors are involved, ‘red’ stimuli have the greatest sensitivity.
found to have little or no effect on position-change thresholds. Dynamic LC noise, on the other hand, causes position-change thresholds to increase linearly with LC noise amplitude.
Vertical misalignment detection thresholds Position-change thresholds Since feature-tracking mechanisms for motion perception rely on the processing of spatial features at discrete locations in the visual field, accurate position information may be an essential requirement for the normal functioning of these mechanisms. It is therefore possible that the random motion of background checks does not interfere directly with the coherent motion signals of color-defined checks, but instead it weakens the ability of the system to secure accurately the position information of colored checks. In order to test this hypothesis, a new experiment was designed to measure position-change thresholds, as described in Experimental methods, Position-change thresholds. The results of these experiments are shown in Fig. 10. Static LC noise, and dynamic or static color contrast noise were
The results of Fig. 10 suggest that dynamic LC noise affects the ability of the visual system to extract accurately the position information of color-defined checks and that, in turn, this causes the failure of feature-tracking motion mechanisms. This conclusion is consistent with the perceptual appearance of the colored checks. When buried in dynamic LC noise, the perceived location of colored checks is not stable, but instead the checks appear to jump around and often to follow the random motion of LC-defined checks. The very similar effect dynamic and static LC and CC noise have on position-change and motion detection thresholds of colored checks suggests that position-change thresholds may also rely on direction-sensitive, motion mechanisms. In other words, the increased position-change thresholds when colored stimuli are buried in dynamic LC noise
254
Fig. 10. (A, B) Effect of luminance contrast (section A) and color noise (section B) on ‘position-change’ thresholds. Four, color-defined or luminance contrast-defined checks were presented, one in each quadrant on a circle of radius 1.8 , centered on the fixation stimulus. The subject’s task was to maintain fixation and judge which of the four checks changed location during each presentation, by pressing one of four response buttons that formed a square, each button corresponding to one stimulus quadrant. Two sequential correct responses were required before the contrast of the stimulus was altered in a staircase procedure. The background noise was either static or dynamic with sequential changes every 40 ms. Position change thresholds are not affected by either static or dynamic chromatic contrast noise when the stimulus checks are defined by luminance contrast. This is also the case when color-defined checks are buried in static LC noise. When the colored checks are buried in dynamic LC noise, position-change thresholds were found to increase systematically with LC noise amplitude.
reflects the poorer performance of a motion mechanism that extracts and combines first-order color and luminance contrast-defined motion signals. If this hypothesis is correct, then the accurate position information of colored checks is not necessarily lost when these checks are buried in dynamic LC noise. To test for this hypothesis we require a visual task that relies on accurate position information, but does not confound position-change and motion detection thresholds. With this in mind, we designed the vertical misalignment task described in Experimental methods, Misalignment detection thresholds and investigated how both static and dynamic background noise affect thresholds for detection of vertical misalignment of stimulus checks. The results of Fig. 11 show convincingly that thresholds for detection of vertical misalignment of colored checks remain completely unaffected by dynamic LC noise, for both long as well as short stimulus durations. The accurate position information of colored checks is therefore preserved and available for use in visual tasks that do not rely on direction discrimination mechanisms.
Discussion The experimental findings illustrate the usefulness of background perturbation techniques in revealing the interactions between visual mechanisms involved in the processing of different stimulus attributes. In spite of the large number of combinations of different background and test stimulus parameters, the experimental findings suggest that the majority of effects fall into three principal classes.
Class I — The case of no significant interaction This outcome includes those stimulus conditions when the addition of noise has little or no effect on thresholds for detection of the test stimulus. This kind of ‘null’ finding is of great significance since it suggests that the detection of noise and test stimulus attributes involves separate mechanisms. The evidence derived from chromatic detection thresholds, reaction times, and pupil color responses suggests that chromatic
255
with noise amplitude. This finding suggests that the same mechanisms are involved in the detection of both test pattern and background noise attributes. Although LC-defined motion remains largely unaffected by static LC noise, or static and dynamic chromatic contrast noise, dynamic LC noise causes the threshold for detection of LC-defined motion to increase linearly with noise amplitude. These findings illustrate the usefulness of dynamic LC noise in masking the detection of luminance contrast signals. They also reveal the existence of a first-order, motion detection mechanism that receives exclusively transient achromatic inputs.
Class III — The case of both separate and shared stages of visual processing Fig. 11. (A–C) Effect of dynamic LC noise on thresholds for detection of misaligned, vertically oriented, colored checks. In the absence of misalignment, the two stimulus checks and the fixation point were collinear and aligned vertically. Misalignment was then generated by presented one of the two stimulus checks shifted either to the left or to the right of the vertical line. The presentation of the stimulus was brief, but always preceded by 0.5 s of dynamic noise. Following each presentation, the subject had to indicate the direction of misalignment by pressing one of four buttons, each button corresponding to one of the four possible tilt directions. Two sequential correct responses were again required before the contrast of the stimulus was altered in a staircase procedure. Results are shown for two stimulus eccentricities of 1.8 and 3.6 and for presentation durations of 107 and 373 ms. The results show that tilt thresholds generated by a small change in stimulus position vary systematically with both eccentricity and stimulus presentation time, but are completely unaffected by either static or dynamic LC noise.
signals are processed independently from static or dynamic luminance contrast noise. This discovery has made possible the development of a number of tests of color deficiency and chromatic sensitivity that achieve a high level of luminance contrast masking, without any need to establish isoluminance conditions for each subject (Barbur et al., 1994).
Class II — The case of shared or identical mechanisms This type of interaction causes the threshold for detection of the test pattern to increase monotonically
This class of observations is unusual in that although the added noise fails to cause a significant increase in thresholds for detection of the test pattern, some other stimulus attribute such as the perception of coherent motion can be disrupted by the noise. This kind of finding suggests that although separate mechanisms are involved in the processing of background and test target signals, common mechanisms exist that combine specific stimulus attributes such as motion signals carried by test stimulus and background checks. The breakdown of color-defined, coherent motion when randomly distributed colored checks are buried in dynamic luminance contrast noise is consistent with the existence of a mechanism that extracts and combines the coherent motion of color-defined checks with the random motion of LCdefined background checks (Gorea and Papathomas, 1989).
Suprathreshold measures of chromatic sensitivity Both reaction times and measurement of pupil color responses with suprathreshold stimuli suggest that the extraction of chromatic signals is independent of LC noise, even when suprathreshold stimuli are involved. The observed breakdown of color-defined coherent motion when the colored checks are buried in LC noise can be used to provide a measure of suprathreshold chromatic sensitivity. It is interesting to note that thresholds for detection of coherent
256
motion increase linearly with LC noise amplitude (Figs. 7 and 8C) for all directions of chromatic modulation investigated. Threshold measures of chromatic sensitivity cannot, however, be used to predict the suprathreshold effectiveness of chromatic signals. The results of Fig. 8 suggest that although ‘red’ and ‘green’ stimuli of equal chromatic strength have approximately equal threshold sensitivity, the same suprathreshold colors are no longer equally effective with the ‘red’ showing the greatest sensitivity.
Feature-tracking mechanisms for color-defined motion When the moving stimulus contains unique spatial primitives, the perception of color-defined coherent motion becomes completely independent of LC noise. This observation provides strong support for the existence of feature-tracking mechanisms that mediate color-defined motion (Cavanagh and Seiffert, 1999). Sperling and colleagues go further to suggest that feature tracking is the only mechanism for detection of color-defined motion (Lu et al., 1999). Their conclusion is based on a number of experiments with isoluminant stimuli, the results of which appear to follow strictly the properties of feature-tracking mechanisms (Lu and Sperling, 1995). Other studies provide clear evidence for the existence of a mechanism that extracts and combines luminance and color-defined motion signals (Cavanagh and Favreau, 1985; Gorea and Papathomas, 1989; Cropper and Derrington, 1994, 1996).
Alternative hypotheses Given such diverse findings, it is of interest to establish whether the visual system has developed visual mechanisms capable of extracting and combining color and luminance contrast-defined motion signals. We have therefore examined two other hypotheses that could, in principle, be used to explain the breakdown of coherent motion when moving, randomly distributed colored checks are buried in dynamic LC noise. The first hypothesis is that detection of the random motion of LC-defined background checks is also mediated largely by featuretracking mechanisms. The observed breakdown of
color-defined motion when the spatiotemporal characteristics of the background noise favor the properties of first-order motion mechanisms fails to support this hypothesis. The second hypothesis explains the breakdown of coherent motion as a loss of accurate position information of color-defined checks. When buried in dynamic LC noise, the spatial location of color-defined checks can no longer be secured accurately and this could, in principle, cause the poor functioning of feature-tracking, motion mechanisms. This hypothesis is consistent with the poor perceived localization of checks when buried in dynamic LC noise. The colored checks appear to follow the random motion of LC-defined checks and their perceived spatial location becomes uncertain. The experiments designed to test this hypothesis reveal a linear increase in position-change thresholds with dynamic LC noise. Although this finding is consistent with feature-tracking being the only mechanism for color-defined motion, it is interesting to note that both position-change and coherent motion thresholds exhibited the same dependence on every form of background noise employed in this study. This observation has prompted us to question whether position-change thresholds of color-defined checks are also mediated by the same motion detection mechanism. The poorer performance of the same motion mechanism that extracts and combines first-order color and luminance contrastdefined motion signals could also explain the subject’s increased position-change thresholds, when colored checks are buried in dynamic LC noise. It is also possible that the accurate position information of colored checks is not lost when these checks are buried in dynamic LC noise and may therefore be available to other visual mechanisms, in spite of the perceived uncertainty of spatial location. The new vertical misalignment task relies on accurate processing of position information, but does not confound position-change and motion detection thresholds. The findings that neither static nor dynamic LC noise have any effect on chromatic thresholds for detection of vertically misaligned checks demonstrate that accurate position information of color-defined checks is available and can be used by other visual mechanisms, in spite of the perceived uncertainty of spatial location. One last remaining alternative hypothesis is that in spite of the perceived uncertainty
257
of spatial location, the subject manages to carry out the vertical misalignment task by averaging and extracting the mean location of colored checks. This new hypothesis does, however, predict that standard deviations for vertical misalignment thresholds should increase with LC noise amplitude. When short stimulus durations are employed and averaging for mean location is no longer possible, mean thresholds for detection of vertical misalignment should also increase with LC noise amplitude. The results show a complete absence of significant changes in either standard deviation or mean thresholds with increased LC noise, when the presentation time is reduced from 373 to 107 ms (Fig. 11C). These findings suggest that the position information of suprathreshold colored checks is preserved and remains independent of LC noise. The breakdown of color-defined motion is not, therefore, caused by the loss of position information. The results suggest that chromatic signals are processed independently of luminance contrast noise and are then available for use in different visual tasks either independently, or in combination with luminance contrast signals.
Classification of motion mechanisms The use of luminance and chromatic background noise reveals the existence of at least three motion mechanisms. The first mechanism responds exclusively to luminance contrast-defined motion. Its sensitivity is not affected by the presence of static luminance contrast or static or dynamic chromatic contrast noise. It is a first-order mechanism since the presence of distinct luminance contrast-defined features fails to restore the detection of motion when the moving features are buried in dynamic luminance contrast noise (Fig. 2B). A second mechanism can extract and combine luminance and color-defined motion signals and this accounts for the breakdown of coherent motion when the random motion of luminance contrast-defined checks is combined with the coherent motion of spatially identical colored checks. When the colored stimulus contains recognizable features that are distinct from background noise, the detection of color-defined motion must rely on feature-tracking mechanisms since the perception of coherent motion is restored completely and becomes
independent of background noise. Further work is needed to establish the extent to which these feature-tracking mechanisms respond exclusively to chromatic signals or extract and combine the motion of features defined by both luminance and chromatic contrast. Results from other studies that also postulate the existence of three mechanisms to account for luminance and color-defined motion (Gorea and Papathomas, 1989) suggest that the third mechanism responds exclusively to color-defined motion.
A case for ‘double-blindsight’ in human vision The work reported here has focused largely on the processing of chromatic and achromatic signals in relation to motion perception. Experiments designed to test different hypotheses have produced a number of unexpected findings. One such finding is our ability to make use of accurate, color-defined position information in a vertical misalignment task that does not confound motion and position-change thresholds. The ability to make use of visual signals, even when the subject is unconscious of any visual input, has been demonstrated in numerous studies. This use of visual information that defies perception has been appropriately labeled as ‘Blindsight’ (Weiskrantz, 1986). The findings that have emerged from this investigation reveal the ability of normal subjects to discard misleading, consciously perceived visual signals. The results show that detection of vertical misalignment of colored checks remains completely unaffected by dynamic LC noise, in spite of the perceived uncertainty of spatial location and the severe impairment of position-change thresholds under the same stimulus conditions. The ability to make accurate use of visual information in spite of misleading percepts can be appropriately described as ‘Double-blindsight’.
Acknowledgments I acknowledge the MRC and the Wellcome Trust for support with the equipment employed in these studies. I am grateful to A. Harlow, A. Goodbody and M. Rodriguez-Carmona for their help with equipment, programming and the numerous experiments.
258
I also thank David Edgar for the critical reading of the manuscript and Manfred Fahle and Michael Morgan for their useful comments and criticism.
References Alexandridis, E., Leendertz, J.A. and Barbur, J.L. (1992) Methods of studying the behaviour of the pupil. J. Psychophysiol., 5: 223–239. Barbur, J.L. (2003) Learning from the pupil — studies of basic mechanisms and clinical applications. In: Chalupa L.M. and Werner J.S. (Eds.), The Visual Neurosciences. MIT Press, Cambridge, MA. Barbur, J.L. and Harlow, A.J. (1990) Pupillary responses to stimulus structure and colour. Perception, 19: 412. Barbur, J.L., Thomson, W.D. and Forsyth, P.M. (1987) A new system for the simultaneous measurement of pupil size and two-dimensional eye movements. Clin. Vis. Sci., 2: 131–142. Barbur, J.L., Plant, G. and Williams, C. (1992) Colour discrimination measurements in patients with cerebral achromatopsia. Perception, 21: 74. Barbur, J.L., Harlow, A.J. and Plant, G.T. (1994) Insights into the different exploits of colour in the visual cortex. Proc. R. Soc. Lond. B, 258: 327–334. Barbur, J.L., Birch, J. and Harlow, A.J. (1996) Colour vision testing using spatiotemporal luminance masking: psychophysical and pupillometric methods. In: Drum B. (Ed.), Colour Vision Deficiencies XI. Kluwer Academic Publishers, Netherlands, pp. 417–426. Barbur, J.L., Wolf, J. and Lennie, P. (1998) Visual processing levels revealed by response latencies to changes in different visual attributes. Proc. R. Soc. Lond. B, 265: 2321–2325. Barbur, J.L., Weiskrantz, L. and Harlow, J.A. (1999) The unseen color aftereffect of an unseen stimulus: insight from blindsight into mechanisms of color afterimages. Proc. Natl. Acad. Sci. USA, 96: 11637–11641. Birch, J., Barbur, J.L. and Harlow, A.J. (1992) New method based on random luminance masking for measuring isochromatic zones using high resolution colour displays. Ophthalm. Physiol. Opt., 12: 133–136. Breitmeyer, B.G. and Julesz, B. (1975) The role of on and off transients in determining the psychophysical spatial frequency response. Vis. Res., 15: 411–415. Cavanagh, P. (1991) The contribution of colour to motion. In: Valberg A. and Lee B.B. (Eds.), From Pigments to Perception. Advances in Understanding Visual Processes. Plenum, New York, pp. 151–164. Cavanagh, P. and Favreau, O.E. (1985) Color and luminance share a common motion pathway. Vis. Res., 25: 1595–1601.
Cavanagh, P. and Seiffert, A.E. (1999) Position-based motion perception for color and texture stimuli: effects of contrast and speed. Vis. Res., 39: 4172–4185. CIE (1964) CIE Proceedings 1963 (Vienna Session), Vol. B., Committee Report E-1.4.1, Bureau Central de la CIE, Paris 1964, pp. 209–220. Cowey, A., Stoerig, P. and Bannister, M. (1994) Retinal ganglion cells labelled from the pulvinar nucleus in macaque monkeys. Neuroscience, 61: 691–705. Cropper, S.J. and Derrington, A.M. (1994) Motion of chromatic stimuli: first-order or second-order? Vis. Res., 34: 49–58. Cropper, S.J. and Derrington, A.M. (1996) Rapid colourspecific detection of motion in human vision. Nature, 379: 72–74. Derrington, A.M. and Lennie, P. (1984) Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. J. Physiol. (Lond.), 357: 219–240. Gorea, A. and Papathomas, T.V. (1989) Motion processing by chromatic and achromatic visual pathways. J. Opt. Soc. Am., 6: 590–602. Gorea, A. and Papathomas, T.V. (1991) Texture segregation by chromatic and achromatic visual pathways: and analogy with motion processing. J. Opt. Soc. Am., 8: 386–393. Hendry, S.H.C. and Yoshioka, T. (1994) A neurochemically distinct third channel in the macaque dorsal lateral geniculate nucleus. Science, 264: 575–577. Hubel, D.H. and Wiesel, T.N. (1972) Laminar and columnar distribution of geniculo-cortical fibres in the macaque monkey. J. Comp. Neurol., 146: 421–450. Kaplan, E. and Shapley, R.M. (1982) X and Y cells in the lateral geniculate nucleus of the macaque monkey. J. Physiol. (Lond.), 330: 125–144. Kelly, D.H. and Martinez-Uriegas, E. (1993) Measurements of chromatic and achromatic afterimages. J. Opt. Soc. Am., 10: 29–37. King-Smith, P.E. and Carden, D. (1976) Luminance and opponent-color contributions to visual detection and adaptation and to temporal and spatial integration. J. Opt. Soc. Am., 66: 709–717. Kranda, K. (1983) Analysis of reaction times to coloured stimuli. Ophthal. Physiol. Opt., 3: 223–231. Krastel, H., Alexandridis, E. and Gertz, J. (1985) Pupil increment thresholds are influenced by color opponent mechanisms. Ophthalmologica, 191: 35–38. Lu, Z.L. and Sperling, G. (1995) The functional architecture of human visual motion perception. Vis. Res., 35: 2697–2722. Lu, Z.L., Lesmes, L.A. and Sperling, G. (1999) The mechanism of isoluminant chromatic motion perception. Proc. Natl. Acad. Sci. USA, 96: 8289–8294. Lupp, U., Hauske, G. and Wolf, W. (1973) Perceptual latencies to sinusoidal gratings. Vis. Res., 13: 2219–2234. MacAdam, D.L. (1942) Visual sensitivities to color differences in daylight. J. Opt. Soc. Am., 32: 247–274.
259 Mollon, J.D. and Krauskopf, J. (1973) Reaction time as a measure of the temporal response properties of individual colour mechanisms. Vis. Res., 13: 27–40. Mullen, K.T. and Baker, C.L. (1985) A motion aftereffect from an isoluminant stimulus. Vis. Res., 25: 685–688. Perry, V.H. and Cowey, A. (1981) The morphological correlates of X- and Y-like retinal ganglion cells in the retina of monkeys. Exp. Brain Res., 43: 226–228. Perry, V.H., Oehler, R. and Cowey, A. (1984) Retinal ganglion cells that project to the dorsal lateral geniculate nucleus in the macaque monkey. Neuroscience, 12: 1101–1123. Reffin, J.P., Astell, S. and Mollon, J.D. (1991) Trials of a computer-controlled colour vision test that preserves the advantages of pseudoisochromatic plates. In: Drum B., Moreland J.D. and Serra A. (Eds.), Colour Vision Deficiencies X. Kluwer Academic Publishers, Dordrecht, Netherlands, pp. 69–76.
Rodieck, R.W. (1988) The primate retina. In: Alan R. Liss (Ed.), Chapter 4 — Neurosciences: Comparative Primate Biology. John Wiley & Sons, Inc., New York. pp. 203–278. Rodieck, R.W. (1991) Which cells code for color? In: Valberg A. and Lee B.B. (Eds.), From Pigments to Perception — Advances in Understanding Visual Processes. Plenum, New York, pp. 83–94. Weiskrantz, L. (1986) Blindsight. Clarendon Press, Oxford. Wyszecki, G. and Stiles, W.S. (1982) Color Science — Concepts and Methods, Quantitative Data and Formulae. John Wiley & Sons, Inc., New York. Young, R.S.L. and Alpern, M. (1980) Pupil responses to foveal exchange of monochromatic lights. J. Opt. Soc. Am., 70: 697–706. Young, R.S.L. and Teller, D.Y. (1991) Determination of lights that are isoluminant for both scotopic and photopic vision. J. Opt. Soc. Am., 8: 2048–2053.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 18
Stimulus cueing in blindsight Alan Cowey1,* and Petra Stoerig2 1 Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford OX1 3UD, UK Institute of Experimental Psychology, Heinrich Heine University, Universita¨tsstrasse 1, 40225 Du¨sseldorf, Germany
2
Abstract: When visual stimuli are presented in the cortically blind visual field of patients or monkeys with verified destruction of striate cortex, many subjects can voluntarily respond to them. In studies of this blindsight, the on- and/or offset of the visual stimulus is usually known to the subject, either because it is signaled in some way or because the subject can present the stimulus himself. To study the effect of stimulus uncertainty on the responses of four hemianopic monkeys and one human hemianope, we compared trials on which the subjects themselves could instantly trigger the stimulus with trials on which the same stimulus appeared 1–7 s after the start-light that normally served as the trigger was first touched. The latter manipulation diminished both the percentage of trials on which the subjects responded and the percentage correct when they did respond. As the start-light disappeared when touched in the first but not second condition, we interpret our results as indicating an influential role for attention in blindsight. Although keeping attention focussed on the start-light and delaying the target impaired performance especially in the monkeys, localization was still significant in three and hardly affected in GY.
Introduction
demonstrate that performance depends on the stimulus and conditions used, the extent to which it depends on the subjects’ foreknowledge, attention, and expectation about the stimuli presented in the cortically blind field has hardly been assessed. In formal, laboratory based tests the subjects usually know when a stimulus will be presented, either because its presentation is announced by a second, acoustic or visual, signal, or because it is presented within one of two successive intervals both of which are individually signaled (e.g. Barbur et al., 1980; Azzopardi and Cowey, 1997; Sahraie et al., 2002). When stimulus characteristics other than spatial location are being assessed, subjects also know where targets will be presented, and when functions other than detection are being assessed, subjects also know whether to expect a stimulus. Last but not least, human subjects are customarily informed about the type(s) of stimuli presented because the choice is usually limited to two or three that differ in some pertinent dimension such as color, size, motion, or shape which they are required to distinguish. The stimuli are also commonly shown
Blindsight, as the Oxford English Dictionary now knows, is the ability to localize, and often additionally to discriminate between, different visual stimuli confined to a field defect caused by destruction of striate cortex or its thalamic input. Its defining characteristic is the lack of any awareness, or at least report of awareness, of the stimulus in the presence of statistically significant performance (Weiskrantz et al., 1974; Stoerig, 1999a,b). Among the features that can be discriminated are flux, motion (although not necessarily its direction, Barton and Sharpe, 1997), orientation, size, chroma, and flicker (see Stoerig and Cowey, 1997; Weiskrantz, 2001, for reviews). Performance with 2-alternative forced-choice guessing (2AFC) can be as high as 95–100% correct, barely worse than or as good as in the intact visual field (Pizzamiglio et al., 1984; Perenin, 1991; Cowey and Stoerig, 1997). While the published reports clearly *Corresponding author. Tel.: þ 01865-271352; Fax: þ 01865-310447; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-8
261
262
to the subjects in their normal visual field. All of this should, and probably does, allow the majority of subjects to focus their attention to a particular time, place, and possibly stimulus feature. Only monkeys with complete cortical blindness may, despite extensive training, remain ignorant about stimulus types. As far as we are aware there are only two exceptions to this customary knowledge about the stimulus. One concerns pupillary responses to stimuli in the cortically blind field. For example, Weiskrantz et al. (1998, 1999) demonstrated a pupillary response in two hemianopic monkeys (neither of them among the monkeys used here) and in subject GY to behaviorally irrelevant grating stimuli presented at unpredictable times in the hemianopic field while the subjects fixated a cross whose dimming they had to detect. As pupil responses are reflexive and may even be present in unconscious patients, their resistance to stimulus uncertainty may simply reflect their great sensitivity. The second exception concerns a series of experiments with hemianopic monkeys who had to move their eyes to small visual targets presented in the periphery. When the signal normally provided by the extinction of the fixation spot was omitted, only monkeys operated on as infants (5–6 weeks of age) still responded to the peripheral targets, while adult-operated animals largely failed (Moore et al., 1995, 1996, 1998). The authors suggested that these findings indicate that adult-lesioned animals have blindsight rather than residual conscious vision, thus implying that only animals lesioned as infants, when there is much more plasticity (Payne et al., 1996), retained some kind of visual awareness of the stimuli and therefore responded when no separate signal was given. We recently argued that the infant-lesioned monkeys may simply have developed more or better blindsight (Stoerig et al., 2002), and here address this issue by omitting the off-set of the start-light, to which our monkeys were accustomed and which normally cued them to immediate stimulus presentation, and in addition introducing some temporal uncertainty as to when a target appears. The highly trained subjects were required to touch the position at which a stimulus appeared; two positions per hemifield were used. We compared performance in two conditions. In the first, touching the start-light extinguished it and instantaneously produced a target at one of four possible positions,
two in each hemifield. In the second, the start-light was disabled, touching it failed to extinguish it, and a target appeared 1–7 s after the monkeys first touched the now ineffective trigger, which they continued to do throughout the delay. The results show that the latter procedure greatly impairs both responsiveness and accurate spatial localization but without abolishing either. Especially human subject GY’s localization performance was hardly affected, possibly because he had to be told that the start-light was now ineffective but a target would still appear, possibly because he is aware of sufficiently salient stimuli in his hemianopic field (Barbur et al., 1980; Blythe et al., 1987; Weiskrantz et al., 1995; Stoerig and Barth, 2001), and thus possesses residual vision (Blythe et al., 1987; Stoerig et al., 2002) or Type 2 blindsight (Weiskrantz, 1998) rather than a complete dissociation between function and awareness. That two of the monkeys, and subsequently a third monkey in later tests, still significantly localized targets in their hemianopic field indicates that, despite being lesioned as young (in one instance elderly) adults, they have retained or regained some residual vision although previous experiments suggested that they have blindsight rather than residual vision (Cowey and Stoerig, 1995). Alternatively, they are exhibiting blindsight even when their focal attention is not released from the central start-light, they no longer know that a target is to be expected, and the targets are temporally less predictable.
Methods Subjects Three adult male macaques (two M. mulatta and one M. fascicularis) and one female M. Mulatta, all aged about 16 years were studied, under Personal and Project Licensing of the UK Home Office. The males were already adult when the left striate cortex was removed about 10 years before the present experiment. The female (Rosie) was first an unoperated control subject before having the same operation at age 14, two years before the present experiment. Details of surgery and examples of the histological sections through the excised occipital lobe containing most of the striate cortex, together with MR images
263
of the brains of the three males have been published (Cowey and Stoerig, 1997; Stoerig et al., 2002). The total aspiration of the striate cortex in the calcarine fissure after the lobectomy and the uniform degeneration of the dLGN has since been confirmed histologically in the males. Figure 1 shows, for the first time, representative sections through the region of the calcarine fissure and through the degenerated dLGN in these monkeys. Monkey Rosie is still taking
part in experiments but her excised and histologically sectioned occipital lobe shows that all lateral and medial striate cortex was successfully removed (Stoerig et al., 2002). The human subject, GY, has a right hemianopia with about 3–4 of macular sparing, following a lesion suffered at the age of 8, almost 40 years before the present experiment. MRI scans reveal an extensive hypodense region in the left occipital lobe,
Fig. 1. In each column the first three photomicrographs show coronal sections through the occipital lobes of three of the monkeys, rostral to the plane of the occipital lobectomy and spanning the remaining extent of the calcarine sulcus (c.s.), which contains the representation of the retina for far peripheral vision and is arrowed in each section. Note that in all three monkeys the calcarine fissure is missing in the left hemisphere. The large cavity in the left hemisphere of Dracula is a greatly enlarged lateral ventricle, which was also visible by structural MRI in the living brain several years before the present experiment (Cowey and Stoerig, 1995) and is therefore not a perfusion artefact. Below each set of whole brain sections are three sections through the lateral ventral thalamus, from caudal to rostral, showing the extensive and uniform degeneration and shrinkage of the dLGN, arrowed. At the bottom of each column is a single section, just behind the optic chiasma and stained for myelin, illustrating the optic tracts (arrowed) and the huge shrinkage of the left optic tract, that projects to the operated side of the brain. The top scale bar in the first column applies to the three top sections of all columns and equals 2 cm. The longer scale bar beneath it applies to the three photomicrographs through the thalamus and also equals 2 cm. It also applies to the bottom photomicrograph through the optic tracts, where it equals 1 cm.
264
destroying the striate cortex except for the foveal representation at the occipital pole and sparing most of the adjacent extra-striate cortex (Barbur et al., 1993; Baseler et al., 1999; Azzopardi and Cowey, 2001; Goebel et al., 2001). His impaired hemifield is relatively blind in that he has low-level conscious awareness especially of fast-moving stimuli of high luminance contrast. In the present experiments, he consistently said that he was usually aware of the stimuli presented in the hemianopic field, but that he did not ‘see’ them, i.e. he denied any awareness of events in the hemianopic field as involving visual percepts. GY has participated in studies of his residual visual sensitivity for a period of over 20 years (e.g. Barbur et al., 1980, 1993; Blythe et al., 1987; Brent et al., 1994; Azzopardi and Cowey, 1997, 2001; Benson et al., 1998; Stoerig et al., 2002), and, like our four monkeys, is unusually experienced with psychophysical investigations. The tests on GY were carried out with his consent and the approval of the local Institutional Ethical Committee.
it disappeared and was instantly followed (Fig. 2C) by a 14 by 14 0.30 cpd stationary horizontal squarewave grating, presented at random in one quadrant of the VDU for 200 ms (100 ms for monkey Rosie). The position of the stimulus was randomly determined by the computer with the constraint that the total number at each of the four possible positions was the same in each session. Only if the subject then touched the remembered position of the target (Fig. 2D) was a reward delivered. The target was too brief for the monkey to make a saccadic eye movement from the start-light to the target before it disappeared. The latter was verified for three of the
Testing procedure The monkeys were tested while squatting in a primate chair with the head restrained by moulded perspex baffles. Viewing was binocular and the eyes were monitored continuously by CCTV that provided a picture of the face that filled a TV screen and showed on both eyes the specular reflections of three infra-red lights around the stimulus display. The visual targets were presented on a Phillips VDU (UP2799, 17 in., vertical refresh rate ¼ 16.6 ms) at a viewing distance of 28 cm and an angular screen size of 80 60 . Luminance values were checked frequently with a luminance meter (Minolta, LS 110). The monkey started each trial by reaching out and touching the centrally placed, white, 4 , 40 cd m2 start-light. Background luminance was 3 cd m2. When the monkeys reached out to start each trial they invariably looked at the start-light, thereby allowing brief visual stimuli to be confined to the left or right visual hemifield. Figure 2 is a representation of the sequence of events in the initial sessions where the timing of the presentation of the target was under the control of the subject and completely predictable. As soon as the central start-light (Fig. 2A) was touched (Fig. 2B)
Fig. 2. From top to bottom (A to D) the figure shows an example of the sequence of events on a trial in Experiment 1. In this example the target appears briefly in the left upper quadrant of the display and the subject responds correctly by touching its remembered position.
265
monkeys by making a video recording of the eyes for several testing sessions; although they did make a saccadic eye movement (and head movement) to the visual quadrant in which the stimulus had appeared, or in which the monkey incorrectly responded, it was not swift enough to reach the target before it had disappeared, in both the left and right hemifields. In other words they did not make express saccades, presumably because they could not predict where the target would appear and they could not disengage attention from the start-light until it had disappeared (Fischer, 1985). Following each correct response the monkey was rewarded with a peanut or raisin automatically delivered to a food well beneath the display. Trials were self-paced and ended when the monkey responded or after 3 s without a response following stimulus presentation, which was rare in the condition where the stimulus always followed a touch to the start-light. After an incorrect response there was no food reward and the entire screen went black for 1.5 s. An individual session lasted for 120 or 200 trials. Subject GY was tested in essentially the same way except that he sat in a chair in front of the display and
his ‘reward’ was signaled by the conspicuous noise of the feeder and incorrect responses by the instantaneous blanking of the display. He could not be tested as often as the monkeys or for as many trials. In this first experiment the contrast of the grating was systematically varied. Initially it was presented at a contrast of 0.98 and a mean luminance of 20 cd/m2, i.e. 0.8 log units above mean screen luminance. After the monkey was responding at better than 90% correct in the impaired hemifield, mean luminance was made the same and then the grating contrast was systematically reduced, while preserving mean luminance, as follows. On each testing session the contrast was reduced from its value on the previous day until performance in the impaired field declined to about 60% correct. The contrast in the impaired field was then restored to 0.98 and the mean luminance to 20 cd m2 — in order to maintain a high level of performance — while the contrast in the normal hemifield was systematically reduced from session to session to determine the threshold for about 60% correct. After the completion of these tests, the conditions were altered for the second experiment (Fig. 3). The
Fig. 3. Cartoon of the sequence of events in different testing conditions. In the standard condition (A) the start-light disappears as soon as it is touched and the target (cross-hatched) appears at the same time (Experiment 1 and 3A). When the monkey responds correctly within three seconds he is rewarded with a food pellet. (B) In Experiment 2, touching the start-light has no effect. Instead the experimenter presents the target from 0 to 7 s later and the monkey then has 3 s in which to respond while the start-light is still on. (C) In Experiment 3, two conditions were used in addition to that also used in Experiment 1. In task 3B, the start-light remained on when touched, but a target nevertheless appeared immediately. In task 3C, the start-light again stayed on when touched, and a target appeared, but with a delay of about 3 s and only if the monkeys continued to touch the start-light.
266
start-light was disabled and remained on even though the subject touched it. At some point in the following 7 s the experimenter presented the target. Watching the display on the remote monitor, he initiated the trial only when the subject was trying to activate the start-light and was still looking at it. The temporal delay was randomized such that the mean delay was between 3 and 4 s and all possible delays were roughly equally probable. The stimulus contrast for these sessions was at a level where performance had been better than 90% correct under the condition where the stimulus was presented immediately. As before, the subject had 3 s in which to respond after the target had been presented. As the conditions used in Experiment 2 differ from those of Experiment 1 both in the introduction of the temporal uncertainty and in the start-light’s continued presence, we performed a third experiment on monkey Rosie (post-op) several months later. She was tested for 1200 trials, 400 trials under each of 3 conditions: (A) the start-light disappeared instantly when it was touched, and was immediately followed by the stimulus, (B) it did not disappear when touched but the stimulus was presented by the experimenter as soon as she touched it, (C) as for (B) but the stimulus was delayed by the experimenter for about 3 s. This slightly imprecise delay reflects the fact that the monkey had still to be looking at the start-light and touching it for the delayed stimulus to be presented. In B the delivery of the stimulus was thus not overtly signaled, but the monkey herself triggered it in the sense that the experimenter pressed a button as soon as she touched the start-light. In C her actions only led to the stimulus 3 s after the first touch and if she was continuing to touch it. The first touch therefore no longer signaled the stimulus. As the monkeys were accustomed to move their eyes and head to search the screen when the start-light disappeared, we could not use a further condition in which the target was delayed although the start-light went off when touched, because the retinal position of the target would have become uncontrolled. The results were analyzed with respect to proportion of trials on which the subjects responded, their percentage correct on those trials when they did respond, and their reaction time on different types of trial. Reaction time was the time from the onset of the stimulus target (whether it was presented
as soon as the subject pressed the start-light or was presented with a delay) to the subject’s response on the touch screen. All reaction times faster than 200 ms were excluded from the analyses but long outliers up to the maximum duration of 3 s after target presentation were included. The latter was done in order to determine whether random responding might occur when there was a lengthy delay. Reaction times were recorded with a timer card triggered by the onset of the stimulus and stopped by the first response in any of the four possible target positions. Statistical analyses were performed with SPSS version 10.
Results Experiment 1 Frequency of responding and percentage correct The monkeys received several thousand trials on the localization task in which their performance was measured with respect to grating contrast in the normal and hemianopic fields. The results are shown in Fig. 4 and provide the basis for the choice of stimulus in the subsequent experiment. All monkeys and GY performed at 90% correct or better in the hemianopic field at a contrast of 0.96 and a mean luminance that was the same as the background. They were marginally better at a slightly higher contrast of 0.98 and a mean luminance of 20 cd m2 but ceiling effects would have prevented an even larger difference. But at lower contrasts at equiluminance there was a steep decline in performance in the hemianopic field, previously noted by Stoerig et al. (2002) for both Rosie and GY. Blindsight has poor contrast sensitivity and a grating stimulus that is still perceptually prominent in the normal visual field can be undetectable in the hemianopic field, as first shown for monkeys with total bilateral removal of striate cortex by Pasik and Pasik (1982). Table 1 shows the results with the four monkeys on the localization task, using a high grating contrast (0.96) and a mean luminance equal to the background’s, and shortly after the tests of contrast sensitivity had been completed. As in the normal hemifield the monkeys almost always responded
267
Fig. 4. Percentage correct performance as a function of luminance contrast of the grating target on the localization task on trials where the target appeared the instant the start-light was pressed (Experiment 1). The mean grating contrast was equiluminant with the background except for the points indicated by an asterisk, when it was 0.8 log units more intense than the background. For the monkeys the number of trials for each point was about 250. For GY it was only 60. s ¼ normal field, d ¼ blind field.
when the grating was presented in the hemianopic field, and scored 90% correct or better on those trials. As also shown in Table 1, GY’s performance is perfect in the normal hemifield, and almost as good in
the impaired one although, in view of his greater sensitivity (see Fig. 4), he was tested with a stimulus of 0.4 grating contrast and at a mean grating luminance the same as the background.
268 Table 1. Experiment 1: Each row shows, for the left and right hemifields separately, the percentage of trials on which the subject responded and the percentage of the latter that were correct Animal
Left trials
% Responding
% Correct
Right trials
% Responding
% correct
Rosie pre-op Rosie post-op Dracula Lennox Wrinkle GY
480 480 490 550 440 180
100 100 100 94 100 100
100 98 100 99 99 100
480 480 490 550 440 180
96 98 98 92 92 98
100 90 92 90 90 99
The stimulus was always cued by turning off the start-light as soon as it was touched and instantly presenting the stimulus.
Reaction times Figure 5 shows the mean reaction times of the subjects. It is noteworthy that all four monkeys and GY were slower to respond correctly when the target was in the right, hemianopic, field (paired samples test, df ¼ 4, t ¼ 3.6, P ¼ 0.02, two-tailed). There was a similar trend for incorrect responses but they were so rare that comparisons are unreliable. For example, Rosie and Dracula each made only one incorrect response in the good field on this task, and GY while making no errors in his good field, made but 3 in his blind field.
Experiment 2 Frequency of responding and percentage correct The same stimulus values — 0.96 and 0.4 grating contrast for the monkeys and GY respectively, presented at the same mean luminance as the background — were used. The effect of unpredictably delaying the onset of the target for up to 7 s and never signaling its appearance by turning off the start-light are shown in Table 2. The smaller number of trials reflects the fact that the testing sessions became longer and the monkeys were irritated by the persistent start-light. The table reveals two prominent effects. First, the proportion of trials on which the monkeys and GY responded at all when the grating appeared in the normal, left, hemifield was barely altered for two of the monkeys but declined to just above 80% correct with Dracula and Wrinkle. This change was not statistically significant (paired samples test, df ¼ 4, t ¼ 2.42, P ¼ 0.073, two-tailed). However, the incidence of responding declined strikingly for
stimuli in the blind hemifield, to a mean of 33% for the monkeys and GY; this effect was highly significant (paired samples test, df ¼ 4, t ¼ 12.7, P<0.001). Note that in the normal hemifield, the frequency of responding remained high; it dropped to 81% (Lennox) at the most, and mean percentage responding was not significantly different from the corresponding value of Experiment 1 (see Table 1). Second, even when the monkeys did respond, their performance in the hemianopic field fell from better than 90% to a mean of 60%. The change was significant (paired samples test, df ¼ 3, t ¼ 3.44, P ¼ 0.041, two-tailed). Even when GY, who was only slightly impaired, was included in the analysis the difference remained significant (paired samples test, df ¼ 4, t ¼ 2.8, P ¼ 0.048, two-tailed). This should be evaluated with respect to the normal hemifield where the percentage correct on trials where there was a response was almost indistinguishable in the two conditions (paired samples test, df ¼ 4, t ¼ 0.5, P ¼ 0.67, two-tailed). Rosie was unusual in that her score on trials when the stimulus was in her hemianopic field was well below 50% correct (binomial test, N1 ¼ 72, N2 ¼ 168, P<0.001). The explanation is that, unlike the other monkeys, on those trials she responded on the left as well as the right, i.e. in the good as well as the blind field; indeed, she responded in the normal hemifield rather than in the blind one on 64.8% of responsive trials. Equally important is whether the subjects still scored better than expected by chance when the stimulus presentation was not cued. Dracula and Wrinkle were both better than expected by chance (binomial tests, P<0.001 in each case) whereas Lennox ( P ¼ 0.27) was not. When Rosie did respond in the blind field (i.e. ignoring trials when she
269
Fig. 5. The mean reaction time for each subject for stimuli in the left and right (hemianopic) visual field on the stimulus localization task with stimuli of 0.96 grating contrast (0.4 for GY) that yielded better than 90% correct responding. The blank entries for Rosie preop and for GY indicate that there were no responses in that particular category. Error bars show standard error of the mean.
incorrectly responded to a target on the right by pressing on the left), she scored 76.6% correct (72/94, binomial test, P<0.01), while the 28% correct given in Table 2 thus refers to her overall percentage correct in
response to right field targets and is not significantly better than chance because at this stage she frequently responded impetuously on the left, localization of targets in the hemianopic field is significant. For easier
270 Table 2. Experiment 2: A similar analysis as for Table 1 but the target was uncued by never turning off the start-light and by delivering the stimulus from 0 to 7 s after the subject began to touch the start-light Animal
Left trials
% Responding
% Correct
Right trials
% Responding
% correct
Rosie pre-op Rosie post-op Dracula Lennox Wrinkle GY
240 500 240 220 255 120
95 94 82 81 97 100
100 93 94 99 98 100
240 500 240 220 255 120
90 52 33 18 22 59
99 28 76 59 75 97
he responded on only 59% of trials, far less frequently than before but still more frequently than any of the monkeys. The latter is hardly surprising given that the nature of the new task had to be explained to him beforehand and he therefore knew that if several seconds had elapsed he should probably respond. However, the latter will not explain why, in striking contrast to the monkeys, when he did respond he scored 97% correct. Asked about this afterwards he remarked that on those trials he was fairly confident that something had been presented, even though he did not ‘see’ anything.
Reaction times
Fig. 6. Percentage correct on trials where the subject responded. Each pair of bars indicates performance in the left (normal) and right (hemianopic) hemifields. Black bars indicate the hemianopic field. The top histogram shows performance with no stimulus delay (Experiment 1); the bottom row shows the corresponding results for responsive trials in Experiment 2 when the start-light remained on when touched and the stimulus was unpredictably delayed. Note the decrease in performance which is no longer different from chance in Rosiepost and Lennox. Error bars show s.e. of the mean. The error bars are omitted for performance above 90% correct because they are so small.
comparison, the percentage correct scores of the subjects in the two tasks are shown one above the other in Fig. 6, where the white bars refer to a normal hemifield and the black bars to the hemianopic field. GY was given a total of 240 trials. When the target was on the left he responded on every trial and scored 100% correct. But when it was in his right hemifield
The reaction times when the stimulus was uncued and less predictable are shown in Fig. 7. Even in the normal hemifield, when the monkey responded correctly, they were slower to respond than in the cued condition (paired samples test, df ¼ 4, t ¼ 5.7, P ¼ 0.004, two-tailed). Reaction times were longer and more variable when the monkey responded incorrectly to stimuli in the normal field but this remained so rare that comparisons are unreliable. To targets in the hemianopic field, they responded correctly with a mean of about 995 ms, which is significantly longer than the mean of 660 ms in the cued condition (paired samples test, df ¼ 4, t ¼ 3.8, P ¼ 0.03 two-tailed). However, the small absolute lengthening of these particular reaction times is important because it shows that the subjects did not just wait after sensing nothing and before eventually responding at random because the latter behavior would have produced much longer reaction times, especially on trials where stimulus delay was short. Unfortunately, the stimulus delay on each trial was not recorded.
271
Fig. 7. The mean reaction time for each subject in Experiment 2 where the stimulus was uncued and less predictable in time. See Table 2 for the number of responsive trials. Error bars show standard error of the mean.
The greatest increase occurred with incorrect responses to stimuli in the hemianopic field. For Rosie (post-op), Dracula, Lennox, and Wrinkle the mean difference in response latency between the two
conditions was 1.16 s (compare Figs. 6 and 7 but note the different scales). This conspicuous lengthening of reaction time of incorrect responses was significant for the monkeys (paired samples test, df ¼ 3, t ¼ 5.9,
272
P ¼ 0.01, two-tailed). GY could not be included because he hardly ever responded wrongly, and the number of incorrect responses he made in either hemifield is too small for reaction time to be informative. Nevertheless, his reaction time to targets in the hemianopic field increased with temporal uncertainty, although the absolute and relative differences were smaller than those of the monkeys.
Experiment 3 Frequency of responding and percentage correct Only Rosie was subsequently tested for 400 trials on each of three different testing conditions. The first was identical to Experiment 1, and was repeated in order to provide data gathered at roughly the same time (actually during the same week) for better comparison. The stimulus appeared as soon as the monkey touched the start-light, which disappeared at the same time (3A). In the second condition the stimulus appeared when the start-light was touched but the latter remained on (3B). In the third condition, the start-light remained on when touched but the stimulus was delayed by about 3 s (3C); this condition differs from those used for Experiment 2 only in the fixed rather than comparatively unpredictable (1–7 s) delay. Conditions 3B and 3C are thus similar in that the start-light remains on when touched, but in 3B touching it immediately triggers the target, whereas in 3C it does so only after a 3 s delay and only while the monkey continues to touch it. The results are shown in Table 3. In the original Task 3A Rosie performed at a high level in both hemifields, scoring 100 and 95% correct in the normal left and impaired right field, respectively. In conditions 3B and 3C her frequency of responding declined in
both hemifields, but much more prominently in the hemianopic one (91 and 90% compared to 99% in Task 3A in the normal, 53 and 41.5% compared to 86% in Task 3A in the hemianopic field). The difference between 3B and 3C (53% vs. 41.5%) was significant (2 ¼ 5.3; df ¼ 1; p ¼ 0.025, two-tailed). While the drop is larger when the target appears after a fixed delay, the fact that the start-light is not extinguished when touched decreases the response rate markedly even when touching it actually produces the target. We assume that this is due to attention being fixed on the start-light. Nevertheless, when Rosie did respond, she did so almost perfectly in the normal field, and at 67 and 78 rather than the 50% correct expected by chance in the hemianopic field. There was a highly significant difference overall (2 ¼ 36.9; df ¼ 2; P<0.001), attributable to her much higher score under condition 3A, but no difference between her % correct for 3B and 3C (2 ¼ 2.9; df ¼ 1; P ¼ 0.085). The results shown in Table 3 indicate that the removal of the start-light cue to the presentation of the stimulus was more important than the delay that followed the first touch on the start-light.
Reaction times In her good hemifield there was a significant main effect on correct trials (one way ANOVA, F ¼ 8.2; df ¼ 2; P<0.001). Rosie’s correct responses were significantly faster ( P<0.001) in condition 3A (mean 371 ms) than in 3C (415 ms); the difference between 3A and B (380 ms) or 3B and C is not significant (P ¼ 0.7). To targets in the blind field, her correct responses took even longer. Again, the mean is lowest for the double-cued condition 3A (426 ms), and was significantly different from the other two (3B: 585 ms; 3C: 560 ms; P<0.001) which did not
Table 3. The postoperative performance of monkey Rosie when the stimulus was cued by turning off the start-light as soon as it was touched (A), when it was not overtly cued by keeping the start-light on but was presented as soon as it was touched (B), and when it was not delayed for about 3 s after the start-light was first touched (C) Task
Left trials
% Responding
% Correct
Right trials
% Responding
% Correct
A: cued B: uncued zero delay C: uncued 3 s delay
200 200
99 91
100 99
200 200
86 53
95 67
200
90
99
200
41.5
78
273
differ significantly from each other ( P<0.55). They mirror the percentage correct in the hemianopic field, which is highest for condition 3A, and only slightly better for 3C than B. The effect of delaying the target by 3 s was to increase mean response time in the good, but to decrease it in the hemianopic field.
Discussion The results of the first experiment, where the target was presented as soon as the start-light was pressed, demonstrate that all our subjects were excellent at localizing the high contrast gratings in both hemifields even when their mean luminance was equal to that of the background. This agrees with previous reports based on testing under similar conditions (Cowey and Stoerig, 1997). Overall, reaction times were longer for incorrect responses but these occurred rarely. They were also significantly longer when the target was in the blind hemifield, by about 185 ms. The results of the second experiment in which the target was uncued and variably delayed show that despite the subjects’ irritation with the ‘misbehaving’ start-light (expressed by rattling the chair, slapping the screen), they still responded not only to targets in the good but, albeit to a much smaller extent, to targets in the blind field. Despite the greater disruption for targets in the blind field, performance remained significantly better than chance for GY, for monkeys Dracula and Wrinkle, and for Rosie on the proportion of trials on which she directed her responses to the hemianopic side. In addition, although reaction times increased particularly to stimuli in the blind field, they still differed by approximately the same extent between correct and incorrect responses. More importantly, and although we could not analyze them as a function of the stimulus delay, it is clear from the size of the increase of response times to targets in the hemianopic field that the monkeys, when they responded, did not do so at random, out of boredom, or ‘blindly’, because reaction times should then have become much more variable and on average much longer. Finally, the results of the third experiment with monkey Rosie show that: (1) performance in the cued and undelayed Task 3A that was identical to Experiment 1 was largely unchanged after a twoyear interval; (2) in tasks 3B and C, the monkey still
responded with a markedly reduced frequency to targets presented in the hemianopic field, as she had done in Experiment 2. Nevertheless, her response bias in Tasks 3B and C differed from that in Experiment 2 because she now responded much more often by touching positions within the impaired (rather than the good) field, and overall scored better than expected by chance with stimuli in the hemianopic field. Together, the results indicate that the fall in the response rate is attributable to the fact that the startlight remained on although she touched it. As she herself effectively triggered the target by touching the start-light in Task 3B, we assume that her attention remained fixed on the start-light which, together with her not knowing that a target would be presented, reduced the frequency of responses to the presumably weak signal she received from the hemianopic field. That the fixed delay of 3 s introduced in Task 3C slightly improved her performance is consistent with this hypothesis because the delay may have increased her attention to what happened elsewhere. Even GY, who performed almost faultlessly in both hemifields in Experiment 1 failed to respond to targets in his impaired field in 41% of trials in Experiment 2., although when he did respond, he was almost always correct. An even larger decrease in the proportion of responsive trials was found by King et al. (1996) who asked GY to indicate whether a prominent white disk of 188.5 cd/m2 had been presented. Presentation time was 165 ms, the background luminance 9.5 cd/m2, and the target could appear in either hemifield, in both, or in neither. Whereas GY always responded correctly when the hemianopic field presentation was cued by a tone, he responded on only 4 of 39 trials, i.e. roughly 10%, in the heminaopic field when not cued. What do these results tell us with respect to the role attention and awareness play in the residual visual functions demonstrated in fields of cortical blindness?
Role of attention The results of Experiment 1 and 3A confirm the high level of localization performance in the hemianopic field already reported for the same monkeys (Cowey and Stoerig, 1995, 1997). Provided the stimuli have sufficient luminance contrast to the background, or
274
sufficient grating contrast with mean luminance equal to that of the background, the monkeys performed at over 90% correct. But when, in the latter condition, the grating contrast was reduced, this ability declined steeply, as also shown in experiments done on totally cortically blind monkeys and reviewed by Pasik and Pasik (1982). Clearly, the excellent albeit significantly slower localization of targets in the hemianopic field would be less striking if the subjects had some awareness of the high-contrast targets. As our earlier control measurements on the same monkeys demonstrate that awareness based on the presence of light scattered onto the normal retina could not account for their performance (see Cowey and Stoerig, 1997), any such awareness would indicate that the monkeys have retained or regained some residual visual processing, as is the case of GY (see below). The pronounced decrease in response rate as well as the smaller drop in percentage correct when the start-light remained on when touched, and the target presentation was delayed, indicate that together these manipulations reduce performance in the hemianopic field, where Lennox’s performance actually fell to chance level. Together with Rosie’s results in tasks 3B and C of the third experiment, they indicate that when attention remains focussed on the start-light when the stimulus appears, and the subjects are uninformed about the impending stimulus, it often fails to elicit a response to a blind-field target. The positive effects of attention directed to the place or stimulus features of a target, expressed behaviorally as well as physiologically, come at a cost for the unattended stimulus (e.g. Moran and Desimone, 1985). The findings of Richmond et al. (1983) are of special interest in the present context. They recorded responses from inferior temporal neurons in monkeys trained to attend to the dimming of a fixation spot rather than to a stimulus presented elsewhere, and found much smaller responses to the ‘elsewhere’ stimulus in this condition. When, instead, the fixation spot was extinguished before the peripheral stimulus appeared, the cells responded much more vigorously to the latter, suggesting that ‘the reduction of response to the stimulus in the presence of the fixation point is caused by an interaction between the responses to the fixation point and the visual stimulus’ (p.1415). We deduce that effects of this kind are reflected in the present experiment even in
the slightly reduced response rates to stimuli in the good hemifield, while the greater effect for targets in the hemianopic field arises from the relative weakness of signals from this field (e.g. Rodman et al., 1989; see Bullier et al., 1994, for review). In fact, a pronounced fall in response rate was still seen in Experiment 3B, where Rosie failed to respond to 47% of targets in the hemianopic field even though the stimulus appeared as soon as she touched the start-light and thus effectively triggered it herself. Being ignorant of that fact, she obviously kept her attention focussed on the start-light. Remarkably, localization performance for the responsive trials in Experiment 2 was still significant in monkeys Wrinkle and Dracula, and in GY in whom alone it remained close to ceiling. Significant performance in the hemianopic field was also found for Rosie when only her responses at locations in the hemianopic field are considered. That she performed above chance when she did respond in that field was confirmed in Experiment 3B and C. Only monkey Lennox, who responded on a mere 18% of those trials, did not localize above chance when the startlight remained on. Reaction times, much longer to targets in the hemianopic field in the second experiment, bear witness to the difficulty of the task. They also demonstrate that on the responsive trials the signal from the hemianopic field was processed, because random responses in the 1–7 s interval between first touching the start-light and the stimulus presentation should have yielded very variable reaction times. This conclusion is further supported by the reaction times recorded for Rosie in Experiment 3. In Task 3B where the target appeared as soon as the inextinguishable start-light was touched, her mean response time was 585 ms, which, although longer than in the normal hemifield or in Task 3A, is far too swift to reflect random responses. In Task 3C where stimuli appeared about 3 s after she touched the start-light, here mean response time was 560 ms, again too fast to reflect random responding. The large difference between Rosie’s reaction times on correct responses in Experiment 3C (mean: 560 ms) and 2 (mean: 995 ms) when compared to the insignificant difference between those collected in Experiment 3B (585 ms) and C, indicates that the unpredictability of the stimulus onset had only a slight exacerbating effect. Although this comparison is confounded by a
275
possible effect of time — Experiment 3 took place about 2 years after Experiment 2 — the fact that she was so much faster when the delay was predictable should at least have contributed to the difference. Even subject GY took somewhat longer in the unpredictably delayed condition. Nevertheless, he scored 99% correct on the 41% of hemianopic field trials on which he responded, even though he was tested with a grating contrast of 0.4, showing that despite the impaired detection expressed in the response rate the temporal delay did not abolish his ability to localize in the hemianopic field. This result agrees with those reported by Kentridge et al. (1999) who asked GY to verbally identify at which of two positions in the hemianopic upper right quadrant a 1.5 , 400 ms target with high, medium or low negative contrast with respect to the backgound, appeared during each 10 s trial. The stimulus onset time varied from 1.5 to 8.5 s, and although the fixation light was never extinguished, the presentation of the target was either cued by dimming the fixation light, or not cued. Localization in the hemianopic field remained high at all target contrasts when the timing of the stimulus was cued, and fell only at the lowest contrast when uncued. Their results, which the authors attributed to modulation of selective attention or arousal by temporal cueing, agree with ours by showing (a) that cueing helps, and (b) that a temporal delay per se, even if unpredictable within certain limits, is tolerable, especially if the delay has been explained. That Kentridge et al. (1999) did not observe a significant deterioration in the response rate may reflect their use of luminance-defined stimuli; ours had the same space averaged mean luminance as the background, rendering detection more difficult.
Role of awareness In the experiment by Moore et al. (1995, 1996) hemianopic monkeys learned to fixate a dim central target and to move their eyes to a peripheral target (up to 24 eccentric) of high luminance contrast (3.1 log units) as soon as it appeared. Both the location and timing of the peripheral target were unpredictable. The authors found that initially all three adult-operated monkeys failed to respond to targets in the field defect, although one monkey
subsequently learned to do so. However, even the latter monkey failed when the stimulus luminance contrast was reduced to 2.1 log units above background, much greater than in the present experiment. The effects of similar lesions made in early infancy were much less severe. The authors interpreted these results as indicating that the adult-lesioned animal who did respond when cued had blindsight while the infant-lesioned ones who responded in either condition had residual vision. Comparing their results to ours, we can confirm that a subject with acknowledged awareness of stimuli in his hemianopic field (GY) and a lesion suffered at age 8, still responds more often than not when uncued and regardless of a 1–7 s temporal delay. It is conceivable that, had he suffered his lesion in infancy (rather than at 8 years) or had we used stimuli as intense as those of Moore et al. (1995, 1996), his response rate might have been unaffected as well. In contrast to their adult-lesioned monkeys however, three of our equally adult-lesioned monkeys (among them Rosie who underwent surgery at 14 yrs of age) responded, albeit less frequently, better than expected by chance even in the uncued task(s). This difference is even more striking when considering that our monkeys were accustomed over a period of years to the start-light going off when touched, meaning that they continued to look at it and touch it when it did not. They thus differed from the monkeys of Moore and colleagues because (a) they almost always responded and were better than 90% correct when cued, and (b) three monkeys — Wrinkle and Dracula in Experiment 2, Rosie in Experiment 3 — even when uncued significantly localized the targets in the blind field when they responded to them. Although this could imply that our monkeys, due to their extensive experience with blindsight testing, had regained some residual vision, we think this is unlikely. Our monkeys had, not long before the experiments described here, participated in experiments designed to show whether or not they had blindsight rather than residual vision. Like Lennox, who also participated and behaved in the same way, they were excellent at localizing cued targets but nevertheless categorized such targets in the same way as blanks in equally cued conditions in which blank stimuli were introduced and required a different response (touching a permanently outlined region near the top of the
276
screen (Cowey and Stoerig, 1995, 1997)). We still interpret this finding as indicative of blindsight because, like human patients with this condition, the monkeys answered ‘no target light’ when given that option. However, there is another possibility that may explain the change in Rosie’s behavior between Experiments 2 and 3. In a separate experiment carried out shortly after Experiment 1 but using the same procedure, her performance was compared to that of patients with field defects of absolute and relative blindness. In that experiment, Rosie responded better to stimuli in the hemianopic field than the patients with the absolute defects, but not as well as those with relative defects (Stoerig et al., 2002). We can therefore not rule out the possibility that she has recovered some awareness of visual stimuli in her hemianopic field, although she still responded at a much lower level of performance than human subject GY who has definitely changed over time with respect to awareness although he prefers not to describe it as ‘visual’ (see also Morland et al., 1999; Stoerig and Barth, 2001). We do not know whether the monkeys, who unlike GY could not be informed that a target would be presented even though the start-light stayed on, would eventually have learned to ignore this with more practice. It should be noted that they had relatively few trials in this condition, not just compared to the number in Experiment 1 reported here, but more importantly with respect to the many thousands of trials they have had over the years of participating in different tests. The fact that, unlike the adult-operated monkeys of Moore and colleagues (1995, 1996, 1998), three of our four adult-lesioned monkeys nevertheless demonstrated blindsight under the difficult conditions of Experiment 2 may be attributable to their lengthy experience and indicate that learning does play a role. That performance, rather than depending solely on the already mentioned factors that include the function tested and the stimulus conditions used, may improve with practice when feedback is given to patients, who unlike monkeys do not normally receive any, is already apparent (Stoerig, unpublished). The fall in response rate observed in Experiment 2 suggests that both blindsight and low-level residual vision profit from attention being deployed to the hemianopic field, and knowing at least whether and where to expect a stimulus.
Whether and under what conditions improvements in blindsight performance with extended practice may eventually invoke a return of at least low level conscious vision, as sometimes present in subject GY, is still uncertain, although changes in the density and extent of the blind field have been observed in some of our long-term subjects (Stoerig, 1999b). If blindsight can open a path toward visual field restitution, the pronounced effects of knowledge, attention, and expectation we here report ought to decrease in tandem with the reduction in density of the defect.
Acknowledgments Our research was supported by the UK Medical Research Council, Grant G971/397B, the Deutsche Forschungsgemeinschaft, and a Network Grant from the Oxford McDonnell Centre for Cognitive Neuroscience. We thank Carolyne le Mare and Iona Hoddinot-Hill for their help is testing subjects and analyzing results. We are grateful to subject GY for his continuing interest and cooperation.
Abbreviations dLGN
dorsal lateral geniculate nucleus
References Azzopardi, P. and Cowey, A. (1997) Is blindsight like normal, near-threshold vision? Proc. Natl. Acad. Sci. USA, 94: 14190–14194. Azzopardi, P. and Cowey, A. (2001) Motion discrimination in cortically blind patients. Brain, 124: 30–46. Barbur, J.L., Ruddock, K.H. and Waterfield, V. A. (1980) Human visual responses in the absence of the geniculo-striate projection. Brain, 103: 905–928. Barbur, J.L., Watson, J.D., Frackowiak, R.S.J. and Zeki, S. (1993) Conscious visual perception without V1. Brain, 116: 1293–1302. Barton, J.J.S. and Sharpe, J.A. (1997) Smooth pursuit and saccades to moving targets in blind hemifields. A comparison of medial occipital, lateral occipital and optic radiation lesions. Brain, 120: 681–699. Baseler, H. A., Morland, A. B. and Wandell, B.A. (1999) Topographic organization of human visual areas in the absence of input from the primary cortex. J. Neurosci., 19: 2619–2627.
277 Benson, P.J., Guo, L. and Blakemore, C. (1998) Direction discrimination of moving gratings and plaids and coherence of dot displays without primary visual cortex (V1). Eur. J. Neurosci., 10: 1767–1772. Blythe, I.M., Kennard, C. and Ruddock, K.H. (1987) Residual vision in patients with retrogeniculate lesions of the visual pathways. Brain, 110: 887–905. Brent, P.J., Kennard, C. and Ruddock, K.H. (1994) Residual colour vision in a human hemianope: spectral responses and colour discrimination. Proc. Roy. Soc. Lond. B, 256: 219–225. Bullier, J., Girard, P. and Salin, P.-A. (1994) The role of area 17 in the transfer of information to extrastriate visual cortex. In: Peters A. and Rockland K.S. (Eds.), Cerebral Cortex, Vol. 10. Plenum Press, New York, pp. 301–330. Cowey, A. and Stoerig, P. (1995) Blindsight in monkeys. Nature, 373: 247–249. Cowey, A and Stoerig, P. (1997) Visual detection in monkeys with blindsight. Neuropsychologia, 35: 929–939. Fischer, B. (1985) The preparation of visually guided saccades. Rev. Physiol. Biochem. Pharmacol., 106: 1–35. Goebel, R., Muckli, L., Zanella, F.E., Singer, W. and Stoerig, P. (2001) Sustained extrastriate cortical activation without visual awareness revealed by fMRI studies of hemianopic patients. Vision Research, 41: 1459–1474. Kentridge, R.W., Heywood, C.A. and Weiskrantz, L. (1999) Effects of temporal cueing on residual visual discrimination in blindsight. Neuropsychologia, 37: 479–483. King, S.M., Azzopardi, P., Cowey, A., Oxbury, J. and Oxbury, S. (1996) The role of light scatter in the residual visual sensitivity of patients with complete cerebral hemispherectomy. Vis. Neurosci., 13: 1–13. Moore, T., Rodman, H.R. and Gross, C.G. (1998) Man, monkey, and blindsight. The Neuroscientist, 4: 227–230. Moore, T., Rodman, H.R., Repp, A.B. and Gross, C.G. (1995) Localization of visual stimuli after striate cortex damage in monkeys: parallels with human blindsight. Proc. Natl. Acad. Sci. USA., 92: 8215–8218. Moore, T., Rodman, H.R., Repp, A.B., Gross, C.G. and Mezrich, R.S. (1996) Greater residual vision in monkeys after striate cortex damage in infancy. J. Neurophysiol., 76: 3928–3933. Moran, J. and Desimone, R. (1985) Selective attention gates visual processing in the extrastriate cortex. Science, 229: 782–784. Morland, A.B., Jones, S.R., Findlay, A.L., Deyzac, D., Le, S. and Kemp, S. (1999) Visual perception of motion, luminance and colour in a human hemianope. Brain, 122: 1183–1198. Pasik, P. and Pasik, T. (1982) Visual functions in monkeys after total removal of visual cerebral cortex. Contributions to Sensory Physiology, 7: 147–200. Payne, B.R., Lomber, S.G., MacNeil, M.A. and Cornwell, P. (1996) Evidence for greater sight in blindsight following
damage of primary visual cortex early in life. Neuropsychologia, 34: 741–774. Perenin, M.-T. (1991) Discrimination of motion direction in perimetrically blind fields. Neuroreport, 2: 397–400. Pizzamiglio, L., Antonucci, G. and Francia, A. (1984) Response of the cortically blind hemifields to a moving visual scene. Cortex, 20: 89–99. Richmond, B.J., Wurtz, R.H. and Sato, T. (1983) Visual responses of inferior temporal neurons in awake rhesus monkey. J. Neurophysiol., 50: 1415–1432. Rodman, H.R., Gross, C.G. and Albright, T.D. (1989) Afferent basis of visual response properties in area MT of the macaque: I. Effects of striate cortex removal. J. Neurosci., 9: 2033–2050. Sahraie, A., Weiskrantz, L., Trevethan, C.T., Cruce, R. and Murray, A.D. (2002) Psychophysical and pupillometric study of spatial channels of visual processing in blindsight. Exp. Brain Res., 143: 249–256. Stoerig, P. (1999a) Blindsight. In: Wilson R. and Keil F. (Eds.), The MIT-Encyclopedia of the Cognitive Sciences. MITPress, Cambridge, MA, pp. 88–90. Stoerig, P. (1999b) Wege zur Gesichtsfeldwiederherstellung [Routes to visual field recovery]. Zeit. fu¨r Neuropsychologie, 2: 91–93. Stoerig, P. and Barth, E. (2001) Phenomenal vision following unilateral destruction of primary visual cortex. Conscious. Cogn., 10: 574–587. Stoerig, P. and Cowey, A. (1997) Blindsight in man and monkey. Brain, 120: 535–559. Stoerig, P., Zontanou, A. and Cowey, A. (2002) Aware or unaware: Assessment of cortical blindness in four men and a monkey. Cereb. Cortex, 12: 567–574. Weiskrantz, L. (1998) Consciousness and commentaries. In: Hameroff S.R., Kaszniak A.W. and Scott A.C. (Eds.), Towards a Science of Consciousness. MIT Press, Boston, pp. 371–377. Weiskrantz, L. (2001) Blindsight. In: Behrmann M. (Ed.), Handbook of Neuropsychology, Vol. 4. Elsevier, Amsterdam, pp. 215–237. Weiskrantz, L., Barbur, J.L. and Sahraie, A. (1995) Parameters affecting conscious versus unconscious visual discrimination in a patient with damage to the visual cortex (V1). Proc. Natl. Acad. Sci. USA, 92: 6122–6126. Weiskrantz, L., Cowey, A. and Barbur, J.L. (1999) Differential pupillary constriction and awareness in the absence of striate cortex. Brain, 122: 1533–1538. Weiskrantz, L., Cowey, A. and Le Mare, C. (1998) Learning from the pupil: A spatial visual channel in the absence of V1 in the monkey and human. Brain, 121: 1065–1072. Weiskrantz, L., Warrington, E.K., Sanders, M.D. and Marshall, J. (1974) Visual capacity in the hemianopic field following a restricted cortical ablation. Brain, 97: 709–728.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 19
Visually guided behavior after V1 lesions in young and adult monkeys and its relation to blindsight in humans Charles G. Gross1,* Tirin Moore1 and Hillary R. Rodman2
2
1 Department of Psychology, Green Hall, Princeton University, Princeton, NJ 08544, USA Department of Psychology and Yerkes RPRC, Emory University, 532 N. Kilgo Circle, Atlanta GA 30322, USA
Abstract: After lesions of striate cortex in primates, there is still the capacity to detect and localize visual stimuli. In this chapter we review three aspects of our study of this phenomenon in macaques. First, we found that macaques that received their striate lesions as infants had considerably greater ability to detect and localize stimuli than those that received similar lesions as adults. Second, we suggest that the visual functions that survive striate lesions in macaques made in adulthood resemble those in human ‘blindsight’. Third, we report that monkeys with striate lesions made in infancy are able to discriminate direction of visual motion.
The effects of age at the time of lesion
Damage to primary visual cortex (striate cortex, V1) has a devastating effect on vision in humans and other primates. This chapter addresses three questions about the effects of striate cortex lesions in macaque monkeys. The first is whether striate lesions sustained in infancy have different effects from similar lesions made in adulthood. The second question is whether the vision that survives striate lesions in monkeys resembles the implicit or nonconscious vision that survives striate lesions in humans, that is, can destriate monkeys show blindsight? The third and related question is whether destriate monkeys can discriminate direction of stimulus movement. All three questions are relevant for understanding the role of striate cortex in visually guided behavior and visual consciousness in human and nonhuman primates.
The idea that brain lesions in infancy might have lesser effects than lesions in adults was first suggested in 1865 by Paul Broca (Berker et al., 1986). The first systematic experiments on the question were those of Margaret Kennard in the 1930s (Kennard, 1936, 1938; Kennard and Fulton, 1942). Since then considerable evidence has accrued indicating greater recovery after early brain damage in primates in several different regions of the cerebral cortex (Goldman, 1972; Carlson, 1984a,b; Carlson and Burton, 1988; Bachevalier et al., 1990; Bachevalier and Mishkin, 1994). However, until our studies, the behavioral effects of lesions of striate cortex in infancy and adulthood had never been compared in primates. Furthermore, it was possible that the effects could have been the opposite to the usual ones of lesser deficits after lesions in infancy that were found in other systems. This is because early striate lesions in monkeys produce more rapid and more extensive
*Corresponding author. Tel.: þ 1-609-258-4430; Fax: þ 1-609-258-1113; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14401-9
279
280
Subjects
secondary degeneration in the lateral geniculate (Milhailovic et al., 1971) and, transneuronally, in the retina than do later ones (Cowey, 1974; Dineen and Hendrickson, 1981; Weller and Kaas, 1989). On the basis of this evidence, one might expect early lesions of striate cortex to cause greater visual deficits than later ones. To examine these alternatives we studied the ability of monkeys to detect and localize visual stimuli after striate lesions made in infancy or adulthood (Moore et al., 1996, 1998).
Six Macaca fascicularis monkeys received large unilateral lesions of striate cortex either in adulthood (the adult lesion group, A-1, A-2, and A-3) or at 5–6 weeks of age (the infant lesion group, I-1, I-2, and I-3). Sagittal sections through the lesions of an adult lesion animal are shown in Fig. 1 and of an infant lesion one in Fig. 2. Both figures also show examples of MR images of the lesion and intact hemispheres
Monkey A-1 Intact
Lesion
A B C D E lu
A
lu
io io lu
B
lu
io io
lu
lu
C ca
io
D
io
lu lu ca
E
io
io
lu
lu ca
io io
Fig. 1. Striate cortex lesion in Monkey A-1. Sagittal sections showing the intact (left) and operated (right) hemispheres. Striate cortex in the intact hemisphere is indicated by shading. In the operated hemisphere the bold lines show the borders of the lesion. The diagrams of a standard dorsal view of the brain (top) show the approximate level of the sagittal sections. The photographs show magnetic resonance (MR) images of sagittal sections through the intact and damaged hemisphere (details of the MR methods are in Moore et al., 1995). The oblique lines show the drawing of the section closest to each MR image. ca, calcarine; lu, lunate; io, inferior occipital.
281 Monkey I-2 Intact
Lesion
A B C D E lu
lu
A
io
io
lu
B
io
lu
io
lu
C
lu ca
io
io lu
D
lu ca io
io
E
lu lu ca io io
Fig. 2. Striate cortex lesion in Monkey I-2. See legend to Fig. 1.
and illustrate how these images can provide an indication of the site and size of experimental lesions prior to histological analysis after sacrifice of the animal. The removal of striate cortex was complete or nearly so in two of the adult lesion animals A-1 and A-3 and two of the infant lesion ones I-1 and I-2. In the other two animals there was considerable sparing of the cortex representing portions of the visual field beyond eccentricities of 25 . Testing of the animals began 2–6 years after surgery.
Behavioral procedures During training and testing sessions the animals were placed in a primate chair. They were trained to fixate
and make saccades to visual targets. In the initial task, the targets were small spots that appeared at unpredictable locations and unpredictable times. For a reward the animal had to both detect the onset of the target and saccade to it. The targets were on for 1 s and failure to initiate an eye movement during that period was defined as a detection error. On a third of the trials no target was presented; on these ‘blank’ trials the animal was rewarded for maintaining fixation. This behavioral paradigm is summarized in Fig. 3. All the animals were tested monocularly. For each monkey, detection was tested at 48 locations within the central 24 (Fig. 3). After training, each animal was tested for about 3400 trials divided into
282 90 120
60
150
30
180
Fixation Point Stimulus Eye
0
210
12˚
330
24˚ 240
300 270
Fig. 3. Paradigm for visual target detection task. (Left) The animals were trained to fixate on a small spot (the fixation point) and then saccade to another small spot (the stimulus) that appeared at an unpredictable time at one of the sites shown with the open circles on the right. The central square represents the fixation point window (2 radius). Details of the procedure are in Moore et al., 1995.
12 repetitions. Each repetition covered all 48 points, 4 trials/point in the visual half field contralateral to the lesion and 8 trials/point in the half field side ipsilateral to the lesion. As a control for light scatter, the blindspot in the ipsilateral field was plotted. The animals consistently failed to detect targets presented within their blindspot as shown in Fig. 4.
Visual perimetry Figure 5 shows the overall results from one animal with an infant lesion (I-2) and one animal with an adult lesion (A-1). Both animals performed with near perfect accuracy in the field ipsilateral to the striate lesion. The infant lesion animal also performed well in the field contralateral to the lesion except for the most eccentric points. By contrast the adult lesion animal failed to detect contralateral stimuli about 70% of the time which is near chance performance. Figure 6 shows the detection errors over the course of the 12 repetitions at 288 trials per repetition. All the infant lesion animals begin at a higher performance level than any adult lesion subject and their final level was also much higher. Only one of the adult lesion animals, A-3, showed considerable improvement over the course of testing. The recovery of detection at each target site of one adult lesion animal (A-3) is shown in Fig. 7. The recovery occurred first at the most central points and then at the more peripheral ones. The infant animal that showed the poorest initial performance and therefore the most recovery (I-3)
also showed this same center to periphery pattern of recovery (Fig. 7). We then investigated the effect of target contrast: the animals with the best final performance in each age at lesion group were tested with the contrast reduced by 1 log unit (Fig. 8). The reduced contrast had no effect on the animal with the infant lesion (I-2) but it virtually reinstated the original deficit in the animal with the adult lesion (A-3). Thus, the recovery of the adult lesion subject seemed dependent on the target contrast whereas the animal with the infant lesion continued to detect and localize the less intense stimulus within the scotoma. In summary, the animals that had received their lesions in infancy could detect and localize targets at most locations within the scotoma from the beginning of testing and, after some practice, performed virtually perfectly everywhere. By contrast, the animals that received their lesions as adults appeared blind to targets at most sites for some time. Only one of the adult lesion animals, and only after considerable time and practice, was able to localize targets fairly consistently within its scotoma. However, its performance was never as good as that of the worst infant lesion animal.
Do monkeys with striate cortex lesions show blindsight? The virtually blind performance of the three adult lesion animals described in the previous section was unexpected. In previous studies, monkeys
283
A-2
A-1 10
20
10
o
20
10
o
10
o
20
o
10
10
10
10
20
10
o
10
o
20
o
10
0-25% errors
o
I-3
o
o
o
o
o
I-2
I-1
20
A-3
o
26-50% errors
o
o
10
51-75% errors
o
o
76-100% errors
Fig. 4. As a control for possible light scatter, detection of target stimuli in and near the blindspot ipsilateral to the lesion was tested. Target stimuli were presented in 1 steps in the vicinity of the blind spot. (The center of the blindspot is usually 14 –17 from the fovea along the horizontal meridian.) Each circle represents 4–12 stimulus presentations. The stimulus contrast was 3 log units above the background except for monkey I-2 where a contrast of 2 log units above the background was used. Since the animals were unable to detect stimuli at two or more locations in the blind spot, the amount of effective light scatter must have been less than the radius of the optic disc (about 3 ).
Infant Lesion (I-2) 120
Adult Lesion (A-1) 90
90
120
60 30
150
180
330
240
30
150
0
210
300
180
0
330
210
240
270
0-25% errors
60
300 270
26-50% errors
51-75% errors
76-100% errors
Fig. 5. Total detection errors (failure to initiate saccades) for monkeys I-2 and A-1. Each circle represents one test point; for each polar angle loci were tested at 6 , 12 , 18 and 24 eccentricity. The visual half field contralateral to the striate lesion is shown on the right for both plots (Moore et al., 1996).
284
Mean Detection Errors (%)
100
100
I-1
75
75
50
50
25
25
0
0
100
100
I-2
75
75
50
50
25
25
0
0
100
100
I-3
75
75
50
50
25
25
0
2
4
6
8 10 12
A-1
A-2
A-3
0
2
4
6
8 10 12
Number of Repetitions Fig. 6. Detection performance across all repetitions for all the animals with striate lesions. Filled symbols, contralateral; open symbols, ipsilateral. (Moore et al., 1996.)
A-3
Trials 2305-3456
with striate lesions received in adulthood had been able to detect and localize stimuli (Cowey and Weiskrantz, 1963; Mohler and Wurtz, 1977; Newsome et al., 1985; Seagraves et al., 1987). Similarly, after striate lesions, some humans — those with ‘blindsight’ — were also able to do much better than our adult lesion monkeys (e.g. Weiskrantz, 1986). However, we noticed something that suggested that our animals actually had much more knowledge of target location than their high incidence of error had indicated. When the animals did not detect targets in the scotoma (i.e. failed to saccade to them) their post-trial behavior indicated that they had knowledge of target location. On such trials, after the trial had ended and both the fixation point and target stimulus were turned off, the animals often made saccades to the site where that target had just been presented (Fig. 9). This post-trial saccade indicated that the animal had implicit knowledge of target location. We then changed the task to try to tap into this implicit knowledge and to make the procedure more similar to those in earlier studies that had demonstrated better performance after adult striate lesions (Mohler and Wurtz, 1977; Seagraves et al., 1987).
I-3
Trials 1153-2304
Trials 1-1152
Contralateral to lesion
Contralateral to lesion
0-25% errors 26-50% errors 51-75% errors 76-100% errors
Fig. 7. Recovery of detection over successive blocks of 1152 trials for monkeys A-3 and I-3. Note that for both animals recovery was earlier at the more central sites.
285
High (3.0 log.) Contrast
Reduced (2.0 log.) Contrast
A-3
Ipsi
Contra
Ipsi
Contra
0-25% errors 26-50% errors 51-75% errors 76-100% errors
I-2
Ipsi
Contra
Contra
Ipsi
Fig. 8. The effect of reducing target contrast on detection of monkeys A-3 and I-2.
Fixation Point Stimulus Eye
I-1
30%
30%
30%
A-1
30%
30%
30%
Fig. 9. Frequency of location of final position of saccades that occurred after the offset of an undetected target stimulus presented in the contralateral field in monkeys I-1 and A-1. The targets were presented at an eccentricity of 24 at the polar angle shown by the arrows. The polar angles of the saccades were collapsed into 30 bins. These data indicate that the animals had information about the location of the trials even on trials that they failed to saccade to the target.
286
Use of a procedure resembling ‘forced choice’ In our original behavioral procedure (Fig. 10, left) the fixation point stayed on after the target appeared. We then changed the procedure so that the fixation point went off when the target appeared (Fig. 10, right). When the fixation point turned off, it was as if
the animal was ‘forced’ to saccade to a new location. To our surprise, with this change the two adult lesion monkeys, A1 and A2 who had previously acted virtually blind in the half field contralateral to their striate lesion, were now able to detect and accurately localize many of the targets in the contralateral half field. New Procedure: Fixation turns off
Old Procedure: Fixation remains on
o
18
A1
o
o
18
18
o
18 Ipsilateral to lesion
Contralateral to lesion o
18
A2
o
o
18
18
o
18
Fixation point Stimulus Eye
Fixation point Stimulus Eye
Fig. 10. Effect of time of stimulus onset on detection. (Left) Poor performance of A1 and A2 in the original paradigm in which the fixation point remains on after onset of the target stimulus (bottom). See also legends to Figs. 3 and 5. (Right) Localization of previously undetected targets during the new procedure in which the fixation point was extinguished simultaneous with the onset of the target. The locations of the targets are shown by the small arrows in the figure on the left and the black circles in the figure on the right. Mean endpoints of the saccades are shown by the open symbols. Note that with the original procedure the targets in the contralateral field were not detected, whereas with the revised paradigm the endpoints of the saccades were on or near the targets in both half fields (Moore et al., 1995).
287
Figure 10 (right) shows the endpoints of saccades to targets at specified eccentricities with the new procedure. The end points of the saccades in both half fields were close to the targets. The accuracy of saccades to all the targets is shown in Fig. 11, where the direction of the target is plotted against the direction of the saccade for both half fields. The accuracy of the saccades on the contralateral side is virtually as good as on the ipsilateral side for one animal and only slightly impaired for the other. Target angle (deg.) ipsi. to lesion 240 210 180 150 120 300
120
330
150
0
180
30
210
60
240 60
30
0
Saccade angle (deg.) ipsi. to lesion
Saccade angle (deg.) contra. to lesion
A1
330 300
Target angle (deg.) contra. to lesion
A2
Target angle (deg.) ipsi. to lesion 30
0
Whereas in the first saccade paradigm (Fig. 10, left) the adult lesion monkeys continued to fixate on the fixation spot and ignored the targets flashed into the scotoma, in the new paradigm they now initiated accurate saccades to the targets. Note that the new procedure essentially forces the animals to initiate a saccade and, in humans with blindsight it is a forced choice paradigm that is usually required to reveal accurate detection and localization (Stoerig and Cowey, 1997). In a comparable situation as first shown by Zihl and Werth (1984) humans that do show blindsight also require a cue as to target onset in order to localize and detect stimuli. In summary, the adult lesion animals showed good detection and localization but only when the fixation point was turned off thereby ‘forcing’ the animal to saccade to a new location. This provides an answer to our second question: monkeys with striate lesions do show visually guided behavior similar to that of human blindsight patients. Cowey and Stoerig came to a similar conclusion in a study in which monkeys were able to categorize visual targets in the scotoma as nontargets in spite of their ability to localize them accurately (Cowey and Stoerig, 1995; Stoerig and Cowey, 1997; Stoerig et al., 2002).
Can destriate monkeys discriminate direction of movement?
330 300
300
120
330
150
0
180
30
210
60
240
Saccade angle (deg.) ipsi. to lesion
Saccade angle (deg.) contra. to lesion
60
Monkeys appear to show blindsight
240 210 180 150 120
Target angle (deg.) contra. to lesion Fig. 11. Accuracy of saccades to targets in the ipsilateral and contralateral fields for A1 and A2. Each point represents mean endpoints of saccades to targets appearing at different angular directions from the fixation point. Error bars show SEM across all eccentricities. (Moore et al., 1995)
Motion discrimination We trained the three infant lesion animals and the best performing adult lesion animal (A-3) from the experiments described above on various movement discrimination tasks using a go/no-go procedure (Fig. 12). On each trial the animal fixated on the fixation point and then a moving stimulus was presented in either the field contralateral or ipsilateral to the lesion. If the stimulus moved in one direction (S þ ), the animal was rewarded for saccading to it. However, if the stimulus moved in the opposite direction (S) the animal was rewarded for continuing to fixate. Discrimination performance was based on the degree to which the monkey could choose the correct trial on which to initiate
288
Direction of Motion Discrimination (Go/No-Go Task)
Fixation point
Fixation point
Stimulus
Stimulus
Eye
Eye
Fig. 12. Direction of movement discrimination paradigm. The monkey first fixated on a fixation point, shortly after which a stimulus was presented. The task required the animal to saccade (arrow) to the stimulus (‘go’ trial, left) or to withhold such movements (‘no-go’ trial, right) depending on the stimulus.
Vertical Position (deg.)
Monkey I-1
Monkey I-2
Monkey I-3
Monkey A-3
45
0
45 0
45
0
45
0
45
0
45
Horizontal Position (deg.) Fig. 13. Position of the stimulus apertures (5 and 15 diameter circles) within the scotomata of the animals with striate lesions used in the movement discrimination experiments. Unshaded areas represent zones of the visual field with corresponding intact striate cortex. (In part after Moore et al., 2001a,b). The estimate of the field defects are based on maps of the visual topography of striate cortex (Gattass et al., 1988) and the lateral geniculate nucleus (Malpeli et al., 1996).
a saccadic eye movement to the motion display. The monkey’s performance was therefore the average of the percent correct on the S þ and S trials, regardless of whether the trial types occurred at different frequencies. To determine whether or not the monkey could discriminate between the positive and negative stimuli above that expected by chance, the number of total saccadic eye movements made to the two stimuli was compared in a 2 2 contingency table analysis. We tested the three infant lesion animals and the best performing adult
animal (A-3). The stimuli were presented within the scotoma (Fig. 13) and as a control in the field ipsilateral to the lesion. When the discriminandum was a horizontal bar that fell within the scotoma, each monkey could easily discriminate upward from downward motion inside the scotoma. This is consistent with the finding that in the absence of striate cortex many single neurons in Area MT and Area V3A are still sensitive to direction of motion of a bar (Rodman et al., 1989; Girard et al., 1991, 1992). However, discrimination of
289
80 60 40
80 60 40
outside scotoma
inside scotoma
100
outside scotoma
inside scotoma
100
I-2 % Correct
% Correct
I-1
100
A-3 % Correct
% Correct
100
80 60
I-3
80 60 40
40 outside scotoma
outside scotoma
inside scotoma
moving vs. static
direction 4 deg./sec.
inside scotoma
direction 20 deg./sec.
Fig. 14. Direction of motion discrimination in a 15 aperture. Percent correct trials on discrimination of upward moving from static dot patterns and of dot patterns moving upward or downward at 4 /s or 20 /s within the scotoma or in the intact field. Note that all the animals could discriminate moving from static stimuli but only the infant lesion animals could discriminate direction of movement above a chance level. (After Moore et al., 2001a,b).
I-1
80 60 40 up up up vs. vs. vs. static "noise" down
100 % Correct
% Correct
100
localizing the bar within the scotoma at the start of the trial. In order to determine whether the animals had true discrimination of movement, we removed the possibility for the animal to use stimulus displacement cues by using as discriminanda, coherently moving dot fields–random dot kinetograms (Britten et al., 1992). The dots were presented in a 15 aperture that fell within the scotomata of all the animals. All the infant lesion animals and adult monkey A-3 were able to discriminate moving from static dot fields inside their scotomata as accurately as in their good half field (Fig. 14). On the discrimination of direction of movement of the random dot patterns the infant lesion animals were somewhat impaired in the scotoma but performed significantly above chance (for both the speeds tested, 4 and 20 /s) (Fig. 14). By contrast, the best performing adult lesion animal (A-3) was totally unable to tell direction of motion in her scotoma even though she had recovered some detection and localization capacity as described earlier (Fig. 14). This ability of the infant lesion animals to detect direction of motion was crucially dependent on the size of the aperture used in the testing. When the aperture was reduced to 5 diameter all three infant lesion animals performed at chance at the up versus down discrimination (Fig. 15). Two of them (I-2 and I-3) could not even distinguish upward movement from random motion of the dots. However in the smaller aperture they could still discriminate moving from static dot patterns (Fig. 15). This dependence of the residual motion sensitivity of the infant lesion animals on the size of the motion
I-2
80 60 40
100 % Correct
the direction of movement of a bar is not the same as true discrimination of direction of movement (Nakayama and Tyler, 1981). Since the moving bar begins its traverse at very different positions when the movement direction is upward as opposed to downward, the monkeys could solve the discrimination by
I-3
80 Intact field scotoma
60 40
up up up vs. vs. vs. static "noise" down
up up up vs. vs. vs. static "noise" down
Fig. 15. Direction of motion discrimination in a 5 aperture. Percent correct trials on discrimination of upward moving and static dot patterns, of upward and randomly moving (‘noise’) dot patterns and of upward and downward moving dot patterns in the scotoma and intact fields. Note that none of the infant animals could discriminate direction of movement in the 5 aperture, whereas they were able to do so in the 15 one. (After Moore et al., 2001a,b).
290
to have covert or implicit ‘knowledge’ of direction of movement while still failing the discrimination task. As in the case of the role of the temporal cues in our perimetry study, and in human blindsight, the ability to use visual information after striate lesions depends on how that ability is assessed.
I-1 Horizontal eye position (deg.)
stimulus fits with the finding of Weiskrantz et al. (1995) that human blindsight patients require large scale displacements to detect direction of movement of a spot. The total inability of the adult lesion animal to discriminate direction of movement of a random dot field is strikingly similar to the situation with those human patients who show blindsight after striate lesions. They too are unable to discriminate direction of movement of a random dot pattern although they can often detect the direction of movement of a bar, grating or a single large dot (Blythe et al., 1986; Magnussen and Mathiesen, 1989; Barton and Sharpe, 1997; Azzopardi and Cowey, 2001; Cowey and Azzopardi, 2001). The behavioral abilities of human blindsight patients and the adult lesion monkey are thus parallel to the properties of single neurons in Area MT after striate lesions: sensitivity to the direction of motion of a bar but not to that of random dot fields, i.e. no true direction of movement discrimination (Rodman et al., 1989; Azzopardi et al., 1998).
6 0 -6 -12
-18 -120
-80
-40
0
40
Implicit knowledge about direction of movement When we examined the fine grain of the monkeys’ saccades to the moving dot fields we realized that the metrics of the saccades differed on correct trials from those on incorrect ones. As shown in Fig. 16 for one of the infant lesion animals, when the animal made a correct saccade to upward moving dots the vertical component of the saccade was higher than when the animals made, incorrectly, a saccade to the downward moving dots. Thus it had extracted information about the direction of movement that was reflected in the metrics of the oculomotor response. This finding is consistent with the ability of the infant lesion animals to perform the direction of movement discrimination far above chance level. What was surprising, however, was the behavior of the adult lesion animal that had completely failed the go/no-go discrimination of the direction of stimulus movement (Fig. 17). This animal’s eye movements indicated that it too was able to extract information about direction of movement despite its overt behavioral performance. Unlike the infant lesion animals, the adult animal appeared
Vertical eye position (deg.)
Time from saccade onset (ms)
4 2 0 -2 -4 -120
-80
-40
0
40
Time from saccade onset (ms) Fig. 16. Evidence for implicit direction of motion sensitivity in animal I-1. Mean horizontal (top) and vertical (bottom) components of saccadic eye movements made to fields of moving dots during direction of motion discrimination (with the 15 aperture). The animal was trained to saccade to a field of upward moving dots and withhold movement to downward moving ones. Although the animal performed above chance, as noted above, it made enough errors on the no-go or downward movement trials to compare the saccades made to upward and downward movement. The black trace shows the amplitude of correct or go saccades to the upward moving stimulus and the grey trace shows the amplitude of incorrect saccades to the downward moving stimuli. Note that the vertical amplitudes to the upward moving dots are greater than that to the downward moving ones. (After Moore et al., 2001a.)
291
Horizontal Eye Position (deg.)
A-3 -6 0 6 12 18 -120
-80
-40
0
40
Vertical Eye Position (deg.)
Time from saccade onset (ms)
2 1 0 -1 -2 -3 -120
-80
-40
0
40
Time from saccade onset (ms) Fig. 17. Evidence for implicit direction of motion sensitivity in animal A-3. See legend to Fig. 16. Mean horizontal (top) and vertical (bottom) components of saccadic eye movements made to fields of moving dots during direction of motion discrimination (with the 15 aperture). In this case, the animal was required to saccade to downward moving dots and withhold saccades to upward moving dots. As noted above this animal did not perform above chance on this task. It made saccades to both directions of movement. Nevertheless, the vertical amplitude to the moving dots differed between the two directions of movement. This indicates that the animal had some implicit information about the direction of stimulus motion. (After Moore et al., 2001a.)
Concluding discussion Comparison of monkeys and humans The visual behavior of the monkeys that received striate lesions as adults closely resembled that of humans with blindsight in two major ways. First, they could detect and localize visual stimuli in their scotomata only if there was a temporal cue to respond as in a forced choice paradigm. Second, they
were unable to discriminate the direction of a moving dot field although again, like human blindsight patients, they could detect whether a dot display was moving and the direction of movement of a bar (Barton and Sharpe, 1997; Azzopardi and Cowey, 2001). The analysis of the incorrect saccades of the monkeys suggest that they may have information about the direction of moving dots but are unable to use this information to control their behavior in an operant task. It would be interesting to see if there were any parallels of this implicit knowledge of direction of movement in human blindsight patients. The visual guided behavior of the infant lesion animals was much better than that of the adult lesion animals and was different from that of the usual blindsight patients both in not needing a forcedchoice situation and in being able to discriminate the direction of movement. Patient GY is a much studied blindsight patient who, since he received his striate lesion in childhood at the age of 8, makes an interesting comparison with our infant lesion animals (Kentridge et al., 1999). He was similar to them in that his ability to localize a visual stimulus remained above chance even in the absence of temporal cues as to target onset. Yet, he was different from them in his apparent inability to discriminate the direction of movement of a random dot field (Azzopardi and Cowey, 2001).
The neural basis of the vision that survives striate lesions What are the neural mechanisms that underlie the visual functions that survive striate lesions in adulthood? At least in the monkey, these residual or recovered functions are abolished by lesioning the superior colliculus (Mohler and Wurtz, 1977; Rodman et al., 1990). Tecto-fugal fibers project to the pulvinar (Benevento and Standage, 1983) which in turn projects widely to extra-striate visual cortical areas and might provide them with visual information sufficient to sustain residual visual functions. There are three lines of evidence that the activity of extra-striate areas might be crucial for the full scope of the visual capacities after striate lesions. The first line of evidence is that several extrastriate areas in the dorsal visual cortical system
292
continue to respond to visual stimuli after striate lesions, namely area MT, area V3A and the Superior Temporal Polysensory area (STP) (Bruce et al., 1986; Rodman et al., 1989, 1990; Girard et al., 1991, 1992; Gross, 1991). In fact, after striate lesions, cells in these areas maintain their normal visuotopic organization (except for STP, which is not visuotopically organized in normal animals) and continue to be sensitive to orientation and direction of motion of a moving bar. Thus, the activity of these areas could be the basis of the visually guided but unconscious behavior that survives striate lesions. The second line of evidence for a role of extrastriate cortex in blindsight is that in three separate sets of studies, patients with hemidecortication showed little or no signs of explicit blindsight (i.e. involving forced-choice testing) in the blind half field (King et al., 1996; Stoerig et al., 1996; Faubert et al., 1999). Similarly, hemidecortication in monkeys prevents accurate saccades to a target (Tusa et al., 1986). The third line of evidence is that there is considerable activity in extrastriate cortex after striate lesions in humans as revealed by functional imaging (Barbur et al., 1990; Stoerig et al., 1998; Baseler et al., 1999; Bittar et al., 1999).
more rapidly after early striate cortex damage and thus more able to contribute to the residual function after such damage. Conversely, surviving lateral geniculate neurons, the geniculo-extrastriate pathway, and direct retinal inputs to the pulvinar may play a greater role after early lesions, as suggested by studies of early visual cortex lesions in cats (reviewed in Payne et al., 1996). After early lesions of V1 in monkeys, surviving geniculate neurons show dramatically expanded dendritic branching (Hendrickson and Dineen, 1982), and are overwhelmingly drawn from the calbindin-immunopositive subpopulation of the koniocellular channel, suggesting the survival of a unique population which projects to specific components of extra-striate cortex (Rodman et al., 2001).
Acknowledgments This work was supported by National Science Foundation Grant BNS-9109743 and National Institutes of Health Grants MH-19420, MH-12336, and EY-11374. We thank M. A. Pinsk and W.T. Clark for their help and A. B. Repp for participating in some of the experiments.
Striate lesions in infancy References What might be the underlying mechanism for the much greater recovery in the infant lesion animals? Following damage to striate cortex in both infant and adult monkeys, the retino-geniculate pathway degenerates dramatically. The remaining excitatory input to the geniculate appears to be from the superior colliculus (Kisvaraday et al., 1991). A possible insight into greater recovery following early damage comes from studies of transneuronal degeneration of retinal ganglion cells after striate lesions. These studies show much faster and possibly greater degeneration in younger monkeys (Cowey, 1974; Dineen et al., 1982; Weller and Kaas, 1989). Thus, it is conceivable that the faster degeneration of the retino-geniculate pathway in infant-damaged monkeys actually facilitates recovery by disinhibiting the residual retino-collicular pathway (Moore et al., 2001). In other words, the retino-tecto-geniculate pathway may be unmasked
Azzopardi, P. and Cowey, A. (2001) Motion discrimination in cortically blind patients. Brain, 124: 30–46. Azzopardi, P., Fallah, M., Gross, C.G. and Rodman, H.R. (1998) Responses of neurons in visual areas MT and MST after lesions of striate cortex in macaque monkeys. Neurosci. Abs., 24: 648. Bachevalier, J. and Mishkin, M. (1994) Effects of selective neonatal temporal lobe lesions on visual recognition memory in rhesus monkeys. J. Neurosci., 14: 2128–2139. Bachevalier, J., Brickson, M., Hagger, C. and Mishkin, M. (1990) Age and sex differences in the effects of selective temporal lobe lesion on the formation of visual discrimination habits in rhesus monkeys (Macaca mulatta). Behav. Neurosci., 104: 885–899. Barbur, J.L., Watson, J.D.G., Frackowiak, R.S.J. and Zeki, S. (1990) Conscious visual perception without V1. Brain, 116: 1293–1302. Barton, J.J.S. and Sharpe, J.A. (1997) Motion detection in blind half fields. Ann. Neurol., 41: 255–264.
293 Baseler, H.A., Morland, A.B. and Wandell, B.A. (1999) Topographic organization of human visual areas in the absence of input from primary cortex. J. Neurosci., 19: 2619–2627. Benevento, L.A. and Standage, G.P. (1983) The organization of projections of the retinorecipient and nonretinorecipient nuclei of the pretectal complex and layers of the superior colliculus to the lateral pulvinar and medial pulvinar in the macaque monkey. J. Comp. Neurol., 217: 307–336. Berker, E.A., Berker, A.H. and Smith, A. (1986) Translation of Broca’s 1865 report: Localization of speech in the third left frontal convolution. Hist. Neurol., 43: 1065–1072. Bittar, R.G., Ptito, M., Faubert, J., Dumoulin, S.O. and Ptito, A. (1999) Activation of the remaining hemisphere following stimulation of the blind hemifield in hemispherectomized subjects. Neuroimage, 10: 339–346. Blythe, I.M., Bromly, J.M., Kennard, C. and Ruddock, K.H. (1986) Visual discrimination of target displacement remains after damage to the striate cortex in humans. Nature Lond., 320: 619–621. Britten, K.H., Shadlen, M.N., Newsome, W.T. and Movshon, J.A. (1992) The analysis of visual motion: a comparison of neuronal and psychophysical performance. J. Neurosci., 12: 4745–4765. Bruce, C., Desimone, R. and Gross, C.G. (1986) Both striate cortex and superior colliculus to visual properties of neurons in the superior temporal polysensory area of the macaque. J. Neurophysiol., 55: 1057–1075. Carlson, M. (1984a) Development of tactile discrimination capacity in Macaca mulatta. II. Effects of partial removal of primary somatic sensory cortex. (Sml) in infants and juveniles. Brain Res., 318: 83–101. Carlson, M. (1984b) Development of tactile discrimination capacity in Macaca mulatta. III. Effects of total removal of primary somatic sensory cortex. (Sml) in infants and juveniles. Brain. Res., 318: 103–117. Carlson, M. and Burton, H. (1988) Recovery of tactile function after damage to primary or secondary somatic sensory cortex in infant Macaca mulatta. J. Neurophys., 8: 833–859. Cowey, A. (1974) Atrophy of retinal ganglion cells after removal of striate cortex in a rhesus monkey. Perception, 3: 257–260. Cowey, A. and Azzopardi, P. (2001) Is blindsight motion blind? In: de Gelder B., De Haan E. and Heywood C. (Eds.), Out of Mind Varieties of Unconscious Processing: New Findings and New Comparisons. Oxford University Press, Oxford, pp. 87–103. Cowey, A. and Stoerig, P. (1995) Blindsight in monkeys. Nature, 373: 247–249. Cowey, A. and Weiskrantz, L. (1963) A perimetric study of visual field defects in monkeys. Q. J. Exp. Psychol., 15: 91–115. Dineen, J. and Hendrickson, A. (1981) Age correlated differences in the amount of retinal degeneration after striate
cortex lesions in monkeys. Invest. Ophthalmol. Vis. Sci., 5: 749–752. Dineen, J., Hendrickson, A. and Keating, E.G. (1982) Alterations of retinal inputs following striate cortex removal in adult monkey. Exp. Brain Res., 47: 446–456. Faubert, J., Diaconu, V., Ptito, M. and Ptito, A. (1999) Residual vision in the blind field of hemidecorticated humans predicted by a diffusion scatter model and selective spectral absorption of the human eye. Vision Res., 39: 149–157. Gattass, R., Sousa, A.P. and Gross, C.G. (1988) Visuotopic organization and extent of V3 and V4 of the macaque. J. Neurosci., 8: 1831–1845. Girard, P., Salin, P.A. and Bullier, J. (1991) Visual activity in areas V3a and V3 during reversible inactivation of area V1 in the macaque monkey. J. Neurophysiol., 66: 1493–1503. Girard, P., Salin, P.A. and Bullier, J. (1992) Response selectivity of neurons in area MT of the macaque monkey during reversible inactivation of area V1. J Neurophysiol., 67: 1437–1446. Goldman, P.S. (1972) Development determinants of cortical plasticity. Acta Neurobiol. Exp., 32: 495–511. Gross (1991) Contribution of striate cortex and the superior colliculus to visual function in area MT, the superior temporal polysensory area and the inferior temporal cortex, Neuropsychologia. 29: 497–515. Hendrickson, A. and Dineen, J.T. (1982) Hypertrophy of neurons in dorsal lateral geniculate nucleus following striate cortex lesions in infant monkeys. Neurosci. Lett., 30: 217–222. Kennard, M.A. (1936) Age and other factors in motor recovery from precentral lesions in monkeys. Am. J. Physiol., 115: 138–146. Kennard, M.A. (1938) Reorganization of motor function in the cerebral cortex of monkeys deprived of motor and premotor areas in infancy. J. Neurophysiol., 1: 477–496. Kennard, M.A. and Fulton, J.F. (1942) Age and reorganization of central nervous system. Mt. Sinai J. Med., 9: 594–606. Kentridge, R.W., Heywood, C.A. and Weiskrantz, L. (1999) Effects of temporal cueing on residual visual discrimination in blindsight. Neuropsychologia, 37: 479–483. King, S.M., Azzopardi, P., Cowey, A., Oxbury, J. and Oxbury, S. (1996) The role of light scatter in the residual visual sensitivity of patients with complete cerebral hemispherectomy. Vis. Neurosci., 13: 1–13. Kisvaraday, Z.F., Cowey, A., Stoerig, P. and Somogyi, P. (1991) Direct and indirect retinal input into degenerated dorsal lateral geniculate nucleus after striate cortical removal in monkey: implications for residual vision. Exp. Brain Res., 86: 271–292. Magnussen, S. and Mathiesen, T. (1989) Detection of moving and stationary gratings in the absence of striate cortex. Neuropsychologia, 27: 725–728. Malpeli, J.G., Lee, D. and Baker, F.H. (1996) Laminar and retinotopic organization of the macaque lateral geniculate
294 nucleus: magnocellular and parvocellular magnification functions. J. Comp. Neurol., 3: 363–377. Milhailovic, L.T., Dragoslava, C. and Dekleva, N. (1971) Changes in the number of neurons and glial cells in the lateral geniculate nucleus of monkey during retrograde cell degeneration. J. Comp. Neurol., 142: 223–230. Mohler, C.W. and Wurtz, R.H. (1977) Role of striate cortex and superior colliculus in visual guidance of saccadic eye movements in monkeys. J. Neurosphysiol., 40: 74–94. Moore, T., Rodman, H.R. and Gross, C.G. (1998) Man, Monkey and blindsight. The Neuroscientist, 4: 227–230. Moore, T., Rodman, H.R., Repp, A.B. and Gross, C.G. (1995) Localization of visual stimuli after striate cortex damage in monkeys: parallels with human blindsight. Proc. Natl. Acad. Sci., 92: 8215–8218. Moore, T., Rodman, H.R., Repp, A.B., Gross, C.G. and Mezrich, R.S. (1996) Greater residual vision in monkeys after striate cortex damage in infancy. J. Neurophysiol., 76: 3928–3933. Moore, T., Rodman, H.R. and Gross, C.G. (2001a) Recovery of visual function following damage to striate cortex in monkeys. In: B. de Gelder, E. De Haan, C. Heywood (Eds.), Varieties of Unconscious Processing: New Findings and New Comparisons. Oxford University Press, Oxford, 35–51. Moore, T., Rodman, H.R. and Gross, C.G. (2001b) Direction of motion discrimination after early lesions of striate cortex. (V1) of the macaque monkey. Proc. Natl. Acad. Sci., 98: 1273–1276. Nakayama, K. and Tyler, C.W. (1981) Psychophysical isolation of movement sensitivity by removal of familiar position cues. Vision Res., 21: 427–433. Newsome, W.T., Wurtz, R.H., Dursteler, M.R. and Mikami, A. (1985) Punctate chemical lesions of striate cortex in the macaque monkey: effect on visually guided saccades. Exp. Brain Res., 58: 392–399. Payne, B.R., Lomber, S.G., MacNeil, M.A. and Cornwell, P. (1996) Evidence for greater sight in blindsight following damage of primary visual cortex early in life: a review. Neuropsychologia, 34: 741–774. Rodman, H.R., Gross, C.G. and Albright, T.D. (1989) Afferent basis of visual response properties in area MT of the macaque. I. Effects of striate cortex removal. J. Neurosci., 9: 2033–2050.
Rodman, H.R., Gross, C.G. and Albright, T.D. (1990) Afferent basis of visual response properties in area MT of the macaque. II. Effects of superior colliculus removal. J. Neurosci., 10: 1154–1164. Rodman, H.R., Sorenson, K.M., Shim, A.J. and Hexter, D.P. (2001) Calbindin immunoreactivity in the geniculo-extrastriate system of the macaque: implications for heterogeneity in the koniocellular pathway and recovery from cortical damage. J. Comp. Neurol., 431: 168–181. Seagraves, M.A., Goldberg, M.E., Deng, S.Y., Bruce, C.J., Ungerleider, L.G. and Mishkin, M. (1987) The role of striate cortex in the guidance of eye movements in the monkey. J. Neurosci., 7: 3040–3058. Stoerig, P. and Cowey, A. (1997) Blindsight in man and monkey. Brain, 120: 535–559. Review. Stoerig, P., Faubert, J., Ptito, M., Diaconu, V. and Ptito, A. (1996) No blindsight following hemidecortication in human subjects? Neuroreport, 7: 1990–1994. Stoerig, P., Kleinschmidt, A. and Frahm, J. (1998) No visual responses in denervated V1: high-resolution functional magnetic resonance imaging of a blindsight patient. Neuroreport, 9: 21–25. Stoerig, P., Zontanou, A. and Cowey, A. (2002) Aware or unaware: assessment of cortical blindness in four men and a monkey. Cereb. Cortex, 12: 565–574. Tusa, R.J., Zee, D.S. and Herdman, S.J. (1986) Effect of unilateral cerebral lesions on ocular behavior in monkeys: Saccades and quick phases. J. Neurophysiol., 56:1590–1625. Weller, R.E. and Kaas, J.H. (1989) Parameters affecting the loss of ganglion cells of the retina following ablations of striate cortex in primates. Vision Neurosci., 3: 327–349. Weiskrantz, L. (1986) Blindsight: A Case Study and Implications. Oxford University Press, Oxford. Weiskrantz, L, Barbur, J.L. and Sahraie, A. (1995) Parameters affecting conscious and unconscious visual discrimination with damage to the visual cortex (V1). Proc. Natl. Acad. Sci. USA, 92: 6122–6126. Zihl, J. and Werth, R. (1984) Contributions to the study of blindsight. I. Can stray light account for saccadic localization in patients with postgeniculate visual field defects? Neuropsychologia, 22: 13–22.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 20
Is blindsight in normals akin to blindsight following brain damage? C.A. Marzi*, A. Minelli and S. Savazzi Department of Neurological and Visual Sciences, University of Verona, 8 Strada Le Grazie, 37134 Verona, Italy
Abstract: The aim of this chapter is to discuss evidence bearing on two related issues, namely, first, whether the neural pathways of subliminal perception are the same as those subserving suprathreshold perception. Second, whether the pathways for subliminal perception in normals are similar to those subserving blindsight in brain-damaged patients. As to the former question, the overall balance is in favor of the different-pathway hypothesis while a tentative answer to the second question might be that blindsight is basically similar to subliminal perception in normals. The differences undoubtedly existing between the two conditions depend mainly on the differences in the stimuli used to reveal them.
Introduction
normals and then with the neural pathways subserving blindsight in brain-damaged patients.
Unconscious visual processing can occur in normals when using subthreshold or masked stimuli or in brain-damaged patients with suprathreshold or unmasked stimuli which fail to be detected because of a perceptual or attentional impairment. In both cases the unperceived stimuli can influence behavior. In healthy humans this dissociation between awareness and performance goes under the name of subliminal perception; in patients with removal of the primary visual cortex (V1) it goes under the name of ‘blindsight’ (Weiskrantz, 1986). There is a tendency to call blindsight all dissociations between stimulus awareness and behavior which have been shown to occur in various neuropsychological impairments such as neglect, agnosia, amnesia and so on. Here the term will be used only in reference to patients with a lesion of the primary visual cortex (V1) or its direct input and downstream output as it occurs following damage to the optic radiations. This chapter will first deal with the possible neural pathways underlying subliminal visual perception in
Pathways of subliminal and supraliminal perception: same or different? The concept of subliminal perception has been a subject of interest and controversy for decades (for a recent review on its neurophysiological correlates, see Shevrin, 2001). Very little is known about its neural bases. In principle there are two possibilities: one is that the pathways for subliminal and supraliminal perception are one and the same and what differs in the two conditions is that subliminal stimuli activate the brain to a smaller extent than supraliminal stimuli. According to this view, degree of neural activation would be the critical variable determining the emergence of conscious awareness. A reasonable possibility is that putative consciousness center/s have higher activation thresholds than ‘unconscious’ centers and therefore a presumably weak activation as that produced by subliminal stimuli is not sufficient to excite them. The second possibility is that there are separate pathways for conscious and unconscious vision diverging from relatively early processing stages and that only the conscious
*Corresponding author. Tel.: þ 39045-8027143; Fax: þ 39045-580881; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14402-0
295
296
pathway is connected to putative consciousness center/s. A more ‘dynamic’ alternative to the concept of consciousness center/s that cuts across the same/ different pathways distinction, is that of feedback cortico–cortical connections whose activation would be critical for perceptual awareness (Lamme, 2001; Pascual-Leone and Walsh, 2001). In principle, back projecting neurons from higher-order visual areas to the primary visual cortex might have higher activation thresholds than forward projecting neurons and therefore might not be activated by the weak input typically associated with unperceived stimuli. The lack of feedback from higher-order visual areas would be incompatible with stimulus awareness. By the same token, one could argue that only ‘conscious pathways’ connect to areas providing feedback to primary visual areas and that is the reason why their activation yields stimulus awareness. How can one decide between the same/different pathways alternative? Not many studies have been specifically addressed to this problem. Among the most recent and better controlled studies demonstrating the existence of subliminal perception in normals are experiments using psychophysical (Bonneh et al., 2001; He and MacLeod, 2001; Kolb and Braun, 1995; MacLeod and He, 1993; Shady and MacLeod, 2002; Watanabe et al., 2001), priming (Bar and Biederman, 1999; Dehaene et al., 1998) or masking techniques (Meeres and Graves, 1990). The majority of these studies, however, have not been specifically concerned with the possible neural bases of the subliminal effect documented. Therefore, one can find only indirect clues to try and decide among the two above hypotheses. In principle, convincing evidence for the same-pathway hypothesis would be to show that consciously perceived stimuli activate the same brain structures as identical unperceived stimuli albeit to a greater extent. Alternatively, different structures might be activated following presentation of perceived and unperceived identical stimuli and this would provide support for the different-pathway hypothesis. Unfortunately, evidence of this type is difficult to gather in normals because the manipulations used to render a stimulus invisible yet capable of being subliminally processed inevitably make such a stimulus different from its visible counterpart. Thus, in principle, one cannot be sure if the different pattern
of cerebral activation found for the two conditions of stimulation depends on the consciousness variable or on the manipulation used to control visibility. One way out of this problem is to use stimuli that can be perceived only on a given proportion of trials because they are difficult to detect or discriminate rather than as a result of manipulations such as masking or tachistoscopic presentations or as a result of attentional slips. The critical procedure would then be to separately analyze the neural activation produced by physically identical unseen and seen stimuli making sure that the former yield subliminal processing. To our knowledge such an experiment has not been done yet in normals. Recently, we used a logically similar approach (Marzi et al., 2000) by testing a patient with a partial unilateral extinction as a result of right hemisphere frontal-parietal-temporal damage and no other neuropsychological impairment. We compared the ERP responses to normally perceived pairs of small patches of light (about 60%) to those to pairs of identical stimuli in which the left stimulus went unnoticed, i.e. was ‘extinguished’ (about 40%). It is important to point out that the trials in which the patient showed left extinction were interspersed among the others and were not a mere consequence of attentional shifts. We found that, in contrast to the normally perceived stimulus, the extinguished left stimulus did not evoke any P1 and N1 components while it did evoke later components such as P300. We think, therefore, that extinction affects relatively early stage of cortical visual processing. Unfortunately, in that study the patient did not show evidence of implicit processing (such as an implicit redundancy gain with double stimuli, as that found by Marzi et al. (1996) in some extinction patients) and therefore gave us no information on the neural correlate of subliminal processing because the lack of P1 and N1 correlated with the lack of both conscious and unconscious early visual processing. Further similar experiments are necessary with patients who do show subliminal effects. Given that P1 and N1 are the earliest components of the visual ERP to be influenced by attention (Hillyard and Anllo-Vento, 1998) our results underline the importance of spatial attention for conscious processing. A functional magnetic resonance imaging (fMRI) study which has been specifically addressed to comparing the cerebral areas that are activated by
297
the same stimuli when they are perceived or not has been recently carried out by Moutoussis and Zeki (2002) in normal subjects. They used dichoptic fusion to render monocular stimuli (faces or houses) visible or not depending upon whether they were of the same or of the opposite color, respectively. It was found that houses and faces activated specific cortical areas, independently from whether they were perceived or not. As predicted by the same-pathway hypothesis, the activation observed with unperceived stimuli was weaker than with their perceived counterparts but was located in the same areas as those activated by perceived stimuli. One note of caution is in order, however; a critical assumption of the experiment is that binocular rivalry was eliminated and this requires complete dichoptic fusion in all subjects. If this was not entirely achieved, then the residual activation found in the dichoptically fused condition, that is, in the condition in which conscious perception was absent, might have been related to a partial breakdown of fusion caused by binocular rivalry. Another interesting approach to the study of the pathways of subliminal perception is priming: In a recent study, Naccache and Dehaene (2001) found that the intraparietal region, i.e. an area involved in semantic processing of numbers, is affected by both subliminal and supraliminal primes. In another study with words as stimuli (Dehaene et al., 2001) the activation obtained with masked words was largely reduced in comparison with that evoked in the same areas associated with conscious reading. Again, these results provide support for the same-pathway hypothesis of subliminal perception, at least when tested with a priming paradigm. Among the studies advocating the differentpathway hypothesis there is a priming experiment by Bar and Biederman (1999) who found that pictures too briefly presented to be recognized, or even guessed above chance on a forced-choice test, can nonetheless facilitate the recognition of the same pictures many trials later. This subliminal effect was evident only for images that remained within the same quadrant in priming and test trials, but survived a translation of about 5 . Thus, one can reasonably suggest that subliminal visual priming is likely mediated by cortical areas in which cells have receptive fields larger than 5 but are confined to a single quadrant. Candidate areas in humans could be
the homolog of macaque V4 or TEO (the posterior part of the inferior temporal cortex). It follows that awareness of objects might be associated with hierarchically higher level areas such as the anterior part of the inferior temporal cortex, namely area TE. Thus, the pathways for sub- versus supraliminal perception need not be entirely different but might differ solely in the involvement of higher-order areas when stimuli are consciously perceived. Perhaps more direct support for the differentpathway hypothesis comes from the observations of Azzopardi and Cowey (1997) that have been obtained with blindsight patient GY rather than with normal participants. By using a signal detection theory procedure they showed that blindsight has different characteristics than normal degraded vision and argued that blindsight vision is mainly subserved by the dorsal ‘where’ pathway rather than by the ventral ‘what’ pathway originally proposed by Ungerleider and Mishkin (1982). This is because the absence of V1 forces visual information to be relayed to the cortical centers by tectal and pulvinar pathways that innervate extrastriate areas such as MT which belong to the dorsal route. A similar concept of a differential sensitivity of the parallel visual streams could be applied to the well-known results of Meeres and Graves (1990) in normal participants. By using briefly presented (and masked) open circles at six possible locations and blanks these authors showed that when subjects reported that the circle was absent, they nevertheless guessed the exact locations of the gap significantly better than chance. A signal detection theory analysis found that the subjects were more sensitive to stimulus location than to stimulus detection and this is in broad keeping with the idea that subliminal processing is biased toward dorsal (‘where’) system processing. Further support for the different-pathway hypothesis comes from psychophysical studies in which unperceived stimuli nonetheless yield visual after-effects. Recently, He and MacLeod (2001) have shown that visual after-effects that are likely to be subserved by cortical neurons show a lower spatial resolution than thalamic neurons in the monkey (McMahon et al., 2000) and this justifies their observation in humans that orientation-selective adaptation and tilt after-effects can be elicited by invisible stimuli. They found that after looking at
298
a very fine grating so high in spatial frequency that it was indistinguishable from a uniform field, participants required more contrast to detect a test grating presented at the same orientation than one presented at the orthogonal orientation. The subjects also experienced a tilt after-effect that depended on the relation between the tilt of the test pattern and the orientation of the unseen pre-exposed pattern. This result suggests a reasonable possibility, namely that the difference between conscious and unconscious vision depends upon the sensory threshold of the pathways. Those subserving conscious vision have higher absolute thresholds and lower spatial selectivity than those subserving unconscious vision. Finally, a study supporting the different-pathway hypothesis is a fMRI experiment carried out a few years ago by Sahraie et al. (1997) on blindsight subject G.Y. These authors presented stimuli moving at different speeds to the sighted or blind hemifield and asked the patient to discriminate the direction of motion and to report by pressing a commentary key whether he was aware of the presence of the stimulus or not. For fast stimulus speed conditions the patient reported that he was aware of any stimulus presentation either in the good or in the blind hemifield. In contrast, for slow stimulus speed he reported no awareness of any stimulus presentation in the blind hemifield while he was aware of the stimulus in his good field. Importantly, G.Y.’s discrimination accuracy was very good even in the unaware condition. The fMRI results pointed to a shift in the pattern of cerebral activity from the aware to the unaware mode: in the former there was an activation of prestriate and dorsolateral prefrontal cortex, while in the latter the superior colliculus (SC) was active together with medial and orbital prefrontal cortex. These results indicate that prefrontal areas such as Brodmann areas 46 and 47 are strongly implicated in perceptual awareness, while during the unaware mode there is a shift to subcortical activity mediated by the SC. Again, as in the case of Azzopardi and Cowey (1997) results, one should consider that this pattern of activity may be peculiar of hemianopic patients such as G.Y. while a similar study in normals is needed before generalizing the above conclusion to the healthy brain. What can one conclude from this brief review of the evidence on the neural substrate of subliminal
versus supraliminal visual processing? Clearly, the results are far from being concordant but at least one straightforward conclusion can be drawn, namely, that the pathways for subliminal vision must have different psychophysical thresholds than supraliminal pathways (masking effects might be an exception) and therefore, logically, they cannot be identical. If the dissociation between conscious awareness and behavior concerns different functions, such as localization versus detection as in Meeres and Graves (1990) study, then the pathways for sub- and supraliminal processing are likely to involve different systems with different subjective thresholds, e.g., the dorsal versus the ventral route of Ungerleider and Mishkin (1982). When the dissociation is within the same function and has been brought about by luminance differences, as in our study on the implicit redundancy gain (Savazzi and Marzi, 2002), than the pathways might involve different subsystems with different luminance thresholds within the same main stream. As we will see in the next paragraph, it is reasonable to assume that the SC might have a lower activation threshold for visual stimuli than the visual cortex and therefore might be a likely neural site of some subliminal effects involving luminance change.
Subliminal visual effects as studied with the redundant signal effect (RSE) paradigm To study the possible pathways of subliminal visual processing in normals we have employed a paradigm known as the redundant signal effect (RSE) that has enabled us to reveal blindsight effects in patients with hemianopia as a result of V1 lesions (Corbetta et al., 1990; Marzi et al., 1986) or hemispherectomy (Tomaiuolo et al., 1997) or in patients with unilateral extinction as a result of right temporal-parietal lesions (Marzi et al., 1996). Because of its simplicity and robustness this paradigm is a very handy means to reveal implicit effects. It consists in the speeding up of RT when responding to multiple versus single stimuli and represents an example of divided visual attention in which signal processing can be carried out in parallel to the advantage of response speed (Cavina-Pratesi et al., 2001; Miller, 1982). Two main models have been proposed for the RSE: A probabilistic (race) model (Raab, 1962) and a neural (coactivation) model (Miller, 1982). The former
299
postulates independent channels carrying information from each of the redundant stimuli: Whichever channel wins the race activates a decision center triggering the response. By increasing the number of channels one increases the probability that the fastest channel will be quicker than the average channel. The latter model postulates that signals from the various channels are summed in an activational pool and, therefore, the threshold for initiating the response is reached more quickly than for single stimuli. Miller (1982) has provided a straightforward mathematical method to discriminate between probabilistic and neural summation with his race inequality equation which sets an upper limit for the cumulative probability of a response given multiple signals. If the upper bound is violated a coactivation explanation is likely because responses to redundant stimuli cannot be faster than the fastest responses to single stimuli. Miller’s violation is a conservative test and a lack of violation does not exclude the possibility of coactivation while eliminating the probabilistic explanation. In the last 15 years we have been able to demonstrate that some patients show a RSE for stimuli bilaterally presented across the vertical visual field meridian despite their lack of awareness of the stimulus presented in the affected hemifield. To try and understand the possible pathways underlying these implicit effects, recently we carried out a similar experiment in normal subjects by using pairs of stimuli in which one stimulus had a subthreshold light intensity and therefore could not be consciously detected (Savazzi and Marzi, 2002). Twelve healthy right-handed students (six males) were asked to press a key as quickly as possible following presentation of single or double small luminous squares of about 1 presented on a PC screen with a duration of 150 ms at an eccentricity of 6 along the horizontal meridian either in the right or left hemifield or bilaterally. In the latter case, the distance between the two stimuli was 12 . The paradigm was simple RT, and therefore, subjects were asked to respond without having to discriminate between single and double stimuli. The hemifield of presentation of single stimuli or the alternation between single and double stimuli were unpredictable. There were three degrees of stimulus luminance as determined by previous individual threshold assessment with the method of constant stimuli (range of luminance:
0.02–0.58 cd/m2). Suprathreshold stimuli were set at a luminance that represented the minimum value at which the stimuli could be detected by all subjects at least on 99% of presentations (this value was for all subjects 0.30 cd/m2). Sub-threshold stimuli were those that were detected on less than 1% of presentations; their luminance value was for nine subjects 0.05 cd/m2 and for the remaining three subjects 0.04 cd/m2. Finally, control stimuli had such a low luminance (0.02 cd/m2) that were never detected. Each subject received 13 combinations of stimulus number and luminance that were alternated in a random series. The most important comparison was between RT to single suprathreshold stimuli and RT to double mixed stimuli in which one stimulus in a pair was suprathreshold and the other was subthreshold. Confirming previous reports, double suprathreshold stimuli yielded a RSE, i.e. reliably faster RTs (356 ms), than single suprathreshold stimuli (372 ms). However, the novel finding was the occurrence of a RSE even in the double mixed condition, i.e. when subjects reported having seen only one stimulus because one stimulus in the pair was subthreshold. These ‘virtual’ double stimuli yielded a mean RT (367 ms) which was reliably shorter than that for single suprathreshold stimuli despite their identical subjective appearance. It is important to mention that when double stimuli were made up of the combination of a control and a supra-threshold stimulus there was no RSE and this provides convincing evidence that the implicit RSE found was not an artifact. Further control experiments ruled out the possibility that the implicit effect found was due to the minuscule proportion of seen subthreshold stimuli or to the different overall luminosity of the display during the presentation of single versus double stimuli (see Savazzi and Marzi, 2002). We then verified whether the observed RSE was related to neural (Miller, 1982) or to probability summation (Raab, 1962) by using the race inequality test proposed by Miller (see above) and found that for both supraliminal and subliminal RSE there was a violation of the race model and therefore a probabilistic explanation could be ruled out in favor of a neural coactivation explanation. What might be a likely neural site for the implicit effect observed? A previous study (Tomaiuolo et al., 1997) found that hemispherectomy patients show a RSE despite that one stimulus of a pair could not
300
be consciously detected because it was presented into the hemianopic hemifield. This result strongly suggests the SC as a likely site of coactivation. This possibility is strengthened by the well-known fact that the SC is a center of multimodal sensory convergence (Stein, 1998) and that a large redundancy gain has been classically described with multimodal stimuli (Nickerson, 1973). Furthermore, there is recent functional magnetic resonance imaging (fMRI) evidence (Iacoboni et al., 2000) suggesting that the RSE might be mediated by the SC under modulatory influences from the extrastriate cortex. This possibility has received support from our own recent ERP evidence suggesting that the neural coactivation underlying the RSE might take place at the level of extra-striate cortex (Miniussi et al., 1998), an area richly interconnected with the SC. Finally, there are studies (Cavina-Pratesi et al., 2001; Miller et al., 1999; Mordkoff et al., 1996) demonstrating that the coactivation effect is unlikely to occur at a premotor stage of RT, but rather at a perceptual stage. Typically, the RSE is not retinotopic and coactivation can occur between stimuli practically anywhere in the visual field. Therefore, the level at which the RSE occurs might well be the SC, a visual structure in which retinotopy is rather coarse. On the whole, the picture emerging from these results reinforces the idea that when visual processing is mainly subserved by a subcortical structure it remains unconscious. Our present experiments indicate that the SC has presumably a lower threshold for the visual activation of its neurons than visual cortical areas and this is an important result that bears on the general problem of the neural correlates of conscious experience. Why should the cortex be less sensitive to low luminance than the SC? One simple possibility is that the visual cortical system has adapted for operations such as object and color vision that require an adequate degree of luminance. To reduce overall visual noise the visual cortex might impose a block on low luminance signals. In contrast, subcortical centers mediate phylogenetically ancient responses to luminance changes and it is advantageous for them to have lower luminance thresholds. The cost to pay is the lack of visual awareness. Of course, this explanation might work for stimuli which go unperceived because of low luminance but it might not apply to masked stimuli, for example.
Furthermore, it does not apply to implicit processing in blindsight proper and the other forms of implicit– explicit dissociations caused by brain damage that occur with suprathreshold stimuli. In these cases the residual unconscious visually guided behavior does not seem to have particularly low thresholds for pointing or eye movements localization, as it might have been predicted if it was mediated by the SC. This might be a consequence of the lesion of V1 or of the optic radiations causing blindisght. It has been known since a long time from single-cell studies in cats and monkeys that the responsiveness of SC neurons is decreased following V1 lesion (see Sprague, 1991, for a review). Therefore, implicit perception subserved by the SC in blindsight patients may be less sensitive than in subliminal perception in normals. At any rate, it should be noted that to our knowledge, threshold testing in blindsight has never been carried out using indirect rather than direct (i.e. forced-choice) methods. Indirect methods, such as the RSE with bilateral stimuli, might possibly reveal a similar threshold sensitivity as that found in subliminal perception in normals (Savazzi and Marzi, 2002). Why then is blindsight blind? In our opinion there are two possibilities: The first follows from the previous discussion of subcortical vision and is almost a tautology: Subcortical vision is by itself unconscious and if all the processing that remains after cortical lesion is subcortical then vision is unconscious. The second possibility is that following V1 lesion the surviving functions are mediated by cortical areas that before the lesion have never been activated during visual perception; in such a case conscious awareness is lost, see Marzi (1999) for a more extended discussion of this point.
Implicit RSE in normals and in brain-damaged patients What is then the difference between blindsight in normals and in brain-damaged patients? Before trying to analyze commonalities and differences one consideration is in order, namely that conscious and unconscious pathways can converge. One example comes from our own experiments on the RSE: a stimulus of subthreshold luminance (for conscious
301
perception) and a suprathreshold stimulus converge at the level of SC neurons to yield a redundancy gain. The same occurs in brain-damaged patients: a stimulus which has been banned from consciousness because it has been presented to an hemianopic or neglected visual field area can interact with a stimulus presented to normal areas and yield a RSE. Thus, the two pathways, conscious and subconscious, can in fact converge upon common neurons, see Fig. 1. The results with the implicit RSE can help us understand blindsight in normals and in braindamaged patients. But, what are the commonalties and what are the differences between these two aspects of implicit vision? The commonalities are essentially that both might occur in the SC-extrastriate cortex pathway. The differences are that blindsight in normals is presumably subserved by pathways with lower detection thresholds than those subserving blindsight in brain-damaged patients because psychophysical studies have generally found elevated rather than lower thresholds in the blindsighted hemifield in comparison to the unaffected hemifield. Moreover,
one should be aware of the before-mentioned study by Azzopardi and Cowey (1997) showing that the sensitivity of blindsight patients might change depending on the task used. In contrast to thresholds, however, the level of discriminative performance seems to be much higher in blindsight than in many subliminal paradigms although this difference has never been systematically investigated. This is not unexpected given that in blindsight the stimuli used are typically very well visible (when presented to the normal portions of the visual field) while in subliminal testing they are by definition weak because of their physical characteristics. All in all, one could conclude that the similarities are bigger than the differences. Our initial query was whether the aware and unaware processing modes in normal vision are subserved by similar or different neural pathways and if the latter is similar to blindsight. While Azzopardi and Cowey (1997) have convincingly shown that blindsight is not similar to normal near-threshold vision, our work with the RSE in normal and brain-damaged subjects suggests that
Fig. 1. Schematic representation of the pathways for subliminal summation in normals and in blindsight patients. Thick arrows indicate suprathreshold stimuli and thin arrows indicate stimuli of subliminal intensity. In normals the subliminal stimulus is forced to use the superior colliculus (SC) ! pulvinar (PUL) ! extrastriate pathway because this pathway is hypothesized to have lower detection thresholds than the lateral geniculate nucleus (LGN) ! primary visual cortex (V1) ! extrastriate cortex pathway. By the same token, in blindsight patients the stimulus presented to the hemianopic hemifield is forced to use the SC ! PUL ! extrastriate cortex pathway because the LGN ! V1 ! extrastriate cortex pathway is not viable. Despite this difference, in both cases, coactivation of the redundant stimuli occurs at the level of the bilateral SC by means of the intercollicular commissure. The dashed line indicates the pathway that cannot be activated by subthreshold stimuli in normals. The dashed box in blindsight patients indicates that the LGN undergoes retrograde degeneration following lesion in V1.
302
blindsight might indeed be similar not to near- but to below-threshold vision.
References Azzopardi, P. and Cowey, A. (1997) Is blindsight like normal, near-threshold vision? Proc. Natl. Acad. Sci. USA, 94: 14190–14194. Bar, M. and Biederman, I. (1999) Localizing the cortical region mediating visual awareness of object identity. Proc. Natl. Acad. Sci. USA, 96: 1790–1793. Bonneh, Y.S., Cooperman, A. and Sagi, D. (2001) Motioninduced blindness in normal observers. Nature, 411: 798–801. Cavina-Pratesi, C., Bricolo, E., Prior, M. and Marzi, C.A. (2001) Redundancy gain in the stop-signal paradigm: implications for the locus of coactivation in simple reaction time. J. Exp. Psychol. Hum. Percept. Perform., 27: 932–941. Corbetta, M., Marzi, C.A., Tassinari, G. and Aglioti, S. (1990) Effectiveness of different task paradigms in revealing blindsight. Brain, 113: 603–616. Dehaene, S., Naccache, L., Le Clec, H.G., Koechlin, E., Mueller, M., Dehaene-Lambertz, G., van de Moortele, P.F. and Le Bihan, D. (1998) Imaging unconscious semantic priming. Nature, 395: 596–600. Dehaene, S., Naccache, L., Cohen, L., Bihan, D.L., Mangin, J.F., Poline, J.B. and Riviere, D. (2001) Cerebral mechanisms of word masking and unconscious repetition priming. Nat. Neurosci., 4: 752–758. He, S. and MacLeod, D.I. (2001) Orientation-selective adaptation and tilt after-effect from invisible patterns. Nature, 411: 473–476. Hillyard, S.A. and Anllo-Vento, L. (1998) Event-related brain potentials in the study of visual selective attention. Proc. Natl. Acad. Sci. USA, 95: 781–787. Iacoboni, M., Ptito, A., Weekes, N.Y. and Zaidel, E. (2000) Parallel visuomotor processing in the split brain: cortico– subcortical interactions. Brain, 123: 759–769. Kolb, F.C. and Braun, J. (1995) Blindsight in normal observers. Nature, 377: 336–338. Lamme, V.A. (2001) Blindsight: the role of feedforward and feedback corticocortical connections. Acta Psychol. (Amst.), 107: 209–228. MacLeod, D.I. and He, S. (1993) Visible flicker from invisible patterns. Nature, 361: 256–258. Marzi, C.A. (1999) Why is blindsight blind? J. Consciousness Studies, 6: 12–18. Marzi, C.A., Tassinari, G., Aglioti, S. and Lutzemberger, L. (1986) Spatial summation across the vertical meridian in hemianopics: a test of blindsight. Neuropsychologia, 24: 749–758. Marzi, C.A., Smania, N., Martini, M.C., Gambina, G., Tomelleri, G., Palamara, A., Alessandrini, F. and Prior, M.
(1996) Implicit redundant-targets effect in visual extinction. Neuropsychologia, 34: 9–22. Marzi, C.A., Girelli, M., Miniussi, C., Smania, N. and Maravita, A. (2000) Electrophysiological correlates of conscious vision: evidence from unilateral extinction. J. Cogn. Neurosci., 12: 869–877. McMahon, M.J., Lankheet, M.J.M., Lennie, P. and William, D.R. (2000) Fine structure of parvocellular receptive fields in the primate fovea revealed by laser interferometry. J. Neurosci., 20: 2043–2053. Meeres, S.L. and Graves, R.E. (1990) Localization of unseen visual stimuli by humans with normal vision. Neuropsychologia, 28: 1231–1237. Miller, J. (1982) Divided attention: evidence for coactivation with redundant signals. Cognit. Psychol., 14: 247–279. Miller, J., Ulrich, R. and Rinkenauer, G. (1999) Effects of stimulus intensity on the lateralized readiness potential. J. Exp. Psychol. Hum. Percept. Perform., 25: 1454–1471. Miniussi, C., Girelli, M. and Marzi, C.A. (1998) Neural site of the redundant target effect electrophysiological evidence. J. Cogn. Neurosci., 10: 216–230. Mordkoff, J.T., Miller, J. and Roch, A.C. (1996) Absence of coactivation in the motor component: evidence from psychophysiological measures of target detection. J. Exp. Psychol. Hum. Percept. Perform., 22: 25–41. Moutoussis, K. and Zeki, S. (2002) The relationship between cortical activation and perception investigated with invisible stimuli. Proc. Natl. Acad. Sci. USA, 99: 9527–9532. Naccache, L. and Dehaene, S. (2001) The priming method: imaging unconscious repetition priming reveals an abstract representation of number in the parietal lobes. Cereb. Cortex, 11: 966–974. Nickerson, R.S. (1973) Intersensory facilitation of reaction time: energy summation or preparation enhancement? Psychol. Rev., 80: 489–509. Pascual-Leone, A. and Walsh, V. (2001) Fast backprojections from the motion to the primary visual area necessary for visual awareness. Science, 292: 510–512. Raab, D. (1962) Statistical facilitation of simple reaction times. Trans. N.Y. Acad. Sci., 24: 574–590. Sahraie, A., Weiskrantz, L., Barbur, J.L., Simmons, A., Williams, S.C.R. and Brammer, M.J. (1997) Pattern of neuronal activity associated with conscious and unconscious processing of visual signals. Proc. Natl. Acad. Sci. USA, 94: 9406–9411. Savazzi, S. and Marzi, C.A. (2002) Speeding up reaction time with invisible stimuli. Curr. Biol., 12: 403–407. Shady, S. and MacLeod, D.I. (2002) Color from invisible patterns. Nat. Neurosci., 5: 729–730. Shevrin, H. (2001) Event-related markers of unconscious processes. Int. J. Psychophysiol., 42: 209–218. Sprague, J.M. (1991) The role of the superior colliculus in facilitating visual attention and form perception. Proc. Natl. Acad. Sci. USA, 88: 1286–1290.
303 Stein, B.E. (1998) Neural mechanisms for synthesizing sensory information and producing adaptive behaviors. Exp. Brain Res., 123: 124–135. Tomaiuolo, F., Ptito, M., Marzi, C.A., Paus, T. and Ptito, A. (1997) Blindsight in hemispherectomized patients as revealed by spatial summation across the vertical meridian. Brain, 120: 795–803.
Ungerleider, L.G. and Mishkin, M. (1982). Two cortical visual systems. In: Ingle, D.J., Goodale, M.A. and Mansfield, R.J.W. (Eds.), Analysis of Visual Behavior. The MIT Press, Cambridge Massachusetts, pp. 549–586. Watanabe, T., Nanez, J.E. and Sasaki, Y. (2001) Perceptual learning without perception. Nature, 413: 844–848. Weiskrantz, L. (1986). Blindsight. A Case Study and Implications. Oxford University Press, Oxford.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 21
Auras and other hallucinations: windows on the visual brain Frances Wilkinson* Centre for Vision Research, Toronto Western Research Institute, and York University, 4700 Keele St, Toronto ON M3J 1P3, Canada
Abstract: Hallucinations in psychologically normal individuals provide a valuable route to studying the neural mechanisms of visual awareness. Migraine auras, epileptic auras and the hallucinations of Charles Bonnet Syndrome are examined in this context. Both similarities and striking differences in content are noted and the extent to which we are currently able to localize the source of these forms of endogenously driven visual awareness is discussed.
Visual awareness, the topic of this volume in honour of Alan Cowey, normally refers to the processes by which the visual pathways of the brain encode and represent external reality as it is conveyed by light emitted from and reflected by objects in our surroundings. Avoiding the thorny issue of animal consciousness, let us agree that this awareness allows humans to recognize individual objects and scenes and to interact with them in real time in a manner that is normally effortless. A widely used approach to studying nervous system function is to examine its workings under abnormal conditions, often following structural damage, where the function of interest is absent or impaired. The long tradition of work on blindsight — vision without awareness — pioneered by Weiskrantz et al. (1974) and by Cowey (Cowey and Stoerig, 1991; Weiskrantz and Cowey, 2002) is well represented in earlier chapters in this volume. In blindsight, visual information can apparently still be garnered from the retinal input and used to respond to certain types of visual challenge, but the awareness — the sense of ‘watching the world’ — is lost. Hallucinations are the other side of the coin. The sense of ‘watching’ is vividly present; it is the stimulus — the external world — that is missing. The visual pathways have been co-opted by endogenous
sources of activation, which are none-the-less able to elicit percepts that are vivid and often complex. There are numerous conditions that would qualify as vision without a stimulus. The most common is dreaming; others include the visions brought on by mind-altering drugs, and the hallucinations associated with psychoses. However, the present discussion will be limited to just three types of visual hallucination which are amenable to careful scientific investigation: the auras of migraine, which they have been studying in the author’s laboratory for several years, the auras of occipital lobe epilepsy, and the hallucinations associated with visual loss, known as Charles Bonnet syndrome (CBS). All three occur in awake, aware individuals whose self-reports are not subject to the uncertainties of psychosis or drug-states nor to the amnesia that usually occurs with dreams. Both the potential value of hallucinations as a window on the brain, and the difficulties entailed in their study were recognized long ago by the neurologist G.W. Gowers who said in his 1895 Bowman Lecture on epilepsy and migraine: ‘‘The difficulty of ascertaining the facts depends on their subjectivity. That which is to be discerned can only be seen through the vision of another. Moreover, this is the sight of the unreal; it is the sight of that which is not. Yet, though unreal to the
*Corresponding author. Tel.: þ 416-736-2100 x33184; Fax: þ 416-736-5662; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14402-1
305
306
subject it is, as a sensation, a profound reality which confuses the mind and may make even recollection painful. Hence the opportunities for ascertaining trustworthy facts are very rare, and when they come it is important that they should be made the most of.’’ (Gowers, 1895, pp. 3–4) Today it is possible to combine careful behavioral observations with data from functional imaging, electrophysiology and other modern techniques to give us a much clearer picture of the neural basis of these phenomena. In the following pages, a brief review of old and recent literature on hallucinations in migraine, epilepsy and Charles Bonnet syndrome is provided, highlighting similarities and differences. We also describe some observations from our laboratory on the auras of migraine.
Classification of visual hallucinations The hallucinations to be considered here differ from dreams in several important ways. First, they are limited to the visual modality. While complex objects may be seen, they are generally not heard, felt, tasted or smelled. Secondly, the hallucinator is an observer of the events, not an active participant, either in terms of actions or in terms of emotional engagement. Finally, unlike the dreamer, the hallucinator is simultaneously aware of the external environment. There is no widely accepted system for classifying visual hallucinations. For our present purposes, let us distinguish among four general categories: (1) elementary, (2) complex, (3) illusions and distortions, and (4) ictal blindness. A similar subdivision has been made by several other authors (Ritchie Russell and Whitty, 1955; Sveinbjornsdottir and Duncan, 1993; Bien et al., 2000), and other more complex systems usually involve the subdivision of one or more of these primary categories (Gru¨sser and Landis, 1991; ffytche and Howard, 1999). Elementary visual hallucinations include all simple visual phenomena lacking meaning and form beyond simple geometric shape. This includes points of light (phosphenes) occurring singly or in static or moving arrays, spots, line elements, and simple geometric forms. Properties which are relevant in describing
such elementary positive hallucinations include colour/achromaticity, brightness, pattern of flicker/ movement if any, and location in the visual field. In contrast, complex hallucinations entail reports of objects and scenes. Among the questions of interest here are the categories of objects reported and their relative frequency, whether objects are recognized exemplars from the individual’s past experience (e.g., the face of a family member), whether they appear natural or distorted, three-dimensional or flat, coloured or achromatic, stationary or in motion and so on. Similar questions may be asked of hallucinations consisting of complex scenes. The elementary and complex hallucinations might be thought of as ‘true hallucinations’ or positive visual phenomena. Something is perceived which is not present in the external world. The third category — illusions and distortions — is somewhat different. Real visual input is modulated by endogenously generated activity producing an alteration in the perception of the external world. Such distortions take several forms. One is the enhancement of an elementary property such as color. More common is distortion of shape and size: lines may appear fragmented, wavy or bent, shapes altered, and objects too small (micropsia) or too large (macropsia). Motion is induced or altered, appearing too fast or too slow. Palinopsias would also fall into this category — images persist after the objects that have generated them cease to be present, or multiple copies of a single real object may be seen (polyopsia). Finally, vision can be lost, blurred or dimmed, either in local regions, in an entire hemifield, or through gradual constriction of the whole field of view (tunnel vision). In some cases, visual loss is accompanied by ‘filling in’ and the subject is not aware of the loss except under specialized testing; in other instances, normal vision may appear to be actively occluded by a ‘cloud’ of darkness or of visual noise, like static on a television screen. While these different classes of hallucination in no way tie visual awareness to a particular level of the visual pathway, they do suggest that the generation of intrinsically patterned visual stimulation may occur at many levels, and that the percepts generated reflect the level at which this patterning occurs. In the context of the well-known distinction between dorsal (where) and ventral (what) visual pathways (Ungerleider and
307
Mishkin, 1982), it is tempting to relate elementary positive hallucinations to early cortical (or possibly even pre-cortical) levels, complex hallucinations to later stages of the ventral pathway, and illusions and distortions predominantly to the dorsal pathway (Santhouse et al., 2000). At this point in time, this can at best be considered a working hypothesis.
Migraine visual auras Background and phenomenology Visual auras as a premonitory symptom to migraine headache have been recognized for well over a hundred years in the medical literature (Liveing, 1873); allusions to hallucinations which could be migraine aura go back to antiquity (Sacks, 1992). According to the widely used International Headache Society Classification Committee, 1988 (IHS; 1988) classification system for migraine, an aura is a totally reversible neurological disturbance which precedes the headache by up to an hour and persists for up to 60 min. Migraine aura may also occur alone, with no subsequent headache. While auras may involve other sensory systems, the motor system, and central functions such as language reception and production, visual auras are by far the most common type (Russell and Olesen, 1996). Furthermore, when more than one type of aura occurs within an episode, the visual aura usually occurs first (Olesen, 1993), suggesting that something about the visual system — its cytoarchitecture, vasculature, or pattern of stimulation — predisposes it to this form of hallucinatory activity. The common migraine conditions (migraine with and without aura) occur in more than 10% of the adult population (approximately 5–8% of men and 15–20% of women) (Russell et al., 1995; Lipton et al., 2001). Between 10 and 40% of these individuals experience auras in at least some episodes. Little epidemiological information exists about aura without headache. The best current estimate is that approximately 1% of the population experiences these episodes (Russell et al., 1995), but this is likely to be a significant underestimate. Thus apart from dreams, migraine auras are the most common form of visual hallucination.
Because of their prolonged duration, and because they precede the headache in those cases in which headache occurs, migraine auras are particularly amenable to observation and description. Not surprisingly, scientists and physicians who were themselves affected by migraine, have been responsible for introducing these descriptions to the medical literature. Among the more famous cases of selfreport are papers by Airy (1870), Jolly (1902), Lashley (1941), Alvarez (1960) and Gru¨sser (1995). There are, in addition, numerous clinical reports (e.g., Klee and Willanger, 1966; Fisher, 1980) which generally confirm the major points of the self-reports. Finally, a small number of prospective studies have been conducted in which groups of subjects have been trained to record specific aspects of their auras as they progress (Hare, 1973; Russell et al., 1994; Wilkinson et al., 1999; Crotogino et al., 2001). The following discussion reflects the findings of all three sources. The cardinal feature of visual migraine aura is that it typically consists of simple positive and/or negative (scotoma) components. The most common reports are of visual phosphenes and of simple geometric patterns. The positive components are always described as very bright, usually white or silver but colored in some instances, and as scintillating, flashing, or flickering. A second striking property of many, but not all, visual auras is that they spread in a stereotyped pattern across the visual field. The most familiar descriptions of visual auras are of the form known as fortification spectra. The selfreports of Airy (1870), Lashley (1941) and Gru¨sser (1995), and the detailed analysis by Richards (1971) of his subject’s auras are all of this type. As illustrated in Fig. 1, such an aura typically consists of bright line elements, oriented at angles to one another, forming an arc. Generally the aura begins near the fovea in one hemifield and spreads out toward the periphery of the same field, respecting the vertical meridian and eventually disappearing in the far periphery. This pattern suggests a spread of neural activation across some retinotopically organized brain region, a point first made by Lashley (1941). A region of blindness or scotoma usually follows in the wake of the positive arc and persists for several minutes. The scotoma may pass unnoticed by the subject because of ‘filling in’, but can be demonstrated with careful probing of the visual field. The average duration of such an aura is
308
Neural substrate of migraine auras
Fig. 1. Fortification type migraine aura drawn at successive times (in minutes) after onset. Dark lines represent bright components. From Lashley (1941).
about 20–25 min (Hare, 1973). In our laboratory we have used a matching technique to estimate the rate at which the elements of fortification spectra flash or scintillate. Migraineurs adjusted the flicker rate of a light-emitting diode viewed in one hemifield to match the aura activity in the other hemifield. Our data (Crotogino et al., 2001) demonstrate that individual subjects experience quite consistent flicker rates across episodes. Across subjects there is greater variability, but the majority reported rates in the range of 15–20 Hz. Ongoing work in our laboratory applies a similar matching approach to examine other features of auras as they progress. The majority of positive migraine auras fall into the elementary hallucination category. Strikingly, there are no reports of complex objects or scenes in cases of migraine aura meeting strict IHS criteria and uncomplicated by epilepsy. Ictal blindness (scotoma) may accompany the positive features or may exist alone in a variety of forms including complete hemianopia and tunnel vision. Much less frequent are auras that would be classified as illusions or distortion. For example, the appearance of shimmering or ‘heat waves’ in the visual scene has been described both by our subjects and elsewhere in the literature. Straight lines may appear broken, and objects fractured (Podoll and Robinson, 2000). Micropsia and macropsia (Alice in Wonderland syndrome; Todd, 1955) have been reported, and Sacks (1992) describes cases reporting mosaic vision — the breaking up of the visual image into crystal-like facets — and cinematic vision, which he describes as loss of smooth movement. Both these manifestations are quite rare.
A critical clue to understanding the neural mechanisms underlying migraine aura is provided by the rate and pattern of spread of the aura across the visual field (see above). As Lashley first noted, and others have confirmed (Gru¨sser, 1995), the increasing size of the positive elements, and the increasing speed of spread across the visual field with eccentricity, lend themselves to an explanation based on spread of activation at a constant rate across a retinotopically organized but distorted neural representation of visual space. Striate cortex (V1), in which the foveal representation is greatly magnified compared to the periphery (Horton and Hoyt, 1991), could provide such a substrate. Lashley (1941) was the first to estimate the rate of spread to be about 3 mm/min. In the world of neural events, this timescale suggests activation regenerated and spread locally as a result of chemical diffusion rather than synaptic spread. In the same decade that Lashley wrote his landmark paper on auras, Lea˜o (1944, 1947) first described the phenomenon of spreading depression in rabbit neocortex, and in 1958 Milner explicitly proposed a link between the two. Spreading depression is characterized by a wavefront of neuronal activation (Grafstein, 1956) triggered by dramatic local increases in extracellular potassium, followed by profound neuronal depression which persists for several minutes. The local diffusion of extracellular potassium released at the wavefront propagates the wave across the neural structure. There has been considerable controversy as to whether spreading depression, which was originally described in lissencephalic brains, could be generated in human cortex (Gloor, 1986; McLachlan and Girvin, 1994). More recent in vitro studies indicate that it can, at least under abnormal local ionic conditions (e.g., reduced magnesium; Avoli et al., 1991). Whether or not the mechanism is precisely the same as the classic spreading depression of Lea˜o, it is now widely accepted that a ‘spreading depression-like’ process underlies the visual aura in migraine (Hardebo, 1992; Lauritzen, 1994). The wavefront of neural excitation operating on intrinsic cortical networks is presumed to underlie the positive hallucinations and the subsequent neuronal depression, the scotoma. What remains unanswered is what constellation of
309
abnormalities in the cortical milieu occurs in the course of a migraine episode to enable the triggering of the aura. Among the possibilities we are examining in ongoing computational work modeling aura onset and spread (Wilkinson and Wilson, 2000) are impaired inhibition (Chronicle and Mulleners, 1994), abnormalities in neuromodulatory processes (particularly those mediated by serotonin), and intrinsic hyperexcitability of glutamate pathways, perhaps due to reduced magnesium (Ramadan et al., 1989). The simple nature of most aura features suggests an early locus in the visual pathway. In the case of the fortification type of aura, which has received closest attention, cortical area V1 seems the most likely locus based on three properties. The restriction to a single hemifield with a sharp cut-off at the vertical meridian points to a post-chiasmal origin. Secondly, the fact that the elements are described as short line segments — elements with clear orientation — suggests the involvement of orientation tuned cells (Hubel and Wiesel, 1968) which effectively rules out a pre-cortical locus such as the LGN. Finally, the fact that most auras appear continuous across the horizontal midline is most easily explained if they originate in an area containing a topographic representation of the visual field which is continuous across the horizontal meridian. Such a representation is found in human V1 (Holmes, 1917). In contrast, V2 is split along the horizontal meridian with the superior visual field representation lying below the calcarine fissure, and the inferior field representation lying above the fissure. Of the many visual cortical representations described in the primate cortex (Van Essen et al., 1992), thus far the only area other than V1 to show continuous mapping is area V3a, a motion-related area in the dorsal pathway (Tootell et al., 1997). In our own work, we have projected maps of auras, drawn by subjects with fixation and viewing distance controlled, into V1 coordinates (Fig. 2) and have confirmed a rate of spread of approximately 2 mm/min, well within the range described in the literature on spreading depression (Wilkinson et al., 1999). One would expect that so spectacular and prolonged a neural disturbance as the visual aura should yield a striking signature in brain imaging studies, particularly fMRI (functional magnetic resonance imaging). However, the difficulty in capturing this
phenomenon has been its unpredictability. Since auras generally come with little advance warning, and since their mean duration is less than 25 minutes, the task of imaging an aura, particularly its early phase, is daunting. One possible route would be to trigger the aura, if indeed an effective trigger is known. This approach has, in fact, been taken in a landmark study by Hadjikhani et al. (2001). One subject was able to induce his own auras by exercise; his auras were recorded from onset. Two other individuals worked on site, and hence could begin scanning within 15–20 min of symptom onset. In the striking illustrations in this report, changes in the BOLD (blood oxygenation level-dependent) signal spread across the posterior cortex affecting multiple cortical areas simultaneously, impinging on foveal representations first and moving forward along the medial aspect of the hemisphere to affect the peripheral visual representations later. In the subject who was able to induce auras allowing the capture of the very beginning of the cortical changes, the first signs of the BOLD change were reported, rather surprisingly, to be in area V3a. However, it should be noted that the phenomenology of this subject’s auras was not the fortification pattern described above; rather a crescent-shaped cloud of white noise (‘television snow’) moved out across the visual field, followed in its wake by a region of scotoma. The authors speculate that while the aura in this patient might reflect V3a processing, other
Fig. 2. Cortical mapping of a single migraine aura. The map represents a mathematical description of unfolded left striate cortex based on Horton and Hoyt (1991). Inferior fields are represented above the horizontal midline, superior fields below. Heavy black lines represent the projections of the subject’s drawings of his aura at approximate five minute intervals, numbered in temporal sequence.
310
auras might begin in other topographic visual regions. The fact that all posterior cortex including several retinotopically organized visual areas showed simultaneous activation during much of the period of the aura in this study makes interpretation of the fMRI data difficult. At least two alternatives emerge. Either the apparent wave of activation indicates that the spreading depression process engulfs all of posterior cortex, but that only a subset of this activation results in visual awareness, or alternatively, the activation directly due to spreading depression is much more limited in extent, and the rest of the spreading activation in adjacent cortical areas represents synaptic activation through feed-forward and feedback circuitry. fMRI is unlikely to allow us to distinguish between these alternatives; however, investigations of spreading depression in animal models using a combination of electrophysiological (Morlock et al., 1964) and imaging techniques (Basarsky et al., 1998; James et al., 1999) may provide one route to resolving this issue.
Visual epilepsies Background and phenomenology Epilepsies are classified both by phenomenology and by presumed or established locus (usually in terms of cerebral lobes). The introduction of the term aura (from the Greek for ‘breeze’) to describe the initial warning stage of an epileptic episode is attributed to Galen (Penfield and Kristiansen, 1951). In their mongraph on epileptic seizure patterns, Penfield and Kristiansen (1951, p. 3) begin with the claim that ‘the initial phenomenon in an epileptic seizure is usually the most important clue to the localization of the original focus of epileptic discharge in the brain’. However, the link between phenomenology and structure is not always as close as one might expect (Fried et al., 1995; Bien et al., 2000), so in the present discussion we will start from phenomenology without making initial inferences about localization. In the present context, ‘visual epilepsy’ will include all forms of epilepsy in which visual aura is the initial or only ictal manifestation. Some visual epilepsies progress to include such well recognized seizure types
as complex partial seizures and generalized tonic– clonic seizures. However, others — including some forms of benign childhood occipital epilepsy — have only visual manifestations and must be differentiated from other disorders such as migraine on the basis of EEG criteria and the specific properties of the aura (Andermann and Zifkin, 1998; Panayiotopoulos, 1999a). Photosensitive epilepsy will not be included in our discussion because visual awareness is not part of the ictal process despite the fact that visual stimulation precipitates the ictal events. Epilepsies with visual manifestations are rather rare: of the 259 cases of focal epilepsy described by Penfield & Kristiansen, only 11 were classified as having visual auras; an additional 18 had ‘psychic hallucinations and/or illusions’ some of which would fit the categories of complex visual hallucinations and illusions/distortions as defined earlier. A similar small percentage of visual epilepsies have been described in other surgical populations (Williamson et al., 1992; Fried et al., 1995). Two characteristics of epileptic visual auras have limited their accessibility — they are generally very brief, lasting seconds rather than minutes and they are frequently followed by other, more debilitating, seizure manifestations which preclude immediate recording of visual impressions. Therefore reports in the literature are all retrospective, and most consist of verbal description, often as summarized by the recording physicians in case notes that reflect preconceived categories. One recent exception is the work of Panayiotopoulos (1994, 1999a) who had patients provide drawings of their auras. Despite the retrospective nature of all these reports, similarity across studies suggests that there are commonalities in the properties of epileptic hallucinations compared to hallucinations of other origin. Elementary hallucinations are very frequent in visual epilepsies. These typically have the appearance of dots, spots or disks and are most frequently multiple and colored. They are variously described as stars, balls of light, halos, streaks and very rarely as a zig–zag (Gowers, 1895; Penfield and Kristiansen, 1951; Salanova et al., 1992; Williamson et al., 1992; Sveinbjornsdottir and Duncan, 1993; Kuzniecky et al., 1997; Bien et al., 2000). When there is a pre-existing field defect, as is the case in many of the post-traumatic (gun-shot wound) epilepsies
311
described by Ritchie Russell and Whitty (1955), positive visual hallucinations generally occur in the region of the field defect. In idiopathic forms of occipital lobe epilepsy, in which the fields are intact, the hallucinated spots overlie but do not totally replace the external visual scene. The elementary components may be stationary and long-lasting, but more commonly flicker, pulsate, twinkle or move. Several patterns of motion (translational, rotary, expansion, contraction and random) have been described (Gowers, 1895; Penfield and Kristiansen, 1951; Panayiotopoulos, 1999b). Like elementary hallucinations, complex hallucinations may exist in an already blind hemifield (Lance, 1976) — in which case their relationship to Bonnet syndrome, which we address later in this chapter, must be considered. However, they may also cover part or all of the normally functioning portion of the field. Faces are particularly common among objects seen; animals and human figures are also reported. Interestingly, in some instances both faces and whole creatures have grotesque or distorted form (Gowers, 1895; Panayiotopoulos, 1999b) and do not represent specific familiar individuals, whereas in other cases they appear normal and are sometimes recognized as persons known to the subject. A particularly striking report is of seeing oneself in mirror image, mimicking one’s own movements (Ritchie Russell and Whitty, 1955, case 23, p. 92). Complex visual hallucinations are typically, but not always colored appropriately. Their size may be normal, but they may also be tiny or larger than life. Epilepsies arising from more anterior temporal and limbic structures are more likely to involve whole scenes, to entail auditory as well as visual components and to emotionally involve the subject (Penfield and Kristiansen, 1951). Illusions and distortions of size, shape and color have been reported in occipital and temporal-occipital epilepsies; palinopsias have also been described (Ritchie Russell and Whitty, 1955; Bien et al., 2000; for review, see Sveinbjornsdottir and Duncan, 1993). However, like complex hallucinations, illusions and distortions are less common in epilepsy than are elementary hallucinations. Loss of vision without loss of consciousness or blurring or dimming of vision has also been reported, associated particularly with seizures in children
(Panayiotopoulos, 1999a), but seen also in epilepsy of traumatic origin (Ritchie Russell and Whitty, 1955). In some cases, the extinction of vision is followed by elementary hallucinations. In addition to total visual loss, there are reports of transient loss of specific abilities such as recognition of faces (prosopagnosia; Agnetti et al., 1978) and objects (Ritchie Russell and Whitty, 1955, case 34). One might speculate that ictal blindness and specific functional losses arise through similar mechanisms operating in different cortical regions.
Neural basis of epileptic visual auras The most common causes of visual epilepsies in adults are tumours and traumatic damage. In children, some occipital epilepsies are now recognized as arising from developmental malformations which have been revealed by modern structural MRI technology (Panayiotopoulos, 1993; Andermann and Zifkin, 1998). Epileptic visual auras are widely assumed to reflect irritative activation of neural circuits caused by abnormalities in the tissue at and around the site of the lesion or malformation. By combining intracranial electrical stimulation and recording during surgery, Penfield and Rasmussen (1957) demonstrated that stimulation of electrophysiologically abnormal cortical regions often elicits the characteristic aura of the patient’s seizure episodes, providing strong support for this hypothesis. The underlying basis of ictal blindness in epilepsy has received less discussion. It has been suggested that the visual loss represents active inhibition of cortical visual processing at some level (Ritchie Russell and Whitty, 1955). An alternative possibility would be that spontaneous activity in certain visual areas, or activity that does not settle into an organized pattern, actively interferes with the processing of external visual input without giving rise to coherent percepts. Due to the brevity of the visual auras, and to the subsequent seizures suffered by many individuals, visual epilepsy is not amenable to investigation with functional imaging. However, visual epilepsy does provide an alternative route to examining hallucinations not afforded by either migraine or Charles Bonnet syndrome. Because epileptic symptoms and
312
their underlying cause (e.g., tumour) are often progressive, neurosurgery is a frequent treatment option. Great advances have been made in intracranial electrophysiological techniques since Penfield’s pioneering work, and today the use of chronically implanted micro- and macro-electrodes as a route to localizing and characterizing the epileptic focus is considered ethically justified. This route, when carefully used, provides valuable insights into the relationship between lesion, electrophysiological activity and conscious experience (aura) on a temporal scale not available with even the best non-invasive imaging technologies. An excellent example of this approach is a case study by Babb et al. (1981) in which activity in both posterior cortex (medial and lateral surfaces) and the hippocampus was monitored with implanted electrodes on a 24 hour basis along with video monitoring of attacks. The data in this report show clear evidence of initial activation at some medial occipital sites slightly preceding and concurrent with reports of the patient’s typical visual aura, which in this case was flashing colored ‘butterfly’ lights, lasting 10–20 s before loss of consciousness and onset of psychomotor automatisms. The eight hippocampal sites monitored showed changes to epileptiform activity only at the onset of the seizure. These findings provide strong support for the argument that the aura in this case was generated endogenously in the occipital lobe and not by back-propagation from limbic regions. An astrocytoma was surgically removed from the lateral occipital/posterior temporal cortex, successfully eliminating seizures, but not the visual auras which continued to occur. The more recent development of large strips of electrodes which can be implanted subdurally for diagnostic purposes prior to surgery (e.g., Allison et al., 1999) has great promise as a route to localizing and investigating the dynamics of the neural activation associated with epileptic visual auras.
Charles Bonnet syndrome Background and phenomenology The final type of hallucination to be considered are those associated with loss of visual input to the brain
in the absence of dementia or other psychological disorder. First described by Charles Bonnet (Bonnet, 1769; De Morsier, 1967) and now known as Charles Bonnet syndrome (CBS), this condition occurs most often in the elderly as vision dims through cataract, glaucoma or age-related macular degeneration. However, CBS has been reported at all ages including in childhood following visual loss due to ocular or central pathology (Ko¨lmel, 1985; Lepore, 1990; Schultz and Melzack, 1991; Teunisse et al., 1996; Mewasingh et al., 2002). Two major definitional problems exist in the literature on CBS. The first concerns etiology — generally both peripheral and central visual loss are considered acceptable preconditions for CBS, and often the nature of the visual loss is not clearly specified. Secondly, the hallucinations described by Bonnet were complex, and the occurrence of complex hallucinations has been taken as a defining feature of CBS by some authors. However, when a broader definition of any visual hallucinations in the presence of low vision is used, it becomes clear that a full spectrum of hallucination types may occur, sometimes in the same individual, following visual loss. In the present review, this broader definition of the visual phenomenology of CBS will be employed. The other criteria for CBS diagnosis are (1) the absence of dementia, (2) the hallucinations are exclusively visual, (3) the hallucinations occur in a state of conscious awareness, coexisting with normal ongoing perception and (4) the individual is not deceived by the hallucination (except perhaps briefly at their onset) (Schultz and Melzack, 1991; Teunisse et al., 1996). The prevalence of hallucinations after visual loss is unknown. It is highly likely that CBS, especially in the elderly, is under-reported. Patients have fears about their sanity, particularly with the heightened awareness of Alzheimer’s and other dementias of the aged. Even when they have no personal doubts about the integrity of their mental state, the fear that family and/or medical personnel might judge them mentally incompetent, were they aware of the hallucinations, doubtless contributes to a reluctance to discuss these symptoms. There appears to be good reason for this: Teunisse et al. (1996), in a study of 60 CBS cases, found that only one of the 16 who had consulted physicians had been correctly diagnosed. It has been suggested that CBS may occur in as many as 10–30%
313
of elderly patients with low vision of peripheral etiology (Schultz and Melzack, 1991). One difference between CBS and epileptic auras is that the hallucinations of CBS are generally much more prolonged, and because they are not anticipatory signals of an impending aversive seizure, it is easier to evaluate their intrinsic emotional content. In general, the complex hallucinations of CBS, while sometimes amusing and sometimes disturbing, do not appear to represent emotionally laden perceptual events (Schultz and Melzack, 1991; Teunisse et al., 1996; Santhouse et al., 2000). The objects and scenes themselves do not generally have emotional valence for the subjects — faces are rarely recognized as individuals known to the subject, for example, and if they are recognized, there is little evidence that they have been invoked by their emotional loading. While the literature on CBS emphasizes complex hallucinations, it is clear that the full range of hallucination types may occur following visual loss. Unfortunately, the studies that provide the largest samples and the best descriptions of hallucination phenomenology do not always detail the nature of visual loss. In the following, we focus on a representative set of studies in which the peripheral (retina/optic nerve) or central origin of the visual loss is clearly identified. Santhouse et al. (2000; see also ffytche and Howard (1999) for further descriptions of a largely overlapping sample) presents data from 39 cases of peripheral visual loss, most with macular degeneration, glaucoma, cataract or a combination of these. Despite the fact that complex hallucinations are emphasized, the numbers reported indicate that elementary hallucinations formed the most frequent group. These included flashes, simple shapes, and ‘tesselopsias’ or repeating simple grid-like patterns, typically coloured, and generally seen in the center of the visual field. Of the complex hallucinations described, the most common were faces, followed by figures, branching patterns such as road maps, and vehicles. Faces were very frequently described as distorted, cartoon or caricature like, and ugly. Finally, illusions and distortions were also reported, including perseveration, polyopia and palinopsia, micropsia and macropsia and abnormally intense colours. A factor analysis performed on the data identified three factors which roughly equate to (1)
figures and landscapes, (2) faces and (3) visual distortion. Teunisse and colleagues studied 60 individuals with CBS (Teunisse et al., 1996). Again most were seniors (average age: 75.4 years) and again, most visual loss was due to macular degeneration, diabetic retinopathy, glaucoma or corneal disease. This study reports only complex hallucinations. No mention is made of elementary hallucinations, so one cannot tell whether these did not occur or whether the interview procedure simply did not elicit them. The complex hallucinations most commonly included people (whole figures, faces, groups, and miniature people), animals, trees and plants, scenes and a wide range of inanimate objects. Often the hallucinated object fit into the real scene (e.g., a hallucinated figure sitting on a real chair). Colour was a very common feature, occurring always in 63% of cases, and sometimes in 10%. Finally, another well documented aspect of the hallucinations experienced by individuals with peripheral visual loss is that they are of higher visual quality (more acute in detail, more intensely coloured) than their residual vision. When one moves to central visual loss (defined as originating at the lateral geniculate nucleus or more centrally), the mixture of etiologies becomes more worrisome because of the possible overlap with epilepsy. Moreover, the visual situation changes since most commonly these individuals have hemianopia or a smaller field defect, with normal vision in the intact parts of the field rather than the globally reduced visual function seen with peripheral disorders. Lance (1976) and Ko¨lmel (1985) provide reviews of the older literature and striking descriptions of their own patients’ hallucinations. Both reports focus on occipital lobe damage and both point to the difficulties of differentiating cases in which an epileptic focus may be present from cases in which there are apparently no irritative sources of activation. The focus of both studies is on complex hallucinations, although both elementary hallucinations and illusions are mentioned in individual cases. However, in a separate report, Ko¨lmel (1984) describes a separate subgroup from the same hemianopic patient population who experienced vividly coloured geometric patterns either immediately prior to or in the days following the onset of their field defects. Lastly, Vaphiades et al.
314
(1996) prospectively examined 32 patients with retro-chiasmal ischemic infarction and found hallucinations (or in his terminology ‘positive spontaneous visual phenomena’) in 13 (41%) covering the range of elementary and complex hallucinations and palinopsia. As with descriptions of both epilepsy and CBS of peripheral origin, the most frequently reported complex hallucinations in these studies are of human figures, animals, faces and body parts (particularly hands). Figures frequently appeared in groups — rows of similar or identical people, typically with unrecognizable faces. In some cases, the same images occurred repeatedly; in others, a wide range of forms were seen. In general, the hallucinations of CBS do not appear to reflect visual memories, although of course one cannot prove the negative — an object not recalled still may have been seen in the past. Comparison of studies of peripheral and central visual loss leads to the conclusion that the nature of the hallucinations experienced is not closely tied to the site or etiology of the visual loss. This conclusion is further supported by Lepore (1990) who examined a group of 104 consecutive cases with visual loss, some of central and some of peripheral origin. No relationship was found between the site of visual loss and either the incidence of hallucinations or the type (elementary versus complex). Of the 59 individuals with hallucinations, 63% had only elementary components, 10% had only complex components and 27% had both. In the complex category, again human figures ranked first in frequency, followed by faces and body parts, animals (including insects), and other objects including vehicles and clothing. Probably because of the variety of levels and etiologies involved, the literature provides tantalizing suggestions but no consistent account of either the factors triggering such hallucinations, the conditions necessary and sufficient to sustain them, nor effective methods of their suppression. In the central cases discussed by Ko¨lmel (1985), onset was typically within a few days of the event causing the hemianopia, and frequency generally decreased over time. Among triggers or predisposing conditions, individuals mention low stimulation, fatigue and emotional stress (Schultz and Melzack, 1991; Teunisse et al., 1996). Some cases of CBS following
peripheral visual loss report hallucinations only when the eyes are open and others, only with the eyes are shut (Schultz et al., 1996; Teunisse et al., 1996) Ko¨lmel (1985) makes the strong claim that the hallucinations experienced by his patients were terminated by saccades, but not by pursuit eye movements. A subset of patients studied by Vaphiades et al. (1996) also reported that saccades, in this case toward the normal field, eliminated the hallucinations. The explanation for these intriguing reports is unclear. Possibly the mechanism mediating saccadic suppression in normal vision is sufficiently strong to disrupt the endogenous activity of the hallucination, a notion first proposed by Ko¨lmel (1985). However, saccadic suppression acts most strongly to suppress visual motion, suggesting effects predominantly on the magnocellular pathway and the dorsal cortical stream, although there is as yet no general agreement about the level at which this occurs (see Ross et al., 2001, for review). The hallucinations of CBS, particularly the complex hallucinations, seem more likely to arise from the ventral stream (ffytche et al., 1998; see below).
Neural mechanisms underlying CBS hallucinations In attempting to explain the hallucinations of CBS, frequent reference has been made to ‘release phenomena’ (Cogan, 1973; Lepore, 1990; Schultz and Melzack, 1991; Vaphiades et al., 1996). The basic notion is that when input from the environment is eliminated or severely reduced, central neural mechanisms become spontaneously active reflecting either circuitry laid down through previous memory consolidation or novel constructions of central networks. Schultz and Melzack (1991) have drawn an analogy to the phantom limbs experienced by many amputees. Phenomenologically, it is interesting to review the findings of the McGill sensory deprivation studies from the 1950s in this context. Noting that subjects of prolonged deprivation studies reported hallucinations during deprivation and visual distortions upon returning to a normal visual environment, Heron et al. (1956) subjected themselves to sensory deprivation (visual, auditory and somatosensory). Within a day of deprivation onset, all experienced simple hallucinations (rows
315
of dots, geometric patterns, mosaics). Over time hallucinations became complex: scenery, people, ‘bizarre architecture’. Polyopsia — identical small geometric forms, figures or plants arrayed in symmetric rows — was frequent in the early stages. The similarity to CBS hallucinations is evident. Numerous suggestions have been made as to how deprivation-induced ‘release’ might work (see Schultz and Melzack, 1991, for review): removal of cortical inhibition normally maintained by sensory input, activation of central networks by ascending reticular arousal mechanisms, hyper-excitability of deafferented central mechanisms due to local changes in membrane properties. However, to date, none of these speculations have been developed into rigorous models to explain complex phenomena such as hallucinations. As to why certain classes of complex forms (e.g., faces) seem to occur particularly frequently, Lance (1976) suggested that this may reflect the amount of cortical tissue devoted to representing these familiar categories in much the same way that the hands and face, which are greatly over-represented in somatosensory and motor maps, tend to be the site of onset of sensorimotor seizures. While Lance was alluding to local regions of cortex, the greater emphasis today on distributed networks in neural processing might lead to a restatement of this idea to suggest that form categories which involve networks distributed most broadly within the cortex might be most likely to be activated endogenously. It is somewhat surprising that to date only two studies have applied modern imaging techniques to studying the hallucinations of CBS. ffytche et al. (1998) used fMRI to study eight patients with CBS, four of whom actually had spontaneous hallucinations in the scanner. A striking correspondence between hallucination content and activation locus is reported: colour with posterior fusiform area, faces with left middle fusiform area, objects with right middle fusiform and textures with the collateral sulcus. In a second part of the study, the difference between activation in the presence and absence of an external stimulus was found to be greater in nonhallucinators than in hallucinators. The interpretation proposed by the authors is that activity underlying hallucinations reduced the difference between stimulus and non-stimulus intervals. This leads to the proposal that the same extrastriate visual areas that
are active during normal perception of faces and other complex forms are participating in the generation of hallucinatory percepts. While interesting, these conclusions await verification in further studies. More recently, Adachi et al. (Adachi et al., 2000) employed single positron emission computed tomography (SPECT) to examine cerebral blood flow during hallucinations in five individuals with severe eye disease leading to blindness. The SPECT data show regions of hyperperfusion in the lateral temporal cortex, the thalamus and the striatum in the presence of normal levels in the occipital, parietal and frontal lobes. There was some evidence of asymmetry in the hyperperfusion: generally the same side was affected at all three sites, but side varied across individuals. Unfortunately the comparison made was relative to ‘normal’; there is no comparison in this study to periods without hallucinations in the affected individuals, so it is difficult to draw solid conclusions. It is clear that with the aging of our population, and the attendant increased incidence of age-related forms of blindness, the number of CBS cases will increase. Furthermore, as recognition of the existence of CBS as a purely visual condition rather than a sign of dementia grows, a greater proportion of cases are likely to self-identify. This should provide opportunities for further investigation employing functional imaging aimed at capturing hallucinations in progress. Meanwhile, much could be learned using a much simpler approach. To date, the entire literature is based on retrospective reports of hallucination phenomenology. While the visual loss may preclude the drawings of the sort possible with migraine during an aura, and epilepsy immediately afterward, the prolonged nature of many CBS hallucinations points to fruitful ground for ‘real-time’ verbal descriptions, requiring nothing more complex than a cassette recorder.
Conclusions What do hallucinations tell us about the organization of the visual system? Descriptions of hallucinations on their own do not go a great distance in illuminating our understanding of the visual brain. However, by comparing these three
316 Table 1. Summary of common properties of hallucinations based on literature surveyed Hallucination type
Property
Migraine
Epilepsy
Charles Bonnet
All
Duration
Minutes (5–60)
Seconds
Seconds–hours
(1) Elementary
Shape
Line elements, phosphenes
Phosphenes
Chromaticity Temporal properties
Achromatic Flicker, drift across field but limited element motion NA
Circular discs, phosphenes Colored Flicker and motion Figures, faces
Colored Static or moving elements Figures, faces
NA NA NA Very rare Not reported Yes Yes
Stereotyped Occasional Frequent Yes Yes Yes Yes
Varied Frequent Infrequent Yes Yes NA NA
(2) Complex
(3) Illusions/distortions (4) Ictal blindness
Most frequent object classes Stereotyped/varied Distorted (grotesque) Familiar faces/scenes Micropsia/Macropsia Palinopsia/Polyopsia As only symptom Preceding or following positive aura
Since this survey is based predominantly on case reports rather than on well-designed prospective studies, relative frequencies should be viewed with caution. NA, not applicable to this condition.
hallucinatory syndromes on several parameters that are well documented for each (see Table 1), a number of striking points emerge that indicate directions for future research. In all three cases, simple hallucinations form the majority in any study in which all types are assessed, despite the fact that emphasis on intriguing and amusing complex hallucinations in many reports gives a different impression. In migraine, oriented components are very common and achromatic percepts dominate, although point-like phosphenes and colour are reported in a moderate proportion of cases. In both epilepsy and CBS, colour appears much more ubiquitous, and spots, flashes and stars are the most common configuration. Thus while the range of elementary hallucinations is coextensive, the highest frequency percept differs, which demands explanation. Complex hallucinations of people and objects have almost never been reported in migraine auras whereas they are common in both other conditions. If one accepts the spreading depression model of auras, either the explanation lies at the level of neuronal membrane and network properties such that spreading depression can be initiated and supported in some but not all cortical territories, or else the spatiotemporal pattern of activation in spreading depression is inconsistent with eliciting coherent percepts from higher visual cortical areas even if these regions
are silently invaded. Compared to CBS, complex epileptic auras more often have a memory component — people and scenes are often recognized from the individual’s past. Faces are frequently reported to be distorted and grotesque in CBS; this also occurs in the epilepsy literature, although it is a less common feature of the published descriptions. Among objects reported in complex hallucinations, faces, figures and body parts seem to exceed all other classes. This may reflect the relative amount or excitability of the cortical circuitry devoted to their encoding. However, retrospective reporting biases on the part of subjects and recording biases on the basis of clinicians documenting the cases may also contribute to this apparent over-emphasis. Clearly what are needed are more studies encompassing large sample groups, multiple episodes and immediate recording of hallucination details in order to acquire a more accurate picture of the range and relative importance of different hallucination properties. In our experience with migraine, a combination of carefully structured observations and free reports is optimal. On the other hand, the phenomena that do appear in existing reports, particularly those that appear frequently such as the fortification spectra of migraine, the colored spots of epilepsy and the distorted faces of CBS should inform our thinking about the organization of neural networks within the
317
visual system. So, for example, computational models based on normal visual function in cortical neural networks should also be able to generate these characteristic hallucination patterns when subjected to the appropriate conditions. The spatial characteristics of the fortification-type migraine aura are clearly reminiscent of the orientation-selective tuning properties neurons in V1. In this respect, these auras are similar to many geometric hallucinations which occur with drug use, most vividly described by Kluver in his detailed studies of mescaline (Kluver, 1966). A number of models have attempted to reproduce migraine auras and/or other geometric hallucinations (Ermentrout and Cowan, 1979; Reggia and Montgomery, 1996; Bressloff et al., 2001). Of these, the recent work by Bressloff et al. is most comprehensive, incorporating much of the recent literature on the internal circuitry of striate cortex. While their paper does not address migraine aura explicitly, some of the patterns their network generates are strongly reminiscent of aura components. A second example of hallucinations playing into current thinking about visual cortical organization involves the processing of faces by the brain. The way in which faces are encoded, stored and recognized by the brain has been the focus of a great deal of interest in recent years. Are certain brain regions devoted uniquely to face recognition? Do we build up representations of faces from the thousands we see in our lifetime? Models based on population statistics and principal components analysis have attracted particular attention (Sirovitch and Kirby, 1987; Hancock et al., 1996; Wilson et al., 2002). Such a model would argue that the brain can generate an infinite number of face exemplars within an enormous multidimensional face space. The great emphasis on faces in hallucination reports from both epilepsy and CBS suggests large or widespread circuits in the brain supporting face perception. The frequent reports of grotesque faces with distorted or caricatured features would fit well into the framework of multidimensional face spaces, particularly if strength of neural response were the code for strength of contribution of a particular dimension in face-space. Of note is the observation that individual face principal components are highly distorted relative to actual faces which combine many such components.
What do hallucinations tell us about visual awareness? Hallucinations tell us that visual awareness is an intrinsic property of the visual brain, normally structured by input from the external environment but not dependent upon that input. In the past, comparisons have been drawn between externally driven vision and endogenously produced visual imagery (Kosslyn et al., 1999). However, imagery, no matter how vivid, lacks that quality of awareness alluded to in the introduction — the sense of ‘watching the world’. Hallucinations unroll automatically, generally uninfluenced by cognition. Even in the three conditions discussed in this chapter, in which mental status is unclouded by disease or drugs, the hallucinator is often initially fooled by these endogenous visions and only distinguishes illusion from external stimulation by logical reality testing (‘if it moves with my eyes, it must be in my head’; ‘people aren’t that small, so I must be seeing things’ etc.). The brain shows self-organization: activity produced by an irritative focus (epilepsy), by release or hypersensitization following deprivation of external input (CBS), and by local ionic imbalances in the extracellular medium (migraine) in each case gives rise to coherent percepts rather than random noise. Each of the three types of hallucination considered here offer unique opportunities to explore the problem of visual awareness: epilepsy through electrophysiology, migraine and CBS through careful documentation of the visual phenomena as they occur, ideally in conjunction with functional brain imaging. Such data, carefully collected and wisely interpreted, will surely provide new insights into the neural foundations of visual awareness.
Abbreviations BOLD CBS fMRI IHS SPECT
blood oxygenation level-dependent Charles Bonnet syndrome functional magnetic resonance imaging International Headache Society single photon emission computed tomography
318
Acknowledgments The support of the Canadian Institutes of Health Research and of the Krembil Foundation is gratefully acknowledged. The author would also like to thank Dr. H.R. Wilson for valuable discussions.
References Adachi, N., Watanabe, T., Matsuda, H. and Onuma, T. (2000) Hyperperfusion in the lateral temporal cortex, the striatum and the thalamus during complex visual hallucianations: Single photon emission computed tomography findings in patients with Charles Bonnet Syndrome. Psychiatry Clin. Neurosci., 54: 157–162. Agnetti, V., Carreras, M., Pinna, L. and Rosti, G. (1978) Ictal prosopagnosia and epileptogenic damage of the dominant hemisphere. A case history. Cortex, 14: 50–57. Airy, H. (1870) On a distict form of transient hemiopsia. Philos. Trans. R. Soc. Lond., 160: 247–264. Allison, T., Puce, A., Spencer, D.D. and McCarthy, G. (1999) Electrophysiological studies of human face perception. I: Potentials generated in occipitotemporal cortex by face and non-face stimuli. Cereb. Cortex, 9: 415–430. Alvarez, W.C. (1960) The migrainous scotoma as studied in 618 persons. Am. J. Ophthalmol., 49: 489–504. Andermann, F. and Zifkin, B. (1998) The benign occipital epilepsies of childhood: an overview of the idiopathic syndromes and of the relationship to migraine. Epilepsia, 39: S9–S23. Avoli, M., Drapeau, C., Louvel, J., Pumain, R., Olivier, A. and Villemure, J.-G. (1991) Epileptiform activity induced by low extracellular magnesium in the human cortex maintained in vitro. Ann. Neurol., 30: 589–596. Babb, T.H., Halgren, E., Wilson, C., Engel, J. and Crandall, P. (1981) Neuronal firing patterns during the spread of an occipital lobe seizure to the temporal lobes in man. Electroencephalogr. Clin. Neurophysiol., 51: 104–107. Basarsky, T.A., Duffy, S., Andrew, R.D. and MacVicar, B.A. (1998) Imaging spreading depression and associated calcium dynamics in brain slices. J. Neurosci., 18: 7189–7199. Bien, C.G., Benninger, F.O.H.U., Schramm, J., Kurthen, M. and Elger, C.E. (2000) Localizing value of epileptic visual auras. Brain, 123: 244–253. Bonnet, C. (1769). Essai Analytique sur les Faculte´s de l’Ame, 2nd Ed. Cl. Philibert, Copenhagen, pp. 176–178. Bressloff, P.C., Cowan, J.D., Golubitsky, M., Thomas, P.J. and Wiener, M.C. (2001) Geometric visual hallucinations, Euclidean symmetry and the functional architecture of striate cortex. Philos. Trans. R. Soc. Lond. B. Biol. Sci., 365: 299–330.
Chronicle, E. and Mulleners, W. (1994) Might migraine damage the brain? Cephalalgia, 14: 415–418. Cogan, D.G. (1973) Visual hallucinations as release phenomena. Graefe’s. Arch. Clin. Exp. Ophthalmol., 188: 139–150. Cowey, A. and Stoerig, P. (1991) The neurobiology of blindsight. Trends Neurosci., 14: 140–145. Crotogino, J., Feindel, A. and Wilkinson, F. (2001). Perceived flicker frequency of scintillating migraine auras. Headache, 41: 40–48. De Morsier, G. (1967) Le syndrome de Charles Bonnet: hallucinations visuelles sans deficience mentale. Ann. Med. Psychol. (Paris), 125: 677–702. Ermentrout, G.B. and Cowan, J.D. (1979) A mathematical theory of visual hallucination patterns. Biol. Cybern., 34: 137–150. ffytche, D.H. and Howard, R.J. (1999) The perceptual consequences of visual loss: ‘positive’ pathologies of vision. Brain, 122: 1247–1260. ffytche, D.H., Howard, R.J., Brammer, M.J., David, A., Woodruff, P. and Williams, S. (1998) The anatomy of conscious vision: an fMRI study of visual hallucinations. Nat. Neurosci., 1: 738–742. Fisher, C.M. (1980) Late-life migraine accompaniments as a cause of unexplained transient ischemic attacks. Can. J. Neurol. Sci., 7: 9–17. Fried, I., Spencer, D.D. and Spencer, S.S. (1995) The anatomy of epileptic auras: focal pathology and surgical outcome. J. Neurosurg., 83: 60–66. Gloor, P. (1986) Migraine and regional cerebral blood flow. Trends Neurosci., 9: 21. Gowers, W.R. (1895) Subjective visual sensations. Trans. Ophthalmol. Soc. UK, 15: 1–38. Grafstein, B. (1956) Mechanism of spreading cortical depression. J. Neurophysiol., 19: 155–171. Gru¨sser, O.-J. (1995) Migraine phosphenes and the retinocortical magnification factor. Vision Res., 35: 1125–1134. Gru¨sser, O.-J. and Landis, T. (1991) Visual agnosias and other disturbances of visual perception and cognition. Vision and Visual Dysfunction, Vol. 12. CRC Press, Boca Raton, pp. 158–178; 467–482. Hadjikhani, N., Sanchez del Rio, M., Wu, O., Schwartz, D., Bakker, D., Fischl, B., Kwong, K.K., Cutrer, F.M., Rosen, B.R., Tootell, R.H., Sorensen, A.G., Moskowitz, M.A. (2001) Mechanisms of migraine aura revealed by functional MRI in human visual cortex. Proc. Natl. Acad. Sci. USA, 98: 4687–4692. Hancock, P.J.B., Burton, A.M. and Bruce, V. (1996) Face processing: human perception and principal components analysis. Mem. Cognit., 24: 26–40. Hardebo, J.E. (1992) A cortical excitatory wave may cause both the aura and the headache of migraine. Cephalalgia, 12: 75–80. Hare, E.H. (1973) The duration of the fortification spectrum in migraine. In: Cumings, J.N. (Ed.), Background to Migraine. Springer, New York, pp. 93–98.
319 Heron, W., Doane, B.K. and Scott, T.H. (1956) Visual disturbances after prolonged perceptual isolation. Can. J. Psychol., 10: 13–18. Holmes, G. (1917) Disturbances of vision by cerebral lesions. Br. J. Ophthalmol., 2: 353–384. Horton, J.C. and Hoyt, W.F. (1991) The representation of the visual field in human striate cortex. Arch. Ophthalmol., 109: 816–824. Hubel, D.H. and Wiesel, T.N. (1968) Receptive fields and functional architecture of monkey striate cortex. J. Physiol., 195: 215–243. International Headache Society Classification Committee (1988) Classification and diagnostic criteria for headache disorders, cranial neuralgias and facial pain. Cephalalgia, 8: 1–96. James, M.F., Smith, M.L., Bockhorst, K.H., Hall, L.D., Houston, G.C., Papadakis, N.G., Smith, J.M., Williams, A., Xing, D., Parsons, A.A., Huang, C.L., Carpenter, T.A. (1999) Cortical spreading depression in the gyrencephalic feline brain studied by magnetic resonance imaging. J. Physiol., 519: 415–425. Jolly, F. (1902) Ueber Flimmerskotom und Migra¨ne. Berlin Klin. Wschr., 42: 973–976. Klee, A. and Willanger, R. (1966) Disturbance of visual perception in migraine. Acta Neurol. Scand., 42: 400–414. Kluver, H. (1966) Mescal and the Mechanisms of Hallucinations. University of Chicago Press, Chicago. Ko¨lmel, H. (1984) Coloured patterns in hemianopic fields. Brain, 107: 155–167. Ko¨lmel, H. (1985) Complex visual hallucinations in hemianopic field. J. Neurol. Neursurg. Psychiatry, 48: 29–38. Kosslyn, S.M., Pascual-Leone, A., Felician, O., Camposano, S., Keenan, J.P., Thompson, W.L., Ganis, G., Sukel, K.E. and Alpert, N.M. (1999) The role of area 17 in visual imagery: convergent evidence from PET and rTMS. Science, 284: 167–170. Kuzniecky, R., Gilliam, F., Morawetz, R., Faught, E., Palmer, C. and Black, L. (1997) Occipital lobe developmental malformations and epilepsy: clinical spectrum, treatment and outcome. Epilepsia, 38: 175–181. Lance, J. (1976) Simple formed hallucinations confined to the area of a specific visual fields defect. Brain, 99: 719–734. Lashley, K.S. (1941) Patterns of cerebral integration indicated by the scotomas of migraine. Arch. Neurol. Psychiatr., 46: 331–339. Lauritzen, M. (1994) Pathophysiology of the migraine aura: the spreading depression theory. Brain, 117: 199–210. Lea˜o, A.A.P. (1944) Spreading depression of activity in cerebral cortex. J. Neurophysiol., 7: 359–390. Lea˜o, A.A.P. (1947) Further observations on the spreading depression of activity in the cerebral cortex. J. Neurophysiol., 10: 409–414.
Lepore, F. (1990) Spontaneous visual phenomena with visual loss; 104 patients with lesions of retinal and neuronal afferent pathways. Neurology, 40: 444–447. Lipton, R.B., Hamelsky, S.W. and Stewart, W.F. (2001) Epidemiology and impact of headache. In: Silberstein, S., Lipton, R.B. and Dalessio, D.J. (Eds.), Wolff’s Headache and Other Head Pain. Oxford University Press, Oxford. Liveing, E. (1873) On Megrim, Sick-Headache and Some Allied Disorders: A Contribution to the Pathology of NerveStorms. Churchill, London. McLachlan, R.S. and Girvin, J.P. (1994) Spreading depression of Leao in rodent and human cortex. Brain Res., 666: 133–136. Mewasingh, L., Kornreich, C., Christophe, C. and Dan, B. (2002) Pediatric phantom vision (Charles Bonnet) syndrome. Pediatr. Neurol., 26: 143–145. Milner, P.M. (1958) Note on the possible correspondence between the scotomas of migraine and spreading depression of Lea˜o. Electroencephalogr. Clin. Neurophysiol., 10: 705. Morlock, N.L., Mori, K. and Ward, A.A.J. (1964) A study of single cortical neurons during spreading depression. J. Neurophysiol., 27: 1192–1198. Olesen, J. (1993) Migraine with aura and its subforms. In: Olesen, J., Tfelt-Hansen, P. and Welch, K.M.A. (Eds.), The Headaches. Raven Press, New York, pp. 263–275. Panayiotopoulos, C.P. (1993) Benign childhood partial epilepsies-Benign childhood seizure susceptibilty syndromes, J. Neurol. Neursurg. Psychiatry, 56: 2–5. Panayiotopoulos, C.P. (1994) Elementary visual hallucinations in migraine and epilepsy. J. Neurol. Neursurg. Psychiatry, 57: 1371–1374. Panayiotopoulos, C.P. (1999a) Elementary visual hallucinations, blindness, and headache in idiopathic occipital epilepsy: differentiation from migraine. J. Neurol. Neursurg. Psychiatry, 66: 536–540. Panayiotopoulos, C.P. (1999b) Visual phenomena and headache in occipital epilepsy: a review, a systematic study and differentiation from migraine. Epileptic Disord., 1: 205–216. Penfield, W. and Kristiansen, K. (1951) Epileptic Seizure Patterns. Charles C. Thomas, Springfield IL, pp. 104. Penfield, W. and Rasmussen, T. (1957) The Cerebral Cortex of Man. Macmillan, New York, pp. 248. Podoll, K. and Robinson, D. (2000) Illusory splitting as visual aura symptom in migraine. Cephalalgia, 20: 228–232. Ramadan, N., Halvorson, H., Vande-Linde, A., Levine, S., Helpern, J. and Welch, K. (1989) Low brain magnesium in migraine. Headache, 29: 590–593. Reggia, J.A. and Montgomery, D. (1996) A computational model of visual hallucinations in migraine. Comput. Biol. Med., 26: 133–141. Richards, W. (1971) The fortification illusions of migraines. Sci. Am., 224: 88–96.
320 Ritchie Russell, R.W. and Whitty, C.W.M. (1955) Studies in traumatic epilepsy. 3. Visual Fits. J. Neurol. Neurosurg. Psychiatry, 18: 79–96. Ross, J.M., Morrone, M.C., Goldberg, M.E. and Burr, D.C. (2001) Changes in visual perception at the time of saccades. Trends Neurosci., 24: 113–121. Russell, M.B., Iversen, H.K. and Olesen, J. (1994) Improved description of the migraine aura by diagnostic aura diary. Cephalalgia, 14: 107–117. Russell, M.B. and Olesen, J. (1996) A nosographic analysis of migraine aura in a general population. Brain, 119: 355–361. Russell, M.B., Rasmussen, B.K., Thorvaldsen, P. and Olesen, J. (1995) Prevalence and sex-ratio of the subtypes of migraine. Int. J. Epidemiol., 24: 612–618. Sacks, O. (1992) Migraine. University of California Press, Berkeley. Salanova, V., Andermann, F., Olivier, A., Rasmussen, T. and Quesney, L.F. (1992) Occipital lobe epilepsy: electronical manifestations, electrocorticography, cortical stimulation and outcome in 42 patients treated between 1930 and 1991. Surgery of occipital lobe epilepsy. Brain, 115: 1655–1680. Santhouse, A.M., Howard, R.J. and ffytche, D.H. (2000) Visual hallucinatory syndromes and the anatomy of the visual brain. Brain, 123: 2055–2064. Schultz, G. and Melzack, R. (1991) The Charles Bonnet syndrome: ‘phantom visual images’. Perception, 20: 809–825. Schultz, G., Nedham, W., Taylor, R., Shindell, S. and Melzack, R. (1996) Properties of complex hallucinations associated with deficits in vision. Perception, 25: 715–726. Sirovitch, L. and Kirby, M. (1987) Low dimensional procedure for the charactization of human faces. J. Opt. Soc. Am., 4A: 519–524. Sveinbjornsdottir, S. and Duncan, J.S. (1993) Parietal and occipital lobe epilepsy: a review. Epilepsia, 34: 493–521. Teunisse, R., Cruysberg, J.R., Hoefnagels, W.H., Verbeek, A.L. and Zitman, F.G. (1996) Visual hallucinations in psycholo-
gically normal people: Charles Bonnet’s syndrome. Lancet, 347: 794–797. Todd, J. (1955) The syndrome of Alice in Wonderland. Can. Med. Assoc. J., 73: 701–704. Tootell, R.H., Mendola, J.D., Haadjikhani, N.K., Ledden, P.J., Liu, A.K., Reppas, J.B., Sereno, M.I. and Dale, A.M. (1997) Functional analysis of V3A and related areas in human visual cortex. J. Neurosci., 17: 7060–7078. Ungerleider, L.G. and Mishkin, M. (1982) Two cortical visual systems. In: Ingle, D.J., Goodale, M.J. and Mansfield, R.J.W. (Eds.), Analysis of Visual Behavior. MIT Press, Cambridge MA, pp. 549–586. Van Essen, D.C., Anderson, C.H. and Felleman, D.J. (1992) Information processing in the primate visual system: an intgrated systems perspective. Science, 255: 419–423. Vaphiades, M., Celesia, G.G. and Brigell, M.G. (1996) Positive spontaneous visual phenomena limited to the hemianopic fields in lesions of central visual pathways. Neurology, 47: 408–417. Weiskrantz, L. and Cowey, A. (2002) Prime-sight in a blindsight patient. Nat. Neurosci., 5: 101–102. Weiskrantz, L., Warrington, E.K., Sanders, M.D. and Marshall, J. (1974) Visual capacity in the hemianopic field following a restricted occipital ablation. Brain, 97: 709–728. Wilkinson, F., Feindel, A. and Grivell, J.E. (1999) Mapping of visual migraine auras into cortical coordinates. Headache, 39: 386. Wilkinson, F. and Wilson, H.R. (2000) A neural network model of visual migraine aura. Headache, 40: 437. Williamson, P.D., Thadani, V.M., Darcey, T.M., Spencer, D.D., Spencer, S.S. and Mattson, R.H. (1992) Occipital lobe epilepsy: clinical characteristics, seizure spread patterns, and results of surgery. Ann. Neurol., 31: 3–13. Wilson, H.R., Loffler, G. and Wilkinson, F. (2002) Synthetic faces, face cubes, and the geometry of face space. Vision Res., 42: 2909–2923.
Progress in Brain Research, Vol. 144 ISSN 0079-6123 Copyright ß 2004 Elsevier BV. All rights reserved
CHAPTER 22
Theories of visual awareness Adam Zeman* Department of Clinical Neurosciences, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK Grau, teurer Freund, ist alle theorie Und grun des Lebens goldner Baum (All theory is grey, dear friend, But the golden tree of life springs ever green) Johann Wolfgang Von Goethe
Abstract: The past decade has provided a wealth of data for theorists of visual awareness. Two empirical approaches, both seeking to dissociate conscious from unconscious neural processes, have been particularly fruitful. The first has focused on the neural correlates of changes in experience occurring in the absence of change in external stimuli, for example during binocular rivalry; the second has investigated the neural correlates of unconscious processes such as blindsight. Several of the theories based on these data propose that visual consciousness arises from interactions between thalamo-cortical modules whose independent operation is unconscious; popular candidate ‘modules’ include visual regions in the ‘ventral’ visual pathway and parieto-frontal regions associated with action planning. These theories can be tested against recent findings from patients in the vegetative state, a state of ‘wakefulness without awareness’, which can follow major insults to the brain. The findings indicate that stimulus-evoked cortical activity occurs in the vegetative state, but tends to be limited in extent, is often restricted to primary sensory areas, and is poorly integrated with activity elsewhere in the cerebrum. The theories of visual awareness reviewed previously predict that such activity should not give rise to visual experience. This prediction is reassuring, but can we be sure that it is correct? Reflection on the indirect nature of the evidence available to theorists of visual awareness makes it doubtful that we can confidently specify the minimum conditions for awareness, unless we are prepared to modify our everyday concept of consciousness. O’Regan and Noe have recently proposed a sophisticated redefinition of visual awareness along these lines. Progress at this frontier of visual neuroscience requires that scientists and philosophers join forces to clarify the concepts of experience and consciousness.
Introduction
In this chapter the author first outlines the kind of evidence which ground current theories of visual awareness (see Theories of visual awareness: the evidence base), then reviews a number of these theories, emphasizing their common features and logical structure (see Theories of visual awareness: some examples). The vegetative state is a tragic, extreme, clinical disorder of consciousness: it provides a test case for theories of visual awareness (see ‘Wakefulness without awareness’: the vegetative state). Putting these theories to work on this task reveals a potential shortcoming, related to some features of our ordinary concept of consciousness
From an everyday perspective, visual experience is marvellously rich. Almost every moment of our waking lives, we can enjoy and exploit the play of form and color, depth and motion which compose our visual world. From the perspective of neuroscience, vision is our dominant and most intensively investigated sense. It is no surprise that vision should have been the target of so many theories of consciousness in recent years. *Corresponding author. Tel.: þ131-537-1167/8; Fax: þ131-537-1106; E-mail:
[email protected] DOI: 10.1016/S0079-6123(03)14402-2
321
322
(see The minimum conditions for awareness). The section ‘Vision as action’ considers a radical redefinition of visual experience, which would overcome this shortcoming — at a price. The final section looks to the future of the ‘science of consciousness’. These topics address questions close to the heart of psychology, in particular the relationships between subject and object, and between subjective and objective points of view.
Theories of visual awareness: the evidence base One of the goals of visual physiology has always been to account for the contents of visual consciousness. A series of major discoveries over the past century have revolutionized our picture of the cerebral events which underlie conscious vision. Key findings include the delineation of the retinotopic map in striate cortex (Holmes and Lister, 1916); the discovery of its orientation-specific columns by Hubel and Wiesel (1977); the realization that 30–40 functionally and anatomically distinct visual areas surround area V1 (Cowey, 1994); the evidence that parallel, though interconnected, streams of visual information flow through these areas, subserving the perception of form, color, depth, and motion (Livingstone and Hubel, 1988); and the broad distinction between an occipito-temporal stream concerned with object identification and an occipito-parietal stream concerned with visually guided action (Milner and Goodale, 1995). Work in this tradition has revealed increasingly fine-grained correlations between cerebral events and features of visual experience, for example the correlations between the perception of visual motion and activity in ‘human V5’ on the one hand, and the perception of color and activity in ‘human V4’ on the other (Zeki, 1993). But correlation, of course, does not imply cause: associations like these may point to accompaniments rather than causes of the activity which is crucial for visual consciousness. Such accompaniments might be wholly or virtually irrelevant to the neurology of visual experience, like the pupillary light reflex, or related to, but remote from, the putative neural basis of the experience, like activity in the retina.
There are several ways in which the case for a causal connection can be strengthened. One is by showing that damage to an area or process impairs the associated feature of visual experience. Descriptions of ‘central akinetopsia’ and ‘central achromatopsia’ suggest that this is true of areas V5 and V4: damage to V5 impairs awareness of visual motion while damage to V4 impairs the awareness of color (Zeki, 1993). Another is to show that stimulation of the area or process provokes or modulates the visual experience in question: transcranial magnetic stimulation of the cortical visual areas has gone some way down this road in man (Walsh, this volume); microstimulation of small groups of cells has been used to support a similar conclusion in animals (Salzman, 1990). Two other lines of approach to the neural correlate of consciousness have shown promise in recent years. Both aim to disentangle the unconscious accompaniments of conscious neural processes from the neural correlates of consciousness itself. The first focuses on changes in brain activity occurring when experience changes in the absence of stimulus change: this happens, for example, when we summon up visual imagery, during shifts of visual attention and during alternations of the conscious percept under conditions of binocular rivalry. The aim of the exercise is to pinpoint activity which is tied closely to visual experience, discounting the background of correlated but loosely linked events. This interesting approach runs the risk of missing some of the relevant action: it targets the moving tip of the iceberg of the brain activity which is required for visual experience, but it might well be that some of the unchanging background of activity is also required for visual consciousness. Nancy Kanwisher’s findings illustrate this approach well (Kanwisher, 2001). Her work in man follows the lead of Desimone (Moran and Desimone, 1985), Logothetis (Logothetis and Schall, 1989; Leopold and Logothetis, 1996) and others in animals, showing that both attentional shifts and alternations between bistable percepts modulate activity in cortical visual areas. In the case of binocular rivalry the percentage of cells showing correlations with awareness ranges from about 20% in V1 and V2, at an early stage of visual processing, to about 90% in inferotemporal cortex, well downstream (Kanwisher, 2001). Using functional MRI, Kanwisher has shown that, during binocular rivalry between an image of a
323
face and an image of a place, activity in the fusiform face area (FFA) and the parahippocampal place are a (PPA) oscillates in time with the oscillation of the conscious percept: in other words, at this level of the visual system, activity reflects the contents of visual consciousness rather than the pattern of retinal stimulation. Indeed, the magnitude of the oscillation as the conscious percept changes between the two rivalrous stimuli is similar to the magnitude of the change during alternate presentation of one or other stimulus to both eyes. Kanwisher has gone on to show that summoning up visual images of faces and places, and attentional shifts between them, evoke similar though smaller alternations of activity between the FFA and PPA. The tentative implication is that these areas are likely to contribute to the ‘neural correlate of consciousness’ (NCC). This line of attack goes directly to the focus of visual awareness. The second approach is indirect. It aims to conquer visual consciousness by stealth. If much of the activity occurring in the brain takes place without exciting consciousness, understanding the location and nature of these unconscious processes should help to highlight the distinctive qualities of conscious processing (The author assumes without argument here that unconscious neural processes occur). Such unconscious processes have now been studied in a wide variety of contexts: for example, in normal subjects using briefly presented, low-intensity or masked stimuli; during anesthesia; and in patients in whom brain injuries have disabled conscious capacities leaving unconscious or ‘implicit’ abilities behind. The author gives a few examples to illustrate the kinds of data which this work can provide for theorists of consciousness, emphasizing the potential to use functional imaging to contrast the mechanisms of conscious and unconscious processes. Sobel et al. (1999) have described the phenomenon of ‘blind smell’ in normal subjects. Subjects asked to indicate the presence or absence of a low-intensity odor perform above chance in the absence of awareness: fMRI revealed brain activation in the anterior medial thalamus and inferior frontal gyrus. Using PET, Morris et al. (1998) showed that masked, undetected, presentations of an angry face previously paired with an aversive stimulus elicited activity in the right amygdala of human subjects, whereas unmasked
presentations activated the left amygdala. Dehaene et al. (1998) have shown that in a task requiring subjects to classify numbers as larger or smaller than five by pressing a button with the left or right hand, presentation of masked, undetected numerical primes sets in train a stream of perceptual, cognitive, and motor processes in precisely the areas which are engaged by the unmasked, detected stimulus. These experiments were performed in healthy subjects. Two pathologies of awareness, blindsight and neglect, also offer opportunities to examine the neural correlates of unconscious perception. Some subjects with blindsight report visual awareness under certain conditions of stimulation, while under others they perform above chance on visual tasks in the absence of awareness. Sahraie et al. (1997) compared the brain activity induced by stimuli of these two kinds in the much studied patient G.Y.: they found that the shift between ‘aware’ and ‘unaware’ modes was associated with a shift in the pattern of activity from cortical to subcortical and from dorsolateral to medial prefrontal. In contrast, Zeki and ffytche (1998), in work performed with the same patient, report enhanced activation of area V5 in the aware mode by contrast with the unaware, with additional activation of a region of the brainstem. Rees et al. (2000) have shown, in a patient with neglect following a right parietal stroke, G.K., that unseen, extinguished stimuli excite both striate and extrastriate visual areas, suggesting that this activity is insufficient for awareness. It is too early to draw firm conclusions from these fascinating but preliminary studies. However, they illustrate the potential of work using ‘unconscious’ or ‘implicit’ perception to explore the neural correlates of consciousness. The results reviewed by the author hint that the intensity or ‘activation strength’, site and connectivity of neural processes can all influence the likelihood that they will enter awareness.
Theories of visual awareness: some examples Like any biological phenomenon, awareness must have underlying mechanisms, functions, phylogeny, and ontogeny. A comprehensive theory of awareness would address all these. The author touches on some candidate mechanisms here, prime suspects in the
324
hunt for the ‘neural correlate of consciousness’. These proposals are grounded in evidence of the kind outlined in the previous section. Several relevant reviews have appeared recently (Zeman et al., 1997; Frith et al., 1999; Kanwisher, 2001; Zeman, 2001; Baars, 2002; Rees et al., 2002). Most current theories share some basic assumptions: that awareness matters, in the sense that it allows us to do things which would be impossible without it; that awareness is a function of the brain, but that not all the activity occurring in the brain is conscious; that deep structures in the brainstem and thalami are crucial to arousal, while activity in the thalami and cortex determines the contents of awareness; that the processes giving rise to awareness are distributed across the brain, and that several psychological systems contribute in it. Most theories also assume that the neural correlate of consciousness will be a loosely linked but temporarily coherent network of neurons, a grouping akin to Donald Hebb’s ‘cell assembly’. These points of agreement leave plenty of scope for dispute about critical details. How large must a cell assembly be to give rise to awareness? Need it incorporate particular types of neuron, or particular layers of cortex? Must the interactions within the assembly attain a certain level of complexity? Must its activity be of a particular kind or duration? Need it involve particular cortical regions, or have a certain range of connections with regions elsewhere? And how do these details relate to the psychological structure of awareness? The recent proposals of Edelman and Tononi (Edelman, 1992; Tononi and Edelman, 1998), Crick and Koch (Crick, 1994; Crick and Koch, 1995; Koch, 1998), Weiskrantz (1997) and Kanwisher (2001) and others differ along these dimensions on points of detail. For present purposes the author wants to emphasize certain informative similarities between them. First, they emphasize the importance of interactions between neural systems to awareness: Edelman and Tononi single out the role of re-entrant connections between neocortex, thalamus, and the limbic system; Crick and Koch point to interactions between higher-order sensory association cortex and executive regions of the frontal lobes; Kanwisher implicates interactions between the ventral and dorsal streams of visual processing. Second, they emphasize
the role of interactions between psychological systems or items: sensation and memory, perception and action, sensory–motor processing and a further ‘commentary stage’, token and type. Third, each theory identifies the function of these interactions as the generation of awareness from processes and systems which operate more or less unconsciously in isolation: the underlying idea is that sensation becomes conscious only when it undergoes some further process — encounters past associations, or is used to govern action, allows the fusion of token and type, or becomes the object of reflection. These might be called ‘atomic’ theories of awareness. They try to show how the contents of awareness can be constructed from unconscious building blocks consisting of neural systems. They can be contrasted with ‘field’ theories of consciousness, which regard the contents of consciousness as modulations of a basal field (Searle, 2000; John, 2001). Field theories tend to highlight the physiology of awareness, including the possibility that synchronized gamma oscillations are relevant to consciousness, while atomic theories emphasize the anatomy of awareness. But field theories, just like their atomic rivals, point to the importance of interactions between brain regions in the creation of consciousness. It is possible to abstract a very general, prototypical, theory from these examples. Here are its key features: awareness occurs as the result of physiologically appropriate interactions between neural systems which serve sensation, memory, and action; activity which remains confined within a single system, or which fails to attain certain physiological requirements, can influence behavior but will not enter awareness. What happens when we try to apply these broad principles to a clinical example?
‘Wakefulness without awareness’: the vegetative state Thirty years ago, a neurologist and a neurosurgeon, Fred Plum and Brian Jennett (Jennett and Plum, 1972; Zeman, 1997; Jennett, 2002) coined the term ‘vegetative state’ to describe a condition of ‘wakefulness without awareness’ which can follow a variety of severe insults to the brain. The need for this new
325
diagnostic category had arisen because of an unintended effect of advances in neurosurgical treatment and medical resuscitation. Some patients were surviving massive insults to the brain, causing such extensive damage to any combination of the cerebral cortex, cerebral white matter, and thalamus that no sign of ‘a functioning mind’ remained. Yet in those patients in whom the upper brainstem was relatively unimpaired, ‘wakefulness’ might recover, leading to the eerie spectacle of a patient with open eyes, and some capacity to ‘alert’ to salient stimuli, but no sign of discriminative perception, communication or conscious purpose. The eerieness is compounded by the fact that patients in this condition are often far from inert. They can make a range of spontaneous movements, including chewing, teeth grinding, and swallowing. More distressingly, they may smile, shed tears, grunt, moan, and scream without any discernible reason. Spontaneous roving eye movements are common. Head and eyes may turn fleetingly to follow a moving object or toward a loud sound. Brainstem reflexes are usually intact. Grasp reflexes may be present. Painful stimulation of the limbs may provoke a flexor or an extensor response, as well as eye opening, a quickening of respiration and a grimace. The vegetative stage might be caricatured as a state of cortical death, in the presence of a working brainstem. But the caricature turns out to be inaccurate: The cortex is not always silent. Electrical and magnetic-evoked potentials may be detected over primary sensory areas, visual, auditory and somesthetic, and functional imaging also reveals modalityspecific activation of sensory cortices (de Jong et al., 1997; Menon et al., 1998; Laureys et al., 2000a; Schiff et al., 2002). In rare patients in whom ‘fragments’ of behavior are preserved — such as the occasional utterance of an inappropriate word, or a quietening to soothing music — there is evidence for preservation of function in corresponding cortical regions — left perisylvian and right hemispheric in these two examples (Schiff et al., 2002). But in most vegetative subjects in whom stimulus-driven cortical activity can be discerned, it is confined to early sensory regions, and fails to set in train activity in sensory areas downstream (Laureys et al., 2000a). Activity in association areas, and connectivity between distant cortical regions, impaired for the duration of the
vegetative state, are restored as awareness recovers (Laureys et al., 2000b). How does this evidence of limited cortical activity in patients in the vegetative state sit with the theories of awareness outlined in the last section? The answer is that it sits rather neatly: physiological studies of cortical responses in the vegetative state reveal impairment of the integration between neural systems which the theories in question regard as the precondition for awareness. This lends some theoretical support to the traditional view that, despite the range of behavior they display, patients in the vegetative state are indeed unaware. This view is, of course, supported from other directions: cerebral metabolic rates in the vegetative state are usually at or below the levels seen during general anesthesia, and the extent of the cerebral injury may be profound (Jennett, 2002). This is surely a reassuring convergence: clinical impression, empirical data, and scientific theory, all point to a single conclusion. But when a patient in the vegetative state alerts and grimaces in response to pressure on a nail bed can we be certain that he or she is unaware?
The minimum conditions for awareness The question we are asking amounts to this: can we establish the minimum conditions for awareness? If so, we can determine for sure whether patients in the vegetative state fall short of the line or not. Let us see, first, what kind of answer intuition offers, and then put intuition under the microscope. We do not normally suppose that awareness requires the capacity for overt behavior: subjects who regain awareness while paralyzed by anesthetists provide hard-won proof of this. Nor do we regard language as indispensable: prelinguistic infants and aphasic adults are, surely, aware. Memory may be closer to the kernel of consciousness: but profound disorders of anterograde memory are compatible with awareness. How much more can we whittle away from the roster of cognitive functions before we extinguish awareness completely? A thought experiment highlights a minimal case. Imagine that a subset of the cortical visual areas were isolated from the remainder of the brain, and activated in the manner correlated with visual experience in an intact brain: would this activity
326
give rise to the experience normally associated with activation of this kind, say an experience of color? Nancy Kanwisher (2001) thinks not, but intuitions on this point disagree. Clearly any experience that did arise would be impoverished — it would lack self-reference, explicit resonance with past experience, any linguistic dimension or capacity to give rise to action. Yet the notion of such an ‘unarticulated flash of experience’ does not strike most of us as absurd. Nothing in our ordinary concept of awareness seems to exclude the possibility of episodes of ‘unreportable awareness’ (Zeman, 2000). The author does not wish to persuade us that this intuition is correct: the point is precisely that the question is undecidable. And if the occurrence of experience under these conditions is a possibility, we are left with a problem: we cannot be sure that patients with severely damaged brains, supporting some residual cerebral activity, are unaware. The absence of evidence of awareness does not amount to proof of the absence of awareness. The explanation of our quandary is not hard to find. Our ordinary concept of awareness is thoroughly ‘internal’: we speak of ‘objective’ and ‘subjective’ worlds, a world without and a world within; we take it that awareness is a deeply private matter, inaccessible to observation by third parties. It casts an ‘inner light’ on a private performance: in a patient just regaining awareness we imagine the light casting a faltering glimmer, which grows steadier and stronger as a richer awareness returns. We imagine a similar process of illumination at the phylogenetic dawning of awareness, when animals with simple nervous systems first became conscious. We wonder whether a similar light might one day come to shine in artificial brains. But, bright or dim, the light is either on or off: awareness is present or absent — and only the subject of awareness knows for sure. The light of awareness is invisible to all but its possessor.
Vision as action The author believes that most neurologists and neuroscientists operate with a concept of perceptual awareness of this kind. It is what Alan Cowey has in mind when he refers to ‘conscious, aware vision’ or ‘phenomenal consciousness’, by contrast, for
example, to the visual capacity subserving blindsight (Cowey, 1997). It is what Daniel Dennett (1991) has in mind when he describes the prevailing concept of consciousness in neuroscience as ‘cartesian materialism’. It is natural, seductive and deeply problematic. Here are two of its problems. In the opening sections of this chapter the author has written enthusiastically about scientific theories of awareness. But if awareness is, at heart, a private matter, these theories fall short of their goal: for they are necessarily theories of reports or evidence of awareness, as science can only build its theories on evidence which is publicly available. Either science will fail to give us a theory of what we wish to explain — the subjective, private events of awareness — or, perhaps, awareness is not quite what we took it to be in the first place (Zeman, 2002). In the terms introduced by David Chalmers the problem the author is drawing attention to is the failure of the theories he has discussed to solve the ‘hard problem’ of consciousness — to bridge the ‘explanatory gap’ between events in the brain and events in the mind (Chalmers, 1995, 1996). Secondly, the light of awareness, which we were hoping scientific theory would explain, can seem curiously redundant. What is it for? Compare the theory of consciousness with the theory of matter. Physical theory, also, has to deal with invisible entities, subatomic particles. But physical theories come up with palpable predictions: E ¼ mc2 gives rise to bombs and reactors. What comparable consequence could we expect when we implemented a theory of consciousness? Only the dawning of an invisible light. These problems suggest that something may be seriously wrong with our ‘intuitive’ picture of awareness. How can we fix it? Can we somehow bring our concept of consciousness more closely into line with the kinds of evidence on which a scientific theory of awareness must necessarily be based? Kevin O’Regan and Alva Noe (2001) have proposed a radical redefinition of visual awareness which would make it possible to escape from — or, some might say, to displace — these dilemmas. The author briefly summarizes their proposal here, not because he expects everyone to wish to follow their lead, but because it offers one challenging solution to
327
the impasse we have encountered (for a full account of their sophisticated theory, see their 2001 paper). Perhaps, the idea goes, quite contrary to our intuitive picture, experience does not arise from the brain. Instead, experience is what occurs when we explore our surroundings: seeing, for example, ‘is a way of acting’. Once we appreciate that experience results from an interaction between an agent and her surroundings, we begin to see that the ‘explanatory gap’ which troubled us before, between neural events and mental events, was imposed by drawing the boundaries of explanation too narrowly around the brain. O’Regan and Noe adduce several lines of evidence which undermine our initial picture of visual awareness as a determinate private occurrence. The surprising phenomena of change blindness and inattentional blindness seem to show that our ‘internal representation’ of our visual surroundings is much sparser than we thought; our remarkable capacity for sensorimotor adaptation, evidenced, for example, by the work of Kohler and others with inverting lenses, hints that ‘experience’ is determined by highly plastic relationships between sensory inputs and motor outputs, rather than by any fixed properties of the brain; similarly, research on sensory substitution, for example using Paul Bach-y-Rita’s tactile transformations of visual scenes, indicates that the ‘visual’ properties of visual experience are conferred by a particular kind of exploration of the world rather than by the ‘specific nerve energy’ of the normal visual pathways. Once we accept that our experience, when closely examined, is not quite as we usually take it to be, the notions that visual experience consists in a type of exploration, and that its richness lies as much in the world as in our heads, become more attractive than they might at first appearance.
The future of the science of consciousness Some readers will be unconvinced by the functionalistbehaviorist account of visual experience offered by O’Regan and Noe. It may seem to ‘leave out the mind’ (Searle, 1992), the intrinsic ‘qualities’ of our experience. The author has briefly described it here as it offers one rather invigorating way out of
the dilemma we encountered when we tried to define minimum conditions for awareness. On a view like O’Regan and Noe’s, this attempt is misdirected: there is no moment at which the light of consciousness dawns. There are simply more and less-sophisticated ways of interacting with the world: a patient in the vegetative state interacts with his surroundings, but at a level so primitive that none of our peculiarly human, or deeply valued, forms of interaction are engaged. O’Regan and Noe’s ideas also remind us that too narrow a focus on the brain may deprive of the resources we need to help explain awareness: considerations of the physical environment, development, and culture. More generally, it seems that a solution to the problem of consciousness requires that the interested parties — from neuroscience, psychology and philosophy — should put their heads together. The problem of awareness raises fundamental issues in the theory of meaning and perception: What do statements about experience mean? Can they really be unverifiable in principle yet meaningful, like claims about unreportable awareness? And what is happening when we perceive our surroundings? Are we creating ‘a picture in the head’ as discussions of visual perception so often suppose, or are we, instead, exploring the world around us? These embarrassingly simple but vexatious questions must rank high on the agenda of any future science of awareness.
A personal note I am honored to be given an opportunity to contribute to Alan Cowey’s Festschrift. It was clear from the meeting held to celebrate his work that Alan’s colleagues are his friends, and hold him in equally high scientific and personal regard. I am not surprised. He taught me the psychology of vision, and I could not have asked for a more amiable, thorough, or stimulating tutor. He treated my illegible handwriting as a challenge in visual psychophysics, reporting small discoveries with satisfaction. I have greatly appreciated his benevolent interest over the years, which has encouraged me to pursue my fascination — rather eccentric for a clinician — with the science of consciousness. Alan’s own work is severely rigorous and objective,
328
but I suspect him of having a particularly rich enjoyment of the phenomenal world. I hope that he is not averse to a contribution to this volume which asks whether the appearances — which science must ‘save’ — are quite as we took them to be.
References Baars, B.J. (2002) The conscious access hypothesis: origins and recent evidence. Trends Cogn. Sci., 6: 47–52. Chalmers, D. (1995) Facing up to the problem of consciousness. J. Consc. Stud., 2: 200–219. Chalmers, D. (1996) The Conscious Mind. Oxford University Press, New York. Cowey, A. (1994) Cortical visual areas and the neurobiology of higher visual processes. In: Farah M.J. and Ratcliff G. (Eds.), The Neuropsychology of High-level Vision. Lawrence Erlbaum, Hillsdale. Cowey, A. (1997) Current awareness: spotlight on consciousness. Dev. Med. Child Neurol., 39: 54–62. Crick, F. (1994) The Astonishing Hypothesis. Simon and Schuster, London. Crick, F.H.C. and Koch, C. (1995) Are we aware of neural activity in primary visual cortex? Nature, 375: 121–124. Dehaene, S., Naccache, L., Le Clec’H, G., Koechlin, E., Mueller, M., Dehaene-Lambertz, G., van de Moortele, P.-Fm. and Le Bihan, D. (1998) Imaging unconscious semantic priming. Nature, 395: 595–600. de Jong, B.M., Willemsen, A.T.M. and Paans, A.J.M. (1997) Regional cerebral blood flow changes related to affective speech presentation in persistent vegetative state. Clin. Neurol. Neurosurg., 99: 213–216. Dennett, D. (1991) Consciousness Explained. Penguin Press, London. Edelman, G. (1992) Bright Air, Brilliant Fire. Penguin Books, London. Frith, C., Perry, R. and Lumer, E. (1999) The neural correlates of conscious experience: an experimental framework. Trends Cogn. Sci., 3: 105–114. Holmes, G. and Lister, W.T. (1916) Disturbances of vision from cerebral lesions, with special reference to the cortical representation of the macula. Brain, 39: 34–73. Hubel, D.H. and Wiesel, T.N. (1977) The Ferrier Lecture: functional architecture of macaque monkey visual cortex. Proc. R. Soc. Lond. B, 198: 1–59. Jennett, B. (2002) The Vegetative State. Cambridge University Press, Cambridge. Jennett, B. and Plum, F. (1972) Persistent vegetative state after brain damage. Lancet, 1: 734–737. John, E.R. (2001) A field theory of consciousness. Consc. Cogn., 10: 184–213. Kanwisher, N. (2001) Neural events and perceptual awareness. Cognition, 79: 89–113.
Koch, C. (1998) The neuroanatomy of visual consciousness. In: Jasper H.H., Descarries L., Castelucci V.F. and Rossignol S. (Eds.), Consciousness at the Frontiers of Neuroscience. Lipincott-Raven, Philadelphia. Laureys, S., Faymonville, M.-E., Degueldre, C., Fiore, G.D., Damas, P., Lambermont, B., Janssens, N., Aerts, J., Franck, G., Luxen, A., Moonen, G., Lamy, M. and Macquet, P. (2000a) Auditory processing in the vegetative state. Brain, 123: 1589–1601. Laureys, S., Faymonville, M.-E., Luxen, A., Lamy, M., Franck, G. and Maquet, P. (2000b) Restoration of thalamocortical connectivity after recovery from persistent vegetative state. Lancet, 355: 1790–1791. Leopold, D.A. and Logothetis, N.K. (1996) Activity changes in early visual cortex reflect monkeys’ percepts during binocular rivalry. Nature, 379: 549–553. Livingstone, M. and Hubel, D. (1988) Segregation of form, color, movement and depth: anatomy, physiology and perception. Science, 240: 740–749. Logothetis, N.K. and Schall, J.D. (1989) Neuronal correlates of subjective visual perception. Science, 245: 761–763. Menon, D.K., Owen, A.M., Williams, E.J., Minhas, P.S., Allen, C.M.C., Boniface, S.J., Pickard, J.D. and the Wolfson Brain Imaging Centre Team (1998) Cortical processing in the persistent vegetative state. Lancet, 352: 200. Milner, A.D. and Goodale, M.A. (1995) The Visual Brain in Action. Oxford University Press, Oxford. Moran, J. and Desimone, R. (1985) Selective attention gates visual processing in the extrastriate cortex. Science, 229: 782–784. Morris, J.S., Ohman, A. and Dolan, R.J. (1998) Conscious and unconscious emotional learning in the human amygdala. Nature, 393: 467–470. O’Regan, J.K. and Noe, A. (2001) A sensorimotor account of vision and visual consciousness. Behav. Brain Sci., 24: 5. Rees, G., Wojciulik, E., Clarke, K., Husain, M., Frith, C. and Driver, J. (2000) Unconscious activation of visual cortex in the damaged right hemisphere of a parietal patient with extinction. Brain, 123: 1624–1633. Rees, G., Kreiman, G. and Koch, C. (2002) Neural correlates of consciousness in humans. Nat. Rev. Neurosci., 3: 261–270. Sahraie, A., Weiskrantz, L., Barbur, J.L., Simmons, A., Williams, S.C.R. and Brammer, M.J. (1997) Pattern of neuronal activity associated with conscious and unconscious processing of visual signals. Proc. Natl. Acad. Sci. USA, 94: 9406–9411. Salzman, C.D., Britten, K.H. and Newsome, N.T. (1990) Cortical microstimulation influences perceptual judgements of motion direction. Nature, 346:174–177. Schiff, N.D., Ribary, U., Moreno, D.R., Beattie, B., Kronberg, E., Blasberg, R., Giacino, J., McCagg, C., Fins, J.J., Llinas, R. and Plum, F. (2002) Residual cerebral
329 activiy and behavioural fragments in the persistently vegetative brain. Brain, 125: 1210–1234. Searle, J. (1992) The Rediscovery of the Mind. MIT Press, Cambridge. Searle, J. (2000) Consciousness. Ann. Rev. Neurosci., 23: 557–578. Sobel, N., Prabhakaran, V., Hartley, C.A., Desmond, J.E., Glover, G.H., Sullivan, E.V. and Gabrieli, J.D.E. (1999) Blind smell: brain activation induced by an undetected airborne chemical. Brain, 122: 209–217. Tononi, G. and Edelman, G.M. (1998) Consciousness and the integration of information in the brain. In: Jasper H.H., Descarries L., Castelucci V.F. and Rossignol S. (Eds.), Consciousness at the Frontiers of Neuroscience. LipincottRaven, Philadelphia. Weiskrantz, L. (1997) Consciousness Lost and Found. Oxford University Press, Oxford.
Zeki, S. (1993) A Vision of the Brain. Blackwell Scientific Publications, Oxford. Zeki, S. and ffytche, D.H. (1998) The Riddoch syndrome: insights into the neurobiology of conscious vision. Brain, 121: 25–45. Zeman, A. (1997) Persistent vegetative state. Lancet, 350: 795–799. Zeman, A. (2000) The problem of unreportable consciousness. In: Toward a Science of Consciousness, Imprint Academic, Thorverton. Abstract 96. Zeman, A. (2001) Consciousness. Brain, 124: 1263–1289. Zeman, A. (2002) Consciousness: A User’s Guide. Yale University Press, London. Zeman, A., Grayling, A.C. and Cowey, A. (1997) Contemporary theories of consciousness. J. Neurol. Neurosurg. Psychiatry, 62: 549–552.
Subject Index Charles Bonnet syndrome, 312–314 Neural mechanism of, 314–315 Cholinergic amacrine cells, 10–12 Consciousness (see Awareness) Cerebellum and visual awareness, 70–72 Decorrelation control, 64–70 Models of, 62–64 Color Color constancy, 147–148 and color contrast, 148–149, 161–162 Color contrast, 149–151 and depth segmentation, 153–154 and luminance contrast defined motion mechanisms, 243–257 and motion segmentation, 154 and texture segmentation, 151–153 and the role of area V1, 156–158 Corollary discharge Identifying the, 51–58 Influence on visual processing, 49–51 Corpus callosum Agenesis of, 89–91 and taste, 87–88 and dichotic listening, 88–89 Maps in, 85–87
Achromatopsia, 162–163 Amygdala and emotional valence, 174–179 Attention and area V4, 172–173 and awareness,179–180 and color, 163–169 Biased competition model of, 171 Neuroimaging of, 171–180 Awareness and area V1, V5/MT, 117–123 and hallucinations, 317 and the cerebellum, 61–73 Theories of, 322–324 The vegetative state, 324–325 Minimum conditions for, 325–326 Backward masking Neurophysiology of, 96–99 Blindsight and animal methodology, 233–234 and motion discrimination, 287–291 and stimulus cueing, 261–273 Role of attention, 273–275 Role of awareness, 275–276 and subliminal perception, 295–298 and the redundant signal effect, 300–302 and visual agnosia, 81–82 Comparison on monkeys and humans, 291 Dissociation between implicit and explicit processes, 103–105 Effects of age at time of lesion, 279–282 in normal observers, 295–302 in monkeys, 237–238, 282–287 Interaction with conscious awareness, 79–81 Neural substrates of, 84–85 Transcranial Magnetic Stimulation, 118–119
Dorsal stream Does the dorsal stream have a memory? 137 Dorsomedial nucleus of the thalamus and eye-movements, 52–58 Emotion, 171–179 Faces Inferotemporal cortex and visual perception of, 96–100 Rapid serial visual presentation, 107–105 Frontal eye field, and eye-movements, 52–58 331
332
Grasping Comparing visually guided and memory guided, 137–140 Effect of delays in neurological patients, 133–135 in normal observers, 136–137 Hemianopia Conscious vision in, 81 Inner plexiform layer, 6 Migraine auras, 307–308 Neural substrate of, 308–310 Monkeys Effects of occipital lesions, 229–232 Inferotemporal cortex and face identification, 95–100 Neuronal responses to rapid serial visual presentation, 107–115 Retinal Ganglion cells, 23–29 Saccadic eye-movements, 47–58 Stimulus cueing in blindsight, 261–276 Motion perception and color, 243–258 and dorsal and ventral area V3, 206–209 in destriate monkeys, 287–291 First and second order, 197–209 Neuroimaging and the amygdala, 171–180 and visual motion processing Optic ataxia, 134–136 Photoreceptors Development, 3–14 Differentiation, 4 Environmental determinants of, 4–5 Pupillometry, 246, 249, 262 Reaching in spatial neglect and extinction, 213–220
Redundant signal effect, 298–300 Retinal ganglion cells, 23–29, 243–244 M- and P-cells Contrast sensitivity of, 31–32 Photoreceptor signals to, 36–40 Role in achromatic and chromatic vision, 32–35 Spatial properties of, 30–31 Temporal properties of, 31 Saccadic eye-movements, 48 Stroop test, 82 in a visually agnosic patient, 83 and visual imagery, 83–84 Superior colliculus and eye-movements, 50–58 Subliminal perception, 295–302 Transcranial magnetic stimulation, 124–128 Visual control of action, 131–142 The effect of delay in neurological patients, 133–135 in normal observers, 136–137 Off-line versus real time control of, 140–142 Visual cortex Area V1 and color contrast, 156–158 Area V1 lesions in infant and adult monkeys, 279–292 Areas V1, V5/MT and priming, 123–127 Effects of area V1 damage in humans, 232–237 Inferior temporal cortex, 95–100 Superior temporal sulcus, 107–115 Visual epilepsies, 310–311 Neural substrate of, 311–312 Visual extinction and obstacle avoidance, 220–222 Visual form agnosia, 81, 131–137 Visual Hallucinations and the organization of the visual system, 315–317 classification of, 306 Visual neglect, 213–215 Reaching and bisection, 216–220 Visual search, 120–122, 183–195