Visual Masking
OXFORD PSYCHOLOGY SERIES Editors Mark D’Esposito Daniel Schacter Jon Driver Anne Treisman Trevor Robbi...
51 downloads
684 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Visual Masking
OXFORD PSYCHOLOGY SERIES Editors Mark D’Esposito Daniel Schacter Jon Driver Anne Treisman Trevor Robbins Lawrence Weiskrantz 1. The neuropsychology of anxiety: an enquiry into the functions of the septohippocampal system J.A. Gray
22. Classification and cognition W. Estes
2. Elements of episodic memory E. Tulving
24. Visual Stress A. Wilkins
3. Conditioning and associative learning N. J. Mackintosh
25. Electrophysiology of mind Edited by M. Rugg and M. Coles
4. Visual Masking: an integrative approach B. G. Breitmeyer 5. The musical mind: the cognitive psychology of music J. Sloboda 6. Elements of psychophysical theory J.-C. Falmagne 7. Animal intelligence Edited by L. Weiskrantz 8. Response times: their role in inferring elementary mental organization R. D. Luce 9. Mental representations: a dual coding approach A. Paivio 10. Memory, imprinting, and the brain G. Horn 11. Working memory A. Baddeley 12. Blindsight: a case study and implications L. Weiskrantz 13. Profile analysis D. M. Green 14. Spatial Vision R. L. DeValois and K. K. DeValois 15. The neural and behavioural organization of goal-directed movements M. Jeannerod
23. Vowel perception and production B. S. Rosner and J. B. Pickering
26. Attention and memory: an integrated framework N. Cowan 27. The visual brain in action A. D. Milner and M. A. Goode 28. Perceptual consequences of cochlear damage B. C. J. Moore 29. Binocular vision and stereopsis I. P. Howard 30. The measurement of sensation D. Laming 31. Conditioned taste aversion J. Bures, F Bermúdez-Rattoni, and T. Yamamoto 32. The developing visual brain J. Atkinson 33. Neuropsychology of anxiety, second edition J. A. Gray and N. McNaughton 34. Looking down on human intelligence: from psychometrics to the brain I. J. Deary 35. From conditioning to conscious recollection: memory systems of the brain H. Eichenbaum and N. J. Cohen
16. Visual pattern analyzers N. V. Graham
36. Understanding figurative language: from metaphors to idioms S. Glucksberg
17. Cognitive foundations of musical pitch C. L. Krumhansl
37. Active Vision J. M. Findlay and I. D. Gilchrist
18. Perceptual and associative learning G. Hall
38. The science of false memory C J Brainerd & V F Reyna
19. Implicit learning and tacit knowledge A. S. Reber
39. Seeing Black and White A. Gilchrist
20. Neuromotor mechanisms in human communication D. Kimura
40. The case for mental imagery S. Kosslyn
21. The frontal lobes and voluntary action R. E. Passingham
41. Visual masking: time slices through conscious and unconscious vision B. G. Breitmeyer and H. Ög ˘men
Visual Masking Time slices through conscious and unconscious vision SECOND EDITION
BRUNO G. BREITMEYER and
˘ HALUK ÖGMEN
1
3
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © Oxford University Press, 2006 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 1984 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Breitmeyer, Bruno G. Visual masking : time slices through conscious and unconscious vision / Bruno G. Breitmeyer and Haluk Ög˘ men.—2nd ed. p. ; cm.—(Oxford psychology series ; 41) Includes bibliographical references and indexes. 1. Visual perception. I. Ög˘ men, Haluk. II. Title. III. Series. [DNLM: 1. Visual Perception. WW 105 B835v 2006] BF241.B73 2006 152.14—dc22 2005031831 Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn ISBN 0–19–853067–6 (Hbk.) 978–0–19–853067–1 (Hbk.) 10 9 8 7 6 5 4 3 2 1
Preface
The first Oxford University Press edition of Visual Masking, written by one of us, appeared in 1984. Its intent was to present a scholarly review of empirical research and theoretical developments in visual masking up to that time as well as a description of the by now well known dualchannel transient-sustained approach to masking and information processing. Despite some of its regrettable flaws, the book seems to have enjoyed a noticeable and gratefully acknowledged impact on research in visual perception and cognition over the past two decades. However, over the same two decades the study of visual masking has been marked by several new developments. In the year of the publication of the first edition, a new theoretical approach to the study of visual masking and related spatio-temporal phenomena, known as perceptual retouch, was published by Talis Bachmann, followed by many additional publications, including two book-length monographs (Bachmann 1994, 2000), on the relationship of perceptual retouch to the microgenesis of perception. Except for this development, however, there generally seems to have been a lull in the study of visual masking until the last decade or so, when additional empirical, methodological, theoretical, and technological innovations occasioned a renewed and vigorous interest in visual masking. In part, new empirical findings on the effects of attention and perceptual grouping on visual masking (Ramachandran and Cobb 1995; Shelley-Tremblay and Mack 1999) have extended the scope of interest, as has a spate of published findings, relevant to the understanding of unconscious visual processing, which have emanated from Odmar Neumann’s laboratory in Bielefeld, Germany, over the past decade and a half. In addition to these human psychophysical results, informative findings regarding the neural correlates of masking have been reported in several recent neurophysiological investigations of monkey visual cortex (see Chapter 3). Methodological innovations known as ‘common-onset’ and ‘four-dot’ masking (Di Lollo et al. 1993; Enns and Di Lollo 1997) have also generated new findings and a new theoretical approach to masking,
vi
PREFACE
known as object-substitution, that emphasized the roles of attention and re-entrant activation in visual processing (Di Lollo et al. 2000; Enns and Di Lollo 2000; Enns 2004). Concurrently, new neural-network models of masking were published. One model, developed by Francis (1997), explored the emergent temporal dynamics of the FACADE model proposed by Grossberg and colleagues (Grossberg and Mingolla 1985a,b; Grossberg 1994). The other model, developed by one of us (Ögmen 1993), explored the retinocortical dynamics of the dualchannel approach originally proposed almost three decades ago by Breitmeyer and Ganz (1976). This was a particularly fortunate event for the other one of us, who, although mathematically trained, failed to express that approach in quantitative neural-network terms. Our collaboration, a particularly fruitful and enjoyable one over the past 10 years, has itself added to the new interest in visual masking (Breitmeyer and Ögmen 2000). In addition to these empirical and conceptual innovations, three other developments, each exploiting visual masking as a methodological tool, conspired to reinvigorate interest in visual masking. One consists of the many recent explorations of various kinds of unconscious visual processing under the methodological rubric of masked priming (Kinoshita and Lupker 2003; see also Chapter 8). The second is an equally wide use of masking by neuroscientists in recent brain-mapping studies of the neural correlates of conscious and unconscious vision. The third is its use in the study of visual information processing in special subject populations (see Chapter 9). For all of these reasons, the current edition of Visual Masking is not merely a revision of the first edition, but is basically a new monograph on visual masking that could easily have been given another title such as the subtitle, Time Slices through Conscious and Unconscious Vision. In the current edition, we have retained Chapter 1 of the first edition, covering the history of masking, with only minor changes. The topics of visual masking by light and of visual integration and persistence, covered in Chapters 2 and 3, respectively, of the first edition, are not covered here. The reason for this omission is that masking by light occurs mainly at peripheral (most likely) retinal levels of visual processing, and therefore sheds little ‘light’ on the study of central cortical object-forming processes. Moreover, interest in visual persistence and the cognate topics of iconic memory and visual short-term store has waned appreciably since the 1980s (Haber 1985), although visual
PREFACE
persistence has retained some currency in relation to motion perception (see Chapter 6). The coverage of topics in the present edition is given below. In Chapter 2 we review methods, findings, and applications of visual masking. Some material in this chapter is also found in Chapter 4 of the first edition, but many other reviewed results and applications of masking have been reported only in the last two decades. Chapter 3 deals with the neurophysiological correlates of visual masking and information processing. The chapter includes neurophysiological and neuroanatomical findings obtained from the study of (mainly) the monkey visual system as well as electrophysiological and brainimaging findings obtained from human observers. We regret that discussion of some of the more interesting brain-imaging results known to us have not been included because they have not yet appeared in published form. Chapter 4 reviews past and recent models of visual masking, some of which were also covered in Chapter 5 of the first edition. Despite the consignment, in the fast-paced world of visual neuroscience, of developments older than 5 years to ‘history’, we have decided to include discussion of the older models in Chapter 4 rather than in Chapter 1. Chapter 5 outlines in detail our own preferred model, which is an elaboration of the neural-network model proposed by one of us (Ögmen 1993). This model is dynamic not only because of its substantive content, retinocortical dynamics, but also in its development. Although open to fine tuning based on new findings, it nonetheless draws a conceptual line based on (i) mutual inhibitory interaction between sustained and transient channels and (ii) interactions between feedforward and feedback neural activations. Chapter 6 covers the relationship between motion perception and visual masking. While such relationships have been noted before, here they take on special significance because our model outlined in Chapter 5 was originally developed to account for several visual-motion phenomena, in particular motion blur, or more specifically its relative absence, during perceived object motion. With regard to spatiotemporal visual phenomena, this renders an empirical scope to our model which is not limited to visual masking. Chapter 7 covers the effects of figural (gestalt) context and of selective visual attention on masking. With regard to figural contexts we review two types of findings, those obtained when contexts are provided by the mask configurations and
vii
viii
PREFACE
those provided by target configurations; and for selective visual attention we review findings relevant to space-based as well as object- or featurebased attention. Moreover, we assess the role of attention as a mere ancillary vs. an essential constitutive process in theoretical accounts of visual masking. Chapter 8 reviews recent findings from studies using visual masking, particularly metacontrast and backward pattern masking, to explore the types and levels of unconscious visual information processing. This chapter is particularly relevant to those cognitive scientists, philosophers, and neuroscientists interested in the neural correlates of conscious and unconscious vision. In Chapter 9 we cover the use of visual masking, again mainly metacontrast and backward pattern masking, in the study of special subject populations, including those with neurological, ophthalmological, psychiatric, and reading deficits. These studies are important not only for their applied significance but also for their theoretical implications. We conclude with an epilogue that places the study of visual masking in a wider crossdiscipline context. Our preface would be incomplete without acknowledging the help and patience accorded to us by the following individuals. During a recent meeting of the Psychonomics Society, one of us initially proposed writing the current revised edition of Visual Masking to Catharine Carlin, the psychology editor of the US branch of Oxford University Press based in New York. Without doubt or hesitation she supported the proposal, eager to have it published by the US branch. However, since the first edition was published by the UK branch in Oxford, the working editorship was assigned to Martin Baum. We thank both Catharine and Martin for their enthusiastic support, and Martin and his editorial assistant, Carol Maxwell, for generously extending the deadline for submitting our manuscript by a little over 6 months. Without that concession, the last 2 years of work could easily have triggered a ‘peritraumatic’ stress disorder in at least one of us; happily they comprised one of the most productively and creatively exciting periods of our professional careers. We also thank Alpay Koç for assisting in the preparation of the final version of the manuscript and especially the bibliographic references. Houston, Texas July 2005
B.G.B. H.Ö.
Contents
Abbreviations xi 1 2
3 4
5
6 7 8 9
A history of visual masking 1 Methods, applications, and findings in visual pattern masking 31 Neurobiological correlates of visual pattern masking 81 Models and mechanisms of visual masking: a selective review and comparison 99 The sustained–transient channel approach to visual masking: an updated model 141 Metacontrast and motion perception 219 Figural context and attention in visual masking 235 Unconscious processing revealed by visual masking 253 Visual masking in selected subject populations 287 Epilogue 297 Appendix A 301 Appendix B 307 References 309 Index 363
This page intentionally left blank
Abbreviations
AB
attentional blink
BCS
boundary contour system
cff
critical flicker fusion
CVEP
cortical visually evoked potential
DPS
direct parameter specification
FCS
feature contour system
FEF
frontal eye field
FG
fusiform gyrus
fMRI
functional magnetic resonance imaging
GFS
generalized flash suppression
IB
inattentional blindness
ISI
interstimulus interval
IT
inferior temporal
LGN
lenticulate geniculate nucleus
MIB
motion-induced blindness
NSP PCN PR RD RECOD ROC RSVP SFS SOA SP SRD STA TMS VEP
non-specific pathway posterior contralateral negativity perceptual retouch response difference retino-cortical dynamics receiver operating characteristic rapid serial visual presentation specific flash suppression stimulus onset asynchrony specific pathway specific reading disability stimulus termination asynchrony transcranial magnetic stimulation visually evoked potential
This page intentionally left blank
Chapter 1
A history of visual masking
1.1. Introduction Science can be viewed as a means of giving cognitive coherence to our observations or perceptions of the world, and the history of science can be viewed as the study of the evolving contents and modes of such cognitive structuring. Typically, a science progresses when it confronts puzzles, problems, or inconsistencies that require solutions or when fortuitous discoveries are made. The resulting advances are usually piecemeal and methodical. By reinforcing and adding to the edifice of observations structured around extant theory and method, these advances indicate the more or less continuous development of what Kuhn (1962) has called ‘normal science’. However, significant conceptual or empirical anomalies crop up occasionally and remain deeply and inextricably rooted within a particular scientific paradigm. Their deep-rooted intractability signals an intellectual turning point, which demands not only a restructuring of the manifest theoretical and methodological framework but also a radical shift of the oft-tacit presuppositions or metatheoretical foundations on which it is based. Relative to the normal-science time frame, such profound resolutions of scientific anomalies comprise what Kuhn (1962) has termed ‘paradigm-shifts’. What follows from this brief sketch of the scientific enterprise is that the history of a science consists not merely of a catalogue and chronicle of its theories, methods, and observations but also, and more importantly, a chronicle of the analysis and interpretation of their attendant problematic situations (Popper 1972) and fundamental presuppositions (Collingwood 1940). Situational analysis and interpretation of this sort is akin to what Hempel (1966) has termed ‘explication’ or ‘rational reconstruction’ and to Collingwood’s (1956) method of ‘reenactment of past thought’ (for further discussions and critiques of these historical methods, see Donagan (1966) and Skagestadt (1975)).
2
A HISTORY OF VISUAL MASKING
In addition to attempting to render an adequate situational analysis of the science of visual masking, the historical approach adopted here also highlights numerous significant similarities between past and present presuppositions, theories, methods, and observations. This tactic is not premised on an a priori notion of history as recurrent or repetitive but rather on an a posteriori analysis drawing on relevant prior and current sources. Nor is the approach meant to yield the impression that in regard to visual perception nothing new is under the sun. On the contrary, the study of visual perception in the last century, and in particular in the past five decades, has been marked by vast accumulations of new findings and by dramatic conceptual and technical developments (see, for instance, Palmer’s (1999) extensive treatise of recent developments in vision science). Nonetheless, despite many theoretical, methodological, and empirical advances, the claim that nothing as radical or profound as a ‘Copernican Revolution’ (Kuhn 1957a,b) has occurred up to now in the science of visual perception or, specifically, visual masking, seems hardly disputable. In fact, the current choice of highlighting, whenever possible, significant similarities between past and recent aspects of the study of visual masking was made to illustrate its piecemeal, continuous, and normal-science mode of development. 1.2. Background Visual masking refers to the reduction of the visibility of one stimulus, called the target, by a spatiotemporally overlapping or contiguous second stimulus, called the mask. Historically, masking has played a leading role in the study of spatial and temporal properties of visual perception. In this role it has remained of great importance to the present and promises to continue as such in the future (Breitmeyer and Ög˘men 2000). To understand the historical roots of our present and future interest in visual masking, we review the scientific context within which masking developed not only as a methodological tool but also as a phenomenon deserving empirical and theoretical investigation per se. Initially, as now, the use and study of visual masking were grounded in attempts to delineate the temporal stages and parameters of the perceptual process. These attempts involved the conceptual parsing and the experimental measure of, among others, the following temporal stages: the time for a stimulus to reach focused awareness (Apperceptionzeit), perception time (Wahrnehmungszeit), perceptual
BACKGROUND
duration (Wahrnehmungsdauer), sensation time (Empfindungszeit), the rise and fall times of sensation, sensory persistence, retinal (transduction) latency, conduction velocity, cortical processing latency, and so on. Cattell (1885a, 1886) condensed these measures into a coarser parsing of four basic temporal parameters, each corresponding to one sensory–perceptual operation or stage. 1. The time that a stimulus must be present in order that a sensation be excited (threshold level). 2. The duration of a stimulus required to maximize sensory intensity (saturation level). 3. The time required for a stimulus to be changed into a nervous impulse (transduction latency). 4. The time taken up in the nerve and brain before the stimulus is seen (perceptual latency). As Baade (1917b) pointed out, this ‘microtomization’ (Mikrotomierung) or temporal slicing of the perceptual process occurred in a context concerned with the complementary studies of (a) ‘pure’ sensations and (b) the initial stages in the microgenesis of the perceptual process. Perhaps it is best to quote Baade’s characterization of this context. Out of the following consideration it will become clear what especially great significance the study of the initial stages of the perceptual process has for the investigation of sensations. The chemist seeks to produce pure chemical bonds; the physicist seeks to free the research-relevant processes from all superfluous and accidental side effects; the bacteriologist seeks to isolate his research objects in a ‘pure culture’ [or medium]. Should the psychologist not strive to observe, in an isolated state, a state of purity, or however one wishes to express it, those objects of psychology to which he grants the name and role of ‘element’, ‘primitive form’, or something similar? Should one search for the above-mentioned isolated sensations, one would naturally expect their presence only in the first stages of the perceptual process; for it is, so to say, palpably clear that nothing of the later stages will be as simple as the initial ones. (Baade 1917b, pp. 99–100; our translation)
Within this microgenetic context, some of the earliest research on the initial stages and temporal parameters of perceptual processes was conducted in the latter half of the nineteenth century (Baxt 1871; Cattell 1885a, 1886; Erdmann and Dodge 1898; Exner 1868; Tigerstedt and Bergqvist 1883) and continued well into the initial decades of the twentieth century (Fröhlich 1923; Monjé 1927). Although these investigations were criticized throughout on logical and methodological
3
4
A HISTORY OF VISUAL MASKING
grounds (Cattell 1885a, 1886; Erismann 1935; McDougall 1904b; Rubin 1929; Wundt 1899a,b, 1900), they left in their wake a host of interesting experimental techniques and empirical findings. Moreover, they raised persistent theoretical and methodological problems (Gibson 1979; Neisser 1976; Shaw and Bransford 1977; Turvey 1977). A thorough extensive discussion of these problems is beyond the scope of this chapter. However, briefly, they revolved fundamentally around the following questions. 1. Is it desirable or possible to isolate perceptual elements (e.g. pure sensations)? 2. Correspondingly, is it desirable or possible to determine the primitive stages or mechanisms of the perceptual process in which the elements can be isolated? 3. As a methodological corollary, by employing brief and static (i.e. tachistoscopic) stimuli, is one not introducing into the perceptual process laboratory artefacts that bear little resemblance or relevance to more naturalistic extra-laboratory perception? The recognition of the scientific importance attached to these questions (Neisser 1976; Turvey 1977) is a recurrent theme in the study of perception. To illustrate, let us quote verbatim the introductory paragraph of Ebbecke’s (1920) article entitled Über das Augenblicksehen (‘On momentary seeing’). Typically our seeing process is one characterized by a roving view. As soon as one is prevented, through some unnatural way, from running one’s eye over the objects in the visual field and, so to speak, probing them, all sorts of disturbances intrude into visual sensation. Under [prolonged] rigid fixation, visibility begins to blur in that brightness and hue differences disappear and afterimages appear. Conversely, when the eye catches only a brief glimpse of a visual object, the visual impression is rendered inaccurate or altered [relative to free viewing conditions] (Ebbecke 1920, p. 13; our translation, emphasis added).
Despite this time-honored recognition of the problems attending attempts to delineate stages of the perceptual process and the corollary use of the tachistoscopic method, chronometric research on the perceptual process flourished then as it does now (see Ö˘gmen and Breitmeyer 2005; Posner 1978; Shapiro 2001). With that in mind, we turn to that subclass of tachistoscopic techniques and observations relevant to the historical understanding of
METACONTRAST AND PARACONTRAST
visual masking. We focus primarily on the following topics: (1) the types of masking termed metacontrast and paracontrast; (2) the relation of stroboscopic motion to metacontrast; (3) the type of masking termed masking by light; (4) the existence of response persistence and temporal integration in vision; (5) the role of central, cognitive (non-sensory) processes involved in visual masking. This choice of topics was not arbitrary; rather, it reflects the main thrust of past as well as present research in visual masking and cognate areas. 1.3. Metacontrast and paracontrast By metacontrast we mean the reduction in the visibility of one briefly presented stimulus, the target, by a spatially adjacent and temporally succeeding briefly presented second stimulus, the mask. Therefore metacontrast is a form of backward masking in so far as the masking stimulus exerts a retroactive effect on the visibility of the target stimulus. By exchanging the temporal order of the above two sequential stimuli i.e. by designating the first stimulus as the mask and the second as the target to be masked, the conditions for paracontrast, a type of forward masking, are met. The coining of the terms metacontrast and paracontrast and the first extensive investigation of these two masking effects are credited to Stigler (1910, 1926). However, as noted in Alpern’s (1952) historical review of metacontrast, evidence for these effects predated Stigler’s (1910) work by a decade or two. Moreover, the use and importance of metacontrast as a methodological tool in the study of the time course and elementary processes of visual perception, although explicitly acknowledged by Stigler (1926) and Piéron (1935) in the early twentieth century, was already apparent several decades earlier. According to Stigler (1908, 1910), Exner (1868) was the first to employ the experimental technique that eventually developed to become known as metacontrast and paracontrast masking. Exner used the subjective comparison of two spatially bordering stimuli, one of which was flashed slightly before the other, in order to investigate the time course of light sensations produced by a brief tachistoscopic light
5
6
A HISTORY OF VISUAL MASKING
stimulus. Without going into the details of Exner’s rationale and method, we shall briefly analyze the following assumptions that he made to justify his method. Exner assumed that two objectively equal and briefly flashed stimuli, presented sequentially in immediately adjacent retinal areas, elicit two equal sensations. What was implied is that the two sensory effects produced by the brief stimuli are independent and do not interact spatially. Without this constancy assumption, whether true or not, Exner could not have employed the sensation produced by the second stimulus to monitor the time course of sensation produced by the spatially bordering first stimulus. This working hypothesis, later also adopted in related investigations by Kunkel (1874) and Petrén (1893), may have been premised, additionally, on the then prevailing notion of ‘local signs’ introduced by Lotze (1852) in his Medicinische Psychologie (Medical Psychology). According to Lotze, local signs resulted from stimulation of spatially delimited ‘sensory circles’, each of which was connected via a separate independently acting nerve fiber to its appropriate cortical area. Lotze’s notion of local signs was adopted and adapted by many of his contemporaries, particularly by the noted nativist Hering and two of the most influential empiricists, Wundt and Helmholtz. Furthermore, to produce independent local sensory activity, Exner (as well as Kunkel and Petrén) assumed that the use of very briefly flashed stimuli effectively eliminated reciprocal spatial contrast mechanisms. This assumption was refuted by Stigler’s (1910, 1913, 1926) subsequent work on masking. Consequently, it should be clear why the former investigators, who regarded their work primarily as a way of determining the time course of isolatable visual sensations, failed to consider the significance of their investigations in terms of spatial contrast phenomena. Moreover, the use of brief flashes, presumably to circumvent spatial contrast phenomena, implies that these investigators were aware of such contrast effects. In fact, Exner (1868, p. 615) makes reference to the existence of edge or border contrast effects. As early as 1834, Müller, in his Handbuch der Physiologie des Menschen (Handbook of Human Physiology) proposed that spatial contrast phenomena were based on mutual and reciprocal action between separate retinal areas, and several of Exner’s contemporaries (Hering 1872, 1878; Hermann 1870; Mach 1865, 1866a,b, 1868) had already published some major works on spatial contrast and the reciprocal dependence and interactions between adjacent retinal areas
METACONTRAST AND PARACONTRAST
of stimulation. Therefore it seems more credible that Exner, Kunkel, and Petrén, rather than having been entirely ignorant of prior and contemporaneous work on spatial contrast, simply did not consider it as a significant factor in their studies. Be that as it may, it was not until Sherrington’s (1897) work on reciprocal action in the retina and McDougall’s (1904a,b) investigations of the sensory intensity of brief visual stimuli that spatiotemporal contrast phenomena were given due notice in the study of the time course of visual sensations. In this regard, Sherrington (1897, p. 33) stated that: the physiological result of applications of a stimulus to any point of a sensifacient surface is decided by not only the particular stimulus there and then incident but also by circumjacent and immediately antecedent retinal events in determining the final physiological or sensory effect produced by a given retinal point of stimulation.
Sherrington’s (1897) definition of ‘simultaneous contrast’ and ‘successive contrast’ as reciprocal sensory relations or interactions across an interval of space and time, respectively, also implied the existence of and distinction between metacontrast (temporally backward masking) and paracontrast (temporally forward masking), although his study failed to draw this distinction either conceptually or experimentally. Nonetheless, one can infer from Sherrington’s (1897) and McDougall’s (1904a) investigations (both published prior to Stigler’s work) that both investigators were aware of metacontrast and paracontrast effects, although neither named or isolated these effects as such. It was left for Stigler (1908, 1910, 1913, 1926) to do so, and for subsequent investigators to rediscover, reconceptualize, or elaborate on them (Alpern 1953; Baroncz 1911; Baumgardt and Segal 1942; Fry 1934, 1935; Fry and Bartley 1936; Piéron 1935). Some of the major findings, theories, and techniques are reviewed in the next section. 1.3.1.
Principal investigations: findings and theories
Sherrington’s (1897) investigation was perhaps the first to indicate the existence of a metacontrast effect. One of the stimuli that Sherrington employed was a disk (Fig. 1.1) which could be rotated, at varying angular speeds, in either a clockwise or a counterclockwise direction. When the disk spins in the clockwise direction, the single white arc bracketed by the two black arcs shown at the bottom of the figure is followed in time by the two spatially adjacent double white arcs shown at the top of the figure. The percept is one of three bands that appear to flicker at rates
7
8
A HISTORY OF VISUAL MASKING
Fig. 1.1 One of the rotating discs used by Sherrington (1897) in his investigations of reciprocal, spatiotemporal interactions in human vision. (Reproduced from Sherrington 1897.)
dependent on the speed of the rotation. Sherrington found that as the disk spun at increasing speeds in the clockwise direction, flicker persisted longest in the middle band corresponding to the single white arc. Conversely, when rotation reversed, flicker persisted longest in the bands produced by the double white arcs. This result can be summarized as follows: when brief illumination of a given retinal area is followed rapidly by brief illumination of spatially adjacent area, the critical flicker fusion (cff) in the former area is increased relative to that of the latter. Assuming that the interactions are reciprocal in time, one can say alternatively that the cff in the latter area decreased relative to that in the former area. Under similar experimental arrangements (see Fig. 1.3(a)), Piéron (1935) reported essentially the same result. These results are somewhat puzzling in the context of meta- and paracontrast. The fact that stimulation of a given area can decrease cff in a subsequently illuminated and spatially adjacent area is easily reconcilable with a forward paracontrast masking effect. However, if metacontrast is viewed as a backward suppressive effect, why is it that cff is increased in the first of two illuminated spatially adjacent areas? To answer this, we must draw a distinction between flicker and brightness sensations; the two types of sensation may rely on different sensory mechanisms. In fact, what Piéron reported (and Sherrington failed to report) is that at intermediate rates of rotation the brightness visibility of the leading stimulus could be suppressed entirely by the succeeding spatially adjacent stimulus. However, as rotation rate decreased or increased, or alternatively as the stimulus onset asynchrony (SOA)
METACONTRAST AND PARACONTRAST
between adjacent retinal areas increased or decreased, the brightness of the leading stimulus became progressively greater, i.e. the suppressive effect that the lagging stimulus had on the leading one decreased. Hence, by taking the perceived brightness of the band produced by the leading arc as a sensory response index, Piéron obtained a nonmonotonic metacontrast suppression as a function of SOA. And, like Sherrington, he reported a complementary facilitatory effect on cff. The empirical and theoretical work discussed in Chapters 4 and 5 clarifies why flicker sensitivity does not serve as an index of metacontrast suppression but rather as an index of complementary metacontrast facilitation. It will also become clear why decreases of flicker and brightness visibility indicate the existence of two separate mechanisms of paracontrast suppression. To anticipate in summary fashion, we shall make the following two claims. 1. The fast ‘flicker-detectors’, activated by the spatially adjacent temporally lagging stimulus, suppress, and are also reciprocally suppressed by, the activity of the slow pattern ‘brightness (or contrast) detectors’ generated by the leading stimulus. 2. Consequently, this in turn results in (a) an inhibition of the flicker detectors responding to the lagging stimulus and (b) an inhibition of the contrast detectors, and thus a simultaneous disinhibition of the flicker detectors, responding to the leading stimulus. Given that we index metacontrast (or paracontrast) by studying variations in perceived brightness or contrast, let us now turn to McDougall’s (1904a) investigation which, to our knowledge, yielded the first clearly interpretable metacontrast effect. McDougall also used an experimental apparatus similar to that employed by Sherrington. Specifically, he used an opaque disk with two small apertures (Fig. 1.2(a)), which was rotated in front of a luminous background and visually fixated at its center. Both apertures were continuously trans-illuminated when the disk rotated. However, when aperture b was covered and the disk was rotated at fairly slow speeds (1 rev/s), McDougall reported seeing a traveling pattern of alternate dark and light bands, known as Charpentier’s bands, produced by aperture a. These bands (Fig. 1.2(b)) were in turn trailed by a grayish band known as Bidwell’s ghost or the Purkinje image (Brown 1965). The former bands are related to the frequently reported temporal oscillations in the visibility of positive after-images produced by flashed stationary
9
10
A HISTORY OF VISUAL MASKING
(a)
a
a⬘ b b⬘
Fig. 1.2 (a) McDougall’s (1904a) stimulus apparatus, consisting of an opaque rotating disk with two apertures (a and b) transilluminated by a light source behind the disk. (b) The appearance of the leading Charpentier bands and the trailing Bidwell’s ghost or Purkinje image produced, for instance, by aperture a while b is masked when the disk is rotated clockwise. (Reproduced from McDougall 1904a.)
(b)
stimuli (Alpern and Barr 1962; Aubert 1865; Corwin et al. 1976; Fechner 1840a,b; Fröhlich 1921, 1922a,b, 1929; Helmholtz 1866; Müller 1834; Plateau 1834; Purkinje 1819) and have recently been related to oscillatory metacontrast functions (Purushothaman et al. 2000; 2003) and motion blur (Chen et al. 1995; Purushothaman et al. 1998). The latter grayish band is related to a more prolonged dull and ill-defined phase of the after-image. McDougall hypothesized that Bidwell’s ghost represented the trailing end of the primary sensation (Primärempfindung) produced by the leading aperture a. Moreover, he reasoned that by again uncovering the temporally trailing aperture b, the initial stronger portion of the primary sensation produced by this aperture could suppress the weaker trailing end of the sensation, i.e. Bidwell’s ghost, produced by the leading aperture a. As predicted, when aperture b was uncovered, the trailing Bidwell’s ghost of aperture a disappeared.
METACONTRAST AND PARACONTRAST
Note that McDougall’s explanation of the ‘backward’ suppression of Bidwell’s ghost rests on the following implicit assumptions: (i) although two physical stimuli are spatially and temporally segregated, their sensory responses interact not only via reciprocal lateral interaction but also via their temporal overlap, thus in effect simulating simultaneous brightness contrast at the physiological level; (ii) as a corollary, the sensory response of a brief stimulus must persist, in some form or other, beyond the momentary duration of the stimulus. These assumptions were made explicit and elaborated in subsequent investigations of meta- and paracontrast conducted by Stigler (1910, 1913, 1926). Based on Exner’s (1868) notion of ‘positive after-image’, Stigler (1910) drew the following distinction between the initial and trailing portions of the primary sensation produced by a flashed stimulus. The (initial) part of the primary sensation (Primärempfindung), which was produced by the presence of the stimulus, was designated the homophotic image, and the (trailing) part, which outlasts the stimulus, was designated the metaphotic image and corresponded to the Exnerian positive after-image. As stated by Stigler, the metaphotic image was further distinguishable in that it, relative to the homophotic image, is particularly susceptible to masking by spatially adjacent stimuli. Although this masking was believed to tap the same mechanisms as simultaneous or homophotic contrast, Stigler named it metaphotic contrast or metacontrast in order to highlight the fact that the metaphotic image of a temporally leading stimulus was affected by (the homophotic image of) a succeeding stimulus. As stated so far, this explanation, based on (i) a form of visual persistence and (ii) a mechanism related to simultaneous brightness contrast, seems reasonably and simply stated in terms of temporal integration of interactive sensory responses. However, in 1926 Stigler complicated this theoretical explanation somewhat by explicitly acknowledging another possible mechanism which we can alternatively term the overtake or delay hypothesis. He states that metacontrast shows that it is possible for an excitation to be overtaken on its way from the retina to the central organs and masked via a contrast effect [produced] by a succeeding [spatially] neighboring stimulus.
Metacontrast shows further that the visual excitation is measurably delayed at one and probably several sites along its way from the periphery to the center [i.e. the brain]. (Stigler 1926, pp. 950–1; our translation, emphasis added)
11
12
A HISTORY OF VISUAL MASKING
Thus Stigler inadvertently or, more likely, deliberately raised two hypothetical explanations of metacontrast: one could conveniently be called the integration–inhibition hypothesis, and the other the overtake–inhibition hypothesis. The most likely reason for Stigler’s introduction of the latter hypothesis is his finding of dichoptic metacontrast in 1926, following his 1910 failure to find such metacontrast (see below). Based on his 1910 results, Stigler proposed, following Exner’s (1898) similar proposal (see note 1), that horizontal cells in the retina mediate the metacontrast effects between adjacent retinal areas. Consequently, for it to be suppressed, the metaphotic image (the Exnerian positive after-image) would also have to be localized at retinal, in particular at receptor, levels of activity, a hypothesis which antedates a more recent explanation of visual persistence proposed by Sakitt (1975, 1976). However, Stigler’s (1926) dichoptic metacontrast results could no longer be explained by the scheme outlined above. As a result, the integration–inhibition hypothesis could be saved only by a separate cortical source of visual persistence, or else one that depends on temporal properties of sensory activity transferred from more peripheral levels to the brain. Except for Monjé’s (1931) somewhat later work (see section 1.5), not many experimental findings prior to or contemporaneous with those of Stigler bore clearly on the issue of the locus of visual persistence. However, that the receptor signal must traverse one or more ganglia before it reaches the central visual areas of the brain was already an established anatomical fact (Minkowski, 1913, 1920a,b). Most likely Stigler (1926), having been aware of this fact, opted for the more adequately supported overtake–inhibition hypothesis. What was furthermore implied by this hypothesis and the dichoptic metacontrast results is that the brightness suppression of the leading stimulus by the following one must depend on central sources of inhibition. Stigler’s two hypotheses, in some form or another, have been adopted in subsequent theories of metacontrast and masking, and more recently have been expressed in abbreviated terms as the integration and interruption hypotheses, respectively (Kahneman 1968; Scheerer 1973; Sperling 1963). Within the framework of these theoretical devices, Stigler (1910, 1913, 1926) employed the use of two spatially adjacent light-on-dark semicircles or rectangles as target and mask, with the target temporally preceding the mask. Extending over roughly two decades, Stigler’s research supported
METACONTRAST AND PARACONTRAST
the following main conclusions, most of which have been repeatedly corroborated in later and more recent investigations. 1. Metacontrast can be obtained binocularly as well as monocularly (Stigler 1910). 2. Although initially denied by Stigler (1910), he found later that metacontrast could also be obtained dichoptically (Stigler 1926; see also Kolers and Rosner 1960; May et al. 1980; Schiller and Smith 1968; Weisstein 1971; Werner 1940). 3. Metacontrast is not only a dichoptic effect, it is also an interhemispheric (transcallosal) effect (Stigler 1926; see also McFadden and Gummerman 1973). 4. ‘Metacontrast is exclusively dependent on the luminance of the two [adjacent] contrast fields and [is] essentially unaffected by color. It [i.e. metacontrast] is as evident when one semicircle is red, and the other green as when both are white’ (Stigler 1926; p. 967). However, see Chapter 2, section 2.6.8, for discussions of both corroborative and disconfirming results reported in more recent studies. 5. Metacontrast depends on the temporal onset asynchronies of the two adjacent stimuli, or as Stigler (1910, p. 418) stated: ‘should the time difference between the onsets of the two luminous stimuli be increased beyond the temporal-resolution threshold . . . a difference value is finally attained . . . at which Field I [the leading stimulus or target] appears darker than Field II [the succeeding stimulus or mask]’. A reworked version of this ‘SOA law’ has more recently been stated by Kahneman (1967) and Di Lollo et al (2004) for metacontrast and by Turvey (1973) for central pattern masking. 6. In foveal view, which was employed by Stigler (1910), metacontrast was eliminated or progressively attenuated when an increasingly wide dark vertical strip separated the two successive and neighboring stimuli, a result also reported more recently by Kolers and Rosner (1960) and Bridgeman and Leff (1979). 7. Metacontrast depends on the direction of gaze (Stigler 1910). When the border between the test and mask field was viewed parafoveally rather than foveally or directly, metacontrast seemed to be stronger and more immune to spatial separation between the two neighboring stimuli (Alpern 1953; Kolers and Rosner 1960; Merikle 1980; Saunders 1977; Stewart and Purcell 1970, 1974).
13
14
A HISTORY OF VISUAL MASKING
8. The first stimulus can reduce the apparent brightness of the succeeding one. This latter effect is known as paracontrast (Stigler 1926; see also Alpern 1953; Kolers and Rosner 1960; Weisstein 1972). Before discussing later relevant studies of metacontrast and masking, let us examine one more of Stigler’s (1910) findings that in hindsight seems to be of relevance to his theoretical explanations of metacontrast. Stigler found that metacontrast obtains not only when the second stimulus is equal to or greater in intensity than the first, but also when the second stimulus is substantially weaker than the first (Stigler 1910, p. 394, Experiment 9, and p. 399, Experiment 45). This raises the following set of problems. Stigler (1910), among others (e.g. Cattell 1885b), was aware that a relative increase in stimulus intensity decreased the response latency and persistence of a visual sensation. This poses a puzzle for both the integration–inhibition and the overtake–inhibition hypotheses. With a shorter latency and persistence of the first stimulus relative to the second, how is it that, according to former hypothesis, the slow and weaker sensory effects of the second stimulus integrate temporally with and inhibit the briefly persisting and stronger ones of the first? According to the latter hypothesis, how is it that the weaker sensation produced by the second stimulus overtakes and inhibits the faster stronger one produced by the first? This theoretical puzzle, also inherent in a later related theory (Ganz 1975) (see Chapter 4), was not apparent to Stigler, his contemporaries, or his immediate successors, and as such did not attain central status in theories of metacontrast until much later (Alpern 1953). Of additional particular interest to the present historical review of theories and findings are the studies reported by Fry (1934), Werner (1935), Piéron (1935) and Alpern (1953). Fry (1934), independently of Stigler’s work, rediscovered the metacontrast effect, and his theoretical explanation is basically a composite of Stigler’s two hypotheses. To explain the temporally backward influence in metacontrast, Fry stated: what seems to happen is that the response of the retina to the first stimulus is considerably delayed and prolonged and overlaps in time the response to the second [spatially adjacent] stimulus and is inhibited by it by some kind of interaction between retino-cortical pathways at synapses either at the retina, or at the basal ganglia, or at the cortex. (Fry 1934, p. 706, emphasis added)
Within this theoretical framework, Fry (1934) extended Stigler’s basic findings by obtaining the following important quantitative results.
METACONTRAST AND PARACONTRAST
Similar to Stigler’s (1910) results, Fry (1934) found that parafoveal metacontrast could be obtained when a spatial (dark) gap of up to 1.25 was introduced between the target and the mask stimuli. Moreover, Fry found that the strength of metacontrast decreased with the size of the gap, indicating that metacontrast interactions are, relatively speaking, spatially local. Piéron (1935) added the following technique to the basic metacontrast motif. As shown in Figure 1.3, he employed not only the, by now, standard metacontrast paradigm (Fig. 1.3(a)) but also an interesting variation in which several spatially staggered and adjacent stimuli could sequentially mask each other (Fig. 1.3(b)). Figure 1.3(b) shows that, as the disk rotates clockwise, stimulus d is followed in time by adjacent stimulus c, which in turn is followed by adjacent stimulus b, and so on. Piéron found that at a given optimal rate of clockwise rotation, stimulus c suppressed the visibility of stimulus d, stimulus b in turn suppressed stimulus c, and finally stimulus a suppressed stimulus b. In effect, only stimulus a remained visible. What, in the context of the above theoretical explanations, is perhaps surprising and puzzling is not only the staggered masking of stimuli d, c, and b but also the fact that stimulus b’s inhibition of the visibility of stimulus c (or stimulus a’s inhibition of the visibility of stimulus b) in turn failed so disinhibit the visibility of stimulus d (or of stimulus c). That is to say, if the sensory processes elicited by stimulus b inhibit the sensory processes which (a)
(b) b
a b
c a
c d
Fig. 1.3 (a) One of Piéron’s (1935) modifications of McDougall’s (1904a) stimulus apparatus. With Piéron’s apparatus, stimulation from aperture a was followed by adjacent stimulation from two adjacent apertures, b and c, when the disk was rotated clockwise. (b) A second of Piéron’s modifications showing the spatially staggered series of apertures giving rise to a sequential blanking or masking of stimulation arising from apertures d, c, and b when the disk is rotated clockwise. (After Piéron 1935.)
15
16
A HISTORY OF VISUAL MASKING
otherwise would have been elicited by stimulus c, would one not expect that the sensory processes elicited by stimulus d in turn not be inhibited since this inhibition relies on the presence of stimulus c’s sensory processes? As shown in Chapters 5 and 8, the resolution of this puzzle depends on a theoretical reconceptualization of metacontrast, relying on two separate types of neural processing. In his extensive studies of metacontrast, Alpern (1953) confirmed most of these prior findings. Of notable exception, however, was his failure to obtain metacontrast (i) when the stimuli fell within the foveal region, (ii) when they were presented dichoptically, or (iii) when the stimulus intensities of the mask fell below photopic threshold levels. The failure to obtain foveal and dichoptic metacontrast is somewhat surprising in view of Stigler’s (1910) prior report of foveal metacontrast (Stigler’s stimulus display subtended the central 1.5 of vision), and the prior reports of dichoptic metacontrast by Stigler (1926) and Werner (1940). Be that as it may, Alpern (1953) (see also Alpern 1965) inferred from his findings that metacontrast is a retinal phenomenon, most likely due to inhibitory interactions between fast cone and slow rod processes.1 To our knowledge, Alpern (1953) was the first investigator of metacontrast to publish the by now typical U-shaped or type B (Kolers 1962) metacontrast function, although such a function also could have been inferred from Piéron’s (1935) and Werner s (1935) less quantitative and more phenomenally oriented studies of metacontrast. That is, Alpern’s quantitative experimental methods demonstrated clearly that the optimal suppression exerted by a circumjacent mask on the brightness of a target occurred not when both stimuli were presented simultaneously, but rather when the mask lagged the target by about 50 ms, despite stimuli of equal energy and thus of equal sensory latency and persistence. Moreover, Alpern (1953) was also the first investigator to note the theoretical significance of obtaining such type B functions even when the mask energy was substantially lower than the target energy. This reintroduced the puzzle already raised in connection with prior discussions of some of Stigler’s (1910) results. Given a response latency to the second stimulus equal to or longer than that to the first, how is it, as Stigler (1910, 1926) and later investigators believed, that excitation produced by the second overtakes and suppresses that of the first? Perhaps it was the inconsistency of the prior explanations, based implicitly on response persistences or response latencies within the
METACONTRAST AND PARACONTRAST
same types of sensory processes, which led Alpern to posit an alternative explanation based on differential temporal response properties of cone and rod processes. As such, Alpern’s (1953) explanation, although only partially correct, was in spirit the first to highlight what was subsequently called the dual-channel approach to metacontrast (Breitmeyer and Ganz 1976; Matin 1975; Weisstein et al. 1975). Half a century before Alpern’s (1953) work, McDougall (1904a) had already implicitly invoked the longer response latency of rod, relative to cone, processes in explanations of his lateral and backward masking effects. McDougall hypothesized that many, although not all, of the trailing Bidwell’s ghost phenomena were due to rod processes, which have a longer response latency than the cone processes responsible for the earlier (Charpentier) bands produced by a rotating illuminated window (Fig. 1.2(b)). Further implied by McDougall’s (1904a) investigation and made explicit by Alpern (1953) is the belief that cone mechanisms can inhibit rod mechanisms (and vice versa). In fact the issue of rod–cone interactions was brought into the limelight by Alpern’s (1953) study and, as we shall see, continued to attract appreciable attention (see Chapter 2, section 2.6.8). Related phenomena: stroboscopic motion, feature inheritance, and standing-wave illusion
1.3.2.
The existence of stroboscopic motion was demonstrated experimentally by Exner (1875, 1888) several decades before the publication of Wertheimer ‘s (1912) phenomenon-rich and epoch-making paper ‘Experimentelle Studien über das Sehen von Bewegung’ (‘Experimental studies on seeing motion’). Although Wertheimer’s study primarily addressed the sensory processes underlying (stroboscopic) motion perception, it is also relevant to the study of masking and metacontrast. In fact, Wertheimer was aware of the potential relation between masking and stroboscopic motion. For instance, he reported, as Schulz (1908) had done earlier, the suppressed visibility of one, usually the first, of the two stimuli used in stroboscopic motion. This was particularly true for the type of motion called phi. Referring to the apparent rotation of (a) a flashed vertical toward (b) a subsequently flashed horizontal line, Wertheimer noted that: in the extreme sense, the subject had no inkling that the vertical [first line] was actually exposed . . . In the course of the experiments several cases resulted in which
17
18
A HISTORY OF VISUAL MASKING
one of the two exposed objects plainly was not seen, nor could it be imagined; and the subject judged that only one was exposed; in regard to the other, perceived one, phi motion was clearly apparent either coming from the first (a) or alternately approaching the second (b) locus [or stimulus] (Wertheimer 1912, p. 217, our translation).2
Moreover, Wertheimer was aware of prior work reporting masking phenomena, and he cites Schumann’s description of an apparent explosion (Explodieren) or expansion of a stimulus at one location when followed by a masking stimulus. Wertheimer did not emphasize this aspect of his study, and therefore its importance and relation to metacontrast and masking does not seem to have been clear to him. What is important, however, is that, like Exner (1875) and more recently Anstis and Moulden (1970), Wertheimer demonstrated that stroboscopic motion could be obtained dichoptically. This, as noted by Wertheimer, implicated the involvement of central or cortical mechanisms in stroboscopic motion perception. Although not explicitly stated by Wertheimer, we can infer that the masking phenomenon accompanying stroboscopic motion under monocular or binocular viewing was also obtained here. Therefore, by extension of Wertheimer’s conclusion, masking also partakes of a central cortical component. Of additional importance is Wertheimer’s finding that differences between the colors and shapes of the two successive stimuli did not eliminate stroboscopic motion. What one perceived during the stroboscopic motion when using such heterogeneous stimuli was a transformation (Veränderung) from one color or pattern to the other, an observation later replicated by Kolers and von Grünau (1976). It was as though the mechanisms producing stroboscopic motion sensations were more or less unaffected by those concerned with pattern or color discrimination. Finally, Wertheimer also noted, as did Kolers and von Grünau (1977) more recently, the importance of attention in stroboscopic motion (and, by inference, in masking). The degree and smoothness of stroboscopic motion depended on where one directed one’s attention and gaze. Of course, in Wertheimer’s case, changes in the direction of attention and gaze may have been confounded with changes in the retinal location of stimuli. As reported earlier by Exner (1888), the retinal periphery is particularly sensitive to stroboscopic motion—an important fact, since, as Stigler (1910) had shown, metacontrast was also stronger under indirect rather than direct viewing of the edge separating the test from the masking flash. However, Werner (1935, Experiment 28), anticipating
MASKING BY LIGHT
more recent findings (Enns and Di Lollo 1997; Ramachandran and Cobb 1995), reported that the magnitude of metacontrast suppression depended on where an observer’s attention was covertly directed, the suppression being stronger when spatial attention was directed away from the location of the target. Among his many phenomenal descriptions of metacontrast, Werner (1935) additionally made two that are particularly relevant to current studies of the temporal dynamics of object perception (Herzog and Koch 2001; Macknik and Livingstone 1998) (see Chapter 2, sections 2.7.2 and 2.7.3). In his Experiment 1, Werner (1935) repeatedly cycled a brief (12–25 ms) black target disk and an equally brief and black mask ring at variable meta- and paracontrast SOAs and observed the effects on the visibility of the disk and the ring. He found that in certain ranges of meta- and paracontrast SOAs, the visibility of the disk was completely suppressed while that of the ring was left standing. Werner (1935) found that this ‘standing-wave illusion’ (Macknik and Livingstone 1998) was optimal when (i) the metacontrast SOA between the disk and the following ring was 120–240 ms and, in turn, (ii) the paracontrast SOA between the ring and the next presentation of the disk was 280–560 ms. In his Experiment 22, Werner (1935) also demonstrated that the contour features of the target, even when its visibility is suppressed, become phenomenally attached to the contours of the ring. This ‘feature migration’ (Enns 2002), ‘feature inheritance’ (Herzog and Koch 2001), or ‘feature transposition’ (Wilson and Johnson 1985) is important not only because it demonstrates that invisible target information can nonetheless attach to the phenomenal representation of the mask, but also because it raises interesting questions about the mechanisms, and their spatiotemporal properties, that contribute to feature formation and feature binding. 1.4. Masking by light Masking by light is a form of visual masking in which a briefly flashed, uniformly illuminated field obscures the visibility of a prior or later flashed target stimulus. In the history of masking, backward masking by light (mask flash presented after target flash) had been employed extensively ever since Exner’s (1868) pioneering work on the time course and fate of visual sensations elicited by the leading target
19
20
A HISTORY OF VISUAL MASKING
stimulus (Baade 1917a,b; Baxt 1871; Cattell 1885a, 1886; Fröhlich 1923; Monjé 1927; Schumann 1899; Tigerstedt and Bergqvist 1883). The main findings of these studies can be summarized as follows: (i) the more intense the after-coming mask flash, the less the visibility of the prior target (e.g. another light flash, a letter, or a word (Baxt 1871; Cattell, 1885a, 1886; Sperling 1965 (see his Fig. 2.5)); (ii) the greater the temporal interval between prior target and succeeding mask, the weaker the masking magnitude (Cattell 1885a, 1886; Schumann 1899; Sperling 1965 (see his Fig. 2.5)). That is, with increasing temporal separation, backward masking by light is a monotonically decreasing or type A function (Kolers 1962). Most theoretical explanations of this masking effect offered then and subsequently (e.g. Eriksen 1966) followed from the sensory persistence hypothesis, according to which the sensory response of the leading target stimulus persists in the form of a decaying positive after-image that can be suppressed by integrating with the response of an after-coming mask. In these early studies, the mask was used as a tool to monitor the temporal processing stages (e.g. the response persistence) of the target stimulus. However, as noted later by Sperling (1964), one can conversely use the visibility of the target to monitor the time course of the sensory response elicited by the mask. The latter approach was successfully employed and exploited by Crawford (1940, 1947) to investigate the effects on a small test flash produced by the onset, duration, and offset of a larger bright conditioning flash. Since the on- and offset of a conditioning flash produce sudden changes in the adaptation level of the visual system, related studies on such adaptation effects are of prime importance in the historical understanding of masking by light. The study of the sensory effects of sudden changes, particularly sudden increases, in luminance has had a long history. McDougall (1904b) and Stigler (1908) cite the relevant work of Plateau, Exner, and Helmholtz in the first half and middle of the nineteenth century; and McDougall (1904b) himself investigated the currently well-known transient overshoot and sudden decline in visual activity following an abrupt stimulus onset. These overshoots and subsequent declines in brightness were well documented by Stainton (1928) and related not only to the Broca–Sulzer effect (Broca and Sulzer 1902) but also to masking effects such as metacontrast (Baumgardt and Segal 1942). Moreover, both light and dark adaptation were well known and
MASKING BY LIGHT
extensively studied phenomena by the beginning of the twentieth century (Blanchard 1918; Löhmann 1906; Piper 1903). However, it was not until several decades later that Crawford (1947) quantitatively investigated the sudden changes of visual sensitivity or, alternatively, of visual adaptation produced by the on- and offset of a uniform light flash. The method employed by Crawford was to measure the detection threshold of a small 10-ms test flash spatially centered on a larger conditioning field flashed for a duration of about 500 ms. By varying the temporal interval between the onset of the test flash and the onand offset of the conditioning flash, Crawford was able to monitor the changes in visual sensitivity produced by rapid light and dark adaptation, respectively. The main result is shown in Fig. 1.4. The time interval between the onset of the test flash and the conditioning flash is given on the abscissa. Positive values indicate that the test flash was presented after onset of the conditioning flash. The test flash threshold values are given on the ordinate. Several important results should be noted.
Log10 brightness of test stimulus (c/sq ft)
3
Brightness of conditioning field = 100 C/FT2 Brightness of conditioning field = 30 C/FT2 Brightness of conditioning field = 10 C/FT2
2
1
0
–1
–0.4 –0.3 –0.2 –0.1
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Time (seconds) after start of conditioning stimulus
0.9
1.0
1.1
1.2
Fig. 1.4 The change in test field threshold (ordinate) as a function of the time of presentation of the test field relative to the onset of the conditioning field (abscissa). Negative and positive time values indicate that the test field was flashed before and after, respectively, the onset of the conditioning field. (Reproduced from Crawford 1947.)
21
22
A HISTORY OF VISUAL MASKING
1. The greater the intensity of the conditioning or masking flash, the greater is the overall test flash threshold or, alternatively, the lower is the overall visual sensitivity. 2. The rise in test flash threshold occurs up to 100 ms before the onset of the conditioning flash. 3. There are transient overshoots of the test flash threshold at and near the time of the on- and offset of the conditioning flash. 4. After offset, the immediate sensory effects of the conditioning flash are prolonged on the order of 200–300 ms, before assuming the longer-lasting phase of dark adaptation. The first and last of the results are merely replications of aforementioned findings on (a) the increase in mask effectiveness with intensity and (b) sensory response persistence. It is the second and third results that are of greater theoretical importance. The problem posed to Crawford (1947) by the second result was to explain how the visibility of a prior test flash is masked by a following conditioning flash onset. He offered two alternate explanations that are similar to Stigler’s (1910) overtake and temporal integration accounts of metacontrast. He maintained that: either the relatively strong conditioning stimulus overtakes the weaker test stimulus on its way from retina to brain and interferes with its transmission; or the process of perception of the test stimulus, including the receptive processes in the brain, takes an appreciable time of the order of 0.1 s, so that the impression of the second (large) stimulus within this time interferes with the perception of the first. (Crawford 1947, p. 285)
Although Crawford (1947) noted the suddenness of the threshold rise at onset and the presence of the smaller rise just prior to offset, he does not elaborate on these threshold rises or overshoots per se. With hindsight, however, it turns out that these overshoots play very important roles in theories of light and dark adaptation (Wald 1961; Baker 1963). At issue in these theories are the adequacies of photochemical and neural explanations of visual adaptation. As noted by Wald (1961), disproportionately large rises of visual threshold (e.g. the transient overshoots reported at onset or offset by Crawford (1947)) can accompany quite a small amount of bleaching of photopigments. Hence some non-photochemical neural process also seems to contribute to these dramatic changes of visual sensitivity. In fact, as Wald (1961) points out, evidence for such a neural component can be inferred from
SENSORY RESPONSE PERSISTENCE AND TEMPORAL INTEGRATION
extrapolation of data reported as early as 1918 by Blanchard. Similarly, extrapolation of the results of Löhmann’s (1906) study suggests the presence of a neural component in light adaptation; and independent, yet related, work by Bartley (1938) on brightness perception and by Bernhard (1940) on electrophysiological correlates of light stimulation indicated that some of these neural components may be centrally or cortically located (Battersby et al. 1964). An earlier related view on the role of central mechanisms in backward masking by light flashes had already been expressed by Cattell (1885a, 1886). 1.5. Sensory response persistence and temporal
integration in vision As noted above, Crawford (1947) found that the sensory effects produced by a conditioning or masking flash outlasted its offset by several hundred milliseconds. The fact that sensory responses elicited by a brief visual stimulus outlast its duration had been confirmed repeatedly since the late nineteenth century (Aubert 1865; Baxt 1871; Cattell 1885a, 1886; Charpentier 1890; D’Arcy 1773, cited by Boynton 1972; Exner 1868; Fechner 1840a,b; Fröhlich 1921, 1922a,b, 1923, 1929; Helmholtz 1866; Martius 1902; McDougall 1904a,b; Monjé 1931; Müller 1834; Plateau 1834; Schumann 1899). Exner (1868) described the brightness sensation following a brief light stimulus in terms of a relatively fast rise toward a peak value followed by a more gradual decline or decay. He designated that part of the primary sensation (Primärempfindung) that outlasted the stimulus as the ‘positive after-image’, and subsequently Monjé (1931) differentiated this primary sensation from the longerlasting secondary after-images. We have seen above that in some form or another (e.g. the metaphotic image) such visual persistence was an integral part of theoretical explanations of metacontrast (Fry 1934; Stigler 1910, 1926). In more recent nomenclature, Exner’s positive after-image or Stigler’s metaphotic image correspond to what is termed iconic persistence (Coltheart 1980; Neisser 1967). In fact, one of the major conceptualizations of iconic persistence is that it is basically a type of afterimage (Hochberg 1968, 1978) whose source is, at least in one version, thought to reside in retinal receptor activity (Sakitt 1976; Turvey 1977). The issue of whether iconic persistence is based on peripheral (retinal) or central (cortical) processes, or perhaps both—an issue which was
23
24
A HISTORY OF VISUAL MASKING
central to recent theories of iconic persistence (reviewed by Coltheart 1980; Long 1980)—was already implied or even made explicit (Schumann 1899, cited by Baade 1917b, p. 123) in many of the investigations of the late nineteenth century cited previously. In order to address this issue historically, we shall introduce a more recent distinction between previsible or neural and visible or phenomenal persistence (Coltheart 1980; Turvey 1978). By previsible persistence we mean the neural response persistence at peripheral sites which, although contributing to visible persistence, does not constitute it; by visible persistence we mean the persisting phenomenally observable image associated with some central cortical process activated by a brief stimulus flash. Such a distinction is not entirely without historical foundation, since it was at least implicit in the work of Baxt (1871), Cattell (1885a, 1886), Tigerstedt and Bergqvist (1883), Baade (1917b), and Monjé (1931). Keeping this distinction in mind, let us look at the following methods and principal findings on response persistence in vision that were reported between the mid-1800s and the early decades of the twentieth century. To our knowledge, four main methods of investigating visual persistence were employed at or near the start of the twentieth century. One was a variation of the backward masking method introduced by Exner (1868). The rationale of this method ran something as follows. If a brief stimulus is followed at some temporal interval by a masking flash, the latter will in effect disrupt the processing of the positive afterimage produced by the former stimulus. By determining the minimal interval after offset of the first stimulus at which the later flash no longer exerted its masking effect, one could infer that the sensory effects elicited by the prior stimulus must have persisted for at least that minimal duration (Baade 1917b; Baxt 1871). A lower bound of visual persistence could thus be established. A second method, an adaptation of the one initially employed by D’Arcy (1773, cited by Boynton 1972), was to exploit the presence of Bidwell’s ghost and Charpentier bands produced by a rotating stimulus (see Fig. 1.2). Fröhlich (1921, 1922a,b, 1923, 1929) proposed the following rationale. By measuring the spatial extent of the bands or ghosts trailing a moving stimulus (and subtracting the time required for the entire stimulus to move across a given point), one can, with knowledge of the velocity of the stimulus, determine the duration of any portion of the primary sensation that outlasts the presence of the stimulus.
SENSORY RESPONSE PERSISTENCE AND TEMPORAL INTEGRATION
A third method was to measure the cff of a light source. The cff is defined as the frequency at and above which the flickering stimulus appears steady rather than flickering. For example, Ferry (1892) reasoned that one can estimate retinal persistence by measuring the temporal interval between successive isochronal exposures of a flickering stimulus at which perceptual fusion just occurs. Finally, a fourth procedure relies on what is called the ‘seeing-more-than-there-is’ phenomenon (McCloskey and Watkins 1978) in which either a narrow vertical slitaperture is moved left or right in front of a much wider stationary pattern display or the slit-aperture is stationary and the pattern display is moved behind the aperture. At any moment, an observer has only a small portion of the otherwise occluded pattern in view. However, despite these momentary limited views, as the aperture or the display move at optimal speeds, the observer perceives the entire pattern of the display. Since some of the partial views of the pattern occur later in the motion sequence than others, these separate views must be integrated over time to form a complete pattern percept, suggesting that at some level of the visual system the activity produced by earlier views persists until at least the onset of the activity produced by later views. This phenomenon was reported as early as the 1860s by, among others, Zöllner (1862), Helmholtz (1866), and Vierordt (1868). In addition to newer methods, variations of these four procedures have been employed more recently to investigate response persistence in vision (Allport 1970; Erwin 1976; Haber and Nathanson 1968; Meyer and Maguire 1977; Parks 1965; 1968, 1970; Spencer 1969). These early methods yielded the following main findings that have been replicated more recently (see Coltheart 1980). 1. As the intensity of a stimulus increases, response persistence decreases (Bowen et al. 1974; Exner 1868; Ferry 1892; Martius 1902; Monjé 1931). 2. As the photopic luminosity or physiological efficacy of a bandlimited light source increases, persistence decreases (Ferry 1892). 3. As a corollary, variations in persistence are not attributable to variations in wavelength (color) but rather to the covariations in photopic luminosity (Ferry 1892). 4. Persistence decreases as light adaptation level increases (Fröhlich 1923; Haber and Standing 1970; Schumann 1899, cited by Baade 1917b, p. 111).
25
26
A HISTORY OF VISUAL MASKING
5. For durations smaller than the critical duration limiting temporal integration (Bloch’s law; see McDougall (1904b)), persistence decreases with increases in stimulus duration (Baroncz 1911; Bowen et al. 1974; Fröhlich 1923; Haber and Standing 1970; Martius 1902). 6. According to Exner (1868) persistence is greater foveally than extrafoveally (Breitmeyer and Halpern 1978; Mezrich 1984), although subsequent work by Fröhlich (1923) indicated contrary findings. 7. Visual persistence, at least for stimuli of intermediate intensity, is longer (by a factor of 21/2, indicating central binocular brightness summation between the two eyes) under monocular than under binocular vision (Monjé 1931). Up to now, these results, except the last one, indicate the importance of peripheral sensory variables in determining persistence in vision. As such, they refer to what we have called previsible or neural persistence. The seventh or last finding pointed out the importance of binocular compared with monocular vision, and suggested the possibility of central cortical processes involved in determining brightness and persistence. In fact, the work of Baxt (1871), Cattell (1885a, 1886), and Schumann (1899, cited by Baade 1917b) also pointed tentatively to the involvement of such central processes. Baxt (1871) used a backward masking technique to study the temporal parameters of persistence in vision. When using three letters as a target stimulus presented for 12.9 ms followed at varying interstimulus intervals (ISIs) by a second uniform flash of light of duration 55 ms, Baxt found that the target escaped the backward mask’s influence when the ISI was at least 57.9 ms. On the other hand, when, other things being equal, the same subjects were required to recognize a more complicated and hence difficult Lissajous figure, the critical ISI value increased to 195.6 ms. Apparently the difficulty of the task determined how long the sensory trace must persist in order to recognize a stimulus correctly. Unfortunately, a Lissajous figure is not only more complicated than alphabetic characters but also contains finer detail and more figural elaboration. Thus the greater difficulty associated with the Lissajous figure may have tapped sensory rather than cognitive sources of difficulty. Baxt (1871) himself notes the fact that a stimulus containing
SENSORY RESPONSE PERSISTENCE AND TEMPORAL INTEGRATION
small spatial differences, i.e. detail, requires a longer exposure duration or integration time in order to be recognized than do larger stimuli (Kahneman 1964). It follows that the longer mask-escape ISI and hence the greater inferred response persistence of a Lissajous figure may be correlated with a concomitant increase in the required temporal integration time (Bowling and Lovegrove 1980, 1981). Related findings, interpretable along similar lines, can be found in Cattell’s (1885a, 1886) comparison of backward masking effects on letters composed of the simpler Latin type as opposed to the more complicated and detailed Gothic type. However, that cognitive difficulty per se may affect response persistence in vision was demonstrated by Schumann (1899) in the following statement regarding recognition performance on words of (equal type but) varying difficulty. When, for example, one [briefly] exposes a word which is difficult to recognize, the experimental subject experiences recognition problems when a mask follows the word by 0.2 sec; whereas with a more easily recognized word, the perceptual image seems already to have decayed completely (cited by Baade 1917b, p. 123; our translation).
Baade stresses and elaborates this finding as follows: ‘the finding, that the afterimage of words read with difficulty persists longer than that of easily read ones, points via this strong influence of central factors, to the fact that one must seek its seat at a central level’ (Baade 1917b, p. 123; our translation), a conclusion similarly and more recently arrived at by Erwin (1976). What, if any, were some of the central factors one can infer from these and contemporaneous works? Of prime significance was the factor of effort or activity (Tätigkeit) and attention (Aufmerksamkeit). We have seen how this latter factor, in the form of directed gaze, played a role in stroboscopic motion and metacontrast (Stigler 1910; Wertheimer 1912). Reference to attentional variables in the study of basic sensory processes had been made on numerous other occasions (Baade 1917a,b; Exner 1888; Rubin 1929; Tigerstedt and Bergqvist 1883). Particularly significant is Baade’s (1917a,b) distinction between the possible sensory and attentional effects of an after-coming mask on the visibility of the prior target stimulus. Baade (1917a,b) notes that the sensory consequences of such a mask are often confounded with its ability to evoke and thus divert attention from the target or test stimulus. As such, Baade implicitly raised the possibility of non-sensory
27
28
A HISTORY OF VISUAL MASKING
cognitive masking mechanisms (Enns and Di Lollo 1997; Michaels and Turvey 1979), and hence undermined his own rationale of using backward masking to index visuosensory persistence and his conclusion that such persistence is influenced by central factors. The stage of visual processing at which such attentional masking might play its role was, to our knowledge, not made explicit in any of the above investigations. However, there is some implicit indication that it was conceived of as intervening between conscious registration of a stimulus and its recognition or cognitive categorization (Cattell 1885a, 1886; Tigerstedt and Bergqvist 1883). Tigerstedt and Bergqvist employed the following distinction initially made by Wundt in his treatise Physiologische Psychologie (Physiological Psychology). Wundt partitioned the psychophysical process into three stages of which the first two were (i) the entry of the visual impression into consciousness, otherwise designated perception, and (ii) the entry of the conscious impression into the focal point of awareness, otherwise designated apperception. The former is similar to Neisser’s (1967) and Treisman’s (1988) more recent notion of preattentive process; the latter, of course, is similar to a focally attentive process. An alternative to Wundt’s perception stage or the preattentive process is the notion of iconic memory or persistence, a level of (parallel) visual processing at which a literal visual and visible representation of the stimulus is given without its yet being further processed, identified, or cognitively categorized. The latter focal recognition process presumably requires effort or attention. It is at this transitional stage from perception (iconic memory) to apperception (focal awareness or recognition) where, according to Tigerstedt and Bergqvist (1883), attention would exert its role. In more recent masking terminology, attention is diverted from or interrupted at this stage so that, in turn, the transfer of information from a precategorical feature representation or icon of the stimulus to a higher object or categorical representation is also interrupted (Enns and Di Lollo 1997; Michaels and Turvey 1979). In this regard, using a backward masking technique, Tigerstedt and Bergqvist (1883) determined that the rate at which information is transferred from the perception to the apperception stage averaged one stimulus item per 13.8 ms; a rate very similar to that determined by Sperling (1963), Scharf et al. (1966), and Scharf and Lefton (1970) in more recent times.
SUMMARY
What is the fate of items not so transferred? Cattell made the following relevant observations on what he called ‘the limits of consciousness’: In making these experiments I notice that the impressions [of briefly exposed letters, words, or sentences] crowd simultaneously into my consciousness, but beyond a certain number, leave traces too faint for me to grasp. Though unable to give the impression, I can often tell, if asked, whether a certain one was present or not. This is especially marked in the case of long sentences; I have a curious feeling of having known the sentence and having forgotten it. The traces of impressions beyond the limits of consciousness seem very similar to those left by my dreams. (Cattell 1885a; p.312)
One can infer from Cattell’s observation that the process of focally attending to and cognitively grasping these visual impressions or icons takes time and effort; and since these iconic impressions decay rather rapidly, not all of them can be transferred into focal awareness. 1.6. Summary One of the major tasks undertaken by visual scientists in the latter half of the nineteenth century was the identification of (i) perceptual elements or building blocks and (ii) the stage or stages at which these elemental components were processed during perceptual microgenesis. Under various guises this program has continued to the present, although its limitations were pointed out then as they still are now. The initial emphasis of this program was on establishing the time course of sensation and perception. In this context, the discovery that the duration of sensations could outlast that of the brief stimuli producing them gave visual persistence a central role. Determining its intensive and temporal properties was a major project in which two varieties of masking were employed. For instance, we noted that lateral masking (para- and metacontrast) was unwittingly employed by, among others, Exner to index sensory intensity over time, since he assumed the absence of spatial contrast effects. However, at about the same time, estimates of visual persistence also quite wittingly employed backward masking by light. The former paradigm, with a shift of emphasis to spatial interactions, led to the early twentieth century studies of meta- and paracontrast by Stigler, although both lateral masking effects had already been investigated several decades earlier, without naming them as such, by Sherrington and McDougall. The latter paradigm of masking by light highlights, for one, its disruptive effects on central cognitive processes
29
30
A HISTORY OF VISUAL MASKING
such as attention and, what has been called more recently, read-out of iconic information. Moreover, it took a significant turn in Crawford’s investigation of early light and dark adaptation. Here the mask or conditioning flash was no longer used to index the persisting response to a brief prior stimulus; instead, a brief test probe—flashed before, during, and after the mask—was employed to monitor the response of the visual system to the mask. Besides anticipating and defining many of the more global issues, problems, and paradigms surrounding visual information processing as presently studied, the history of visual masking and persistence also shows that many of the specific findings, methods, and theories reported in the past anticipated those found in the more recently published literature (e.g. Geremek et al. 2002). Although many new empirically and theoretically important findings in the study of visual masking and related phenomena have been placed under the spotlight, their connection to the past, although not always acknowledged, is one of methodical and gradual refinement and elaboration. Consequently, our history of visual masking exemplifies the history of a scientific discipline in its normal-science phase. Notes 1. In his initial metacontrast investigation Stigler (1910) also failed to obtain dichoptic metacontrast. He, like Alpern (1953), interpreted this result as indicating a peripheral retinal locus of metacontrast interaction. However, Stigler’s explanation differed from Alpern’s in that the response integration hypothesis was maintained in the context of hypothetical inhibitory lateral interactions transmitted between adjacent receptors via horizontal cells (Exner 1898), whereas Alpern’s theoretical explanation rests on the faster cone response being able to overtake and inhibit the slower rod response. 2. Wertheimer (1912, p.226) also notes the disappearance of both vertical and horizontal lines when they were presented alternately in back-and-forth sequence. This sequential blanking is similar to the aforementioned temporally staggered metacontrast effects reported by Piéron (1935). That is, the two stimuli were alternated at a temporal rate such that each was able to mask the brightness and contour aspects of the other, but each left unmasked the neural information producing stroboscopic motion perception. This and other observations by Wertheimer are, to our knowledge, the first clear demonstrations of a perceptual dissociation of brightness or contour perception and motion perception. Unfortunately, Wertheimer failed to explore further this psychophysical segregation of figural and motion aspects of a stimulus sequence, and to our knowledge such a segregation or distinction was not made again until it was mentioned by Saucer (1954).
Chapter 2
Methods, applications, and findings in visual pattern masking
2.1. Introduction In visual pattern masking the visibility of one stimulus pattern, called the ‘target’, is reduced by another stimulus pattern, called the ‘mask’. Visual masking has been and continues to be used as a powerful psychophysical tool for investigating pattern-processing mechanisms under steady-state viewing conditions (Dakin and Hess 1997; Foley and Chen 1997; Glennerster and Parker 1997; McKee et al. 1994a; Mussap and Levi 1997; Stromeyer and Julesz 1972). Although it is important for theoretical and practical reasons to understand the steady-state properties of pattern vision, in this chapter we focus on the dynamic properties of visual pattern processing. The rationale for investigating dynamic masking rests on the following preconditions. First, some period of time, usually of the order of a few tens or hundreds of milliseconds, is required after a stimulus impinges on the retinae before it either affects behavior or is consciously perceived. Secondly, active processing of the information conveyed by a stimulus occurs during the interval between retinal input and behavior or conscious percept. Thirdly, there are several processing pathways, each with several levels or stages of processing required before a behavioral response or visual awareness is generated. Fourthly, the response to a mask stimulus can interact with the response to the target stimulus at specifiable levels of processing and thus affect the visibility of the target. Finally, humans are mobile visual explorers and, owing to frequent shifts of gaze when we inspect the world of objects and events, vision is by definition a highly dynamic process. Given the usual two to four fixations per second, behaviorally relevant as well as phenomenally rich representations must be modified or newly constructed by
32
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
the visual system in a period of time lasting from about 250 to 500 ms. Thus even the manifest steady-state properties of visual experience during a fixation period rely on highly dynamic underlying neural processes that must be updated several times per second. According to this dynamical system point of view, the visual system may be operating dominantly in a transient regime. Within this conceptual framework, various masking methods can be employed to investigate the time course, pathways, and stages of visual information processing required for control of behavior or perception. Thus masking provides a powerful tool for studying the microgenesis not only of behavioral control but also of visual awareness. The information garnered from such masking studies depends on the choice of a number of display, stimulus, timing, and task parameters. Display parameters include the luminance and wavelength of the background on which the target and mask stimuli are presented. Stimulus parameters include the shape, size, luminance, wavelength, retinal eccentricity, number, and degree of spatial overlap of the target and mask patterns. Timing parameters include the duration of the target and mask stimuli as well as the time interval, most commonly expressed in terms of SOA, separating the target and mask stimuli. Task parameters include the viewing condition (i.e. monocular, binocular, or dichoptic), and the response (e.g. visibility rating, luminance matching, forcedchoice pattern discrimination, or reaction time) used in the masking study. By varying SOA and any one or more of the other parameters, we can infer from the results how, in what pathways, and at what stages of visual processing the responses to the target and mask interact. In particular, by examining how masking magnitude varies with SOA we can infer the process of object formation, i.e. the microgenesis of object perception unfolding after a stimulus impinges on the retina until, some 300 ms later, its conscious registration occurs. Of course, such inferences also rely significantly on our detailed anatomical and physiological knowledge of the primate visual system and on neuropsychological findings in humans. Over the years, reviews of methods, findings, and theories relevant to visual masking have been provided elsewhere (Bachmann 1994; Breitmeyer and Ganz 1976; Breitmeyer and Ö˘g men 2000; Fox 1978; Kahneman 1968; Lefton 1973; Scheerer 1973; Weisstein 1972). We shall provide an updated review and analysis of methods and findings in the
METHODS OF VISUAL MASKING
present chapter. Although some relevant theoretical topics will also be mentioned and discussed, a more extensive review and analysis of these topics is deferred until Chapters 4 and 5 which concentrate on theories and models of metacontrast. 2.2. Methods of visual masking In this book we focus on visual pattern masking, which is the reduction of a target’s visibility by a mask when both consist of spatially patterned forms and contours. By varying the temporal interval between target and mask, pattern masking can be used to investigate the microgenesis of contrast and form perception in human vision (Werner 1935). As illustrated in Figure 2.1, several variants of pattern masking exist. Figure 2.1(a) shows a typical target–mask stimulus combination used in the study of paracontrast and metacontrast as defined in Chapter 1. The target and mask need not consist of a disk and annulus as shown; equally useful would be a rectangle serving as a target and two spatially adjacent rectangles serving as a mask or any other set of stimuli which, without overlapping, preserve spatial contiguity between the target and mask contours. When the mask temporally precedes the target, paracontrast masking is obtained; when the target–mask sequence is reversed, metacontrast masking prevails. Figure 2.1(b) illustrates a target–mask combination that is used in a pattern-masking procedure termed masking by noise (Kinsbourne and Warrington 1962a). Here, the elements and contours of the random-dot mask, although spatially overlapping those of the target, are designed
(a)
(b)
(c)
Mask
Target
Fig. 2.1 Examples of target and mask stimuli typically used in (a) paracontrast and metacontrast, (b) pattern masking by noise, and (c) pattern masking by structure. (Reproduced from Breitmeyer and Ganz 1976.)
33
34
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
to bear little, if any, structural relationship to the target contours. On the other hand, when the overlapping contours of the mask structurally resemble the contours of the target in terms of their orientation, curvature, angularity, or some other figural characteristic, we have (Fig. 2.1(c)) a masking technique called masking by structure (Breitmeyer and Ganz 1976). Because they share contour contiguity and similarity between the target and mask, metacontrast and paracontrast are special cases of structure masking. Thus six types of pattern masking are distinguishable operationally, depending on three spatial and two temporal relationships between the target and mask patterns. Any of three pattern masks (made of overlapping noise, overlapping structure, or non-overlapping contiguous patterns) can be used either in forward masking, where the mask precedes the target at negative SOAs, or in backward masking, where the mask follows the target at positive SOAs. These six masking methods can be distinguished from each other by the functions they yield relating masking magnitude to the SOA separating the target and mask (Fig. 2.2). Using Kolers’ (1962) terminology, Figure 2.2(a) shows idealized monotonic type A forward and monotonic type A backward masking functions. Analogously, Figure 2.2(b) illustrates idealized nonmonotonic type B forward and non-monotonic type B backward masking functions. In our review of findings, we focus chiefly on paraand metacontrast masking and backward masking by structure. They, unlike the other types of masking, can yield a non-monotonic U-shaped (or type B) masking effect as a function of SOA and powerful masking effects when target and mask are presented to separate eyes (dichoptic viewing) as well as when they are presented to the same eye(s) (monocular or binocular viewing) (Breitmeyer 1984; Foster and Mason 1977; Kolers and Rosner 1960; Michaels and Turvey 1979; Schiller and Smith 1968; Turvey 1973; Weisstein 1971). This choice of emphasis is motivated by the fact that object perception, while initially depending on activity at retinal and other precortical levels of processing, ultimately depends on the activity in higher cortical centers where dichoptic masking effects most likely occur. Consequently, our use of the terms ‘forward masking’ and ‘backward masking’ from this point on will apply respectively to either para- and metacontrast or forward and backward masking by structure. Specific references to the other masking methods will be made as the occasional need arises.
METHODS OF VISUAL MASKING
(a)
(b)
Type A
Forward
Backward
Backward
Target visibility
Forward
Type B
0
0 (c)
Bimodal
(d)
Bimodal
Forward Target visibility
Backward
Stimulus onset asynchrony 0
0 (e)
Multimodal (oscillatory) Backward
0 Stimulus onset asynchrony
Fig. 2.2 Schematic representation of masking functions: (a) unimodal type A forward and backward; (b) unimodal type B forward and backward; (c) bimodal forward; (d) bimodal backward; (e) multimodal oscillatory backward.
Figures 2.2(c), 2.2(d), and 2.2(e) show the finer multimodal structure of masking functions revealed by recent studies. A more detailed discussion of the fine morphology of masking functions will be given in the following sections.
35
36
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
2.3. Applications and uses of pattern masking The study of pattern masking is informative for several reasons. First, the phenomenon of backward pattern masking is interesting in its own right because of the counter-intuitive finding that the mask can impede the visibility of the target even though the target is presented first (Bachmann 1994; Breitmeyer 1984; Breitmeyer and Ö˘gmen 2000; Enns and Di Lollo 1997). Several competing quantitative models as well qualitative explanations of this phenomenon have been proposed in the last two decades, and of course testing them requires the study of pattern masking. We will review these models and explanations in subsequent chapters. Secondly, models of visual pattern masking may be relevant to our understanding of a variety of spatiotemporal phenomena such as motion perception, visual persistence, on- and offset reaction time, and discrimination of temporal order (Breitmeyer 1984). In particular, several recent neural-network models (Bachmann 1994; Francis 1997; Ö˘gmen 1993; Breitmeyer and Ö˘gmen 2000) to be discussed in Chapters 4 and 5 can account for backward masking as well as many other spatiotemporal phenomena. Thirdly, visual processing is a dynamic temporally evolving phenomenon (VanRullen and Thorpe 2001), and pattern masking can be a useful tool to investigate the temporal sequence and levels of visual information processing involved in the recognition of stimuli ranging from simple geometric forms to faces (Loffler et al. 2005) or complex scenes (Bacon-Macé et al. 2005; Rieger et al. 2005). Several psychophysical (Bachmann 1984; Bowen and Wilson 1994; Michaels and Turvey 1979; Muise et al. 1991; Turvey 1973) and neurophysiological approaches (Bridgeman 1975, 1980; Kovács et al. 1995; Macknik and Livingstone 1998; Rolls et al. 1999; Thompson and Schall 1999) have been developed. Past and recent psychophysical applications (e.g. Carrasco et al. 2002; Rauschenberger and Yantis 2001) have often simply assumed that the after-coming mask acts to ‘erase’ or add ‘noise’ to visual information or to ‘interrupt’ or ‘terminate’ its further processing. A clearer understanding of underlying mechanisms producing such erasure, noise addition, interruption, or termination (Kahneman 1968; Scheerer 1973; Sperling 1963) is required for the informed use of masking as a methodological tool.
APPLICATIONS AND USES OF PATTERN MASKING
Fourthly, higher-level visuocognitive processes can modulate visual masking and vice versa. For instance, the effects of masking are influenced by perceptual grouping and figure–ground segmentation (Calis and Leeuwenberg 1981; Caputo 1998; Kahan and Mathis 2002; Kurylo 1997; Wolf et al. 1995), by deployment of selective visual attention (Boyer and Ro, in press; Enns and Di Lollo 1997; Havig et al. 1998; Kirschfeld and Kammer 1999, 2000; Michaels and Turvey 1979; Ramachandran and Cobb 1995; Scharlau and Neumann 2003; ShelleyTremblay and Mack 1999; P.L. Smith 2000; P.L. Smith and Wolfgang 2004; Tata 2002; Tata and Giashi 2004; Weisstein 1966), and by the generation of visual imagery (Reeves 1980). As noted by Breitmeyer and Ö˘gmen (2000), the role of gestalt grouping and attention in backward masking is currently of particular theoretical interest. Moreover, masking plays a significant role in studies of the temporal parameters characterizing visual attention (Shapiro 2001), especially in studies of the attentional blink (Brehaut et al. 1999; Breitmeyer et al. 1999; Dehaene et al. 2003; Dell’Acqua et al. 2003; Enns et al. 2001; Giesbrecht and Di Lollo 1998; Giesbrecht et al. 2003; Grandison et al. 1997; Seiffert and Di Lollo 1997) in which a rapid serial visual presentation (RSVP) task is used. Fifthly, and closely related, backward masking has recently been used to explore visual awareness (Bachmann 1997; Dennett 1991) and its implications for the controversial field of ‘subliminal’ perception (Duncan 1985; Holender 1986; Kihlstrom 1996; Marcel 1983a,b). The fact that information rendered unavailable to conscious report because of visual pattern masking can nonetheless influence a variety of motor, cognitive, and emotional processes has been repeatedly established in recent years (Ansorge 2003; Ansorge et al. 1998; Dehaene et al. 2001; Dimberg et al. 2000; Dolan 2002; Eimer 1999; Eimer and Schlaghecken 1998, 2002; Esteves and Öhman 1993; Klotz and Neumann 1999; Klotz and Wolff 1995; Merikle and Joordens 1997; Morris and Dolan 2001; Neumann and Klotz 1994; Öhman 2002; Ortells et al. 2003; Taylor and McCloskey 1990; Whalen et al. 1998; Wong and Root 2003). Sixthly, visual masking has been and continues to be used to study certain clinical anomalies related to vision and brain function, such as amblyopia (Tytla and McAdie 1981; Tytla and Steinbach 1984), closed head injury (Mattson et al. 1994), Parkinson’s disease
37
38
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
(Bachmann et al. 1998), developmental dyslexia (Williams et al. 1989, 1990), mania (Green et al. 1994a,b), and schizophrenia (Brand et al. 2004; Green et al. 1994a,b, 1997, 1999, 2005a; Herzog et al. 2004; Merritt and Balogh 1984; Saccuzzo and Schubert 1981; Slaghuis and Bakker 1995) as well as to non-clinical specific subject populations (Atchley et al. 2002). Therefore studies of visual masking may provide a better understanding of perceptual anomalies and markers in any of these subject populations (Green et al. 1997; Williams et al. 1989, 1990). Finally, several of the above clinical entities are treated with psychoactive drugs. Such drugs are frequently tested for their effects on various perceptual and cognitive functions. Here, masking has also proved to be a useful means of measuring the effects of such drugs on visual information processing (Emre et al. 1989; Fisch et al. 1983; Giersch and Herzog 2004; Holland 1963) 2.4. The phenomenology of pattern masking A wealth of information about masking can be found in subjective descriptions of the target’s and the mask’s appearance. In this section, we outline a general phenomenology of masking. More specific observations related to particular masking paradigms and configurations will be elaborated in the relevant parts of the book. A depiction of some of our observations is given in Figure 2.3. In metacontrast, the mask is highly visible over the entire range of SOAs. Its appearance is characterized by sharp contours and by clearly discernible internal features of uniform contrast and color. At SOAs close to 0 ms or larger than 150 ms the target’s visibility matches that of the mask. However, at optimal masking SOAs, whose exact value, ranging from 30 to 100 ms, depends on stimulus parameters and viewing conditions, metacontrast produces a total or nearly total suppression of the perception of the target’s contrast, color, and contour. It is also possible to observe a reversal of the target’s perceived contrast (Brussel et al. 1978). At other intermediate SOAs, one obtains intermediate percepts of target visibility. The contours of the target may appear incomplete and fragmented, and the area occupied by the target may appear at intermediate and often non-uniform contrast. In addition, one may experience the sensation of an ‘explosive’ or ‘split’ apparent motion proceeding from the masked target outward toward the
THE PHENOMENOLOGY OF PATTERN MASKING
Metacontrast Target–mask configuration
Target–mask percepts
Dynamic sensations
Surface color, contrast
Boundaries, contour details
Explosive ‘motion’ Nonmoving transient ‘blip’
Noise mask
Structure mask
Target–mask configuration
Target–mask percepts
Fig. 2.3 Physical stimuli and attendant phenomenology of pattern masking. See text for further details.
surrounding mask. As Stoper and Banffy (1977) have shown, such split apparent motion is not an invariable accompaniment of metacontrast. One may also experience what we can best describe as a non-moving transient ‘blip’, or a sensation of change, in the target area. When the mask partially surrounds the target, masking effects tend to occur on the side of the target that is adjacent to the mask. If the target’s contour contains ‘features’, such as gaps or angles, these may be perceived as being part of the mask.
39
40
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
2.5. Forward masking (paracontrast) Up to now, the numbers and types of stimulus dimensions that may affect the magnitude and form of the paracontrast function have not been investigated sufficiently. Although one can infer from results reported by Breitmeyer et al. (1981b) that the effective range of target–mask spatial separations is more limited in paracontrast than in metacontrast, systematic research on the effects of stimulus size, retinal location, target–mask spatial separation, and their interactions remains to be done. Nevertheless, scattered reports of several stimulus and response variables affecting the paracontrast function have been published. For instance, paracontrast can yield either type B or type A masking functions depending on the experimental task required of the observer (Alpern 1953; Kolers and Rosner 1960; Lefton and Newman 1976; Pulos et al. 1980; Weisstein 1972). When suppression of brightness or contrast is used as an indicator of the masking effect, type B functions are obtained (Growney et al. 1977; Kolers and Rosner 1960; Weisstein 1972). Here, target suppression is maximal when the mask leads the target by about 30–50 ms. Moreover, the magnitude of the effect decreases as target–mask spatial separation increases (BlancGarin 1973; Growney et al. 1977; Kolers and Rosner 1960) and increases as the mask energy increases relative to the target energy (Alpern 1953; Weisstein 1972). As noted, the type B effect can also be obtained dichoptically (Foster and Mason 1977; Kolers and Rosner 1960), indicating that paracontrast occurs when the target and mask responses interact at or beyond the level of binocular combination, presumably somewhere in the visual cortex or beyond. A type A paracontrast effect has been reported by Lefton and Newman (1976). In this study mask-produced changes of target detection rather than of its perceived brightness or contrast was used as an indicator of masking. At each target–mask SOA, duration thresholds for target detection were measured, with longer threshold durations indicating stronger masking. Masking was strongest near target–mask synchrony and declined monotonically as the temporal interval separating the mask from the lagging target increased. This dependence of the type of paracontrast effect on task variables points out the importance of the role of criterion content in visual masking (Kahneman 1968). Criterion content refers to the stimulus dimension along which an observer makes his or her perceptual
FORWARD MASKING (PARACONTRAST)
judgment about the target. Since brightness judgments and detection of a target can rely on different stimulus dimensions, it is not surprising that different paracontrast functions result in different experimental tasks. Similar differences produced by varying task parameters and corresponding criterion contents also occur in other masking paradigms. A further important role of criterion content in paracontrast is illustrated by the following findings. Measurable paracontrast effects when the mask precedes the target at SOAs larger than 100 ms have been reported by Kahneman (1967), Scharf and Lefton (1970), and Cavonius and Reeves (1983). Using subjective ratings of target visibility, Kaitz et al. (1985) found in their study of paracontrast a bimodal masking function (Fig. 2.2(c)) with one masking maximum at or near an SOA of 0 ms and the other at an SOA between 150 and 100 ms. However, since their results were based on having their observers combine contour and contrast visibility in their ratings, it is not certain which of these two criterion contents was responsible for the two maxima. Figure 2.4 plots two masking functions generated with slightly different stimulus parameters for the same observer (Ö˘g men et al. 2003). One of the functions was based on a rating technique and the other was based on a brightness-matching technique. It appears that an additional dip in the paracontrast function was obtained with the rating technique. Another difference between the two paracontrast tasks is the enhancement of target visibility at SOAs near zero observed when the brightness-matching technique is used. Stober et al. (1978) also observed an enhancement at short SOAs in paracontrast when
Normalized perceived brightness
1.2
0.8
0.4 BB match BB rating Baseline –600
–400
–200
0
0 200 SOA (ms)
400
600
Fig. 2.4 Normalized perceived brightness estimated by matching and rating techniques applied by the same observer.
41
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
brightness judgments were used, but no enhancement was observed when the judgments were based on contour clarity. Recently Breitmeyer et al. (in press) and Breitmeyer, Ziegler, and Hauske (in preparation) systematically explored several variations of stimulus and task parameters in paracontrast. The variable stimulus parameter was the mask-to-target (M–T) contrast ratio between the target and mask. Two tasks, one using subjective contrast matching and the other forced-choice contour discrimination, were performed. In the former task observers were required to match the apparent contrast of the target disk to a comparison disk whose physical contrast could be adaptively adjusted by the observer from trial to trial. In the latter, the same observers were required to indicate which of three possible targets was presented: a disk with a contour deletion at its top, at its bottom, or at neither position. Figure 2.5 shows typical results obtained for SOAs ranging from 350 ms (paracontrast) to 140 ms (metacontrast). Limiting ourselves to the paracontrast case, we see that maximal contrast masking, a dip in the masking function, is obtained at an SOA of 170 ms. Although a dip in contour masking function is also obtained at that SOA, a still greater contour masking effect is obtained at an SOA of 10 ms where, however, the apparent contrast of the target matches that of the target presented in the baseline unmasked condition. That is, at an SOA of 10 ms, there is a dissociation between the high contrast visibility and 1.3 1.1 Target visibility
42
0.9 0.7 0.5 0.3
Contrast match Contour discrimination Baseline
0.1 –350 –300 –250 –200 –150 –100 –50 SOA (ms)
0
50
100
Fig. 2.5 Normalized target visibility as a function of SOA based on subjective contrast matching (full symbols) and forced-choice contour discrimination (open symbols) criteria (Reproduced from Breitmeyer et al., in press.)
METACONTRAST
the minimal contour visibility of the target. Thus it appears that the maximal paracontrast masking effect is obtained at long SOAs (200 to 100 ms) when apparent contrast is the criterion content, and at shorter SOAs (40 to 0 ms) when contour or figural identity (Kolers and Rosner 1960) is the criterion content. In agreement with these results, Stober et al. (1978) reported stronger paracontrast masking with a contour discrimination criterion than with a brightness criterion at SOAs between 50 and 0 ms. However, in their data the dip in paracontrast was at an SOA of 0 ms. Notwithstanding this difference, these data taken together indicate that surface contrast and form contour are processed by different neural substrates. Moreover, Breitmeyer et al. (in preparation) showed that both contrast and contour masking effects are obtained dichoptically as well as monoptically, thus implicating a cortical site for both types of paracontrast masking. It is also interesting that, at SOAs ranging from 140 to 20 ms, there is a relative enhancement of the target’s contrast visibility but not of its contour visibility. In addition to reinforcing the dissociation between contrast and contour visibility, the contrast visibility results indicate that a mask preceding the target can enhance the visibility of some of its attributes, in this case its contrast. Related enhancements of target visibility by a preceding mask have been reported previously by Michaels and Turvey (1979) and Bachmann (1988, 1994), and are relevant for assessments of Bachmann’s perceptual-retouch approach to masking discussed in Chapter 4, section 4.5.2. 2.6. Metacontrast Metacontrast masking can also produce type A or type B functions, and again the type and magnitude of the function depends on the experimental task and stimulus parameters. Here we review experimental findings of metacontrast in the context of the following general variables: task parameters and criterion content, stimulus (target, mask, and background) parameters, and viewing condition. 2.6.1.
Task parameters and criterion content
A most crucial, yet little understood and often neglected, aspect of masking concerns the range of criterion contents that observers can use when making perceptual or behavioral decisions about the target (Kahneman 1968). That task parameters and associated criterion
43
44
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
contents can affect performance in visual masking is a welldocumented fact (Bernstein et al. 1973b, 1976; Breitmeyer et al., in press; Delord 1998; Haber 1969; Hernandez and Lefton 1977; Hofer et al. 1989; Kahneman 1968; Petry 1978; Proctor et al. 1983; Stober et al. 1978; Ventura 1980). Such effects are most aptly illustrated when one compares the types of metacontrast functions obtained in a simple target detection or target reaction time task with those obtained in tasks requiring either subjective judgments of target contrast or forcedchoice identification of its form or contour. Type B metacontrast functions are generally obtained with suppression of the target’s brightness or contrast (Alpern 1953; Flaherty and Matteson 1971; Growney and Weisstein 1972; Ö˘g men et al. 2003; Weisstein 1972; Weisstein et al. 1970), its contour or contour detail (Breitmeyer 1978a; Breitmeyer et al. 1974; Burchard and Lawson 1973; Enns and Di Lollo 1997; Gilden et al. 1988; Hofer et al. 1989; Stober et al. 1978; Tata 2002; Werner 1935; Westheimer and Hauske 1975) or of its form or figural identity (Averbach and Coriell 1961; Weisstein and Haber 1965; Weisstein et al. 1970). Even here, the use of different criterion contents can yield different metacontrast functions. In particular, as shown by Stoper and Mansfield (1978), metacontrast suppresses the visibility of the contourless area, usually characterized by a uniform brightness or color, enclosed by, and well away from, the edges or boundaries of a target stimulus. Stoper and Mansfield (1978) called this type of masking ‘area suppression’ and argued that masking results depend on two distinct mechanisms. One processes the contour or boundary contrast of a stimulus, and the other, related to perceptual filling-in (Gerrits and Timmerman 1969; Gerrits and Vendrik 1970), processes the brightness or area contrast. The distinction between boundary and area contrast has been known for some time (Hering 1878; Mach 1865), and was attributed by von Békésy (1969) to ‘Mach-type’ and ‘Hering-type’ lateral inhibition. More recently, it has been correlated with cortical neural responses to the boundary of a stimulus that precede the neural responses to the interior of the stimulus by about 100 ms (Lee et al. 1995). Moreover, as noted by Breitmeyer and Ö˘gmen (2000), the distinction between boundary- and area-contrast effects in metacontrast may be related conceptually to the distinction between the boundary contour system (BCS) and the feature contour system (FCS) proposed
METACONTRAST
by Grossberg and coworkers (Grossberg 1987, 1994; Grossberg and Mingolla 1985a,b; Grossberg and Todorovic 1988). Paradiso and Nakayama (1991) investigated the spatial and temporal parameters affecting brightness perception and filling-in using a forward–backward masking paradigm. Related studies of brightness perception and filling-in have been extended to texture segregation (Caputo 1998; Motoyoshi 1999). Based on their extensive findings, Paradiso and Nakayama (1991) and Motoyoshi (1999) note that their results pose problems for several extant models of masking and metacontrast that rely on lateral contour-inhibiting processes. Like Stoper and Mansfield (1978), they propose the necessity of filling-in processes to explain their results. The study by Breitmeyer et al. (in press) mentioned previously also indicates a distinction between contour-specific and contrast-specific processes not only in paracontrast (section 2.4) but also in metacontrast. As Figure 2.5 shows, for identical stimuli, the optimal metacontrast SOAs for target contrast and target contour suppression are 10 ms and 40 ms, respectively. Together with the paracontrast results, these results suggest that two separate processes characterized by different temporal parameters are activated by the target. Whereas the metacontrast data indicate that the activation of the contour-specific process precedes that of the contrast-specific process by about 30 ms, the paracontrast data indicate that the timing differences between the respective activations may be larger. Further research is warranted to clear up these apparently conflicting results. Stober et al. (1978) reported metacontrast functions of different shapes based on brightness versus contour clarity judgments. In general, at small SOAs metacontrast was weaker when a subjective contour clarity criterion was used than when a subjective brightness criterion was used. Thus, while there is overall agreement between studies that different metacontrast functions are obtained when using brightness versus contour judgments, the nature of the differences and their relation to stimulus parameters remain to be explored further. Type B functions can also be obtained when a choice reaction time task is employed in which the subject must respond as fast as possible to one of several possible targets on the basis of its discriminable figural properties (Eriksen and Eriksen 1972). Here, choice reaction time to a target can reach a maximum at some intermediate positive
45
46
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
SOA temporally separating the target from the mask. This is not very surprising because suppression of target contrast, contour, or shape, upon which the choice reactions are based, also reaches its maximum at intermediate SOA values. In these studies, the target stimuli typically are either invisible or appear to have lower contrast, ‘fuzzy’ contours, and thus distorted shapes. However, when using simple detection as the response criterion, one can obtain very different metacontrast functions. For instance, in their simple reaction time studies of metacontrast, Fehrer and Raab (1962) and Fehrer and Biederman (1962) showed that reaction time to the target does not vary much as a function of SOA, i.e. neither a type A nor a type B effect was obtained. This finding has been replicated in several subsequent investigations of metacontrast (Harrison and Fox 1966; Bernstein et al. 1973a). In a related study, Schiller and Smith (1966) showed that choice reaction time to, or forced-choice detection of, a target disk displayed at one of two possible locations and followed by two simultaneously flashed rings, each of which could surround the target disk at each of its possible locations, did not change as a function of SOA. The above results, recently confirmed by Vorberg et al. (2003, 2004), indicate that the observers were able to detect the presence and location of a target on the basis of information (criterion content) which, unlike information about brightness, contour, or figure, is immune to the effects of the metacontrast mask. As noted previously, when looking at a metacontrast display, the target contrast and contours appear to be entirely absent; nonetheless one can often detect a type of ‘explosive’ or split-stroboscopic motion or a non-moving ‘blip’, which may provide the criterion content for the simple detection task.1 However, Schiller and Smith (1966), when using their reaction time or detection criterion, obtained an absence of masking only when the target and mask were of equal energies (43 foot-lamberts (ft-L)). When the target energy was lowered to 4.3 and 0.43 ft-L relative to the constant mask energy (43 ft-L), type A functions were obtained in both target reaction time and detection tasks. Similar findings were also reported in studies of target detection when the target duration, and thus its time-integrated energy, was substantially lower than that of the mask (Lefton and Griffin 1976; Lefton and Newman 1976). Hence, whatever process is responsible for the mere detection of a target, it can be suppressed, and the more so the higher the energy of the surrounding mask (Ö˘gmen et al. 2003). This, of course, points out that the type of
METACONTRAST
metacontrast function one obtains depends jointly on the task parameters and, as noted below in section 2.6.2, the characteristics of the target and mask stimuli as well as on the background on which they are displayed. Changes in criterion content can be effected not only by explicit instructions given to the observers, as discussed above, but also implicitly through practice or ‘learning’. Ventura (1980) and Hogben and Di Lollo (1984) reported reductions in the magnitude of metacontrast over successive experimental sessions. Ventura (1980) suggested that, through practice, observers learned to shift their brightness judgments from the dim appearance of the target occurring at later stages of perception to its brighter appearance at earlier stages. Hogben and Di Lollo (1984) added that observers may be learning to exploit subtle ‘plastic deformations’ or feature migration/inheritance effects (see sections 2.4 and 2.7.3) when making their judgments. 2.6.2. Effects of stimulus duration, intensity, and contrast
Variations in the duration, intensity, or contrast of either the target or the mask stimulus can have measurable effects on the shape or magnitude of the metacontrast masking function. Kahneman (1967) used target and mask stimuli of equal duration to study the effect of target/mask duration on the shape of metacontrast function. When plotted as a function of SOA, masking functions obtained for exposure durations ranging from 25 to 125 ms showed a strong overlap. Based on this and some additional findings that will be discussed in Chapter 4, Kahneman proposed the ‘onset–onset law’ (also known as the SOA law) which states that SOA is the most critical variable in metacontrast. More recently, Macknik and Livingstone (1998) varied the durations of the target and the mask independently in the 20–140 ms range and showed that the SOA at which peak masking occurred ranged from 20 to 200 ms. Plotting these results in terms of ISI and stimulus termination asynchrony (STA), the time between the offset of the target and the offset of the mask, yielded peak masking ranges of 50 to 40 ms and 80 to 120 ms, respectively. Since the peak masking effect was dispersed less for STA than for either SOA or ISI, Macknik and Livingstone concluded that the STA is the best descriptor of the time of peak backward masking. However, it is far from clear which, if any, temporal parameter—SOA, ISI, or STA—is the best or the most critical for explaining the peak masking effect during backward masking. Macknik and Livingstone’s results
47
48
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
tend to support this tentative view since there was also noticeable (although less) variability in the peak STAs compared with ISIs and SOAs. The SOA at which peak masking occurs is a function of several variables including light adaptation level (Alpern 1953; Purcell et al. 1974; Stewart and Purcell 1974), the ratio of target to mask energy (Fehrer and Smith 1962; Hellige et al. 1979; Kolers 1962; Stewart and Purcell 1974; Weisstein 1972), the degree to which the mask depicts a three-dimensional object (Williams and Weisstein 1981), the degree to which figural and semantic features of the mask resemble those of the target (Hellige et al. 1979; Michaels and Turvey 1979), and, under dichoptic viewing, whether the target is presented to the dominant or non-dominant eye (Michaels and Turvey 1979). Therefore to speak of the SOA, STA, or ISI as though backward masking functions should yield peaks at a fixed value is clearly unrealistic. For stimulus detection performance, Bloch’s (1885) law states that, up to a critical duration, the visual system temporally integrates its input. This implies that the effects of duration and intensity can be considered jointly through their product, namely stimulus energy. Extending this observation to masking, one of the prime determining variables is the mask-to-target (M/T) energy ratio. In her review of metacontrast, Weisstein (1972) noted the following general empirical relationships depending on M/T energy ratio: (i) when the ratio is less than or equal to unity (target energy greater than or equal to mask energy), type B U-shaped backward masking functions are obtained; (ii) for ratios greater than unity (target energy less than mask energy), the shape of the metacontrast function tends to shift from a type B to a monotonic type A function as the ratio becomes progressively larger. This relation between M/T contrast ratio and shape and magnitude of metacontrast functions has been reported in several other studies of metacontrast and backward masking (Breitmeyer 1978b; Fehrer and Smith 1962; Kolers 1962; Spencer and Shuntich 1970; Stewart and Purcell 1974) and is a key feature in the description and testing of alternative models of metacontrast (Francis 2000). Breitmeyer (1978b) investigated the relationship between the magnitude and shape of the metacontrast function, as indexed by contrast suppression of the target, and the M/T energy ratio as indexed by the M/T duration ratio. The target duration was fixed at 16 ms, and the duration of the mask could vary in 0.3 log unit steps from 1 to 32 ms.
METACONTRAST
Mask duration 1 ms 2 ms 4 ms 8 ms 16 ms 32 ms
RK Masking magnitude
9
7
5
3 0
40
80 SOA (ms)
120
Fig. 2.6 Metacontrast masking magnitude as a function of SOA and mask durations as indicated. Target duration was fixed at 16 ms. (Reproduced from Breitmeyer 1978b.)
Thus the M/T duration or energy ratios ranged from 0.0625 to 2.0. The results are displayed in Figure 2.6. When the mask duration was only 1 ms, little if any metacontrast masking was obtained. A noticeable type B function, peaking at an SOA of 56 ms, emerged at a mask duration of 2 ms and increased progressively in magnitude up to a mask duration of 8 ms. At further increments of mask duration, the masking magnitude at and beyond an SOA value of 56 ms did not change; however, what did change drastically was the masking magnitude at lower SOA values. Here, the masking effect increased progressively as mask duration increased beyond 8 ms, thus producing a concomitant change in the shape of the metacontrast function from type B to type A. As an extension of this change from type B to type A metacontrast functions, one can offer the following plausible and testable hypothesis. Since type B paracontrast magnitude increases when the M/T energy ratio increases (Weisstein 1972), the corresponding shift from type B to type A metacontrast may reflect the increasing contribution of the paracontrast effect at non-optimal, i.e. positive, SOAs. Other things being equal, in the study by Macknik and Livingstone (1998) the M/T energy ratio was directly proportional to D m /D t , where Dm and Dt are the durations of the mask and the target, respectively. In the light of the aforementioned results, as this ratio increases (i.e. as Dm increases or Dt decreases), one would expect to see shifts in the peak masking toward lower SOAs; conversely, as the ratio increases, the shift ought to be toward higher SOAs. In terms of SOA, the results obtained by Macknik and Livingstone support this trend. As the stimulus duration becomes longer than the critical duration, temporal
49
50
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
integration by the visual system becomes less effective. For example, doubling the physical M/T energy ratio by doubling D m to a value beyond the critical duration will not double the effective energy in the perceptual system for stimulus detection performance. This suggests that, for studies using relatively long stimulus durations, a perceptual rather than a physical measure of effective stimulus energy is needed. Because different visual processes can have different integration characteristics, it might be impossible to equate completely the effectiveness of stimuli of different durations. However, equating stimulus effectiveness in terms of detection thresholds offers a first approximation. Using this rationale, we explored systematically the effects of stimulus duration and intensity on the shape of the masking function.2 Target and mask durations, to the nearest 10 ms, were 10, 40, 80, and 120 ms. In the ‘compensated conditions’ target and mask luminances were set eight times above their detection threshold. In the ‘uncompensated conditions’, target and mask luminances were set eight times above the detection threshold of the 10 ms stimulus, regardless of their actual durations. Figure 2.7 shows the data averaged across the three observers. The panels in the left column plot the masking functions, and the panels in the right column plot the ‘change in masking’ with respect to the baseline condition, in which the duration of both target and mask was 10 ms. Figure 2.7(a) shows the masking functions obtained with fixed target duration and varying mask durations for compensated (C) and uncompensated (U) conditions. The findings are similar to those shown in Figure 2.6, with the additional observation that uncompensated conditions provide a more pronounced change at short SOAs than the corresponding compensated conditions. Figure 2.7(b) shows metacontrast masking functions for a fixed mask and varying target durations. In the uncompensated conditions, the brief mask is ineffective and a near-perfect target identification performance is obtained. For the compensated conditions, reduced masking is observed at short SOAs as the target duration is increased. Figure 2.7(c) shows metacontrast functions for target and mask stimuli of the same duration. As the target and mask durations increase, masking at short SOAs decreases. Overall, these results suggest that none of the laws based solely on temporal relationships, such as SOA, ISI, and STA, or laws based solely on the M/T energy ratio can give a complete description of metacontrast masking functions. Changes in the masking function that
METACONTRAST
(a)
40
5 0 –5
35 10/10C 10/40C 10/40U 10/80C 10/80U 10/120C 10/120U
30 25 20
–10 –15 –20
15
–25 0
20 40 60 80 100 120 140 160
0
20
40
60
80 100 120 140 160
0
20
40
60
80 100 120 140 160
0
20
40
60
80 100 120 140 160
0 20 80 100 120 140 160 Stimulus onset asynchrony
40
60
80
(b)
45
25
Number of correct responses
35 10/10C 40/10C 40/10U 80/10C 80/10U
30 25 20 0
20 40 60 80 100 120 140 160 (c)
45 40 35 30
10/10C 40/40C 80/80C
25 20
Change in number of correct responses
40
20 15 10 5 0 –5 25 20 15 10 5 0 –5
0
20
40
60
80 100 120 140 160
(d)
45
25 20
40
15
35 10/10C 40/40C 40/10C 80/80C 80/10C
30 25 20 0
20
40
60
10 5 0 –5 100 120 140 160
Fig. 2.7 Metacontrast masking (indexed by the number of correct answers out of 40 trials) as a function of SOA. The left column shows metacontrast functions for different target and mask durations as indicated in the insets. The right column shows changes in masking with respect to 10/10C condition for the data at the corresponding panel in the left column. In the compensated conditions (C), the contrast of each stimulus was set to a luminance contrast eight times greater than its detection threshold. In the uncompensated conditions (U), the contrasts of the stimuli were the same (uncompensated for detection threshold) at all durations.
occur when the stimulus luminance is changed without any change in stimulus duration (e.g. 8/120U vs. 8/120C in Fig. 2.7(a)) preclude the application of pure temporal relationships. Although the M/T energy ratio is a very good predictor of the general shape of the masking
51
52
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
function, it cannot give a complete description of these functions, as can be seen by the changes that occur in masking functions without any change in the M/T energy ratio (Fig. 2.7(c)). Figure 2.7(d) shows all stimulus conditions where a decrease in masking is observed compared with the briefest stimulus condition. By inspecting the left column of Figure 2.7, we can conclude that changes in metacontrast function occur at relatively short SOAs. This finding can clearly be seen in the plots in the right column. An increase in mask (target) energy, by an increase either in luminance or in duration, causes an increased (decreased) masking at short SOAs. Furthermore, as can be seen from Figures 2.7(c) and 2.7(d), an increase in target energy produces a greater change than an equal increase in mask energy. These findings, as well as related ones recently reported by Di Lollo et al. (2004), are important for several reasons. First, the results of the recent parametric study by Di Lollo et al. suggest that the most suitable temporal descriptor of peak metacontrast masking takes us back to square one; the best descriptor is neither ISI, as argued by Francis (1997), nor STA, as argued by Macknik and Livingstone (1998), but SOA, as originally suggested by Kahneman (1967). A second, methological, reason is that they have clear implications for explanations of backward masking and metacontrast. Since, as illustrated above, type A backward masking effects can obscure type B effects when the M/T energy ratio is substantially greater than 1.0, experiments failing to obtain type B effects, without specifying the M/T energy ratio, cannot be used unequivocally as evidence against possible underlying mechanisms responsible for generating type B effects (e.g. Eriksen et al. 1970). Only a type A metacontrast function obtained with a M/T energy ratio less than 1.0 and with a brightness, contour discrimination, or form-identification task would pose a problem for theories of type B metacontrast. However, to our knowledge no study has ever reported such a result. Another reason that they are important is that the smooth transition from type B to type A functions as the M/T ratio increases, of the sort shown in Figures 2.6 and 2.7(a), represents the rule. It is important to note that at all SOAs the magnitude of the type B effect always remains equal to or less than that of the type A effect. However, an important exception to that empirical regularity has recently been reported by Francis and Herzog (2004), who compared the visibility of a misaligned vernier target when masked by four
METACONTRAST
adjacent (metacontrast) elements with its visibility when it was masked by an overlapping grating mask comprised of 25 aligned vernier elements. In the former case, masking magnitude varied in the typical U-shaped manner with SOA; in the latter case, it varied in a type A manner. Despite being less energetic, the four-element mask produced stronger rather than weaker masking at the intermediate SOAs of 40–100 ms. Hence these results contradict the general findings shown in Figure 2.3. This is probably because the grating mask, in conjunction with the brief 10-ms target, produces a newly discovered phenomenon known as the shine-through effect (Herzog and Koch, 2001), in which the vernier target appears to shine through the grating mask, often in exaggerated form. The shine-through effect and its companion, feature inheritance, are discussed more fully in section 2.6.3 below. An exception to the broad unimodal (i.e. with a single maximum) type A/type B classification of metacontrast functions occurs for spatiotemporally localized stimuli. In a series of experiments, Vrolijk and van der Wildt (van der Wildt and Vrolijk 1981; Vrolijk and van der Wildt 1982, 1985) reported bimodal metacontrast functions, i.e. with two extrema (Fig. 2.2(d)). More recently, using similar stimuli, Purushothaman et al. (2000) and Fotowat et al. (in press) showed that metacontrast functions can be oscillatory (Fig. 2.2(e)). These findings and their clinical implications (Green et al. 1999; Wynn et al., 2005) will be discussed in more detail in Chapter 9. Effects of background luminance and adaptation level 2.6.3.
In the above studies, M/T energy ratio varied while the adaptation level determined by the background intensity was generally constant. Purcell et al. (1974) investigated not only the effects of M/T energy ratio but also the effects of background intensity. They measured metacontrast effects when the target and mask were flashed against either a dark background or a background at a luminance of 40 ft-L. U-shaped masking functions were obtained at each background intensity and at each M/T energy ratio. However, at the low background intensity the peak of the U-shaped metacontrast functions tended to shift to lower SOA values than at the high background intensity. Similar results have been reported by Stewart and Purcell (1974) under transient dark adaptation interposed between target and mask flashes. A similar shift
53
54
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
toward lower SOA values of peak type B metacontrast occurs when the intensity of the target and mask are both varied from high to low values. In particular, Alpern (1953) found that, as stimulus energies ranged from a high of 3000 ft-L to a low of 3.6 ft-L, not only the metacontrast magnitude decreased but also the SOA at which peak metacontrast occurred shifted from 125 ms to approximately 75 ms. 2.6.4.
Effects of stimulus contrast polarity
Most studies of metacontrast have usually employed target and mask stimuli with a contrast of the same sign, i.e. both stimuli were either dark-on-light surrounds or light-on-dark surrounds. Using a detection task, Sherrick et al. (1974) reported metacontrast suppression of both black and white targets by either black or white masks. However, they used a target duration of 15 ms, a mask duration of 100 ms, and an ISI of 0 ms (mask onset at target offset). Consequently, only the single SOA value of 15 ms was sampled and no extensive inferences can be drawn about the shape or magnitude of the metacontrast functions. Breitmeyer (1978c) extended the study of metacontrast with blackon-white or white-on-black surrounds. Here the target and mask durations (energies) were equal so as to not obscure type B metacontrast effects. Moreover, Breitmeyer, instead of employing a simple detection task, used a form or contour discrimination task, which, as noted above, is more likely to yield type B effects. Disk-like targets were employed and could be either black or white on a medium gray background. Spatially surrounding rings served as masks and, again, could be either black or white on gray. The general findings can be summarized as follows: (i) U-shaped metacontrast functions were obtained for any combination of target and mask contrasts; (ii) the magnitude of the masking effect tended to be greater when the target and mask contrasts had the same sign. These findings indicate that, although the magnitude of type B contour suppression in metacontrast is affected by contrast–polarity relations between the target and mask, the activation of mechanisms producing these contour-related type B effects per se is largely indifferent to these same relations. However, Becker and Anstis (2004) recently showed that when a target disk’s visibility is assessed by judgments of perceived brightness or contrast, one does not obtain metacontrast masking when the contrast polarities of the target and mask do not match.3 This result points out two interesting
METACONTRAST
properties. First, it again demonstrates that the degree and type of masking one obtains depends on task parameters and criterion content (see sections 2.5 and 2.6.1 above); secondly, it suggests that form or contour properties and surface properties like brightness are processed by different visual channels (see Chapter 8, section 8.4). 2.6.5.
Figural variables: stimulus orientation and size
In one of his experiments, Werner (1935, experiment 18) investigated the effect of the orientation of internal contours of a target on metacontrast. The target consisted of either a vertically or a horizontally oriented grating. The metacontrast mask consisted of adjacent black vertically oriented bars flanking the target area. Werner reported that although suppression of the vertical target grating was obtained in about 80 percent of the trials, the horizontal target grating always remained clearly visible. In other words, the magnitude of suppression of the target’s visibility is orientation specific and depends on the similarity between target and mask orientation. Stimulus size, in particular mask size, has been found to affect the magnitude of metacontrast, but the reported directions of empirical effects have not been consistent across studies. Although some investigators have reported decreases of metacontrast magnitude as the width of the surrounding mask increases (Schiller and Greenfield 1969; Sturr and Frumkes 1968; Sturr et al. 1965), others have reported the opposite effect (Growney and Weisstein 1972; Kao and Dember 1973; Matteson 1969). In fact, several of the results obtained by Growney and Weisstein indicate that masking magnitude may be a U-shaped function of mask width. For stimuli centered 1.0 from the fovea, they reported an initial rapid increase of metacontrast magnitude as mask width increased from 1 to about 10, followed by a more gradual decline in mask effectiveness as the width increased further. This particular trend may be an index of the antagonistic processes of spatial summation and spatial (lateral) inhibition found with variation of stimulus size as measured by the Westheimer function (Teller 1971; Teller et al. 1971). This nonmonotonic relationship between mask width and metacontrast magnitude, in conjunction with the findings of Bridgeman and Leff (1979) that the effects of stimulus size interact with the effects of retinal locus, makes it plausible that variations of mask width across different studies can yield inconsistent metacontrast effects.
55
56
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
What seems to be needed is a careful empirical investigation of orthogonal variations of target size, mask size, and retinal locus. A metacontrast study that closely approximates these conditions has been reported by Bridgeman and Leff (1979). They showed that large brightness suppression effects can be obtained in the fovea only when relatively small targets (0.25 diameter disks) and masks are used. As the target and mask dimensions increased, foveal metacontrast magnitude decreased substantially, whereas parafoveal and peripheral metacontrast magnitude remained robust, either not changing noticeably or increasing slightly. In a related study using a contour discrimination task, Lyon et al. (1981) also showed strong metacontrast effects in the fovea. In fact, masking magnitude was found to be a function of the difficulty of contour discrimination and retinal locus. Foveal metacontrast was as strong as parafoveal metacontrast when a finer contour detail had to be discriminated at the fovea rather than in the parafovea (see also Breitmeyer 1978a; Westheimer and Hauske 1975). However, for discrimination of equal contour detail parafoveal metacontrast was stronger than foveal metacontrast. 2.6.6.
Spatial variables: stimulus location and separation
The finding that the magnitude of metacontrast suppression of brightness, contour, and figural identity generally increases with increasing eccentricity of the retinal locus of stimulation has been replicated frequently (Alpern 1953; Kolers and Rosner 1960; Lefton 1970; Merikle 1980; Saunders 1977; Stewart and Purcell 1970, 1974; Stoper and Banffy 1977). Moreover, the fact that several earlier investigations reported little if any foveal metacontrast (Alpern 1953; Kolers and Rosner 1960; Toch 1956) can, in light of the analyses by Bridgeman and Leff (1979), be attributed to the use of excessively large, and thus suboptimal, foveal targets (e.g. in the study by Kolers and Rosner (1960) the smallest foveal target disk was 0. 42 in diameter and the largest was 1.67 in diameter). Target–mask spatial separation is also known to affect the magnitude and form of type B metacontrast functions. Generally, as the spatial separation of the target and mask increases, the metacontrast magnitude decreases (Alpern 1953; Blanc-Garin 1973; Breitmeyer and Horman 1981; Breitmeyer et al. 1981b; Growney et al. 1977; Kolers 1962; Kolers and Rosner 1960; Levine et al. 1967; Ö˘gmen et al. 2003;
METACONTRAST
Weisstein and Growney 1969). Moreover, the peak of the U-shaped metacontrast function, in addition to decreasing in magnitude, at times also tends to shift toward shorter SOA values (Alpern 1953; Growney et al. 1977) as the target–mask spatial separation increases, although in other studies (Kolers and Rosner 1960; Weisstein and Growney 1969) no such trend occurred. Perhaps of greater significance is the fact that target–mask spatial separation, like stimulus size, interacts with the retinal locus of stimulation. For instance, Kolers and Rosner (1960), like Stigler (1910) in his pioneering work, noted that very small contour separations between the target and mask drastically reduced or even eliminated foveal metacontrast. In contrast with this finding, for non-foveal stimulus loci robust type B metacontrast functions can be obtained at target–mask spatial separations as large as 2 (Alpern 1953; Breitmeyer et al. 1981b; Growney et al. 1977). Since the above studies indicate that the effects of stimulus size, target–mask spatial separation, and retinal locus on metacontrast magnitude mutually interact, it would be desirable in future investigations to study the effects of these variables in greater detail while using the same experimental design, method, and observers. Carefully measured results from such intra-experimental comparisons may clear up some of the ambiguities and contradictions found when comparing the extant inter-experimental results. These ambiguities are particularly important, and their careful consideration is warranted when drawing conclusions from results produced by varying figural and spatial aspects of the target and mask stimuli. For instance, in his study of the standing-wave illusion, Enns (2002, Experiment 3) draws conclusions from results produced by varying mask size that may not be warranted. 2.6.7. Viewing conditions: monoptic, dichoptic, and cyclopean
Although there is little controversy about the existence of monoptic metacontrast, the results of investigations of dichoptic metacontrast are somewhat equivocal. We noted in Chapter 1 that although Stigler’s (1910) original investigation of metacontrast failed to show dichoptic masking effects, his later investigation (Stigler 1926) yielded dichoptic metacontrast. Although some subsequent studies (Alpern 1953) also
57
58
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
reported an absence of dichoptic metacontrast, the majority of later studies obtained dichoptic type B metacontrast effects (Breitmeyer and Kersey 1981; Kolers and Rosner 1960; May et al. 1980; Schiller and Smith 1968; Weisstein 1971; Werner 1940). These investigations, and in particular those showing that dichoptic metacontrast can be as strong as or even stronger than monoptic metacontrast (Schiller and Smith 1968; Weisstein 1971), indicate (i) that the site of target–mask interactions responsible for generating type B metacontrast effects must be at or beyond the level of binocular combination of the respective monocular inputs and (ii) that binocular rivalry between target and mask may be an additional source of target suppression under dichoptic compared with monoptic viewing (Breitmeyer, et al. in preparation; Schiller and Smith 1968). Based on these results and interpretations, one might infer that type B metacontrast interactions also occur in cyclopean or stereo-space when one uses random-dot stereograms to generate the target and mask patterns. That this is not the case has been shown by investigations of cyclopean metacontrast by Vernoy (1976), Lehmkuhle and Fox (1980), and ourselves (unpublished),4 in all of which only type A backward masking effects were obtained. It appears that a necessary condition for obtaining type B metacontrast is that one uses standard first-order target and mask stimuli whose edges are defined by luminance or wavelength differences. Thus each stimulus is capable of generating its own non-cyclopean or monocular contrast- and contour-forming processes in the visual pathway, although eventually they may also contribute to binocular stereo-effects (Szoc 1973). These combined findings of dichoptic type B metacontrast when using standard stimuli, and the failure to obtain such metacontrast when using cyclopean stimuli, indicate that binocular cortical mechanisms sensitive to luminance or wavelength contrast are used in generating type B metacontrast. Mechanisms sensitive to mere cyclopean depth contrast (devoid of luminance contrast) are not used in type B metacontrast, although they are in type A backward masking. Consequently, mechanisms giving rise to type B metacontrast, although located at the level of binocular combination, are probably not located at as late a level of cortical processing as that used in generating cyclopean contours (spatial boundaries defined solely by binocular image disparity or depth differences).
METACONTRAST
Stimulus wavelength variables: effects of chromatic and rod–cone interactions and of background color 2.6.8.
Between 1950 and 1980 one of the most extensive and fruitful applications of the metacontrast masking method was to psychophysical investigations of interactions (or lack of them) between chromatic and rod–cone mechanisms. Alpern and his collaborators provided most of the impetus for this line of research (Alpern 1965; Alpern and Rushton 1965; Alpern et al. 1970a,b,c,d). Their results and conclusions inspired further investigations which are discussed below. Alpern’s (1965) rationale for the use of metacontrast in studying chromatic (cone–cone) and rod–cone interactions developed along the following lines. First, it relied on the studies of chromatic mechanisms performed by Stiles (1939, 1949, 1959)5 and on the work of du Croz and Rushton (1963) on cone dark-adaptation. These studies showed that the rod mechanism and the short-wavelength (s-w), mediumwavelength (m-w), and long-wavelength (l-w) cone mechanisms behaved independently of each other. For either the increment-threshold or the dark-adaptation technique, it was found that a mechanism’s test-flash threshold was lowered in proportion to that mechanism’s sensitivity to the test-flash wavelength, and raised in proportion to its sensitivity to the background wavelength, a result one would predict from the independence of the four mechanisms. Furthermore, in his original metacontrast investigation Alpern (1953) found that the brightness of a test flash activating only rod mechanisms could be optimally suppressed at an SOA of 50 ms by a flanking mask flash activating both rod and cone mechanisms. These results motivated Alpern’s (1965) hypothesis that the latency difference between rod and cone responses was equal to the SOA of 50 ms that yielded optimal test flash suppression. However, Alpern (1965) found that, at that SOA, the visibility of a test flash isolating rod mechanisms was suppressed only when the surrounding after-coming mask flash also activated the rod mechanism and not when it isolated any of the cone mechanisms. These results indicated that rod and cone mechanisms do not interact in metacontrast. Using a similar rationale, Alpern and Rushton (1965) found no interaction among the three different cone mechanisms in metacontrast. Both this apparent specificity of cone mechanisms and the lack of rod–cone interactions were replicated in subsequent studies using metacontrast (Alpern et al.
59
60
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
1970a,c,d) and in studies of increment thresholds measured against flashed or steady backgrounds (Hallett 1969; McKee and Westheimer 1970; Westheimer 1970). These results rule out not only interactions among the different receptor mechanisms but also an explanation of type B metacontrast based on latency differences known to exist between rod and cone responses (McDougall 1904a) or between different types of cone mechanisms (Mollon and Krauskopf 1973) as Alpern (1953, 1965), for example, initially hypothesized. However, we shall see below that, since the time they were reported, these negative results have been contradicted and superseded by the results of other later investigations of interactions among receptor mechanisms. Despite the earlier failure to obtain psychophysical indices of interactions among these mechanisms, several subsequent studies using a variety of investigative techniques have reported, for instance, rod–cone interactions (Barris and Frumkes 1978; Blick and MacLeod 1978; Buck et al. 1979; Foster 1977; Frumkes et al. 1973; Frumkes and Temme 1977; Latch and Lennie 1977; Makous and Peeples 1979; Sandberg et al. 1981; Temme and Frumkes 1977; von Grünau 1976). Moreover, these psychophysical findings are consistent with the results of several neurophysiological studies indicating rod–cone signal convergence and interactions, including those at post-receptor levels of neural processing (Andrews and Hammond 1970; Enroth-Cugell et al. 1977; Gouras and Link 1966; Hammond 1971, 1972; Rodieck and Rushton 1976). Subsequent metacontrast investigations also have found evidence for both rod–cone interactions and interactions among different cone mechanisms (Foster 1976, 1978, 1979; Foster and Mason 1977; Reeves 1981; Yellott and Wandell 1976). Foster (1976) and Foster and Mason (1977) reported rod–cone interactions in metacontrast when the test and mask stimuli were presented either monoptically or dichoptically. The last result implicates cortical processing and is consistent with Hammond’s (1971, 1972) finding of an interaction of rod- and conegenerated responses in lateral geniculate and cortical visual cells. In Foster’s (1976) study, an s-w green stimulus and an l-w red stimulus were chosen as either mask or target. The s-w stimulus isolated the rod mechanism whereas the l-w stimulus isolated cone mechanisms. Figure 2.8 shows the monoptic masking results. Note that both type B paracontrast and metacontrast functions were obtained with this combination of wavelengths. Furthermore, it should be noted that when the test and mask flashes were green and red, respectively (Fig. 2.8(c)), optimal
METACONTRAST
metacontrast was obtained at an SOA value about 200 ms higher than that obtained when the two flashes were red and green, respectively (Fig. 2.8(a)). Moreover, the reverse temporal shift seems to occur for paracontrast. Foster (1976) interpreted this shift of optimal SOA in terms of the latency difference between the slower (green-activated) rod and the faster (red-activated) cone responses. This indicates, contrary to what might be inferred on the basis of Alpern’s (1965) results, that rod–cone interactions and latency differences do play a role in determining the shape of metacontrast masking functions. In light of Foster’s (1976) findings, Alpern’s (1965) failure to obtain evidence supporting rod–cone interactions in metacontrast no longer seems puzzling. Note that in Figure 2.8(c), the green test flash isolates the rod mechanism whereas the red mask flash isolates the l-w cone mechanism. This target–mask wavelength relation essentially replicates 0.4
(a)
T red: M green Baseline
0.2
Log relative threshold elevation
0 –0.2 0.4
(b)
T red: M red Baseline
(c)
T green: M red Baseline
0.2 0.0 –0.2 0.4 0.2 0.0 –0.2 –400 –200
0
200 400 600 SOA (ms)
800
1000
Fig. 2.8 Masking magnitude (log relative threshold elevation) as a function of SOA and the dominant wavelengths of the test and surrounding mask stimuli: (a) results when the red test stimulus activated cone mechanisms and the green mask activated rod mechanisms; (b) results when the red test and the red mask stimuli activated cone mechanisms; (c) results when the green test stimulus activated rod mechanisms and the red mask activated cone mechanisms. (Reproduced from Foster 1976.)
61
62
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
that of Alpern (1965). Furthermore, note that in this condition optimal metacontrast is obtained at an SOA of 300 ms, far exceeding the fixed 50-ms SOA value employed by Alpern. In fact, if one takes the solid line connecting individual data points as an approximate interpolation of masking magnitude, at the shorter 50-ms SOA value one finds, as did Alpern, almost no metacontrast effect. Moreover, the same argument may hold for the reported failure to find metacontrast interactions between cone mechanisms (Alpern and Rushton 1965). However, both conjectures require clearer empirical demonstrations; in particular ones based on a finer sample of SOA values than the rather coarse 100-ms sample interval used by Foster (1976). Furthermore, the fact that type B metacontrast (paracontrast) functions occur when the test (mask) and mask (test) flashes activate faster cone and slower rod mechanisms, respectively (Figs 2.8(b) and 2.8(c)), indicates that latency differences between more central neural mechanisms also play a role in determining the metacontrast function, a conclusion consistent with the existence of dichoptic metacontrast (paracontrast) functions between rod and cone mechanisms (Foster and Mason 1977). Furthermore, to obtain type B metacontrast (paracontrast) when the target (mask) and mask (target) stimuli activate cone and rod mechanisms, respectively, the latency differences between more central neural mechanisms must be larger and override the oppositely signed latency differences between rod and cone responses. Such an interpretation would dovetail with that offered by Yellott and Wandell (1976). Their study, comparing chromatic effects in metacontrast under monoptic and dichoptic viewing, indicated two types of metacontrast effects: a receptor-specific one originating in the retina and another originating centrally and, unlike the retinal mechanism, governed by spectral sensitivities that have been sharpened by opponent-process operations. Moreover, Yellott and Wandell arrived at the following pertinent observation: If a 10 cd/m2 10 ms red test flash (e.g. a 1 3 bar) is followed after about 70 ms by an equally intense 10 ms metacontrast mask (e.g. a pair of flanking bars) of the same color, its brightness is so dramatically reduced that naive observers will normally report that they have not seen any test flash at all. If the red mask is then replaced with a green one of the same luminance, the red test flash is greatly restored in visibility and seems to be hardly masked at all . . . . . . [These results] indicate that for the brightness reduction phenomenon ordinarily thought of as metacontrast, a mask’s effectiveness depends on its subjective color similarity to the test flash . . . (Yellott and Wandell 1976, p. 1279)
METACONTRAST
A similar conclusion was drawn by Holland (1963) and by Bevan et al. (1970) in their studies of visual masking. They also found that the metacontrast effect was weakened as color differences between the target and mask stimuli were introduced. On the other hand, we noted earlier (Chapter 1, section 1.3) that Stigler (1910, 1913, 1926) found metacontrast to be indifferent to color differences between the target and mask stimuli. Stigler’s contrary findings, based largely on phenomenal report, shed little light on the role of color in metacontrast masking. The later investigations, using more sophisticated methods, have shown that color effects in metacontrast are quite complex. For instance, in the experiment described by Yellott and Wandell (1976), the target and mask flashes were presented as a combination of luminance increments and hue substitutions against a uniform background. Therefore one can characterize the two stimuli in terms of their luminance transients or their chromatic transients or both—one transient was confounded with the other. To unconfound these variables, Bowen et al. (1977) investigated the effects of chromatic transients and chromatic plus luminance transients on metacontrast. The target and flanking masks each consisted of a 620-nm reddish orange flash presented on an achromatic background. In condition 1, the intensity of the stimuli was such that either stimulus introduced both a luminance transient and a chromatic transient relative to the background. In condition 2, the intensity of the two stimuli was set to match the brightness of the background; hence only chromatic transients relative to the background were used. In condition 3, the target consisted only of a chromatic transient and the mask of a luminance plus chromatic transient; and in condition 4 the reverse arrangement was employed. Bowen et al. found that type B metacontrast was obtained only in conditions 1 and 3 in which the mask produced both a brightness and a chromatic transient; in the other two conditions in which the mask produced only chromatic transients no metacontrast effects were obtained. Based on these results, it follows that, despite color similarity of the target and mask stimuli, metacontrast is not obtained in the absence of brightness transients produced by the mask. As a corollary one could conclude along with Bowen et al. that the presence of brightness transients in the mask is a necessary condition for obtaining type B metacontrast. However, in view of more recent findings reported by Reeves (1981) and Breitmeyer et al. (1991), even this conclusion has had to be altered.
63
64
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
For instance, Reeves capitalized on the finding reported by Glass and Sternheim (1973) that stimuli consisting of transient hue substitutions against an isoluminant background can produce Crawford-type masking by light at and near the instant of hue substitution, provided that the hues are sufficiently different in wavelength. There was little if any masking when the hues were the same or similar in wavelength. Reeves (1981) used a variety of hue stimuli against an isoluminant background on which they were exposed. The findings showed that as the target and mask hue deviate progressively more from the background hue, progressively stronger type B metacontrast is obtained. Thus the failure of Bowen et al. (1977) to obtain metacontrast with their combination of orange (620 nm) test stimuli on a white achromatic surround may have been due to the fact that the colored test and the neutral background hues were not sufficiently different in chromatic space. From these results we can conclude that type B metacontrast can be obtained by pure chromatic transients provided that the chromatic differences between the target and mask are sufficiently large. This conclusion is given additional support by Foster’s (1978, 1979) findings that type B metacontrast functions can be obtained when the test and mask flash selectively activate s-w and l-w mechanisms, respectively. These mechanisms were activated by stimuli with dominant wavelengths at 421 and 664 nm, respectively, which were sufficiently different in wavelength composition to produce type B metacontrast (Reeves 1981). In addition, Foster (1979) obtained the following intriguing result. When the s-w mechanism was masked by the l-w mechanism, the optimal metacontrast masking effect was obtained at an SOA of 100 ms; however, when the l-w mechanism activated by the target was masked by a similar l-w mechanism, the peak masking effect occurred at an SOA of 50 ms. This relative shift in SOA at which peak masking occurs is consistent with the existence of a longer response latency of s-w mechanisms relative to that of l-w mechanisms (Mollon and Krauskopf 1973). The following interim summary of the seemingly complex effects of stimulus wavelength on metacontrast described above can now be given. 1. Contrary to Alpern’s (1965) original study, rod–cone interactions exist in metacontrast (Foster 1976; Foster and Mason 1977).
METACONTRAST
2. The latency differences between rods and cones can play a role in determining the SOA at which peak type B metacontrast masking occurs (Foster 1976). 3. Target and mask stimuli of the same or similar wavelength composition can produce type B metacontrast, provided that at least the mask stimulus produces a brightness transient (Bowen et al. 1977). 4. Even in the absence of brightness transients, type B metacontrast can be produced by pure hue transients, provided that the wavelength composition of the target and flanking mask is sufficiently different from that of the background (Breitmeyer et al. 1991; Reeves 1981). 5. Result 4, as well as those of Foster (1978, 1979), show that, contrary to the earlier study by Alpern and Rushton (1965), different cone mechanisms can interact to produce type B metacontrast. 6. The latency differences between cone mechanisms play a role in determining the SOA at which optimal type B metacontrast occurs (Foster 1979). 7. There may be a metacontrast effect specific to peripheral receptor processes and another, more centrally located, metacontrast effect expressing mechanisms with spectral sensitivities that have been sharpened and influenced by opponent processes (Reeves 1981; Yellott and Wandell 1976). These findings have important implications for theories of metacontrast, in particular for those based on the sustained–transient channel distinction proposed by Breitmeyer and Ganz (1976). Because of developments over the past 25 years in the study of parallel pathways in monkey vision, an updated sustained–transient approach to masking (Breitmeyer 1992) (see Chapter 5) takes the parvocellular (P) and magnocellular (M) pathways as neural analogs of these channels. It is sometimes found (Hicks et al. 1983; Krüger 1979; Lee et al. 1988, 1989a,b), and hence claimed (Livingstone and Hubel 1988; Ramachandran 1990; Skottun and Parke 1999), that stimuli produced by isoluminant hue substitutions produce little, if any, activity in the M pathway. However, neurophysiological evidence indicates that M cells can indeed respond to isoluminant stimuli varying only in wavelength, provided that the cone contrast is sufficiently high, as it is with stimuli at red–green (or blue–yellow) isoluminance (Krüger 1979; Lee et al. 1988; Schiller and Colby 1983). In particular, Lee et al. (1989b) note that
65
66
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
non-linearities occurring at or before the summing of m-w and l-w cone inputs to the M cells could produce their brisk responses to red–green isoluminant borders. Hence transient-masking paradigms should yield masking effects with stimuli consisting not only of luminance changes but also isoluminant hue changes that produce large cone-contrast changes. The strong effects with red–blue hue substitutions reported by Glass and Sternheim (1973) and the strong metacontrast masking with red–green hue substitutions reported by Reeves (1981) and Breitmeyer et al. (1991) indicate that the transient magnocellular responses play a significant role in transient Crawfordtype masking as well as in metacontrast masking. A feature of the foveal and parafoveal M pathway that is relevant to masking theories based on inhibitory interactions between P and M channels is the l-w sensitivity of receptive-field inhibitory surrounds and the consequent suppression of M-pathway activity by diffuse red light (De Monasterio 1978a,b; De Monasterio and Schein 1980; Dreher et al. 1976; Livingstone and Hubel 1984; Marrocco et al. 1982; van Essen 1985; Wiesel and Hubel 1966). Assuming that transient responses (Francis 2000) and, in particular, transient-on-sustained inhibition contribute to backward pattern masking, we should find weaker metacontrast when stimuli are presented on red backgrounds than when they are presented on isoluminant white, green, or blue backgrounds. Such results have been reported in a number of metacontrast studies (Breitmeyer et al. 1991; Breitmeyer and Willliams 1990; Edwards et al. 1996; Williams et al. 1991). Moreover, these effects of background wavelength on human transient M-pathway activity, and thus on metacontrast, have recently been corroborated in a number of other psychophysical paradigms including simple reaction time (Breitmeyer and Breier 1994), choice reaction time discriminations when making categorical and coordinate spatial judgments (Roth and Hellige 1998), choice reaction time discriminations between local and global stimuli (Michimata et al. 1999), and temporal resolution (Yeshurun 2004). 2.7. Variations on the backward masking and
metacontrast themes The standard metacontrast technique entails the presentation of a target followed by a spatially surrounding mask. A typical example is the use of a target disk and a surrounding annulus as a mask (see Fig. 2.1(a)).
VARIATIONS ON THE BACKWARD MASKING AND METACONTRAST THEMES
However, other stimulus-presentation techniques not strictly adhering to this procedure can also yield metacontrast effects. The major variations in the metacontrast method are described below with a discussion of their associated empirical findings. 2.7.1.
Sequential blanking
A type of masking that is directly relatable to metacontrast is what Mayzner and coworkers have called sequential blanking (Mayzner and Tresselt 1970; Newark and Mayzner 1973; Tresselt et al. 1970). Although the exact sequence order and variables such as meaningfulness of the stimuli can affect performance (Mayzner and Tresselt 1970; Mewhort et al. 1978; Newark and Mayzner 1973), in our opinion, these sequential blanking effects are a variety of sequential metacontrast. The discussion of Piéron’s (1935) work in Chapter 1 showed that an optimal temporal sequence of spatially adjacent stimuli can successively produce masking of any prior stimulus in the spatiotemporal sequence by its immediately following and spatially adjacent stimulus; only the last stimulus is immune to brightness or form suppression. Less sequential masking was observed at suboptimal temporal sequences, either slower or faster ones, i.e. at larger and smaller SOAs separating successive stimuli in the sequence. Since metacontrast is typically optimal at intermediate SOAs of 50–100 ms, it would be reasonable to expect that sequential blanking is also optimal at successive SOAs of about 50–100 ms. This result was obtained in an experiment reported by Hearty and Mewhort (1975). These investigators used a sequentially presented array of eight letters. The letters were presented sequentially for 5 ms each at varying ISIs either from left to right or from right to left, i.e. the temporal order reflected the left-to-right or the right-to-left spatial order. ISIs of 0, 50, 100, and 200 ms were used, corresponding to SOAs of 5, 55, 105, and 205 ms. Hearty and Mewhort (1975) found that optimal masking of the location of the letter E embedded randomly in the spatial eight-letter array was obtained at an SOA of 55 ms. Since determining the location of E also relies on determining its identity, multistimulus sequential blanking, much like metacontrast using a two-stimulus sequence, was also a type B U-shaped function of SOA. Based on these results, it can reasonably be concluded that sequential blanking is a form of sequential metacontrast as originally demonstrated by Piéron (1935).
67
68
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
Moreover, our own unpublished observations of the sequential blanking effect, employing an array of five adjacent vertical bars presented temporally in sequence from left to right, showed that sequential brightness and form suppression of the first four bars is optimal at an SOA at which one obtains a perception of rapid formless motion (phi motion) proceeding from the first to the fifth bar position, and hence through the fourth position. Moreover, as already reported by Wertheimer (1912), by alternately flashing two adjacent bars at optimal rates we were able to produce pure formless phi motion for some time, after which, in line with the findings of Mewhort et al. (1978), pattern or form was increasingly observable. Contrary to von Grünau’s (1978b) claim, these results suggest that form per se is not necessary to obtain stroboscopic motion. Of course, one could argue that, in the former five-bar sequence, the form of the final fifth bar is not masked and therefore is necessary for perceiving motion throughout the first four bar positions. Here, one could use an adaptation of von Grünau’s (1978b, 1979, 1981) procedure and mask the final bar by two flanking bars to see if one can nonetheless perceive formless phi movement. 2.7.2.
The standing-wave illusion
As noted in Chapter 1, section 1.3.2, Werner (1935) showed that when a sequence composed of a target and a surrounding mask is cycled, one can adjust the target-to-mask SOA (metacontrast) and separately the mask-to-target SOA (paracontrast) to obtain almost complete elimination of the target’s visibility. Related findings were reported by Schiller and Smith (1966). Macknik and Livingstone (1998) fine-tuned this illusory effect and called it the standing wave of invisibility. They reported that the illusion was strongest when the target and mask, at durations of 60 ms and 110 ms, respectively, were cycled in counterphase. Using optical imaging techniques and very similar target mask durations of 50 ms and 100 ms, respectively, Macknik and Haglund (1999) found neural correlates of the standing-wave illusion in V1 of monkey. Macknik and Livingstone (1998) argued that the illusion most likely occurs because every mask presentation strongly masks not only the preceding target via metacontrast but also the following target via paracontrast. We concur with this interpretation and add the following elaboration. In our own studies (Breitmeyer et al. 2004a; Ö˘gmen et al 2003; see also sections 2.4 and 2.5.1 above) we consistently found
VARIATIONS ON THE BACKWARD MASKING AND METACONTRAST THEMES
metacontrast and paracontrast brightness suppression to be maximal at SOAs ranging from 50 to 60 ms and from –120 to –90 ms, respectively. Since the effective meta- and paracontrast SOAs used in the standing-wave illusion reported by Macknik and Livingstone (1998) fall in these ranges, one would expect almost optimal suppression of the target’s visibility. 2.7.3.
Feature inheritance and shine-through
Another phenomenon noted by Werner (1935; see also Chapter 1, section 1.3.3) was that at times features of the suppressed target were transferred to the mask. This process, now known as feature inheritance (Herzog and Koch 2001), was subsequently also noted by Stewart and Purcell (1970) and Hofer et al. (1989) and was investigated by Wilson and Johnson (1985).6 The latter investigators, along with Enns (2002), point out that feature inheritance poses problems for feedforward contour-interaction theories of visual masking. Deferring these theoretical issues until Chapter 6, we now turn to some of the properties characterizing feature inheritance and its close ally the shine-through effect (Herzog and Koch 2001). Using target and mask durations of 3.5 ms or less, Wilson and Johnston (1985) showed that the inheritance of a gap in a target line by an uninterrupted adjacent mask line is, like metacontrast masking, a type B function of SOA. Gap inheritance was relatively weak at SOAs of 0 and 400 ms, but reached a peak at an intermediate SOA of 100 ms. Although Wilson and Johnston (1985) did not measure target visibility as a function of SOA, it is highly likely that feature inheritance is directly related to the extent to which the target’s visibility is suppressed in a similar type B fashion. This indicates that the mechanism responsible for type B metacontrast overlaps substantially with that responsible for feature inheritance. Recently, Herzog and coworkers (Herzog and Fahle 2002; Herzog and Koch 2001; Herzog et al. 2001a,b, 2003a,b,c) have extensively investigated properties of feature inheritance and its companion, the shinethrough effect. In these studies, the target usually consists of a brief presentation (e.g. 20 ms) of a pair of line segments and the aftercoming mask, varying in duration from about 30 to 300 ms, is a grating-like stimulus that spatially overlaps the target. Figure 2.9 shows examples of target and mask stimuli and the attendant percepts when feature
69
70
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
inheritance prevails. Similar to the results reported by Wilson and Johnston (1985), the mask not only suppresses the visibility of the targets (a pair of vernier-offset bars, tilted bars, or motion-displaced bars) but also perceptually inherits the respective offset, tilt, or motion of the target. As shown in Figure 2.9, feature inheritance is strongest when the number of elements in the mask grating is between three and seven; as this number increases, the target ‘shines through’ the mask grating, thus becoming increasingly visible. The presence and strength of both these phenomena are affected significantly by a variety of spatial and temporal properties of the mask and target stimuli. In terms of spatial properties of the mask, the spatial layout of the mask elements and their contextual modulation by gestalt and grouping factors have strong effects on whether feature inheritance or shine-through prevails (Herzog and Fahle 2002; Herzog et al. 2001a, 2003a). Generally speaking, when gestalt or grouping factors disrupt the perceptual regularity or homogeneity of an extensive mask grating of, say, 25 elements, so that the central or kernel five to seven elements that overlap the target are perceptually segregated from their flanking contextual elements, the strength of shine-through diminishes substantially. Although numerous, complex, and sometimes subtle, nonetheless these psychophysically measured spatial effects of the grating mask can be explained by a simple neural network with dynamically defined lateral interactions (Herzog and Fahle 2002), consistent with the existence of horizontal connections that characterize interactions within levels of
0–30 ms
0–20 ms
Physical stimuli 30–330 ms
20–320 ms
Perception
Feature inheritance
Shine through
Fig. 2.9 Stimuli used in studies of feature inheritance and shine-through effects. (Reproduced from Herzog and Koch 2001.)
VARIATIONS ON THE BACKWARD MASKING AND METACONTRAST THEMES
visual cortex (Ernst et al. 2001; Hupé et al. 2000; Kapadia et al. 1995; Li and Gilbert 2002; Rossi et al. 2001; Spillmann and Werner 1996; Stettler et al. 2002). The strength of shine-through also depends on the temporal integrity of the kernel and contextual elements comprising the aftercoming mask grating. Small asynchronies (from 10 to 30 ms) between the onsets of the kernel and contextual elements of the mask grating produce dramatic reductions in the shine-through effect (Herzog and Koch 2001; Herzog et al. 2001b, 2003b). Similarly, increases of asynchrony between the offsets of a 300-ms kernel and briefer contextual elements reduced the shine-through effect significantly (Herzog et al. 2001b). The presence and strength of feature inheritance and shine-through also depends on temporal aspects of the target. According to Herzog et al. (2001a,b), neither shine-through nor feature inheritance are observed consistently for target durations less than 10 ms. With target durations of 10–20 ms, shine-through emerges and subsequently feature inheritance also emerges at target durations of 30–50 ms. In so far as shine-through and feature inheritance are related to featuresegmentation and feature-binding processes, respectively, Herzog et al. (2001b) conclude that feature binding requires more time than feature segmentation. In either case, we believe that feature binding to perceptual objects is the key process. In shine-through, the target features are bound to the target object, so that it and the mask are perceived as two independent objects; in feature inheritance, the target features are inherited by the perceived mask grating while the target is rendered perceptually invisible. These phenomena present a powerful paradigm for investigating the spatiotemporal properties of image segmentation, feature binding and feature fusion (Herzog et al. 2001b). Moreover, since feature inheritance implies a key role of re-entrant cortical activity in masking (Enns 2002; Herzog et al. 2001b), it is of particular significance to object-substitution models of masking, which will be discussed more fully in Chapter 4. 2.7.4.
Common-onset masking
Equally relevant for object-substitution models of masking are findings from a form of masking called common-onset masking (Bischof and Di Lollo 1995; Cohene and Bechtold 1974, 1975; Di Lollo et al. 1993, 2000; Enns and Di Lollo 2000; Markoff and Sturr, 1971). In this
71
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
masking paradigm the onsets of the target and spatially neighboring mask are synchronous; however, while the target duration can be as brief as 10 ms, the mask duration is varied systematically and thus is capable of outlasting the target’s duration up to 350 ms or more. Two key features of the findings obtained with common-onset masking are illustrated in Figure 2.10. First, the level of masking increases as the number of elements in the target display increases. Secondly, masking magnitude increases as mask duration increases and attains a maximum at a mask duration of about 150 ms, with that maximum maintained for even longer durations. Thus little if any masking is obtained at any mask duration when only a single target element is presented; however, when several elements are in the target display, masking magnitude increases as mask duration increases. When target arrays are larger, the increased number of distracters puts a higher load on limited attentional resources, thus reducing attentional effectiveness at any one location. As a consequence, systematic variation of attention and variation of mask duration jointly modulate the strength of common-onset masking. Thus attention, or more precisely the relative lack of attention, is of prime importance in producing strong common-onset masking. A general interpretation of these findings is that the processing of the mask up to at least 150 ms 100 Percentage of correct responses
72
80
60 Set size 1 2 4 8 16
40
20 0
80 160 240 320 Duration of mask (ms)
Fig. 2.10 Percentage of correct responses as a function of the duration of mask in the common-onset masking experiments. The different symbols correspond to different number of elements on the display. The full and open symbols show results from dark- and light-adapted conditions, respectively. (Reproduced from Di Lollo et al. 2000.)
PATTERN MASKING BY NOISE AND STRUCTURE MASKS
after termination of the target in some way increasingly disrupts the processing of the target, but only when attentional resources are scarce at the target location. The fact that later post-target (levels of pattern) processing can affect earlier ones suggests a contributory role of re-entrant activity, thus rendering these results of direct relevance to object-substitution models of masking. However, as noted in Chapter 4, section 4.5.2, several models not requiring re-entrant processing can also account for common-onset masking. 2.8. Pattern masking by noise and structure masks Generally, noise masks yield strong type A forward and backward masking effects with monoptic viewing of stimuli (Kinsbourne and Warrington 1962a,b; Schiller 1966; Schiller and Smith 1965). With such viewing, the forward masking effects are generally stronger and extend over a longer temporal interval than backward masking effects (Kinsbourne and Warrington 1962b, Scharf and Lefton 1970; Schiller 1966; Schiller and Smith 1965; Smith and Schiller 1966). Moreover, noise masking becomes stronger with higher mask intensities and as the spatial overlap of the noise mask with the target increases (Schiller 1966). When the target and mask stimuli are presented dichoptically, one still obtains type A forward and backward masking effects; however, compared with the backward effect, the forward effect is relatively weak and extends over a smaller temporal interval (Greenspoon and Eriksen 1968; Smith and Schiller 1966; Turvey, 1973). Moreover, in dichoptic backward noise masking, Turvey (1973) found that the intensity of the mask does not seem to be an important masking parameter, although Monahan and Steronko (1977) subsequently reported that mask (as well as target) intensity is an important parameter. However, it should be noted that Monahan and Steronko selected their subjects so that equal masking results were obtained with the target stimulus (a letter) presented to either eye. This procedure presumably eliminated eye-dominance differences between the left and right eyes, which indicates, in line with the results reported by Breitmeyer and Kersey (1981) and Michaels and Turvey (1979), that eye dominance may play an important role in dichoptic pattern masking. Masking by structure differs from masking by noise in that the mask is structurally related to the target pattern (compare Figs 2.1(b) and 2.1(c)), i.e. the mask and target share figural features like orientation or
73
74
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
curvature (Gilinsky 1967, 1968; Houlihan and Sekuler 1968; Sekuler 1965; Uttal 1973), thus producing spatial proximity of similar contours. Under these conditions, particularly at M/T energy ratios greater than 1.0, strong type A forward and backward masking effects can be obtained monoptically and dichoptically (Gilinsky 1967, 1968; Greenspoon and Eriksen 1968; Scharf and Lefton 1970; Sekuler 1965; Taylor and Chabot 1978). However, whereas monoptic or dichoptic forward masking effects are generally type A, monoptic backward structure masking effects can also be type B (Bachmann and Allik 1976; Herrick 1974; Weisstein 1971), especially when the M/T energy ratio is at or below 1.0 (Hellige et al. 1979; Michaels and Turvey 1973, 1979; Purcell and Stewart 1970; Spencer and Shuntich 1970; Turvey 1973) or when the stimuli are presented extrafoveally (Purcell et al. 1975). Under dichoptic foveal or extrafoveal viewing, type B backward masking functions can be obtained at M/T ratios above as well as below 1.0 (Michaels and Turvey 1979). This indicates that under monoptic, especially foveal, viewing, the transition from type B to type A functions as the M/T ratio increases above 1.0 is due to common integration of mask and target luminance-contrast information at peripheral levels (Hellige et al. 1977; Turvey 1973). For instance, similar to the results reported by Breitmeyer (1978b) (see Fig. 2.3), Hellige et al. (1979) and Michaels and Turvey (1979) found that the backward structure-masking function tended to shift from type B to type A as the M/T energy ratio changed from 1/2 to 2/1. This shift, as shown in Figure 2.11, was more pronounced for masks whose figural features were different from those of the target. The similarity between type B backward masking functions, produced particularly when the overlapping target and mask share many figural features and when standard nonoverlapping metacontrast stimuli are used, suggests that both type B effects result from interactions among correspondingly similar underlying mechanisms selectively sensitive to the same figural information. Moreover, transient overshoots at on- and offsets of a prolonged structure mask, similar to those obtained with a uniform light flash (Crawford 1947), have also been reported (Matin 1974; Mitov et al. 1981). Mitov et al. used the following adaptation of Green’s (1981) study of the masking of gratings by a prolonged uniform flash of light. In their study, Mitov et al. employed sinusoidal gratings as masks (500 ms) as well as targets (20 ms). As long as the spatial frequencies of
PATTERN MASKING BY NOISE AND STRUCTURE MASKS
100 75 50 25 2:1
Same Different
0
Percentage correct
100 75 50 25 0
1:1
Same Different
1:2
Same Different
100 75 50 25 0 0
40
80 120 SOA (ms)
160
200
Fig. 2.11 Percentage of correct target recognition as a function of SOA. The results plotted with solid circles and lines show masking functions obtained when the target and mask had the same or very similar features; the results plotted with open circles and broken lines show masking functions when the target and mask were composed of different features. Upper, middle, and lower panels show results at T/M energy ratios of 2:1, 1:1, and 1:2, respectively. (Reproduced from Hellige et al. 1979.)
both target and mask gratings were below 6 c/deg (but not necessarily equal to each other), pronounced transient masking overshoots were obtained at and near the on- and offsets of the mask flash. However, when the target and mask consisted of an 18-c/deg and a 6-c/deg grating, respectively, only weak attenuated transient overshoots were obtained. Moreover, when the latter target and mask spatial frequencies were reversed, no transient overshoots occurred; only a sustained masking effect (reducing the visibility of the 20-ms 6-c/deg test grating for the entire 500-ms duration of the 18-c/deg mask grating) was obtained. The former findings of transient overshoots when low spatial frequency masks were employed may be related to the results of
75
76
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
Matthews (1971) and Teller et al. (1971), showing that transient overshoots at on- and offsets of the mask occur only if the diameter of a uniform-light mask exceeds that of the spatial excitation pools of rod- and cone-driven processes. Backward pattern and noise masking also have differential effects on judging the location compared with the identity of a target. We have already discussed such a difference between metacontrast suppression of target location and target contrast or identity information (see section 2.5.1). Results from several studies (Finkel 1973; Finkel and Smythe 1973; Mewhort and Campbell 1978; Mewhort et al. 1993; Schiller 1965) indicate that there are separate post-offset representations of stimulus-location and stimulis-identity information that are processed by functionally or anatomically distinct systems (Dick and Dick 1969). Consistent with these results, Smythe and Finkel (1974) showed that a backward noise mask suppresses target-identity formation over a longer temporal interval than it suppresses target location information. This indicates that post-offset location information persists for a shorter interval or is processed sooner than identity information. Our own unpublished experimental findings confirm this finding. In our experiment, a random-dot pattern served as a mask. Four capital letters randomly chosen from a set of 12 and arranged randomly at any four of 12 clock positions around an imaginary circle concentric with the fixation point served as targets. Target and mask were flashed for 15 ms and 50 ms, respectively, with the target–mask SOA ranging, in equal 25-ms steps, from 15 to 315 ms. The proportion of errors made by observers in reporting location or identity information (the two reports were tested in separate sub-experiments) was taken as a measure of masking magnitude. The results showed that both location and identity information were masked equally well at an SOA of 15 ms. However, although both functions declined in type A fashion with SOA, the location-masking function declined at a significantly faster rate than the identitymasking function. Whereas no location masking was obtained at SOAs beyond 90 ms, identity masking was obtained up to an SOA of 240 ms. 2.9. Summary Pattern masking has several useful methodological and theoretical applications when investigating the temporal dynamics of visual
SUMMARY
perception, the interaction of visuosensory with higher visuocognitive functions, and the varieties of unconscious and conscious visual processing in normal as well as selected subject populations. It can be differentiated operationally using several procedures. When the target and mask patterns share common structural features and overlap spatially, masking by structure prevails. A mask consisting of random elements with no obvious structural relation to features of the target pattern is employed in masking by noise. Non-overlapping spatially adjacent mask and target patterns are used to investigate (forward) paracontrast and (backward) metacontrast forms of lateral masking. In para- and metacontrast the type of function one obtains depends on the response criterion adopted by an observer and on several stimulus variables. Paracontrast is typically type B when brightness ratings/ matches, contour discrimination, or form identification are used, and is type A when simple detection is employed. With any of these criteria, the magnitude decreases as mask intensity decreases and spatial separation between the target and mask stimuli increases. Type B paracontrast can also be obtained dichoptically, implicating mechanisms at or beyond binocular levels of visual processing. Metacontrast yields type B functions when brightness, color, contrast, contour, or figural identity provide the criterion content. Simple detection or reaction time tend to yield no masking, indicating that some target information is immune to the suppressing effects of the mask, or type A masking if the M/T energy ratio is significantly greater than 1.0. Without corresponding shifts in criterion content, type B metacontrast shifts to a type A function as M/T energy increases from a value below 1.0 to values above 1.0. For M/T energy ratios less than or equal to 1.0, the SOA at which optimal type B metacontrast occurs shifts to lower values as either stimulus or background luminance decreases. Type B metacontrast suppression of contour is indifferent to the sign of contrast of the target relative to that of the mask; moreover, metacontrast can be produced by the offset of a prolonged mask, as well as by the briefly flashed mask that is typically employed. In addition to these contrast and intensity variables, spatial variables also affect metacontrast. The magnitude of metacontrast decreases as the orientation or contour difference between target and mask increases, and as the spatial separation between target and mask stimuli increases. Generally, its magnitude is also weakest at the fovea and increases with
77
78
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
retinal eccentricity. Stimulus size is an important variable which interacts with retinal location; smaller and larger stimuli are more effective foveally and extrafoveally, respectively. Interocular or dichoptic presentation of target and mask stimuli yield type B metacontrast which can be of greater magnitude than monocular presentations, indicating interaction of target and mask responses at or beyond the level of binocular combination and also the possible additional contribution of binocular rivalry. At stereoscopic or cyclopean levels of visual processing, which bypass monocular pattern processing, metacontrast yields only type A functions, indicating that the mechanism responsible for type B metacontrast (i) requires standard first-order stimuli composed of luminance or chromatic contrast relative to a uniform background and (ii) resides at prestereoptic binocular levels of processing. Although initial studies failed to yield evidence for metacontrast when target and mask stimuli activated different wavelength-selective or receptor mechanisms, more recent studies have found that both rod–cone interactions and interactions between different cone mechanisms can be obtained. Moreover, transients produced by huesubstitution stimuli against isoluminant fields are also capable of producing type B metacontrast. Masking phenomena associated with stroboscopic motion and sequential blanking can also be related to metacontrast masking effects. Moreover, recycling target and mask stimuli at optimal para- and metacontrast SOAs can produce a standingwave illusion in which one perceives only the masks flanking a completely suppressed target. Monocular pattern masking by noise or structure is typically a type A effect for M/T energy ratios greater than 1.0. This holds for forward as well as backward masking, although the forward masking effect is usually effective over a larger time interval than backward masking. When the mask energy or contrast is low relative to that of target (M/T 1.0), a backward foveal type B function can result with monocular masking, particularly with masking by structure. Moreover, type B effects are also obtainable extrafoveally with equal energy stimuli. This shows that the type B effect obeys similar target energy and retinal location relationships as does metacontrast. Dichoptic masking by noise usually also produces type A effects; however, in contrast with the monoptic effects, backward masking is usually stronger than forward masking. Furthermore, a backward type B dichoptic structure masking
SUMMARY
effect can also be obtained for equal-energy target and mask stimuli, indicating that this type B effect, like the metacontrast effect, includes central, cortical levels of processing. Additionally, the attenuation of monocular type A effects with relatively weak masks or under dichoptic viewing shows that it is due to energy-dependent integration of target and mask contrast in common peripheral pattern-processing channels. Both backward structure masking and metacontrast can produce a phenomenon known as feature inheritance whereby the mask inherits features of the visibly suppressed target. Backward structure masking can also yield the phenomenon of shine-through in which the mask appears in a stronger or exaggerated form ‘through’ the veil of the overlapping mask. These phenomena, which are critically dependent on timing parameters in the millisecond range, have proved to be a very useful means of investigating the temporal and spatial properties of feature coding and feature binding and may also be important in distinguishing between different types of backward masking mechanisms. A briefly flashed target pattern informs the visual system not only of its surface (brightness, color) and figural (contour, edge) aspects but also of its location in the visual field. Masking by noise and structure affects location and figural identification differentially. The temporal interval over which one can obtain masking of target location information is significantly shorter than the interval over which one can obtain masking of pattern information. This result suggests that channels signaling the location of a stimulus integrate information over a shorter interval of time than channels carrying pattern information. Notes 1. Cox and Dember (1972) obtained type B metacontrast functions using a detection task; hence their results are an exception to the general finding that no masking is obtained with a detection criterion. This discrepancy can be explained on the basis of employing different criterion contents. Cox and Dember (1972) used black-on-white target and mask stimuli and subjects were required to detect the target. One way of obtaining a type B function in a detection task is to require subjects to detect the presence of a black target rather than just any target presence. In the former case the criterion content would be brightness or contrast; in the latter case, it could be apparent movement. Since brightness or contrast reversal can be obtained in backward masking by light and metacontrast (Barry and Dick 1972; Brussel et al. 1978; Purcell and Dember 1968) at optimal SOAs, the location of the black target will appear even brighter than the background. Hence, if a contrast criterion is used, the target will not be detected as black.
79
80
METHODS, APPLICATIONS, AND FINDINGS IN VISUAL PATTERN MASKING
2. The stimuli were generated using a Cambridge Research Systems visual stimulus generator (VSG2/3). A Vis Res Graphics VRG2–21 monochrome monitor with P46 phosphor driven at a frame rate of 239 Hz was used. The resolution of the monitor was 704 pixels and 400 pixels horizontally (H) and vertically (V), respectively. At the viewing distance of 108 cm, each pixel was 1.65(H) 1.46(V) min2. VSG2/3 was used in 12-bit luminance resolution mode and the luminance of the monitor was gamma-corrected for linearity. The background field was of dimensions 19.4(H) 9.8(V) deg2 and had a luminance of 10 cd/m2. A fixation stimulus was presented at the center of the display. The stimulus configuration was similar to that used by Macknik and Livingstone (1998). The target consisted of two vertical bars presented symmetrically 3.3 to the left and right of the fixation stimulus. The width of the target bars was 16.5. One of the target bars (left or right chosen randomly from trial to trial) had a height of 3.9 and the other had a height of 4.88. The mask stimuli consisted of two flanking bars for each target bar. Each of the mask bars was of dimensions 16.5 (H) 4.88 (V). The horizontal center-to-center separation between the target bar and each of the mask bars was 19.86. The shorter target bar was always presented within the space between the mask bars but had a random vertical offset to minimize position cues. The subject’s task was to indicate which of the two target bars (left of right) appeared shorter. 3. In our own informal observations using a variety of target and mask shapes and sizes (e. g. vertical and horizontal bars of varying lengths and heights) we have observed cross-polarity metacontrast suppression of brightness or contrast, particularly when the target (and mask) stimuli were relatively small. Thus a more systematic parametric study to determine under what task and stimulus conditions one does or does not obtain such metacontrast suppression seems to be in order. 4. While one of us (BGB) was a research associate with Dr Bela Julesz at Bell Laboratories, Murray Hill, NJ, in the year 1973–1974, he conducted an extensive set of unreported experiments on backward lateral masking using cyclopean stimuli consisting of a central vertical rectangle flanked by two vertical mask rectangles. Orthogonally varying parameters such as lateral separation between target and mask stimuli, size of target and mask stimuli, disparity (crossed or uncrossed) of both the target and mask stimuli relative to the cyclopean background plane, and disparity or depth of the cyclopean target relative to the mask uniformly yielded type A backward masking functions. These results are in line with those of the two experiments published by Vernoy (1976) and Lehmkuhle and Fox (1980). 5. For a more extensive and illustrated account of Stile’s rationale and the accompanying experimental results, see Marriott (1962). 6. One of us (BGB) has also noted this effect in his unpublished observations of metacontrast. Here, an outline square with a gap in one of the randomly chosen sides served as a target and a surrounding larger outline square served as a metacontrast mask. At intermediate SOAs, where metacontrast suppression of the target is pronounced, one could nevertheless determine the position of the gap in the target by the phenomenal presence of an inherited gap or indentation in the adjacent mask contour (see Fig. 2.2).
Chapter 3
Neurobiological correlates of visual pattern masking
3.1. Introduction Because masking is not a unitary phenomenon, the visibility of a target can be reduced as a result of a large variety of neural factors including light adaptation at the photoreceptors, inhibitory receptivefield surrounds, and sophisticated cortical interactions. Since it is beyond the scope of this chapter to review all of these mechanisms, we will focus on neural correlates of visual pattern masking at cortical levels. Within this context, studies that employed a uniform flash of light as a mask and obtained type A forward and backward masking functions (Coenan and Eijkman 1972; Donchin and Lindsey 1965; Donchin et al. 1963; Fehmi et al. 1969; Lindsey et al. 1967) are not reviewed. Such functions would be expected, since uniform flashes of light typically produce type A forward and backward masking functions when psychophysical indicators of masking are used (e.g. Sperling 1965). Since uniform flashes of light, by definition, do not contain any contours defined by luminance or chromatic contrasts, the use of such masks renders these results largely irrelevant to pattern masking. Early single-cell studies of visual masking in the lateral geniculate nucleus (LGN) of cat (Schiller 1968) also failed to yield anything other than type A functions of metacontrast or masking by pattern. Schiller (1968) correctly concluded that correlates of type B brightness suppression characterizing metacontrast are not to be found in single-cell activities of the retina or LGN. Below we review more recent studies that have investigated electrophysiological correlates of visual pattern masking at cortical levels in primates.
82
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
3.2. Evidence from animal studies Metacontrast and pattern masking by structure in V1
3.2.1.
Bridgeman reported evidence of metacontrast in responses of single cells in the striate cortex of cat (Bridgeman 1975) and monkey (Bridgeman 1980). His study in monkey V1 combined behavioral and electrophysiological measurements. The target was a disk and the mask was a ring. A pair of such disk–ring combinations were presented. In brightness discrimination trials, the SOA was zero for both disk–ring combinations. One of the disks had a higher luminance than the other and the monkey’s task was to press a response panel on the side of the brighter disk. The monkey was rewarded when the choice was correct. Metacontrast trials were interleaved with brightness discrimination trials. In metacontrast trials, both disks had the same luminance, but one disk–ring combination was presented at an SOA 0 ms while the other was presented at an SOA 0 ms. Based on training and reward schemes used in brightness discrimination trials, the monkey was expected to press the panel on the side of the disk which was perceived to be brighter even though both disks had the same luminance (in metacontrast trials, the monkey was always rewarded regardless of the choice). The target and mask stimuli were each of 40 ms duration. Stimulus parameters were kept fixed and thus were not matched to the response preferences of the cells under study. In the behavioral experiments, maximum metacontrast occurred at SOAs of 80 ms and 120 ms for the two monkeys (SOA 120 ms was the highest SOA value used in the study). Electrophysiological recordings for metacontrast were conducted at a single SOA corresponding to the previously mentioned ‘optimal’ SOA for each monkey. Single-cell responses were divided into ‘early’ and ‘late’ components. The early component corresponded to neural firing from stimulus onset to (SOA 70) ms (i.e. 150 ms and 190 ms for the two monkeys). The neural response during the period from the end of the early component to 400 ms after stimulus onset was analyzed as the late component. Most of the recorded cells had higher spike densities in the late interval than in the early interval. In the brightness discrimination task, the firing during the early interval did not correlate with the monkey’s performance; in other words, the distributions of spikes per trial for correct and incorrect trials were not
EVIDENCE FROM ANIMAL STUDIES
statistically different. On the other hand, firing during the late interval was significantly higher when the monkey’s choice was correct than when it was incorrect. Similarly, in metacontrast trials, firing during the late period, and not the early period, correlated with the monkey’s choice. Specifically, during the late period cells fired more strongly in trials in which the monkey chose the simultaneous disk–ring configuration than in trials in which it chose the disk–ring configuration with the optimum metacontrast SOA. With the reasonable assumption that there exists a perceptual correlate for the monkey’s behavioral choice, these results suggest that the responses during the late and not the early period correlate with perceived brightness whether perceived brightness is manipulated directly by luminance (brightness discrimination trials) or indirectly by adjacent stimuli (metacontrast trials). Macknik and Livingstone (1998) recorded primarily from V1 complex neurons (S.L. Macknik, personal communication) which not only responded briefly to the onset of a stimulus but, for stimuli of sufficiently long duration, also responded to stimulus offset with a relatively weaker after-discharge. The target in these experiments consisted of an oriented bar flanked laterally by a pair of similarly oriented bars acting as the mask. The dimensions of the target bar were optimized for each cell. In a first experiment on alert monkeys, the target and mask durations were 60 ms and 110 ms, respectively. In the paracontrast condition, the target was turned on immediately after the offset of the mask producing an ISI of 0 ms. Similarly, in metacontrast, the mask was turned on immediately after the target was turned off (i.e. ISI 0 ms). The results showed that the paracontrast mask inhibited the target’s response almost completely (i.e. both the initial response and the after-discharge), while the metacontrast mask selectively inhibited the target’s after-discharge. However, it is not clear how these results would correlate with perception. First, unlike Bridgeman, Macknik and Livingstone did not investigate concomitant behavioral indices of masking in the monkeys. Secondly, because the paracontrast mask led to substantially stronger inhibition of the target’s response, one would expect the perceptual effect of the paracontrast mask to be much stronger than that of the metacontrast mask. However, metacontrast typically produces much stronger suppression of the target’s visibility than paracontrast (Alpern 1953; Weisstein 1972).
83
84
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
During the experiments on awake monkeys, the mask was often within the receptive field of the neuron and thus generated a response that was superimposed on the response generated by the target. To avoid this confound, Macknik and Livingstone conducted similar experiments in anesthetized paralyzed monkeys. In the absence of eye movements, they were able to position the masks outside the excitatory but inside the inhibitory regions of the cells’ receptive fields. In one experiment, the target and mask were of 100 ms duration each. As in the awake-monkey experiments, the target generated an initial strong discharge of latency of about 90 ms. This target onset response decayed to baseline within approximately 85 ms and after a brief interval of baseline activity (approximately 35 ms) was followed by a weaker after-discharge. The after-discharge decayed to baseline by 350 ms after the onset of the target. The effect of the mask was to exert an inhibition with a delay of approximately 200 ms from its onset (or, equivalently, 100 ms from its offset) on the ongoing response of the target. Thus inhibition of the target’s onset response occurred when the mask preceded the target at an SOA of 100 ms (equivalently, an ISI of 0 ms; at this ISI, the delay of the mask’s inhibition coincides with the latency of the target’s onset response). The target’s after-discharge was inhibited when the mask was delayed by another 100 ms (equivalently, at an STA of 100 ms). In psychophysical experiments with similar stimuli conducted on human observers, Macknik and Livingstone (1998) found that the optimal paracontrast and metacontrast occurred at a ISI 0 ms and STA 100 ms, respectively (see Chapter 2, section 2.6.2, for a discussion of these findings). They concluded that in humans changes in target visibility during paracontrast are correlated with changes in the neural onset response, whereas changes in target visibility during metacontrast are correlated with changes in the afterdischarge. However, it should be noted that, as mentioned above, the paracontrast mask at ISI 0 suppressed both the onset response and the after-discharge in awake monkeys. Furthermore, to test whether the sustained portion of the response plays a role in visibility, Macknik and Livingstone analyzed the effect of an 84-ms mask on a 334-ms target in anesthetized paralyzed monkeys. The results showed that the mask can also suppress the sustained part of the target’s response. Macknik and Livingstone argued that this suppression occurred at SOA intervals at which no visual masking occurs and therefore the sustained portion of
EVIDENCE FROM ANIMAL STUDIES
response should be less important than transient portions in determining the visibility of targets. To investigate the effects of mask stimuli on V1 responses, Lamme et al. (2002) used a figure–ground segregation paradigm in awake monkeys. The stimulus consisted of randomly positioned line segments oriented at either 45 or 135. Figure and ground regions were determined by the orientation of the elements (i.e. by second-order information). For example, when the elements in the background had an orientation of 45, the elements in the figure (a small square region) had an orientation of 135. This figure region was placed randomly either to the left or to the right of fixation, and the task of the monkey was to indicate with an eye movement the location of the figure region. In previous studies, Lamme et al. (1999) showed that the early response of V1 neurons was identical whether the cells sampled the background or the figure region. On the other hand, an enhanced late response starting about 100 ms after stimulus onset was evident when the cells sampled the figure region compared with the case when they sampled the background region. Lamme et al. (2002) tested whether a pattern mask would suppress this response enhancement. A pattern mask consisting of randomly positioned patches, each containing line segments oriented at either 45 or 135, was used. The duration of the pattern mask was 300 ms. The target was immediately followed by the pattern mask and thus target duration equaled the SOA covering a range of 14–110 ms. In order to factor out the effects of target duration from those of masking, a second condition was used where the target was followed by a uniformly illuminated screen set at the space-averaged luminance of the pattern mask. This mask produced masking by light. Behaviorally, both the pattern-mask and light-mask conditions generated monotonic type A functions, consistent with results obtained with human observers (Michaels and Turvey 1979; Sperling 1965; Turvey 1973). However, in the case of the light-mask condition, performance increased rapidly, reaching asymptotic levels for SOAs longer than approximately 50 ms, whereas in the case of pattern-mask, the increase in performance as a function of SOA was much slower, reaching the blank-screen level only at the longest SOA of 110 ms. Therefore the differences in performance as a function of target duration/SOA between the pattern-mask and light-mask conditions can be attributed mainly to pattern masking. This reasoning comports
85
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
well with human psychophysical results, in which masking by a large uniform light flash (blank screen) is found to be largely a precortical (perhaps retinal) phenomenon (Battersby and Wagman 1962), whereas pattern masking is a central cortical phenomenon (Michaels and Turvey 1979; Turvey 1973). Lamme et al. (2002) calculated the average multi-unit (neural) activity (MUA) obtained at different electrode sites and computed the difference for each SOA when the cells were sampling the figure (figure response) and when they were sampling the background (background response). The early part of the neural response (approximately the first 100 ms) was identical for the figure and the background MUAs. Differences between the figure and background signals were evident in the later part of the response. In the pattern-masking condition, the mean value of the difference between figure and ground MUAs computed from 100 to 140 ms after stimulus onset produced a type A curve that mimicked the behavioral type A curve (compare Fig 3.1(a) and 3.1(b)). Similarly, the response Behavioral
1.0 (a)
Neurophysiological
0.30 (b)
0.9 0.8
0.25
0.7
0.15
0.20
0.6 0.5 0.4
Pattern mask Light mask
0.3 0
20
40
60
80
100 120
0.2 (c) 0.1 –0.1 –0.2
Target response (multi-unit activity)
Target visibility (proportion correct)
86
0.10 0.05
pattern mask light mask
0.00 –0.05 0 0.05
20
40
60
80
100 120
20
40
60
80
100 120
(d)
0.00 –0.05 –0.10 –0.15 –0.20
–0.3
–0.25 –0.30
–0.4 0
20
40
60
80
100 120
0
Target duration/SOA (ms)
Fig. 3.1 (a) Behavioral target visibilities and (b) neurophysiologically derived MUA (upper right panel) as a function of target-duration/SOA for the pattern-mask and light-mask conditions as indicated in the insets. (c) Functions showing visibility differences between the pattern- and light-mask conditions and (d) corresponding MUA differences. (Reproduced from Lamme, Zipser and Spekreijse (2002.)
EVIDENCE FROM ANIMAL STUDIES
difference in the blank-interval condition mimicked its corresponding behavioral type A curve (again, compare compare Fig 3.1(a) and 3.1(b)). Although Lamme et al. (2002) did not compute the differences between the pattern-mask and light-mask type A functions, we show such difference functions,1 which yield the ‘pure’ cortical pattern-mask components, in the lower panels of Figures 3.1(c) and 3.1(d). Note that both the behavioral and MUA difference functions yield type B decreases of target visibility or target signal as a function of SOA, with peak pattern masking occurring at 30–40 ms, results similar to those found in humans (Michaels and Turvey 1979; Turvey 1973). Lamme et al. (2002) suggested that the early and the late portions of the recorded MUAs correspond to feedforward and feedback processing stages, respectively. They suggested that the role of the pattern backward mask was to reduce the late feedback response by selectively interrupting recurrent interactions between V1 and higher visual areas. If this is true, it appears that type B pattern masking, including metacontrast masking, is due to the disruption of such cortical recurrent interactions. This clearly has negative implications for any models of masking that rely strictly on feedforward neural mechanisms. For example, the original dual-channel transientsustained model proposed by Breitmeyer and Ganz (1976), which was based on feedforward inhibitory interactions between sustained and transient channels, would have to be amended so that the inhibitory effect clearly acts on feedback activity from higher to lower visual cortical areas. In contrast with the aforementioned studies, von der Heydt et al. (1997) found no evidence for metacontrast effects in the responses of 20 single cells recorded in V1 and V2 of awake monkeys. The stimuli were of 33 ms duration. Cells that responded to uniform illumination were not affected by the mask. Responses of the cells that responded to edge contrast were inhibited by the mask for the SOAs ranging from 33 to 33 ms. However, it should be noted that these cells responded to the 33 ms stimuli very briefly and thus failed to show any sustained or late secondary responses like the cells in studies where backward masking effects have been reported. Breitmeyer (1984, pp. 183–4, 226) specifically hypothesized that the most likely site of the responsesuppression effects of metacontrast masking ought to be found in P cells located in the upper layers of V1 (and beyond). Of the 16 cells
87
88
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
recorded in V1 by von der Heydt et al. (1997), 13 were located in layers 2–3; most were color selective, and thus most likely cortical P-blob, cells (R. von der Heydt, personal communication). S.L. Macknik (personal communication) similarly reports that response suppression was not found predominantly in P cells in the study by Macknik and Livingstone (1998). However, both these studies failed to investigate concomitant behavioral indices of masking in their monkeys (S.L. Macknik, personal communication; R. von der Heydt, personal communication). 3.2.2.
Paracontrast and metacontrast in V4
Kondo and Komatsu (2000) recorded from cells in area V4 of two awake monkeys. The target stimulus was a disk whose size was adjusted to be equal to or smaller than the size of the receptive field. The mask was a surrounding non-overlapping ring that shared a contour with the target. The sizes of the target and the mask stimuli were chosen so as to generate a much stronger response from the target than the mask when these stimuli were presented in isolation. The target and mask were each of 17 ms duration. At this short duration, the on and off responses tend to be fused. Kondo and Komatsu focused their investigation on the on-responses and excluded those cells that produced off-responses (assessed by a control stimulus of 500 ms duration). The SOAs varied from 133 to 133 ms. The neural response to the target was computed as the average firing rate during the period 50–250 ms after stimulus onset minus the background discharge rate during the 200–700 ms prior to stimulus onset. The responses had a typical latency of 50 ms and a duration of 100 ms. The neural response as a function of SOA showed monotonic type A functions for both paracontrast and metacontrast. The paracontrast function was broader than the metacontrast function, with masking effects lasting up to SOAs of approximately 77 and 65 ms, respectively. For a metacontrast mask, the late component of the typical 100 ms response was inhibited more strongly than earlier the component, similar to the findings reported by Bridgeman (1980). For a paracontrast mask, there was no clear distinction as to which part of the response was inhibited, similar to the findings reported by Macknik and Livingstone (1998). No behavioral measure of masking was obtained.
EVIDENCE FROM ANIMAL STUDIES
3.2.3.
Backward pattern masking in frontal eye field
The neurophysiological studies discussed above probed changes in activities at early cortical levels in correlation with either the expected or the measured changes in the strength of backward masking. In their study on behaving monkeys, Thompson and Schall (1999, 2000) took a different perspective to the problem by recording neural responses in the frontal eye field (FEF), a subdivision of the prefrontal cortex. The target stimulus was a dim blue square which, when present in a trial, appeared randomly at one of several possible locations. The mask consisted of bright white squares of the same dimension as the target but appearing at all possible target locations. The monkey’s task was to make a saccade after stimulus presentation to the location of the target when the target was present and to maintain fixation when the target was absent. Monotonic type A masking functions were obtained in both monkeys studied. During the electrophysiological experiments, the performance of the monkeys was monitored and the SOA was adjusted to obtain about 50 percent correct performance in targetpresent trials. Because both the target and the mask could generate responses in the neurons, there was significant activity under all stimulus—response combinations (‘hit’, ‘miss’, ‘correct rejection’, and ‘false alarm’ cases). The magnitude of early activity, particularly when averaged over a time interval equal to the SOA and immediately preceding the mask response, correlated with the behavioral response. The highest early activity corresponded to hits, and progressively lower levels of early activity corresponded to misses, false alarms, and correct rejections. The differences in the early activity for hits vs. misses and for false alarms vs. correct rejections were small, corresponding to 1–2 spikes per trial. Furthermore, in many visual and all eye-movement neurons recorded in the FEF, there was a late response phase whose amplitude correlated with the behavioral choice of the monkey, i.e. higher activity for hits and false alarms and lower activity for misses and correct rejections. In contrast with the differences observed in early responses, the differences for hits vs. misses and for false alarms vs. correct rejections were large in the late responses. Because eye-movement neurons project to oculomotor structures, the late activity in these neurons can be interpreted as the motor-command signal leading to the behavioral response. Thompson and Schall (1999, 2000) suggested that the late activity in visual neurons, which do not
89
90
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
project to the oculomotor centers, initiates feedback signals to the extrastriate visual cortex. Although this study does not address directly the effect of stimulus timing, a crucial parameter in backward masking, it sheds light on the relation between neural responses and the behavioral report. From the perspective of signal detection theory (Green and Swets 1966), the simple detection tasks used by Thompson and Schall imply use of criterion contents and levels that are based on significant neural responses even in trials where the monkeys report ‘not seeing’ the stimulus. 3.2.4.
Backward pattern masking in temporal cortex
One limitation of the study by Thompson and Schall (1999, 2000) was that both target and mask stimuli generated responses in neurons under study, rendering an isolation of the ‘pure target’ response impossible. This problem had been bypassed by other studies that recorded neural responses in macaque temporal cortex under backward masking conditions (Kovács et al. 1995; Rolls and Tovée 1994; Rolls et al. 1994, 1999). Since the temporal cortex contains neurons with a high degree of shape selectivity (e.g. Desimone et al. 1984; Gross et al., 1972; Pasupathy and Connor 2002; Rolls 1992; Tanaka et al. 1991, Tsunoda et al. 2001), a mask stimulus that generates negligible responses in the neurons under study can be used. Kovács et al. (1995) recorded from monkey inferior temporal (IT) cortex under a backward pattern-masking paradigm. The task was shape discrimination. The mask onset immediately followed the target offset, and the total duration of the target and mask stimuli was kept at 300 ms. Thus variations of SOA produced correlated changes in target and mask durations. With these stimulus parameters, the behavioral masking function was type A. To assess the task-related information, the difference between the responses to the shapes that generated the strongest and weakest activity in the cell was computed. This ‘response difference’ (RD) was significant under all conditions, including those that induced strong backward masking in psychophysical experiments. Of the four target duration (or SOA) values used in the electrophysiological experiments, two (20 and 40 ms) produced masking effects in behavioral experiments. To assess the effect of masking on neural responses, Kovács et al. (1995) compared the RD for target-only (unmasked) and target-and-mask
EVIDENCE FROM ANIMAL STUDIES
(masked) conditions for these two durations. For the 20 ms target duration (or SOA) condition, the RD integrated over 20 ms after response onset was the same for unmasked and masked conditions. As the integration period was increased from 20 to 160 ms, the RD increased for the unmasked condition but remained relatively constant for the masked condition. Thus the masking effect was significant when the RD was integrated for intervals longer than 20 ms. For the 40 ms target duration (or SOA), the RD increased for an integration period of up to 80 ms for the masked condition, but over the entire range (20–160 ms) for the unmasked condition. Thus, while the RD was about the same in the masked and unmasked conditions for integration periods of 20 and 40 ms, the masking effect became significant for integration periods longer than 40 ms. Consequently, backward masking effects can be observed in IT neurons if one assumes that the neuronal responses are temporally integrated. The integration time for which the masking effect becomes apparent depends on the duration of the stimulus. The role of temporal integration was also supported by a receiver operating characteristic (ROC) analysis of the data. The earliest part of the response (approximately 20 ms) did not contain reliable information for accurate discrimination of shapes, since integration of information over longer durations is required. The effect of the mask was to suppress the information available in the later part of the response (or to interrupt the ongoing response), thereby reducing the performance by making temporal integration ineffective. Another finding was that a difference (relative increase) of few spikes within an integration interval of 80 ms was a sufficient correlate of a high level of discrimination. This finding is in agreement with information-theoretic analyses of spike trains which suggest that a single spike can carry several bits of information when the system is subjected to time-varying stimuli (Rieke et al. 1997). Rolls and colleagues (Rolls and Tovée 1994; Rolls et al. 1994, 1999; reviewed by Rolls 2005) studied the effect of backward pattern masking in monkey IT cortex using faces as target stimuli (duration 16 ms) and less effective non-face stimuli or ineffective faces as masks (duration 300 ms). These stimulus configurations generated type A masking functions. When the target stimulus was presented alone, neurons responded for durations of the order of 200–300 ms. Rolls et al. (1999) applied information-theoretic analysis to neural responses and
91
92
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
quantified the amount of information available under backward masking conditions. They found significant responses, even under conditions yielding strong masking as measured by a previous study (Rolls et al. 1994), and concluded that the information is encoded in the difference of activities and not by simple presence of absence of activity. When the firing rate or the cumulative number of spikes was considered, only the responses to the most effective stimulus changed as a function of SOA. The average response of all cells and the response to the least effective stimulus did not change significantly as a function of SOA. However, when the amount of information was calculated using Shannon ‘s (1948) formulation, SOA had a significant effect not only on responses to most effective stimulus, but also on average responses. Moreover, the attenuation generated by the mask was strongest in the part of the response carrying the peak information about the stimulus. Consequently, the mask reduced the amount of information more strongly than the firing rate. 3.3. Evidence from human studies 3.3.1.
VEP studies of para- and metacontrast
Recordings of cortical visually evoked potentials (CVEPs) have been used to study both para- and metacontrast. In addition to obtaining subjective target-visibility ratings, Kaitz et al. (1985) recorded CVEPs during ring–disk paracontrast. Together with paracontrast maxima at or near target–mask synchrony and at SOAs of 150 to 75 ms when subjective ratings were used (see Chapter 2, section 2.5), the observers also tended to yield two similar maxima peaking correspondingly between 30 and 0 ms and between 100 and 75 ms when CVEP measures were used. According to the findings of Breitmeyer et al. (in press) and their interpretation of paracontrast, these two maxima would correspond to surface/contrast-specific and contour/formspecific processes, respectively. The CVEP studies of metacontrast, while more numerous, are less consistent in their results and interpretations. In the following, we shall try to disentangle these inconsistencies, and give plausible reasons for them. In their CVEP study of metacontrast, Jeffreys and Musselwhite (1986) looked at the effects of a metacontrast mask on the C1 and C2 component (Jeffreys and Axford 1972a,b) of the CVEP to the target.
EVIDENCE FROM HUMAN STUDIES
They assumed that the C1 and C2 components of the CVEP originate from the striate cortex (area 17/V1) and the extrastriate cortex (area 18/V2 (or beyond?)), respectively. Their results showed that these components of the target’s pattern-specific CVEP are not affected by an aftercoming mask. Here we issue several caveats. First, Jeffreys and Musselwhite (1986) note that it is difficult to compare their CVEP results with those of neurophysiological studies of metacontrast since the relation of scalp potentials to underlying neural activity is not known. Hence the ability to localize the components is limited in precision. Therefore it is entirely possible that the C1 and C2 components are generated by cortical pattern-specific mechanisms responding prior to the suppressive effects of the mask. Moreover, there is not even a clear consensus as to the general site of neural generation of the C1 and C2 components. While Jeffreys and Musselwhite (1986) favor striate area 17/V1 and area 18/V2 (and possibly beyond), respectively, the analyses of CVEPs by Drasdo (1980) and Maier et al. (1987) favor area 18 (or 19) and area 17, respectively. These localization difficulties aside, other CVEP studies show that indicators of sequential blanking (Andreassi et al. 1971, 1974) and type B metacontrast effects (Andreassi et al. 1975; Schiller and Chorover 1966; Vaughan and Silverstein 1968) can be found in the later (N2 and P2) but not the earlier (Schwartz and Pritchard 1981) components of the target’s CVEP. The later response components are optimally suppressed at SOAs of 30–60 ms where psychophysical indicators also show optimal masking. Bridgeman’s (1988) reanalysis of Jeffrey and Musselwhite’s (1986) results tend to confirm this finding. However, as noted by Jeffreys and Musselwhite (1986), interpretations of such results are in turn complicated by the fact that the target-evoked and mask-evoked responses are not clearly separable in the later CVEP components. Despite this complication, these CVEP results have important theoretical implications which are discussed more fully in Chapters 4 and 8. Functional MRI studies of backward pattern masking
3.3.2.
Functional magnetic resonance imaging (fMRI) is a relatively new tool which has been used recently to investigate neural correlates of masking. Green et al. (2005a) used a circle with a gap and 12 overlapping circles as the target and mask stimuli, respectively. The target and mask
93
94
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
durations were 34 and 68 ms, respectively. Three SOA values (34, 68, and 102 ms) were used. The observers viewed the stimuli in the scanner and their task was to report the location of the gap (top, bottom, left, or right) by pushing one of four coded buttons on a response pad. The masking functions were of type A. Two methods were used to identify regions of interest. The a priori method identified four areas known to be involved in visual processing, namely early visual areas (V1 and V2), the motion-sensitive regions in the lateral occipital (LO) lobe (human area MT), and the dorsal and ventral components of the object-sensitive region in the LO. Among these areas, only the ventral LO region, and to a lesser degree the dorsal LO region, showed a correlated activity with the masking function, i.e. higher activity for longer SOAs. The second method, the data-driven approach, identified six regions based on their activation by the stimuli regardless of their sensitivity to SOAs. These regions were the inferior parietal, anterior cingulate, precentral insula, thalamic, and occipital areas. Correlations between activity and masking function was observed in the thalamus, inferior parietal, and anterior cingulate. Since Green et al. (2005a) employed backward masking by overlapping pattern, it is not clear to what extent the results they obtained reflect cortical processes that relate uniquely to type B metacontrast mechanisms rather than other types of pattern-masking mechanisms. Recently, Haynes et al. (2005) obtained fMRI as well as psychophysical data on human observers in a masking study that used a clear metacontrast task in which target and mask elements did not overlap spatially. An illustration of their targets and masks is shown in Figure 3.2(a). The targets consisted of a regular honeycomb-like array of white hexagons that could be surrounded by white-outlined hexagons. Four such arrays were presented, one in each visual quadrant, equidistant from fixation. Along either the lower-left to upper-right quadrant axis or the upperleft to lower-right quadrant axis, one of the two target displays had a central hexagon that was darker (target B) than that of the other display (target A). In the psychophysical task the observer in any trial was given a visual cue as to which quadrant axis to attend to and had to decide in which target display of the two cued quadrants the dark central hexagon was located. The mask display, consisting of four outlined hexagon displays, one of which surrounded each of the four target displays, was presented at SOAs of 16.7, 33.3, 66.7, and 100 ms.
SUMMARY
(a)
Target A (b)
Target B
Mask
0.8 0.9 Accuracy
Correlation
0.7 0.6 0.8
0.5 0.4 0
50 SOA
100
Fig. 3.2 (a) ‘Honeycomb’ target and mask stimuli. (b) Correlation, derived from the fMRI results of the same observer, between activity in V1 level and the fusiform gyrus (FG) level of cortical processing as a function of the SOA between the targets and the mask. (Reproduced from Haynes et al. 2005.)
The psychophysical results of a typical observer (broken line in Fig. 3.2(b)) show the U-shaped nature of metacontrast masking as a function of SOA which is usually obtained. The open circles in Figure 3.2(b) show the correlation between activity in the V1 and the fusiform gyrus (FG) levels of cortical processing which was derived from the fMRI results of the same observer. SOA-dependent changes in the correlation closely parallel those of the U-shaped psychophysical target-accuracy function. Since FG is part of the ventral visual pathway implicated in human object perception (Tanaka 1997), it appears that some aspect of the functional connectivity between the target-evoked lower-level activity in V1 and the corresponding higher-level FG activity is disrupted by the metacontrast mask. As noted by Haynes et al. (2005), it is not clear whether the decrease in correlation values at intermediate SOAs is due to disruption of afferent feedforward or to re-entrant feedback activation. 3.4. Summary Backward pattern masking effects are found at a variety of cortical levels from V1 to temporal cortex and FEFs. The only neurophysiological study where masking was shown to be non-monotonic type B was that of Bridgeman (1975). Replotting of the data obtained by Lamme et al.
95
96
NEUROBIOLOGICAL CORRELATES OF VISUAL PATTERN MASKING
(2002) to estimate cortical pattern-masking effects (Fig. 3.1) also shows type B masking function. In other studies, masking was either of type A or was not explicitly assessed at the behavioral level. In future research, it would be highly desirable to use standard stimuli that reliably generate type B masking. Furthermore, a combination of behavioral and neurophysiological measures would buttress further the relationship between neural activities and masking functions. One general observation from the studies reviewed here is that the backward mask tends to inhibit or interrupt ‘late parts’ of the responses. However, the early vs. late distinction varies considerable from area to area and from study to study, with the demarcation between these two phases ranging from 20 to 190 ms. Another common finding is that significant neural activities exist even under strong masking conditions. Therefore backward masking is not simply related to the presence or absence of activity but rather depends on quantitative, often small, changes in the activity. The details of neural coding strategies become important for an analysis of these differences. Methods from signal detection theory as well as information theory suggest that task-related information is temporally distributed in neural responses and that temporal integration is necessary to obtain reliable information. The effect of the mask in backward masking appears to make this temporal integration ineffective. Distribution of information across cell assemblies and its relation to backward masking needs further study. If the information is temporally distributed in neural responses, it remains to be determined whether the same information is distributed or different types of information are represented at different phases of the response. The three-phase operation proposed in the retino-cortical dynamics (RECOD) model is an example of how information can be multiplexed in neural responses (see Chapter 5). The responses in the feedforward phase reflect coarse boundary information signaled by afferent connections, while the responses in the feedback phase reflect fine boundary information sent by efferent connections (see Fig. 5.11). Similarly, the results of Lamme et al. (2002) suggest that the early responses in V1 cells encode first-order stimulus-related information, while the late (feedback-dependent) responses signal second-order more global figure–ground relationships. Another study provides evidence that the initial part of the responses in macaque IT cortex codes coarse
SUMMARY
information (monkey face vs. human face vs. shape) while the finer information about stimulus attributes (identity and expression of the face) is signaled in the part of the response starting approximately 50 ms later (Sugase et al. 1999). If information multiplexing is carried out through ‘traveling waves’ of activity among different neural centers, backward masking might involve multiple neural loci depending on the SOA, the task, and the criterion content. When corrected for latencies, backward masking effects are found in earlier parts of neural responses in the temporal cortex (within approximately the first 80 ms) but in later parts of V1 responses (150 ms and higher). This observation consolidates the suggestion that the suppression of late activities observed in V1 might be a manifestation of activities at levels beyond and feeding back to V1 (Bridgeman 1980; Lamme et al. 2000; Super et al. 2001). Similarly, Thompson and Schall (1999, 2000) suggest that late activity in visual FEF neurons feeds back to the extrastriate cortex. Finally, recent fMRI studies are starting to provide a more global view of masking correlates in terms of their localization and distribution within and among different cortical areas. A combination of these studies with approaches that produce high temporal resolution would be key not only in revealing cortical correlates of masking but also in elucidating the distributed architecture of the visual system’s real-time information processing. Note 1. We thank Victor Lamme for making his data available to us.
97
This page intentionally left blank
Chapter 4
Models and mechanisms of visual masking: a selective review and comparison
4.1. Introduction In the preceding chapters we reviewed many of the important findings and only briefly mentioned some of the theories of masking by pattern. In this chapter we provide detailed outlines of recent models of lateral masking, i.e. para- and metacontrast, as well as a review of several older models. The topics covered in the preceding chapters are important in providing the empirical framework for assessing the explanatory validity and scope of the proposed mechanisms and models. The empirical findings on lateral masking are very extensive, and the significant ones, particularly those reported in the past two decades, are reviewed in Chapter 2. Other reviews of past and recent findings have been given by Breitmeyer (1984), Breitmeyer and Ögmen (2000), and Bachmann (1994). Despite this breadth of coverage, our evaluation of models and mechanisms is intentionally selective since only their most salient features are discussed. Additional findings will be introduced in later chapters where we deal with issues of how masking relates to motion perception, conscious and unconscious perception, the control function of attention, visual context and gestalt organization, and visual processing in selected subject populations. When appropriate, the theoretical relevance of these findings will also be noted in the later chapters. This will place the models and their evaluations in a broader and richer empirical context. A comparison of mathematical formalisms used in different models is provided in Appendix A. Ever since paracontrast, metacontrast, and cognate phenomena were discovered in the late nineteenth and early twentieth centuries, investigators have speculated about possible masking mechanisms. For example, in Chapter 1 we noted that McDougall (1904a) proposed
100
MODELS AND MECHANISMS OF VISUAL MASKING
at least one possible mechanism for the suppression of Bidwell’s ghost (also known as the Purkinje image). In his experiment Charpentier bands and the trailing Bidwell’s ghost were produced by a rotating trans-illuminated aperture that putatively isolated rod responses. When a spatially displaced and temporally lagging aperture that activated cones was added to the display, the visibility of Bidwell’s ghost produced by the leading aperture was suppressed. McDougall reasoned that the faster cone response to the temporally lagging aperture suppressed the slower rod activity, and thus Bidwell’s ghost, generated by the leading aperture. This mechanism predates a similar one proposed by Alpern (1953) for explaining metacontrast. However, McDougall further noted that another masking mechanism, in addition to cone–rod interactions, resides in cone–cone interactions since he found that Bidwell’s ghost was suppressed when both apertures presumably isolated the same type of cone activity. Several subsequent studies of metacontrast have confirmed this latter conjecture (Foster 1978, 1979). Regarding interactions among receptor responses, Stigler (1910, 1926), following Exner’s (1898) conjecture, proposed that horizontal cells in the retina provided the basis for lateral inhibitory interactions between the neural responses of the spatially adjacent stimuli typically used in his para- and metacontrast studies. In his first series of studies, Stigler (1910) failed to demonstrate dichoptic metacontrast effects and this failure may have motivated his placing the interactive mechanism at a peripheral retinal level. However, his later study (Stigler 1926) demonstrated the existence of dichoptic metacontrast. Although not ruling out interactions mediated by horizontal cells, this finding pointed to the existence of additional lateral interactions mediated at post-retinal and most likely cortical sites of visual processing. It also suggested to Stigler two additional and related properties, synaptic delay(s) and overtake of the target response by that of the mask, that may contribute significantly to metacontrast. A version of the overtake hypothesis was adopted later by Crawford (1947) to explain masking when the target was presented shortly before a more intense conditioning or mask flash. In line with Stigler’s (1926) assumption of post-retinal lateral interactions, Fry (1934) pointed out that in metacontrast the lateral effects of a temporally lagging flash on a leading spatially adjacent flash may be mediated by processes along the entire retino-geniculo-cortical pathway.
INTRODUCTION
In the last two decades, since the publication of the first edition of this book (Breitmeyer 1984), several additional models and mechanisms of para- and metacontrast have been proposed. A few of these mechanisms either explicitly or implicitly share features with those proposed by earlier investigators. Others were briefly mentioned in Chapter 2, where we reviewed empirical findings. To accommodate dichoptic masking effects, one feature that past and current models share in common is that their proposed mechanisms responsible for the type B or U-shaped metacontrast and pattern masking functions are located at cortical levels. The following models and mechanisms selected for review can be distinguished according to five general characteristics: (1) models based on spatiotemporal response sequences; (2) models adopting some version of an overtake hypothesis; (3) models based on two separate neural processes or channel activations; (4) models relying on stimulus or object substitution; (5) models based on emergent properties of distributed neural networks. These characteristics are not mutually exclusive; however, they do point to differences of theoretical emphasis among the models. As will be seen, most models rely on some version of the overtake hypothesis. One past model based on spatiotemporal response sequences also invokes separate visual processing channels. Two of the recent models explicitly adopt activation of separate neural channels or pathways to explain metacontrast and other backward masking phenomena; the other two recent models, while not relying on separate pathway activations, invoke separate neural processes. Although all subscribe to object substitution as a phenomenal description of metacontrast that must be explained, only two of them take object substitution as an underlying operative mechanism. Again in a general sense, all models subscribe to the notion of distributed neural networks, although they differ in the degree to which they explicitly formulate the quantitative properties of the networks. Our typology is somewhat limited (cf. Bachmann 1994) and arbitrary in that it is guided by what we regard as the distinctive characteristics of only the more noteworthy past and recent models.
101
102
MODELS AND MECHANISMS OF VISUAL MASKING
4.2. Spatiotemporal sequence models Presenting two spatially adjacent stimuli in rapid succession can produce not only metacontrast masking of the first stimulus but also a strong sensation of apparent motion (Kolers 1972; Korte 1915). As noted in Chapter 1, in his extensive studies of apparent motion Wertheimer (1912) had already reported the masking of the first stimulus in the apparent-motion sequence. Wertheimer’s informal observations have been confirmed subsequently by numerous studies demonstrating similarities between apparent motion and metacontrast functions (Bischof and Di Lollo 1995; Breitmeyer et al. 1974, 1976; Didner and Sperling 1980; Di Lollo et al. 1993; Fehrer 1966; Fisicaro et al. 1977; Kahneman 1967; von Grünau 1978b; Yantis and Nakama 1998). Since backward masking and apparent motion activate spatiotemporal sequence detectors, it seems plausible that the processes underlying apparent motion and metacontrast are closely linked. Possible links have been proposed in three separate models. Further relations of masking to motion perception are explored more fully in Chapter 6. 4.2.1.
Kahneman’s impossible motion model
Kahneman (1967) proposed a model of metacontrast in which stroboscopic motion plays a causal role in metacontrast. Specifically, metacontrast is a special case of impossible stroboscopic motion. For example, with the frequently employed metacontrast display in which a rectangle serves as a target and two spatially flanking rectangles serve as masks, the target–mask sequence activates two oppositely directed stroboscopic motion events of the same object—a physically impossible event. Kahneman (1967) argues that the cognitive–perceptual system is unable to resolve these apparently opposite and contradictory motions of the same stimulus and thus suppresses the perception of object motion by suppressing the visibility of the target stimulus. Although simple and ingenious, this formulation was flawed on several counts. First, the empirically based spatiotemporal laws that govern apparent motion diverge from those characterizing metacontrast (Breitmeyer and Horman 1981; Weisstein and Growney 1969). Moreover, split stroboscopic motion from the target to the flanking masks can be, but need not be, observed in metacontrast (Stoper
SPATIOTEMPORAL SEQUENCE MODELS
and Banffy 1977). Consequently, the perceptual system can simultaneously accommodate such apparently contradictory motions as noted earlier by Wertheimer (1912). This perceptual accommodation to the metacontrast display may use processes responsible for the perception of ecologically possible events such as radial dispersion (explosion) of an object’s surface (Gibson 1979) or rapid visual expansion (looming) in the visual field when objects move toward an observer (D.N. Lee 1976, 1980; Regan and Cynader 1979). Finally, metacontrast can be obtained in a two-stimulus display (e.g. two adjacent rectangles or disks) in which the perception of stroboscopic motion is readily possible and present (Breitmeyer and Horman 1981; Breitmeyer et al. 1974, 1976). These results, as well as those of Stoper and Banffy (1977), do not support the causal necessity of any form of impossible or contradictory stroboscopic motion in producing metacontrast. As noted by Stoper and Banffy, the mechanisms underlying stroboscopic motion and metacontrast are largely independent of each other although under some favorable stimulus conditions they may interact. In order to show that such interaction can yield impossible stroboscopic motion and thus also metacontrast, as required by Kahneman’s (1967) cognitive model, one may want to adopt the following strategy. What needs to be demonstrated is (i) that each of two flanking mask stimuli can separately induce a type B stroboscopic motion function when flashed at varying SOAs after the target stimulus (preferably, but not necessarily, without producing a type B metacontrast effect), but (ii) that in combination the two mask stimuli yield a type B metacontrast function without split stroboscopic motion. In view of the findings reviewed above, such a demonstration would most likely hold only under highly specific non-generalizable conditions. In hindsight, it would nonetheless be more convincing than the equally non-generalizable and fortuitous correlation between type B stroboscopic motion and metacontrast functions used by Kahneman (1967) to support his cognitive model. It also should be mentioned that Kahneman’s (1967) model cannot adequately account for the absence of type B (and type A) metacontrast when a forced-choice detection or simple reaction time measure is used by an observer. In fact, we noted that the split apparent motion (devoid of target contrast or figural information) generally observed in metacontrast is most likely a powerful source of information for
103
104
MODELS AND MECHANISMS OF VISUAL MASKING
detecting the mere presence or location of the target. Without such split apparent motion, i.e. with its suppression by the cognitive system, target detection or localization might be greatly impaired. Moreover, even if such motion information were suppressed by the cognitive system, it is hard to imagine how it could be recovered to produce apparent stroboscopic motion between a masked target and a third stimulus flashed at a locus laterally displaced from the target–mask area (Kolers 1963; von Grünau 1978b) 4.2.2.
Matin’s three-neuron model
Matin (1975) proposed a model of metacontrast based on the existence of three classes of neurons. In a typical metacontrast experiment, a target is presented first, followed at a short interval by a laterally displaced mask. Matin assumed that the target activates one class of neurons that she calls T neurons. Similarly the mask activates a second class of neurons that are termed M neurons. Finally, at appropriate temporal intervals the target–mask sequence activates a third class of neurons that are called T–M neurons. According to Matin (1975), this last class of neurons could consist of either succession (motion) detectors used in analyzing object motion or neurons activated during relative image displacement produced by high-velocity saccades. In the former case, activation of succession detectors could, but need not, produce a sensation of (stroboscopic) motion. Via this assumption Matin avoids the problems of Kahneman’s (1967) model of metacontrast that ascribes a causal role to stroboscopic motion. Saccade neurons comprise part of the latter class of T–M neurons and their neural analogs were taken to be Y or transient neurons, which are assumed to have a shorter response latency and higher conduction velocity than X or sustained neurons.1 The sustained neurons are further assumed to be neural analogs of T neurons that are related to the perception of the target. Although not explicitly stated by Matin (1975), the M neurons could also be identified with sustained neurons since the mask is typically perceived in a lateral masking experiment. With this set of three neural classes, Matin (1975) accounted for several findings obtained in lateral masking studies. First, the existence of T–M neurons could explain the sensation of stroboscopic motion that generally, although not always (Stoper and Banffy 1977), accompanies lateral masking. Moreover, several kinds of empirically derived
SPATIOTEMPORAL SEQUENCE MODELS
lateral masking findings can be adequately explained. These are type A and type B metacontrast as well as type A and type B paracontrast. According to Matin (1975), pronounced type B metacontrast could be produced in one of two possible ways. First, Matin notes that the suppressive effect that T–M neurons exert on T neurons must be retroactive in order to produce type B metacontrast. She states that: Although the T–M neurons do not fire until the presentation of the mask, the magnitude of response in these neurons would be expected to be greatest at some temporal interval between target and mask other than zero. It could therefore be argued that in those classes of metacontrast experiment for which the presentation of the target–mask sequence is an adequate stimulus for the T–M neurons, the psychophysical metacontrast function peaks at some interval other than zero not per se because the target precedes the mask, but in spite of that fact and because the responses of the T–M neurons, which are a part of the total mask response, are greater at that interval than at some longer or shorter one (Matin 1975, p. 457).
In this case, production of type B metacontrast uses the class of T–M neurons that are responsible for the detection of succession (with or without an accompanying sensation of motion). In addition to this explanation, Matin (1975) offers a second one relying on the existence of high velocity saccade neurons that are also activated by the T–M sequence. These saccade neurons are assumed to be fast-conducting transient cells that can suppress the activity of slower-conducting sustained cells. Since the former are a class of T–M neurons and the latter comprises the set of T neurons, the T–M (saccade) neurons would exert optimal suppression of the T neurons when the mask is delayed relative to the target, thus producing a type B metacontrast effect. Implicit in this additional specification of T–M neurons is the conclusion that here the metacontrast function peaks at some SOA other than 0 ms, not despite of but because the target, activating slower T neurons, precedes the mask and in conjunction with the mask activates faster T–M (transient) neurons. Consequently, Matin (1975) assumes that two mutually reinforcing mechanisms are used to produce type B metacontrast. The existence of type A metacontrast, typically obtained when the mask energy is substantially higher than the target energy, is explained as follows. In type A metacontrast, peak masking occurs at an SOA of 0 ms. Here, the high energy of the mask leads to more vigorous activation of M neurons than of T–M neurons. Since the T and M neurons are assumed to have similar conduction velocities, we would expect greatest
105
106
MODELS AND MECHANISMS OF VISUAL MASKING
suppression of T neurons by M neurons at target–mask synchrony, i.e. at an SOA of 0 ms. At progressively greater SOAs we would also expect that the suppression of T neuron activity by M neurons declines, thus producing a type A metacontrast function. An additional assumption must be made explicit here, namely that at an SOA of 0 ms and when the M/T energy ratio is greater than unity, the suppression of T neurons by M neurons is stronger than the maximal suppression of T neurons by T–M neurons at some optimal positive SOA value. If that were not the case, a type B function would still result. Type B paracontrast or forward lateral masking effects, which typically are weaker than metacontrast masking effects (Alpern 1953; Kolers and Rosner 1960), can be explained by invoking an essential asymmetry between the interactions of T and T–M neurons used in paracontrast and metacontrast. Recall that in metacontrast the activity of T–M neurons is not only optimal when the mask follows the target but is also conducted faster than the activity of T neurons. In paracontrast the laterally displaced mask precedes the target in time. Here, T–M neurons are activated by the mask–target sequence. Because these neurons (transient neurons) conduct more rapidly than T neurons (sustained neurons), the suppressive effects of T–M neurons dissipate to some extent by the time the activity of T neurons arrives at the site where interaction between these two classes of neurons occurs. In fact, if the difference in conduction velocity were sufficiently great, the later activity of T neurons could entirely escape the earlier suppressive effects of the T–M neurons, and consequently the paracontrast effect would then be produced only by the M neurons. In the case of either partial or complete escape of T neuron activity from T–M neuron suppression, one would expect paracontrast to be weaker than metacontrast. Although Matin (1975) does not make the following explicit claims regarding paracontrast, they are nevertheless implicit in the description of her model. In the case of partial escape of T neuron activity from T–M neuron suppression, we would expect a type B paracontrast effect that is weaker than a type B metacontrast effect (see Chapter 2, Figs 2.2(b) and 2.5). Furthermore, if the above escape from T–M neuron suppression is total and only M neurons inhibit the T neurons, paracontrast ought to be of type A, being optimal at an SOA of 0 ms and decreasing as the mask-to-target asynchrony increases. An additional
SPATIOTEMPORAL SEQUENCE MODELS
implicit aspect of Matin’s model is that the activation of T–M neural activity preserves information about the presence and locus of the target even when target contrast and figural information processed by T neurons are suppressed during metacontrast. Hence it can explain the ability to detect the presence of the target even when it cannot be identified or seen (Fehrer and Biederman 1962; Fehrer and Raab 1962; Schiller and Smith 1966). Although Matin’s model accounts adequately for major aspects of masking and apparent motion, several specific results pose problems. First, Breitmeyer and Rudd (1981) demonstrated that we can obtain suppression of a target’s visibility in a single-transient paradigm. Here a single brief transient presentation of the surrounding mask reduced the visibility of a stationary target presented for a prolonged time interval. Of course, such suppression could be accounted for by activation of only the M neurons in Matin’s three-neuron model, and hence without any activation of T–M neurons. More problematic is the role of T–M neurons in paracontrast. We would expect the optimal SOAs for activating the T–M and M–T neurons in meta- and paracontrast, respectively, to be the same, and thus the optimal masking SOAs for meta- and paracontrast likewise to be the same. However, in several recent studies (Breitmeyer et al. 2004a, 2005b, in press, in preparation) we found that the optimal masking SOA for paracontrast was maximal at SOAs ranging from 100 to 200 ms while that for metacontrast was optimal at SOAs ranging from 20 to 60 ms. Moreover, based on the metacontrast result, Matin’s model states that the strongest activation of spatiotemporal sequence detectors occurs at an SOA of about 70 ms. In contrast, at an SOA of 100–200 ms we would expect little, if any, such activation. Hence, according to the model, paracontrast at these SOAs should be weak rather than maximal. 4.2.3.
Burr’s spatiotemporal receptive-field model
Burr and colleagues (Burr 1984; Burr et al. 1986) proposed a different version of the spatiotemporal sequence model to account for metacontrast as well as apparent motion. They used the notion of spatiotemporal receptive fields (Burr et al. 1986) to characterize the spatiotemporal response of motion detectors. The receptive fields are made from alternating regions of opposing (excitatory/inhibitory) polarity, and these regions are elongated in space–time along the
107
108
MODELS AND MECHANISMS OF VISUAL MASKING
preferred velocity axis of the detector (Burr et al. 1986, Figs 6 and 7). This organization allows the analyses of form and motion by the detector. The strength of this model is that, extending beyond explaining detection of real motion, it can be applied to a variety of associated phenomena including metacontrast (Burr 1984), stroboscopic motion (Burr et al. 1986), and the reduction of motion smear (Burr 1980, 1981), topics that we will cover more thoroughly in Chapter 6. Unlike Matin’s spatiotemporal sequence model positing that T–M neurons inhibit the responses of the form-processing T neurons, Burr did not introduce such an inhibition to his model. According to Burr and colleagues (Burr 1984; Burr et al. 1986), the spatiotemporal sequence of target and mask stimuli activates receptive fields oriented in space–time, and these receptive fields report the gestalt of a single bar in motion rather than two bars at separate spatial locations. The result of this gestalt-unit formation is that the two stimuli become perceptually merged into one. Burr relates this merging to energy summation of the two stimuli at threshold, a process consistent with a variety of energy models of motion perception (Adelson and Bergen 1985, 1986; Watson and Ahumada 1985; see also van Santen and Sperling 1985). However, target and mask stimuli, and more generally any moving stimuli, also activate spatiotemporally unoriented mechanisms (i.e. ‘static form analyzers’), as well as large number of spatiotemporally oriented mechanisms, whose tuning matches to varying degrees the spatiotemporal energy of the stimulus. Without a selection and/or inhibition process, it is not clear how a single gestalt will emerge from the activities of a large number of mechanisms. Lacking a clear statement of how or by what underlying mechanism the unitary gestalt is formed, the model remains incomplete. As noted by Marr and coworkers (Marr 1982; Marr and Ullman 1981), workable models of object motion in human vision must combine analysis of motion with the analysis of contours or form (see also Grossberg 1991). In the human visual system this translates into combining the outputs of separate motion- and form-detecting channels whose neural analogs may be the magnocellular (M) and parvocellular (P) pathways, respectively (De Yoe and Van Essen 1988; Livingstone and Hubel 1987, 1988) (see Chapter 5). Evidence exists for convergence of M and P inputs in neurons found as early as layers 2, 3, and 4B of monkey striate cortex (Ferrera et al. 1992;
SPATIOTEMPORAL SEQUENCE MODELS
Nealy and Maunsell 1994; Sawatari and Callaway 1996; Yabuta and Callaway 1998). Evidence for an analogous convergence has been reported in cortical complex cells of cat (Emerson et al. 1992). Indeed, such neurons could have the space–time oriented receptive fields required by Burr’s model. Moreover, psychophysical evidence for combined inputs to directionally sensitive motion analyzers from transient (motion) and sustained (form) channels has been reported (Bischof et al. 1996). Although these results support the existence of space–time oriented receptive fields and thus Burr’s model, other psychophysical results indicate that the relation between metacontrast and motion is more complex and that the relation between form- and motion-processing channels includes mutual suppression of activities (Banta and Breitmeyer 1985; Juola and Breitmeyer 1988; von Grünau 1978a,b). Consistent with the latter findings are the brain scans of human visual cortex reported by Zeki (1999, pp. 154–55) which show that increases in activity in area V4 (the color- and form-processing area) is accompanied by decreases in activity in area V5/MT (the motion-processing area). This indicates that the form- and motion-processing channels also interact in a mutually inhibitory manner. Thus an alternative mechanism, namely inter-channel inhibition, as proposed by Breitmeyer and Ganz (1976; see also Matin 1975; Weisstein et al. 1975; Chapter 5) may be responsible for suppression of form information in metacontrast. Moreover, vivid perception of apparent motion can be obtained by two successive stimuli when their spatial separations are so large that no metacontrast suppression of the first stimulus is obtained (Breitmeyer and Horman 1981). The puzzle then is: Why can one perceptually identify the first (as well as the second) stimulus when the two-stimulus sequence gives rise to sensations of motion? One way to avoid this problem is to assume that the space–time receptive fields proposed by Burr to account for metacontrast only process short-range and not long-range displacements of stimuli. Of course, this explanation in turn assumes distinct neural mechanisms underlying the perceptions of short- and long-range motion (Braddick 1974, 1980), an assumption that may not be warranted (Bischof and Di Lollo 1990; Cavanagh and Mather 1989). Finally, as discussed in Chapter 6, gestalt formation by a spatiotemporally oriented mechanism cannot account for masking mechanisms that are implied in motion deblurring.
109
110
MODELS AND MECHANISMS OF VISUAL MASKING
4.3. Two-process models 4.3.1. Ganz’s interactive trace decay and random encoding time model
It is well known from Heinemann’s (1955) work on simultaneous brightness contrast that a more intense spatially surrounding stimulus can appreciably reduce the apparent contrast of a central stimulus. Simultaneous brightness contrast is believed to be induced by the lateral inhibitory effect that the surrounding stimulus exerts on the central one. Ganz (1975) proposed a model of metacontrast based on these findings and this assumption, in which the temporally decaying traces (icons) of the target and mask stimuli interact laterally. As such, it is conceptually an adaptation of Stigler’s (1910) persistence–lateral inhibition explanation of metacontrast. Ganz’s analysis relies on results reported in a study of metacontrast conducted by Sukale-Wolf (1971). In that study the target and the mask were of equal energy, a condition which, as noted in Chapter 2, is conducive to the production of type B metacontrast. Furthermore, Ganz’s (1975) model assumes that the following properties and events characterize metacontrast. Both the target and mask produce temporally decaying neural traces. Since the target and mask are of equal energy, the proportionality constants and the decay constants are assumed to be identical for the target and the mask. Moreover, since the mask is presented after the target, the neural trace of the mask ought to be stronger at any positive SOA than that of the target. Because the weaker trailing end of the target trace overlaps temporally with the stronger leading part of the mask trace, in effect we have a case of simultaneous brightness contrast induced via lateral inhibition by the stronger mask trace on the partially decayed target trace. Hence reduction of the target’s visibility ought to increase progressively as SOA increases. (At an SOA of 0 ms the two equally strong traces are activated simultaneously and decay at the same rate, and little if any brightness contrast or masking should occur). Therefore this lateral interactive trace–decay process would be responsible for the rising part of the U-shaped metacontrast effect as SOA increases from 0 ms to the intermediate value at which optimal masking occurs. In order to explain the descending portion of the type B metacontrast effect for progressively greater SOA values, Ganz (1975) incorporates
TWO-PROCESS MODELS
a further assumption stating that the decaying trace of the target needs some minimal time E for its brightness or contrast to be fully encoded. Additionally, the duration of E is assumed to be distributed randomly in a Gaussian manner. Accordingly, since the encoding probability is very small at low SOAs and increases as SOA increases, we would expect the metacontrast function to rise initially from an SOA of 0 ms to some positive SOA at a rate which is determined by the mean and standard deviation of the Gaussian distribution of encoding time. At longer SOAs the likelihood of encoding becomes increasingly great and masking ought to decrease, in a statistical sense, after its maximum is attained. Although this model adequately explains metacontrast when the target and mask stimuli are of equal energy, the question remains as to how well it would fare when the energies of the two stimuli are unequal. In the case of mask energies greater than target energy it is not clear that it would correctly predict type A metacontrast (Breitmeyer 1978b; Weisstein 1972). Let us assume that the proportionality constant of the mask trace is twice that of the target trace. Then we should indeed obtain stronger masking at an SOA of 0 ms. However, since the target trace decays with time, we additionally obtain increasing masking as SOA increases, since the initial portion of the much stronger mask trace would inhibit a progressively weaker portion of the decaying target trace. In other words, the initial portion of the metacontrast function ought to rise again—indeed, more steeply—before the encoding process takes over to yield the later descending part of the metacontrast function. Consequently, contrary to the obtained type A metacontrast function, Ganz’s (1975) model predicts a type B function. One possible way of circumventing this difficulty would be to incorporate a ‘saturation’ parameter so that whenever the mask energy becomes sufficiently large relative to the target energy, its trace inhibits the target trace equally well at the lower range of SOAs, thus yielding an approximation to a type A metacontrast function. However, a greater difficulty may arise with the model when the target energy is greater than that of the mask. In Chapter 2 (see Fig. 2.3) we noted that type B metacontrast functions can be generated when the mask energy is only half or a quarter of that of the target. Hence, according to Ganz’s model, we would expect, on the basis of Heinemann’s (1955) results, that no masking would occur for a range
111
112
MODELS AND MECHANISMS OF VISUAL MASKING
of low SOAs which would first have to be exceeded before the weaker mask trace becomes effective in suppressing the trailing end of a decaying but stronger target trace. In fact, according to Ganz’s model it may be entirely possible that the target trace is encoded prior to any inhibitory effects that the mask trace may exert on the trailing end of the target trace. Thus type B masking ought to be very weak (especially at SOAs ranging from 0 ms to some intermediate value) or entirely absent. The fact that empirical findings do not show this trend when mask energy is reduced by, say, a half or a quarter relative to the target energy, but rather reveal a type B function characterized by a rising portion, a peak, and then a declining portion as SOA increases militates against Ganz’s model. Moreover, Ganz’s model predicts paracontrast effects which very likely do not match empirical results. Paracontrast brightness suppression, given equal energy target and mask stimuli, is a type B forward masking effect; however, it is generally weaker than type B metacontrast. If the target and mask stimuli are of equal energy, we would expect weak simultaneous brightness induction at an SOA of 0 ms, an expectation consistent with the typically weak metacontrast obtained at that SOA. What, according to Ganz’s model, would be expected to happen as the mask leads the target at progressively greater SOAs? Since the neural trace of the mask and the target are characterized by the same proportionality and decay constants, we would expect the trailing end of the leading mask’s neural trace, which overlaps in time with the leading end of the target’s trace, to weaken progressively as the SOA increases. Hence, with target and mask stimuli of equal energy, little, if any, paracontrast should be obtained. If it is obtained, we would expect it to be at best a type A rather than a type B function, contrary to what is found (Alpern 1953; Kolers and Rosner 1960). 4.3.2. Reeves’ temporal integration and segregation model
Despite the failure of Ganz’s (1975) model to account adequately for some major metacontrast and paracontrast findings, results reported by Reeves (1982) generally seem to concur with and lend some credence to Ganz’s (1975) account of metacontrast. In particular, Reeves (1982), adopting Weisstein ‘s (1972) magnitude rating method, required his observers (i) to rate on each trial the brightness of the target as
TWO-PROCESS MODELS
a function of target–mask onset asynchrony, and (ii) to indicate whether the two stimuli were perceived as being simultaneous or successive. Although not specified by Reeves, perceived simultaneity would reflect an integrative process which, in turn, based on the discussion of integration in Chapter 1, section 1.5, most likely rests on the presence of visual persistence; however, perceived successions would reflect a segregative process leading to temporally separate encodings and thus percepts of the two stimuli. It should be evident that these two processes are analogous to Ganz’s (1975) trace decay and encoding stages for, as one might expect, Reeves found that the proportion of simultaneity judgments decreased monotonically from a constant value of 1.0 obtained at SOAs of 0–40 ms, to 0.0 as SOA increased from 40 to 120 ms. Conversely, the proportion of successiveness judgments increased monotonically from 0 to 1.0 over the 40–120 ms range of SOAs. The former result indicates that temporal integration of target and mask stimuli (based on decaying visual persistence or visual traces of the target) extends with decreasing strength up to an SOA of 120 ms; whereas the latter result indicates that separate target and mask encoding processes occur with a probability of zero below an SOA of 40 ms, but thereafter increase monotonically to a value of 1.0 at an SOA of 120 ms. In line with these results, when Reeves separated the overall averaged metacontrast brightness ratings, characterized by the typical type B function of SOA, on the basis of the type of concurrent temporal judgment, he found that the type B function could be approximated by separable monotonic functions. One component showed progressively increasing target-brightness suppression over the corresponding SOA range of 40–120 ms (which yielded a progressive decrease of the proportion simultaneity reports); conversely, the other yielded progressively decreasing brightness suppression over the same SOA range (yielding a progressive increase in the proportion of successiveness reports). At face value this binary decomposition, and evident correlation between changes of target–mask temporal judgment and target brightness ratings agree with Reeves’ two-process integration– segregation account, and in particular with Ganz’s more specific two-process model based on an initial stage of lateral interactive trace decay and a later perceptual encoding and segregation stage. However, we shall see below that these two-process models, in addition to a third
113
114
MODELS AND MECHANISMS OF VISUAL MASKING
version proposed by Navon and Purcell (1981), are characterized, as is already evident from discussion of Ganz’s (1975) model, by a lack of generalization or applicahility to extant data outside the immediate scope of their investigations. Navon and Purcell’s integration and interruption model
4.3.3.
Using a variety of chromatic and achromatic patterns consisting of target letters (e.g. F) and a non-letter mask, such that each letter target formed a subset of contours of the mask, Navon and Purcell (1981) obtained an inverted type A monotonic backward masking function over an SOA range of 0–50 ms. That is, masking of target letters was minimal at an SOA of 0 ms and increased monotonically over the 50ms SOA range. Such a function is reminiscent of the ascending part of type B metacontrast function found over approximately the same SOA range. In fact, Navon and Purcell (1981) maintain that this increase of masking cannot be due to an integration mechanism but rather to what is commonly termed an interruption mechanism (Scheerer 1973). Now, the target–mask (T–M) spatial composites could be sub– divided into two mutually exclusive components (M∩T) and (M∩T ), i.e. that area of stimulation common to mask and target contours and that area stimulated only by mask contours. What Navon and Purcell (1981) proposed on the basis of their study is that, since chromatic or achromatic integration of contrast makes (M∩T) appear different – from (M∩T ), the target information is in fact preserved in the T–M composite rather than being masked (for a related study, see Schultz and Eriksen (1977) ). Because such integration generally decreases in type A fashion with target–mask SOA, whereas the masking effect that they obtained increased in type A fashion as SOA increased from 0 to 50 ms, Navon and Purcell (1981) further proposed that the type B backward masking function that they obtained is a composite of two separable and additive processes: (i) ‘fortunate’ integration, which preserves target information, decreases monotonically at a rapid rate as SOA increased from zero to, say, 50 ms; (ii) masking by interruption, which destroys target information, decreases monotonically at a somewhat slower rate as SOA increased from zero to, say, 100 ms. Adding these two countervailing type A processes gives a U-shaped type B backward pattern masking function, with optimal masking occurring at 50 ms.
TWO-PROCESS MODELS
Although this two-process model may account for the particular results reported by Navon and Purcell (1981), it is not consistent with the general findings (see Chapter 2, section 2.8) of type A masking by integration found when noise masks are employed (Greenspoon and Erikson 1968; Kinsbourne and Warrington 1962a,b; Schiller 1966; Schiller and Smith 1965; Turvey 1973), and when structure masks having a higher energy than the target are employed (Hellige et al. 1979; Michaels and Turvey 1973; Purcell and Stewart 1970; Spencer and Shuntich 1970; Turvey 1973). 4.3.4.
General criticism of two-process models
Over and above the discrepancies discussed in the previous section, the two-process model of Navon and Purcell (1981), as well as those of Reeves (1982) and Ganz (1975), would make the wrong prediction regarding the expected shifts of the SOA at which optimal metacontrast occurs as background or stimulus intensity is altered. The review of metacontrast in Chapter 2, section 2.6.2, showed that higher background or stimulus luminances shift the peak of the type B metacontrast function to larger SOA values relative to the peak obtained with lower luminances (Alpern 1953; Purcell et al. 1974). We also know, on the basis of the inverse background- and stimulusintensity effects, that visual persistence, integration, and successiveflash resolution thresholds occupy a shorter duration at higher, relative to lower, intensities (Bowen et al. 1974; Di Lollo 1980; Haber and Standing 1970). The implication for Ganz’s (1975) two-process models is that, since the visual traces and therefore the lateral interactions of target and mask stimuli are curtailed in duration at higher background or stimulus intensities, the rising portion of the type B metacontrast function should terminate at a shorter SOA and thus shift the peak masking effect to a correspondingly shorter SOA value. Similarly, for Reeves’ model, the expected shift of the upper limit of temporal integration and the lower limit of temporal resolution to lower SOA values predicts that the two respective and correlated type A monotonic processes, comprising the overall type B metacontrast function, would also shift to lower SOA values. Hence, again, peak masking ought to occur at a lower SOA when background intensity is increased. Finally, since the duration of temporal integration is also shortened at higher background intensities, according to the model of Navon and Purcell
115
116
MODELS AND MECHANISMS OF VISUAL MASKING
(1981) we would again expect the additive combination of the target-preserving integrating function of the mask and countervailing interrupting or masking function to produce a type B effect which peaks at a shorter SOA relative to lower backgrounds. Thus all the above two-process models predict a result which is contradicted by the findings of Alpern (1953) and Purcell et al. (1974) showing that the peak of the type B metacontrast function shifts to higher SOA values as stimulus or background intensity increases. The main problem with the above two-process models is that they apply in a restrictive manner only to the immediate data collected in support of them. Their explanatory extension to other extant data is, as shown, inadequate. Moreover, like the earlier models, none of the two-process models makes any specific attempt to explain the absence of masking if, rather than using brightness ratings or figural identification as a response criterion, others such as simple reaction time or detection are employed. Ganz’s model, based on lateral contrast induction between brightness response traces, could not account for the absence of metacontrast masking effects when the latter response criteria are employed but, rather, would predict a type B metacontrast function. It is not clear how Reeves’ model would fare with the use of the latter simple detection criteria, since no specific hypotheses were proposed regarding how the two components of his model, integration and segregation, affect simple detection. A similar uncertainty about the use of detection as opposed to identification criteria applies to Navon and Purcell’s mode1. Here, without specifying additional hypothetical processes, it is again not clear how the two temporally overlapping processes of target-preserving integration and target-destroying interruption would affect simple detection of the target. 4.4. Past neural-network models 4.4.1
Bridgeman’s Hartline–Ratliff inhibitory network
Bridgeman’s model of metacontrast is based on recurrent lateral inhibition among neurons in a distributed neural network (Bridgeman 1971, 1977, 1978, 2001). The equations specifying the inhibitory activity within the network are similar to those derived by Ratliff (1965): ri(t) ei(t)
n
兺w
j1
j,i [rj(t
兩i j兩) r0j,i]
j i
PAST NEURAL-NETWORK MODELS
where ri(t) is the firing rate of neuron i at time t, ei(t) is the excitatory input to the ith neuron, wi,j is the synaptic weight for the connection from jth to the ith neuron, r0j,i is the firing threshold for the connection between the jth and the ith neuron, and n is the number of neurons in the network. This is a single-layer network where each neuron inhibits its neighbors through recurrent lateral inhibitory connections that have distance-dependent delays. The delayed inhibition generates oscillatory activities in both space and time. Moreover, when a spatially localized input is applied to the network, the activity spreads laterally with time, leading to a spatiotemporally distributed representation of stimuli. After the offset of the input, the spatially distributed activity decays gradually, producing a form of visual persistence or icon. The storage lasts for several iterations of the inhibitory process, after which the spatiotemporal oscillations of the network fade away. In addition, the pattern of spatiotemporal oscillation in the network specifies the stimulus used in metacontrast. For instance, the target alone and the mask alone each activate a characteristic pattern of spatiotemporal oscillations in the network. Thus each pattern of activity corresponds to one of the stimuli. In order to generate lateral masking or metacontrast functions, Bridgeman (1971) assumed that the activity of the target alone is stored in the neural network and is then compared with the activity produced by the target–mask combination. An example of the network activity for the target alone, and for the target and mask at SOAs of 0 and 60 ms is shown in Figure 4.1. The neural-network activity for the mask alone is shown for comparison. Although the exact mechanism of the comparison process is not specified, it is realized via a simple Pearson r coefficient between the stored activity set up by the target (or mask) alone and the activity set up by the target–mask sequence. Thus it corresponds to a cross-correlation or template-matching process of pattern recognition. High correlations correspond to good target recognition, and lower correlations correspond to poor target recognition. Simulated masking functions, using the disk–ring paradigm, for paracontrast, simultaneous masking, and metacontrast are shown in Figure 4.2. The solid lines correspond to the magnitudes of the r coefficient obtained when comparing the disk activity with the disk–ring activity, and the broken lines correspond to the comparison between the ring activity and the disk–ring activity. Note that the disk’s
117
MODELS AND MECHANISMS OF VISUAL MASKING
(a)
Firing rate
118
Nerve net size
(b)
(c)
(d)
Fig. 4.1 The activity in Bridgeman’s lateral inhihitory network in the presence of (a) the target alone, (b) the simultaneous presentation of the target and mask, (c) the presentation of the mask after the target, and (d) the mask alone. (Reproduced from Bridgeman 1971.)
Cross-correlations (a)
(b)
(c)
1.0
0.5
0.0 60 ms Time Annulus with combined stimuli
Disc with combined stimuli
Fig. 4.2 Pearson r coefficients (ordinate) as a function of the temporal interval between the disk and ring presentations: (a) ring presented before disk; (b) ring and disk presented simultaneously; (c) ring presented after disk. Solid lines, r coefficients of the disk; broken lines, r coefficients of the ring. (Reproduced from Bridgeman 1971.)
r coefficients are fairly high when preceded by the ring. This corresponds to a relatively weak paracontrast effect. At disk–ring synchrony the disk’s r coefficients are somewhat lower, and when the ring follows the disk they are reduced substantially. The latter reduction corresponds to
PAST NEURAL-NETWORK MODELS
pronounced metacontrast at an SOA of 60 ms. Roughly symmetrical effects are obtained when the ring activity is compared with the disk–ring activity. One aspect of this model, noted by Weisstein et al. (1975), is that if one simulates metacontrast using the r coefficients over a wider range of SOAs, a pronounced temporal oscillation of the metacontrast function is obtained. As discussed in Chapter 2, section 2.6.2, oscillatory metacontrast functions are prominent only under specific stimulus configurations which are different from those used in Bridgeman’s simulations. Furthermore, at the time that Bridgeman carried out his simulations, there was no empirical evidence for oscillatory metacontrast functions. To minimize oscillations, Bridgeman (1977, 1978) incorporated a modification in which r2 rather than r was used as a measure for comparing target with target–mask activity. With varying degrees of adequacy, the modified model predicts type A and B masking functions and the effects of mask size, separation, T/M energy ratio, and repetitive presentations of the T–M sequence (Bridgeman 1978, 2001). Masking functions derived from the early and late parts of the temporal response were of weaker type A and stronger type B, respectively. Bridgeman related this finding to the effects of response criterion level on masking, specifically to type A and type B functions observed with simple detection thresholds vs. brightness/pattern judgments (Kahneman 1968) and with speeded vs. slow brightness judgments (Lachter and Durgin 1999). However, this model also has several shortcomings. Because oscillatory activity is spread spatially, the effect of target–mask separation appears oscillatory in the model, in contradiction to empirically obtained monotonic functions. The model predicts a type A metacontrast function when the target and mask do not have the same sign of contrast, in particular when the target is dark and the mask is light. It was noted in Chapter 2, section 2.6.4, that Breitmeyer (1978c) obtained type B metacontrast suppression of a target’s contour detail under these stimulus conditions. Since in Bridgeman s (1977, 1978) modified model the network activity presumably represents the entire target stimulus and not just its contour information, the type B contour suppression may fail to be predicted by his model. A further modification of Bridgeman’s model may be to incorporate separate processes for generating brightness or contrast suppression and suppression of
119
120
MODELS AND MECHANISMS OF VISUAL MASKING
contour detail. Bridgeman’s (1971) original and modified (Bridgeman 1977, 1978) models suffer from the same predictive flaw as the twoprocess models discussed in section 4.3. In his simulations, Bridgeman, relying on Singer and Creutzfeldt’s (1970) study of cells in cat lateral geniculate nucleus, assumed a fixed time constant of 30 ms specifying the latency of recurrent lateral inhibitions in his network. Thus one iteration of the network’s inhibitory process required 30 ms. Since the outputs of these successive 30-ms iterations provide the input to the cross-correlating process, which determines the type B metacontrast function, the temporal characteristics of this function are in turn determined by the 30-ms latency of recurrent lateral inhibition. In Bridgeman’s particular application, the peak metacontrast effect (the lowest cross-correlation) occurs at an SOA of 60 ms, i.e. after two network iterations of the target-inhibition activity. However, if we let the latency of recurrent lateral inhibition vary, we could expect to obtain correlated variations of the SOA at which peak masking occurs. For instance, if the inhibitory time constant is 15 ms, the peak of the metacontrast function should shift to an SOA of 30 ms; similarly, if the time constant is 60 ms, the peak should shift to an SOA of 120 ms. One way of varying the inhibitory time constant of the visual system is to change its light adaptation level. Electrophysiological studies (Barlow et al. 1957a,b; Maffei et al. 1970; Poggio et al. 1969; Sasaki et al. 1971; Virsu et al. 1977) showed that the spatiotemporal response properties of single cells along the entire retinogeniculostriate pathway change with the adaptation level of the visual system. Singer and Creutzfeldt’s (1970) study of the response latencies of lateral geniculate cells, on which Bridgeman based his lateral inhibitory time constant of 30 ms, was conducted at a fixed background luminance. However, Virsu et al. (1977) replicated Singer and Creutzeldt’s investigation and found that, with progressive dark adaptation, response latencies increased. Moreover, response latencies increase as stimulus intensity decreases. Consequently, as we change from a high to a low background or stimulus luminance level, Bridgeman’s model, like the two-process models discussed in the previous section, predict corresponding peak metacontrast shifts from low to high SOA values. However, as we have already noted, the findings of Alpern (1953) and Purcell et al. (1974) show exactly the opposite trend. Finally, like all other single-channel models, Bridgeman’s model cannot
PAST NEURAL-NETWORK MODELS
predict the double dissociation observed in target recovery (see Chapter 8, section 8.2). Weisstein’s Rashevsky–Landahl two-factor neural network
4.4.2.
Weisstein’s (1968, 1972) model of metacontrast applies strictly to post-receptor neural interactions in the visual pathway. In this neuralnetwork model, each neuron’s response was assumed to be a function of the difference between excitatory and inhibitory postsynaptic potentials which rise and decay exponentially in time. In the modified model (Weisstein et al. 1975) shown in Figure 4.3, the neural network schematically consists of eight neurons, of which six are excitatory and two are inhibitory. In the figure, the target (S1) stimulus and the mask (S2) stimulus are presented simultaneously. The S1 stimulus activates an excitatory pathway in which the neurons n11, n12, and n13 synapse in sequence. Similarly, the S2 stimulus also activates an excitatory pathway consisting of a primary (n 21), secondary, and tertiary neuron. The primary excitatory neurons in each pathway are assumed to be located in the periphery of the visual pathway; the secondary and tertiary neurons of each pathway are assumed to be central neurons. At the first synapse along each pathway, inhibitory interneurons are also activated
S1 n12
Target
n11
n13
n22 S2 Mask
n21
Fig. 4.3 Weisstein’s modified Rashevsky–Landahl two-factor neural net model. The behavior of the net illustrates the hypothetical activity of target and mask neurons to simultaneous presentation of the target and mask stimuli. Activation functions are shown for the primary, secondary, and tertiary excitatory neurons in each pathway. Shaded regions correspond to the activation functions of the inhibitory interneurons. (Reproduced from Weisstein et al. 1975.)
121
122
MODELS AND MECHANISMS OF VISUAL MASKING
by the primary excitatory neurons (n 11 and n 21). These inhibitory neurons cross spatially and form inhibitory synapses on the tertiary neuron in each pathway. This makes explicit the spatial symmetry of the reciprocal inhibition that the target and mask stimuli can exert on each other. Earlier versions of the model (Weisstein 1968, 1972) only implicitly incorporated this symmetry. As can be seen, the inhibition is of a feedforward non-recurrent type. Each of the eight neurons is characterized by a temporally rising and decaying activity function. The crucial difference between activity functions is seen when comparing the output of the secondary excitatory neurons with that of the inhibitory interneurons. The rise and fall times of the former are longer than those of the latter. In effect, the inhibitory interneurons respond faster than the secondary excitatory neurons. Since both types of neurons synapse on the tertiary neurons, the response of the tertiary neurons, in the case of target–mask simultaneity, is determined by the difference of a leading inhibitory postsynaptic potential and a lagging excitatory postsynaptic potential. Since the two potentials are temporarily out of phase at target–mask simultaneity (SOA 0 ms), the inhibitory potential will have dissipated appreciably by the time the lagging excitatory potential is generated. Hence there should be little if any suppression of the target’s tertiary neuron. The target’s visibility is not masked appreciably. However, at some positive SOA of the mask relative to the target, the generation of two antagonistic potentials should coincide in time. Here we would expect optimal suppression of the output of the target’s tertiary neuron and consequently optimal masking of the target’s visibility. At still greater SOAs, the generation of the inhibitory potential will again be out of phase, occurring somewhat later than the generation of the excitatory potential, and result in less than optimal target masking. Thus, by progressively increasing the SOA, a typical type B metacontrast function can be generated. Mathematically, Weisstein’s model is based on an earlier masking model proposed by by Landahl (1967) and the canonical equations used to describe the activities of neurons are Rashevsky’s two-factor neuron model equations (Landahl 1961; Rashevsky 1960): d j aj j Aj fi(t) dt
(4.1)
PAST NEURAL-NETWORK MODELS
djj bj jj Bj fi(t) dt
(4.2)
fj(t) [ j jj hj]
(4.3)
where j and jj are the excitatory and inhibitory factors, respectively, to the jth neuron (hence the ‘two-factor’ model). They can be interpreted as excitatory and inhibitory neurotransmitters. Parameters Aj, aj, Bj, and bj are non-negative constants, function fi produces the output of the jth neuron and h j is a threshold parameter associated with the jth cell. The function [.] denotes a linear-above-threshold function. In some versions of the model, saturation is also added to this non-linearity. The merit of the model lies in its conceptual simplicity. A modification of the model incorporating a slow inhibitory and fast excitatory process could also account for type B paracontrast (Weisstein et al. 1975). Moreover, since the model assumes that the excitatory– inhibitory interactions occur at central levels of processing, it can account for both monocular and dichoptic para- and metacontrast. Another advantage of the model is that it is described mathematically and therefore can make quantitative predictions. For example, it can simulate some of the empirical type B metacontrast functions obtained by Weisstein and Haber (1965), Schiller and Smith (1966), and Alpern (1953) with a reasonable degree of success (Weisstein 1972, Figs 3, 5, and 7). The model can capture changes in masking qualitatively, rather than quantitatively, as a function of the T/M energy ratio. The model correctly predicts a shift from type B to type A functions as the T/M energy ratio decreases. The empirical results also show this trend. Nonetheless, in several cases there appear to be large deviations between the simulated and empirical results in terms of both the location of the SOA at which peak masking occurs and the overall magnitude of masking. Moreover, since only the n11 and n21 neurons shown in Figure 4.3 respond to the target and mask, and since they presumably code only brightness information, the model of Weisstein et al. cannot really predict absence of metacontrast effects when simple reaction time to, or forced-choice detection of, the target stimulus is required (Bernstein et al. 1973a,b; Fehrer and Biederman 1962; Fehrer and Raab 1962; Harrison and Fox 1966; Schiller and Smith 1966; Ögmen et al. 2003).
123
124
MODELS AND MECHANISMS OF VISUAL MASKING
4.5. Neural-network models adopting overtake and
dual-channel activation hypotheses It may be intuitively obvious that a mask stimulus that is much more energetic (greater contrast or duration) than the target stimulus (Crawford, 1947; Sperling, 1965) activates neural responses that overtake those of the target and thus contribute to backward masking. However, one of the most salient and counter-intuitive features of metacontrast is that the perceived contrast and form of the target stimulus is optimally suppressed not at an SOA of 0 ms but rather when the mask follows the target by 30–100 ms, even when the energy of the mask is equal to or even less than that of the target. Here we review two additional models that rely most clearly on two of the four properties listed above: overtake of slower responses in one neural channel or pathway by activity in a separate and faster responding pathway. Therefore the response overtake is not due to the general finding that more intense stimuli activate shorter latency neural responses. Rather, the overtake is ascribed to the faster activation of one channel or pathway compared with a separate one even when the intensity of the mask is less than or equal to that of the target. The models proposed by Matin (1975) and by Weisstein and colleagues (Weisstein 1968, 1972; Weisstein et al. 1975) are also based on an overtake or dual-channel-activation hypothesis. However, since they were discussed in sections 4.1 and 4.2 above, they will not be covered here. 4.5.1. The retino-cortical dynamics (RECOD) neural-network model
The retino-cortical dynamics (RECOD) neural-network model is an elaboration and quantification of the dual-channel model based on inhibitory interactions within and between sustained and transient channels proposed by Breitmeyer and Ganz (1976). We merely list it here and defer a detailed exposition until Chapter 5. 4.5.2.
The perceptual retouch model
A model of visual masking based on perceptual retouch (PR) was introduced by Bachmann (1984) and elaborated more recently (Bachmann 1994). In this model, interactions between activities of two anatomically distinct pathways also assume a key role. Hence we can
NEURAL-NETWORK MODELS
consider it a type of dual-channel or two-process model (Bachmann 1997). The model includes recurrent lateral inhibition, akin to that specified in Bridgeman’s (1971) Hartline–Ratliff neural net, at early stages in the afferent visual pathway. However, the inhibition does not play a primary role in backward masking as it does in Bridgeman’s model. Instead it merely serves to preprocess target and mask representations which subsequently interact in a more crucial way at later cortical levels to yield the typically observed backward masking effects. Moreover, whereas the dual channels of the RECOD model project in parallel from the retina via the lateral geniculate nucleus to the visual cortex, the two channels or pathways of the PR model take different routes from the retina to the cortex. One, called the specific pathway (SP), projects from the retina via the lateral geniculate to the visual cortex. The other, called the non-specific pathway (NSP), projects via collaterals from the retina to the reticular centers (in the midbrain and brainstem) and from there to the cortex. Adhering as closely as possible to Bachmann’s (1994) terminology and abbreviation scheme, we list the main assumptions and consequences of the model as illustrated in Figure 4.4. Dt
Kt
Dm
Km
Pt
Excitatory synapse
Pm
M
Inhibitory synapse
Fig. 4.4 Representation of the perceptual retouch (PR) model. The specific pathway (SP) consists of receptors (P), detectors (D), and command neurons (K). The non-specific pathway (NSP) comprises the modulatory neuron (M) which pools its inputs from receptors and which projects diffusely to detector neurons. The subscripts t and m denote cells activated by the target and mask, respectively. (Adapted from Bachmann 1994.)
125
126
MODELS AND MECHANISMS OF VISUAL MASKING
1. A stimulus briefly activates both the retino-geniculo-striate SP pathway and, via collaterals, the retino-reticulo-cortical NSP pathway. 2. Whereas the SP activity determines the contents of consciousness, the NSP activity generated in subcortical modulatory structures M is necessary for subjective awareness per se. 3. The NSP activity and the SP activity converge at the same cortical locus of the detector unit D; however, the NSP activity arrives about 50 ms later than the SP activity. 4. For a stimulus representation to attain consciousness, SP impulses and NSP impulses must overlap temporally at the same retinotopic cortical locus of D. Given these assumptions, the process of perceptual retouch can generate a type B backward masking function as follows. The target and mask stimuli, respectively, generate not only short-latency SPt and SPm activities but also long-latency NSPt and NSPm activities. When the two stimuli are presented simultaneously or else clearly successively (SOA 0 ms or SOA 150 ms), the temporal convergences of the SP and NSP impulses on the respective D t and Dm loci, although nonoptimal, are nevertheless equal for both the target and mask. Thus, at Dt and Dm, the cortical loci of convergence of SP and NSP impulses, the signal-to-noise ratios representing the target and mask stimuli are at equal but non-optimal values. Subsequently, the outputs of the Dt and D m units, respectively, activate the command units K t and K m through direct feedforward excitation but suppress Km and Kt activities through crossed feedforward inhibition. Since the excitatory and inhibitory inputs to K t and Km are equal, their outputs, correlating with the visibility of the target and mask stimuli, should also be equal. Therefore at SOAs near 0 ms or larger than 150 ms, the target and mask ought to be equally visible. However, because the NSP activity arrives at D about 50 ms earlier than the SP activity, an SOA of 50 ms produces an optimal temporal convergence of the NSPt and the SPm impulses at the retinotopic cortical locus Dm. Hence, the signal-to-noise ratio of the cortical activity representing the mask at Dm attains its maximal value. It follows that the excitatory activation of Km will be significantly larger than that of K t . Moreover, this difference in activity will be accentuated by the feedforward inhibition, i.e. the output of Kt will be strongly inhibited while that of Km will not. Therefore the target’s
NEURAL-NETWORK MODELS
visibility will be strongly suppressed while the mask’s visibility is enhanced. Bachmann’s PR approach has several strengths. First, it is highly plausible from a neuroscientific standpoint. Well-known anatomical and electrophysiological findings indisputably establish the existence of specific retino-geniculo-cortical processing pathways in vision (reviewed by Breitmeyer 1992; Schiller 1986; Shapley 1992). Less well known but equally indisputable is the empirical evidence for nonspecific afferents that project to visual thalamic and cortical areas from subcortical reticular areas of the brainstem and midbrain (Frizzi 1979; Hartveit et al. 1993; Hassler 1978; Purpura 1970; Singer 1977, 1979; Singer et al. 1976; Steriade and McCarley 1990). The reticular complex has played an increasingly prominent role in recent formulations of visual selective attention (Crick 1984; LaBerge 1995; LaBerge and Brown 1989; Singer 1994). Of relevance here is Berson and McIlwain’s (1982) demonstration that collateral fibers provide direct inputs from transient Y ganglion cells of the cat retina to neurons in the deep layers of the cat superior colliculus. Moreover, on the basis of physiological and cytological criteria, Edwards et al. (1979) have argued that the cells in these deep layers ought to be classified as reticular rather than collicular. If this scheme generalizes to the primate visual system, we have, as stipulated by Bachmann’s model, the neural bases for the activation of retino-reticulo-cortical (NSP) as well as retino-geniculo-cortical (SP) pathways when a brief target or mask stimulus is presented. One of us (Breitmeyer 1984, Chapter 10, 1986) has previously discussed the role of brainstem and midbrain reticular activation in visual masking and information processing. In these discussions reticular activation served not as a constitutive component of the neural masking process as it does in Bachmann’s model, but rather as a component ancillary to the sustained–transient channel interactions that constituted the essential neural masking processes in the model introduced by Breitmeyer and Ganz (1976). The additions were introduced to account for the roles of selective attention and visual exploratory behavior characterized by stimulus-guided fixationsaccade sequences in visual information processing (Breitmeyer 1980). These topics are particularly relevant to our understanding of recent results showing that the magnitude of metacontrast is reduced by selectively attending to the area or the configuration of the target–mask
127
128
MODELS AND MECHANISMS OF VISUAL MASKING
display (Boyer and Ro, in press; Enns and Di Lollo 1997; Havig et al. 1998; Ramachandran and Cobb 1995; Shelley-Tremblay and Mack 1999). Bachmann’s PR approach clearly is consistent with these and a number of other relevant findings, including results obtained when pharmacological and neurosurgical interventions produce predictable response changes in the NSP activating system (Bachmann 1994). A second strength is that the PR model accords especially well with the enhancement of the visibility of the second of two spatiotemporally proximate stimuli (e.g. a mask following a target) compared with when that stimulus is presented alone. Both Michaels and Turvey (1979) and Bachmann (1988, 1994) reported that the visibility enhancement of an aftercoming mask is maximal at SOAs at which the visibility of the target is optimally suppressed. These mask enhancement effects, related to the paracontrast enhancement effects discussed in Chapter 2, section 2.5, may also be related to the contrast reversals of the (preceding) target stimulus reported in several previous investigations (Brussel et al. 1978; Heckenmueller and Dember 1965; Purcell and Dember 1968). For example, under suitable conditions and at optimal SOAs, a black disk-like target followed by a black ring-like surrounding mask actually appears brighter than the white background on which the target and mask are presented. In Bachmann’s PR model, both the enhancement of the second stimulus and the contrast reversal of the first stimulus could be explained by the enhanced output of Dm neurons relative to Dt neurons and consequently a strong inhibition of Kt neurons relative to a strong excitation of Km neurons. Of course, in this interpretation of the PR model it is assumed that the activity levels of the K neurons determine the perceptual salience (brightness, contrast, clarity) of the stimuli. However, in Bachmann’s (1994) model, the visibility of a stimulus is not clearly or consistently tied to the output of its associated K neuron. This ambiguity or inconsistency in Bachmann’s (1994) discussion of the PR model is due to a lack of specificity as to which of the two sets of neurons, D (detection) or K (command), is to be associated with the perceived aspects of the stimuli. At one point, Bachmann (1994, pp. 180, 183–4) claims that the perceptual efficiency and subjective clarity of a stimulus is directly proportional to the level of activity produced by the stimulus in the corresponding D neuron. Alternatively, Bachmann (1994, pp. 183, 221) argues that the K neurons (or later-stage gnostic units) generate responses leading to perceptual categorization and
OBJECT-SUBSTITUTION MODELS
recognition of their corresponding stimuli. We believe that this minor ambiguity in the PR model can be resolved by unambiguously taking the output of K neurons as the correlate of perceived stimulus properties such as the perceptual enhancement of the mask’s contrast and the reversal of target’s contrast. Correcting these ‘tuning’ problems with Bachmann ‘s (1994) PR model would be easy and would enhance its explanatory power. The PR model also predicts that the second stimulus appears in consciousness faster when compared with an isolated presentation of the same stimulus. Bachmann (1994, 1997) presents data supporting this prediction, although the supporting evidence rests on what we consider to be a questionable rationale (Breitmeyer and Ögmen 2000, footnote 2). As a corollary, the PR model also predicts that the judged order of appearance of a target and a following mask stimulus can be reversed with an appropriate choice of SOAs, such as those giving rise to maximum (or near-maximum) metacontrast suppression. To our knowledge, no extant data bear on this prediction. The relevant experiment remains to be conducted. 4.6. Object-substitution models Bachmann’s PR model can be viewed as an object-substitution model as well as a dual-process model. The target object’s SP activity SPt is replaced by the mask object’s SP activity SPm as the latter (mis)appropriates the slower target-activated NSP response at the level of the D and K neurons. Since combination of SP and NSP activities is required for stimulus information to register as a conscious percept, it follows that the mask takes the place of the target in perception (Bachman 1999, 2000). A model specifically adopting object substitution as a masking mechanism has been proposed recently by Enns and Di Lollo (1997) and elaborated by Di Lollo et al. (2000). The theory is intended to explain not only what is claimed to be a new type B masking effect when the onset of the mask follows that of the target, but also the interesting case of simultaneous-onset masking. These two cases are discussed in turn below. 4.6.1.
Object substitution and type B backward masking
Regarding the type B backward masking effect, Enns and Di Lollo (1997) investigated contour discriminability of a diamond-shaped
129
130
MODELS AND MECHANISMS OF VISUAL MASKING
target with either the left or right corner missing when it was masked by a surrounding diamond-shaped mask or by a mask made of four dots falling at the corners of a notional square that would enclose the diamond target. The key results showed that the four-dot mask, like the contour-adjacent diamond-shaped mask, could produce substantial type B masking under the following conditions: (i) when the target and mask were presented 3 peripherally but not when they were presented foveally; (ii) when attention was distributed over large areas of visual space, i.e. when attention could not be fully allocated to a given target– mask location. While attributing type B backward masking, obtained when the surrounding mask was used, to interactions between contoursensitive mechanisms, Enns and Di Lollo attributed a similar effect, found when the four-dot mask was used, to high-level processes of object substitution. They argued that their data cannot be explained by contour-based mechanisms of metacontrast because (a) the four-dot masking was not significantly affected by varying the spatial separation between the target and mask contours, (b) none of the masking models based on contour-sensitive mechanisms predict differential masking outcomes as a function of attention, and (c) under conditions of distributed attention not only backward masking but also simultaneous and forward masking are obtained. These findings, as noted by Enns and Di Lollo (1997), can pose problems for masking models based strictly on contour-sensitive mechanisms. Although the types of contour-sensitive mechanisms are not specified, they would no doubt include a number of proposed mechanisms relying on inhibition between spatially neighboring stimulus activities such as lateral inhibitory networks (e.g. Bridgeman 1971; Weisstein 1968) or mechanisms based on (lateral) inter-channel inhibition (e.g. Breitmeyer and Ganz 1976). However, the results reported by Enns and Di Lollo, while consistent with the notion of object substitution, are damaging to the alternative models only under implicitly limited interpretations. In particular, two interpretive limitations are implied: (1) contour interactions are narrowly limited in spatial extent under all conditions, and (2) contour proximity is a sine qua non for obtaining type B metacontrast. Thus, when the target and mask stimuli are centered at the fovea, strong type B masking with the surrounding mask but not with the four-dot mask is fully consistent with previous findings and with limited foveal lateral interactions
OBJECT-SUBSTITUTION MODELS
between the target and mask (Bridgeman and Leff 1979; Kolers and Rosner 1960). However, target–mask contour proximity is not a sine qua non for obtaining type B backward masking functions, especially when the stimuli are presented outside the fovea, i.e. in regions of the visual field where object substitution presumably manifests itself most strongly. There, substantial metacontrast masking can be obtained at target–mask contour separations of 1 or more (Alpern 1953; Breitmeyer et al. 1981b; Growney et al. 1977; Weisstein and Growney 1969) even when the mask stimulus does not surround the target stimulus (Breitmeyer et al. 1974, 1976). Hence the presence of four-dot masking outside but not inside the fovea would be entirely expected in ‘standard’ models. In particular, extant models such as the sustained–transient model proposed by Breitmeyer and Ganz (1976) and Matin’s (1975) spatiotemporal sequence model, both of which allow for spatially extensive lateral effects generated by the transient channels (see Breitmeyer et al. 1981b, Fig. 7), could account for the extrafoveal surround mask and four-dot mask results. Enns and Di Lollo (1997) correctly note that most extant theories of masking do not generally account for attentional effects. However, one masking model proposed by Michaels and Turvey (1979) incorporates attention, but largely as an add-on process working in conjunction with spatial inhibitory processes. Such a general add-on role for attention makes sense in light of the fact that facilitative effects of selective attention, although found in metacontrast (Boyer and Ro, in press; Havig et al. 1998; Ramachandran and Cobb 1995; Shelley-Tremblay and Mack 1999), are not unique to backward masking tasks. Effects of selective attention generalize to all sorts of perceptual tasks and criterion contents (Bashinski and Bacharach 1980; LaBerge 1995; Posner 1980; Sagi and Julesz 1985; P.L. Smith et al. 2004), and beyond that to many other cognitive tasks. If, for instance, we consider the finding that focusing attention on the locus of a stimulus increases detection sensitivity d’ (Bashinski and Bacharach 1980; P.L. Smith et al. 2004), we need not conclude that decreased sensitivity found with diffuse spatial attention rules out the role of low-level detection units and requires a higher-level mechanism that is qualitatively different from them. We need only assume that the output of the low-level detectors is modulated by the level of attention. Likewise, a modulation of target visibility when attentional allocation to it varies does not
131
132
MODELS AND MECHANISMS OF VISUAL MASKING
warrant ruling out the role of lower-level contour inhibitory mechanisms. These alternative ways of explaining the results reported by Enns and Di Lollo (1997) do not exclude the possible role of object substitution as a masking mechanism. However, they do point out that the results obtained by Enns and Di Lollo (1997) are not decisive in rejecting standard ‘contour-sensitive’ models of masking. 4.6.2.
Object substitution and common-onset masking
Di Lollo et al. (1993, 1995) reported substantial masking of the target when the onsets of the target and mask are simultaneous but the duration of the mask is prolonged relative to the brief target duration. Generally, masking increases with mask duration until a critical mask duration is reached beyond which no further increase in masking is obtained. Di Lollo et al. (2000) extended this paradigm and reported the results of a number of parametric experiments showing how similar masking functions can be obtained with four-dot and surround masks. Their model, now given quantitative expression, distinguished between early masking processes affected by early sensory factors, such as the level of light adaptation, and later attentional factors, such as target set size. As well as the attentional component, the model incorporated cortical processes depending not only on feedforward mechanisms but also on re-entrant feedback mechanisms that play a prominent role in several approaches to the study of visual perception (Bridgeman 1971; de Kamps and van der Velde 2001; Edelman 1987; Grossberg and Mingolla 1985a,b; Ögmen 1993; Ratliff 1965; Zeki 1993). The re-entrant processes play a particularly important role in the process of object substitution. Di Lollo et al. (2000) claim that their findings, expressing the combined effects of common-onset and fourdot masking, not only support their model but also defy explanation by extant feedforward or contour-sensitive models of masking. However, this claim is somewhat puzzling in view of the fact that Bischof and Di Lollo (1995) had already shown that both metacontrast and common-onset masking are consistent with Bridgeman’s (1971) Hartline–Ratliff neural-network model. Although the model incorporates recurrent lateral inhibition that can be expressed as a form of negative feedback, it certainly does not qualify as the high-level attention-dependent re-entrant activity envisaged by Di Lollo et al. (2000) as an essential component of their object-substitution model.
OBJECT-SUBSTITUTION MODELS
In Bridgeman’s model, the recurrent network merely serves to set up and sharpen stimulus-induced neural-network activities, while correlations between target- and mask-induced activities determine the perceptual status of the target when it is followed by a mask. Recurrent lateral inhibition is a low-level property of vision; versions of it are implemented in visual systems as simple as that of Limulus (Ratliff 1965). Hence, invoking a high-level mechanism to explain masking phenomena that a relatively low-level process might explain as well appears unnecessary for explanations of common-onset masking. In fact, as shown recently by Francis and Hermens’s (2002) extension of Francis’s (2000) original analysis of quantitative models of masking, a number of existing models besides Bridgeman’s (1971) can explain the major findings of common-onset masking as well as the four-dot masking reported by Di Lollo et al. (2000). Thus, again, the results, while supporting the object-substitution model, do not rule out extant lateral inhibitory or feedforward models. Despite these unresolved issues (Di Lollo et al. 2002), an important upshot of more recently reported findings by Enns and Di Lollo (1997), Di Lollo et al. (2000), Lleras and Moore (2003) and Enns (2002, 2004) for visual masking theories is the sharper focus they place on late object-specific (Williams and Weisstein 1981) and attentional (Michaels and Turvey 1979) levels of cortical processing that cannot be explained by cortical contour-interactive processes alone (Breitmeyer 1984, pp. 256–61). Such an approach dovetails nicely with evidence for an interruptive mechanism of visual masking located in or probably no later than the inferotemporal cortex (Kovács et al. 1995; Rolls and Tovée 1994; Rolls et al. 1999). Spatial selective attention powerfully modulates the responses of visual cells there and in other extrastriate cortical areas (Moran and Desimone 1995; Motter 1993; Sato 1988; Schiller and Lee 1991). Hence such interruptive masking would be particularly consistent with the view of an attention-modulated masking at these later levels of cortical processing and, specifically, with a mechanism of object substitution. On the other hand, in a recent study Kahan and Mathis (2002) showed that the strength of common-onset masking did not depend on gestalt grouping factors such as form, similarity in color, position, luminance polarity, and common region, thus providing evidence against the role of object-specific factors in common-onset masking.
133
134
MODELS AND MECHANISMS OF VISUAL MASKING
4.7. Emergent dynamic properties of the boundary
contour system (BCS) neural network The boundary contour system (BCS) is a model of a cortical neural network proposed by Grossberg and Mingolla (1985a,b) to account for spatial aspects of the visual perception of stationary steady-state stimuli. By exploring its dynamic properties, Francis and colleagues (Francis 1996a,b, 1997; Francis and Grossberg 1996a,b; Francis et al. 1994) have demonstrated its ability to account also for a variety of empirical findings obtained when visual inputs change rapidly over time. The model is a cooperative–competitive network incorporating, in addition to the obvious afferent feedforward excitatory drive, three key properties: excitatory feedback, feedforward inhibition, and inhibitory feedback. Since the BCS, at least with respect to form perception of stationary stimuli, relies on the cortical P pathway (Grossberg 1994), we can consider these mechanisms to operate within a single channel. Figure 4.5 shows a schematic diagram of the BCS model. At the earliest levels of processing, it consists of unoriented contrast-specific filters (on-center off-surround cells) whose outputs feed into oriented contrast-specific simple cells. In turn, the rectified outputs of these units feed into complex cells selective for the same orientation but insensitive to contrast polarity. At the first competitive stage, the complex cells project their outputs via on-center off-surround connections to first-level hypercomplex cells. Because of the off-surround connections, these hypercomplex cells, while remaining orientation selective, additionally are selective for end-stopped stimuli (Hubel 1988; Hubel and Wiesel 1965). At the next stage, competition among orientations results when higher-order hypercomplex cells are activated via antagonistic inputs from the lower-order hypercomplex cells tuned to different orientations. The outputs of these higher-order hypercomplex cells, which specify the location of oriented stimulus boundaries in the visual field, feed into cooperative bipole cells. The bipole cells generate feedback which, on the one hand, excites location- and orientationconsistent patterns of activity and, on the other, inhibits inconsistent patterns of activity. The temporal dynamics of this model account for a number of empirical regularities characterizing visual persistence (Francis 1996a,b). Persistence arises through activation of the excitatory feedback loops resulting in long-lasting reverberatory activity in the BCS network. When
PROPERTIES OF THE BOUNDARY CONTOUR SYSTEM
Excitatory Inhibitory
Spatial sharpening (excitatory–inhibitory feedback)
Cooperative stage (bipole cells)
Competive stage 2 (hypercomplex cells) Unoriented (LGN cells)
Oriented, Oriented contrast polarity no-contrast (simple cells) (complex cells)
Competitive stage 1 (end-stopped cells)
Fig. 4.5 The boundary contour system (BCS) model. The solid and dotted lines represent excitatory and inhibitory connections, respectively. The first competitive stage consists of feedforward on-center off-surround connections between spatially neighboring cells that have the same orientational tuning. The second competitive stage is a ‘push–pull’ type interaction at every retinotopic position among cells with opponent orientation preference (e.g. horizontal vs. vertical). The bipole cells pool the outputs of hypercomplex cells in an orientation-selective manner. The on-center off-surround feedback from bipole cells implements spatial sharpening. (Adapted from Francis 1997.)
stimuli change or move, this persistent activity would be problematic in that it could give rise to forward masking (Breitmeyer 1980, 1984) or motion smear (Chen et al. 1995) (see also Chapter 6). In the BCS model, such persistence can be curtailed in one of two ways: by a gateddipole mechanism (Francis et al. 1994) which at stimulus offset produces a reset signal inhibiting the persisting reverberatory activity, or by lateral inhibition operating mainly at the first competitive stage (Francis 1996a). According to Francis (1996a, 1997), the latter mechanism of lateral inhibition is the more significant contributor to the
135
136
MODELS AND MECHANISMS OF VISUAL MASKING
dynamics of stimulus boundary erosion over time under metacontrast conditions. Figure 4.6 shows schematically how the model accounts for key properties of metacontrast. Figure 4.6 depicts the effect of the mask on the target at short SOAs. The solid curve T illustrates the response of a target-activated hypercomplex cell. Because hypercomplex cells receive positive feedback signals, their response persists after the offset of the stimulus. However, the lateral feedforward inhibitory signal generated by the mask at the first competitive stage is outside the feedback loop and its duration is shorter. Consequently, the effect of the mask is to reduce transiently the activity of the hypercomplex cells responding to the target, as shown by the dotted curve M. The net response to the target, after taking into account the inhibitory effect of the mask, is shown by the shaded area. As the SOA increases, this transient suppression starts to occur at the weaker portions of the response, as shown in Figures 4.6(b) and 4.6(c). The amount of suppression of the target activity is weak for short SOAs because of the presence of the strong feedback signal. As the strength of this feedback signal decays at midrange SOAs, the amount of suppression becomes stronger. At longer SOAs, the net effect of inhibition becomes smaller because the targetgenerated activity has already decayed (Fig. 4.6(c)). The model uses the linking assumption according to which the final visibility, or perceptual quality, of the target is proportional to the duration of the boundary signals generated by the target. Therefore the magnitude of masking will be given by the change in this duration. As shown in Figure 4.6(a), although the mask suppresses part of the target-generated responses at short SOAs, this suppression does not have any significant effect on the duration of the signal, thereby yielding very weak masking. Similarly, at long SOAs, only part of the inhibitory signal overlaps with the decaying target activity and the change in duration will be relatively small. As a result, the strongest masking is obtained at intermediate SOAs, yielding a U-shaped masking function. From this simplified analysis, it can be seen that any change in target stimulus parameters that modifies the neural response to the target (e.g. changes in target energy, by varying either its luminance (Weisstein 1972) or duration (Schiller 1965)) will affect the shape of the metacontrast function. Similarly, since strength of feedforward inhibition is directly related to the strength of the mask stimulus, changes in mask stimulus parameters (energetic or spatial)
PROPERTIES OF THE BOUNDARY CONTOUR SYSTEM
d
Time ⌬d = 0
Activity
(c)
Activity
(b)
Activity
(a)
d
⌬d
Time
d
T
T
T
M
M
M
⌬d
Time
Fig. 4.6 Simplified depiction of temporal interactions leading to metacontrast effects in the BCS model: target and mask stimuli at (a) short, (b)mid-range, and (c) long SOAs, respectively. The temporal profile of the target (T) and mask (M) stimuli are shown by the traces at the bottom of the graphs. The grey and white areas under the curve represent the response that the target would generate in the absence of the mask. The shaded area is the net response to the target stimulus after taking into account the inhibition exerted by the mask stimulus. The duration of this net response is shown by d. The change d in this duration caused by the presence of the mask is used to compute the magnitude of masking.
are also predicted to affect metacontrast (Breitmeyer 1978b; Di Lollo et al. 1993; Sherrick and Dember 1970). Finally, several spatial and temporal properties of target recovery (disinhibition) produced when introducing a second mask to the target–mask metacontrast sequence (Breitmeyer 1978a,1981b) can be explained by the component of the inhibitory signal arising from the inhibitory feedback pathway implementing spatial sharpening. On the other hand, in its current form the model cannot explain the double dissociation in target recovery discussed in Chapter 8, section 8.2. As mentioned above, the extant findings of backward masking are voluminous, revealing not only the main and highly replicated empirical regularities but also more specific effects. So far the BCS model has been able to account in a robust manner for a number of these regularities and specific effects. However, much work remains to be done to account for the vast remainder of findings. Unfortunately, a practical limitation of this as well as similar network models is that solving the network’s differential equations is a very time-consuming process (see Francis 1997, Appendix), and this may prohibit extensive testing of the model. Despite this practical limitation, the model already manifests a wide explanatory scope, providing an integrative scheme accounting
137
138
MODELS AND MECHANISMS OF VISUAL MASKING
for major findings on visual persistence, temporal integration, metacontrast masking, and apparent motion (Francis 1996a,b, 1997, 1999; Francis and Grossberg 1996b; Francis et al. 1994). Moreover, the model makes several explicit predictions regarding psychophysical and neurophysiological findings. A particularly strong psychophysical prediction is that optimal metacontrast masking ought to occur at a constant ISI between target offset and mask onset despite variations in the target duration. This prediction can be visualized from Figure 4.6 by noticing that the suppressive effect of the mask becomes significant when the target-generated activity starts to decay and becomes weak, which occurs after the offset of the target. Therefore the optimal delay for the mask can be measured from the offset of the target stimulus irrespective of the duration of the target, provided that changes in the duration of the target do not significantly alter the post-stimulus activity. A previous explanation of metacontrast (Kahneman 1967) has explicitly taken the SOA, rather than the ISI, to be the critical temporal parameter specifying optimal backward masking. Since SOA D t ISI (where Dt is the duration of the target), this model predicts that the optimal ISI should decrease, rather than remain constant, as target duration increases. However, as discussed in Chapter 2, section 2.6.2, no single temporal parameter (SOA, ISI, or STA) can be viewed as being best or critical for explaining the peak masking effect during backward masking. Finally, the linking hypothesis used in the current approach needs to be refined or revised since the duration of boundary signals is not related to perceived brightness or percept quality in a simple or direct way, as demonstrated by the inverse relationship between visible persistence and stimulus intensity (Coltheart 1980; Di Lollo and Bischof 1995). Indeed, earlier simulations of the BCS model showed that increasing the luminance of the stimulus caused a decrease in the duration of boundary signals, suggesting an inverse relationship between perceived brightness and the duration of boundaries (Francis et al. 1994). However, the linking assumption used in the simulations of backward masking was a direct relationship between perceived brightness and the duration of boundaries (Francis 1997). Simulations that take into account interactions between BCS and FCS (Francis and Grossberg 1996a) or alternative linking assumptions might be used to rectify this problem.
SUMMARY
4.8. Summary In this chapter we have provided an overview of models and mechanisms of pattern masking. The typology proposed in the section 4.1 provides a ‘five-dimensional space’ in which each model can be placed and compared with others. Several models are specified mathematically and can be simulated to produce quantitative predictions. Quantitative description and simulations of early models were limited because of the limited power of computational technology. For example, Bridgeman’s model contains a single layer to represent all visual areas and processes. Weisstein’s model does not incorporate a spatial (or retinotopic) layout. More recent models, such as the BCS and RECOD, incorporate such details and thus provide a relatively more sophisticated representation of visual processes. Nevertheless, simulation of these extended models requires extensive computational power and easily reaches the limits of current computing technology. It is interesting to note that some fundamental concepts can be found in a variety of theories and models, albeit sometimes following different formalisms. For example, a role for simultaneous brightness contrast in masking can be found in explanations offered by McDougall (1904a), Stigler (1910), Ganz (1975), and Anbar and Anbar (1982) (see Appendix A, section A.2), among others. Lateral inhibition has been used as a neural mechanism to explain simultaneous brightness contrast, and thus several models using lateral inhibition incorporate explicitly or implicitly a similar approach although the anatomical locus of this inhibition varies widely among models, ranging from retina to cortex. We have seen that a variety of mechanisms can generate the basic U-shaped characteristics of pattern masking. However, to show the generality of the model it is necessary to compare its predictions with a broad range of data. Although many models can generate the basic U-shaped function, they fail to generalize when the effects of other variables, such as background luminance or color, on masking functions are considered. While some experimental findings can be handled by parametric variations or simple add-on hypotheses to a given model, in our opinion dissociation and double-dissociation phenomena between the visibility and masking effectiveness of a stimulus are critical in testing models because they put very strong constraints on mechanisms and processes underlying perceptual
139
140
MODELS AND MECHANISMS OF VISUAL MASKING
effects. As discussed throughout this book, very few models can accommodate such dissociation phenomena. Note 1. X and Y neurons correspond to two major classes of retinal ganglion cells identified in the cat (Enroth-Cugell and Robson 1966). Among the differences in their response characteristics are the latencies and transience. Y cells have shorter latencies and respond more transiently than X cells. In Chapter 5, section 5.2.1, we review the properties of primate P and M cells with long-latency sustained and short-latency transient responses. However, although X/Y and P/M classifications have several similarities, the correspondence is not one-to-one (see Chapter 5, note 1).
Chapter 5
The sustained–transient channel approach to visual masking: an updated model
5.1. Introduction The fact that a mask presented later in time can be optimal in suppressing the visibility of a target which had already reached the cortex had been a puzzling aspect of masking. If we conceptualize neural and perceptual phenomena as instantaneous events, then notions of arrow of time and causality appear to be at stake. However, if we take into consideration the fact that transmission and processing of a stimulus takes time, then an after-coming mask need not work backwards in time: It simply interferes with the ongoing response to the target. This is the fundamental rationale of single-channel models. Dual-channel models add another dimension to this framework by highlighting that a given stimulus can activate more than one process and therefore the timing relations need to be analyzed in terms of not only the timing of stimuli but also the relative timing of these processes. In this chapter, we will review two dual-channel models. We note that the models proposed by Matin (1975) and Weisstein et al. (1975) also adopt a dual-channel or dual-process structure, as does the perceptual retouch approach (Bachmann 1984). Although these models bear some resemblance to the models outlined below, we will not discuss them here, since we have already done so in Chapter 4, sections 4.1.2, 4.2.2, and 4.3.2. 5.2. Parvocellular/magnocellular pathways and sustained/transient channels in the primate visual system The two models to be discussed connect their structures directly with parvocellular (P) and magnocellular (M) afferent pathways as well as
142
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
with sustained and transient channels. Therefore, to introduce the necessary background, we will first present a brief review of the neurophysiological and psychophysical bases of these pathways and channels. 5.2.1.
Parvocellular and magnocellular afferent pathways
Two major types of ganglion cells have been identified in the primate retina (e.g. Gouras 1969; De Monasterio 1978a,b; Kaplan et al. 1990; Croner and Kaplan 1995; Lee 1996; Silveira et al. 2005): The P retinal ganglion cells project to the parvocellular layers of the lateral geniculate nucleus (LGN), while the M retinal ganglion cells project to the magnocellular layers of the LGN. P and M cells are differentiated based on their anatomical and physiological properties. 1,2 Anatomically, M cells correspond to parasol cell types. The large majority of P cells correspond to midget cells, and the remainder are small bistratified cells or are of unknown type (Lee 1996). P and M cells make up approximately 75 percent and 10 percent of retinal ganglion cells, respectively. Parasol cells have larger dendritic fields than the midget cells. The dendritic field sizes of both types increase with retinal eccentricity (Dacey and Petersen 1992; Perry et al. 1984; Watanabe and Rodieck 1989). Similarly, the receptive field sizes of both M and P cells increase with increasing retinal eccentricity; with the M cell receptive field center radius being about twice that of P cells (Croner and Kaplan 1995). On the other hand, according to Croner and Kaplan (1995), the peak sensitivities of center and surround regions co-vary with receptive field size so as to produce a constant integrated contrast sensitivity across the visual field. The contrast gain of M cells is about six times greater than that of P cells (Croner and Kaplan 1995). M cells have shorter latencies and respond more transiently than P cells. Most P cells exhibit spectral selectivity and opponency, while most M cells lack these properties. M cells have a low contrast threshold and a high luminance-contrast gain and saturate at relatively low contrast, while the responses of P cells have higher contrast thresholds, a low contrast gain, and a linear dependence on contrast (Kaplan and Shapley 1986). M cells project to layers 1–2 of LGN (magnocellular layers) while P cells project to layers 3–6 (parvocellular layers) (Perry et al. 1984). The neuronal receptive field organization and response properties in LGN closely mimic those of corresponding retinal cells, although some
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
differences exist. The optimal temporal frequency for LGN cells is lower than that for retinal ganglion cells (Derrington and Lennie 1984; Kremers et al. 1997). The magnocellular pathway projects to layers 4C , 4B of the primary visual cortex, while the parvocellular pathway projects to layer 4C . Neurochemical data in the owl monkey suggest that M axons projecting to V1 have more interactions with GABAergic interneurons than P axons (Shostak et al. 2003). The parvocellular and magnocellular pathways exhibit complementary spatiotemporal frequency tuning, with the former preferring high spatial and low temporal frequencies, and the latter preferring low spatial and high temporal frequencies (Derrington and Lennie 1984). There have been several studies of how the ratio of the number of P cells to the number of M cells varies with eccentricity. Although early results were somewhat equivocal (e.g. Livingstone and Hubel 1988), more recent data suggest that the P/M ratio decreases with increasing eccentricity from 35:1 at the fovea to 5:1 at an eccentricity of 15 (Azzopardi et al. 1999). 5.2.2.
Cortical pathways
Although the segregation of afferent magnocellular and parvocellular pathways is well established, whether such segregation is maintained at the cortical level has been under debate. Livingstone and Hubel (1988) suggested a clear segregation with three subdivisions: (1) magnocellular → 4C → 4B projecting to thick stripes of V2; (2) parvocellular → 4C projecting to V1 interblobs, which in turn project to pale stripes in V2; (3) parvocellular (magnocellular?) → 4C projecting to blobs, which in turn project to thin stripes in V2. They further suggested that the parvocellular and magnocellular systems could remain segregated by projecting to the ventral and dorsal systems, respectively. The occipito-parietal dorsal and occipito-temporal ventral systems have been associated with spatial (‘where’) and object (‘what’) vision, respectively (Ungerleider and Mishkin 1982). Subsequent reinterpretations associate these systems with vision-for-action and vision-for-perception (Milner and Goodale 1995) and with behavioral engagement in near space (personal/peripersonal) and far space (extrapersonal) (Previc 1990, 1998).
143
144
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
Several lines of evidence suggest that at the cortical level parvocellular and magnocellular afferents interact (Ferrera et al. 1992; Nealy and Maunsell 1994; Sawatari and Callaway 1996; Van Essen et al. 1992) but the loci and degree of their interactions are not fully established (Martin 1992; Sincich and Horton 2002, 2005). Neuro-anatomical data indicate that the magnocellular afferents provide the dominant, but not exclusive, inputs to the dorsal (‘where’) pathway whereas parvocellular afferents provide the dominant, but not exclusive, inputs to the ventral (‘what’) pathway (Yabuta and Callaway 1998). 5.2.3.
Transmission and processing latencies
Several studies have investigated transmission and processing latencies in the primate visual system (Bair et al. 2002; Maunsell and Gibson 1992; Maunsell et al. 1999; Nowak and Bullier 1997; Nowak et al. 1995; Raiguel et al. 1989; Schmolesky et al. 1998). Measuring processing latencies, or dynamics, is inherently a difficult problem from both the input and output perspectives: Neurons differ in their selectivity to stimulus parameters and therefore selecting the appropriate input for comparison requires ad hoc criteria. Moreover, without a detailed knowledge of how information is encoded in time, the selection of response criterion for comparison can also bias the results. For example, it is conceivable that a particular neuron starts responding earlier but takes longer to process information than another neuron. In this case, comparing onset latencies would provide an artificial timing advantage for this neuron. Nevertheless, comparison of response timing in different parts of the visual system provides a rough estimate for transmission and processing latencies as well as information processing dynamics. For example, Schmolesky et al. (1998) investigated onset latencies of neurons in a variety of visual areas in anesthetized macaque monkeys. The stimulus consisted of a 500 ms pulse. Other characteristics, such as color, orientation, and size, were selected so as to provide optimal stimulation to the neuron under study. Figure 5.1 shows the distribution of onset latencies for different areas. It can be seen that the magnocellular pathway has an average 20-ms advance over the parvocellular pathway as measured in LGN. The earliest average onset latency was found for magnocellular LGN neurons (33 3.8 ms) while the slowest response was observed in V4 (104 23.4 ms) (data expressed as mean SD).
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
Latencies across the visual system V1 V3 MT MST FEF PLGN
MLGN
MT
V2
V4
1.0
Percentile
.75 .50 .25 0
30
40
50
60 70 80 90 Time from stimulus onset (ms)
100
110
120
Fig. 5.1 Percentage of cells that respond to the onset of a visual pulse input as a function of the response onset delay with respect to stimulus onset. Each cumulative histogram corresponds to a specific area of the macaque visual system as marked on the figure. (Reproduced from Schmolesky et al. 1998.)
A representative set of responses is shown in Figure 5.2. Figures 5.2(a), 5.2(c), and 5.2(e) illustrate how signals would propagate through the magnocellular afferents. A brief short latency transient response to stimulus onset (and offset in LGN and V2) propagates with an additional delay of approximately 15 ms added at each subsequent stage. In comparison, Figures 5.2(b), 5.2(d), and 5.2(f) show the parvocellular equivalents, which exhibit a longer latency and more sustained responses. Similarly, onset latencies in each subsequent area are delayed by 10–15 ms, thus maintaining a differential latency of approximately 20 ms between the M and P streams as one ascends through the hierarchy. Figures 5.2(g)–(j) show that areas in the middle tier of the dorsal pathway exhibit approximately the same onset latencies. 5.2.4.
Sustained and transient channels
During the last three decades, evidence deriving from several paradigms, including spatiotemporal contrast sensitivity, visual reaction time, uniform field flicker masking, and visual pattern masking, have provided a firm foundation for the existence of sustained and transient channels in primates (reviewed by Breitmeyer 1992). Sustained channels are more sensitive to high spatial and low temporal frequencies and transient channels exhibit a complementary behavior. Before proceeding to review the theoretical and empirical aspects of human sustained and transient visual processing as studied by these paradigms,
145
146
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
(a)
LGNd M 29 ms
(b)
LGNd P 56 ms
(c)
V1 4C␣ 44 ms
(d)
V1 4C 67 ms
(e)
V2 59 ms
(f)
V2 82 ms
(g)
MT 72 ms
(h)
V3 72 ms
(i)
MST 74 ms
( j)
FEF 80 ms
0
500
1000 0
500
1000
Time from stimulus onset (ms)
Fig. 5.2 Responses to a pulse input of a representative set of neurons in different visual areas of the macaque visual system. The two horizontal bars at the top of the figure show the stimulus timing, with the dark region corresponding to the 500 ms during which the stimulus was applied. Responses to individual trials are shown by tick marks that represent the timing of action potentials. At the top of individual trials, the average density of action potentials is shown as a function of time, with each division along the ordinate corresponding to 50 Hz. Periods of significant activation, as determined by a spike train analysis, are highlighted by horizontal lines above the tick marks. The arrow on the time axes in each panel shows the estimated onset latency of the cell. The exact value of this estimate is given at the top right corner of each panel. Panels (a), (c), and (e) illustrate how signals would propagate through the magnocellular afferents. Panels (b), (d), and (f) show the parvocellular equivalents. Panels (g)–(j) show areas in the middle tier of the dorsal pathway. (Reproduced from Schmolesky et al. 1998.)
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
the following caveat is in order. Any claim for a clear functional dichotomization of sustained and transient channels is explicitly denied. As will become evident below, in the past sustained and transient channels have been identified, often misleadingly, as ‘pattern’ and ‘motion’ detectors, ‘high spatial frequency’ and ‘low spatial frequency’ detectors, or ‘pattern’ and ‘flicker’ detectors, respectively. Although such distinctions serve as first and useful approximations, these distinctions are now believed to be not absolute but relative, in the sense that in regard to a particular function or response property one type of channel merely outperforms the other. We will revisit this issue at the end of the section. 5.2.4.1.
Spatiotemporal contrast sensitivity
Tolhurst (1973) investigated the contrast sensitivity to stationary vertical gratings and gratings either drifting or modulated sinusoidally in counterphase at a rate of 5 Hz. The spatial frequencies employed ranged from about 0.3 to 10.0 c/deg. On the assumption that transient channels are more selective for movement and for lower spatial frequencies than sustained channels, we would expect drifting or counterphase gratings to yield a higher contrast sensitivity at lower spatial frequencies than at higher ones. This expectation was confirmed in that, relative to stationary gratings, presumably detected by sustained channels, drifting and counterphase gratings yielded an increase in contrast sensitivity only at spatial frequencies of 4 c/deg and below. Above 4 c/deg the contrast sensitivities obtained with stationary and non-stationary gratings did not differ. Similar results have been reported by Kulikowski and Tolhurst (1973) and Kulikowski (1975). In these studies, subjects were asked to set two contrast thresholds of gratings modulated in counterphase at rates ranging from 3.5 to 8 Hz and ranging in spatial frequency from about 0.5 to 20.0 c/deg. One threshold setting required the subjects to detect any temporal change in the stimulus display (e.g. flicker or motion), and the other required the subjects to detect the spatial structure or pattern of the counterphase grating. A third condition was used in which subjects were asked to set the contrast threshold for stationary gratings. From Tolhurst’s (1973) study it can be inferred that the latter threshold settings measure the contrast sensitivity of sustained channels. Figure 5.3 shows the sensitivity ratios obtained for the movement and
147
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
(a) Subject J. J. k.
2.0
Contrast sensitivity ratio
148
1.0
(b) Subject C. F. 2.0
1.0 0.3
1
3 Spatial frequency (c/deg)
10
30
Fig. 5.3 The sensitivity ratio for threshold pattern recognition and flicker detection as a function of spatial frequency. The full symbols represent the sensitivity ratio of flicker detection of counterphase gratings relative to pattern detection of stationary gratings, and the open symbols represent the sensitivity ratio of pattern detection of counterphase gratings relative to pattern detection of stationary gratings. (Reproduced from Kulikowski and Tolhurst 1973.)
pattern thresholds of the counterphase gratings relative to the contrast threshold for the stationary gratings. It should be noted that the pattern sensitivity ratio is almost 1.0 at all spatial frequencies of the counterphase grating. However, the movement sensitivity ratio is roughly 2.0 at the lowest spatial frequencies and does not attain a value of 1.0 until spatial frequencies of about 5 c/deg and above are reached. This indicates that movement or transient channels respond preferably to the lower range of spatial frequencies, whereas sustained channels respond preferably to intermediate and higher spatial frequencies, although they can also respond to the lower range of spatial frequencies (see section 5.2.4.3). Similar conclusions have been reached by Nagano (1980) who measured the duration threshold of gratings as a function of their spatial frequency and contrast. He also used the twothreshold criterion employed by Kulikowski and Tolhurst (1973) and found that the sensitivity ratio of the transient to the pattern criterion
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
was well above 1.0 for spatial frequencies ranging from 0.25 to 4.0 c/deg and was equal to 1.0 for higher spatial frequencies. The above experiments employed the method of adjustment in which the observer adjusts the contrast of a test grating until it appears to give rise to a flicker sensation or a pattern sensation, depending on which criterion content is adopted.3 A psychophysical result related to those of Nagano (1980), showing differential contrast sensitivity of sustained and transient channels in human vision, was reported by Breitmeyer and Julesz (1975) (see also Kelly 1973; Tulunay-Keesey and Bennis 1979). These investigators measured contrast sensitivity to vertical sinusoidal gratings under two presentation conditions. In one condition the gratings were presented for 480 ms with an abrupt on- and offset, and in the other condition the gratings were presented with a slow 200-ms ramped on- and offset, thus eliminating temporal transients. Relative to the latter condition, the former condition, which retained abrupt transients at on- and offset, produced about a twofold increase in contrast sensitivity at spatial frequencies ranging from 0.5 to 4.0 c/deg. At higher spatial frequencies no difference in contrast sensitivity between the two conditions was observed. Here, presumably only sustained channels were used in the detection of the gratings. This result corroborates those of the above investigators. In summary, the experiments cited above are in agreement regarding the range of spatial frequencies to which transient and sustained channels preferably respond. Moreover, the results indicate that at low spatial frequencies transient channels are characterized by a lower contrast threshold than sustained channels, results consistent with the single-cell studies reviewed in section 5.2.1. In another study, King-Smith et al. (1976) measured the contrast threshold of a narrow (0.05 2.0) vertical line as a function of the oscillation frequency with which it moved to and fro in a left-to-right direction over an amplitude of 0.05. Relative to the contrast threshold of a stabilized line, the contrast threshold was lowered at oscillation frequencies of about 0.1–8 Hz but increased at higher frequencies. This range of oscillation frequencies corresponds to velocities ranging roughly from 0.01 to 0.8/s. Since the target line was only 3 wide, for the most part its fundamental and higher spatial frequency components were all above 10.0 c/deg. Consequently, on the basis of the above studies, it should activate mainly sustained channels. Given this
149
150
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
reasonable assumption, it can be inferred that sustained channels activated by the line are selectively sensitive to fairly low velocities, whereas transient channels responding to, say, a 0.3-c/deg grating drifting at a rate of 8 Hz would be selectively sensitive to a velocity of at least 25.6/s, a conclusion consistent with the psychophysical findings reported by Harris (1980) (see also Butler et al. 1976). Harris found that pattern sensitivity exceeded flicker sensitivity at grating drift velocities below 1/s whereas the reverse was true at higher velocities. Several electrophysiological studies on humans point to the existence of sustained and transient channels. Kulikowski (1974) measured the cortical visually evoked potential to a counterphase grating before and after prolonged adaptation to the same stationary grating. Since the latter grating contained no temporal transients, it selectively adapted sustained channels but left transient channels unaffected. Consequently, Kulikowski (1974) found little, if any, difference between the shape and form of the counterphase evoked potential produced prior to and after selective adaptation of sustained channels. Related and similar psychophysical results have been reported by Bodis-Wollner and Hendley (1977, 1979). Based on the above results showing that transient channels prefer rapid movement and abrupt onsets and offsets, it can be inferred that other temporal transients such as flicker also affect sustained and transient channels differentially. Tulunay-Keesey (1972) measured flicker and pattern thresholds for a vertical line (0.067 1.0) at flicker frequencies ranging from 0.3 to 30 Hz. Here subjects were required to detect flicker (regardless of pattern detail) and pattern detail (e.g. line orientation), respectively. She found that flicker sensitivity was higher than pattern sensitivity for practically the entire range of flicker frequencies. Flicker sensitivity was greatest at frequencies ranging from about 2.0 to 15.0 Hz. The two sensitivity functions tended to converge at lower and higher frequencies. This result again shows that flicker or transient channels have a lower threshold than pattern or sustained channels, particularly for intermediate to high flicker rates. In contrast, sustained channels seem to prefer low flicker frequencies or higher ones nearing or exceeding the critical flicker frequency of transient channels, above which the flickering line at threshold would eventually appear as a non-flickering, sustained stimulus.
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
The spatial response profiles of flicker and pattern channels can also be assessed psychophysically using subthreshold summation techniques. King-Smith and Kulikowski (1973, 1975) measured the effect that two parallel narrow (1.2 wide) subthreshold lines had on the threshold detectability of a central line of the same width. All three lines were flickered in phase at a rate of 12 Hz. Separate flicker and pattern thresholds were obtained as a function of the spatial separation of the two flanking lines from the central test line. Figure 5.4 shows the results obtained. Note again (Fig. 5.4(a)) that the sensitivity of the flickering line detector is greater than that of the pattern line detector. Furthermore, the spatial response profiles of both detectors are characterized by a central summative or excitatory region flanked by two symmetric subtractive or inhibitory regions, similar to what is found in single-cell studies of visual receptive fields. However, note also that the spatial response profile of flicker or transient detectors has an overall greater spatial extent and a relatively weaker surround inhibitory region than the response profile of pattern or sustained detectors. This difference corresponds to the difference between the sizes and response
40
30
Line contrast sensitivity Line width(min)
3
2
20
1 10
Grating contrast sensitivity
4
0 –1
0 T B 10 0 10 20 Distance from centre (min)
0
5 10 Spatial frequency (c/deg)
Fig. 5.4 (a) Spatial response profiles and (b) spatial frequency sensitivity of flicker (broken lines) and pattern (solid lines) detectors. (Reproduced from King-Smith and Kulikowski 1975.)
151
152
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
gradients of the transient- and sustained-cell receptive fields discussed in section 5.2.1. Figure 5.4(b) shows respective spatial contrast sensitivity functions derived from the response profiles in Figure 5.4(a) by applying the Fourier transform. Note again that the flicker or transient channel is more sensitive at lower than at higher spatial frequencies, with a peak at a frequency of about 1.0 c/deg. On the other hand, the sensitivity of the pattern or sustained channel peaks at a spatial frequency of about 5.0 c/deg, also consistent with the corresponding electrophysiologically measured differences between sustained and transient neurons discussed in section 5.2.1. Moreover, employing the same subthreshold technique, King-Smith and Kulikowski (1975) also psychophysically measured the spatial response profiles of pattern and flicker detectors as a function of flicker frequencies ranging from 1.0 to 24.0 Hz. Although no variation in the response profiles of the pattern detectors was observed as flicker frequency varied, the response profiles of flicker detectors decreased in magnitude and increased in width as flicker frequency increased. This suggests that in human vision flicker or transient channels sensitive to higher flicker frequencies also have larger receptive fields. Such a trend has not yet been identified at the single-cell level. A psychophysical trend which is consistent with known electrophysiological results (section 5.2.1) is that human transient mechanisms increases in size as a function of retinal eccentricity (Wilson 1980). As consequence of the findings of King-Smith and Kulikowski (1975), it can be inferred that flicker sensitivity also becomes greater at larger eccentricities. Hartmann et al. (1979) found that the critical flicker frequency, especially for larger stimuli, increased as the locus of stimulation is moved from the fovea to the 10–20 periphery, and then declines. Additional psychophysical evidence indicates that sustained pattern detectors also increase in size with retinal eccentricity. Enoch et al. (1970a,b) and Ransom-Hogg and Spillmann (1980) found that the sustained Westheimer function (Westheimer 1967) increases in spatial extent with eccentricity. Moreover, Rijsdijk et al. (1980) reported that the contrast sensitivity to stationary gratings decreases with eccentricity, and correspondingly Lie (1980) and others (Kerr 1971; Wertheim 1894) replicated the now wellknown finding that visual pattern resolution decreases as eccentricity increases.
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
Another psychophysical means of demonstrating the existence and differential properties of human sustained and transient channels is to measure spatial frequency contrast sensitivity as a function of grating duration. Such contrast sensitivity functions have been reported by Schober and Hilz (1965), Nachmias (1967), and Legge (1978). In general, contrast sensitivity at all spatial frequencies increases up to a limiting value as duration increases. However, whereas a low-frequency attenuation of sensitivity is present at longer durations, it is absent at shorter ones where only a high-frequency attenuation prevails. If we were to measure changes in contrast threshold as a function of duration for a range of spatial frequencies, we could obtain the duration–contrast reciprocity function, which specifies temporal integration in spatial frequency channels according to Bloch’s law. This empirical law states that, at threshold, contrast and duration can be traded off reciprocally up to a critical duration. Breitmeyer and Ganz (1977) measured duration–contrast reciprocity functions for vertical sinusoidal gratings at spatial frequencies of 0.5, 2.8, and 15.0 c/deg. The results indicate that the critical duration for which duration– contrast reciprocity holds is determined by spatial frequency. The critical durations at frequencies of 0.5, 2.8. and 15.0 c/deg are 60 ms, 150 ms, and 200 ms, respectively; above these values threshold contrast seems to level off or decrease at only a low rate (Legge 1978). These results show that low spatial frequency transient channels are characterized by a shorter integration time than are higher spatial frequency sustained channels; this result was also reported by Legge (1978) and is consistent with Hood’s (1973) finding that optically blurred stimuli yield a shorter integration time than sharply focused ones. Similar psychophysical results derived from monkey have been reported by Harwerth et al. (1980). Since, as discussed in sections 5.2.1 and 5.2.2, monkeys have anatomically and physiologically distinct populations of sustained and transient cells in their visual system, these and other analogies (see below) between human and monkey psychophysical results reinforce the interpretation of the human findings in terms of the sustained–transient channel approach. Psychophysical evidence also indicates that human transient channels are characterized by different impulse responses than those of sustained channels. Kelly (1971a) measured uniform field flicker sensitivity functions and, via the Fourier transform, derived the impulse response
153
154
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
corresponding to the flicker detector. He found that at high background luminance the impulse response could be characterized by initial excitatory phase followed by an inhibitory phase, which in turn was followed by a smaller excitatory phase. The temporal interval i between the primary and secondary excitatory phases of the damped oscillating impulse response defined the temporal frequency i 1 at which peak flicker sensitivity was attained. However, as background luminance was decreased progressively, the duration of the impulse response and i increased. Moreover, the inhibiting phases eventually attenuated until, at scotopic levels, only a monophasic excitatory component like that characterizing the impulse response of pattern detectors was evident. In other words, the transient flicker detectors become more sluggish or ‘sustained’ as background luminance decreases, a result consistent with single-cell recordings (B.B. Lee et al. 1990, 1994; Purpura et al. 1990). Rashbass (1970), used a subthreshold summation technique similar to that employed by Ikeda (1965) to measure the interaction of two impulse responses produced by two consecutively flashed subthreshold stimuli. He also found that the impulse response to transient changes of luminance can be characterized by a triphasic excitatory–inhibitory oscillation. Manahilov (1995) estimated the suprathreshold impulse response by using a brightness-matching technique and found a triphasic temporal profile. Similar oscillating functions have been empirically derived by Grossberg (1970) and Ueno (1977) using visual reaction time to two sequentially and briefly flashed light stimuli. On the assumption that human transient channels are the primary flicker and transient detectors, their impulse response ought to oscillate from excitation to inhibition as described above. Moreover, by employing the subthreshold technique, measures of the impulse responses of transient and sustained channels can be obtained by varying the spatial frequency of a grating from low to high values. Breitmeyer and Ganz (1977) and Watson and Nachmias (1977) used this technique and found distinct differences between impulse responses at low and high spatial frequencies. The results of the latter study are shown in Figure 5.5. Note that at the lowest spatial frequency the interaction of the two subthreshold impulse responses can be characterized as an initial summative or excitatory phase followed by a weaker subtractive or inhibitory phase. As spatial frequency increases, the inhibitory phase attenuates and the impulse response assumes a monophasic excitatory shape. A similar result
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
1.0
1.7 C/deg
3.5 C/deg
7.0 C/deg
10.5 C/deg
0.8 0.6 0.4
Summation Index
0.2 0.0 –0.2 –0.4 1.0 0.8 0.6 0.4 0.2 0.0 –0.2 –0.4
0
100
200 0 Onset asynchrony (ms)
100
200
Fig. 5.5. Subthreshold impulse response summation curves for four grating spatial frequencies. Note the biphasic facilitatory–inhibitory response at the lowest spatial frequency. The inhibitory phase attenuates at higher frequencies and is not evident at the highest one. (Reproduced from Watson and Nachmias 1977.)
was obtained in Kelly ‘s (1971b) investigation of impulse response functions derived from psychophysical data on contrast sensitivity to temporal counterphase modulation of gratings. With increasing spatial frequency, the low temporal-frequency attenuation characterizing uniform field flicker and counterphase modulation at low spatial frequencies (Kelly 1972) drops out. The corresponding effect on the impulse response is to eliminate the inhibitory phases and thus the oscillation of the temporal impulse response as spatial frequency increases. Impulse responses estimated by a motion discrimination task exhibit the same trend (Stromeyer and Martini 2003). It can be inferred from these results that human transient channels show a multiphasic oscillation of excitation alternating with inhibition, whereas sustained channels show only a single excitatory phase. 5.2.4.2.
Visual reaction time
Several psychophysical studies investigated the dependence of visual reaction time as a function of stimulus dimensions that allow selective
155
156
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
activation of parvocellular vs. magnocellular systems and sustained vs. transient channels. Breitmeyer (1975) measured the visual reaction time to the onset of a 50-ms presentation of sinusoidal gratings at a contrast of about 60 percent and at spatial frequencies ranging from 0.5 to 11.0 c/deg. He found that reaction times over this range increased monotonically from about 200 to 350 ms. Since contrast sensitivity declines at higher spatial frequencies, Breitmeyer (1975) also measured reaction times to the same gratings but with their individual physical contrasts adjusted so that their apparent contrasts were equal. Despite this canceling out of differences in contrast sensitivity, Breitmeyer (1975) found that the reaction times increased by about 40 ms over the same range of spatial frequencies employed. Similar and related psychophysical results have subsequently been reported by several other investigators (Breitmeyer et al. 1981a; Levi et al. 1979; Lupp et al. 1976; Tartaglione et al. 1975; Vassilev and Mitov 1976). These findings are indicative of a shorter response latency of low spatial frequency, transient as compared with high spatial frequency, sustained channels. Moreover, similar increases in reaction times as a function of spatial frequency can be obtained when a subject is asked to respond to the sudden offset or contrast reversal of a grating (Breitmeyer et al. 1981a; Long and Gildea 1981; Parker 1980). A more telling set of experiments which indicated that sustained and transient channels differ in response latency as a function of spatial frequency has been reported by Lupp (1977) and Lupp et al. (1978). The latter investigators used the following technique to measure visual reaction times to the onset of a sinusoidal grating. A grating at a spatial frequency of 1.0–16.0 c/deg and with a contrast 1.6 times above its threshold value was presented abruptly for 500 ms. The abrupt onset of a 1000-ms grating of the same respective spatial frequency and at a contrast of 0.6 times (i.e. below) its threshold value occurred at temporal intervals ranging from 500 ms before the onset of the suprathreshold grating to 250 ms after its onset. It was reasoned that the addition of the subthreshold grating would facilitate or increase the response rate of the respective spatial frequency channels and thus facilitate or decrease reaction time. The changes in reaction time as a function of onset asynchrony of the subthreshold gratings are shown for the 1.0, 2.0, 5.3, and 16.0 c/deg gratings in Figure 5.6. Note that at a spatial frequency of 1.0 c/deg, the subthreshold grating facilitates reaction time only transiently at asynchronies ranging
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
⌬RT/ms 20
1 c/deg –500
–400
–300
–200
–100
⌬T/ms
–60 ⌬RT/ms 20
2 c/deg –500
–400
–300
–200
–100
⌬T/ms –60 5.3 c/deg
–500
–400
⌬RT/ms 20 –300
–200
–100
100
⌬T/ms –60
⌬RT/ms 20
16 c/deg –500
–400
–300
–200
–100
–60
Fig. 5.6 Reaction time facilitation (RT) produced by a subthreshold grating on a suprathreshold grating at four spatial frequencies and as a function of the interval of onset of the subthreshold grating relative to the suprathreshold grating. (Reproduced from Lupp et al. 1978.)
157
158
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
from 200 ms (subthreshold grating onset precedes suprathreshold grating onset) to about 50 ms (subthreshold onset follows suprathreshold onset). The largest facilitation was obtained at an asynchrony of about 30 to 40 ms. On the other hand, with higher spatial frequencies, the facilitation became more sustained; in particular, with the 5.3 and 16.0 c/deg gratings, reaction time facilitation was sustained over the entire 500 ms interval preceding the onset of the suprathreshold grating and was eliminated at an asynchrony of about 100 ms. These psychophysical results clearly demonstrate that the facilitation by low spatial frequency gratings occurs within transient channels, whereas the facilitation by high spatial frequency gratings occurs within sustained channels, and that the summation or integration time of transient channels, in line with the results reported by Breitmeyer and Ganz (1977) and Legge (1978), is shorter than that of sustained channels (see section 5.2.4.1). Harwerth and Levi (1978) investigated reaction times to vertical sine wave gratings as a function of spatial frequency and contrast, among other variables. For 500-ms flashes of gratings, the reaction time at low spatial frequencies (e.g. 0.5 c/deg) and high spatial frequencies (e.g. 12 c/deg) generally decreased continuously and exponentially with increases of contrasts that ranged from the respective threshold values to 45 percent. However, for intermediate spatial frequencies (1–8 c/deg) the decrease in reaction time with similar increases of contrast was characterized by a discontinuity revealing that one exponentially decaying function dominated up to a contrast value of about 5–10 percent, followed by another function which dominated at higher contrast values. Analogous psychophysical findings have been reported by Harwerth et al. (1980) in their study of monkey vision. The biphasic relationship between reaction time and contrast has been replicated and extended in subsequent studies (Felipe et al. 1993; Murray and Plainis 2003). Schwartz (1992) studied the distributions of reaction times in response to stimuli selectively activating achromatic and chromatic systems. Homochromatic pulses (white on white, 620 on 620 nm, or 540 on 540 nm) of 1 and 1.5 s were used to activate the achromatic system. Heterochromatic pulses (680, 620, 440, and 400 nm on white) were used to activate the chromatic system. Reaction time distributions for the achromatic conditions peaked earlier and were narrower than
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
those for the chromatic conditions. Moreover, reaction time distributions for the achromatic conditions were bimodal, reflecting stimulus on- and offset. On the other hand, the reaction time distributions for the chromatic conditions were unimodal reflecting stimulus onset. Based on the temporal response properties and chromatic sensitivities of parvocellular and magnocellular systems reviewed in section 5.2.1, these results support selective contributions of parvocellular and magnocellular systems to visual reaction time. The above psychophysical reaction time results also have been corroborated in several electrophysiological studies. Spehlmann (1965) demonstrated that the late-component waves of the human CVEP shifted to longer latencies as the size of the elements in a flashed checkerboard pattern became smaller. Similarly, several investigators (Jones and Keck 1978; Kulikowski 1977; Parker and Salzen 1977a,b, 1982; Vassilev and Strashimirov 1979) have investigated the latency of the CVEP to on- and offset of sinusoidal gratings varying in spatial frequency. Typical results showed that the latency of major positive-wave components of the evoked potential (e.g. P1, P2) increased by about 100 ms as spatial frequency increased from 0.5 to 10 c/deg or more. The relatively weaker N0–P0 components, which are particularly interesting because they are the earliest waves in the potential, can be evoked strongly only by low spatial frequency gratings (Jones and Keck 1978; Kulikowski 1977), and therefore may reflect the earlier activity of more sparsely distributed transient channels.4 Correlates of the spatial frequency selective and biphasic characteristic of reaction times as a function of contrast have also been observed in CVEP waveforms (Baseler and Sutter 1997; Hartwell and Cowan 1993; Murray and Kulikowski 1983; Rudvin et al. 2000). The fact that physiological measures of visual latency correspond well to psychophysical measures has been demonstrated for both neuroelectric and neuromagnetic potentials. Hartwell and Cowan (1993) plotted reaction times as a function of CVEP response times and found a linear relationship. This relationship was the same regardless of contrast, spatial frequency, and the temporal profile of the stimulus. Instead of measuring the cortically evoked neuroelectric potential, Williamson et al. (1977, 1978) measured the latency of the human cortical neuromagnetic response to 66 percent contrast gratings as a function of spatial frequency ranging from about 0.2 to 11.0 c/deg,
159
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
i.e. contrast and spatial frequency ranges approximating those employed by Breitmeyer (1975). Figure 5.7 shows the latency of the neuromagnetic response (left ordinate) and the reaction times (right ordinate) obtained in Breitmeyer’s (1975) study. Note the close correspondence between the neuromagnetic and psychophysical results. Both functions increase monotonically by about 150 ms over spatial frequencies ranging from 0.2 to 11.0 c/deg. Moreover, the neuromagnetic responses were about 120 ms shorter overall than the reaction times. This indicates that the motor component in Breitmeyer’s (1975) study comprised a constant 120 ms of the total reaction time at all spatial frequencies. Consequently, the motor component cannot be a source of the variation of reaction time with spatial frequency.
275 Latency at 2fc 8 Hz 10 Hz 250 13 Hz 16 Hz 225 20 Hz Breitmeyer's RT data
375
350
325
300 175 275 150 250
Reaction time (ms)
200 Latency (ms)
160
125 225 100 200 75 175 50 0.1
0.2
0.5 1 2 Spatial frequency (c/deg)
5
10
Fig. 5.7 The latency of the human cortical neuromagnetic response (symbols, left ordinate) and psychophysical RT functions from Breitmeyer (1975) (solid lines, right ordinate) as a function of spatial frequency. (Reproduced from Williamson et al. 1978.)
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
All the above results, indicating that human transient channels respond and conduct faster than sustained ones and that the transient and sustained channels can be linked to magnocellular and parvocellular systems, buttress one of the major assumptions incorporated into the two models outlined in this chapter. 5.2.4.3.
Effects of transient and flicker adaptation
Section 5.2.4.1 showed that adapting to a stationary grating does not affect transient mechanisms (Kulikowski 1974). However, since transient channels respond selectively to abrupt brief stimuli and flicker, and since the human visual system contains flicker-selective pathways (Nillson et al. 1975; Pantle 1971; Regan 1970; Sternheim and Cavonius 1972), we would expect that masking or adapting the visual system with such stimuli would selectively affect transient mechanisms. Legge (1978) used a procedure in which a brief (20 ms) mask flash of a grating immediately preceded and followed the same grating flashed at durations ranging from 20 to 3000 ms. Legge measured the effects of the preceding and following transient mask flash on the duration–contrast reciprocity function, at threshold, of the intervening test grating. Compared with the no-mask condition, the critical duration, for which duration– contrast reciprocity held, increased, particularly at low spatial frequencies. This result indicates that the preceding and following 20-ms flashes of the grating effectively attenuated or masked the transient responses at test on- and offset, thus leaving the sustained mechanisms, characterized by a longer integration time, to determine the critical duration of the duration–contrast reciprocity function. Related results have been reported by Breitmeyer et al. (1981a). These investigators studied the masking effects of 6 Hz uniform field flicker on a variety of psychophysical responses including on- and offset reaction time, visual persistence, contrast sensitivity, and reaction time to near-threshold gratings. The spatial frequencies tested ranged from 0.25 to 15.0 c/deg. Some of the main results can he summarized as follows. Compared with the no-mask condition in which a non-flickering steady background field was used, the 6 Hz uniform field flicker mask selectively and dramatically increased both on- and offset reaction times for spatial frequencies ranging from 0.25 to about 4.0 c/deg. The effect was obtainable under both monoptic and dichoptic viewing conditions, indicating that the flicker mask affected central cortical
161
162
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
mechanisms (Green 1981a; Lipkin 1962; Thomas 1954). Visual persistence, as measured by the phenomenal continuity procedure used by Meyer and Maguire (1977), also increased dramatically and selectively for the same range of low spatial frequencies. Moreover, the very existence of an increase in visual persistence with spatial frequency in unmasked conditions (Breitmeyer et al. 1981a; Corfield et al. 1978; Meyer and Maguire 1977) also suggests that sustained channels have a longer response persistence. In addition, movement or counterphase contrast sensitivity also decreased significantly and selectively for that range of spatial frequencies, consistent with related flicker-adaptation effects reported by Green (1981a). These results indicate that when transient channels are masked by the background flicker, low spatial frequency sustained channels, which have a longer reaction time, longer response persistence, and higher contrast threshold, are left unmasked and thus responsive to the range of frequencies which otherwise preferably activate transient channels. In the final experiment in this series, Breitmeyer et al. (1981a) measured reaction times to a 500-ms presentation of a 0.5 c/deg grating at a contrast 0.15 log unit above threshold. This technique was employed previously by Tolhurst (1975), who showed that reaction times to a 0.2 c/deg grating distributed themselves probabilistically and bimodally at intervals corresponding to the abrupt on- and offset of the grating. From these results, Tolhurst (1975) inferred that the reaction times were determined by transient channels selectively activated by the abrupt on- and offsets of the grating. However, using the flicker masking technique, Breitmeyer et al. (1981a) found that sustained channels are also activated by low spatial frequencies. The results of their experiment are shown in Figure 5.8 The upper two panels show reaction time distributions to the presentation of the near-threshold grating when no flicker mask is used. Note that here, as in Tolhurst’s (1975) and Schwartz’s (1992) studies, reaction times appear to be distributed probabilistically and bimodally, with the modal reaction times being about 500 ms apart, corresponding to the transient on- and offset of the 0.5 c/deg grating. On the other hand, the lower panels show the probabilistic distribution of reaction times when a 6 Hz uniform field flicker mask is used. Note that, here, the reaction time distribution is unimodal; moreover, this mode is displaced to longer latencies relative to the initial mode in the unmasked condition. This latter finding
PARVOCELLULAR/MAGNOCELLULAR PATHWAYS
18
A: B.B. Steady background
C: R.S.H. Steady background
15 12 9
Number of reaction times
6 3 0 21
B: B.B. 6 Hz U.F.F.
D: R.S.H 6 Hz U.F.F.
18 15 12 9 6 3
200 300 400 500 600 700 800 900 1000 200 300 400 500 600 700 800 900 1000 Reaction time (ms) Reaction time (ms)
Fig. 5.8 RT distribution to a 0.5 c/deg grating at 0.15 log units above threshold. Upper panels, distributions obtained without uniform-field flicker masking; lower panels, distributions obtained with 6 Hz uniform-field flicker masking. (Reproduced from Breitmeyer et al. 1981a.)
indicates that when transient channels are selectively masked, the unmasked sustained channels, having longer response latencies, determine reaction time. It also suggests that the reaction times in the unmasked condition, although bimodally distributed with the first and last mode corresponding to transient activity at on- and offset, additionally reflect the sustained activity, which yields intermediate reaction times bracketed by the two extreme modes.
163
164
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
Similar findings have been reported by Harwerth et al. (1980) in monkey. As in Tolhurst’s (1975) study, bimodally distributed reaction times, corresponding to onsets and offsets of a 500-ms near-threshold grating were obtained at spatial frequencies below 2 c/deg; however, above 2 c/deg clear unimodal reaction times were obtained. Moreover, whereas the modal reaction time of the first of the two bimodal distributions at spatial frequencies below 2.0 c/deg was about 400 ms, the modal reaction times of the unimodal distributions obtained with gratings at 4 and 8 c/deg were about 530 ms and 600 ms, respectively. These results indicate that in monkey, as in human, sustained channels have a longer response latency than do transient channels, and among sustained channels response latency increases as spatial frequency increases. 5.2.5.
Interim summary
There is ample evidence for separate retino-geniculo-cortical pathways with complementary spatiotemporal response properties. The mapping of afferent magnocellular and parvocellular pathways to cortical dorsal and ventral pathways, respectively, is one of ‘dominance’ as opposed to exclusion. Similarly, as mentioned before, the systems-level sustained vs. transient channel distinction is not a pure dichotomy. For instance, Lehmkuhle et al. (1980) argued that transient neurons, in addition to sustained ones, can perform some pattern analysis. More recently, T. Lee et al. (1995) argued that fine (temporo)spatial discriminations can be based on the information conveyed by the transient M pathway. Given that the tuning characteristics of sustained and transient channels overlap to some extent and that their sensitivities with respect to other stimulus dimensions (e.g. contrast) are different, the channel/pathway that may subserve a given behavioral task depends on the task as well as on the multidimensional aspects of the stimulus. For example, a fast reaction time to the orientation of a rapidly displaced low-contrast grating could be based on directionally selective transient channel responses, even though the task requires judgments ostensibly based on the spatial pattern characteristics of the stimulus. 5.3. Breitmeyer and Ganz’s sustained–transient
dual-channel model This model was originally published by Breitmeyer and Ganz (1976) and later modified by Breitmeyer (1984). Its main assumptions are as follows.
BREITMEYER AND GANZ’S SUSTAINED–TRANSIENT DUAL-CHANNEL MODEL
1. Both the brief target and mask in the target–mask stimulus sequence activate long-latency sustained as well as short-latency transient channels. 2. Within a class of channels, inhibition is realized via the centersurround antagonism of receptive fields. This type of antagonism is called intra-channel inhibition. 3. Between the classes of neurons there exists mutual and reciprocal inhibition. This type of antagonism is called inter-channel inhibition. 4. Masking can occur in one of three ways: (i) via intra-channel inhibition (particularly realized in sustained channels); (ii) via interchannel inhibition (particularly the transient-on-sustained channel inhibition); (iii) via the sharing of common sustained or else transient pathways by the neural activity generated by the target and mask when they are spatially overlapping. Also implicit in this last assumption is the sharing or prior common peripheral receptor activity by both stimuli. 5. Transient channels primarily signal the location and presence of stimuli or their rapid changes of location (displacement, motion) over time; sustained channels primarily signal pattern aspects such as brightness, contrast, and contour of stationary or slowly moving stimuli. Other, for now implicit, empirical boundary conditions affecting these masking mechanisms will be made explicit in the following discussion. However, these assumptions must be sufficient to explain the main features of visual masking functions as shown in Figure 2.2. Type A forward and backward masking effects are schematized in Figure 2.2(a). Figure 2.2(b) shows the typically weaker type B paracontrast effect and the stronger type B metacontrast effect. These masking effects and prior theories explaining them were reviewed in Chapters 2 and 4. The basic properties of the model are illustrated in Figure 5.9. Figure 5.9(a) indicates the types of interactions that can occur when the mask (M) precedes the target (T) in forward masking. We can distinguish between two general types of forward masking: masking with spatially overlapping patterns such as masking by structure or noise, and masking with spatially adjacent patterns as in paracontrast (see Fig. 2.1). In both types of forward masking, the mask’s transient activity, indicated by the short-latency spike-like response, cannot
165
166
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
interact in any way with target activity; however, reciprocal inhibition between the target’s transient activity and the mask’s sustained activity can occur as indicated by the left negatively signed arrow. Moreover, for the spatially overlapping stimuli, two mechanisms of forward masking are possible. For one, mask sustained channels and target sustained channels may mutually inhibit each other (intra-channel inhibition), as indicated by the negative sign on the right arrow. Secondly, mask sustained activity can integrate with target sustained activity via the sharing of common retinotopically organized pathways (intra-channel integration), as indicated by the positive sign on the right arrow. For paracontrast forward masking, in which the mask and target patterns do not overlap spatially, intra-channel inhibition of target sustained channels by mask sustained channels constitutes the major masking mechanism. Figure 5.9(b) demonstrates the neural interactions at target–mask asynchrony. With spatially overlapping masks, one would again expect both intra-channel inhibition and intra-channel integration of activity in common target and mask sustained (or else transient) pathways to contribute to the overall masking effect. In fact, at temporal synchrony masking by intra-channel integration ought to be optimal. However, when using spatially adjacent target and mask stimuli, as in meta- or paracontrast, one would expect only intra-channel inhibition to contribute to the overall masking effect. Figure 5.9(c) schematizes the neural interactions that occur when the mask onset follows that of the target by 100 ms. Again, we can distinguish two general types of backward masking: masking with spatially overlapping patterns (masking by noise or structure), and masking with spatially adjacent patterns (metacontrast). In both types of backward masking the transient activity of the mask and the earliest sustained activity of the target can interact via mutual inter-channel inhibition. However, only in the former type of spatially overlapping masking paradigm would integration or sharing of sustained activity in common sustained pathways provide an additional masking mechanism. The same argument holds for the neural interactions illustrated in Figure 5.9(d), except for the fact that the earlier transient or sustained activity of the mask interacts with progressively later sustained activity of the target. The three masking mechanisms outlined above—intra-channel inhibition among spatially adjacent sustained (or transient) pathways,
BREITMEYER AND GANZ’S SUSTAINED–TRANSIENT DUAL-CHANNEL MODEL
(a)
(b) T
T
Su
±
±
–
M
M
–100 (c)
0 T
100
200
300
400 (b)
0 T
100
200
–
– M
300
400
±
M
Fig. 5.9 The hypothetical time course of transient and sustained channels activated by the target (T) and the mask (M) at various target–mask asynchronies: (a) mask onset precedes target onset; (b) target and mask onsets are synchronous; (c), (d) mask onset follows target onset at increasing temporal asynchronies. The transient response is represented by the short-latency spike-like function; the sustained responses at increasing spatial frequencies are indicated by the symbol Su in (a). The inhibitory and excitatory interactions within and between the two types of channels are indicated by two-way arrows signed negatively and positively, respectively. (Reproduced from Breitmeyer and Ganz 1976.)
inter-channel inhibition between transient and sustained pathways, and integration or sharing of common activity within sustained (or transient) pathways—are intended to account, respectively, for the main features of the masking phenomena shown in Figure 2.2: type B forward masking or paracontrast, type B backward masking or metacontrast, and type A forward and backward masking. It should be mentioned that although the designations ‘mask’ and ‘target’ serve a valid methodological distinction, physiologically, in terms of the possible interactions outlined above, that distinction is lost. Consequently, as will become apparent in several cases below, we must consider mutual interactions between the physiological activities generated separately by the target and the mask stimuli, regardless of the temporal order of their presentation.
167
168
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
In this form, the theory accounts for much of the masking data, as well as for a wide range of findings obtained from related areas of research. However, owing to developments in visual neurophysiology and neuroanatomy in the last two decades, as well as to the advent of new theories of cortical dynamics, the basic sustained–transient approach has been in need of revision and re-evaluation. In the following section, we present an updated sustained–transient approach to masking that incorporates recent findings in neuroscience and the dynamics of visual information processing. 5.4. The retino-cortical dynamics (RECOD) model 5.4.1.
Theoretical rationale behind the model
The original RECOD model was not developed to account for masking phenomena directly. Rather, at the core of the model was the theoretical question of how to incorporate non-linear feedback (recurrent, reentrant) processes into real-time visual processing. The primate visual system exhibits extensive anatomical feedback, including significant excitatory connections. The functional role of these feedback connections is largely unknown. As discussed in section 5.2.2, information processing at different neural loci exhibits a broad range of latencies. From the engineering and mathematical perspectives, it is well known that positive and/or delayed feedback can easily lead a system into unstable behavior. How does the visual system avoid this problem? A second problem related to feedback processing concerns how feedforward and feedback signals are combined. As illustrated schematically in Figure 5.10, the feedforward signal delivers a stimulusdependent activity through afferent pathways. The feedback signal processes this activity in order to transform it into a percept-dependent activity. There is a trade-off in setting the gains of the feedforward and Efferent (feedback) activity
Fig. 5.10 Illustration of feedforward and feedback processing as they relate to inputs and perceptual synthesis.
Percept dependent
Afferent (feedforward) activity Stimulus dependent
THE RETINO-CORTICAL DYNAMICS MODEL
feedback signals. First, assume that the gain of the feedforward signal is much higher than the gain of the feedback signal. The stimulusdependent feedforward activity will energize feedback loops; however, because the feedforward signal has a high gain, it will remain dominant and will interfere with the generation and establishment of perceptdependent activities. This will lead to a failure to attain a perceptual synthesis. If the gain of the feedback signal is much stronger than the gain of the feedforward signal, perceptual synthesis can occur. However, the percept will be very insensitive to changes in the input: Assume for example that the shape of the stimulus changes from a square to a triangle. Because afferent signals are relatively weak, the synthesis corresponding to the square will persist in the positive feedback loops and the percept will be either the continuation of the previous object (square) or a highly blurred version of the two objects (a combination of a square and a triangle). We call this problem the trade-off between stimulus read-out and perceptual synthesis in a feedback system.5 5.4.2. A solution: temporal multiplexing of conflicting tendencies
Ögmen (1993) proposed a theory that offers a solution to this trade-off by multiplexing in time the conflicting tendencies, i.e. the need for strong feedforward signals for a reliable read-out of the input and for energizing the feedback loops vs. the need for a strong feedback signal to establish activities that underlie perceptual synthesis. According to this theory, the real-time dynamics of visual processes unfolds in three phases. 1. A feedforward-dominant phase where strong afferent signals travel to higher cortical areas allowing the read-out of the input and energizing the feedback loops. 2. A feedback-dominant phase during which the afferent signal decays to a lower plateau value and the feedback, or re-entrant, signals establish perceptual synthesis. 3. A reset phase that is initiated when inputs change. During the reset phase the feedback signals receive a fast transient inhibition so as to allow the dominance of the afferent signals (feedforward-dominant mode) which deliver the new input. By limiting the real-time
169
170
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
dynamics of the system to a succession of transient epochs, this scheme has the additonal advantage of avoiding asymptotic unstable behavior that can emerge in nonlinear positive feedback systems. How can these phases be regulated in real time? The RECOD model proposes that the input is delivered to the feedback system by two parallel complementary pathways. The first is a relatively fast and transient response that is activated when changes occur in the input. The activity of this pathway is assumed to inhibit (reset) the reverberating activity in the feedback loop. A second pathway, with a relatively slower and sustained response, delivers the excitatory input to the feedback system. Moreover, it is assumed that this input is non-monotonic, overshooting to a peak first and then decaying to a lower plateau. The initial peak response corresponds to the feedforward-dominant phase. The decay of the activity to a lower plateau allows the feedback signals to dominate, thereby providing a transition from the feedforwarddominant to the feedback-dominant phase. Figure 5.11 shows a simulation of the model that depicts these three phases. 5.4.3. A neural model for the theory: basic architecture and neurophysiological correlates
Figure 5.12 shows the basic architecture of the RECOD model (Breitmeyer and Ögmen 2000; Ögmen 1993; Ögmen et al. 2003; Purushothaman et al. 2000). The two ellipses at the bottom of the figure represent two populations of retinal ganglion cells, one with a fast phasic (transient) response and a second with a slower tonic (sustained) response. The neural correlates of these model cells are the primate M and P retinal ganglion cells, respectively, which project to distinct layers of the LGN and form two parallel afferent pathways (the magnocellular and the parvocellular) as shown in Figure 5.12 (see section 5.2.1). We consider these pathways as neural correlates for the transient and sustained afferents in our model. Magnocellular and parvocellular projections to the cortex provide selective inputs to different visual areas subserving various functions such as the computation of motion, form, and brightness. As mentioned in section 5.2.2, at the cortical level these two pathways interact but the loci and degree of their interactions are not fully established. Neuroanatomical data indicate that the magnocellular afferents provide the dominant inputs to the dorsal (‘where’) pathway, whereas parvocellular afferents provide
THE RETINO-CORTICAL DYNAMICS MODEL
0.4 0.3 0.2 Activity 0.1 0
POST-RETINAL
Reset phase Feedforward dominant phase Feedback dominant phase
0
e(m
s)
200
Tim
400
TRANSIENT
30 600
35 Space (cell index)
800 40 1 0.75 0.5Activity 0.25 0
0
0
0
200 30
600 800
40
35 Space (cell index)
400
Tim
e(m
s)
s)
200
400
Tim
e(m
0.04 Activity 0.02
SUSTAINED
30 600 800
35 Space (cell index)
40 600 400 Activity 200 0
INPUT 0
400
Tim
e(m
s)
200 30
600 800
40
35 Space (cell index)
Fig. 5.11 The activities in the model in response to the input signal shown in the bottom panel. The middle panels show the thresholded activities in the sustained and transient population of retinal cells in response to this input. The top panel illustrates activities in the post-retinal network. Note that, for simplicity, in this simulation the full spatial extent of the receptive fields of transient cells was not incorporated in the model and hence the spatial spread of the transient cell activity is narrower than that of the sustained cell activity. (Reproduced from Purushothaman et al. 1998.)
the dominant inputs to the ventral (‘what’) pathway. The model uses a lumped representation for the cortical targets of the magnocellular and parvocellular pathways. The main cortical targets of the magnocellular pathway represent the areas in the dorsal pathway. The main cortical
171
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
Interchannel inhibition Post-retinal areas
t
P pathway
M pathway
172
t
Retina Input t
Fig. 5.12 Schematic diagram of the basic architecture of the RECOD model. The bottom two layers represent the sustained and the transient retinal ganglion cells, whose typical step responses are illustrated next to each layer. The top layers represent their post-retinal targets as lumped networks. Open and full synaptic symbols depict excitatory and inhibitory connections, respectively. (Adapted from Ög ˘men 1993.)
targets of the parvocellular pathway represent the areas in the ventral pathway (see the upper ellipses in Fig. 5.12). The lumped representation of the areas involved in the computation of dynamic form and brightness contains recurrent connections to represent the extensive feedback observed between cortical areas as well as the feedback from the cortex back to the LGN (reviewed by Sherman and Guillery 1996). As discussed in section 5.2.4, at the perceptual level psychophysical studies identified transient and sustained channels in humans and monkeys whose properties are consistent with those of M and P afferents, respectively. In our model, at a lumped ‘systems level’ we identify the post-retinal areas that receive dominant M and P inputs with transient and sustained channels, respectively. Thus the properties of sustained and transient channels are determined by a combination of the properties of specific afferent pathways and post-retinal networks. In the model, the representation of cortical networks in a relatively lumped manner arises from two major constraints: First, from a practical viewpoint it is not feasible to introduce all details of cortical circuitry into the model. Modeling consists of representing those aspects of the system that are critical for the scientific question at hand. Accordingly, including the details of chromatic processing will not be relevant when
THE RETINO-CORTICAL DYNAMICS MODEL
color is not a significant dimension for a given research question. On the other hand, the model should be flexible, i.e. generalizable, so that it can be unlumped when necessary. As an example, we unlumped the cortical networks to include specific spatial frequency channels in our studies of blur perception where the spatial frequency content of the input was varied (Purushothaman et al. 2000). However, in other studies, where we did not manipulate the spatial frequency of our stimuli, these networks were represented in a lumped manner (Ögmen et al. 2003). The second benefit of using lumped representations concerns the computational load. Very detailed models tend to take extensive simulation times and are difficult to implement, simulate, and analyze. Therefore modeling, like any scientific enterprise, involves a delicate choice for the level of analysis, i.e. the delineation of significant and insignificant factors. In the next section we will provide an example of how the model can be unlumped to account for the differential properties of contour and surface network dynamics. In the RECOD model, each channel (or pathway) possesses both positive and negative connectivity patterns. We will refer to the negative, i.e. inhibitory, connections within each channel as intrachannel inhibition (Section 5.3). The reset mechanism mentioned earlier is implemented by inhibitory connections from the M (transient) pathway to the P (sustained) pathway. We will refer to this inhibition as inter-channel transient-on-sustained inhibition (Breitmeyer and Ganz 1976). As shown in Figure 5.12, this inhibitory connection is accompanied by a reciprocal inhibitory connection, the inter-channel sustained-on-transient inhibition. This is shown by the arrows between the upper ellipses in Figure 5.12. This reciprocal inhibition creates a competition between the sustained and transient channels and allows the system to maintain a dynamic balance between figural synthesis and reset. As discussed before, the reset signal is required to allow the new afferent signal to reach the cortex in an effective manner. However, without a reciprocal inhibition, the system would be very sensitive to noise in the transient channel. Because figural synthesis requires time, without a reciprocal inhibition, noise would often reset the activity in the form channel, thus interfering with figural synthesis. In sum, although it starts from a different perspective, this model converges to a structure very similar to the sustained–transient model
173
174
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
discussed in the previous section. Furthermore, it expands the sustained–transient model by incorporating recent neurophysiological findings on afferent and cortical networks, incorporating feedback mechanisms, proposing additional feedforward- and feedbackdominant phases of operation, explicitly incorporating a network structure, formulating a functional dynamic system framework, and providing a quantitative description that can be simulated and compared directly with experimental data. 5.4.4. Example of model unlumping: Contour and surface dynamics
As discussed in Chapter 2, section 2.5, one obtains different para- and metacontrast functions when observers report contour versus surface properties of a target stimulus (Figure 2.5). These findings are consistent with psychophysical (Arrington 1994; Elder and Zucker 1998) and neurophysiological (Lamme et al. 1999; Lee et al. 1995) results showing that contours of visual stimuli are processed faster than are its surface properties. Psychophysical (Arrington 1994; Elder and Zucker 1998; Paradiso and Nakayama 1991; Stoper and Mansfield 1978) and neurophysiological (De Yoe and Van Essen 1988; Lamme et al. 1999; Xiao et al. 2003) findings indicate that this difference in processing speed is partly due to differences in cortical circuits processing these attributes. In particular, activities in cortical P-interblob and Pblob pathways are associated with the processing of form and surface properties, respectively (Grossberg 1994). In terms of the RECOD model, these results can be accommodated, as schematized in Figure 5.13, by unlumping the P-pathway driven post-retinal network into two networks, one processing contour, and a second processing surface-brightness information. As depicted in Fig. 5.14, each briefly flashed stimulus produces a fast transient (M) activation, a slower sustained (P) contour process and in addition a still slower sustained (P) surface/brightness process. Each of the latter activities produced by the target can be suppressed (see dashed vertical arrow) by the fast transient activity of the mask. Although only showing the suppression of the target’s contour process, it is evident from Figure 5.14 that the unlumped model correctly predicts that the SOA of optimal suppression should be shorter for contour visibility than for brightness visibility.
THE RETINO-CORTICAL DYNAMICS MODEL
Inter-channel inhibition
contour
t Post-retinal areas
surface
t
t
P pathway
M pathway
t Subcortical network
retina t input t
Fig. 5.13 An unlumped version of the RECOD model. The sustained channel is unlumped into separate contour and surface networks as shown at the top right of the figure. A subcortical network has been added to account for facilitatory effects in paracontrast. Open and closed triangular symbols depict excitatory and inhibitory connections, respectively. The open circular connection denotes a multiplicative synaptic interaction. For simplicity, only a small subset of connections are shown. For example, to avoid clutter, feedback connections in post-retinal areas are not shown. (Reproduced from Breitmeyer et al. in press).
The paracontrast results appear to present a more complex picture. We analyze these results in terms of three processes as depicted in Figure 5.15. Two of these processes are inhibitory. In our dual-channel RECOD model, a suppressive effect is produced by intra-channel center-surround antagonism of sustained (P) neural activity. It is known that the inhibitory surround activation of classical receptive fields is slower by 10 to 30 ms than activation of the center region (Benardete and Kaplan 1997; Maffei et al. 1970; Poggio et al. 1969; Singer and Creutzfeldt 1970). One would then expect that the surrounding mask has to precede the target by SOAs of 10 to 30 ms to obtain optimal suppression of target-induced excitatory activity. These intrachannel, center-surround inhibitory effects are most likely fast and of a short duration (Connors et al. 1988). However, the paracontrast results shown in Figure 2.5 indicate that an additional inhibitory effect lasts
175
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
TARGET
M Subcortical P-contour P-surface
M MASK
176
Subcortical P-contour P-surface
t SOA
Fig. 5.14 Schematic diagram of the optimal metacontrast suppression effect of a mask on the contour and brightness visibilities of a prior target stimulus. Dashed vertical arrow indicates inhibition of the target’s sustained activity by the mask’s transient activity. Same conventions as in Figure 5.13 are used in depicting activities. (Reproduced from Breitmeyer et al. in press).
for up to 450 ms. Suppression of target visibility can begin when the mask precedes the target by about 450 ms. This effect is explained in our model by a cortical long-lasting intra-channel inhibition (Ögmen et al. 2003). Evidence for both the brief and prolonged inhibition has been found in visual cortex (Berman et al. 1991; Connors et al. 1988; Nelson 1991). In sum, according to our model the two suppressive effects in paracontrast are: 1) a relatively fast intrachannel inhibition realized in the center-surround antagonism of classical receptive fields, and 2) a slower more prolonged inhibition, associated with other
THE RETINO-CORTICAL DYNAMICS MODEL
properties of cortical activity. In addition, the paracontrast results in Figure 2.5 show that a prior mask can have not only suppressive effects on target visibility but also a counteracting facilitating effect. Evidence for facilitatory effects of a prior stimulus on the visibility of a following one has also been reported elsewhere (Bachmann 1988, 1994; Michaels and Turvey, 1979; Stober et al. 1978). A plausible explanation for the enhancement effect has been proposed by Bachmann (1988, 1994, 1997) in terms of his perceptual retouch (PR) approach (see Chapter 4, section 4.5.2). To account for the facilitation effect in paracontrast, we introduced to our model an additional network that we tentatively identify as a subcortical network. As shown in Figure 5.13, the output of this subcortical network multiplicatively gates the input signals to the surface and contour networks. Figure 5.15 provides a schematic summary of mechanisms involved in paracontrast. Figure 5.16 illustrates how a facilitation produced by the slower subcortical system could enhance the visibility of a target’s brightness and contour during paracontrast. For instance, as shown, the facilitatory effect on visibility of a target’s brightness is maximal when the mask precedes the target at an SOA of a few tens of milliseconds. Although not shown, it is evident from Figure 5.16 that the facilitatory effect on
Facilitation
Brief (intra-channel) suppression
Prolonged (intra–channel) suppression
–200
–100
–10
SOA (ms)
Fig. 5.15 Schematic diagram of three processes used in the RECOD model to explain paracontrast effects. (Reproduced from Breitmeyer et al. in press).
177
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
TARGET
M Subcortical P-contour P-surface
M MASK
178
Subcortical P-contour P-surface
SOA
t
Fig. 5.16 Schematic diagram of the optimal paracontrast enhancement effect of a mask on the contour and brightness visibilities of a following target stimulus. Dashed vertical arrow indicates facilitation of the target’s sustained activity by the mask generated sub-cortical activity. Same conventions as in Figure 5.13 are used in depicting activities. (Reproduced from Breitmeyer et al. in press).
visibility of a target’s contour is maximal when the mask precedes the target by a slightly larger SOA. 5.4.5.
Mathematical basis of the model
The computational structure of the model is constructed by specifying, according to neurophysiological, neuroanatomical, and functional constraints, some canonical equations that describe basic aspects of neuronal dynamics. We review these equations below and discuss their general properties that contribute to masking effects. The first type of
THE RETINO-CORTICAL DYNAMICS MODEL
equation used in the model has the form of a generic Hodgkin–Huxley equation dVm/dt (Ep Vm)gp (Ed Vm)gd (Eh Vm)gh (5.1) where Vm is the membrane potential, gp, gd, and gh are the conductances for passive, depolarizing, and hyperpolarizing channels, respectively, and Ep, Ed, and Eh are their Nernst potentials. This equation has been used extensively in neural modeling to characterize the dynamics of membrane patches, single cells, and networks of cells (reviewed by Grossberg 1988; Koch and Segev 1989). For simplicity, we will assume that Ep 0 and use the symbols B, D, and A for Ed, Eh, and gp, respectively, to obtain the generic form of the ‘multiplicative’ or ‘shunting’ equation (Grossberg 1988): dVm/dt AVm (B Vm)gd (D Vm)gh.
(5.2)
The depolarizing and hyperpolarizing conductances are used to represent the excitatory and inhibitory inputs, respectively. The second type of equation is a simplified version of equation (5.1), called the ‘additive’ or ‘leaky-integrator’ model, where the external inputs influence the activity of the cell not through conductance changes but directly as depolarizing and hyperpolarizing currents, yielding the form: dVm/dt AVm excitatory inputs inhibitory inputs. (5.3)
Mathematical analyses showed that, with appropriate connectivity patterns, shunting networks can automatically adjust their dynamic range to process small and large inputs (Grossberg 1988). Accordingly, we use shunting equations when we have interactions among a large number of neurons so that a given neuron can maintain its sensitivity to a small subset of its inputs without becoming saturated when a large number of inputs become active. We use the simplified additive equations when the interactions involve few neurons. With non-linearities, these equations can generate extremely rich and complex behavior. In terms of masking, many of these complex properties can be understood as variants of three fundamental ways masking effects can be obtained: masking by inhibition, by normalization, and by integration. An input to the inhibitory term (activated by
179
180
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
the mask) will reduce the response generated by the excitatory term (activated by the target). This inhibition can be additive or multiplicative. Multiplicative inhibition can lead to activity normalization (Grossberg 1988; Ögmen 1993) which can be viewed as another form of masking. If more than one input is applied to the excitatory term, their effects will be integrated and thus their responses will be fused by temporal integration. These equations are leaky integrators because of the presence of the decay term (the first term on the right-hand side), and thus integration effects have a time span determined by the time constant of this term. A difference between shunting and additive networks is that the additive network has a single fixed time constant while the shunting network has a variable input-dependent time constant (Grossberg 1988; Ögmen 1993). Finally, a third type of equation is used to express biochemical reactions of the form
X→ SZ S Z →Y →
where a biochemical agent S, activated by the input, interacts with a transducing agent Z (e.g. a neurotransmitter) to produce an ‘active complex’ Y which carries the signal to the next processing stage. This active complex decays to an inactive state X, which in turn dissociates back into S and Z. It can be shown (Sarikaya et al. 1998, Appendix) that when the active state X decays very fast, the dynamics of this system can be written as dz ( z) sz dt
(5.4)
with the output given by y y(t) z(t)s(t)
(5.5)
where s, z, and y represent the concentrations of S, Z, and Y, respectively and ␥, ␦, and ␣ denote rates of complex formation, decay to inactive state, and dissociation, respectively. This equation has been used in a variety of neural models, in particular to represent temporal adaptation, or the gain control property, occurring for example through synaptic depression (Abbott et al. 1997; Carpenter and Grossberg 1981; Gaudiano 1992; Grossberg 1972; Ögmen 1993; Ögmen and Gagné 1990). This equation contributes to masking through gain control. As
THE RETINO-CORTICAL DYNAMICS MODEL
can be seen from equation (5.5), the input s(t) is transmitted to the next stage via the ‘gain’ (␥/␦)z(t). It can be mathematically shown that when an additional input (mask) is applied, z(t) decays to a lower value, thereby reducing the gain of the signal transmission. As a result, the effective signal transmitted by the target becomes smaller, leading to its masked state. As discussed before, the level of lumping and unlumping in the model depends on the specific phenomena under analysis. To illustrate some key properties of the model, we provide below details of a version that was used in a recent study (Ögmen et al. 2003). The retinal network is designed to capture the basic spatiotemporal properties of the retinal output without necessarily incorporating all details of the retinal circuitry. To the extent possible, parameters of the model reflect the physiologically measured parameters of the primate retina. Retinal cells with sustained activities (parvocellular pathway)
The activities of sustained retinal cells are described in three functional stages. Stage I. Temporal adaptation (gain control) We use equation (5.4) to achieve temporal adaptation (gain control): 1 dzi ( z ) (J I )z i i i dt
(5.6)
where zi represents the concentration of a transducing agent at the ith spatial location, J is a baseline input generating a dark current and Ii is the external input (luminance value) at the ith spatial position. This temporal adaptation, or gain control, stage causes the neural activity to decay to a plateau level after an initial peak response to a sustained input, as observed in sustained retinal ganglion cell responses. The parameter adjusts the time constant of the decaying response. This equation can account for some of the effects of masking by light. When a large uniform field is superimposed on a target stimulus, the input generated by this light mask will reduce the gain of cells transmitting the response of the target and thus make it a less effective stimulus. Stage 2. Spatial center–surround organization Signals from the first stage are convolved by the kernels Gkse and Gksi which represent the excitatory-center and the inhibitory-surround of
181
182
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
the receptive field. The kernels are Gaussian functions of the form Gkse Ampseexp(k2/sdsen2), and the parameters Ampse and sdse were selected according to the receptor spacing at the fovea (Coletta and Williams 1987; Dacey 1993) and the physiologically measured receptive field characteristics at the corresponding region of the primate retina (Croner and Kaplan 1995). For simplicity, only the on-center off-surround-cells were considered. The membrane potential wi of the ith sustained cell is described by 1 dwi A w (B w ) ins G se (J I )z s i s i 兺 s j j j1 dt jin s
(Ds wi)
ins
兺G
jins
si j1(Js
Ij)zj
(5.7)
where the center and surround convolution sums provide the excitatory and inhibitory inputs to a shunting equation (compare equations (5.2) and (5.7)). The input signal is processed by a second-order polynomial (.) whose coefficients were determined by fitting the contrast response of the model neurons to the physiological data from Kaplan and Shapley (1986) (see Purushothaman et al. 2000, Appendix A.1 and Fig. 6). As discussed before, this equation provides several mechanisms whereby masking can be obtained: masking by integration, masking by inhibition, and masking by gain control. All these mechanisms are within the sustained system and are of the intra-channel type. Receptive field kernels provide limits on the lateral extent of these effects and can explain the dependence of masking magnitude on target–mask spatial separation. Furthermore, by introducing an additional delay to the inhibitory term to represent the center–surround delay of the receptive fields (Benardete and Kaplan 1997), we obtain peak masking at a negative SOA in the vicinity of the center–surround delay. Stage 3. Quadratic non-linearity with threshold and persistence The ‘membrane potential’ of the ith cell is transformed into an output signal (e.g. spike frequency) through a quadratic non-linearity with threshold ([w i s ] ) 2 , where [a] denotes the threshold or half-wave rectification function (i.e. [a] a if a 0, and [a] 0 otherwise). Parameters and s represent the gain and the threshold level of this function, respectively. The thresholded signal provides the input to the additive equation
THE RETINO-CORTICAL DYNAMICS MODEL
di ( i ([wi s])2) dt
(5.8)
where the parameter determines the overall temporal persistence of the signal in the parvocellular pathway. For simplicity, the main persistence terms have been lumped into a single equation. A more detailed version can incorporate additional stages using equations similar to equation (5.7) to represent the details of masking obtained in LGN for example. Retinal cells with transient activities (magnocellular pathway)
The equation for retinal cells with transient activities is similar to that for sustained activities except that it is tailored to produce transient responses and its Gaussian kernel parameters reflect physiologically measured receptive field characteristics of the transient cells in the primate retina (Croner and Kaplan 1995). Post-retinal cells mainly driven by the parvocellular pathway (post-retinal sustained cells)
The activity pi of the ith cell is given by the shunting equation 1 dpi A p (B p )[(p ) 2 (t )] p i p i i i dt pi[
inpf
兺
(pj)
jinpf; j 1
inp
兺
Qmp jimj]
inp
兺
H pijij(t p)
jinp
(5.9)
jinp
where the excitation consists of the afferent parvocellular signal and a feedback signal. The inhibitory signal consists of feedback, feedforward and inter-channel terms. Excitatory and inhibitory recurrent (feedback, re-entrant) signals are carried out through the non-linear function (a) 10a[(a 1) 2 1] if a 0.05, and (a) a(a 0.975) otherwise. The inhibitory kernels Hk pi and Qkmp determine the spatial spread of intra-channel and inter-channel inhibition, respectively. The parameter represents the relative delay between the parvocellular and magnocellular signals, and the parameter p reflects the relative delay of the inter-channel inhibitory signal with respect to the excitatory signal.
183
184
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
Post-retinal inhibitory interneurons
The post-retinal inhibitory interneurons carry the inhibition from sustained cortical cells to transient cortical cells via the additive equation dqi Aq qi Bq pi dt
(5.10)
where qi is the activity of the ith post-retinal inhibitory interneuron. Post-retinal cells mainly driven by the magnocellular pathway (post-retinal transient cells)
The post-retinal transient cells receive excitatory and inhibitory inputs from the magnocellular pathway and a post-retinal sustained-ontransient inhibition via the kernel Qk pm, yielding the shunting equation: dmi Ammi (Bm mi)2[ yi(t)] dt
冦 兺 H [y (t )]
mi
in
jin
mi ji
j
m
in
兺Q
jin
冧
pm jiqj
(5.11)
where m i is the activity of the ith post-retinal transient cell. The function [.] denotes the full-wave rectification that generates the ‘on–off ’ response characteristics of transient cells. Parameter m reflects the relative delay of the intra-channel inhibitory signal with respect to the excitatory signal. As in previous stages, these equations provide masking mechanisms by integration, inhibition, and normalization. However, the significant addendum at the cortical level is the reciprocal inter-channel inhibition which is the main contributor to type B masking. Unlumped model with contour and surface networks
To show how the model can be unlumped to incorporate additional details, consider the example discussed in Section 5.4.4. Two major modifications were introduced: The addition of a sub-cortical network and unlumping of the cortical surface and contour networks. The sub-cortical network
This network has been added to account for the facilitatory effect observed in paracontrast. The main requirement for this network is a
THE RETINO-CORTICAL DYNAMICS MODEL
relatively slow activity that gates inputs to the cortical surface network. However, for definiteness, following Bachmann’s (1994) approach we identified this network as a sub-cortical network. For simplicity we provide an input to this network directly from retinal cells with transient activities. The activity of the ith cell, si, in the subcortical network is governed by the additive equation
冢
冣
1 dsi yj s si dt ji
兺
(5.12)
Parameters s, represent the time constant of activity dynamics and the spatial spread of the summation, respectively. Cortical contour and surface networks
For simplicity, we assumed that contour and surface networks obey the same general equation but use different parameters. To incorporate the multiplicative effect of the sub-cortical network, Equation (5.9) has been modified as 1 dpi A p (B p ){(p ) (2 [s (t )]) (t )} p i p i i s i s i dt
pi
冦
inpf
兺
jinpf ;j i
(pj)
inp
兺
Hpiji vj(t p)
jinp
inp
兺
冧 (5.13)
Qmp ji mj
jinp
to include a signal from the sub cortical network, ([si(t s)]+), which modulates the excitatory parvocellular signal multiplicatively. Parameter s determines the gain of this multiplicative action. When it is zero, the equation becomes identical to (5.9). 5.4.6.
Computer simulations
The model can be simulated for arbitrary inputs by solving the system of ordinary differential equations (ODEs) using numerical methods. Because the system contains processes that unfold at different time scales, it may exhibit ‘stiffness’, and thus ODE solvers that can handle stiff equations may be necessary. In our recent work, we used the CVODE package which is based on variable-coefficient forms of the Adams and backward differentiation formula methods (Cohen and Hindmarsh 1994). Parameters used in the simulations are given in Appendix B.
185
186
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
5.5. Explanatory scope of the dual-channel
RECOD model Because the original sustained–transient model and the more recent RECOD model share common mechanisms critical to masking, most of the material in this section applies to both models. The extensive explanatory scope of these models is developed in the context of the empirical data reviewed in Chapters 1 and 2 and some additional results introduced in this chapter. Clearly, it is not possible to review and discuss all masking effects and findings published in the literature. The intent is to expose the reader to the scope of these models by discussing a representative set of findings. We will also compare the explanatory power of our model with that of previous models reviewed in Chapter 4. Furthermore, several shortcomings of the models will also be discussed, and suggestions will be offered for their future developments and improvements. (1) In Chapter 1, section 1.3.1, we noted that Sherrington (1897) and subsequently Piéron (1935) reported a facilitatory effect on the critical flicker frequency (cff ) of the first of two spatially adjacent stimuli flashed in recycled sequence relative to an inhibitory effect on cff of the second of the two stimuli. As noted, the latter effect is consistent with a form of paracontrast suppression; however, the former result is not consistent with a form of metacontrast suppression, but rather with its opposite metacontrast facilitation. Unlike Sherrington (1897), Piéron (1935) investigated the effect of such repetitively cycled sequential and spatially adjacent stimuli not only on cff but also on brightness perception. With the brightness criterion, a metacontrast suppression was obtained. The dual-channel RECOD model can account for these differential criterion-dependent findings in the following manner. 1. Metacontrast suppression of the brightness of a preceding flash of light by a subsequent spatially adjacent flash is a direct consequence of the faster transient activity generated by the lagging flash inhibiting the slower sustained response of the leading flash. 2. Conversely, the paracontrast suppression of cff of the second of two spatially adjacent repetitively flashed stimuli is a direct consequence of the slower sustained activity generated by the first stimulus reciprocally inhibiting the faster transient (flicker) activity
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
generated by the second stimulus. Hence, the first stimulus has a higher cff relative to the inhibited, i.e. lower, cff of the second stimulus. These results can be related to psychophysical studies revealing effects on the cff of an intermittently flashed stimulus when a second steady or sustained neighboring stimulus is present. Several investigators (Berger 1954; Fry and Bartley 1936; Geldard 1932, 1934; Graham and Granit 1931) have reported enhancement of the cff of an intermittently flashed test stimulus when the brightness of the steady neighboring mask stimulus increased to slightly above the Talbot level of the flickering stimulus, followed by a suppression of the cff as the brightness of the steady mask stimulus increased further. What processes contribute to this non-monotonic effect of steady surround or adjacent luminance on the cff of a flickering stimulus? In our opinion there may be at least two processes which contribute to the initial rise in cff as the surround luminance increases to slightly above the Talbot brightness of the flickering stimulus. First, as the intensity of the steady stimulus increases, there is an increase in the local light adaptation level of neural summation pools (Rushton 1965; Rushton and Westheimer 1962; Westheimer 1968) which intrude into the areas spatially adjacent to the steady stimulus. Since, according to the Ferry–Porter law, cff is directly proportional to the light adaptation level, we would expect the cff to increase as the luminance of the steady stimulus increases. Secondly, despite an increase in cff, the apparent brightness of the flickering stimulus simultaneously decreases as the luminance of the steady stimulus increases. Since, according to our model, longer persistent sustained channels are primarily involved in determining brightness or contrast of a stimulus, the more sluggish sustained activity of the flickering stimulus, persisting from one flash to another, would be increasingly inhibited via simultaneous brightness induction (Heinemann 1955), i.e. via lateral intra-channel inhibition, by the adjacent steady stimulus. This in turn should result in a reduced inter-channel sustained-on-transient inhibition produced between successive flashes of the flickering test stimulus. Such a disinhibition of transient channels would also be expected to result in an increase in their activity as reflected in the initial increase of cff. However, as a countervailing process, when the luminance of the steady surround increases, it also exerts progressively more lateral
187
188
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
inter-channel sustained-on-transient inhibition on the flickering stimulus. Thus, as the brightness of the steady stimulus exceeds the Talbot brightness of the flickering stimulus, the transient activity generated by the latter stimulus also ought to he laterally inhibited. On the assumption that this latter countervailing lateral inter-channel inhibition of transient channels eventually outweighs the facilitatory effects on flicker of light adaptation and the simultaneous lateral disinhibitory effect produced by lateral intra-channel sustained inhibition, we would expect an initial rise or facilitation of cff followed by its suppression as the luminance of the steady stimulus increases. (2) In Chapter 1 we mentioned that pattern can be masked by uniform light flashes. Masking by light refers to the reduction in visibility of a test flash by a mask flash consisting of a uniform field (Sperling 1965). Typically, the test stimulus is relatively small and is spatially superimposed on a much larger mask field. When the test flash itself consists of a spatially uniform stimulus and the observer is required simply to detect the presence of the test stimulus, the paradigm is called masking of light by light. Masking by light, devoid of contour interactions, is of peripheral prechiasmic origin (Battersby and Wagman 1962; Battersby et al. 1964). When the test stimulus consists of a pattern or form, such as an alphabetic character, the paradigm is called masking of pattern by light. For example, sudden luminance changes produced by the abrupt on- and offset of a prolonged masking flash are known to produce transient masking of patterns such as wide-stroke letters (Boynton and Miller 1963). Green (1981b) showed that these transient masking effects on the on- and offset of a prolonged luminance flash are specific to the spatial frequency of a sinusoidal test grating. He found that a brief 30-ms test grating at 1.0 c/deg yielded transient masking overshoots at abrupt on-and offset of a 700-ms luminance flash set at 58.4 cd/m2. However, for the same masking flash, no transient mask overshoots were obtained when a 7.8-c/deg test grating was employed; here sustained and equal masking was obtained for the duration of the flash. If we assume that the on- and offsets of the luminance flash activate peripheral (e.g. retinal) transient channels and that the 1.0- and 7.8-c/deg gratings predominantly activate low spatial frequency transient and high spatial frequency sustained channels, respectively, then the peripheral transient activity generated by the mask flash adds ‘noise’ to the ‘signal’
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
of transient channels activated by the 1.0-c/deg test grating, but not to the sustained channels activated by the 7.8-c/deg grating. In effect, in transient channels the internal signal-to-noise or Weber ratio is reduced, yielding transient masking overshoots at on- and offset. Conversely, since the sustained response component of the mask flash (recall that a stimulus can activate both transient and sustained channels) adds ‘noise’ only to the ‘signal’ in the sustained channels activated by the 7.8-c/deg test grating, only a sustained masking effect without transient overshoots at onset or offset of the mask flash ought to occur. We may wonder why the transient activity generated at on- and offset of the luminance mask do not inhibit, via inter-channel inhibition, the sustained activity generated by the 7.8-c/deg test grating and thus also produce transients near the abrupt luminance transitions of the mask. Recall first that masking by light, devoid of contour interactions, is of peripheral prechiasmic origin (Battersby and Wagman 1962; Battersby et al. 1964). However, since masking by pattern can be obtained with dichoptic presentations of mask and target stimuli (see Chapter 2, sections 2.2 and 2.6.7), we would expect the existence of mask contours to be a crucial aspect of dichoptic masking effects. Thus, if contour interactions also play a role in masking by light, we would expect transient-on-sustained inter-channel inhibition to contribute to masking by light only if the spatial separation between contours of the test pattern and the border edge of the uniform mask flash are sufficiently small (Battersby and Wagman 1962; Markoff and Sturr 1971; Weisstein 1971). Since the stimulus field used by Green (1981b) was 4.5 in diameter, visibility of the 7.8-c/deg test grating may have been appreciably suppressed by transient channels activated at the border of the field; however, in the interior of the test field, such suppression would be absent or, at best, highly attenuated. In summary, masking by light can be explained by several mechanisms whose activations depend critically on the nature of the stimuli. (3) In order to produce contour proximity and overlap of mask and target stimuli Mitov et al. (1981) altered Green’s (1981b) technique by employing the masking-by-pattern paradigm in which a 500-ms flash of one grating and a 20-ms flash of another grating served as mask and test stimuli, respectively. Their results can be summarized as follows. When the spatial frequency of the mask was 6 c/deg or lower,
189
190
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
the magnitude of the transient overshoots at mask on- and offsets were inversely proportional to the spatial frequency of the test grating; specifically, the magnitude was largest for a 2-c/deg test grating, smaller, but pronounced, for a 6-c/deg test grating, and smallest for an 18-c/deg test grating. We would expect such a result if transient channels are preferentially activated at spatial frequencies at and below 6 c/deg and if the magnitude of transient activation decreases with spatial frequency while that of sustained channels increases. Since these experiments were conducted under binocular viewing, which uses peripheral luminance, as well as central contour masking mechanisms, the role of more central transient-on-sustained inter-channel inhibition cannot be assessed. A similar study, using dichoptic viewing of target and mask stimuli, may reveal such target–mask inhibitory interactions. Moreover, the study of Mitov et al. (1981) dovetails neatly with results reported by Teller et al. (1971), employing scotopic luminance, and Matthews (1971), employing photopic luminance. These authors reported that conditioning or mask flashes of relatively small diameter (e.g. 30’) produced no transient masking overshoots at their onset or offset; however, such transient overshoots were obtained with relatively large-diameter mask flashes (e.g. 60’). Since larger diameter masks, like lower spatial frequency gratings, are optimal for activating transient channels, we would expect pronounced transient activity at on- and offset of a large conditioning flash to produce such transient masking overshoots. However, smaller or high spatial frequency stimuli are either suboptimal for or incapable of activating transient channels (see section 5.2.4.1); hence transient mask overshoots are attenuated or absent. Moreover, based on the related results of Breitmeyer and Julesz (1975) and Tulunay-Keesey and Bennis (1979) (see section 5.2.4.1), we can infer that the transient overshoots at on- and offset of the mask flashes employed by Green (1981b) and Mitov et al. (1981) depend on the rise and fall times of the mask at its on- and offsets. In particular, since slowly ramped, relative to abrupt, on- and offsets attenuate the transient response, we would expect a corresponding attenuation of the transient masking overshoots reported by these authors. Such attenuations of transient masking overshoots have been reported
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
by Matsumura (1976) in his investigation of masking of light by luminance increments and decrements. (4) In the review of pattern masking in Chapter 2, it was noted that type B paracontrast masking is optimal when the onset of the spatially flanking mask precedes that of the central target by several tens of milliseconds (Alpern 1953; Kolers and Rosner 1960; Pulos et al. 1980; Weisstein 1972). The mechanism responsible for this effect is intrachannel inhibition within sustained pathways. Recall from section 5.4.4 that the surround activity of sustained retinal ganglion cells lags the center by several tens of milliseconds (Benardete and Kaplan 1997). Consequently, to have an optimal inhibitory effect on a sustained neuron, the stimulus activating the surround must precede the center stimulus by a corresponding asynchrony. At longer or shorter asynchronies, less than optimal inhibition of the center response is produced. Consequently, in paracontrast an optimal type B masking effect should also be produced when the mask precedes the target by several tens of milliseconds. In addition to peak paracontrast effects in the SOA range 30 to 20 ms, it was also noted that peak paracontrast effects were also found in the 150 to 100 ms range (Breitmeyer et al., in press; Cavonius and Reeves 1983; Ögmen et al. 2003) (see Chapter 2, section 2.5). Figure 5.17 shows the data averaged across two observers together with the model prediction (Ögmen et al. 2003). Overall, the model predictions match the data well in terms of the location of the dips in the masking function. In the model, paracontrast masking at relatively large SOA magnitudes arises from a slow intra-channel cortical inhibition which was not included in the earlier versions of the dual-channel masking models. In our simulations, the relative delay of this intrachannel inhibition was 144 ms. As mentioned before, the estimated delays for antagonistic interactions (center–surround) in the early visual pathways are at least an order of magnitude less than this value (Benardete and Kaplan, 1997). However, our paracontrast results indicate that such inhibition can last for several hundreds of milliseconds. Evidence for both the short- and long-lasting inhibition has been found in visual cortex (Berman et al. 1991; Connors et al. 1988; Nelson 1991). Therefore we propose two suppressive effects in paracontrast: (i) a relatively fast effect realized in the local center–surround antagonism
191
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
1.2
Normalized perceived brightness
192
0.8
0.4
2-obs AVE 2-obs T only Model
0 –600
–400
–200
0 SOA (ms)
200
400
600
Fig. 5.17 The perceived brightness of the target as a function of SOA averaged across two observers and normalized with respect to the condition where only the target was presented. The error bars represent 1 SEM superimposed on the data points predictions. The target consisted of a disk 0.86 in diameter. The mask was a ring with an outer diameter of 1.66, and the contour separation between the disk and the mask ring was 2’. Target and mask stimuli were presented in separate frames, each lasting 13.33 ms. The model is in good agreement with the data for the locations of the peak masking effects. Quantitatively, the model underestimates the magnitude of metacontrast and the span of masking at large SOA magnitudes. For computational simplicity, the simulations were based on a simplified (onedimensional) version of the stimuli used in the experiment. It is possible that the quantitative discrepancies between the model and the data are due to the failure in our model and simulations to represent adequately all stimulus parameters such as eccentricity, stimulus size, energy, and background level. These parameters are known to influence both the magnitude and the morphology of the masking functions. (Reproduced from Ö˘ gmen et al. 2003.)
of classical receptive fields, and (ii) a slower more prolonged effect, associated with other, perhaps more global, properties of cortical activity. In our simulations, we used a functionally feedforward signal to implement this long-lasting inhibition; however, a functionally feedback signal is also possible. Additional experiments are needed to test these possibilities. (5) The unlumping and the extension discussed in section 5.4.4. allow the model to explain quantitatively the different para- and metacontrast functions obtained when the criterion contents relate to contour versus surface properties of a target stimulus. Figure 5.18 shows
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
simulations of the model (bottom panel) along with experimental data (top panel). As discussed in explanation (4), due to simplifications used in the simulations masking effects in the model are limited to a smaller range of SOAs compared to experimental data. Accordingly, to make the comparison easier, the scales in the abscissa for the top and bottom panels in Figure 5.18 range from 500 ms to 400 ms for the data and from 200 ms to 200 ms for the model. First, consider the results for metacontrast (positive SOAs): Overall, the model captures DATA
PARACONTRAST
METACONTRAST
1.4 1.2
Visibility
1 0.8 0.6 0.4
Surface-Exp Surface-Exp Contour-Exp Contour-Exp T-only T-only
0.2 0
–500 –400 –300 –200 –100
0
100
200
300
400
SOA (ms) MODEL
PARACONTRAST
METACONTRAST
1.4 1.2
Visibility
1
0.8 0.6 0.4
Surface-Model Surface-Model Contour-Model Contour-Model T-only T-only
0.2 0
–200
–100
0
100
200
SOA (ms)
Fig. 5.18 Data from Fig. 2.5 are shown in the top panel. Results of model simulations are plotted in the bottom panel. Open and closed symbols represent contour and surface/brightness results, respectively. Results are with respect to target-only baseline condition which is normalized to a value of 1. Note that the scales of the abscissa for the top and bottom panels are different. (Reproduced from Breitmeyer et al. in press).
193
194
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
well the shape of the metacontrast functions. In agreement with data, strongest metacontrast occurs at shorter SOA for the contour network compared to the surface/brightness (20 ms for contour vs. ca. 60 ms for surface) network. The values of SOA where optimal suppression occurs compare well with those observed in the data (10 20 ms for contour vs. 40 ms for surface). For paracontrast, for the contour network one observes a gradual long-lasting suppression coupled with a strong suppression around SOA 10 ms. For the surface network, the longlasting suppression is weaker and an enhancement occurs at SOA 40 ms. This enhancement is followed by a dip at an SOA around - 10 ms (the dip is much weaker in the data). The seemingly different morphologies for contour and surface paracontrast functions are obtained in the model by using an identical set of equations. The only difference is the different weightings associated with the inhibitory and facilitatory processes as they interact within surface and contour networks. The long-lasting inhibitory process has a higher weight for the contour network (parameter H pi in the Appendix B, with values 1.5 and 0.2 for contour and surface networks, respectively, as shown in Table B.4); and the multiplicative action of the facilitatory process has a higher gain for the surface network (parameter s in the Appendix B, with values 0.09 and 0.25 for contour and surface networks, respectively, as shown in Table B.4). (6) The transition from type B to type A metacontrast as mask energy increases relative to target energy can also be explained by a correspondingly increasing effectiveness of sustained intra-channel inhibition. Breitmeyer (1978b) (see Chapter 2, section 2.6.2 and Fig. 2.6) showed that, at a target duration of 16 ms, metacontrast shifted from type B to type A as the mask duration increased from 2 to 32 ms. Beyond a mask duration of 8 ms, the mask became progressively more effective at lower SOAs whereas the peak metacontrast effect obtained at SOA 56 ms did not change. Breitmeyer (1978b) was able to determine the masking threshold at the lowest SOA of 16 ms and at the optimal type B metacontrast SOA of 56 ms by plotting the strength of masking at each SOA as a function of mask energy or duration. The results are shown in Fig. 5.19. Note that the masking effects for the optimal type B SOA has a low threshold which increases linearly from the 1-ms to the 8-ms mask duration and then becomes asymptotic. On the other hand, at the shortest SOA of 16 ms, the masking
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
10
RK
8 SAO (msec) 16 56
MASK MAGNITUDE
6 4 2
10 BB 8 6 4 2
1
10 MASK DURATION (MSEC)
40
Fig. 5.19 Metacontrast magnitude functions at SOAs of 16 ms (full circles, solid line) and 56 ms (open circles, broken line) as a function of mask duration for a target flashed for 16 ms. Note the lower masking threshold for the SOA 56 ms condition relative to the SOA 16 ms condition. (Reproduced from Breitmeyer 1978b.)
effect has a higher mask threshold. No masking is obtained at mask durations of 1, 2, 4, and 8 ms, beyond which increasing amounts of masking are obtained. This difference in masking contrast thresholds corresponds to the lower thresholds of transient vs. sustained channels obtained electrophysiologically in single neurons as well as psychophysically in humans (see sections 5.2.1 and 5.2.4.1). Thus, when mask energy increases relative to that of the target, we can infer that the type B metacontrast obtained at lower mask contrasts and produced by a high-gain, rapid-saturation transient-on-sustained inhibition will make a transition to type A metacontrast at high mask contrasts, which in turn is produced by a low-gain linear intra-channel sustained-on-sustained inhibition superimposed on the former inter-channel inhibition.
195
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
(7) Dichoptic type A forward masking by noise or structure is typically weaker than type A backward masking (Greenspoon and Eriksen 1968; Schiller and Smith 1965; Turvey 1973). This can be explained by invoking an essential asymmetry between forward and backward masking (Fig. 5.20) that depends on the effectiveness of the target’s transient activity in inhibiting the mask’s sustained activity. In forward masking by structure or noise, post-retinal transient activity generated by the target can locally inhibit post-retinal sustained activity of the mask. Therefore the intrusion at cortical levels of mask sustained activity into the sustained channels shared with the target should be less than maximal and hence there should be less masking by integration (Fig. 5.20, open circles). On the other hand, in backward masking by noise or structure, not only does the sustained activity of the mask intrude unobstructed into the sustained channels of the target, but also, at post-retinal levels, the transient activity of the mask can inhibit the sustained activity of the target and facilitate this intrusion; thus, there should be more masking (Fig. 5.20, full circles). Moreover, since these interactions are obtained dichoptically, they very likely exist at cortical levels. 100
Percentage letters correctly identified
196
80
60
40
20
0
Lagging Target Leading Target 0
20
40
60
80 100
150
200
250
SOA (ms)
Fig. 5.20 The effect, as a function of SOA, on visibility of a consonant-trigram target, measured in percentage correct letter identification, produced by a spatially overlapping pattern mask when the target temporally lags (open circles) or leads (full circles) the mask. (Reproduced from Michaels and Turvey 1979.)
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
(8) Under monoptic viewing conditions type A forward masking is typically stronger than type A backward masking (Kinsbourne and Warrington 1962a,b; Scharf and Lefton 1970: Schiller and Smith 1965; Turvey 1973). The explanation for this difference between dichoptic and monoptic masking by structure or noise is attributable to the fact that under monoptic conditions integration of target and mask activity can occur as early as the photoreceptor level and at post-receptor neural levels prior to the centrally located sustained–transient inhibitory interactions. The inference that these masking effects entail integration of target and mask information in common peripheral pathways prior to the later inhibitory interactions among sustained or transient channels is consistent with the finding that these powerful and long-lasting peripheral on-effects produced by mask onset are not obtained dichoptically (Battersby et al. 1964; Turvey 1973), whereas interactions among and between sustained and transient channels can be demonstrated dichoptically (see explanation (10)). (9) The fact that transient-on-sustained inhibition occurs in backward masking by noise and structure has been demonstrated by Michaels and Turvey (1973, 1979) and Turvey (1973). In these studies the energy or contrast of the masks was appreciably lower than that of the target. Here one would expect not only weak peripheral (photoreceptor) on-effects produced by the mask but also a weak activation of the sustained channels of the mask relative to the target, and consequently little masking by integration should occur in common post-receptor sustained pathways. However, based on Breitmeyer’s (1978b) study (see Chapter 2, Fig. 2.6), we know that even with monoptic viewing lower-energy masks can yield type B metacontrast functions which are produced by transient on sustained inhibition. Michaels and Turvey (1973) and Turvey (1973) also reported type B backward masking functions when a low-energy structure or noise mask was used, indicating that this form of inter-channel inhibition was also involved in producing their results. This interpretation was given additional clarification by Michaels and Turvey (1979). They showed that with a target/mask energy ratio of 1/2, a type B function was generated under dichoptic viewing whereas a type A function was generated under monoptic viewing. Presumably here the monoptic type A effect, especially at low SOA values, was due to peripheral processes such as photoreceptor on-effects produced by the mask
197
198
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
and common integration of weak-target and strong-mask sustained activity, whereas the dichoptic type B effect was produced by a more central mask transient-on-target sustained inhibition isolated from the peripheral energy-dependent integrative effects obtained monoptically (Battersby et al. 1964; Turvey 1973). In this connection, Purcell et al. (1975) showed that in backward masking by pattern, moving the retinal location of the target and mask stimuli from the fovea to extrafoveal areas can facilitate production of type B masking functions. In section 5.2.1 it was indicated that the P/M cell ratio decreases with retinal eccentricity. Hence, in comparison with foveal stimuli, extrafoveal stimuli should yield weak type A masking by common integration of target and mask sustained pattern information but relatively stronger type B masking produced by transient-onsustained inhibition. (10) Type A forward and backward pattern masking as well as type B para- and metacontrast are obtained dichoptically and monoptically (Alpern 1953; Kinsbourne and Warrington 1962a; Kolers and Rosner 1960; Michaels and Turvey 1973; Schiller and Smith 1965, 1968; Turvey 1973; Weisstein 1972; Werner 1940). We know that sustained and transient neurons exist in the primary visual cortex where a majority of visual cells already receive binocular innervation (Hubel 1988; Hubel and Wiesel 1962, 1968, 1977). Consequently, both dichoptic and monoptic masking effects due to either integration in common sustained pathways or inter-channel inhibition should be obtainable. (11) Type B metacontrast effects decrease in magnitude as the spatial separation between the target and mask stimuli increases (Alpern 1953; Breitmeyer and Horman 1981; Breitmeyer et al. 1981b; Weisstein and Growney 1969). This can be explained on the basis of the spatially restricted receptive fields of sustained and transient neurons and the topographic mapping from the retina to the visual cortex (Brooks and Jung 1973). Although transient receptive fields are larger and exert their influence over greater distances than sustained channels (see section 5.2.1), at increasingly larger distances between stimuli their inhibitory effects on sustained neurons should attenuate. In particular, at the cortical level inter-channel inhibition should be maximal within a cortical column of cells and should decrease as the physical separation of retinotopically organized columns subserving different regions of visual space (Hubel and Wiesel 1977) increases. These properties
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
are captured in the model by the Gaussian convolution kernels that attenuate the strength of interactions as the distance between the neurons increases. (12) Metacontrast is relatively weak in the fovea (Alpern 1953; Bridgeman and Leff 1979; Kolers and Rosner 1960; Lyon et al. 1981; Saunders 1977). At progressively larger eccentricities, metacontrast becomes more robust (Bridgeman and Leff 1979; Kolers and Rosner 1960; Lyon et al. 1981; Saunders 1977; Stewart and Purcell 1970) and can be obtained at target–mask separations exceeding 1 visual angle (Alpern 1953; Breitmeyer et al. 1981b; Growney et al. 1977; Weisstein and Growney 1969). The relative weakness of the effect in the fovea is consistent with and can be explained by the following facts. 1. The fovea is characterized by a stronger activity, higher coverage, and higher concentration of sustained channels than transient channels (Azzopardi et al. 1999) (see section 5.2.1). 2. The precipitous decrease in the response strength and relative density of sustained channels with retinal eccentricity accompanied by an increase in the relative response strength and relative density of transient channels is consistent with more robust metacontrast as retinal eccentricity of stimulation increases. Moreover, since both transient and sustained receptive fields increase with retinal eccentricity we should expect (i) stronger metacontrast with smaller target and mask stimuli in the fovea than in the parafovea or periphery (Bridgeman and Leff 1979; Lyon et al. 1981), and (ii) the inhibitory effect of transient channels on sustained channels to extend over larger target–mask spatial separation in the parafovea than in the fovea (Alpern 1953; Breitmeyer et al. 1981b; Kolers and Rosner 1960; Weisstein and Growney 1969). (13) Type B contour masking and brightness suppression is obtained during stroboscopic motion (Breitmeyer and Horman 1981; Breitmeyer et al. 1976). Specifically, as in the metacontrast paradigm, only the first of two stroboscopic stimuli is masked in accordance with a type B function. The second stimulus is not masked at all (Breitmeyer et al. 1976; Breitmeyer and Horman 1981). Since low spatial frequency transient channels are most likely used to detect rapid motion, a type B suppression of high spatial frequency contour information or contrast information carried in sustained channels activated by the first
199
200
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
stimulus would be expected during stroboscopic motion, just as in metacontrast. Moreover, since a type B masking function was obtained as long as the spatial separation between the first and second stimuli did not exceed a value of about 1.5–2.0, it is clear that strict contour proximity is not a necessary condition for obtaining such masking functions (see also explanation (12) ). (14) In Chapter 2, section 2.6.1, we noted that simple reaction time to the target (Bernstein et al. 1973a,b; Fehrer and Biederman 1962; Fehrer and Raab 1962; Harrison and Fox 1966; May and Grannis, unpublished observations) or forced-choice detection of target location (Schiller and Smith 1966; Ögmen et al. 2003) are unaffected in metacontrast masking. However, when a pattern discrimination task is used, type B choice reaction time functions can be generated (Eriksen and Eriksen 1972; May and Grannis, unpublished observations). Our model states that, since optimal intra-channel inhibition is obtained when the mask precedes the target, transient channels activated by the target are immune to the intra-channel inhibition by the following transient activity generated by the mask. Moreover, because of the longer response latency of sustained compared with transient channels, the sustained activity of the later flashed mask cannot inhibit the transient activity of the preceding target. Therefore, since both intraand inter-channel inhibition of the target’s transient response are effectively eliminated, the activity generated by the abrupt onset of the target could easily be detected by transient channels in the visual cortex or the superior colliculus, activated either via direct retinocollicular projections (Hoffmann 1973; McIlwain and Lufkin 1976) or indirectly via corticofugal projections from transient cells in the ipsilateral cortex (Bunt et al. 1975; Finlay et al. 1976; Leventhal and Hirsch 1978; Palmer and Rosenquist 1974; Rosenquist and Palmer 1971). In contrast, the type B reaction time function obtained when pattern discrimination is employed is consistent with the existence of inter-channel transienton-sustained inhibition in metacontrast. These results again indicate that figural information is carried mainly in sustained channels, while location information is carried mainly in transient channels. This conclusion is further corroborated by the fact that in backward masking by noise location information is masked for a shorter interval than is pattern or figural information (Breitmeyer, unpublished findings; Smythe and Finkel 1974) (see Chapter 2, section 2.8), indicating that
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
80
Paracontrast
Metacontrast M/T = 1
Delta RT (ms)
60
M/T = 3 40
Mean
20
Model
0 –20 –320 –240 –160
–80
0
80
160
240
320
SOA (ms)
Fig. 5.21 Change RT in reaction times due to contour interactions between the target and the mask as a function of SOA for two M/T contrast ratios. The middle curve corresponds to the average of the M/ T 3 and M/ T 1 data. Error bars represent 1 SE of the mean. The squares are the model predictions. (Reproduced from Ö˘ gmen et al. 2003.)
location information is carried in the faster and briefer persisting transient channels whereas pattern information is transmitted in the slower and longer persisting sustained channels. Furthermore, more recently we predicted that, in paracontrast, reaction times for target localization should increase (Ögmen et al. 2003; see also Chapter 8, section 8.3). Figure 5.21 shows the change in reaction time (RT) due to contour interactions in paracontrast and metacontrast together with the model predictions. For metacontrast, RT values fluctuate around averages of 5.5 ms and 1.7 ms for M/T energy ratios of 1 and 3, respectively. However, for paracontrast RT values depend strongly on SOA, peaking at SOA 150 ms. The peak RT values are 28.7 ms and 51.1 ms for M/T ratios of 1 and 3, respectively. Overall, both the data and the model show an inverse U-shaped function for paracontrast and a relatively constant function for metacontrast. For paracontrast, closer examination of the model and the data (in particular for M/T 3) suggest a finer structure that can be described as an ‘inverse W function’, although the peaks and dips of this function in the model and the data are shifted with respect to each other. The two peaks in the W-shaped function may reflect the separate contributions of inter-channel sustained-on-transient inhibition and intra-channel transient-on-transient inhibition to reduction of the activity of the transient channels responding to the target.
201
202
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
(15) Kolers (1963) and von Grünau (1978b, 1981) report that when the shape or contour of either of two spatially separated and sequential stimuli is masked by metacontrast, it can still contribute to the perception stroboscopic motion when the second stimulus follows the first at optimal SOAs. Explanation (14) stated that transient activity generated by a given stimulus can escape the masking effects produced by the following metacontrast mask. Furthermore, since transient channels are presumably involved in spatiotemporal sequence detection (Matin 1975) or the detection of stroboscopic motion (see explanation (13)), the results obtained by Kolers (1963) and von Grünau (1978b, 1981) can be readily explained in terms of the strong (unmasked) transient activity generated by the two sequential stimuli. (16) Both on- and off-center sustained and transient cells can be found long the entire visual pathway. Moreover, as one progresses up the visual pathway, transient neural responses become progressively more indifferent to the sign of the stimulus contrast (Citron et al. 1981; Schiller et al. 1976). Additionally, since transient channels characteristically respond to onset as well as offset of stimuli (Enroth-Cugell and Robson 1966) (see also section 5.2.3), we should be able to obtain metacontrast contour masking irrespective of the sign of mask contrast as shown by Breitmeyer (1978c) and with mask offset as demonstrated by Breitmeyer and Kersey (1981) and suggested earlier by Turvey et al. (1974). We noted in Chapter 2, section 2.6.4, that Becker and Anstis (2004) recently failed to obtain brightness suppression in metacontrast when the target and mask had opposite contrast signs. As noted, this may be due to the use of different criterion contents in visual masking or to the use of large vs. small stimuli (Chapter 2, note 3). (17) Werner’s (1935) investigation of metacontrast indicated that the magnitude of metacontrast suppression of a target pattern is inversely related to the orientation difference between target and mask stimuli. Alternatively, his results indicated that metacontrast is orientation specific. This finding is readily explained by the facts that cortical transient as well as sustained neurons are orientation selective (Ikeda and Wright 1975; Stone and Dreher 1973), and that mutual inhibition among cortical orientation-selective cells is itself orientation selective (Benevento et al. 1972; Blakemore et al. 1970; Blakemore and Tobin 1972; Creutzfeldt et al. 1974; Nelson and Frost 1978) extending up to orientation differences of about 30–40.
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
(18) The present models specify that to obtain type B metacontrast functions the transient channels activated by the mask must inhibit sustained activity generated by the target. Transient channels, relative to sustained ones, are insensitive to high spatial frequencies. Consequently they are also insensitive to image blur, whereas sustained channels are sensitive to it and yield a weaker response with greater blur (Ikeda and Wright 1972). Therefore the prediction follows that blurring of the mask should not substantially reduce metacontrast of a non-blurred target. Growney (1976) has shown that mask blurring does not reduce the magnitude of metacontrast appreciably. (19) Since high spatial frequency sustained channels have a longer response latency than intermediate or low spatial frequency channels, type B metacontrast should peak at longer SOAs for high spatial frequency targets than for lower spatial frequency ones (see Figs 5.9(b)–5.9(d)). Rogowitz (1983) investigated metacontrast using flanking gratings as a mask and a central grating as a target. She found, as expected, that the SOA at which metacontrast was optimal increased as the spatial frequency of the target stimulus increased. Moreover, as spatial frequency increases, the transient response should decrease in magnitude and increase in latency. Consequently, as the spatial frequency of the flanking mask increases, type B metacontrast should decrease in magnitude, with its optimal value shifting to lower SOAs. Rogowitz (1983) also reported the presence of both of these trends. (20) As noted in Chapter 4, section 4.2.2, target suppression can also be obtained using a single-transient paradigm (Breitmeyer and Rudd 1981; Kanai and Kamitani 2003) in which a brief mask suppresses the visibility of a prolonged sustained peripheral target for several seconds. Since the prolonged target stimulus activated only sustained channels, the reduction of its visibility was a result of their activity being inhibited by the transient channels activated by the mask. Moreover, as also noted in Chapter 4, the single-transient paradigm indicates that any single-transient stimulus can activate transient-on-sustained inhibition. Therefore, despite the methodologically necessary use of a two-transient paradigm in metacontrast, contrary to Matin’s (1975) claim, activation of target–mask (T–M) neurons is not required; the activation of transient neurons by the mask alone is sufficient. (21) Metacontrast can also be obtained with transient hue substitution rather than achromatic contrast or brightness transients
203
204
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
(Breitmeyer et al. 1991; Reeves 1981). Reeves obtained type B metacontrast when parafoveal target and mask stimuli (white, yellow, blue, or red) were briefly substituted against a green background of equivalent brightness. Optimal masking was obtained when red target and mask stimuli were substituted against the green background. This may correlate with the fact that, at and near the fovea, most broad-band color-opponent transient cells as well as narrow-band color-opponent sustained cells are maximally sensitive to red and green light. Moreover, such hue-substitution masking is consistent with the fact, noted in Chapter 2, section 2.6.8, that the transient M neurons can respond to hue-substitution stimuli (Krüger 1979; B.B. Lee et al. 1988; Schiller and Colby 1983). (22) In Chapter 2, section 2.6.3, we indicated that background and stimulus luminances affect the locus of the peak metacontrast on the SOA axis. In particular, at lower relative to higher background luminances (Purcell et al. 1974) or stimulus luminances (Alpern 1953), the peak effect occurs at shorter SOAs. Since neurophysiological findings show that the temporal response characteristics of transient neurons tend to converge toward those of sustained neurons as adaptation level (background luminance) decreases, we would expect, among other things (Breitmeyer 1992), the response latencies of transient and sustained channels to converge at lower adaptation levels. Hence, according to our model, the SOA at which peak metacontrast masking occurs should correspondingly shift toward a value of 0 ms. It may be worthwhile reiterating that several prior theoretical explanations proposed by Bridgeman (1971, 1977, 1978), Ganz (1975), Navon and Purcell (1981), and Reeves (1982) predict the opposite shift of the SOA at which optimal metacontrast occurs. Moreover, Matin’s (1975) model, discussed in explanation (20), faces difficulties here. With a reduction of background luminance, optimal velocity sensitivity to real and apparent motion shifts toward lower values (Breitmeyer 1973; Crook 1937; Oyama 1970). This may relate to the following facts. 1. The fusion velocity, i.e. the velocity at or above which a moving grating is no longer distinguishable from a uniform field, decreases linearly with decreases of log background luminance (Crook 1937; Oyama 1970), in line with the Ferry–Porter law, which was origin-
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
ally formulated to account for similar background-luminance dependent changes in the cff (Ferry 1892; Kelly 1961). 2. The respective upper-limit velocities at which sustained and transient neural responses are barely activated decreases as background luminance decreases (Cleland et al. 1973). Therefore, since the optimal ‘velocity’ characterizing stroboscopic motion as well as the optimal temporal resolution decrease as background luminance decreases, the SOA at which we obtain either optimal stroboscopic motion or optimal T–M neuron activation correspondingly increases. Accordingly, of Matin’s two proposed T–M neuron properties contributing to metacontrast, the first, specifying their optimal response to a given SOA, would predict incorrectly that the corresponding optimal metacontrast effect also shifts to a higher SOA as background luminance decreases, whereas the second property (response latency) would predict correctly a shift to a lower SOA. Here, and possibly also in other situations, Matin’s (1975) model faces the problem of stating which property dominates under high compared with low background luminance and how such shifts from presumably bilateral co-operative to unilateral countervailing interactions of the two properties arise. It would seem from a standpoint of parsimony that the response latency property by itself readily accounts for these backgrounddependent results without the complications arising when the optimal SOA property is also invoked. Parenthetically, it also may be worth mentioning that Kahneman’s (1967) impossible-motion formulation of type B metacontrast suffers on the same grounds as Matin’s (1975) formulation based on an optimal SOA-dependent response magnitude of T–M neurons. Of course, deciding to eliminate this latter process from explanations of metacontrast does not, as yet, necessitate its elimination in Matin’s (1975) explanation of type B paracontrast. A systematic, heretofore unperformed, investigation of type B paracontrast as a function of background luminance should yield the relevant answers. (23) Besides background luminance, the wavelength or color of a background also affects the magnitude of metacontrast masking, with green or blue backgrounds yielding stronger masking than red ones (Breitmeyer and Willliams 1990; Breitmeyer et al. 1991; Edwards et al.
205
206
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
1996; Williams et al. 1991). This result, as noted in Chapter 2, section 2.6.8, is consistent with the fact that uniform red light is known to suppress the activity of M neurons (De Monasterio 1978a,b; De Monasterio and Schein 1980; Dreher et al. 1976; Livingstone and Hubel 1984; Marrocco et al. 1982; Van Essen 1985; Wiesel and Hubel 1966). (24) Pantle (1971) and Nillson et al. (1975) demonstrated the existence of selective flicker adaptation (see section 5.2.4.3). The latter investigators found in particular that flicker frequencies ranging from about 8.0 to 24.0 Hz produced the largest adaptation effect. Since transient cells are selectively sensitive to the higher range of flicker frequencies, flicker adaptation should produce a weaker response in transient channels (Breitmeyer et al. 1981a; Green 1981a). Therefore, if flicker adaptation is used prior to a metacontrast presentation, the magnitude of transient-on-sustained inhibition, and hence metacontrast, should decrease. Indeed, Petry et al. (1979) confirmed this prediction in their study of metacontrast. (25) In Chapter 7, we will review data showing that attention and figural grouping effects can modulate the magnitude of masking functions. Our model can explain these findings if we consider attention and figural effects as ancillary modulatory factors influencing basic mechanisms underlying visual masking. For example, if we assume that attention enhances the activity in the sustained channel at the location of the target, this enhancement would give a competitive advantage to sustained channels over transient channels and thereby reduce the magnitude of the masking effect. An interesting prediction of this formulation is that if attention favors the sustained channels in the sustained–transient competition, attention should degrade tasks carried out by the transient channels. Recently, Yeshurun and colleagues presented evidence in support of this prediction. When two brief pulses are presented at the same spatial location, because of temporal integration they are perceived as a single pulse until the ISI reaches a critical value (two-pulse fusion threshold). By using a spatial cueing paradigm, Yeshurun and Levy (2003) showed that the two-pulse fusion threshold was higher when attention was allocated to the spatial location of the target. Furthermore, Yeshurun (2004) showed that such a degradation in temporal resolution was not apparent when stimuli were chosen so as to bias the task toward the parvocellular system (by using isoluminant stimuli or a red background). In summary, our
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
model can explain the influence of spatial attention on masking as a bias for the parvocellular system in the parvo–magno competition. (For further effects of attention on masking see Chapter 7.) (26) Since sequential blanking (Mayzner and Tresselt 1970; Newark and Mayzner 1973) (see Chapter 2, section 2.7.1) can be identified with type B metacontrast (Hearty and Mewhort 1975; Mewhort et al. 1978; Piéron 1935), it also falls easily into the explanatory scope of our model. (27) A further implication of the sustained-on-transient approach to human visual information processing for iconic or visible persistence concerns the decrease of visible persistence both as stimulus duration increases (Haber and Standing 1970; Bowen et al. 1974) and as the spatial frequency of a prolonged stimulus decreases (Bowling et al. 1979; Breitmeyer et al. 1981a; Corfield et al. 1978; Meyer and Maguire 1977) (see section 5.2.4.3). In explanation (16), we noted that pattern stimulus offsets can produce type B backward masking (Breitmeyer and Kersey 1981). As noted in section 5.2.3, stimulus offsets can activate the transient M neurons and thus produce transient-on-sustained inhibition. Consequently, as stimulus duration increases, transient channels activated at offset of the stimulus should retroactively inhibit sustained channels and thus curtail their response persistence beyond the offset of the stimulus. A related line of reasoning applies to the decrement of visual persistence of prolonged low spatial frequency stimuli. Since transient cells are selectively sensitive to low spatial frequencies, their activation at stimulus offset would again retroactively suppress the activity of low spatial frequency sustained channels and thus curtail their persistence beyond stimulus offset. At high spatial frequencies this inhibitory mechanism at grating offset would be absent or greatly attenuated since transient channels are relatively insensitive to these spatial frequencies. Moreover, Meyer et al. (1975, 1979) found that orientation and spatial frequency adaptation decreased visible persistence selectively. Since selective adaptation for prolonged periods of time reduces the sensitivity of sustained channels but leaves that of transient channels unaffected (Bodis-Wollner and Hendley 1979; Kulikowski 1974) (see section 5.2.4.1), visible persistence in sustained channels should decrease since, because of their lower response after adaptation, they can be more effectively inhibited by the non-adapted transient channels activated at grating offset.
207
208
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
(28) The ‘four-dot masking’ paradigm discussed in Chapter 4, section 4.5, can be readily explained by our model by considering the fact that at eccentric locations target–mask contour proximity and similarity becomes less critical for metacontrast (explanation (12)). Similarly, as noted in explanation (25), distributed spatial attention leads to stronger masking. (29) The common-onset masking discussed in Chapter 4, section 4.5.2, can be explained by feedforward as well as feedback mechanisms. 1. Consider the possibility that ‘equalization methods’, whether by the criterion of equal perceived brightness or the equal level above detection-threshold, do not fully control for temporal integration of the mask stimulus at all levels of visual processing. Accordingly, one way our model can explain increased masking as a function of mask duration at SOA 0 is by an increased effectiveness of sustained intra-channel inhibition by the mask on the target (see explanation (6)). 2. Another factor that can lead to masking is feedback inhibition by the prolonged presentation of the mask signal. 3. Finally, for intermediate mask durations, inter-channel transienton-sustained inhibition due to mask offset can contribute to masking. (30) The findings of Lamme et al. (2002), discussed in Chapter 3, section 3.2.1, provide support for the existence of feedforward- and feedback-dominant phases of cortical processing predicted by the RECOD model. Our model can account for their findings by (i) unlumping the cortical network so that V1 is represented separately from subsequent areas and (ii) applying transient-on-sustained inhibition directly to the feedback signals from higher areas back to V1. (31) As discussed below in Chapter 8, section 8.2, the addition of a second mask (M2) to a target (T) and a primary metacontrast mask (M1) sequence can lead to the recovery of the visibility of the target. In particular, Breitmeyer et al. (1981b) investigated the timing of target recovery in metacontrast and found that M2 had to be presented before M1 in order to lead to target recovery. Furthermore, they found that, when the visibility of the target recovered, there was no concomitant change in the visibility of the primary mask M1. This indicates a dissociation between the visibility of M1 and its masking effectiveness.
EXPLANATORY SCOPE OF THE DUAL-CHANNEL RECOD MODEL
When M2 was presented after M1, the opposite effect was observed, i.e. a reduction in the visibility of M1 without a concomitant change in the visibility of the target. Taken together, these results show a double dissociation between the visibility and the masking effectiveness of a stimulus. Our model can explain this double dissociation by noting that the visibility and metacontrast masking effectiveness of a stimulus are associated with distinct processes, i.e. the sustained and transient responses. Furthermore, the dependence of double dissociation on T–M2 asynchrony can be explained by the fact that the main contributor to target recovery is sustained-on-transient inhibition (and to a lesser extent transient-on-transient inhibition). For this inhibition to be effective, M2 has to be presented before M1 so as to inhibit M1’s transient activity. The resulting depression of M1’s transient activity in turn releases its inhibition on the sustained activity of T, leading to target recovery. On the other hand, reduction in the visibility of M1 occurs via inter-channel transient-on-sustained inhibition. For this to occur, M2 has to be presented after M1, as in standard metacontrast. (32) We can now make the following additional predictions. According to the model, M2’s modulations of M1’s visibility originate from the inter-channel transient-on-sustained inhibition while target recovery originates mainly from inter-channel sustained-on-transient inhibition. As discussed in section 5.2.1, M cells have high gain and saturate rapidly as contrast is increased, while P cells have lower gain and are primarily linear as a function of contrast. Consequently, the target recovery effect and the reduction of M1’s visibility produced by M2 as a function of its contrast should exhibit the signatures of parvocellular and magnocellular contrast dependence functions, respectively. Recently, Ögmen et al. (2004) tested this prediction by systematically varying the contrast of the secondary mask M2 in an experiment similar to that of Breitmeyer et al. (1981b). Figure 5.22 shows the experimental results together with model predictions. The xaxis is the SOA between the secondary mask M2 and the target T. The yaxis is the amount of change in visibility with respect to baseline T–M1 sequence (i.e. without M2) in log units. Changes in the visibility of the target are indicated by triangles and changes in the visibility of M1 by squares. Target visibility increases (recovery) for primarily negative T–M2 SOAs, while the visibility of M1 decreases for primarily positive T–M2
209
210
MODEL T 1.5 T 1.0 T 0.5 Tl 0.25 T 0.125 M1 1.5 M1 1.0 M1 0.5 M1 0.25 M1l 0.125
Change in visibility (LU)
0.5
0.3
0.1
0.4
0.2
0
-250
-200
-150
-100
-50 0 -0.1
50
100
150
200
250
-250
-150
-50
50
150
250
-0.2
-0.3 -0.4
-0.5 T-M2 SOA (ms)
-0.6 T-M2 SOA (ms)
Fig. 5.22 Triangles plot the changes in the visibility of the target (T) in log units (LU) as a function of T–M2 SOA. Darker gray levels correspond to higher M2/M1 contrast ratios as indicated in the inset. Squares plot the changes in the visibility of the primary mask M1. Darker gray levels correspond to higher M2/M1 contrast ratios as indicated in the inset. Left and right panels show the psychophysical data and model simulations, respectively. The values falling inside the rectangles are used to plot Figure 5.23. The data represent the average across the observers. (Reproduced from Ö˘ gmen et al. 2004.)
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
DATA
FURTHER ASSESSMENTS, COMPARISONS, AND CRITIQUES
T data M1 data T model M1 model
Relative change (LU)
0.3
0.2
0.1
0
–0.1 0
0.5 1 M2/M1 contrast ratio
1.5
Fig. 5.23 Contrast dependence of target recovery (triangles) and M1’s visibility (squares). Open and full symbols correspond to data and model, respectively. The data represent the average across the observers. (Reproduced from Ö˘ gmen et al. 2004.)
SOAs. The model is in good agreement with the data, except for the slightly lower target recovery and much stronger masking of M1. This quantitative difference can be rectified by changing the reciprocal inhibition weights between transient and sustained systems. However, this quantitative difference does not affect the relative changes of these effects as a function of contrast, which is the main focus of the study. To visualize these relative changes, the data points at the optimal target recovery and metacontrast, shown by the rectangles in Figure 5.22, are plotted as a function of M2/M1 contrast ratio in Figure 5.23). Open and full symbols correspond to the data and the model, respectively. Triangles and squares correspond to target recovery and M1’s visibility, respectively. As predicted, target recovery increases more or less linearly with the M2/M1 contrast ratio, while M1’s visibility saturates rapidly. Taken together, these results show that metacontrast masking is driven by signals originating from the magnocellular pathway, and target recovery in metacontrast is driven by signals originating from the parvocellular pathway. 5.6. Further assessments, comparisons, and critiques The extensive explanatory scope of the two models presented in this chapter already exceeds that of the other major models reviewed in Chapter 4. These two models, besides adequately explaining the main aspects or motifs of pattern masking illustrated in Chapter 2, Figure 2.2,
211
212
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
can also give adequate accounts of many variations on those motifs, which reflect mainly the effects of correlated variations of boundary conditions such as wavelength, intensity, spatial frequency, retinal eccentricity, etc. on the responses of transient and sustained cells. The effects of each of these variables can be specified on the basis of either electrophysiological or related psychophysical results. The most closely related of the models reviewed in this and the previous chapter are Breitmeyer’s sustained–transient model and Ögmen’s RECOD model. The revised version (Weisstein et al. 1975) of Weisstein’s original Rashevsky–Landahl neural-net simulation of metacontrast (Weisstein 1968, 1972) also has similarities with these models. Weisstein’s older model was a fast-inhibition slow-excitation model. In the newer version it is realized, for metacontrast, by transient-on-sustained inhibition of the non-recurrent forward type (negative feedforward as opposed to recurrent inhibition or negative feedback) and for paracontrast by the reverse sustained-on-transient inhibition, implicitly also of the non-recurrent forward type. These assumed reciprocal inhibitory mechanisms correspond to Assumption 3 specifying Breitmeyer and Ganz’s model (see section 5.3). Moreover, as noted in Chapter 4, the revised version by Weisstein et al. (1975) explicitly incorporates the symmetrical physiological masking effects that the stimuli arbitrarily designated as ‘target’ and ‘mask’ can exert on each other. This assumption of the model of Weisstein et al. corresponds to Assumption 1 of Breitmeyer and Ganz’s model. Where the models differ is in combined Assumptions 2 and 4, which in Breitmeyer and Ganz’s model states that paracontrast is realized via intra-channel inhibition, effected particularly in sustained channels, rather than Weisstein et al.’s corresponding assumption of inter-channel, sustainedon-transient inhibition. Only future research can decide on which of the two alternatives is more viable in explaining paracontrast brightness suppression. Even if the intra-channel hypothesis is correct, this does not necessarily imply that sustained-on-transient inhibition fails to manifest itself in pattern masking; in fact, as shown in explanation (31) above and Chapter 8, section 8.2, this particular inter-channel inhibition can manifest itself in the recovery or disinhibition of the target rather than in target-masking effects. Moreover, as argued in explanation (1) above, it may also he involved in a suppression of the cff of the second of two spatially adjacent repetitively flashed stimuli.
FURTHER ASSESSMENTS, COMPARISONS, AND CRITIQUES
However, as it stands now, Weisstein et al.’s model cannot adequately account for the absence of type B metacontrast when simple reaction time or detection rather than brightness perception are used as criterion responses. This is because her initially activated primary target and mask neurons n1 and n21 (see Fig. 5.1) produce a single response to each stimulus and therefore cannot differentiate between fast transient activity and separate slow sustained activity. Moreover, at the level of the secondary neurons where the first differentiation of fast (transient) and slow (sustained) activity is specified, the fast activity acts only to inhibit the central tertiary slow activity laterally without itself being separately channelled to a central tertiary detector level. Hence the fast activity serves only to produce type B metacontrast brightness or contrast suppression, but does not serve to detect the mere presence or location of the target as would be required to account for the fact that simple reaction time and target detection tasks yield no metacontrast. Of course, minor modifications of Weisstein et al.’s model could incorporate these required features. The similarity between Matin’s (1975) model and the sustained– transient models is more remote, particularly in regard to her model’s required activation of T–M neurons. In so far as Matin identifies T–M neurons with transient ones and T neurons with sustained ones, her assumption of a shorter response latency of T–M neurons compared with T neurons does bear a similarity to the sustained–transient model. In fact, this latter hypothesized response latency difference when combined with inter-channel inhibition is essentially equivalent to Assumptions 1 and 3 of the sustained–transient model and the fast-inhibition hypothesis of the model of Weisstein et al. (1975). Consequently, based on these similarities, the three models would, at least qualitatively, make the same predictions of type B metacontrast variations with experimental variations of the stimulus conditions listed in the relevant explanations reviewed in section 5.5. As noted before, although the RECOD model starts from a different perspective, it converges to a structure very similar to the sustained–transient model. Furthermore, it expands the sustained– transient model by incorporating recent neurophysiological findings on afferent and cortical networks, by incorporating feedback mechanisms, by proposing additional feedforward, feedback-dominant phases of operation, by explicitly incorporating a network structure, by
213
214
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
formulating a functional dynamic system framework, and by providing a quantitative description that can be simulated and compared directly with experimental data. The feedback structure of the RECOD model makes this class of models more comparable to other feedback models reviewed in Chapter 4. Nonetheless, the dual-channel aspect of this model makes it significantly different from Bridgeman’s (1971, 1977, 1978) neuralnetwork model, Ganz’s (1975) trace decay–lateral inhibition model, and the non-neural models of Reeves (1981) or Navon and Purcell (1981) discussed in Chapter 4. None of the neural or non-neural models incorporate the distinction between transient response components and slow sustained ones which can reciprocally inhibit each other. Specifically, Bridgeman’s model is based on a single-channel rather than dual-channel approach. His Hartline–Ratliff neural net model assumes a spatially and temporally isotropic excitatory– inhibitory interactive net, for example like that characterizing the eye of Limulus. A time constant, tying the latency of lateral inhibition, and a space constant, specifying the variations of lateral interaction with variations of spatial separation between the target and mask, characterize the lateral inhibitory network. The role of lateral inhibition is not to suppress the target’s pattern information, and thus decrease its signal-to-noise ratio, as in the dual-channel models, but rather to distribute and store target–mask pattern information in the network’s briefly persisting spatiotemporal activity. As a consequence, the role of the mask activity, rather than to suppress the target activity, is to bury or alter it in the persisting spatiotemporal distribution of combined target and mask activities, thus reducing the target’s signal-to-noise ratio by increasing the mask-generated noise. It is the subsequent process of template matching or cross-correlation of this combined iconic representation (Bridgeman 1978) with a more permanently stored target representation, rather than any latency differences between antagonistic target and mask response components, which yields the type B metacontrast (and paracontrast) functions. Hence it should be apparent that the dual-channel neural models and Bridgeman’s single-channel model of lateral masking are conceptually quite distinct. The RECOD model incorporates feedback (recurrent) connections as in Bridgeman’s model because they are viewed as essential components of visual information processing. However, it
SUMMARY
incorporates a dual-channel structure to avoid the type of spatiotemporal blurring that occurs in Bridgeman’s model so that perceptual dynamics can be organized into transient epochs in which stimuli can be processed individually. Let us note that the problem of blurring between the target and mask processing also exist in the re-entrant model of Enns and DiLollo. In the simulations of their model, Di Lollo et al. (2000, p. 497) state that ‘At the outset of a new series of iterative cycles, corresponding to the onset of a new perceptual event, the contents of W [working space] are reset to zero’. Thus the fundamental problem of delineating perceptual cycles corresponding to new perceptual events is done by the modeler and not by the model itself. When a reset mechanism is incorporated directly into the model, as in RECOD, type B backward pattern masking becomes an emergent property of the model, which in turn relegates other mechanisms, such as attention, to secondary modulatory roles. 5.7. Summary A review of psychophysical studies of spatiotemporal properties of human vision, as measured by separate pattern and movement or flicker thresholds, temporal integration and persistence, reaction time, and the effects of flicker adaptation, all as a function of spatial frequency, points to the existence of sustained and transient channels in the visual system of humans. The existence of sustained and transient channels is also supported by primate neurophysiological data showing two parallel afferent pathways with similar characteristics. The two models discussed in this chapter are based on a dual-channel sustained/transient structure. Breitmeyer’s original masking model was based on five major assumptions or initial conditions. 1. Both target and mask stimuli activate sustained and transient channels. 2. Within channels, inhibition is realized via the center–surround antagonism of neural receptive fields. 3. Mutual and reciprocal inhibition exists between transient and sustained channels. 4. Masking occurs in one of three ways: (i) via intra-channel inhibition, (ii) via inter-channel inhibition, and (iii) via sharing in
215
216
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
common sustained or transient pathways of the respective neural activities generated by spatially overlapping target and mask stimuli. 5. Whereas transient channels primarily signal the sudden appearance at a given location or the sudden change of location (motion) of a stimulus, sustained cells process its figural attributes such as brightness, contrast, and contour. The RECOD model expands the sustained–transient model by incorporating recent neurophysiological findings on afferent and cortical networks, by incorporating feedback mechanisms, by proposing additional feedforward, feedback-dominant phases of operation, by explicitly incorporating a network structure, by formulating a functional dynamic system framework, and by providing a quantitative description that can be simulated and compared directly with experimental data. This model can give an adequate account of the wide range of visual masking phenomena discussed throughout this book. Notes 1. A third type of retino-geniculate-cortical pathway is the koniocellular pathway (reviewed by Hendry and Reid 2000). Currently, relatively little is known about this pathway with respect to the other two pathways. The properties of cells in this pathway tend to be more mixed and typically intermediate between M and P cells. 2. Initially, the cat X/Y classification was considered analogous to primate P/M classification. However, more recent research suggested that M cells would be representative of cat X and Y cells while the primate P cell would represent a new group specific to primates (Benardete et al. 1992). 3. Burbeck (1981) reported that, when using a criterion-free forced-choice procedure, pattern thresholds were actually lower than flicker thresholds, contrary to the reports of Tolhurst (1973) and Kulikowski and Tolhurst (1973). Burbeck’s (1981) criterion-free procedure employed the following rationale. In order to establish pattern-detection thresholds of a counterphase flickering grating, observers were asked to discriminate between the counterphase grating and a uniform field flickering at the same temporal frequency and set slightly above threshold contrast. Conversely, to establish flicker thresholds of a counterphase flickering grating, observers were asked to discriminate between the counterphase grating and a stationary grating of the same spatial frequency also set slightly above threshold. Let us examine the pattern threshold procedure first. In our own observations, a counterphase flickering (or moving) grating does not mimic uniform field flicker unless the grating is of sufficiently low spatial frequency; in fact, at higher frequencies one usually sees a ‘ripple’ effect at threshold spread throughout the viewing field rather than uniform field flicker. Since Burbeck (1981) used a field of diameter 8º and a lowest
SUMMARY
spatial frequency of only 0.5 c/deg, most of the test gratings would have produced ripple rather than a uniform field flicker effect. In fact, since the two thresholds seem to converge at the lowest spatial frequencies used by Burbeck (1981), particularly when higher flicker rates are employed, it seems likely that even lower spatial frequencies (e.g. 0.25 0.125 c/deg) may actually mimic uniform field flicker and thus reverse the advantage of pattern relative to flicker thresholds, since here ripple effects are no longer capable of being employed to discriminate a counterphase grating from a uniform field flicker. Hence, with the range of spatial frequencies (0.5 8.0 c/deg) used by Burbeck (1981), the ripple effect, rather than either uniform field flicker or spatial structure per se, could have been used to discriminate the counterphase grating from a comparison stimulus comprised of uniform field flicker. To overcome this difficulty, it may have been a better procedure to present as a control stimulus not a uniformly flickering field but rather one containing a two-dimensional counterphase grating of the same spatial frequency as the test counterphase grating but composed of random orientations or of two orthogonal gratings each at 45 inclination relative to the fixed orientation of the test grating. Setting this control grating at or slightly above flicker/ripple threshold, we could then have observers discriminate the spatial orientation (i.e. pattern component) of the test pattern from this flicker/ripple effect. With regard to the flicker threshold procedure, an observer was given, as control stimulus, a barely visible stationary grating of the same spatial frequency as the counterphase flickering test stimulus. The task was to discriminate the counterphase test grating from the control grating. However, we might ask here whether the threshold discrimination of the test grating was done only on the basis of a perceived flicker/ ripple effect. After all, observers were required to discriminate the test stimulus from a stationary control stimulus whose pattern or spatial structure was already, albeit slightly, above threshold contrast. It is not clear whether such a procedure is indeed criterion-free as claimed or whether it biases the observer to a more conservative criterion based on pattern plus flicker/ripple detection as opposed to flicker/ripple detection alone. Given this bias, we would expect that (i) the ‘flicker’ threshold may in fact have been based on a flicker/ripple plus pattern criterion, and (ii), as noted in the prior paragraph, the ‘pattern’ threshold might in fact have been using a ripple rather than a pattern criterion relative to a uniform field flicker criterion. Research using a more extensive signal detection analysis may better establish whether the effects reported by Burbeck are due to threshold sensitivity changes free of criteria or to a biasing of criteria used in her procedure. Until such research is done, the claim of criterion-free pattern and flicker thresholds is questionable. 4. The fact that the latency of later components of the CVEP increase with spatial frequency indicates that they reflect the longer-latency processing, sustained-channel information. In this regard, CVEP studies of metacontrast, reviewed in Chapter 3, section 3.3.1, show that relative to the CVEP elicited by a target disk, the CVEP elicited by a disk–ring metacontrast sequence is characterized by attenuation of the later components, especially at intermediate disk–ring SOAs of 30–60 ms. 5. The importance of the trade-off between figural synthesis and reset has also been highlighted within the context of the boundary contour system (BCS) model (Francis et al. 1994). The BCS model includes positive feedback (Grossberg and Mingolla 1985a, b) and the analysis of the model showed that without a reset mechanism this
217
218
THE SUSTAINED–TRANSIENT CHANNEL APPROACH
feedback causes extensive persistence (Francis et al. 1994). In the BCS model, a competition between orientationally selective (cortical) mechanisms implements the reset. Francis and colleagues showed that this reset mechanism can form the basis of several visual illusions (Francis and Kim 1999; 2001; Francis and Rothmayer 2003; Kim and Francis 1998). Given that numerous cortical feedback circuits exist, it is highly likely that the nervous system uses multiple reset mechanisms. In the RECOD model, we used a reset mechanism that is outside the feedback loops and directly driven by the external input. A static input can generate time-varying cortical activity, such as oscillatory potentials. Since the transient activity in the M pathway is closely related to the transients of the input (rather than intrinsic cortical time-varying activity), our reset mechanism is designed so that intrinsic cortical time-varying activity does not interfere with figural synthesis. On the other hand, a reset mechanism such as the one used in the BCS model can account for bistable percepts, i.e. switches in perceptual organization in response to certain static stimuli.
Chapter 6
Metacontrast and motion perception
6.1. Introduction As discussed in Chapter 1, section 1.3.2, similarities between masking and stroboscopic motion were noted by early investigators such as Wertheimer. From the perspective of the stimulus, studies of metacontrast and motion typically use very similar displays consisting of sequential stimulation of adjacent spatial positions. However, one distinction is that a typical motion stimulus is spatially asymmetric, i.e. directional (e.g. proceeding from left to right), while a typical metacontrast stimulus is symmetric (e.g. a disk and a surrounding ring). In Chapter 4, section 4.2, we reviewed theoretical approaches to metacontrast based on activation of spatiotemporal sequence or motion detectors. In the current chapter we revisit only the approach adopted by Burr and coworkers (Burr 1984; Burr et al. 1986) and the RECOD model discussed in the previous chapter, as they are most relevant to the topics covered here. The perceived clarity/blur of moving targets is another phenomenon that reveals a close relationship between metacontrast and motion. A briefly presented stimulus remains visible for a significant time period after its offset, a phenomenon known as visible persistence (see Chapter 1, section 1.5). A direct application of visible persistence to moving stimuli would predict that moving targets should appear highly blurred. This is analogous to taking pictures of moving objects at low shutter speeds. Yet under normal viewing conditions, the perceived blur of moving targets is negligible when compared with estimates derived from the duration of visible persistence. This phenomenon is known as motion deblurring (Burr 1980).
220
METACONTRAST AND MOTION PERCEPTION
6.2. Motion deblurring and metacontrast 6.2.1.
Basic findings
Several studies have shown that the visual persistence of stationary targets is approximately 120 ms under daylight viewing conditions (Haber and Standing 1970; see also Coltheart 1980). Based on this duration of visual persistence, we would expect that moving objects should appear highly smeared. For example, a target moving at a speed of 10/s should generate a comet-like trailing smear of extent 1.2. However, under normal viewing conditions, objects in motion usually look sharp and clear, even if the moving object is physically blurred to some degree (Bex et al. 1994; Hammet 1997; Ramachandran et al. 1974). Burr (1980) and Hogben and Di Lollo (1985) measured the perceived extent of the motion smear produced by a field of moving dots as a function of exposure duration. For exposure durations shorter than approximately 40 ms, the extent of perceived smear increased with exposure duration, as would be expected from the visible persistence of static objects. However, for exposure durations longer than 40 ms, the length of the perceived smear was much less than predicted from the persistence of static displays. This reduction of perceived smear for moving objects is called ‘motion deblurring’ (Burr 1980). Contrary to the reports of motion deblurring, it has long been known that isolated targets in real motion exhibit extensive smear, within which lighter and darker regions known as Charpentier’s bands (see Chapter 1, Fig. 1.2(b)) can often be identified (Bidwell 1899; McDougall 1904a). The phenomenon was described by McDougall as follows: A radial slit 2 in width and 7 cm in length, its mid point 15 cm from the center of the disk, rotating at the rate of 1 rev per 3 before the glass lit by four burners, appears as a fanlike bundle of narrow bright rays of diminishing brightness from before backwards. Four such rays can usually be distinguished with certainty at this speed [about 9/s]. They are not separated by distinct dark intervals but appear to overlap one another, and together they fill a sector about 12 width [equivalent to a duration of 100 ms]. (McDougall 1904a, p. 91)
Isolated targets in sampled motion also exhibit prolonged persistence, particularly when the spatial separation between successively presented targets is greater than about 10 (Castet 1994; Di Lollo and Hogben 1985; Dixon and Hammond 1972; Farrell 1984; Farrell et al. 1990).
MOTION DEBLURRING AND METACONTRAST
In order to reconcile the apparently contradictory observations of motion deblurring for a field of moving dots and extensive smear for isolated moving targets, Chen et al. (1995) systematically varied the density of moving dots, ranging from a single dot to 7.5 dots/deg2. In addition, they used continuous motion instead of sampled motion to assess whether motion deblurring can be generalized to stimuli in continuous real motion. Figures 6.1 and 6.2 show the duration (duration of perceived smear perceived length/speed of the target, shown on the left ordinate) and the length (right ordinate) of the perceived smear as a function of exposure duration for target velocities of 5 and 10/s. Dot density is the parameter in each plot. For comparison, each panel contains a dotted line with a slope of 1, indicating where the duration of perceived smear is equal to the exposure duration (i.e. the duration of visual persistence is equal to or longer than the exposure duration). For
160 140 120 100 80 60 40 20 0
160 140 120 100 80 60 40 20 0
Single dot 0.75 dots/deg2 1.5 dots/deg2 5 dots/deg2 7.5 dots/deg2
0.8
SC
0.6 0.4 0.2 0
0
20
40
60
80 100 120 140 160 0.8 HO 0.6 0.4 0.2 0
0
20
40
60
80 100 120 140 160 0.8 HB 0.6 0.4 0.2 0
0
20
40
60
80 100 120 140 160
Exposure duration (ms)
Length of perceived smear (deg)
Duration of perceived smear (ms)
160 140 120 100 80 60 40 20 0
Fig. 6.1 The duration (left ordinate) and length (right ordinate) of the perceived motion smear as a function of exposure duration. Different symbols represent data for different stimuli (single dot, and random dot arrays with densities as shown in the inset). The speed was 5/s. Results for different observers are plotted in different panels. (Reproduced from Chen et al. 1995.)
221
METACONTRAST AND MOTION PERCEPTION
Fig. 6.2 As Figure 6.1 but for a stimulus speed of 10/s. (Reproduced from Chen et al. 1995.)
160 140 120 100 80 60 40 20 0
160 140 120 100 80 60 40 20 0
Single dot 0.75 dots/sq deg 1.5 dots/sq deg 5 dots/sq deg 7.5 dots/sq deg
1.6
SC
1.2 0.8 0.4 0
0
20
40
60
80 100 120 140 160
HO
1.6 1.2 0.8 0.4 0
0
20
40
60
80 100 120 140 160
HB
Length of perceived smear (deg)
160 140 120 100 80 60 40 20 0 Duration of perceived smear (ms)
222
1.6 1.2 0.8 0.4 0
0
20
40
60
80 100 120 140 160
Exposure duration (ms)
exposure durations up to approximately 50 ms, the duration of the perceived smear closely follows the dotted line for all dot densities. However, a pronounced effect of dot density is observed for exposure durations longer than 50 ms. In the case of a single dot, the duration of the perceived smear increases to a value around 100 ms, comparable to the duration of persistence reported for static targets. For the highest dot density, the duration of the perceived smear either reaches an asymptote or decreases at exposure durations longer than 50 ms. Comparison of the results in Figures 6.1 and 6.2 shows that the data depart from the dotted line at approximately the same exposure duration, indicating that the duration of the perceived smear changes in a qualitatively similar way for velocities of 5 and 10/s. The main difference is quantitative: for exposure durations greater than approximately 50 ms,
MOTION DEBLURRING AND METACONTRAST
the duration of the perceived smear is, in general, longer for targets moving at a speed of 10/s than for targets moving at 5/s. Chen et al. (1995) used a bright background of 200 cd/m 2 , and therefore the results in Figures 6.1 and 6.2 show that the extensive smear observed by McDougall (1904a) and Bidwell (1899) for a bright moving slit in a dark field also occurs for a single dot moving against a bright background. At high dot densities, Chen et al.’s data are similar to those of Burr (1980) and Hogben and Di Lollo (1985). This indicates that motion deblurring generalizes to continuously moving stimuli. Furthermore, Westerink and Teunissen (1995) showed that motion deblurring also occurs in more complex ‘natural images’. An important upshot of Chen et al.’s study is that motion deblurring depends critically on the density of dots, i.e. on the spatiotemporal proximity of targets in complex displays. Thus their data offer a reconciliation between studies that reported extensive blur for isolated moving targets and motion deblurring for moving arrays of dots. 6.2.2.
Mechanisms of motion deblurring
A large number of models, ranging from conceptual to computational, have been proposed for motion deblurring phenomenon. As noted in section 6.1, because metacontrast and motion deblurring are related to each other from the perspective of similarities in the stimuli (spatiotemporal sequences) and in the phenomenology (in both cases the visibility of the first stimulus is suppressed), these models can be viewed explicitly or implicitly as models of metacontrast. Indeed, empirical findings show a striking similarity between motion deblurring and metacontrast in the way they depend on the luminance, spatial and temporal separations, and the eccentricity of the targets. For example, several studies using stimuli in apparent motion have shown that the duration of visual persistence decreases as the spatial separation between successively presented targets is reduced (Castet et al. 1993; Di Lollo and Hogben 1985; Farrell 1984). A similar target–mask separation effect occurs in metacontrast (see Chapter 2, section 2.6.6). Chen et al. (1995) reported that motion deblurring is stronger in the periphery than in the fovea, in agreement with stronger metacontrast in the periphery (see Chapter 2, section 2.6.6). Motion deblurring is closely related to sequential blanking, which in turn can be viewed as a form of metacontrast (see Chapter 2, section 2.7.1).
223
224
METACONTRAST AND MOTION PERCEPTION
6.2.2.1.
Motion estimation–compensation models
Many models (Anderson and van Essen 1987; Burr 1980; Burr et al. 1986; Martin and Marshall 1993; Pääkkönen and Morgan 1994) rely on a motion estimation procedure which is used to compensate for the adverse blurring effect resulting from the object motion. According to Burr (1980), motion estimation is achieved by the spatiotemporally oriented receptive fields of motion mechanisms. In Burr’s approach, motion and form are analyzed by the same mechanisms. The analysis of the form for a moving object is carried out by spatiotemporally oriented receptive fields whose spatiotemporal orientation matches the velocity of the object. This match implies that the stimulus remains ‘aligned’ or stationary with respect to the receptive field. As a result, no blur is generated across networks of cells that share this particular spatiotemporal orientation. On the other hand, this stimulus will generate motion blur in mechanisms whose spatiotemporal orientations (i.e. velocity tuning) do not match the velocity of the moving target. Burr did not specify how this blur would be suppressed. Martin and Marshall (1993) proposed a similar model in which excitatory and inhibitory feedback connections suppress the persistent activity of neurons along the motion path. The ‘shifter circuit’ model of Anderson and van Essen (1987) uses an estimation of motion in order to generate a cortically localized representation of moving stimuli, thereby avoiding the persistence which would result from the change of cortical locus of neural activities. Pääkkönen and Morgan (1994) proposed a two-stage model in which the second stage carries out a ‘translation-invariant integration’ of moving stimuli. Although no specific mechanisms were suggested, this would presumably require an estimation of target motion to achieve translation invariance. According to Nijhawan (1994, 1997), when a moving object’s trajectory is predictable, its perceived position is extrapolated forward in order to compensate for the neural transmission delays. Although no mechanisms were specified, such a motion extrapolation scheme, if it exists, may also be used to remove the blur at the previous locations of the moving object. The effect of exposure duration on motion deblurring can be explained in these models by assuming that motion estimation– compensation mechanisms require time to become effective. All these models predict that motion deblurring should be greater for coherently
MOTION DEBLURRING AND METACONTRAST
moving targets than for targets that move in independent random directions. Such an effect has been reported (Watamaniuk 1992). However, this reduction occurred in Watamaniuk’s study only when the spatial separation between the successive dots in apparent motion was about 0.2–0.6. When the separation between successive dots was 0.1 (i.e. the condition most closely approximating real motion), persistence was comparably brief for both fixed and random trajectories of motion. Another prediction of this class of models is that an isolated moving target should produce no visual blur provided that it stimulates motion estimation–compensation mechanisms sufficiently. This prediction is in contradiction with the extensive blur observed for a moving isolated target (Bidwell 1899; Chen et al. 1995; McDougall 1904a; V.C. Smith 1969a,b; Lubimov and Logvinenko 1993). However, it could be argued that isolated targets do not provide sufficient stimulation for motion estimation–compensation mechanisms. To test these models more conclusively, Chen et al. (1995) studied motion deblurring using the stimuli shown in Fig. 6.3. In one experiment, motion blur was measured for vertical and horizontal arrays of dots (Figs 6.3(a) and 6.3(b)). Anderson and Burr (1991) compared the extent of spatial summation parallel and orthogonal to the direction of moving gratings and concluded that human motion detectors were equal in length and width (see also Anderson et al. 1991). Motion mechanisms with this geometry should be stimulated equally by horizontal and vertical arrays of dots and, according to motion estimation–compensation models, these stimuli should be perceived as producing equivalent amounts of smear. Chen et al.’s results showed clear differences between the extent of motion blur in vertical and horizontal arrays. In the vertical array, the proximity of the dots did (a) (b)
(c)
Fig. 6.3 Stimulus configurations used in motion deblurring experiments. Horizontally moving (a) horizontal array, (b) vertical array, and (c) pair of dots.
225
METACONTRAST AND MOTION PERCEPTION
not have any significant effect and an extensive motion smear, comparable to the single dot stimulus in Figures 6.1 and 6.2, was observed. However, in the horizontal array, the proximity of the dots produced an effect similar to dot density in Figures 6.1 and 6.2, leading to motion deblurring as the distance between the individual dots decreased. Finally, in a third experiment, the stimulus consisted of two horizontally separated dots that moved in tandem in the horizontal direction (Fig. 6.3(c)). Because each dot should activate motion estimation– compensation mechanisms approximately equally, motion estimation– compensation models predict that the leading and the trailing dots should produce the same blur extent. This experiment provides a stronger test for these models because no assumptions are required concerning the shape or size of the receptive fields of motion-detecting mechanisms. Contrary to the prediction of these models, the results (Fig. 6.4) show that the perceived smear of the leading dot decreases systematically as the dot-to-dot separation is reduced (comparable to
160 140 130 100 80 60 40 20 0
160 140 130 100 80 60 40 20 0
LEADING DOT 0.75 dots/deg2 1.5 dots/deg2 5 dots/deg2 7.5 dots/deg2
0.8 5°/s 0.6 0.4 0.2
0
20
40
60
0 80 100 120 140 160
1.6 10°/s
1.2 0.8 0.4
0
160 140 130 100 80 60 40 20 0
160 140 130 100 80 60 40 20 0
TRAILING DOT
0.8
SC: 5°/s
0.6 0.4 0.2
0
0 20 40 60 80 100 120 140 160
1.6 10°/s
1.2
Length of perceived smear (deg)
Duration of perceived smear (ms)
226
0.8 0.4
0 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 Exposure duration (ms)
Fig. 6.4 The duration (left ordinate) and the length (right ordinate) of the perceived motion smear as a function of exposure duration for the stimulus configuration shown in Figure 6.3(c). The left and right panels correspond to the leading and the trailing dots, respectively. Data are for two speeds (upper panels, 5/s; lower panels, 10/s) from one observer. Dot-to-dot separations are expressed in units of equivalent densities for comparison with data in Figures 6.1 and 6.2. (Reproduced from Chen et al. 1995.)
MOTION DEBLURRING AND METACONTRAST
the perceived smear obtained for dot arrays of similar density in Figs 6.1 and 6.2), whereas the trailing dot generates extensive smear independent of dot-to-dot separation (comparable to the perceived smear generated by a single dot in Figs 6.1 and 6.2). Taken together, these results provide strong evidence against motion estimation– compensation models. 6.2.2.2.
Inhibition-based models
The general idea of inhibition to explain motion deblurring has been proposed by several investigators (Castet 1994; Di Lollo and Hogben 1985, 1987; Dixon and Hammond 1972; Francis et al. 1994; McDougall 1904a; Ögmen 1993). According to these models, the smear of a moving target is reduced because of the inhibition exerted by spatiotemporally adjacent stimuli. Therefore the critical factor determining motion deblurring is not the activation of motion mechanisms, as the previous class of models would predict, but rather the spatiotemporal proximity of a stimulus to the spatial locus of smear. The experiments of Chen et al. (1995) discussed in the previous section were designed to test this class of models as well. The effect of dot density can readily be explained by the separation between adjacent stimuli: the closer the stimuli are to each other, the more inhibition they exert on each other’s trailing smear. A similar effect is predicted for the horizontal array of dots. However, when the dots are positioned into a vertical array, a given dot in the array does not overlap with the trailing smear of its neighbors. As a result, extensive smear is predicted to occur, in agreement with data. In the case of two-dot stimuli, inhibition-based models predict greater deblurring for the leading dot of the pair, provided that the trailing dot is close enough to inhibit the smear of the leading dot. As there is no spatially proximal dot to inhibit the smear of the trailing dot, the extent of its perceived smear should be comparable to that of a single moving dot. These predictions are in agreement with the data of Chen et al. To support these qualitative arguments further, Purushothaman et al. (1998) conducted a simulation study of the RECOD model. Figure 6.5 shows the response of the model to a single moving dot stimulus. The different panels in the figure plot spatiotemporal activity profiles for sustained, transient, and post-retinal cells. The stimulus generates a
227
METACONTRAST AND MOTION PERCEPTION
0.08 0.04 Activity 0
POST-RETINAL
0
200
Tim
e (m
s)
100 0 50 100 Space (cell index)
300 400
150 200
1 0.75 Activity 0.5 0.25 0
TRANSIENT
SUSTAINED
100
100
50
300 400
150
0
100 Space (cell index)
e (m
200
200
Tim
e (m
3 2 Activity 1 0
s)
0
s)
0
Tim
300 400
200
INPUT
150 200 600 400 Activity 200 0
0 50 100 Space (cell index)
0
(ms
)
100 200
e Tim
228
300 400
150
0 50 100 Space (cell index)
200
Fig. 6.5 Simulations of the RECOD model for a single moving dot. The lowest panel shows the input, the middle two panels show the activities of retinal transient and sustained cells, and the upper panel shows the activities of post-retinal sustained cells. (Reproduced from Purushothaman et al. 1998.)
blurred profile across the network of retinal sustained cells and a similar blur profile can be observed for post-retinal cells. Figure 6.6 shows superimposed density plots for afferent sustained and transient signals. Because of the relative latency between these signals, the transient activity ‘leads’ the sustained activity and therefore does not inhibit the trailing blur of the moving dot. Figure 6.7 shows the model
MOTION DEBLURRING AND METACONTRAST
800
Time (ms)
600
400
200
0
0
50
100 Space (cell index)
150
200
Fig. 6.6 Superimposed density plots of afferent sustained and transient activities replotted from the middle two panels of Figure 6.5 to show their spatiotemporal relationship. (Reproduced from Purushothaman et al. 1998.)
160 Model 140 Duration of blur (ms)
Data 120 100 80 60 40 20 0
20
40
60 80 100 120 Exposure duration (ms)
140
150
Fig. 6.7 Comparison of model prediction with data from Figure 6.2, averaged across three observers. (Reproduced from Purushothaman et al. 1998.)
prediction superimposed on data for a single moving dot. The duration (or spatial extent) of the blur increases with exposure duration. Figure 6.8 shows model activities for the ‘two-dot’ paradigm following the same conventions as Figure 6.5. While an extensive blur for the leading and the trailing dot can be observed across the network of retinal sustained cells, the post-retinal network shows blur only
229
METACONTRAST AND MOTION PERCEPTION
0.08
POST-RETINAL
0.04
Activity
0 0
200
Tim
s) e (m
100
150 200
1 0.75 Activity 0.5 0.25 0
TRANSIENT
0 50 100 Space (cell index)
300 400
0 100
300 400
0
50
150 200
100 Space (cell index)
e (m
200
200
Tim
s)
s)
100
e (m
4 3 Activity 2 1 0
SUSTAINED
0
Tim
0 50 100 Space (cell index)
300 400
INPUT
150 200 600 400 Activity 200 0
0 100
s) e (m Tim
230
200 50
300 400
150 200
0
100 Space (cell index)
Fig. 6.8 Simulations of the RECOD model for a pair of moving dots. Conventions as in Figure 6.5. (Reproduced from Purushothaman et al. 1998.)
for the trailing dot. Figure 6.9 shows superimposed density plots for afferent sustained and transient signals. It can be seen that the transient activity of the trailing dot overlaps the smear of the leading dot. As a result of post-retinal transient-on-sustained inhibition, the smear of the leading dot becomes suppressed. However, there is no transient activity to suppress the smear of the trailing dot, and thus the trailing dot appears blurred. Figure 6.9 also clarifies why motion deblurring
SUMMARY
800
Time (ms)
600
400
200
0
0
50
100 150 Space (cell index)
200
Fig. 6.9 Superimposed density plots of afferent sustained and transient activities replotted from the middle two panels of Figure 6.8 to show their spatiotemporal relationship. (Reproduced from Purushothaman et al. 1998.)
depends on exposure duration. Because the transient signal has shorter latency and persistence than the sustained signal, the overlap between the transient and sustained activities occurs only when the stimulus is exposed for longer than a critical duration. Figure 6.10 directly compares model predictions with data and shows good quantitative agreement. 6.3. Summary Because metacontrast and motion involve mechanisms activated by spatiotemporal sequences, there have been several attempts to equate and/or relate these phenomena to each other. A careful parametric analysis shows clearly that they involve different mechanisms. Nonetheless, there is a parametric range wherein stimuli strongly activate both mechanisms. In such cases, the outcome of the experiment can depend critically on the criterion content of the observers. As discussed in Chapter 1, section 1.3.2, and Chapter 2, sections 2.5 and 2.6.1, observers can ‘see’ (i.e. detect) the target based on motion cues even when the visibility of the target is completely suppressed. The perception of figural properties of an object is not a prerequisite to sensing its motion. Motion mechanisms can operate over a heterogeneous set of features and forms (Kolers 1972). It has been argued
231
METACONTRAST AND MOTION PERCEPTION
160
Model-large separation Model-small separation Data-large separation Data-small separation
Duration of blur (ms)
140 120 100 80 60 40 20 0
Leading dot 20
40
60
80
100
120
140
160
40 60 80 100 120 Exposure duration (ms)
140
160
160 140 Duration of blur (ms)
232
120 100 80 60 40 20 0
Trailing dot 20
Fig. 6.10 Comparison of model prediction with data from Figure 6.4, averaged with data from two additional observers. Individual data for these observers can be found in Chen et al. (1995). The top and bottom panels correspond to the leading and the trailing dot, respectively. (Reproduced from Purushothaman et al. 1998.)
that phi (formless) and ‘form-cue invariant’ motion perception can play an important role in triggering time-critical behavioral responses such as escape (Stoner and Albright 1992). However, perceiving the form of moving objects is critical in many other tasks. As discussed in this chapter, metacontrast can improve the clarity of form perception by suppressing the smear of moving objects. This occurs when the spatiotemporal characteristics of the stimuli jointly activate metacontrast and motion mechanisms. Thus, under appropriate conditions,
SUMMARY
metacontrast and motion mechanisms can operate in concert to generate the perception of form in motion. The data discussed in this chapter also showed cases leading to a strong activation of motion mechanisms without a concomitant motion deblurring effect. These findings provide evidence against models where motion plays a causal role in metacontrast.
233
This page intentionally left blank
Chapter 7
Figural context and attention in visual masking
7.1. Introduction The gestalt rules of perceptual organization and the effects of attention studied in the laboratory reflect two constraints: (i) the structural correlation of elements in natural scenes that are exploited for efficient coding (Geisler and Diehl 2003; Sigman et al. 2001; Simoncelli 2003), and (ii) the capacity limits on focally processing the superabundance of information in such scenes (Posner 1994). Figural context, gestalt rules of sensory organization, and attention are also known to reflect highlevel visual processes and, moreover, to interact with each other (Pomerantz and Pristach 1989) (see section 7.2.1). These are important considerations for visual masking, since it is influenced not only by low-level bottom-up processes but also by high-level top-down ones. Evidence exists for effects on masking of mechanisms at a number of visual processing stages, from the earliest pre-attentive and preconscious ones to those that engage gestalt grouping and attentional processes (Bachmann 1994; Boyer and Ro, in press; Breitmeyer 1984; Enns and Di Lollo 1997; Foster 1976, 1977, 1978, 1979; Lamme et al. 2002; Michaels and Turvey 1979; Ramachandran and Cobb 1995; Shelley-Tremblay and Mack 1999; Turvey 1973; Vidnyánszky et al. 2001; M.C. Williams and Weisstein 1981). In our opinion, these are not unexpected or surprising findings. Since the effects of figural contrast, context, and attention are ubiquitous and known to affect the quality and quantity of processing in many tasks besides visual masking (Bashinski and Bacharach 1980; Frith and Dolan 1997; Han and Humphreys 1999; Handy et al. 1996; Kapadia et al. 1995; King et al. 1993, 1995; Pomerantz and Pristach 1989; Posner and Petersen 1990; Posner and Rothbart 1994; Prinzmetal and Banks 1977; Saarinen et al. 1997), there is no a priori reason to rule out such effects in visual
236
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
masking. We shall return to this point when we discuss the implications of these effects for theories of visual masking. Below we discuss the effects of figural context and attention, keeping in mind that the two effects sometimes go hand in hand (Freeman et al. 2001). 7.2. Effects of perceptual context and grouping 7.2.1. Effects of spatiotemporally grouping the target with the mask display
In several psychophysical investigations Weisstein and coworkers (Berbaum et al. 1975; Lanze et al. 1985; Weisstein and Harris 1974; Weisstein and Maguire 1978; Weisstein et al. 1973; A. Williams and Weisstein 1978) (see also McClelland 1978) demonstrated that the visibility of a briefly flashed target element was enhanced when, in conjunction with the following mask, it comprised a three-dimensional or connected object or a higher-level emergent feature. This objectsuperiority effect is analogous to the word-superiority effect (Baron and Thurstone 1973; Egeth and Gilmore 1973; Johnston and McClelland 1974; Reicher 1969; Smith and Haviland 1972; Wheeler 1970) in which a briefly flashed letter is typically recognized better when, in conjunction with the following mask, it comprises a word rather than a meaningless and unpronounceable string of letters. Typical examples of context patterns varying in perceived (a)
Fig. 7.1 Typical stimuli employed to study the object-superiority effect. Observers are required to identify one of the four line segments either alone as shown in (c) or in the context of decreasing apparent three-dimensionality or depth as shown in (a), (b), and (d). (Reproduced from A. Williams and Weisstein 1978.)
(b)
(c)
(d)
EFFECTS OF PERCEPTUAL CONTEXT AND GROUPING
three-dimensionality are shown in Figure 7.1. Figure 7.1(a) shows context patterns that appear strongly three-dimensional. Figures 7.1(b) and 7.1(d) show context patterns that appear progressively less threedimensional, to the point of looking flat in Figure 7.1(d). In Figure 7.1(c) the diagonal target lines, which are parts of the context patterns shown in Figures 7.1(a), 7.1(b), and 7.1(d), are represented without any context pattern. Using these stimuli, A. Williams and Weisstein (1978) investigated the effects of the various context patterns (including no context) on the perceptual identifiability of the diagonal test lines when the stimuli were flashed for 20 ms. Performance differences in the three stimulus conditions displayed in Figures 7.1(b), 7.1(c), and 7.1(d), relative to the condition displayed in Figure 7.1(a), are shown for four separate experiments in Figure 7.2. Note the progressive decline in performance as the context pattern shifts from a strong apparent three-dimensional object to a flat pattern. In fact, in Experiments 1 and 2, the flat-context condition yielded poorer performance than the no-context condition, indicating that the context acted as a noise source and impaired identification. Exp 1
Exp 2
Exp 3
Exp 4
--------
--------
--------
--------
–2.4%
–4.7%
–7.7%
–5.8%
–11.2%
–15.3%
–11.6%
–14.5%
–16.0%
–20.0%
–10.2%
–9.3%
Fig. 7.2 The mean differences in accuracy of identifying the line segments between the object or highest apparent depth context (Fig. 7.1.(a)) and the other three contexts (Figs 7.1(b)–7.1(d)). Separate results are shown for four experiments. (Reproduced from A. Williams and Weisstein 1978.)
237
238
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
The performance results of the first two display conditions, where the contexts combined with the target lines appear three-dimensional, was significantly better than that in the no-context condition. Such findings are important for theories of pattern recognition and perception. Many recent theories (e.g. Biederman 1987; Treisman 1988; Ullman 1989) view the process of pattern recognition as constructional and, in part, sequential. They assume that the first step is the extraction and identification of visual primitives or component features of a pattern, followed by constructive binding of such features into more complex representations, which eventually result in a representation of an object. Such an approach, appropriately called the bottom-up approach, is consistent with the increasing specificity and complexity of contour feature activation in progressively higher cortical visual areas (Desimone et al. 1984; Fujita et al. 1992; Hubel 1988; Kobatake and Tanaka 1994; Pasupathy and Connor 2002; Tanaka et al. 1991, Tsunoda et al. 2001). However, the fact that three-dimensional or other object contexts facilitate identification of a target line relative to the no-context conditions indicates, in turn, that the output of the target line from the feature encoding level is facilitated by the contextual object representation. This dependency of feature extraction on context suggests that a global perceptual construction of the object occurs either in parallel with or prior to the encoding or transfer of local feature information to subsequent levels (Navon 1977; Sugase et al. 1999; Watt 1988), i.e. global pattern processing at a same-level or higher stage may facilitate the encoding or transfer of local features. Consequently, as well as relying on bottom-up activation, the perceptual process may also incorporate a top-down feedback activity or a samelevel horizontal modulation between the representations of contexts and local features (Hupé et al. 1998; Kapadia et al. 1995; Lamme 1995; Li and Gilbert 2002; Rossi et al. 2001; Spillmann and Werner 1996). The time-course of object context effects has also been investigated using metacontrast by M.C. Williams and Weisstein (1981; see also Purcell and Stewart 1991). Here, the brief diagonal target lines are followed at varying SOAs by patterns defined by depth or connectedness. Figure 7.3 shows an example of six depth and three connectedness patterns with their associated ratings of apparent depth and connectedness. Typical metacontrast functions obtained with the different context stimuli serving as masks are shown in Figure 7.4. Overall, less
EFFECTS OF PERCEPTUAL CONTEXT AND GROUPING
Depth rating
8.7
8. 1
7.2
Connectedness rating 7 .7
6.0
2.4
5.5
3.2
2.9
Fig. 7.3 (a) Six apparent depth context patterns with decreasing associated depth ratings from top to bottom as shown. (b) Three apparent connectedness context patterns with decreasing connectedness ratings from top to bottom as shown. (Reproduced from M.C. Williams and Weisstein 1981.)
masking of the target lines was obtained with greater depth and connectedness of the context stimuli. However, two trends may be of potential interest. First, while variations of the level of context depth affected the optimal masking SOA, variations of context connectedness did not. In particular, the optimal masking SOA decreases from 150 to 60 ms as the level of context depth decreases, but maintains a value of 80 ms for all levels of context connectedness. On the other hand, the magnitude of masking is affected much more by variation of connectedness than variation of depth. While masking magnitude increases only slightly as depth decreases, it increases markedly as connectedness decreases. Presumably these differences relate to (as yet unexplored)
239
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
(a) 80
Depth
Percent accuracy
70
60 8.7 8.1 7.2 5.5 3.2 2.9 Baseline
50
40 0
30
60
90 120 150 180 210 240 270 SOA (ms)
(b) 80 Connectedness
Fig. 7.4 (a) Metacontrast masking functions of the test line segments by the six depth context patterns as shown in Figure 7.3(a). (b) Metacontrast masking functions of the test line segments by the three connectedness context patterns as shown in Figure 7.3(b). (Reproduced from M.C. Williams and Weisstein 1981.)
70 Percent accuracy
240
60
50
7.7 6 2.4 Baseline
40 0
30
60
90 120 150 180 210 240 270 SOA (ms)
differences in the processing of flat-appearing patterns relative to patterns that appear to have an additional depth dimension. The gestalt factors operative in these two situations may not be identical and thus may reflect different spatial or temporal properties (Kurylo 1997). Additional effects on metacontrast have been obtained when the target can be spatiotemporally integrated with the mask by the gestalt factors of collinearity, symmetry, or similarity (Havig et al. 1998; M.C. Williams and Weisstein 1980). For instance, Havig et al. (1998)
EFFECTS OF PERCEPTUAL CONTEXT AND GROUPING
used a target element that was flanked by two mask elements that could be located to the right and left of (horizontal alignment), or above and below (vertical alignment), the target. Prior to the target–mask sequence, a visual cue signaled the horizontal or vertical mask alignment correctly in 80 percent of the trials (valid trials), and incorrectly in the remainder. The target was significantly easier to detect (weaker masking prevailed) in validly cued than in invalidly cued trials, i.e. when the target was collinear with the validly cued mask alignment. However, Williams and Weisstein (1980) found that spatiotemporal integration of the target and mask via symmetry and similarity can enhance rather than reduce the masking effect. Specifically, when the target elements could be perceptually integrated with the mask, they were presumably embedded in a larger perceptual grouping that rendered their detection more difficult in this case1 (Prinzmetal and Banks 1977) rather than easier as in other cases (Pomerantz et al. 1977). Effects of spatially embedding the target within a larger target gestalt 7.2.2.
In recent years, additional studies have shown that a target element’s visibility in backward masking is affected not only by its fit into a larger target–and–mask configuration but also by how it fits into a target–only configuration. Although fast and slow response components in areas as early as V1 may process visual information in a distinct manner (Müller et al. 2001; Lamme et al. 2000), the effects of context on the masking of simple targets like oriented lines or textures manifest themselves at the earliest (Vidnyánszky et al. 2001) as well as later (Lamme et al. 2002) response levels in cortical processing. These two masking effects may in turn depend on distinct forms of contextual modulation produced by intrinsic horizontal connections (Kapadia et al. 1995) and by feedback from higher to lower visual areas (Lamme et al. 2000). Our review below will include context-dependent effects, which probably depend on both kinds of modulation; however, the extent to which each contributes to masking effects is unknown, although intelligent guesses might be made. One example of target-configuration effects is that inverted faces or faces with scrambled features are more prone to be masked than are upright unscrambled faces (Purcell and Stewart 1988; Shelley-Tremblay and Mack 1999). Since facial recognition can proceed holistically or by
241
242
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
component features (Cabeza and Kato 2000; Farah et al. 1998; Tanaka and Farah 1993; Tanaka and Sengco 1997), a facial-superiority effect (Homa et al. 1976; van Santen and Jonides 1978), analogous to the object-superiority effect (Weisstein and Harris 1974), would render facial components more visible and thus more useful for the detectability of a face. More directly discernible effects of target configuration have been reported by Ramachandran and Cobb (1995). In one experiment the central of three horizontally collinear disks could be masked by an array of four vertically aligned rectangles, with the two central rectangles flanking the central target disk. At a masking SOA of 116 ms, the rated visibility of the target disk was significantly higher when observers attended to the target array than when they attended to the mask array. While both the mask and target arrays produced grouping by collinearity, the grouping of the target array was strengthened by the additional factor of similarity and thus could lead to increased visibility of the central disk element when observers attended to the array. Our own (unpublished) observations of metacontrast masking, using subjective ratings of target visibility, showed that collinearity of the target array did not decrease the masking magnitude. Specifically, compared with an isolated horizontal target bar, the same bar embedded in an array of (horizontally) collinear bars did not decrease masking, or alternatively increase the visibility, of the target element. This is a counterintuitive result since studies of detection thresholds for a briefly flashed target element embedded in an array of collinear suprathreshold elements increases its threshold visibility (Dresp 1993; Dresp and Grossberg 1997; Polat and Sagi 1993, 1994). However, effects of collinear context may be contrast dependent, varying from facilitation of visibility at lower target contrasts to no change or suppression of visibility at higher contrasts (Polat et al. 1997; Wehrhahn and Dresp 1998). This is consistent with the finding that grouping of concurrently presented high-contrast elements may actually reduce activity to visual elements in primary visual cortex (V1) (Murray et al. 2002). In our experiments both the target and mask were at a maximal contrast of 1.0, and therefore we might not have optimized conditions for obtaining facilitatory effects. However, King et al. (1995) demonstrated independently that the gestalt factors of similarity and proximity affect the magnitude of metacontrast. They showed that the masking of
ATTENTIONAL EFFECTS
a horizontal target line by two adjacent and collinear mask lines was reduced when a second line, adjacent and parallel to the target line, was included in the target array. Moreover, this reduction of masking magnitude decreased as the separation between the two lines in the target array increased. 7.3. Attentional effects Several of the studies cited above (Havig et al. 1998; Ramachandran and Cobb 1995) may have combined the effects of selective attention and grouping, and others may have confounded what some investigators (Abrams and Law 2000; Duncan 1984; Egly et al. 1994a,b; Iani et al. 2001; Lamy and Egeth 2002) have termed space- or locationbased attention with object- or configuration-based attention. Consequently, it is not entirely clear to what extent the modulation of masking magnitude was affected by grouping per se or by space-based or object-based attention. Below we discuss findings of studies that more clearly isolate effects of visual selective attention. Recent electrophysiological and brain-imaging findings indicate that activity in areas as early as V1 (Di Russo and Spinelli 1999; Gilbert et al. 2000; Lamme and Spekreijse 2000; Noesselt et al. 2002; Roelfsema et al. 1998; Somers et al. 1999; Watanabe et al. 1998), and even as early as the LGN (O’Connor et al. 2002), can be modulated by attention, although such early modulation may depend on feedback from higher cortical visual areas. Moreover, since the effects of visual selective attention on detectability of a target may be more prominent when multi-element target displays are used (Di Lollo et al. 2000; Reynolds et al. 1999; Motter 1993; Tata 2002), we would expect the effects of selective attention to interact with the effects of multi-element figural context discussed above. Thus we should be able to discern various attentional effects, each depending on the types and levels of visual information processing used in a perceptual task. 7.3.1. Evidence of space-based attentional effects in masking
The results of Experiment 1 of Ramachandran and Cobb (1995), although supporting the existence of attentional effects in metacontrast, are ambiguous as to their source. The target and mask display is shown in Figure 7.5. The target array consisted of three disks (A, B,
243
244
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
C
A
B
Fig. 7.5 An example of target and mask element configurations used to study gestalt grouping effects and attention in metacontrast. Observers were asked to group the non-target element (B) with either the target element (A) or the other non-target element (C). (Reproduced from Ramachandran and Cobb 1995.)
and C) and the mask comprising two rectangles flanking target disk A. It was found that when disks A and B, rather than C and B were grouped via attentional instruction, masking of disk A was reduced. Ramachandran and Cobb (1995, p.373) correctly concluded ‘ . . . that when visual attention is used to bind the target disk with other adjacent features in the image, masking is reduced considerably’. Although we agree, it is not clear whether the binding or grouping per se is of primary significance, as would be predicted from an object-based interpretation, or whether it simply serves space-based attention by shifting the spatial ‘center of gravity’ of attention toward the target when it is bound with the neighboring disk B. We believe that the latter plays the major role, and therefore we have tentatively included this result as supporting the role of a space-based attention in modulating metacontrast magnitude. More direct evidence of space-based attention was reported by Averbach and Coriell (1961), who studied metacontrast using either a single-element or a multi-element target display. They found weaker metacontrast in the former than in the latter condition. In this study the location of a single target was entirely predictable since it did not vary over trials. Hence attention could be directed to the location of the target even before a target–mask sequence. However, in a multielement display the location of the relevant target is not known until the surrounding masking stimulus appears. Until that time the spacebased attentional system is probably in a diffused state. Hence focusing of attention on the mask-designated target location is delayed.
ATTENTIONAL EFFECTS
Moreover, since visual pattern sensitivity to stimuli falling at a given spatial location can be enhanced by the direction of attention to that location (Bashinski and Bacharach 1980; McAdams and Maunsell 1999; P.L. Smith 2000; P.L. Smith and Wolfgang 2004; P.L. Smith et al. 2004; Treue 2001), the more expeditious allocation of attention to the single-target display should enhance sensitivity relative to the later allocation of attention to the target in the multi-element display. Consequently, metacontrast will be weaker in the former than in the latter situation. Additional evidence for this interpretation derives from studies conducted by Spencer (1969) and Spencer and Shuntich (1970) who, similarly to Averbach and Coriell (1961), investigated backward pattern masking with a single-letter and a 12-letter target array. In particular, Spencer and Shuntich (1970) investigated backward pattern masking at three mask-energy levels. With the single-letter array, the magnitude of backward masking increased as mask energy increased but was eliminated for all mask energies at a target–mask asynchrony of about 100 ms. However, with the 12-letter array all three mask energies yielded not only more pronounced but also more prolonged backward masking effects extending up to SOAs of 300 ms. Each of these results has been replicated by Enns and Di Lollo (1997, Experiment 3) and Tata (2002, Experiment 1). Additional direct evidence for space-based attentional modulation of metacontrast has also been reported recently (Boyer and Ro, in press; Enns and Di Lollo 1997; Havig et al. 1998; Tata 2002; Tata and Giashi 2004). Enns and Di Lollo (1997, Experiment 1) varied spatial attention by varying the spatial uncertainty of a target–mask sequence. Compared with a condition in which the target–mask sequence was presented at a spatially uncertain location, masking was weaker in the spatial certainty condition. Moreover, like increases of items in the target array (Averbach and Corriel 1961; Spencer 1969; Spencer and Shuntich 1970), an increase in spatial uncertainty shifted the optimal masking SOA to higher values. Boyer and Ro (in press), Havig et al. (1998), and Tata (2002) used the spatial cueing paradigm developed by Posner (1980) and found, as expected, that when the target–mask location was validly cued, masking magnitude was weaker than when it was invalidly cued. In a clever experiment, Tata (2002) additionally manipulated attentional cueing of location by exploiting the ‘pop-out’
245
246
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
effect, according to which attention is drawn to an ‘odd’ singleton item embedded in a number of distractor items (Treisman 1988; Treisman and Gelade 1980). The masking was much weaker when the target was a pop-out item in a multi-item target display (thus attracting attention) than when it did not pop out. 7.3.2. Evidence of object- or feature-based attentional effects in masking
Given the existence of inattentional blindness (IB) and, of course, attentional seeing (Mack and Rock 1998), the work reviewed in section 7.2.1, especially that of Purcell and Stewart (1988), Havig et al. (1998), Ramachandran and Cobb (1995, Experiment 2), and Shelley-Tremblay and Mack (1999, Experiment 1), indicates that object-based attention can modulate backward masking magnitude. From the assumption that our perceptual object and scene representations are constrained by more or less likely (i.e. statistical) spatiotemporal regularities in the world (Marr 1982), it follows that each of these representations in turn is more or less typical, salient, or familiar to us. Hence, we would expect more familiar, typical, or salient visual objects, in so far as they resist IB, to be masked less than items that are less resistant to IB. 7.3.3. Evidence for the effects of central attentional mechanisms in masking
Michaels and Turvey (1979), who explicitly incorporated attentional mechanisms in their model of backward masking, proposed a multiprocess model of masking along the following lines. They argued that the masking effects at the shorter SOAs were more likely to be due to target–mask sensory interactions (integration or suppression of target activity with or by the mask activity), whereas the prolonged masking effects extending to the longer SOAs, also reported by Spencer and Shuntich (1970) and Enns and Di Lollo (1997, Experiment 3), were more likely to be due to target–mask attentional interactions, in particular to an interruption by the aftercoming mask of central attentional mechanisms devoted to the transfer of target information from iconic memory to a level of non-visual categorical representation, such as verbal short-term memory (Neisser 1967; Sperling 1960, 1963, 1967), that is immune to visual masking effects. Arguing that iconic read-out of meaningful information should proceed more efficiently than that
ATTENTIONAL EFFECTS
of meaningless information, Michaels and Turvey (1979) found that the masking strength and temporal extent depended on higher-order semantic information in the letter strings comprising their target and mask stimuli, results and interpretations anticipating related ones reported more recently by Shelley-Tremblay and Mack (1999). In contrast with Michael and Turvey’s (1979) process of attentional interruption, Enns and Di Lollo (1997) prefer to introduce a new process called object substitution. This process, based on re-entrant cortical activation (Edelman 1987; Zeki 1993), requires modulation of central attentional states in order to produce modulations of masking magnitude (see also Bachmann 1999). Whether object substitution is distinct from attentional interruption is not clear. Enns and Di Lollo (1997) argue for a distinction between the two. The distinction is based on analysis of false-positive reports in studies of the attentional blink (AB) in which RSVPs of spatially overlapping items are used (Giesbrecht and Di Lollo 1998; Isaak et al. 1999). Here, the item immediately following the designated target acts as an object-substitution mask. Thus this item, rather than the designated target, is reported. However, subsequent work by Giesbrecht et al. (2003) fails to support the hypothesis that object substitution plays a major role in the AB. In fact, these investigators conclude, in line with interpretations proposed by Breitmeyer et al. (1999), that masking effects in the AB are due to early peripheral rather than late central visual processes. Nonetheless, subsequent research (Di Lollo et al. 2000), using the common-onset masking paradigm (Bischof and Di Lollo 1995; Di Lollo et al. 1993) (see also Chapter 2, section 2.6.4), does point to a distinction between interruption and object-substitution masking. Recall that interruption masking in the Michaels and Turvey (1979) model refers to an interruption in the attentive read-out of information from iconic memory. Recently, Lamme (2003) proposed that iconic memory corresponds to what Block (1995, 1996) has termed phenomenal consciousness, whereas post-iconic levels of processing correspond to what Block has termed access consciousness. Moreover, Lamme (2003) argues convincingly that only items in phenomenal consciousness that are attended to can be transferred to access consciousness, a line of reasoning that resonates well with the hypothesis that read-out from iconic memory to more permanent visual (or nonvisual) representations available to conscious report requires attention.
247
248
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
However, the phenomenology of common-onset masking (supported by objective measures) indicates that target information fails to attain phenomenal consciousness or the iconic level of processing. Given the validity of Lamme’s argument, the object substitution produced by common-onset masking cannot be the same as interruption of information transfer from phenomenal awareness (iconic memory) to access consciousness (post-iconic levels). While these topics already suggest interesting links between attention and consciousness, we reserve fuller discussion of the important relations between attention and consciousness for the next chapter (see Chapter 8, section 8.5). 7.4. What role do attention and figural
grouping play in metacontrast? The ubiquitous roles of space- or object-based attention and gestalt grouping in all sorts of visual (and non-visual) processing that we noted in section 7.1 indicate to us that they play no specialized constitutive role in metacontrast. Thus we agree with Kirschfeld and Kammer (1999, 2000) that metacontrast and attention produce separate and opposing effects, the former decreasing or retarding and the latter increasing or accelerating the visibility of the target (see also Bachmann 1999; Bachmann and Põder 2001). The same reasoning applies to the opposing effects of figural context and metacontrast on the visibility of a target (King et al. 1993; Ramachandran and Cobb 1995). The question remains as to what distinctive process constitutes the mechanism underlying metacontrast. In our opinion, the signature feature of metacontrast pointing to a constitutive masking mechanism, as distinct from a specific type of visual masking method, is the typical but counterintuitive U-shaped masking effect obtained as a function of SOA. In this regard, several of the recent investigations (Boyer and Ro, in press; Havig et al. 1998; Ramachandran and Cobb 1995; ShelleyTremblay and Mack 1999) showed only that attention to the configurational or positional information of the target can decrease the magnitude of metacontrast while still preserving the typical U-shaped masking function. Even when certain experimental conditions (Enns and Di Lollo 1997; Tata 2002) can produce an elimination of backward masking, and thus of the type B effect when attention is directed to the target–mask display, we can demonstrate in other studies, in which
SUMMARY
attention is focused on a fixed readily anticipated target location (Breitmeyer 1978a) or selectively directed to the location of one of several possible target locations (Carter et al. 2003), that clear and strong type B effects are obtainable. Thus the signature nonmonotonicity as a function of SOA can be demonstrated to be the defining characteristic of metacontrast mechanisms regardless of whether or not attention is allocated to the target. Hence we believe that the top-down modulatory effects of attention and figural grouping on metacontrast (to the point of eliminating type B metacontrast effects) are analogous to the bottom-up modulatory effects of stimulus wavelength that we discussed in Chapter 2, section 2.5.8. These bottomup effects are produced by visual processes that influence the magnitude and timing of metacontrast but are not solely responsible for producing the U-shaped metacontrast effect, as once (wrongly) proposed by Alpern (1953) (see Chapter 2, section 2.6.8). Exceptions to this argument are the model of metacontrast proposed by Bachmann (1984, 1994, 1997) and the object-substitution approach developed by Enns and Di Lollo (1997) and Di Lollo et al. (2000). As discussed in Chapter 4, section 4.5.2, the perceptual-retouch approach posits a truly constitutive role of attention—a spatially localized target-induced activation of the non-specific arousal system that is appropriated by the mask—in producing metacontrast. But even here, to account for the results of selective spatial attention on masking reported by Boyer and Ro (in press) and Carter et al. (2003), an additional cue-induced direction of selective attention to the location of the target would merely counteract the masking effects. In the object-substitution approach it is the relative lack of attentional resources to a particular target location that contributes to greater backward or common-onset masking. Since re-entrant cortical activity is assumed to play a critical or constitutive role in object-substitution masking, attention here also appears to modulate the masking effect produced by re-entrant activation rather than constitute the masking effect itself. 7.5. Summary The effects of gestalt context and attention reflect constraints on the statistical distribution of elements in natural visual scenes and on the visual system’s limited ability to process these elements
249
250
FIGURAL CONTEXT AND ATTENTION IN VISUAL MASKING
simultaneously. In visual masking, the context for the target can be provided either by the aftercoming mask or by a target configuration which, together with the target element, forms a gestalt. In the former case, if the target element can be integrated into the mask context to form a coherently connected flat pattern or a three-dimensional object, it is masked the less the greater the rated connectedness or object quality of the integrated target–mask percept. The time courses of the connectedness and object-superiority effects during metacontrast differ, most likely reflecting different temporal properties governing the operation of different gestalt factors of figural organization. At times the target–mask integration makes the target less detectable, particularly if the target is a figure in its own right embedded in a larger figural target–mask gestalt. In contrast with the context effects of target–mask gestalt grouping, when the target element is part of a larger target gestalt (e.g. when it is a feature of a normally arranged face), it typically (but not invariably) is masked less than when it is a part of a meaningless arrangement of target elements (e.g. scrambled facial features). Both space-based and object- or feature-based deployment of visual attention can modulate the magnitude of metacontrast and backward masking. Typically, attending to the location or figural aspects of the target enhances its visibility. In addition to these stimulus-driven forms of attentional allocation, more centrally controlled attentional processes that mediate the transfer of information from iconic levels to post-iconic levels of processing also play an important role in determining an observer’s performance during backward masking. While attention is incorporated as an essential constitutive feature in one or two theoretical approaches to backward masking, in most other models of masking both attention and figural context play an ancillary role. In other words, figural context or attention predictably affects the visibility of the target, and hence the magnitude of masking, in a way that is not specific or limited to visual masking paradigms, but rather is a general feature of attention and context when studied in a variety of other experimental paradigms. Thus both top-down influences on backward masking can be viewed simply as modulators of masking analogous to the bottom-up modulatory effects produced by varying certain physical parameters (e.g. wavelength composition) of the target and mask stimuli.
SUMMARY
Note 1. Analogs of such decreases in the detectability of parts comprising wholes are the visual puzzles sometimes found in the puzzle pages of newspapers and magazines where the task is to find the ‘hidden object’ in a larger object array or the embedded-figure tests used in neuropsychological assessments. However, in addition to this general source of difficulty in target detection, another reason for the greater masking with mask elements that constituted symmetric or similar complements to the target elements may have been the greater match between the contour orientations of the target and mask elements, thus rendering the mask more effective (see Chapter 2, sections 2.6.5 and 2.8).
251
This page intentionally left blank
Chapter 8
Unconscious processing revealed by visual masking
8.1. Introduction Attention, a topic addressed in the previous chapter, is closely linked to consciousness (Block 1995; 1996; James 1950; Mack and Rock 1998; Posner 1994). In the last two decades masking has been increasingly used in studies of masked priming (Kinoshita and Lupker 2003) and visual consciousness (Assad 1999; Bachmann 1997; Dennett 1991; Libet 1996; Thompson and Schall 1999), and in the related and controversial field of perception without awareness (‘subliminal’ perception) (Ansorge 2003; Ansorge et al. 1998; Balota 1983; Dimberg et al. 2000; Dolan 2002; Duncan 1985; Fowler et al. 1981; Holender 1986; Kihlstrom 1987, 1996; Klinger and Greenwald 1995; Klotz and Neumann 1999; Klotz and Wolff 1995; Leuthold and Kopp 1998; Marcel 1983a,b; Merikle 1992; Merikle et al. 2001; Reingold and Merikle 1990; Morris et al. 1998; Neumann and Klotz 1994; Taylor and McCloskey 1990; Wong and Root 2003). These developments are supported by the well-known fact that, at optimal masking SOAs of about 30–100 ms, a mask can totally suppress from awareness qualia such as color, contrast, and contour of a preceding target, although traces of target-generated activity persist at unconscious levels in the visual system. Both aspects, the qualia-absent registration of the target and its unconscious traces that are available to the behavioral systems, can be assessed by objective measures (Bridgeman et al. 1979, 1981; Goodale et al. 1986; Pélisson et al. 1986; Klotz and Wolff 1995; Vorberg et al. 2003, 2004). In the review of the neurobiological bases of masking phenomena in Chapter 3, we mentioned the distinction between stimulus- and percept-dependent neural activities in the visual cortex. Stimulus- and percept-dependent levels of processing have been correlated with, on
254
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
the one hand, early (V1) and later (V4/V5 and beyond) stages of cortical processing, respectively (Leopold and Logothetis 1996; Logothetis and Schall 1989; Scheinberg and Logothetis 1997) and, on the other hand, with early and late response components in primary visual cortex (V1), with the later components reflecting re-entrant activation from higher visual brain centers (Lamme 2003; Lamme et al. 2000; Super et al. 2001). These observations agree with the suggestion that the suppression of these late response components in V1 is a neural correlate of backward masking (Bridgeman 1980; Lamme et al. 2002). Therefore, for object perception, we proceed on the working assumptions that the early and late components of neural responses are associated with stimulus- and percept-dependent activities and that the stimulusdependent level is clearly associated with the unconscious and qualiapoor level of processing. 1 In this chapter, we review findings of unconscious information processing in the visual system as it relates to the metacontrast mechanism per se, the location, color, and form of phenomenally suppressed targets, and the allocation of attention. We conclude that metacontrast, in its effects but not its mechanism, is akin to a transient form of blindsight and that it also may share properties with binocular rivalry suppression. 8.2. The unconscious mechanism of metacontrast
suppression revealed by target recovery (disinhibition) Studies too numerous to cite (Breitmeyer 1984, pp. 270–84) have shown that a second mask introduced into the target–mask sequence used in backward masking can restore (at least partially) the otherwise suppressed visibility of the target’s contours and surface properties. This phenomenon is referred to as target recovery or target disinhibition. In the following discussion, we designate the primary masking stimulus that suppresses the visibility of the target by M1, and the secondary mask that produces target recovery by M2. In addition to metacontrast and pattern masking, a number of other methods can be used, to produce backward masking of the target’s visibility; correspondingly, there are also a number of ways, depending on the type of stimuli used, to produce target recovery (again, see Breitmeyer 1984, pp. 270–84). We limit ourselves to discussion of target recovery phenomena reported
THE UNCONSCIOUS MECHANISM OF METACONTRAST SUPPRESSION
by Breitmeyer et al. (1981b). We make this choice because this study not only bears significantly on our understanding of consciousness and its neural correlates but also has important theoretical implications. The design of the Breitmeyer et al. (1981b) study is depicted in Figure 8.1. As shown in Figure 8.1(a), a 30-ms target T (a black 1 disk) was followed at an optimal masking SOA of 60 ms (as determined in an ancillary experiment) by M1, a 30-ms surrounding black annulus or ring 0.5 wide. The spatial separation between T and M1 was 0.006. M1, and thus the fixed T–M1 sequence, could be preceded (Fig. 1(b)) or followed (Fig. 1(c)) at variable SOAs by M2, consisting of a larger 30-ms black ring, also 0.5 wide, surrounding M1. The spatial separation between M1 and M2 was 0.006. Before discussing the rationale for this study and its implications for theories of backward masking and consciousness, we turn to the results shown in Figure 8.2. At any M1–M2 SOA, target visibility was defined as the logarithm of the ratio of the target’s matched contrast relative to its matched contrast when only it and M1 were presented at the optimal masking SOA of 60 ms. Thus the suppressed contrast visibility of T at the masking SOA of 60 ms was defined as its baseline visibility of zero. Since masking magnitude was optimal at an SOA of 60 ms, T’s baseline contrast visibility of zero also corresponds to the optimal masking effectiveness
(a)
(b)
T
M1
M2
(c) –240
M2 –180
–120
–60
0
60
120
M1–M2 onset asynchrony
Fig. 8.1 Schematic diagram of stimuli used in target recovery from metacontrast masking. The temporal sequences depict T (disk) and M1 (annulus) at a fixed T–M1 SOA of 60 ms. The SOAs separating the onset of M1 from M2 vary from 240 ms (M2 precedes M1) to 120 ms (M2 follows M1). (Adapted from Breitmeyer et al. 1981b.)
255
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
Visibility of T Visibility of M1 Masking effectiveness of M1 Baseline
0.4
0.3 Log change in contrast visibility (Masking) (Recovery)
256
0.2
0.1
0
T
M1
–0.1
–0.2
–0.3
–0.4
–0.5 –270 –240 –210 –180 –150 –120 –90 –60 –30 0 30 M1–M2 onset asynchrony (ms)
60
90
120 150
Fig. 8.2 Log relative response strength of neural activities contributing to the visibility of T, the visibility of M1, and the masking effectiveness of M1 as a function of M1–M2 SOA. The array of disks and concentric annuli at the bottom of the figure render the (approximate) subjectively perceived contrasts of T, M1, and M2 as M1–M2 SOA is varied. (Adapted from Breitmeyer et al. 1981b.)
of M1. Target recovery was given by the logarithm of the relative increases in target contrast visibility. Since target recovery results from a decrease in the masking effectiveness of M1, the negative of the target recovery function reflects the changes of the masking effectiveness of M1 as a function of M1–M2 SOA. Let us first look at target recovery (full triangles) produced by M2 when its onset precedes that of M1. Note that as the M1–M2 SOA increases from 240 to 90 ms target recovery increases to a maximum. At an M1–M2 SOA of 30 ms, target recovery is still very strong; however, when the onsets of M2 and M1 coincide, target recovery has declined substantially. Phenomenologically the black target’s apparent contrast is shown in the disk–annulus series shown
THE UNCONSCIOUS MECHANISM OF METACONTRAST SUPPRESSION
below the data curves. Its apparent blackness is close to zero at an M1–M2 SOA of 240 ms and is close to maximal at an M1–M2 SOA of 90 ms, after which it declines and is nearly zero again at an M1–M2 SOA of 0 ms. Let us now look at target recovery when the onset of M2 onset follows that of M1. It can be seen that there is no target recovery at positive M1–M2 SOAs, i.e. the target’s black contrast stays suppressed. Phenomenologically this is depicted by the unchanged lightness of the central disks in the two rightmost disk–annulus displays below the data curves. The masking effectiveness of M1 as a function of M1–M2 SOA can be obtained by simply inverting the target recovery function. This is shown by the full squares in Figure 8.2. In contrast with these mask effectiveness (and target recovery) functions, let us now inspect the effects of M2 on the visibility of M1. At any M1–M2 SOA, M1 visibility is defined as the logarithm of the ratio of M1’s matched contrast relative to its matched contrast when only the target and M1 are presented at the optimal masking SOA of 60 ms. Thus the unsuppressed contrast visibility of M1 at the optimal masking SOA of 60 ms is defined as its baseline visibility of zero. Inspection of Figure 8.2 shows that at all negative M1–M2 SOAs, M1 (including the SOA of 0 ms) the visibility of M1 (open circles) is hardly affected by the preceding (or simultaneous) M2.2 Phenomenologically, this is shown by the highly visible inner black annuli depicted below the data curves of Figure 8.2 for M1–M2 SOAs ranging from 240 to 0 ms. However, when the onset of M2 follows that of M1 at progressively larger positive SOAs, M1’s visibility changes are defined by a typical type B U-shaped metacontrast function. Phenomenologically, this is depicted by the near, if not total, elimination of the blackness of M1 at M1–M2 SOAs of 30–60 ms followed by a recovery in the perceived blackness of M1 as the M1–M2 SOA increases to 120 ms. The striking feature for consideration of theories of masking and consciousness and their neural bases is that M2 produces a double dissociation, at negative and positive M1–M2 SOAs, between changes of M1’s contrast visibility and changes of its masking effectiveness. In other words, over the negative M1–M2 SOA intervals, where one sees almost no changes in M1’s qualia-rich visibility, one nevertheless sees dramatic decreases in M1’s masking effectiveness (corresponding to the increases of recovery of the target’s qualia-rich visibility). However, at positive M1–M2 SOAs, where one obtains a profound suppression of M1’s qualia-rich visibility, there is no decrease in M1’s
257
258
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
masking effectiveness (corresponding to no recovery of T’s qualia-rich visibility). Similar to explanations of double dissociations found in neuropsychological investigations (Teuber 1955), we take the double dissociation between the decrease in M1’s masking effectiveness and the lack of change in M1’s visibility at negative M1–M2 SOAs, on the one hand, and the lack of change in M1’s masking effectiveness and the decrease in M1’s visibility at positive M1–M2 SOAs, on the other hand, as strong indications for the existence of two separate neural mechanisms or processes. The upshot of the above results for theories of consciousness can be summarized in the following abstract way that does not depend on which particular model of masking is adopted. The onset of M1 activates a neural process S which suppresses target visibility. At negative M1–M2 SOAs, M1’s process S in turn can be suppressed by M2, but without M2 suppressing the visibility of M1. Hence, we can conclude that process S does not contribute to the qualia-rich contour and contrast visibility of M1. In other words, process S does not register in visual awareness. Similarly, at positive M1–M2 SOAs, the onset of M2 suppresses the qualia-rich visibility of M1 but does not interfere with M1’s ability to suppress the qualia-rich visibility of T (as there is no recovery of T’s suppressed visibility). This again shows that the M1-activated process S is invisible and thus does not contribute to the qualia-rich percept of M1. Therefore in each case we postulate a second stimulus-activated neural process V which does contribute to the qualia-rich awareness of M1. To convert this abstract scheme into the more concrete RECOD model of masking we outlined in Chapter 5, we further propose that processes S and V correspond to neural activities in the transient magnocellular (M) and parvocellular (P) pathways, respectively. This proposal dovetails nicely with (a) the proposal of Crick and Koch (2003), and also of Milner and Goodale (1995), that the activity in the cortical dorsal M-dominant pathway does not contribute substantially to conscious vision and (b) evidence derived from blindsight studies (Kentridge et al. 1997, 1999; Sahraie et al. 1997, 1998; Weiskrantz 1997). Even with blindsight-induced disruption of the normal cortical M pathway activity, retinal M-cell activity can project directly to the superior colliculus (Kaas 1986) and hence, via the pulvinar, to extrastriate visual centers in the cortex. This subcortical M pathway could
UNCONSCIOUS PROCESSING OF OBJECT LOCATION
provide a basis for the qualia-poor visual performance found in blindsight subjects. In normal observers also, the subcortical activity could contribute to the qualia-poor performance during metacontrast, of course in addition to the intact dorsal cortical activity. For example, spatiotemporally sequenced activation of this M pathway may give rise to a ‘sense’ of objectless motion reported by blindsight subjects and may also give rise to the similar objectless ‘pure phi’ sensations reported in studies of apparent motion in normal observers (Korte 1915; Neuhaus 1930). However, typically we see objects in motion, and for perception of qualia-rich object-in-motion to occur, as already realized in Marr’s (1982) computational model, the activity of the object-processing cortical P channels needs to converge with that of the motion-processing M channels. Neurophysiological evidence for convergence of M and P activities has been reported in numerous studies (Ferrera et al. 1992; Nealy and Maunsell 1994; Sawatari and Callaway 1996; Yabuta and Callaway 1998). 8.3. Unconscious processing of object location The double dissociation discussed above reveals that many aspects or qualities of a stimulus such as its contours and surface properties can be totally suppressed from conscious perception without changing its effectiveness as a metacontrast mask. According to the RECOD model, the transient-channel processes activated by a stimulus are responsible not only for its metacontrast effects, should that stimulus be designated as a mask, but also for signaling its location, should it be designated as the target. Our sketch of the phenomenology of masking in Chapter 2 indicated that there are three sources of information for detecting the presence or location of a target when its perception of its qualia is suppressed: (1) a sensation of objectless ‘explosive’ or ‘split’ apparent motion proceeding from the location of the masked target outward toward the location of the surrounding mask, (2) a paradoxical reversal of the target’s perceived contrast, and (3) the perception of what we can best described as a non-moving transient ‘blip’ in the target area. The existence of such qualiasuppressed target location representations is reflected in several findings showing that some residual target information is immune to metacontrast masking.
259
260
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
For instance, Fehrer and Raab (1962) and Fehrer and Biederman (1962) measured simple reaction times to the detection of a target during metacontrast and found that the reaction time did not vary appreciably with SOA, contrary to what one would expect if target reaction time correlated with its subjective visibility. These counterintuitive findings relating simple reaction time detection to lack of subjective awareness have been replicated on several occasions (Bernstein et al. 1973a,b; Harrison and Fox 1966; Lachter and Durgan 1999) and in various modalities (Imanaka et al. 2002; Taylor and McCloskey 1990). Similarly, as shown by Schiller and Smith (1966), even when observers are required to make choice reaction times as to which of two possible locations the target occupied, the choice reaction times did not vary with SOA. On the other hand, when a choice reaction time is required as to the target’s identity, the reaction times vary in an inverted U-shaped (type B) function with SOA (Eriksen and Eriksen 1972). These results, indicating a dissociation between motor response and conscious percept, may be related to the ability of observers to track or point correctly to the location of target stimuli that are rendered invisible by saccadic suppression (Bridgeman et al. 1981; Goodale et al; 1986; Pélisson et al. 1986). Moreover, they reiterate two important features of metacontrast discussed in Chapter 2. One is that an observer’s performance depends on the criterion content that he or she adopts, which in turn depends on the experimental task (Breitmeyer 1984; Ventura 1980). Another feature is that, depending on which criterion content is adopted, observers are able to detect, on the basis of residual mask-immune information, the mere presence or location of the target at metacontrast SOAs at which perception of its qualia-rich identity information is optimally suppressed (Vorberg et al. 2003, 2004). However, our preceding analysis of target recovery in the context of the RECOD model indicates that while metacontrast should not affect the ability to localize a target, paracontrast should decrease it in a non-monotonic U-shaped manner. This follows from the model’s assumption that the slower sustained-channel activity of a preceding mask suppresses the faster transient-channel activity of a following target stimulus. Consistent with such a decrease, Ögmen et al. (2003) recently reported that paracontrast indeed raises the choice reaction time to the detection of a target. In this investigation, subjects responded as quickly and accurately as possible to the location of a target appearing randomly to the left or right of the vertical meridian.
UNCONSCIOUS PROCESSING OF OBJECT LOCATION
Figure 8.3(a) shows an example of a target disk falling to the left when it was preceded (paracontrast) or followed (metacontrast) by two simultaneously presented mask annuli surrounding the left and right target locations. A typical result (obtained by observer BB) is shown in Figure 8.4. Note that while the choice reaction time does not vary appreciably with metacontrast SOA, it increases monotonically from about 260 ms at a paracontrast SOA of 293 ms to 360 ms at an SOA (a)
+ (b)
+ Mask
Target Paracontrast
Target
Mask Metacontrast
500
BB
Paracontrast
Fig. 8.3. Stimulus configurations used for target localization experiment. (a) contourmask configuration; (b) pseudo-mask configuration. (Reproduced from Ög ˘men et al. 2003.)
Metacontrast
RT (ms)
460
420
M/T=1 M/T=3
380
340
300 –300 –240 –180 –120 –60 0 60 SOA (ms)
120 180 240
Fig. 8.4. The average reaction time as a function of SOA for observer BB for mask/target (M/T) energy ratios of 3 and 1. (Reproduced from Ög ˘men et al. 2003.)
261
262
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
of 0 ms. This pattern deviates from our model prediction. One factor which is not considered in our model is a generalized ‘interference effect’ in the response to one stimulus when a second one precedes or follows it at short time intervals. We hypothesized that as well as contour-specific suppression of target location information by the mask, the mask also produces a response-interference effect and, assuming additivity, the total reaction time RTtotal can be expressed as RTtotal RTcontour mask RTinterference C
where C is the baseline reaction time obtained in the absence of any mask, RTcontour mask and RTinterference represent the contributions of contour mask and interference effects, respectively, to the reaction time. RTcontour mask can be estimated by designing a control experiment where a ‘pseudo-mask’ produces an interference effect with minimal contour masking. An example of a target and a pseudo-mask is shown in Figure 8.3(b). Because the pseudo-mask elements were sufficiently distant from the target contours, they should have little if any contourspecific masking effect while still generating the non-specific interference effect. The reaction time RT control obtained in the control experiment can be written as RTcontrol RTinterference C.
It can seen that the difference RT RTtotal RTcontrol provides an estimate for RTcontour mask. Figure 8.5(a) shows the choice reaction time results, averaged across six observers, for the control and mask conditions at mask/target (M/T) contrast ratios of 1.0 and 3.0. Note that for both contrast ratios the choice reaction times were larger for the annular mask than for the control conditions. The choice reaction time difference between the two conditions is shown in Figure 8.5(b). At positive metacontrast SOAs, RT values fluctuate unsystematically around averages of 5.5 ms and 1.7 ms for M/T ratios of 1 and 3, respectively. However, for paracontrast, RT values depend strongly on SOA, peaking at SOA 150 ms. The peak RT values are 28.7 ms and 51.1 ms for M/T ratios of 1 and 3, respectively. Like the results obtained previously by Schiller and Smith (1966), these results indicate that the detection of the target location is immune to metacontrast masking. That the target cannot be seen at optimal
UNCONSCIOUS PROCESSING OF OBJECT LOCATION
(a) 400 M/T=1 M/T=1(control) M/T=3 M/T=3(control) Baseline
RT (ms)
360
320
280
240 (b) 60 M/T=1 M/T=3 Mean
50
⌬RT (ms)
40 30 20 10 0 –10 –20 –300 –240 –180 –120 –60 Paracontrast
0
SOA (ms)
60
120
180
240
Metacontrast
Fig. 8.5 (a) Reaction times averaged across six observers as a function of SOA for M/T energy ratios of 3 (circles) and 1 (squares). Full and open symbols correspond to contour mask and pseudo-mask conditions, respectively. The horizontal broken line shows the baseline reaction time obtained by presenting only the target. (b) RT values computed from the data as a function of SOA. The middle curve corresponds to the average of the M/T 3 and M/T 1 data. (Reproduced from Ög ˘men et al. 2003.)
metacontrast SOAs implies that the detection of location can rely on unconscious processes, i.e. processes distinct from those giving rise to conscious percept of the target’s form, color, and contrast qualia. According to our model these unconscious processes correspond to activity in the transient M pathway. As noted, this assumption conforms with recent claims that the most likely cortical stream supporting the
263
264
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
‘zombie’ (Koch and Crick 2001) or unconscious mode of processing is the dorsal M-dominated stream (Crick and Koch 2003; Milner and Goodale 1995). Of course, studies of blindsight patients indicate that midbrain sites such as the superior colliculus, which receives direct M-pathway projections from the retina (Kaas 1986) and interacts with the dorsal stream, could also contribute to unconscious processing of the location or change of location (motion) of visual stimuli (Ro et al. 2004; Stoerig 1996, 2002; Weiskrantz, 1997). 8.4. Unconscious processing of object-identity
information: color and form The perception and identification of objects depends not only on form but also on color. For instance, we often differentiate ripe from unripe fruit or edible from inedible foods not on the basis of form but rather on the basis of color. The ventral cortical stream of processing for object vision (Ungerleider 1985; Ungerleider and Mishkin 1982) is a P-dominated pathway, which in turn is divided into the ‘blob’ and ‘interblob’ pathways processing surface properties of objects, such as color and brightness, and form properties, such as oriented edges and contours, respectively (De Yoe and Van Essen 1988; Xiao et al. 2003). These anatomically and physiologically distinct P pathways comprise analogues of neural-network models such as the FACADE model developed by Grossberg and colleagues (Cohen and Grossberg 1984; Grossberg 1994). In this model the blob and interblob P pathways correspond to the feature contour system (FCS) and the boundary contour system (BCS), with the former processing surface properties and the latter edge or contour information of objects. Although activity in the ventral pathway is associated with perception or conscious vision (Crick and Koch 2003; Milner and Goodale 1995), results we review indicate that unconscious form and color processing may occur at several levels in this pathway. While a reasonably strong case for unconscious visual processing in blindsight patients has been made over the past several decades (Weiskrantz 1997), in normal observers the case has been more problematic if not contentious (Holender 1986; Merikle 1992). Some evidence for unconscious visual processing in normal observers has also accumulated over the past two decades (Marcel 1983a,b;
UNCONSCIOUS PROCESSING OF OBJECT-IDENTITY INFORMATION
Kihlstrom 1987, 1996), and recently the theory of direct parameter specification (DPS), developed and explored by Neumann and coworkers (Ansorge 2004; Ansorge and Neumann, 2005; Klotz and Neumann 1999; Neumann 1990; Neumann and Klotz 1994; Scharlau and Ansorge 2003) has provided additional convincing evidence. The DPS theory states that a motor response can be directly specified by visual input, i.e. without the mediation of conscious awareness of the visual input. Development and testing of the theory uses metacontrast (or backward pattern masking) to suppress conscious registration of targets while leaving intact their ability to prime discriminative responses to the following masks. Below we discuss some of the applications of this approach to the study of unconscious form and color processing. 8.4.1.
Unconscious priming by form
A study reported by Klotz and Wolff (1995) provides a clear demonstration of DPS theory when applied to form perception. As shown in Figure 8.6(a), the target prime was either a small square or diamond or
(a)
(b) 470 Target
Mask
RT (ms)
450
Congruent Neutral
Incongruent
430 410 390 370
Congruent
Incongruent
Fig. 8.6 (a) Examples of target and mask configurations used in unconscious form priming. The stimulus consisted of a square- or diamond-shaped target or two circle-shaped forms, as shown, followed in time by either a larger square- or diamond-shaped mask that spatially surrounded the targets. Target–mask pairings could be congruent, incongruent, or neutral, as indicated. (Adapted from Klotz and Wolf 1995.) (b) Choice reaction times to the mask configuration as a function of target–mask congruency condition. (Reproduced from Klotz and Wolf 1995.)
265
266
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
one of two circular shapes presented above or below fixation followed at an optimal metacontrast SOA by a either a larger square or diamond mask that surrounded the target. The task of the observers was to depress as quickly and accurately as possible one of two response keys depending on which mask shape was presented. The target–mask pairings were congruent (e.g. diamond-shaped target followed by a diamond-shaped mask), incongruent (e.g. square-shaped target followed by a diamond-shaped mask) or neutral (e.g. one of the circleshaped targets followed by either mask). As shown in Figure 8.6(b), relative to the neutral pairings (broken line), the congruent and incongruent pairings yielded faster and slower choice reaction times, respectively. This differential response–speed effect is consistent with the expectation that the targets can facilitate responses to same-shaped masks while, in contrast, interfering with responses to differently shaped masks. Applying a signal-detection approach, Klotz and Wolff (1995) showed in separate control experiments that, indeed, the target elements were not visible, as indicated by a d of zero. Hence, the facilitative and interfering priming effects of the targets occurred at an unconscious level of processing. Using a similar rationale, we explored the loci or levels of processing producing these unconscious form-priming effects (Breitmeyer et al. 2004a; Breitmeyer et al. 2005b). In one study (Breitmeyer et al. 2004a), we explored the possible consequences of findings reported by Lamme et al. (2000, 2002) and Macknik and Livingstone (1998) regarding the early response components of neural responses in V1. As noted above, the former investigators demonstrated that while the late response components of V1 neurons were percept dependent, the early response components were stimulus dependent and thus unconscious. The latter investigators showed additionally that the early response components are prone to suppression by paracontrast masks. Along with several models of low-level vision (Marr 1982) and high-level vision (Biederman 1987; Hummel and Biederman 1992; Treisman 1988; Ullman 1996), we proceeded on the assumption that the earliest stages of cortical form vision occurred in V1 where form primitives such as edge or line orientation are processed. If this is the case, it seems reasonable to assume further that the early stimulus-dependent components of V1 neural responses are responsible for the unconscious form-priming effect. Since these early components are prone to
UNCONSCIOUS PROCESSING OF OBJECT-IDENTITY INFORMATION
paracontrast suppression, it follows that a mask preceding the target at optimal paracontrast SOAs should reduce the unconscious form-priming effect that the target has on a mask following at optimal metacontrast SOAs. However, our results showed that a paracontrast mask did not affect the unconscious form priming; if anything, the priming effects on the choice reaction times were slightly, although not significantly, increased by a paracontrast mask. These findings rule out the earliest cortical response levels as the sites of unconscious form priming. In fact, our follow-up studies (Breitmeyer et al. 2005b) indicated that unconscious form processing probably occurs at or beyond levels where the form primitives of orientation are conjoined or integrated into representations of more complex form features such as corners and vertices, or very likely even the whole form, as independently indicated by the recent findings of Mitroff and Scholl (2005). Hence unconscious form priming may occur at late whole-form-dependent rather than early feature-dependent levels of processing. Moreover, our findings imply that the mechanism responsible for metacontrast suppression of form (edge and contour information) is located at relatively late levels of processing. This, as we saw in Chapter 4, has important theoretical consequences, since it, together with other evidence (Breitmeyer and Ögmen 2000), raises serious doubts concerning early cortical sites of metacontrast suppression as proposed by Breitmeyer (1984). 8.4.2.
Unconscious priming by color
Different conclusions apply to unconscious processing of chromatic information. Evidence supporting direct parameter specification of color information has been reported recently by Schmidt (2000, 2002). These studies showed that, for example, a green target facilitated choice reaction times to following metacontrast masks with a congruent green color but interfered with choice reaction times to masks of an incongruent red mask. Follow-up studies reported by Breitmeyer and coworkers (Breitmeyer et al. 2004a,c) indicate that, unlike form priming, unconscious color priming occurs at the early stimulusdependent cortical levels of processing. In particular, Breitmeyer et al. (2004c) showed that a white target prime produced an incongruency effect that was significantly larger when the following mask was (a desaturated) blue than when it was (a desaturated) green (Fig. 8.7),
267
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
510
Fig. 8.7 Mean choice reaction times to blue and green annuli when preceded by fully masked blue, white, and green disks. (Reproduced from Breitmeyer et al. 2004c.)
RT (ms)
268
Congruent White Incongruent
490
470
450 Blue mask
Green mask
i.e. the white target tended to act more like an incongruent green target prime for a blue mask than an incongruent blue target prime for a green mask. This asymmetry could be explained by processes occurring at either the percept- or stimulus-dependent level of processing. If it occurred at percept-dependent levels, we would expect the white targets to be perceptually more similar to the green than the blue targets. However, a separate control study (Breitmeyer et al. 2004c) showed that the white target was perceptually more confusable with the blue than the green target, thus ruling out percept-dependent levels of processing as sites of unconscious color priming. Since (a) the white target was generated by exciting the red, green, and blue phosphors of a Sony Trinitron video display and (b) the green phosphor contributes the largest luminance component to the white target, the green-like effect of the white target can be explained most readily on the basis of physical wavelength properties rather than perceptual color properties. Thus the unconscious green-like priming effect of the white target probably occurs at early stimulus-dependent cortical levels of processing like V1. This interpretation is consistent with findings of neural responses correlating progressively less with the wavelength composition and more with the perceived color qualities of a stimulus as one progresses along the visual color-processing hierarchy (Gegenfurter 2003; Moutoussis and Zeki, 2002; Wachtler et al. 2003; Zeki 1997). It also agrees with recent findings reported by Schulz and Sanocki (2003) on the time course of perceptual grouping by color, which showed greatest wavelength dependency at short exposure durations and progressively greater percept
UNCONSCIOUS PROCESSING AND ATTENTION
or color dependency at progressively longer exposure durations. Breitmeyer et al. (2004a) independently confirmed the wavelength dependency of unconscious color priming by specifically manipulating the strength of the early feedforward stimulus-dependent component of V1 neural responses (Lamme et al. 2000). They showed that a mask presented at optimal paracontrast SOAs preceding the target prime reduces the color priming effect of the target on the subsequently presented metacontrast mask. Such a reduction would be expected if, as shown by Macknik and Livingstone (1998), the early response component of V1 neurons to the target is suppressed by a paracontrast mask. Several interesting conclusions can be drawn from the combined results of the form- and color-priming studies. First, both types of priming are most likely due to the activity in the rapid feedforward sweep of cortical processing that proceeds unconsciously. The unconscious nature of this processing is also supported by recent evidence reported by Lamme (2000) and VanRullen and Koch (2003). Secondly, they show that while unconscious color priming can occur at the earliest wavelength-dependent levels of processing in V1, unconscious form priming appears to occur at later levels (e.g. V4 or IT cortex) where at least conjunctions of simple form primitives if not representations of whole forms are processed. Thirdly, they are consistent with the notion that an object’s surface properties, such as color or contrast, are processed in visual pathways distinct from those processing form properties like edge orientation, angles, or vertices (De Yoe and Van Essen 1988; Xiao et al. 2003). Finally, together with the findings reviewed in Chapter 2, sections 2.5 and 2.6.4, they suggest that masking is not a unitary process in that masking of surface properties (e.g. its contrast) is distinct from masking of contour properties. However, more research is required to elucidate fully the properties of these distinct masking processes. 8.5. Unconscious processing and attention We noted above that when conscious registration of a target’s attributes is suppressed by a mask, residual unconsciously processed information can nonetheless be made available to a number of behavioral control systems. In view of the fact that consciousness and attention are closely related (James 1950; Posner 1994), two interesting questions are raised by these findings. One is whether the effective use of this residual target
269
270
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
information, being processed preconsciously, is also processed preattentively, i.e. without the need of visual attention. The other question is whether the unconsciously processed target information can be used to control attention. Here we review work indicating that attention can influence the processing of target information when it is rendered phenomenally invisible to conscious report by the aftercoming mask and conversely that fully masked targets can influence the deployment of visual attention. The answer to the first question appears to be that the processing of residual preconscious information requires attention. In a series of experiments, Nacchache et al. (2002) recently examined unconscious priming effects of a masked number prime on a number-comparison task. A briefly flashed number prime (1–4 or 6–9) was immediately preceded and followed by a pattern mask, thus rendering the prime inaccessible to consciousness. A clearly visible probe number was presented immediately after the mask, and observers were asked to press one of two response keys indicating whether the probe was less or greater than 5. Analyses of the choice reaction times showed that obtaining unconscious priming effects, in the form of faster responses when the prime and probe numbers were both either greater or less than 5, requires that observers allocate attention to the temporal interval during which the prime–probe pair is presented. In a related study, Ansorge (2004) recently showed that unconscious priming effects of a masked prime are subject to dual-task interference by being reduced when a second task interferes with the processing of the prime. Such a result is not expected if the processing of unconscious information is automatic or preattentive. Moreover, the belief that the processing of emotional faces rendered invisible by an aftercoming mask occurs automatically or preattentively (Dolan 2002; Öhman 2002) may have to be revised, since a recent study on monkey (Pessoa et al. 2002) showed that brain regions responding differentially to emotional faces were activated only when sufficient levels of attention were available for processing the faces. Results such as these indicating that attention must, and thus can, modulate the unconscious processing of stimuli are consistent with the results of the investigation of attention in a neurological patient with blindsight (Kentridge et al. 1999). Blindsight results from damage restricted to sites in area V1 of the visual cortex (Weiskrantz 1997).
UNCONSCIOUS PROCESSING AND ATTENTION
In the patient investigated by Kentridge et al., the reaction times to stimuli presented to the blind field were facilitated by his attending to the location of a test stimulus presented in the blind field. This shows that attention directed to a test stimulus can facilitate its unconscious processing. The second question as to whether the unconsciously processed target information can be used by the visual system to strategically control and direct attention and action can also be answered positively (Ansorge 2004; Ansorge and Neumann, 2005; Ja´skowski, Skalska and Vorleger, 2003; Ja´skowski et al. 2002; Schlotterbeck and Vorleger 2002; Woodman and Luck, 2003). Recording VEPs, Ja´skowski et al. (2002) recently explored the posterior contralateral negativity (PCN) known to reflect attention-controlled stimulus selection (Eimer 1996; Wascher et al. 2001; Wauschkuhn et al. 1998; Woodman and Luck 1999). Their results indicate that the form-selective aspects of an invisible prime can activate attentional processes that modify the PCN to the (mask) stimulus following the prime (target). Presumably this form-selective or objectbased attentional modulation uses the unconsciously processed form information reviewed in section 8.4.1 above. Similarly, McCormick (1997) and Scharlau and Ansorge (2003) showed that invisible stimuli can activate location-specific attentional processes. The conclusion that attention can be directed to the location of stimuli rendered invisible by an aftercoming mask is supported by independent results of studies on neurological blindsight patients. Many of the residual visual abilities of a blindsight patient rely on the spared extrageniculate pathway consisting of direct retinal projections to the superior colliculus of the midbrain, in turn projecting to the pulvinar of the thalamus, and hence to spared post-striate visual areas of cortex (Rafal et al. 1990; Weiskrantz 1997). Since evidence implicates this pathway in the orienting of attention to visual stimuli and their features (LaBerge 1995; Posner and Petersen 1990), reaction time benefits of attention to stimuli presented in the cortical (V1) scotomas of a blindsight subject should be obtainable despite the subject’s being unaware of the stimuli. These combined findings—that attention can modulate the unconscious (as well as conscious) processing of visual information and conversely that unconsciously (as well as consciously) processed information can be used to strategically deploy attention—indicate
271
272
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
that attention and consciousness, although typically closely related (James 1950; Posner 1994), are by and large distinct, a point recently articulated more fully by Lamme (2000, 2003). To us it seems fair, on the basis of the above findings, to claim that attention is necessary but not sufficient for at least a type of visual consciousness which corresponds to what Block (1995, 1996) has termed access consciousness, a form of consciousness that, for example, accompanies retrieval of information from visual short-term (iconic) store for verbal report or phenomenal description about a select few items from a larger array of items (Sperling 1960). Whether attention is necessary for what Block (1995, 1996) terms phenomenal consciousness (e.g. display items present in the phenomenal field that fail to be reported) is debatable, although recent results reported by Joseph et al. (1997) indicate that even the putative ‘preattentive’ processing of information in the phenomenal field actually does require some (spatially global or distributed) attention. However, the nature of the extrafoveal ‘preattentive’ search task, ancillary to a primary target-identification task in a foveal attentional blink paradigm, would also require the entry of peripheral targets into access consciousness. This dual task in turn would require that at least some attention, perhaps otherwise not available, be made available by the observer for processing of the peripheral target. Hence the question as to whether phenomenal awareness, i.e. awareness that is not accessed for report, requires attention remains an open issue. 8.6. Metacontrast and the psychophysics of
induced ‘blindness’ 8.6.1.
Metacontrast and binocular rivalry suppression
There are ways other than metacontrast to ‘skin’ conscious perceptions, thus producing temporary ‘blindness’. One of them is to induce binocular rivalry, which alternately suppresses from conscious awareness the inputs to one eye or the other (Alais and Blake 2005). Recently, Breitmeyer et al. (2005a) combined binocular rivalry suppression with metacontrast suppression to assess the functional locus of the latter to the former. The target and mask were each presented for 10 ms at a metacontrast SOA of 50, which produced optimal suppression of the target visibility. The target, one of three disk-like stimuli, was presented to the left eye and the mask, one of three ring-like stimuli to the right
METACONTRAST AND THE PSYCHOPHYSICS OF INDUCED ‘BLINDNESS’
eye. In the non-rivalrous condition, the target and mask were presented under standard dichoptic viewing without binocular rivalry. Based on prior reports of dichoptic metacontrast (see Chapter 2, section 2.6.7), the expected outcome here was a low visibility of the target and a high visibility of the mask. In the rivalrous condition, the observers were asked to initiate a target–mask sequence only when the left (target) eye was in a dominant state and the right (mask) eye was in a suppressed state during binocular rivalry. Recall from Section 8.2 that a brief stimulus, such as a metacontrast mask, activates two neural processes: one occurring at unconscious levels of processing and giving rise to its effectiveness as a mask, the other giving rise to its visibility. Here two predictions are warranted, depending on whether or not binocular rivalry can cancel or suppress the neural mechanism responsible for the mask’s effectiveness in suppressing the target disk’s visibility. 1. If the cortical metacontrast mechanism, like the process of steadstate (non-transient) dichoptic masking (Westendorf 1989), is itself suppressed by binocular rivalry, metacontrast suppression of the disks should not occur. Consequently, the target should be highly visible. Moreover, the mask should have low visibility, since the activation of neural processes giving rise to its perception is suppressed when the right (mask) eye is in a suppressed phase of binocular rivalry. 2. In contrast, if binocular rivalry suppression does not suppress the cortical metacontrast process, metacontrast suppression of the disks will occur, thus rendering a low target visibility. The mask again should not be visible, since it is presented when the right (mask) eye is in a suppressed state of binocular rivalry. As shown in Figure 8.8, in the non-rivalrous condition, the expected dichoptic metacontrast effect was obtained: target visibility was low and mask visibility was high. In the rivalrous condition, Prediction 1 was confirmed: the target was highly visible, whereas the visibility of the mask was suppressed. Thus, along with the neural processes contributing to the visibility of the mask, the unconscious mechanism responsible for metacontrast suppression of target visibility can in turn be suppressed by binocular rivalry. Exactly where binocular rivalry or metacontrast is resolved is not clear. Several studies indicate that rivalry is resolved at relatively low levels of cortical processing (Carlson and He 2004; Lee and Blake 2002, 2004; Polonsky et al. 2000; Tong and
273
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
1.0 0.9 0.8 Proportion correct
274
Target Mask
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Neither
Mask eye Eye suppressed
Fig. 8.8 Proportion of correct stimulus contour identifications under non-rivalrous and rivalrous dichoptic viewing of target and mask stimuli as indicated in the inset. In the non-rivalrous condition neither eye’s input was suppressed; in the rivalrous condition, the input to the mask eye was suppressed.
Engel 2001). However, global and high-level perceptual-grouping factors are known to modulate binocular rivalry (Kovács et al. 1996), and amplification of rivalry appears to occur along the ascending cortical processing pathways (Nguyen et al. 2003). It follows that the metacontrast suppression mechanism, wherever it occurs, could be suppressed by binocularly rivalry at one or more cortical levels of processing. In combination, psychophysical findings such as these suggest the existence of functional hierarchies of unconscious information processing (Breitmeyer et al. 2005a). Some of these processes are immune to the suppressive effects of binocular rivalry and most likely occur at subcortical levels (Pasley et al. 2004; M.A. Williams et al. 2004), while others like metacontrast are subject to binocular-rivalry suppression and occur at cortical levels. Moreover, binocular rivalry, metacontrast and more generally backward masking can suppress the visibility of words (Zimba and Blake 1983; Dehaene et al. 2001). Despite these similarities, backward pattern masking does not eliminate semantic priming (Naccache et al. 2005; Nakamura et al. 2005) whereas binocular-rivalry suppression does (Zimba and Blake 1983). This indicates that binocular rivalry suppresses the unconscious processing of forms – here specifically those of words – whereas metacontrast and
METACONTRAST AND THE PSYCHOPHYSICS OF INDUCED ‘BLINDNESS’
backward pattern masking do not (Breitmeyer et al., 2005b; Naccache et al. 2005; Nakamura et al. 2005). Again this suggests the existence of distinct levels and types of unconscious processing accessed by different psychophysical techniques used to suppress the visibility of objects. Consequently, systematic comparative investigations, when possible, of the effects of backward masking, binocular rivalry, the attentional blink and so on may reveal distinctly different types and levels of unconscious visual processing. 8.6.2. Metacontrast, flash suppression, and motion-induced blindness
In typical metacontrast studies both the target and the mask are presented briefly and transiently. About 25 years ago, one of us investigated the possibility that the visibility of a target with a sustained or prolonged duration would be suppressed by a briefly flashed mask (Breitmeyer and Rudd 1981). The design of the experiment entailed a 10-s presentation of a black-on-white bar flanked symmetrically at spatial separations of 0.3, 0.6, 1.2, or 2.4 by two mask bars flashed for 50 ms at a target–mask onset asynchrony of 2 s. At such a large SOA (2000 ms) onset–onset response interactions between the target and the mask, typically entailed in the production of type B metacontrast, would be eliminated. If any suppression were to occur, it would have to be of the sustained neural response to the prolonged target by the transient response to the brief mask. The suppression of the target’s visibility was monitored as follows. If the onset of the flanking masks produced a phenomenal disappearance of the target, the subject pressed a button as soon as he or she noticed the disappearance. This started a millisecond clock counter which ran until the subject indicated the reappearance of the target by releasing the button. In this way several measurements were obtained from which an average suppression duration was calculated. The results for two subjects in Breitmeyer and Rudd’s (1981) study are shown in Figure 8.9. The target–mask display, as indicated by the different symbols in the figure, could be centered at one of three eccentricities relative to the fovea: 1.7, 6.0, and 10.3. At the smallest eccentricity, no suppression of target visibility was obtained at any target–mask spatial separation for either observer. The duration of target suppression increased at higher eccentricities and at smaller target–mask spatial separations. Both trends are consistent
275
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
Eccentricity (deg) 1.7 6.0 10.3 MK 6
Target suppression duration (s)
276
BB
5 4 3 2 1 0
–1 0
0.6 1.2 1.8 Target–mask spatial seperation (deg)
2.4
Fig. 8.9 The duration of suppression of the visibility of the sustained target bar after presentation of a 50-ms set of flanking mask bars as a function of target mask spatial separation and viewing eccentricity. Results are shown for two observers. (Reproduced from Breitmeyer and Rudd 1981.)
with similar variations in the magnitude of metacontrast suppression obtained when viewing eccentricity or target–mask spatial separation are systematically varied (see Chapter 2, section 2.6.6). Thus it appears that the two-transient paradigm is not a necessary condition for obtaining masking effects, but rather that a single transient elicited by the mask is sufficient. We called this effect specific flash suppression (SFS), since the mask was specifically aimed at (or near) the location of the target. Recently, SFS has been rediscovered by Kanai and Kamitani (2003), and a variation of it has also been reported by May et al. (2003). Other psychophysical methods of rendering visual stimuli unconscious are generalized flash suppression (GFS) (Wilke et al. 2003) and motioninduced blindness (MIB) (Bonneh et al. 2001; Hofstoetter et al. 2004). GFS is akin to SFS in that a large array of incoherently moving dots (similar to Brownian motion) surrounding, but not immediately adjacent to, the area of the target is presented a couple of seconds after
THE MASKING OF VISUAL TARGETS BY TMS
introduction of the target display. Here too, fading of the target is timed to the onset of the mask, increases with target eccentricity, and decreases with the spatial separation of the target and the random dots (Wilke et al. 2003). MIB differs from GFS in that the motion of the random dots is coherent, but it similarly produces target fading which also varies with stimulus parameters such as the density and contrast of the dots (Bonneh et al. 2001). All three methods (SFS, GFS, and MIB) are closely related to metacontrast suppression. In our opinion they rely on the activation of the transient- or motion-sensitive M pathway, which in turn suppresses the sustained P activity generated by the target. Moreover, we believe that, via this common suppressive mechanism, they exploit the semistabilized nature of extrafoveal stationary stimuli and thus are a form of ‘facilitated’ Troxler fading (Troxler 1804). 8.7. The masking of visual targets by transcranial
magnetic stimulation While visual masks have been used for over a century to degrade or suppress the conscious registration of visual targets, recent developments have allowed the use of transcranial magnetic stimulation (TMS) to disrupt the brain’s processing of visual targets. In TMS masking a magnetic pulse, applied to the area overlying the occipital lobe, produces a brief and spatially localized scotoma in the visual field (sometimes accompanied by a localized phosphene sensation) that masks the visibility of a target falling in the same area as the scotoma. By systematically moving the location of the magnet producing the TMS over the occipital region of the skull, one can (to some extent) vary the location of the scotoma in the visual field. Moreover, by additionally varying the SOA between the TMS pulse and the target, one can investigate the time course of processing in early (V1/V2) visual areas of the brain (Amassian et al. 1989, 1998; Corthout et al. 1999a,b, 2000) and how that processing relates to the microgenesis of visual consciousness. 8.7.1.
TMS suppression of visual targets
A series of experiments conducted by Corthout et al. (1999a,b) illustrates the masking effects of TMS on foveal targets consisting of
277
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
Target visibility (proportion correct)
278
0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 –120 –80 –40
0 40 80 120 160 200 T-TMS SOA (ms)
Fig. 8.10 Typical results showing changes of target visibility as a function of target-to-TMS SOA. Worsening target identification performance indicates an increasing suppressive effect of the TMS pulse. Note the maximal suppressive effects at SOAs of 30 and 100 ms. (Adapted from Corthout et al. 1999b.)
individual letters. Figure 8.10 shows typical results (Corthout et al. 1999b), averaged across three observers, of TMS masking as a function of the SOA between the TMS pulse and the visual target. Negative and positive SOAs indicate that the TMS onset respectively preceded and followed the onset of the visual target. Masking magnitude is indicated by the proportion of correct identifications of the target letters, with lower proportions corresponding to stronger masking. Note that two masking maxima were obtained, one at an SOA of 30 ms and the other at an SOA of 100 ms. A similar SOA value of about 100 ms for the later period of maximal TMS suppression has also been reported by Amassian et al. (1989, 1998) and by Kammer et al. (2003). Corthout et al. (1999b) concluded that these two maxima corresponded to the TMS-induced disruption of two processing intervals, the former corresponding to the early feedforward activation of cortical neurons and the latter to activation depending on re-entrant feedback from higher cortical visual areas. This interpretation dovetails with the aforementioned proposal of Lamme and coworkers (Lamme et al. 2001, 2002; Super et al. 2001) regarding an early feedforward and stimulus-dependent component and a later re-entrant and perceptdependent component of V1 neural responses. In section 8.8 we will relate these TMS results and their interpretations to para- and metacontrast masking.
THE MASKING OF VISUAL TARGETS BY TMS
TMS and the disinhibition or recovery of visually masked targets 8.7.2.
In section 8.2 we reviewed findings on how a secondary visual mask, by inhibiting the masking effects of a primary visual mask, can recover a target’s visibility which would otherwise, in the absence of the secondary mask, remain suppressed by the primary mask. In similar fashion a TMS pulse can be used as a secondary mask to suppress the inhibitory effects of a primary visual mask on a target. In a study by Amassian et al. (1993) a brief target consisting of a three-letter trigram was followed at an SOA of 100 ms by an equally brief mask also consisting of a three-letter trigram. As noted in Chapter 2, section 2.2., this constitutes a case of masking by structure, which would produce not only a minor amount of masking by integration of target and mask information at peripheral as well as central levels, but also a large amount of masking by central interruption of target processing (Breitmeyer 1984; Michaels and Turvey 1979). These combined visual masking effects acting to suppress the visibility of the target were counteracted by a TMS pulse when it was presented over the occipital cortex 60–140 ms after the onset of the visual trigram mask. During these intervals the TMS increased the visibility of the trigram target while simultaneously suppressing the visibility of the trigram mask. Peak recovery of target visibility occurred when the TMS pulse followed the trigram mask at an SOA of 100 ms. Since spatially overlapping targets and masks consisting of letter trigrams were used by Amassian et al. (1993), we cannot conclude with certainty to what respective extents the target recovery produced by the TMS was due to suppression of the mask’s integration mechanism vs. its interruption mechanism. In order to eliminate spatiotemporal integration as a source of backward masking, Ro et al. (2003) used a disk as a target and a surrounding annulus as a metacontrast mask. The mask followed the target at an optimal metacontrast SOA of 43 ms. Thus by eliminating the integration of overlapping target and mask information, only central interruption could be a source of masking. Ro et al. (2003) measured visibility changes of the annular mask and the disk target produced by a TMS pulse as a function of the SOA separating its onset from that of the preceding mask.
279
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
Figure 8.11 shows the changes of visibility of the mask ring and the target disk as a function of the mask–TMS SOA. Here the baseline mask-ring visibility takes a value of 0.95, the proportion of correct detections in the absence of the TMS. Similarly, baseline target-disk visibility, when the disk is followed by the ring at the optimal meta-contrast SOA of 43 ms, is defined as 0.0. As can be seen from inspection of Figure 8.11, the effect of the TMS generally is to suppress the visibility of the mask ring and to increase or recover the target disk’s visibility. Particularly noteworthy here is that the TMS pulse produces maximal target recovery at the same mask–TMS SOA of 100 ms at which it produces maximal suppression of the mask following the target disk. Unlike the study by Amassian et al. (1993), Ro et al. (2003) measured how a TMS pulse suppressed the effects TMS-induced visibility changes
1.0 0.9 Stimulus visibility (proportion correct)
280
0.8 0.7 0.6 Annulus (TMS) Annulus (baseline) Target (TMS) Target (baseline)
0.5 0.4 0.3 0.2 0.1 0.0 -0.1 50
75
100
125
150
175
Annulus-TMS SOA (ms)
Fig. 8.11 Visibility, in proportion of perceived-stimulus reports, of the mask ring and the preceding disk as a function of the SOA separating the annulus from the following TMS pulse. The ring’s baseline visibility (0.95) was obtained in the absence of the TMS pulse. The disk visibilities are TMS-induced changes of the proportion of disk reports relative to the baseline (no change) in the absence of the TMS pulse. Positive changes indicate TMS-induced recovery of the (otherwise suppressed) disk visibility. (Adapted from Ro et al. 2003.)
THE MASKING OF VISUAL TARGETS BY TMS
of the mask’s isolated interruption mechanism. Nevertheless, both studies share a common feature in that the suppression of the visibility of the visual mask was accompanied by a decrease in its masking effectiveness. As shown by Breitmeyer et al. (2004b), although this positive correlation between the visibility of a stimulus and its effectiveness as a mask may at first glance appear obvious and even necessary, it is neither. The reason is that, as shown by Breitmeyer et al. (1981b) (see section 8.2 and Figs 8.1 and 8.2), the visibility of a primary metacontrast mask (M1 in Fig. 8.1) can be suppressed by a secondary metacontrast mask (M2 in Fig. 8.1) without in turn reducing the effectiveness of the primary mask in suppressing the target’s visibility, i.e. without producing target recovery. We believe this is so for the following reasons. In the two studies in which a TMS pulse was used as a secondary mask, it is highly likely that the pulse had a spatially local but otherwise non-specific suppressive effect on processing in the visual cortex. Consequently, it suppressed activity not only in the mask-activated P pathway processing the form aspects of the visual mask (letter trigrams or annulus), but also in the mask-activated M pathway responsible for the central interruption or suppression of targetactivated form processing in the P pathway. The latter effect of the TMS pulse on the mask’s M activity leads to less suppression and therefore greater recovery of the target’s visibility, whereas the former effect on the mask’s P activity leads to suppression of the mask’s visibility. In contrast, in the study by Breitmeyer et al. (1981b), the secondary metacontrast mask (M2) suppressed the P activity of the primary metacontrast mask (M1) without suppressing its M activity, and hence there was a reduction of M1’s visibility without a reduction in its masking effectiveness. If correct, this analysis has important methodological implications. Since current techniques of TMS application do not selectively disrupt cortical P or M processing, it may be preferable to use a visual mask to produce more selective disruption of cortical processing (at least until such time as a more pathway-specific version of TMS masking is developed). By acting more specifically, a visual mask is able to yield more useful information than a TMS mask, particularly as applied to study of unconscious and conscious processes in vision. Of course, this
281
282
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
assumes that in other respects the visual mask can replicate the effects of a TMS on visual processing. We argue below that this is indeed the case when one is interested in studying the temporal dynamics of object vision. 8.8. The relation between visual masking and
TMS masking In terms of its primary masking effects it appears that a TMS pulse can produce effects similar to a visual mask. As shown in Figure 8.10, a TMS pulse produces two minima in the function relating a target’s visibility to the SOA separating the onset of the target from the TMS mask. One minimum occurs at a negative SOA, and the other at a positive SOA. The analogy to para- and metacontrast masking effects is obvious. As noted, Corthout et al. (1999b) concluded that these two ‘paracontrast’ and ‘metacontrast’ maxima correspond to the TMSinduced disruption of two processing intervals, the former corresponding to the early feedforward activation of cortical neurons and the latter to activation depending on re-entrant feedback from higher cortical visual areas. This interpretation is consistent, on the one hand, with the finding (Macknik and Livingstone 1998) that paracontrast suppresses the early response component of V1 neurons and, on the other, with the finding that backward pattern masking suppresses the later response components (Andreassi et al. 1975; Bridgeman 1975, 1980, 1988; Lamme et al. 2002; Schiller and Chorover 1966; Schwartz and Pritchard 1981; Vaughan and Silverstein 1968). Figure 8.12(a), taken from a recent study reported by Breitmeyer et al. (2004b), shows again the results of Corthout et al. (1999b) in comparison with paracontrast and metacontrast masking results obtained with visual masks. Corthout et al.’s TMS masking results are based on averages of three observers. The visual masking results are based on averages of three observers participating in experiments conducted in our laboratory. To make a proper comparison of the two sets of findings (Fig. 8.12b), we shifted the visual masking results so that the visual masking SOA of 0 ms aligned with a TMS SOA of 60 ms for the following reasons. Assuming that the cortical effects of a TMS pulse occur at very short latencies (e.g. 10 ms or less), we took the value of 60 ms as an estimate of the time delay (produced by sensory transduction
THE RELATION BETWEEN VISUAL MASKING AND TMS MASKING
Normalized target visibility
(a) Unadjusted retinocortical delay
(b)
Breitmeyer et al. 2004b Corthout et al. Baseline
1.2 1.0 0.8 0.6 0.4 0.2
0.0 –200 –160 –120 –80 –40 0 40 80 120 160 200 T-M1 SOA or T-TMS SOA (ms) Breitmeyer et al. 2004b Corthout et al. 1999b Baseline
Adjusted retinocortical delay
Normalized target visibility
1.2 1.0 0.8 0.6 0.4 0.2 0.0 –200 –160 –120 –80
–40
0
40
80
120
160
SOA (ms)
Fig. 8.12 (a) Comparison of a typical masking function obtained in our laboratory using a visual para- or metacontrast mask with a typical masking function obtained by Corthout et al. (1999b) using a TMS pulse as a mask. Negative and positive SOAs indicate that the masks were presented before and after the target, respectively. Results are not adjusted for retinocortical transmission delay. (b) As in (a) but with the results adjusted for a 60-ms delay of cortical M activity due to retinocortical transmission time (Baseler and Sutter 1997). (Reproduced from Breitmeyer et al. 2004b.)
and retino-geniculo-cortical transmission) separating the onset of the cortical TMS effect and the onset of the cortical effects of a visual mask presented to the retinae (Baseler and Sutter 1997). Despite the use of different observers and procedures, the two studies yield masking functions that agree to a surprising extent, especially regarding the SOAs at which masking maxima occur. This result would be expected if the early and late TMS-suppression maxima and the para- and metacontrast masking maxima both correspond to the suppression of the early and late responses of V1 neurons, respectively.
283
284
UNCONSCIOUS PROCESSING REVEALED BY VISUAL MASKING
8.9. Summary Visual masking can make several significant contributions to our understanding of the types and levels of conscious and unconscious information processing. First, careful analyses of target recovery studies reveal that the mechanism of metacontrast itself occurs at unconscious levels of processing, and this leads to the additional conclusion that the visual processes responsible for the visibility of a stimulus are distinct and dissociable from those responsible for its effectiveness as a mask. Visual masking techniques applied to the study of priming effects produced by a masked target stimulus also reveal the unconscious processing of a number of object attributes. Although the unconscious processing of an object’s spatial location can be inferred from a number of earlier metacontrast studies, more recent studies show additionally that attributes of an object that allow it to be identified, such as its chromatic surface features and form or contour features, can also be processed at unconscious levels. In addition, the results of these studies support the neurophysiological evidence and neural-network models linking distinct cortical levels and pathways to the processing of form or contour features, on the one hand, and surface features like color on the other. Although priming by color and form can be obtained at unconscious or, more specifically, preconscious levels, a number of masking studies have revealed that much of preconscious processing can be or even must be accompanied by selective spatial and temporal allocation of attention. Conversely, a number of other studies have shown that masked stimuli can nonetheless be used at unconscious levels by the visual system for strategic control of attention. Consequently, despite the close link between attention and consciousness, attentive processing appears to be distinct from conscious processing. TMS masking, recently also used to study the time course of conscious and unconscious processing, produces at least two epochs of suppression when varying the target-to-TMS SOA, with the earlier one corresponding to the SOA of maximal paracontrast suppression and the later one to the SOA of maximal metacontrast suppression. Despite this resemblance between TMS and visual masking, it appears that for now visual masks yield suppressive effects that are more pathwayspecific than those produced by TMS masks.
SUMMARY
Notes 1 We assume that although neural activity is percept related, it can nonetheless occur at unconscious levels of processing. For instance, percept-dependent activity can be found in the later response components of V1 neurons (Lamme 2003; Lamme et al. 2000; Super et al. 2001). Nevertheless, as argued by Crick and Koch (1995), V1 neural activity on its own does not give rise to conscious experience, i.e. whereas V1 neural activity may be necessary it is not sufficient for conscious experience. 2 In fact, if anything a slight enhancement of the visibility of M1 is indicated at M1–M2 SOAs of 90 ms. Similar enhancements produced by a preceding mask (in this case M2) were discussed in Chapter 2, section 2.5, and Chapter 4, section 4.3.2.
285
This page intentionally left blank
Chapter 9
Visual masking in selected subject populations
9.1. Introduction In Chapter 2 we noted that visual masking is a useful tool for investigating dynamic temporally evolving sequences and levels of information processing in the visual system. The vast majority of such studies have been conducted on human observers with normal visual systems. When applied to subjects who exhibit a variety of clinical anomalies or other distinct but non-clinical characteristics, interesting differences emerge when comparing masking performance between normal and clinical or special-subject populations. Such comparisons are informative because they allow us to understand how the response properties of underlying masking mechanisms express themselves differentially in the various subject populations. Since, by definition, visual masking is a spatiotemporal phenomenon, the extent to which these differences are due to the temporal or spatial response properties can also be assessed. However, a caveat is in order before we proceed to discussion of visual masking in specific subject populations. The evidence concerning various visuocognitive functions in some special populations is often ambiguous or mixed; studies either fail to replicate findings or report contradictory findings. For instance, in schizophrenia, evidence regarding such functions as gestalt organization (Cox and Leventhal 1978; Silverstein et al. 1996, 2000) and forward masking (Green et al. 2003, in press; Rassovsky et al. 2004, 2005; Saccuzzo et al. 1996; Slaghuis and Bakker 1995) is mixed. The evidential problem also holds for the study of visual deficits in specific reading disability (SRD) or dyslexia (Borsting et al. 1996; Demb et al. 1998; Edwards et al. 1996; Lehmkuhle et al. 1993; Livingstone et al. 1991; Pammer and Lovegrove 2001; Samar et al. 2002; Skottun 2000, 2001; Skottun and Parke 1999; Slaghuis and
288
VISUAL MASKING IN SELECTED SUBJECT POPULATIONS
Ryan 1999; Stein and Walsh 1997; Stuart et al. 2001; Victor et al. 1993). This makes it difficult, although not impossible, to specify a set of empirically based diagnostic and nosological criteria that are characterized by discriminative power and reliability. For example, in SRD the visuocognitive deficits may be limited to a particular subpopulation of SRD individuals (Borsting et al. 1996; Slaghuis and Ryan 1999), and likewise deficits in in schizophrenia may also be limited to specific subpopulations of schizophrenics (Green and Walker 1986; Slaghuis and Bakker 1995). Therefore, in our opinion, the problem of mixed evidence is better viewed as a challenge to broaden and refine thinking about how differential diagnosis relates to differential visuocognitive deficits rather than as a definitive narrowing or elimination of possibilities. To that end much diligent and at times painfully systematic work remains to be done. As a starter, a careful and comprehensive metaanalysis of extant findings covering the entire range of experimental paradigms within each of the various syndromes would clearly be in order. 9.2. Visual masking in amblyopia In humans, amblyopia is an anomaly that is associated with abnormal binocular visual development induced by strabismus or anisometropia. Nonetheless, the neural bases of amblyopia and binocular vision deficits are probably different (Blakemore and Eggers 1978). Recent animal models of amblyopia (reviewed by Kiorpes and McKee 1999) have revealed a physiological correlate of this condition in neuronal dysfunction at cortical levels of visual processing, particularly in the neural representations of the central visual field. The associated loss of visual acuity and spatial contrast sensitivity has been amply documented (Levi 1994; McKee et al. 2003). An interesting feature of human amblyopia is that usually only one eye is amblyopic. Hence the non-amblyopic eye can serve as the control or normal eye against which the performance of the amblyopic eye can be compared. Together with a number of visual tasks, visual backward masking reveals differences between the normal and amblyopic eye. Tytla and Steinbach (1984) investigated metacontrast in both eyes of human amblyopes. As shown in Figure 9.1, the amblyopic eye not only yielded a more pronounced type B metacontrast effect but also a shift in the SOA at which peak masking occurred to higher values. According
VISUAL MASKING IN NEUROLOGICAL AND PSYCHIATRIC PATIENTS
Target size: 1.0°
Blackness estimation
100
Target size: 0.5°
80 60 40 20
Control
Control
Amblyopic
Amblyopic
0 0
20
40
60
80
100 120 140
0
20
40
60
80
100 120 140
SOA (ms)
Fig. 9.1 Metacontrast suppression of a black target disk by a black surrounding ring for the non-amblyopic (open symbols) and amblyopic (full symbols) eyes of a human amblyope. The results shown in each of the two panels were obtained for each of two target diameters as indicated. Note the shift to a larger optimal metacontrast SOA in the amblyopic eye. (Adapted from Tytla and McAdie 1981.)
to the dual-channel models (e.g. the RECOD neural network) discussed in Chapter 5, these results indicate that the response magnitudes and latencies of the sustained P channels in the amblyopic eye are smaller and longer than those of analogous channels in the normal eye. The decrease in response magnitudes is also indicated by a loss of visual acuity in the amblyopic eye (Levi and Harwerth 1980), whereas the increase in response latencies is also confirmed by corresponding increases in visual reaction time (Levi et al. 1979). Recent VEP recordings of strabismic amblyopes and control subjects made by Demirci et al. (2002) also support this interpretation. 9.3. Visual masking in neurological and
psychiatric patients There appears to be a general decline in the efficiency and speed of visual processing in patients suffering from neurological deficits associated with, for instance, closed head injury (Mattson et al. 1994) or Parkinson’s disease (Bachmann et al. 1998). Backward masking tends to be stronger or to last over longer SOA ranges in neurological subjects compared with control subjects. As noted by Bachmann et al. (1998), such global differences do not form a compelling basis for development of diagnostic or nosological criteria. Moreover, since
289
290
VISUAL MASKING IN SELECTED SUBJECT POPULATIONS
visual information processing includes everything from bottom-up feedforward sensory coding of stimulus properties to top-down feedback influences of gestalt organization, figural context, and attention, such differences per se do not allow specification of what mechanisms or stages in visual information processing are affected by the neurological deficit. These global decreases in efficiency and speed of processing also characterize psychiatric subjects (Braff 1981; Green et al. 1994a,b, 2000; Merritt and Balogh 1984; Rund and Landrø 1990; Saccuzzo and Braff 1980, 1981, 1986; Saccuzzo and Schubert 1981), unaffected siblings of psychiatric patients (Green et al. 1997), and non-clinical subjects falling within a psychometrically determined psychosis-prone spectrum (Merritt and Balogh 1990). However, there is additional evidence pointing to more specific differences and similarities between various diagnostic groups of psychiatric patients and control subjects in the masking functions they yield. For instance, Slaghuis and Bakker (1995) reported that the backward masking functions of positive-symptom schizophrenic patients did not differ significantly relative to the backward masking functions obtained by control subjects, whereas those of negative-symptom patients did. The negative-symptom group of patients was more prone to backward masking than were the control subjects. These masking differences may relate to differences in other cognitive functions between positive- and negative-symptom schizophrenics (Green and Walker 1986). Moreover, masking research by Herzog and coworkers (Brand et al. 2004; Herzog et al. 2003d, 2004) has shown that although processing of target elements may be abnormal in schizophrenics relative to control subjects, schizophrenics nonetheless seem to possess some normal gestalt grouping (Herzog et al. 2004) and feature fusion mechanisms (Brand et al. 2004). Schizotypic and schizophrenic subjects also show differences in the shape of the backward masking function relative to control subjects. Results reported by Merrit and Balogh (1990) illustrate such differences for schizotypics.1 While both types of subject showed similar type A monotonic masking functions for a high spatial frequency mask, a low spatial frequency mask yielded overall weak type A masking for the control subjects but, in contrast, a strong U-shaped type B masking effect for the schizotypic subjects. According to the RECOD model (Purushothaman et al. 2000), it is the activity of low spatial frequency
VISUAL MASKING IN NEUROLOGICAL AND PSYCHIATRIC PATIENTS
transient M channels which is responsible for type B masking. Accordingly, these results suggest, as argued by several investigators (Green et al. 1994a; Schuck and Lee 1989; Slaghuis and Bakker 1995), that schizophrenic and schizotypic subjects have overly active transient M channels. As noted in Chapters 2 and 5, Purushothaman et al. (2000) and Fotowat et al. (2003) demonstrated that, when a fine sampling of SOAs is used, metacontrast masking can yield functions which show ␥-range (40-Hz) oscillations superposed on the type B U-shaped masking function. Figure 9.2 illustrates the recent findings of Green et al. (1999) (see also Green et al. 1994a, Fig. 3; Green et al. 2003a,b; Wynn et al. 2005). They show again that schizophrenic subjects yield stronger metacontrast masking than do normal subjects but also that control subjects yield a prominent ␥-range oscillation whereas schizophrenic subjects do not. Moreover, regarding this latter difference, Green et al. (2003a,b) found correspondingly weaker ␥-range oscillations in the later but not the earlier components of the CVEPs of the schizophrenics. This result dovetails nicely with the evidence that the later and not earlier CVEP components yield amplitude changes that correlate with the masking of target visibility during metacontrast (Andreassi et al. 1975; Bridgeman 1975, 1980, 1988; Lamme et al. 2002; Schiller and
(a) M/T = 0.5: T sharp
Mean number correct
12
(b) M/T = 0.25: T blurred
10 8 6 Control Schizophrenic
Control Schizophrenic
4 2 0
20
40
60
20 80 100 0 Interstimulus interval (ms)
40
60
80
100
Fig. 9.2 Metacontrast masking functions obtained from normal and schizophrenic subjects with (a) a sharp target and a mask-to-target (M/T) energy ratio of 0.5 and (b) a blurred target and an M/T energy ratio of 0.25. A lower mean number of correct responses corresponds to stronger masking. Note the lack of oscillations in the schizophrenics’ masking functions. (Adapted from Green et al. 1999.)
291
292
VISUAL MASKING IN SELECTED SUBJECT POPULATIONS
Chorover 1966; Schwartz and Pritchard 1981; Vaughan and Silverstein 1968) (see also Chapter 3, section 3.1, and Chapter 8, section 8.6). Whether or not these and related (Cadenhead et al. 1996) deficits in ␥-range oscillations are merely symptomatic rather than causal in schizophrenia is open to question. In recent years the existence of ␥-range oscillations has been tied to a number of perceptuocognitive processes. For instance, ␥-range oscillations are believed to provide not only the temporal binding of neural activity relevant for feature integration and conscious perceptual organization (Engel et al. 1999; Fries et al. 1997; Keil et al. 1999; Rodriguez et al. 1999; Tallon-Baudry and Bertrand 1999), but variations of ␥-range activity are also taken to relate to changes of focused arousal and attention (Engel et al. 1999; Munk et al. 1996; Sheer 1984; Steriade et al. 1996; Tiitinen et al. 1993). Schizophrenics are known to have abnormal cortical–cortical connectivity (Parnas et al. 1996), which may underlie their abnormal patterns of ␥-range oscillations (Clements et al. 1997; Hoffman et al. 1996), attentional deficits (Mori et al. 1996; Nuechterlein and Dawson 1984) as well as their ‘cognitive dysmetria’ (Andreasen et al. 1998). 9.4. Visual masking in subjects with specific
reading disability 9.4.1. Is there a transient M-channel deficit in specific reading disability?
The interest in applying visual masking to SRD and dyslexia grew out of a theoretically motivated psychophysical approach originally developed by Lovegrove and followed up by M.C. Williams (Lovegrove 1993; Lovegrove and M.C. Williams 1993; Lovegrove et al. 1986; M.C. Williams and LeCluyse 1990; M.C. Williams et al. 1989) to perform experimental investigations of deficits in the transient channels of SRD subjects. Under a renamed M-channel deficit, this approach was taken up and continued by several other investigators (Breitmeyer 1989, 1993a,b, 1994; Demb et al. 1998; Felmingham and Jakobson 1995; Lehmkuhle et al. 1993; Livingstone et al. 1991; Stein 1993; Stein and Walsh 1997; Talcott et al. 1998). This approach has been criticized on theoretical as well as empirical grounds (Gross-Glenn et al. 1995; Skottun 1997a,b, 2000, 2001; Skottun and Parke 1999; Stuart et al. 2001; M.J. Williams et al. 2003).
VISUAL MASKING IN SUBJECTS WITH SPECIFIC READING DISABILITY
Regarding the empirical aspect, we believe that the controversial and unsettled nature of the approach arises mainly from inconsistencies among the results of studies comparing psychophysical performances of SRD and normal readers. However, this chapter is not the place for reviewing and analyzing this vast amount of psychophysical literature. As mentioned above, a critical meta-analysis is still in order. However, several key points should be raised about the M-channel deficit approach to SRD. First, as mentioned above, the deficit may only be evident in certain SRD subgroups. Secondly, to the extent that it is evident, it may show up weakly, if at all, with insensitive psychophysical measures. Stein and Walsh (1997) also made this important point in their assessment of evidence for and against the M-channel deficit approach. Hence the procedures applied to the study of visual deficits in SRD should use optimal psychophysical methods and exploit sufficiently sensitive psychophysical measures. While adhering to the tenet of disconfirmability, we maintain that only when the experimental methods are convincingly stringent can negative results be informative. Finally, theoretical developments based on the M-channel deficits can remain agnostic vis à vis their etiological relation to SRD. While maintaining the existence of these deficits, we can regard them merely as markers rather than causal factors, although speculations as to possible etiological connections have been made (Breitmeyer 1991, 1993a,b, 1994; Stein and Walsh 1997). Unlike the psychophysical results, results obtained from studies using CVEP measures point almost unanimously to a transient M-channel deficit in SRD compared with normal subjects. Of seven such reported studies, only one (Victor et al. 1993) reports absence of such a deficit in SRD subjects, whereas the remaining six (Kubová et al. 1996; Lehmkuhle et al. 1993; Livingstone et al. 1991; May et al. 1991, 1992; Samar et al. 2002) obtained CVEP waveform differences consistent with cortical M-channel deficits in SRD subjects. The CVEP waveform differences are particularly evident for the early component peaking at about 50–75 ms, which represents M-channel activation (Baseler and Sutter 1997). Here, the results of Samar et al. (2002) are particularly interesting, since these investigators compared CVEP waveforms of normal deaf readers with those of disabled deaf readers. Their study, like the studies using hearing subjects (Lehmkuhle et al. 1993; Livingstone et al. 1991), found that the amplitudes of the early CVEP
293
VISUAL MASKING IN SELECTED SUBJECT POPULATIONS
component (but not of the later components) were lower in deaf SRD readers than in deaf normal readers. 9.4.2. Metacontrast masking performance in SRD and normal readers
According to original versions of the dual-channel approach to masking (Breitmeyer and Ganz 1976; Matin 1975) and the more recent RECOD version (Ögmen 1993; Purushothaman et al. 2000), the inhibition of the target’s sustained-channel activity by the mask’s transientchannel activity provides the mechanism for type B backward masking. Therefore if SRD subjects suffer a deficit in their transient M channels, their backward masking should be weaker than that in normal readers. Such differences have been reported in a series of masking studies conducted by M.C. Williams and coworkers (M.C. Williams and LeCluyse 1990; M.C. Williams et al. 1989, 1990). Figure 9.3 shows the results of metacontrast masking reported by M.C. Williams and LeCluyse (1990). Metacontrast functions were obtained for stimuli presented foveally and peripherally. Recall from Chapter 2, section 2.6.6, that metacontrast increases in strength with retinal eccentricity of stimulus presentation, consistent, as noted in section 2.6.6, with the stronger ratio of M-channel to P-channel activity in the periphery than in the fovea. Two aspects of the results of Figure 9.3 are noteworthy. First, the magnitude of metacontrast masking is weaker in SRD 10 Relative accuracy (%)
294
Disabled readers
Normal readers
5 0 –5 –10
Foveal Peripheral Baseline
Foveal Peripheral Baseline
–15 –20 –25 0
30
60
90
120 150 180 210 0 30 SOA (ms)
60
90
120 150 180 210
Fig. 9.3 Metacontrast masking functions for foveally and peripherally presented stimuli obtained from normal and specifically reading disabled subjects. Lower relative target identification accuracy corresponds to stronger masking. Note the smaller metacontrast effects obtained by the reading disabled subjects, particularly for peripheral stimulus presentations. (Adapted from Williams and LeCluyse 1990.)
SUMMARY
subjects than in normal subjects. Secondly, in normal subjects the peripheral stimulus location yields stronger metacontrast than the foveal location, as expected; in contrast, in SRD subjects the peripheral location yields metacontrast that is weaker and almost absent compared with that yielded by the foveal location. Thus the decrease in metacontrast magnitude found in SRD subjects is particularly magnified in peripheral areas of the retina where metacontrast and transient activity are usually quite strong. Similar results confirming a weaker metacontrast masking effect in SRD than in normal subjects were also reported by Edwards et al. (1996). 9.5. Summary In order to be maximally informative, the application of visual masking to special subject populations depends on (a) sound theoretical reasoning about masking and the characteristics that distinguish the visual systems of special subjects from those of normal ones and (b) solid empirical results. Unfortunately, existing empirical findings are at best mixed, with a significant number of studies reporting no differences in visual performance between normal and special subjects. However, negative results should be accepted only when very rigorous methodological controls have been implemented; the more stringent an experimental test, the more confident one can be about theoretical implications of its results. Moreover, given the current state and stage of research in special subject populations, mixed results might be as fruitfully directed to rethinking the etiology, nosology, and differential diagnosis applying to a given syndrome as to rethinking theories of visual deficits. Given these caveats, the application of visual masking to amblyopia, neurological and psychiatric disorders, and SRD has revealed interesting and theoretically significant findings. In amblyopia, the shift toward longer optimal metacontrast SOAs is consistent with weaker or slower responses in the P pathway. Schizophrenic and schizotypic subjects tend to yield stronger metacontrast and backward masking functions, which moreover tend to lack the high-frequency ␥ oscillations found in normal subjects. This indicates that schizophrenic subjects may have an overactive transient M-channel and an abnormal synchronization of neural activity in the object-processing P channel (as well as in other parts of the brain). In contrast with schizophrenic subjects, SRD
295
296
VISUAL MASKING IN SELECTED SUBJECT POPULATIONS
subjects tend to be less susceptible to metacontrast and thus to have an underactive M-channel, a characteristic that has implications for a number of other visual functions besides those tapped through visual masking. Note 1 The control groups consisted of a group of normal control subjects (normal controls) and of subjects identified by the Minnesota Multiphasic Personality Index (MMPI) as being prone to antisocial personality disorder but not to schizophrenia (4–9 controls). The experimental groups consisted of two MMPI-identified schizotypic groups of subjects: the 8–9 group and the 2–7–8 group.
Epilogue
Today, as in the past, the value of visual masking lies in its use as an investigative tool and its being an interesting phenomenon intrinsically worthy of study. Of course, the same applies to numerous other visual phenomena such as optical illusions, after-effects, binocular rivalry, figure–ground reversals, multistable perceptions, etc. Each of these is interesting in its own right and each has uses as an investigative tool not only in vision science, of which there are too many to mention, but also in fields as varied as philosophy and the visual arts. One of us recalls a conversation held several decades ago with a colleague and respected vision researcher who opined that interest in metacontrast masking was misplaced; after all, he maintained, metacontrast is merely a parlor trick or a puzzling curio that one occasionally displays for effect, but otherwise a conceptually vacuous phenomenon. We hope that the foregoing chapters, particularly those dealing with theories of visual masking, have proved this critic wrong by showing that a conceptual grasp of visual masking is not an empty trifle. Instead, it turns out to be a phenomenon whose fullest explanation depends on understanding the complex interplay of sensory, cognitive, and attentional processes, which in turn depend on interactions between bottom-up feedforward, lateral/horizontal, and top-down feedback processes. The subtitle of this book places an understanding of visual masking, both as a phenomenon per se and as an investigative tool, in the broader framework of the temporal dynamics of conscious and unconscious visual processing. Other visual phenomena obtained under free viewing, such as binocular rivalry, the Troxler fading effect, and the perceptual fluctuations accompanying prolonged viewing of multistable visual displays, also deal with the temporal dynamics of perceptual organization. However, the spontaneous perceptual changes accompanying these phenomena are characterized by temporally stochastic properties; in contrast, visual masking, by precisely controlling temporal parameters such as SOA, can reveal temporally deterministic
298
EPILOGUE
properties of perceptual change. What applies to visual masking techniques also applies to the more recently developed technique of ‘flash suppression’ during binocular rivalry (Kreiman et al. 2005) and TMS masking, although a more careful psychophysics of TMS masking may have to be worked out. Therefore both these techniques are particularly useful in studying the time course of unconscious and conscious visual processes during the first few hundred milliseconds after a stimulus impinges on the retinae. Understanding these processes is important for two reasons. First, normal vision is by nature highly dynamic. Because of the very small area of sharpest foveal vision, we usually change our visual fixations somewhere between two and four times per second in order to scrutinize or ‘read’ the information in a vastly larger visual field. Thus the processing of visual information must fully occur within the 250–500 ms duration of each fixation. It is known that the saccades separating two successive fixations can suppress the vision of stimuli presented just before, during, and just after a saccade, and that normally we are not aware of this suppression (Volkmann 1986; Ross et al. 2001). This is odd, since on the one hand saccadic suppression demonstrates a loss of visibility of stimuli while on the other we are not aware of such a loss in everyday non-laboratory settings. This is akin to the lack of awareness of an artificially produced or naturally occurring scotoma such as the ‘blind spot’ in each eye. Perhaps the temporal ‘scotoma’ or ‘anesthesia’ (Dodge 1900), i.e. the loss of visibility between successive fixations, is ‘filled in’ by a process akin to the filling in believed to occur with spatial scotoma (Pessoa and De Weerd 2003). While the detailed nature of this temporal trans-saccadic filling in of the stream of consciousness remains to be worked out (Irwin 1996), saccadic suppression also raises the interesting prospect that some information about visual stimuli can be processed during the saccade despite the loss of their phenomenal visibility (Bridgeman et al. 1979, 1981). Since visual masking can also produce suppression of phenomenal vision while leaving unconsciously processed information intact, it is also important (along with TMS masking) as a powerful tool for studying not only the time course but also the contents and the levels of unconscious visual processing (see Chapter 8). Although especially useful in precise studies of the temporal properties of vision, visual masking is not unique in its ability to specify levels or stages of
EPILOGUE
unconscious processing. In Chapter 8 we were already able to specify some of these levels of unconscious processing. Much effort has been devoted in recent years to discovering and delineating the neural correlates of conscious vision (Koch 2004; Metzinger 2000; Pollen 1999; Weiskrantz 1997) and also of unconscious vision (de Gelder et al. 2001; Stoerig 1996), and much work remains to be done in each of these pursuits. Using the various psychophysical techniques for suppressing conscious registration of stimuli, neuroscientists also have tried to specify the cortical levels and loci of processing that are correlated with conscious and unconscious vision (e.g. Sheinberg and Logothetis 1997; Haynes et al. 2005; Hofstoetter et al. 2004; Wilke et al. 2003). However, since serious uncertainties remain regarding the cortical sites and levels of these visual-suppression mechanisms (e.g. Blake and Logothetis 2002; Tong 2001), specifying the levels and loci of conscious and unconscious processing remains equally uncertain. Be that as it may, we believe that a useful step for studying and clarifying the neural correlates of conscious and unconscious vision is to establish a relational network of unconscious processing produced by these psychophysical techniques. We have already shown that metacontrast suppression, itself a mechanism occurring at unconscious levels of processing, functionally occurs after binocular rivalry suppression (Chapter 8, section 8.6). Other known ways of psychophysically rendering stimuli inaccessible to consciousness depend on pattern rivalry (Andrews and Purves 1997), flash suppression (Breitmeyer and Rudd 1981; Kanai and Kamitani 2003; Kreiman et al. 2005; Wilke et al. 2003), motion-induced blindness (Bonneh et al. 2001; Hofstoetter et al. 2004), and transient stimulus decrements (May et al. 2003). When used in conjunction with neuroscientific techniques, the informative content of each of these techniques would be greatly enhanced if we could specify where in the functional hierarchy of visual information processing each of these suppression mechanisms occurs relative to the others. For instance, if pattern rivalry were to occur functionally before motion-induced blindness, and neuroscientific research were to show that motion-induced blindness occurs in, say, area V2, we could fairly confidently maintain that the neural mechanism of pattern rivalry occurs before V2. Moreover, as well as knowing that the neural correlates for conscious vision reside later
299
300
EPILOGUE
than the stages of both pattern rivalry suppression and motioninduced blindness, we could also possibly unravel the types of information processed at and between each of these two unconscious processing stages. Such findings may bear relevantly on our understanding of neurological disorders such as blindsight, neglect, etc. in which unconscious forms of visual processing are preserved. Additionally, they could be of interest to the larger community of cognitive neuroscientists, including those concerned with the distinctions between vision for action, vision for perception, and vision for emotion, as well as philosophers grappling with the various (hard and soft) problems concerning consciousness and body–mind relations. Even if a functional hierarchy of unconscious visual processing is established, it is hard to imagine that a definite site or locus of neural correlates of conscious vision will readily be established. The search space may have been narrowed; however, if, as assumed by many cognitive scientists and neuroscientists, consciousness supervenes on the exchange of neural signals among the many sites in a highly interactive and complex cortico-thalamic network, a definite site or locus for conscious vision will be hard to pinpoint. Perhaps, at some time in the future, we will have to be satisfied with specifying the various cortical and subcortical loci responsible for rendering the contents of conscious and unconscious vision, knowing all along that consciousness as a state, taken as given, must be differentiated from consciousness as a trait or aspect of at least some visual contents (Stoerig 2002). What we hope that this book demonstrates is the circumscribed promise that at least one of these psychophysical techniques holds for specifying the levels of types of conscious and unconscious processing of visual contents, and the wider promise it may hold when combined with other psychophysical techniques without, however, solving the critical and hard problems of consciousness left to researchers of more optimistic and sanguine temperament.
Appendix A
Some mathematical aspects of masking models
A.1. Multiplicative (shunting) and additive equations In Chapter 5, we introduced two ‘generic’ formalisms that have been used extensively in neural modeling (e.g. Grossberg 1988; Koch and Segev 1989). The first equation has the form of Hodgkin–Huxley equation and is written as dVm AVm (B Vm)gd (D Vm)gh. dt
(A1)
The physiological interpretation of the variables and parameters is given in Chapter 5. In particular, gd and gh represent variable (active) conductances through which input signals modulate the membrane potential Vm. In this sense, this is an ‘active’ model of a neuron. Because of the multiplicative interactions between the input and the membrane potential, the model is also known as the multiplicative or shunting model (e.g. Grossberg 1988). A simpler version of this model is the ‘additive’ model dVm AVm excitatory inputs inhibitory inputs. dt
(A2)
where conductances are fixed and the input modulates the membrane potential via additive currents. The BCS model (Chapter 4, section 4.7) and the RECOD model (Chapter 5) build their dynamic network representations using equations (A1) and (A2). We will show below that the equations used in Bridgeman’s and Weisstein’s models are of the additive type.
302
SOME MATHEMATICAL ASPECTS OF MASKING MODELS
Consider first the equations of Bridgeman’s Hartline–Ratliff inhibitory network: ri(t) ei(t)
n
兺w [r (t 兩i j兩) r ]
j1
j,i
0 j,i
j
(A3)
where ri(t) is the firing rate of neuron i at time t, ei(t) is the excitatory input to the ith neuron, wj,i is the synaptic weight for the connection from the jth neuron to the ith neuron, rj,i0 is the firing threshold for the connection between the jth and the ith neuron, and n is the number of neurons in the network. Equation (A3) is a difference equation and equation (A2) is a differential equation. The two equations can be compared by approximating the derivative by a backward-difference formula. Following the transformations used by Grossberg (1988), let dri(t) 艐ri(t) ri(t 1). dt
(A4)
Equation (A3) can be written as ri(t) ri(t 1) ri(t 1) ei(t)
n
兺w [r (t 兩i j 兩) r ].
j1
j,i
j
0 j,i
(A5)
Using equation (A5), we obtain n dri(t) 艐 ri(t 1) ei(t) 兺 wj,i[rj(t 兩 i j兩) r0j,i]. dt j1
(A6)
which has the same form as equation (A2) with A 1 and with ei(t) n and 兺j1wj,i[rj(t 兩i j兩) r0j,i] corresponding to the excitatory and inhibitory inputs, respectively. Note that while the state variables in the two equations correspond to different physiological variables (membrane potential in equation (A3) and firing rate in equation (A5)), the Hartline–Ratliff model applies thresholds directly to firing rates, while network formulations of equations (A1) and (A2) apply thresholds to membrane potentials. As discussed in Chapter 4, the building blocks of Weisstein’s model are the two-factor Rashevsky–Landahl equations d j aj j Aj fi(t) dt
(A7)
MULTIPLICATIVE (SHUNTING) AND ADDITIVE EQUATIONS
x1
–
+ x2 +
fi
Fig. A.1 Two-neuron additive model equivalent for the two-factor Rashevsky-Landahl model. (Reproduced from Ög ˘men 1993)
djj bj jj Bj fi(t) dt
(A8)
fj(t) [ j jj hj]
(A9)
where j and jj are the excitatory and inhibitory factors, respectively, to the jth neuron. These factors can be interpreted and modeled as excitatory and inhibitory neurotransmitters. Alternatively, we can introduce an additional ‘interneuron’, as shown in Figure A1, to express this model as a small circuit of two neurons (Ögmen 1993, Appendix B). Let the two neurons in Figure A1 obey the additive equations dx1 x1 fi(t) x2 dt
(A10)
dx2 x2 fi(t) dt
(A11)
with the output of x1 given by
[x1(t) ].
(A12)
This output corresponds to the output of the two-factor neuron with the following parameter identifications: ␣ ␣ j ,  A j B j , ␥ aj bj, ␦ bj, Bf, and hj.
303
304
SOME MATHEMATICAL ASPECTS OF MASKING MODELS
A.2. The approach of Anbar and Anbar Anbar and Anbar (1982) proposed a mathematical model for masking based on three assumptions: step visual response function (VRF) and exponential decay, temporal integration, and lateral inhibition. A.2.1.
Step visual response function
The VRF v(t) for an input of intensity I applied at time t 0 and of duration t0 is assumed to have the following form: v(t) cI for 0 t t0 and v(t) cI exp[ (t t0)] for t0 t (A13)
where  is a constant and ␣ is a power function of the input, i.e. ␣ kI where k and represent two constants. Thus the response is assumed to increase stepwise to its level while the stimulus is on and to decay exponentially after the stimulus is turned off. Using equations (A1) and (A2) we also obtain an exponential rise and decay in the response. Anbar and Anbar use step increase as a simplification and assume specific power relations for response magnitude and decay rate. A.2.2.
Temporal integration
Temporal integration is used as a linking assumption for perceived brightness V(I): V(I)
冕 v(t)dt cI t 冢1 t1 冣.
0
0
(A14)
0
A temporal integration assumption is also used in other models (e.g. Weisstein’s model and the RECOD model) to link neural activities to perceived brightness. A.2.3.
Lateral inhibition
Finally, Anbar and Anbar assume that masking results from a type of simultaneous brightness contrast effected by lateral inhibition such that the weaker of the two stimuli is suppressed by the stronger. This suppression is assumed to be a step decrease by an amount proportional to the pth power of the ratio between the two VRFs at the onset of the masking stimulus. For example, assume that the target and
THE APPROACH OF ANBAR AND ANBAR
the mask have intensities IT and IM, respectively, with IT IM. Assume also that the mask is applied while the target is on. By equation (A13), when the target is on, the VRF for the target is cIT . Similarly, the VRF for the mask is cIM . Since the mask is stronger than the target, the VRF for the target will be suppressed by the ratio (ITIM) p
(A15)
(ITIM) pI T
(A16)
reaching stepwise the level
However, if the mask is applied after the offset of the target at t t0, the VRF will be suppressed by the ratio
冦
I Texp[ 0( t0)] I M
冧
p
(A17)
reaching stepwise the level
冦
I Texp[ 0( t0)] I M
冧 I exp[ ( t )]. p
T
0
0
(A18)
Inspection of these expressions shows how a U-shaped backward masking function is obtained. First, consider equation (A17). For a fixed mask intensity, larger values of lead to smaller values of this ratio. Since VRF is multiplied by this ratio, smaller values of the ratio imply larger suppression of activity. Thus, from the perspective of the ratio (A17), masking becomes more effective as ISI increases. On the other hand, inspection of (A18) shows that the actual suppression of VRF is given by the product of this ratio and an exponentially decaying VRF. Because larger values of ISI correspond to smaller values of the exponentially decaying VRF, from the perspective of the ongoing VRF, masking becomes less and less effective as ISI increases. Putting the two opposing tendencies together, we obtain a U-shaped function. It should be noted that a secondary effect of the drop in activity is the concomitant change in the rate of subsequent decay. However, Francis (2000) showed that a U-shaped function can be obtained without the change in decay rate, although joint changes in amplitude and decay rate produce a U-shaped masking function that is quantitatively different from that obtained solely by a change in amplitude.
305
306
SOME MATHEMATICAL ASPECTS OF MASKING MODELS
A.3. Concepts of mask and target blocking The simple mathematical formulation of the Anbar and Anbar model leads to an important insight on how a U-shaped masking function can be produced. As mentioned above, because ratio (A15) depends on the strength of the target, masking is more efficient when the mask is delayed. However, because the activity is decaying, the mask has to be applied while the activity remains at a significant level; otherwise, most of the target’s activity remains intact, leading to a high level of visibility. These two constraints together lead to an ‘optimal’ intermediate ISI wherein maximum masking occurs. Francis (2000) formulated this intuition into a general property that he called mask blocking. Mask blocking is defined as the blocking of the mask’s inhibitory effect by the target’s activity (cf. numerator in equation (A15)). The complementary concept, target blocking, occurs when the mask blocks the target signal from contributing to the visibility of the target. Assume, for example, that the inhibitory signal of the mask is largely independent of the activity of the target. In this case the inhibitory activity of the mask will reduce the activity of the target without a reciprocal interference from the target and thus target blocking will occur. More detailed analysis and predictions can be found in Francis and Cho (2005).
Appendix B
Parameters of the RECOD model
Table B.1. Parameters used in Equation (5.6) Parameter
Value
J
0.40 16.0 0.13 12.0 0.0035
Table B.2. Parameters for the retinal network Parameter
Value
As Bs Ds Js Gse: Amp Gse: sd Gsi: Amp Gsi: sd s ns nt At Bt Gtse: Amp Gtse: sd
2.0 250.0 10.0 6.0 1.0 (0.03*60.0*60.0)/23.0 0.0135 (0.18*60.0*60.0)/23.0 0.10 0.00125 230.877 28.0 40.0 2.0 600.0 1.0 0.00743 (0.1*60*60)/23.0
308
PARAMETERS OF THE RECOD MODEL
Table B.3. Parameters for the sub-cortical network Parameter
Value
s
0.00035 60
Table B.4. Parameters for the postretinal network Parameter
Value
Ap Bp : Contour : Surface s: Contour s: Surface npf np Ks kp Hpi: Amp Contour Hpi: Amp Surface Hpi: sd Qmp: Amp Qmp: sd Aq Bq Am Bm Km Hmi: Amp Hmi: sd Qpm: Amp Qpm: sd
1.0 1.0 2.0 7.0 0.09 0.25 24 120 10.0 9.0 1.5 0.2 100.0 5.0 100.0 1.0 10.0 10.0 1.0 10.0 7.0 56.0 300.0 80.0
References
Abbott, L.F., Varela, K., Sen, K., and Nelson, S.B. (1997). Synaptic depression and cortical gain control. Science 275, 220–3. Abrams, R.A. and Law, M.B. (2000). Object-based visual attention with endogenous orienting. Percept Psychophys 62, 818–33. Adelson, E.H. and Bergen, J.R. (1985). Spatio-temporal energy models for the perception of motion. J Opt Soc Am A 2, 284–99. Adelson, E.H. and Bergen, J.R. (1986).The extraction of spatiotemporal energy in human and machine vision. In Motion: Representation and Analysis (IEEE Workshop Proceedings). Piscataway, NJ: IEEE Publications, pp. 151–5. Alais, D. and Blake, R. (2005). Binocular Rivalry. Cambridge, MA: MIT Press. Allport, D.A. (1970). Temporal summation and phenomenal simultaneity: experiments with the radius display. Q J Exp Psychol 22, 686–701. Alpern, M. (1952). Metacontrast: historical introduction. Am J Opt 29, 631–46. Alpern, M. (1953). Metacontrast. J Opt Soc Am 43, 648–57. Alpern, M. (1965). Rod–cone independence in the after-flash effect. J Physiol 176, 462–72. Alpern, M. and Barr, L. (1962). Durations of the after-images of brief light flashes amd the theory of the Brucker and Sulzer effect. J Opt Soc Am 52, 219–21. Alpern, M. and Rushton, W.A.H. (1965). The specificity of the cone interaction in the after-flash effect. J Physiol 176, 473–82. Alpern, M., Rushton, W.A.H., and Torii, S. (1970a). The size of rod signals. J Physiol 206, 193–208. Alpern, M., Rushton, W.A.H., and Torii, S. (1970b). The attenuation of rod signals by backgrounds. J Physiol 206, 209–27. Alpern, M., Rushton, W.A.H., and Torii, S. (1970c). The attenuation of rod signals by bleachings. J Physiol 207, 449–61. Alpern, M., Rushton, W.A.H., and Torii, S. (1970d). Signals from cones. J Physiol 207, 463–75. Amassian, V.E., Cracco, R.Q., Maccabee, P.J., Cracco, J.B., Rudell, A., and Eberle, L. (1989). Suppression of visual perception by magnetic coil stimulation of human occipital cortex. Electroencephalogr Clin Neurophysiol 74, 458–62. Amassian, V.E., Cracco, R.Q., Maccabee, P.J., Cracco, J.B., Rudell, A.P., and Eberle, L. (1993). Unmasking human visual perception with the magnetic coil and its relationship to hemispheric asymmetry. Brain Res 605, 312–16. Amassian, V.E., Cracco, R.Q., Maccabee, P.J., Cracco, J.B., Rudell, A.P., and Eberle, L. (1998). Transcranial magnetic stimulation in study of the visual pathway. J Clin Neurophysiol 15, 288–304. Anbar, S. and Anbar, D. (1982). Visual Masking: a unified approach. Perception 11, 427–39.
310
REFERENCES
Anderson, C.H. and van Essen, D.C. (1987). Shifter circuits: a computational strategy for dynamic aspects of visual processing. Proc Natl Acad Sci USA 84, 6297–301. Anderson, S.J. and Burr, D.C. (1991). Spatial summation properties of directionally selective mechanisms in human vision. J Opt Soc Am A 8, 1330–9. Anderson, S.J., Burr, D.C., and Morrone, M.C. (1991). Two-dimensional spatial and spatial-frequency selectivity of motion-sensitive mechanisms in human vision. J Opt Soc Am A 8, 1340–51. Andreasen, N.C., Paradiso, S., and O’Leary, D.S. (1998). ‘Cognitive dysmetria’ as an integrative theory of schizophrenia: a dysfunction in cortical-subcortical-cerebellar circuitry? Schizophr Bull 24, 203–18. Andreassi, J.L., Mayzner, M.S., Beyda, D.R., and Davidovics, S. (1971). Visual cortical evoked potentials under conditions of sequential blanking. Percept Psychophys 10, 164–8. Andreassi, J.L., Stern, M., and Okamura, H. (1974). Visual cortical evoked potentials as a function of intensity variations in sequential blanking. Psychophysiology 11, 336–45. Andreassi, J.L., De Simone, J.J., and Mellers, B.W. (1975). Amplitude changes in the visual evoked potential with backward masking. Electroencephalogr Clin Neurophysiol 41, 387–98. Andrews, D.P. and Hammond, P. (1970). Mesopic increment threshold spectral sensitivity of single optic tract fibres in the cat: cone–rod interaction. J Physiol 209, 65–81. Andrews, T.J. and Purves, D. (1997). Similarities in normal and binocularly rivalrous viewing. Proc Natl Acad Sci USA 94, 99005–8. Ansorge, U. (2003). Asymmetric influences of temporally vs. nasally presented masked visual information: evidence for collicular contributions to nonconscious priming effects. Brain Cogn 51, 317–25. Ansorge, U. (2004). Top-down contingencies of nonconscious priming revealed by dual-task interference. Q J Exp Psychol 57A, 1123–48. Ansorge, U. and Neumann, O. (2005). Intentions determine the effect of nonconsciusly registered visual information: evidence for direct parameter specification in the metacontrast dissociation. J Exp Psychol Hum Percept Perform 31, 762–77. Ansorge, U., Klotz, W., and Neumann, O. (1998). Manual and verbal responses to completely masked (unreportable) stimuli: exploring some conditions for the metacontrast dissociation. Perception 27, 1177–89. Anstis, S.M. and Moulden, B.P. (1970). After effects of seen movement: evidence for peripheral and central components. Q J Exp Psychol 22, 222–9. Arrington, K.F. (1994). The temporal dynamics of brightness filling-in. Vision Res 34, 3371–87. Assad, J. (1999). Now you see it: frontal eye field responses to invisible targets. Nat Neurosci 2, 205–6. Atchley, P., Grobe, J., and Fields, L.M. (2002). The effect of smoking on sensory and attentional masking. Percept Psychophys 64, 328–36. Aubert, H. (1865). Physiologie der Netzhaut. Breslau: Morgenstern. Averbach, E. and Coriell, A.S. (1961). Short-term memory in vision. Bell Syst Tech J 40, 309–28. Azzopardi, P., Jones, K.E., and Cowey, A. (1999). Uneven mapping of magnocellular and parvocellular projections from the lateral geniculate nucleus to the striate cortex in the macaque monkey. Vision Res 39, 2179–89.
REFERENCES
Baade, W. (1917a). Selbstbeobachtungen und Introvokation. Zeitschr Psychol Physiol Sinnesorg 79, 68–96. Baade, W. (1917b). Experimentelle Untersuchungen dar darstellenden Psychologie des Wahrnehmungsprozesses. Zeitschr Psychol Physiol Sinnesorg 79, 97–127. Bachmann, T. (1984). The process of perceptual retouch: nonspecific afferent activation dynamics in explaining visual masking. Percept Psychophys 35, 69–84. Bachmann, T. (1988). Time course of the subjective contrast enhancement for a second stimulus in successively paired above-threshold transient forms: perceptual retouch instead of forward masking. Vision Res 28, 1255–61. Bachmann, T. (1994). Psychophysiology of Visual Masking: The Fine Structure of Conscious Experience. Commack, NY: Nova Science. Bachmann, T. (1997). Visibility of brief images: the dual-process approach. Conscious Cogn 6, 491–518. Bachmann, T. (1999). Twelve spatiotemporal phenomena and one explanantion. In Cognitive Contributions to the Perception of Spatial and Temporal Events (ed G. Aschersleben, T. Bachmann, and J. Muesseler). Amsterdam: Elsevier, pp. 173–206. Bachmann, T. (2000). Microgenetic Approach to the Conscious Mind. Amsterdam: John Benjamins. Bachmann, T. and Allik, J. (1976). Integration and interruption in the masking of form by form. Perception 5, 79–97. Bachmann, T. and Põder, E. (2001). Change in feature space is not necessary for the flash-lag effect. Vision Res 41, 1103–6. Bachmann, T., Asser, T., Sarv, M., et al. (1998). Speed of elementary visual recognition operations in Parkinson’s disease as measured by the mutual masking method. J Clin Exp Neuropsychol 20, 118–34. Bacon-Macé, N., Macé, M.J., Fabre-Thorpe, M., and Thorpe, S.J. (2005). The time course of visual processing: backward masking and natural scene categorisation. Vision Res 45, 1459–69. Bair, W., Cavanaugh, J.R., Smith, M.A., and Movshon, J.A. (2002). The timing of response onset and offset in macaque visual neurons. J Neurosci 22, 3189–205. Baker, W. (1963). Initial stages of light and dark adaptation. J Opt Soc Am 53, 98–103. Balota, D.A. (1983). Automatic semantic activation and episodic memory encoding. J Verb Learn Verb Behav 22, 88–104. Banta, A. and Breitmeyer, B.G. (1985). Stationary patterns suppress the perception of stroboscopic motion. Vision Res 25, 1501–5. Barlow, H.B., Fitzhugh, R., and Kuffler, S.W. (1957a). Dark adaptation, absolute threshold and Purkinje shift in single units of the cat’s retina. J Physiol 137, 327–37. Barlow, H.B., Fitzhugh, R., and Kuffler, S.W. (1957b). Change of organization of the receptive fields of the cat’s retina during dark adaptation. J Physiol 137, 338–54. Baron, J. and Thurstone, I. (1973). An analysis of the word superioty effect. Cogn Psychol 4, 207–28. Baroncz, Z. (1911). Versuch über den sogenannten Metakontrast. Pflugers Arch Gesamte Physiol 140, 491–507. Barris, M.C. and Frumkes, T.E. (1978). Rod–cone interaction in human scotopic vision. IV. Cones stimulated by contrast flashes influence rod threshold. Vision Res 18, 801–8.
311
312
REFERENCES
Barry, S.H. and Dick, O. (1972). On the ‘recovery’ of masked targets. Percept Psychophys 12, 117–20. Bartley, S.H. (1938). A central mechanism in brightness discrimination. Proc Soc Exp Biol Med 38, 535–6. Baseler, H.A. and Sutter, E.E. (1997). M and P components of the VEP and their visual field distribution. Vision Res 37, 675–90. Bashinski, H.S. and Bacharach, V.R. (1980). Enhancement of perceptual sensitivity as the result of selectively attending to spatial locations. Percept Psychophys 28, 241–8. Battersby, W.S. and Wagman, I.H. (1962). Neural limitations of visual excitability. IV. Spatial determinants of retrochiasmal interaction. Am J Physiol. 203, 359–65. Battersby, W.S., Oesterreich, R.E., and Sturr, J.F. (1964). Neural limitations of visual excitability. VII. Nonhomonymous retrochiasmal interactions. Am J Physiol 206, 1181–8. Baumgardt, E. and Segal, J. (1942). Facilitation et inhibition parameters de la function visuelle. Ann Psychol 43–44, 54–102. Baxt, N. (1871). Über die Zeit, welche nötig ist, damit ein Gesichtseindruck zum Bewusstsein kommt und über die Grösse (Extension) der bewussten Wahrnehmung bei einem Gesichtseindruck von gegebener Dauer. Arch Gesamte Psychol 4, 325–36. Becker, M.W. and Anstis, S.M. (2004). Metacontrast masking is specific to luminance polarity. Vision Res 44, 2537–43. Benardete, E.A. and Kaplan, E. (1997). The receptive field of the primate P retinal ganglion cell. I. Linear dynamics. Vis Neurosci 14, 169–85. Benardete, E.A., Kaplan, E., and Knight, B.W. (1992). Contrast gain control in the primate retina: P cells are not X-like, some M cells are. Vis Neurosci 8, 483–6. Benevento, L.A., Creutzfeldt, O.D., and Kuhnt, U. (1972). Significance of intracortical inhibition in the visual cortex. Nat New Biol 238, 124–6. Berbaum, K., Weisstein, N., and Harris, C. (1975). A vertex-superiority effect. Bull Psychon Soc 6, 418. Berger, C. (1954). Illumination of surrounding field and flicker fusion frequency with foveal images of different sizes. Acta Physiol Scand 30, 161–70. Berman, N.J., Douglas, R.J., Martin, K.A., and Whitteridge, D. (1991). Mechanisms of inhibition in cat visual cortex. J Physiol 440, 697–722. Bernhard, C.G. (1940). Time correlations in man of electrophysiological and sensory phenomena following light stimuli. Acta Physiol Scand 1 (Suppl 1), 52–94. Bernstein, I., Amundson, V.E., and Schurman, D.L. (1973a). Metacontrast inferred from reaction time and verbal report: replications and comments on the Fehrer–Biederman experiment. J Exp Psychol 100, 195–201. Bernstein, I.H., Proctor, J.D., Proctor, R.W., and Schurman, D.L. (1973b). Metacontrast and brightness discrimination. Percept Psychophys 14, 293–7. Bernstein, I.H., Fisicaro, S.A., and Fox, J.A. (1976). Metacontrast suppression and criterion content: a discriminant function analysis. Percept Psychophys 20, 198–204. Berson, D.M. and McIlwain, J.T. (1982). Retinal Y-cell activation of deep-layer cells in superior colliculus of the cat. J Neurophysiol 47, 700–14. Bevan, W., Jonides, J., and Collyer, S.C. (1970). Chromatic relationships in metacontrast suppression. Psychon Sci 19, 367–8.
REFERENCES
Bex, P.J., Edgar, G.K., and Smith, A.T. (1995). Sharpening of drifting blurred images. Vision Res 35, 2539–46. Bidwell, S. (1899). Curiosities of Light and Sight. London: Swan Sonnenschein & Co. Ltd. Biederman, I. (1987). Recognition-by-components: a theory of human image understanding. Psychol Rev 94, 115–47. Bischof, W.F. and Di Lollo, V. (1990). Perception of directional sampled motion in relation to displacement and spatial frequency: evidence for a unitary motion system. Vision Res 30, 1341–62. Bischof, W.F. and Di Lollo, V. (1995). Motion and metacontrast with simultaneous onset of stimuli. J Opt Soc Am A 12, 1623–36. Bischof, W.F., Seiffert, A.E., and Di Lollo, V. (1996). Transient-sustained input to directionally selective motion mechanisms. Perception 25, 1263–80. Blake, R. and Logothetis, N.K. (2002). Visual competition. Nat Rev Neurosci 3, 13–23. Blakemore, C. and Eggers, H.M. (1978). Effects of artificial anisometropia and strabismus on the kitten’s visual cortex. Arch Ital Biol 116, 385–9. Blakemore, C. and Tobin, E.A. (1972). Lateral inhibition between orientation detectors in the cat’s visual cortex. Exp Brain Res 15, 439–40. Blakemore, C., Carpenter, R.H., and Georgeson, M.A. (1970). Lateral inhibition between orientation detectors in the human visual system. Nature 228, 37–9. Blanc-Garin, J. (1973). Effet de la distance spatiale dans le masquage latéral rétroactif et proactive. Perception 1, 441–52. Blanchard, J. (1918). The brightness sensibility of the retina. Phys Rev 11 (Series 2), 81–99. Blick, D.W. and MacLeod, D.I.A. (1978). Rod threshold: influence of neighboring cones. Vision Res 18, 1611–16. Bloch, A.M. (1885). Experience sur la vision. C R Seances Soc Biol Fil 37, 493–5. Block, N. (1995). On a confusion of a function of consciousness. Behav Brain Sci 18, 227–87. Block, N. (1996). How can we find the neural correlate of consciousness? Trends Neurosci 19, 456–9. Bodis-Wollner, I. and Hendley, C. D. (1977). Relation of evoked potentials to pattern and local luminance detectors in the human visual system. In Visual Evoked Potentials in Man: New Developments (ed J.E. Desmedt). Oxford: Clarendon Press, pp. 197–207. Bodis-Wollner, I. and Hendley, C.D. (1979). On the separability of two mechanisms involved in the detection of grating patterns in humans. J Physiol 291, 251–63. Bonneh, Y.S., Cooperman, A., and Sagi, D. (2001). Motion-induced blindness in normal observers. Nature 411, 798–801. Borsting, E., Ridder, W.H., III, Dudeck, K., Kelley, C., Matsui, L., and Motoyama, J. (1996). The presence of a magnocellular defect depends on the type of dyslexia. Vision Res 36, 1047–53. Bowen, R.W. and Wilson, H.R. (1994). A two-process analysis of pattern masking. Vision Res 34, 645–57. Bowen, R.W., Pola, J., and Matin, L. (1974). Visual persistence effects of flash luminance, duration, and energy. Vision Res 14, 295–303. Bowen, R.W., Pokorny, J., and Cacciato, D. (1977). Metacontrast masking depends on luminance transients. Vision Res 17, 971–5.
313
314
REFERENCES
Bowling, A. and Lovegrove, W. (1980). The effect of stimulus duration on the persistence of gratings. Percept Psychophys 27, 574–8. Bowling, A. and Lovegrove, W. (1981). Two components to visible persistende: effects of orientation and contrast. Vision Res 21, 1241–51. Bowling, A., Lovegrove, W., and Mapperson, B. (1979). The effect of spatial frequency and contrast on visual persistence. Perception 8, 529–39. Boyer, J. and Ro, T. (in press) Attention attenuates metacontrast masking. J Cogn. Boynton, R.M. (1972). Discrimination of homogeneous double pulses of light. In Handbook of Sensory Physiology. Vol. 7/4: Visual Psychophysics (ed D. Jameson and L.M. Hurvich). New York: Springer, pp. 202–32. Boynton, R.M. and Miller, N.D. (1963). Visual performance under condtions of transient adaptation. Illum Eng 58, 541–50. Braddick, O.J. (1974). A short-range process in apparent motion. Vision Res 14, 519–27. Braddick, O.J. (1980). Low-level and high-level processes in apparent motion. Philos Trans R Soc Lond B 290, 137–51. Braff, D. L. (1981). Impaired speed of information processing in nonmedicated schizotypal patients. Schizophr Bull 7, 499–508. Brand, A., Kopmann, S., and Herzog, M. (2004). Intact feature fusion in schizophrenic patients. Eur Arch Psychiatry Clin Neurosci 254, 281–8. Brehaut, J.C., Enns, J.T., and Di Lollo, V. (1999). Visual masking plays two roles in the attentional blink. Percept Psychophys 61, 1436–48. Breitmeyer, B.G. (1973). A relationship between the detection of size, rate, orientation and direction in the human visual system. Vision Res 13, 41–58. Breitmeyer, B.G. (1975). Simple reaction time as a measure of the temporal response properties of transient and sustained channels. Vision Res 15, 1411–12. Breitmeyer, B.G. (1978a). Disinhibition of metacontrast masking of Vernier acuity targets: sustained channels inhibit transient channels. Vision Res 18, 1401–5. Breitmeyer, B.G. (1978b). Metacontrast masking as a function of mask energy. Bull Psychon Soc 12, 50–2. Breitmeyer, B.G. (1978c). Metacontrast with black and white stimuli: evidence for inhibition of on and off sustained activity by either on or off transient activity. Vision Res 18, 1443–8. Breitmeyer, B.G. (1980). Unmasking visual masking: a look at the ‘why’ behind the veil of the ‘how’. Psychol Rev 87, 52–69. Breitmeyer, B.G. (1984). Visual Masking. New York: Oxford University Press. Breitmeyer, B.G. (1989). A visually based deficit in specific reading disability. Ir J Psychol 10, 566–73. Breitmeyer, B.G. (1991). Reality and relevance of sustained and transient channels in reading and reading disability. In Oculomotor Control and Cognitive Processes—Normal and Pathological Aspects (ed R. Schmid and D.D. Zambarbieri). Amsterdam: Elsevier, pp. 473–83. Breitmeyer, B.G. (1992). Parallel processing in human vision: history, review, and critique. In Applications of Parallel Processing in Vision (ed J. Brannan). Amsterdam: Elsevier, pp. 37–78.
REFERENCES
Breitmeyer, B.G. (1993a). Sustained and transient channels in vision: a review and implications for reading. In Visual Processes in Reading and Spelling (ed D. Willows, E. Corcos, and R. Kruk). Hillsdale, NJ: Erlbaum, pp. 95–110. Breitmeyer, B.G. (1993b). The roles of sustained (P) and transient (M) channels in reading and reading disability. In Facets of Dyslexia and its Remediation (ed S.F. Wright and R. Groner). Amsterdam: Elsevier, pp. 13–31. Breitmeyer, B.G. (1994). The role of sustained and transient pathways in reading and reading disability. In Eye Movements in Reading (ed J. Ygge and G. Lennerstrand). Oxford: Elsevier, pp. 219–33. Breitmeyer, B.G. and Breier, J.I. (1994). Effects of background color on reaction time to stimuli varying in size and contrast: Inferences about human transient channels. Vision Res 34, 1039–45. Breitmeyer, B.G. and Ganz, L. (1976). Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression and information processing. Psychol Rev 83, 1–36. Breitmeyer, B.G. and Ganz, L. (1977). Temporal studies with flashed gratings: inferences about human transient and sustained channels. Vision Res 17, 861–5. Breitmeyer, B.G. and Halpern, M. (1978). Visual persistence depends on spatial frequency and retinal locus. Presented at the Annual Meeting of the Psychonomic Society, San Antonio, TX, November 1978. Breitmeyer, B.G. and Horman, K. (1981). On the role of stroboscopic motion in metacontrast. Bull Psychon Soc 17, 29–32. Breitmeyer, B.G. and Julesz, B. (1975). The role of on and off transients in determining the psychological spatial frequency response. Vision Res 15, 411–15. Breitmeyer, B.G. and Kersey, M. (1981). Backward masking by pattern stimulus offset. J Exp Psychol Hum Percept Perform 7, 972–7. Breitmeyer, B.G. and Ö˘gmen, H. (2000). Recent models and findings in visual backward masking: a comparison, review, and update. Percept Psychophys 62, 1572–95. Breitmeyer, B.G. and Rudd, M. (1981). A single-transient masking paradigm. Percept Psychophys 30, 604–6. Breitmeyer, B.G. and Williams, M.C. (1990). Effects of isoluminant-background color on metacontrast and stroboscopic motion: interactions between sustained (P) and transient (M) channels. Vision Res 30, 1069–75. Breitmeyer, B.G., Love, R., and Wepman, B. (1974). Contour masking during stroboscopic motion and metacontrast. Vision Res 14, 1451–6. Breitmeyer, B. Battaglia, F., and Weber, C. (1976). U-shaped backward contour masking during stroboscopic motion. J Exp Psychol Hum Percept Perform 2, 167–73. Breitmeyer, B.G., Levi, D.M., and Harwerth, R.S. (1981a). Flicker-masking in sptial vision. Vision Res 21, 1377–85. Breitmeyer, B.G., Rudd, M., and Dunn, K. (1981b). Metacontrast investigations of sustained–transient channel inhibitory interactions. J Exp Psychol Hum Percept Perform 7, 770–9. Breitmeyer, B.G., May, J.G., and Heller, S.S. (1991). Metacontrast reveals asymmetries at red-green isoluminance. J Opt Soc Am A 8, 1324–9.
315
316
REFERENCES
Breitmeyer, B., Ehrenstein, A., Pritchard, K., Hiscock, M., and Crisan, J. (1999). The roles of location specificity and masking mechanisms in the attentional blink. Percept Psychophys 61, 798–809. Breitmeyer, B.G., Ö˘gmen, H., and Chen, J. (2004a). Unconscious priming by color and form: different processes and levels. Conscious Cogn 13, 138–57. Breitmeyer, B.G., Ro, T., and Ö˘gmen, H. (2004b). A comparison of masking by visual and transcranial magnetic stimulation: implications for the study of conscious and unconscious visual processing. Conscious Cogn 13, 829–43. Breitmeyer, B.G., Ro, T., and Singhal, N. (2004c). Unconscious priming with chromatic stimuli occurs at stimulus- not percept-dependent levels of visual processing. Psychol Sci 15, 198–202. Breitmeyer, B.G., Ö˘gmen, H., and Koc, A. (2005a). Metacontrast and binocular-rivalry suppression reveal hierarchies of unconscious visual processing. Paper presented at the Annual Meeting of the Vision Sciences Society, Sarasota, FL, 6–11 May 2005. Breitmeyer, B.G., Ö˘gmen, H., Ramon, J., and Chen, J. (2005b). Unconscious and conscious priming by forms and their parts. Vis Cogn 12, 720–36. Breitmeyer, B.G., Kafaligönül, H., Ö˘gmen, H., Mardon, L., Todd, S., and Ziegler, R. (in press). Para- and metacontrast masking reveal different effects on brightness and contour visibility. Vision Res Breitmeyer, B.G., Ziegler, R., and Hauske, G. (in preparation). Monoptic and dichoptic paracontrast effects on brightness and contour visibility. Bridgeman, B. (1971). Metacontrast and lateral inhibition. Psychol Rev 78, 528–39. Bridgeman, B. (1975). Correlates of metacontrast in single cells of the cat visual system. Vision Res 15, 91–9. Bridgeman, B. (1977). Reply to Brooks and Fuchs: exogenous and endogenous contributions to saccadic suppression. Vision Res 17, 323–6. Bridgeman, B. (1978). Distributed sensory coding applied to simulations of iconic storage and metacontrast. Bull Math Biol 40, 605–23. Bridgeman, B. (1980). Temporal characteristics of cells in monkey striate cortex measured with metacontrast masking and brightness discrimination. Brain Res 196, 347–64. Bridgeman, B. (1988). Visual evoked potentials: concomitants of metacontrast in late components. Percept Psychophys 43, 401–3. Bridgeman, B. (2001). A comparison of two lateral inhibitory models of metacontrast. J. Math Psychol 45, 780–8. Bridgeman, B. and Leff, S. (1979). Interaction of stimulus size and retinal eccentricity in metacontrast masking. J Exp Psychol Hum Percept Perform 5, 101–9. Bridgeman, B., Lewis, S., Heit, G., and Nagle, M. (1979). Relation between cognitive and motor-oriented systems in visual position perception. J Exp Psychol Hum Percept Perform 5, 692–700. Bridgeman, B., Kirch, M., and Sperling, A. (1981). Segregation of cognitive and motor aspects of visual function using induced motion. Percept Psychophys 29, 336–42. Broca, A. and Sulzer, D. (1902). La sensation lumineuse en function du temps. J Physiol Pathol Gen 4, 632–40.
REFERENCES
Brooks, B. and Jung, R. (1973). Neuronal physiology of the visual cortex. In Handbook of Sensory Physiology. Vol. VII/3: Central Processing of Visual Information. Part B (ed R. Jung). New York: Springer-Verlag pp. 325–440. Brown, J.L. (1965). Afterimages. In Vision and Visual Perception (ed C.H. Graham). New York: Wiley, pp. 479–503. Brussel, E.M., Stober, S.R., and Favreau, O.E. (1978). Contrast reversal in backward masking. Vision Res 18, 225–7. Buck, S.L., Peeples, D.R., and Makous, W. (1979). Spatial patterns of rod–cone interactions. Vision Res 19, 775–82. Bunt, A.H., Hendrickson, A.E., Lund, J.S., Lund, R.D., and Fuchs, A.F. (1975). Monkey retinal ganglion cells: morphometric analysis and tracing of axonal projections, with a consideration of the peroxidase technique. J Comp Neurol 164, 265–85. Burbeck, C.A. (1981). Criterion-free pattern and flicker thresholds. J Opt Soc Am 71, 1343–50. Burchard, S. and Lawson, R.B. (1973). A U-shaped detection function for backward masking of similar contours. J Exp Psychol 99, 35–41. Burr, D.C. (1980). Motion smear. Nature 284, 164–5. Burr, D.C. (1981). Temporal summation of moving images by the human visual system. Proc R Soc Lond B Biol Sci 211, 321–39. Burr, D.C. (1984). Summation of target and mask metacontrast stimuli. Perception 13, 183–92. Burr, D.C., Ross, J., and Morrone, M.C. (1986). Seeing objects in motion. Proc R Soc Lond Biol Sci 227, 249–65. Butler, P.D., Harkavy-Friedman, J.M., Amador, X.F., and Gorman, J.M. (1996). Backward masking in schizophrenia: relationship to medication status, neuropsychological functioning, and dopamine metabolism. Biol Psychiatry 40, 295–8. Butler, T.W., King-Smith, P.E., Moore, R.K., and Riggs, L. (1976). Visual sensitivity to retinal image motion. J Physiol 263, 170–1P. Cabeza, R. and Kato, T. (2000). Features are also important: contributions of featural and configural processing to face recognition. Psychol Sci 11, 429–33. Cadenhead, K.S., Perry, W., and Braff, D.L. (1996). The relationship of informationprocessing deficits and clinical symptoms in schizotypal personality disorder. Biol Psychiatry 40, 853–8. Calis, G. and Leeuwenberg, E. (1981). Grounding the figure. J Exp Psychol Hum Percept Perform 7, 1386–97. Caputo, G. (1998). Texture brightness filling-in. Vision Res 38, 841–51. Carlson, T.A. and He, S. (2004). Competing global representations fail to initiate binocular rivalry. Neuron 43, 907–14. Carpenter, G.A. and Grossberg, S. (1981). Adaptation and transmitter gating in vertebrate photoreceptors. J Theor Neurobiol 1, 1–42. Carrasco, M., Williams, P.E., and Yeshurun, Y. (2002). Covert attention increases spatial resolution with or without masks: support for signal enhancement. J Vis 2, 467–79. Carter, M., Brown, V., Breitmeyer, B., and Havig, P. (2003). Allocation of attention affects the time-course of metacontrast masking. Paper presented at the Annual Meeting of the Visual Sciences Society, Sarasota, FL, May 2003.
317
318
REFERENCES
Castet, E. (1994). Effect of the ISI on the visible persistence of a stimulus in apparent motion. Vision Res 34, 2103–14. Castet, E., Lorenceau, J., and Bonnet, C. (1993). The inverse intensity effect is not lost with stimuli in apparent motion. Vision Res 33, 1697–1708. Cattell, J.McK. (1885a). The inertia of the eye and brain. Brain 8, 295–312. Cattell, J.McK. (1885b). The influence of the intensity of the stimulus on the length of the reaction time. Brain 8, 511–15. Cattell, J.McK. (1886). Über die Trägheit der Netzhaut und des Sehcentrums. Philos Stud 3, 94–127. Cavanagh, P. and Mather, G. (1989). Motion: the long and the short of it. Spat Vis 4, 103–29. Cavonius, C.R. and Reeves, A.J. (1983). The interpretation of metacontrast and contrast-flash spectral sensitivity functions. In Color Vision: Physiology and Psychophysics (ed J.D. Mollon and L.T. Sharpe). London: Academic Press, pp. 471–8. Charpentier, A. (1890). Recherches sur la persistence des impressions retiniennes et sur la excitations lumineuses de courte duree. Arch Ophthalmol 10, 108–35. Chen, S., Bedell, H.E., and Ö˘gmen, H. (1995). A target in real motion appears blurred in the absence of other proximal moving targets. Vision Res 35, 2315–28. Citron, M.C., Emerson, R.C., and Ide, L.S. (1981). Spatial and temporal receptive-field analysis of the cat’s geniculocortical pathway. Vision Res 21, 385–96. Cleland, B.G., Levick, W.R., and Sanderson, K.J. (1973). Properties of sustained and transient ganglion cells in the cat retina. J Physiol 228, 649–80. Clements, B.A., Blumenfeld, L.D., and Cobb, S. (1997). The gamma band response may account for poor P50 suppression in schizophrenia. Neuroreport 8, 3889–93. Coenan, A.M.L. and Eijkman, E.G.J (1972). Cat optic tract and geniculate unit responses corresponding to human visual masking effects. Exp Brain Res 15, 441–51. Cohen, L.S. and Bechtold, H.P. (1974). Visual recognition as a function of stimulus offset asynchrony and duration. Percept Psychophys 15, 221–6. Cohen, L.S. and Bechtold, H.P. (1975). Visual recognition of dot pattern bigrams: An extension and replication. Am J. Psychol 88, 187–99. Cohen, M.A. and Grossberg, S. (1984). Neural dynamics of brightness perception: features, boundaries, diffusion, and resonance. Percept Psychophys 36, 428–56. Cohen, S.D. and Hindmarsh, A.C. (1994). CVODE User Guide. Lawrence Livermore National Laboratory Tech. Rep. UCRL-MA-118618. Coletta, N.J. and Williams, D.R. (1987). Psychophysical estimate of extrafoveal cone spacing. J Optl Soc Am A 4, 1503–13. Collingwood, R.G. (1940). Essay on Metaphysics. London: Oxford University Press. Collingwood, R.G. (1956). The Idea of History. London: Oxford University Press. Coltheart, M. (1980). Iconic memory and visible persistence. Percept Psychophys 27, 183–228. Connors, B.W., Malenka, R.C., and Silva, L.R. (1988). Two inhibitory postsynaptic potentials, and GABAA and GABAB receptor-mediated responses in neocortex of rat and cat. J Physiol 406, 443–68. Corfield, R., Frosdick, J.P., and Campbell, F.W. (1978). Grey-out elimination: the roles of spatial waveform, frequency and phase. Vision Res 18, 1305–11.
REFERENCES
Corthout, E., Uttl, B., Walsh, V., Hallett, M., and Cowey, A. (1999a). Timing of activity in early visual cortex as revealed by transcranial magnetic stimulation. Neuroreport 10, 2631–4. Corthout, E., Uttl, B., Ziemann, U., Cowey, A., and Hallett, M. (1999b). Two periods of processing in the (circum)striate visual cortex as revealed by transcranial magnetic stimulation. Neuropsychologia 37, 137–45. Corthout, E., Uttl, B., Juan, C.-H., Hallett, M., and Cowey, A. (2000). Suppression of vision by transcranial magnetic stimulation; a third mechanism. Neuroreport 11, 2345–9. Corwin, T.R., Volpe, L.C., and Tyler, C.W. (1976). Images and after-images of sinusoidal gratings. Vision Res 16, 345–9. Cox, S.I. and Dember, W.N. (1972). U-shaped metacontrast functions with a detection task. J Exp Psychol 95, 327–33. Cox, M.D. and Leventhal, D.B. (1978). A multivariate analysis and modification of a preattentive, perceptual dysfunction in schizophrenia. J Nerv Ment Dis 166, 709–18. Crawford, B.H. (1940). The effect of field size and pattern on the change of visual sensitivity with time. Proc R Soc London B Biol Sci 129, 94–106. Crawford, B.H. (1947). Visual adaptation in relation to brief conditioning stimuli. Proc R Soc London B Biol Sci 134, 283–302. Creutzfeldt, O.D., Kuhnt, U., and Benevento, L.A. (1974). An intracellular analysis of visual cortical neurones to moving stimuli: response in a co-operative neuronal network. Exp Brain Res 21, 251–74. Crick, F. (1984). Function of the thalamic reticular complex: the searchlight hypothesis. Proc Natl Acad Sci USA 81, 4586–90. Crick, F. and Koch, C. (1995). Are we aware of neural activity in primary visual cortex? Nature 375, 121–3. Crick, F. and Koch, C. (2003). A framework for consciousness. Nat Neurosci 6, 119–26. Croner, L.J. and Kaplan, E. (1995). Receptive fields of P and M ganglion cells across the primate retina. Vision Res 35, 7–24. Crook, M.N. (1937). Visual discrimination of movement. J Psychol 3, 541–58. Dacey, D.M. (1993). The mosaic of midget ganglion cells in the human retina. J Neurosci 13, 5334–55. Dacey, D.M. and Petersen, M.R. (1992). Dendritic field size and morphology of midget and parasol ganglion cells of the human retina. Proc Natl Acad Sci USA 89, 9666–70. Dakin, S.C. and Hess, R.F. (1997). The spatial mechanism mediating symmetry perception. Vision Res 37, 2915–30. De Gelder, B., de Haan, E., and Heywood, C. (2001). Out of Mind. Oxford: Oxford University Press. Dehaene, S., Naccache, N., Cohen, L., et al. (2001). Cerebral mechanisms of word masking and unconscious repetition priming. Nat Neurosci 4, 752–8. Dehaene, S., Sergent, C., and Changeux, J.-P. (2003). A neuronal network model linking subjective reports and objective physiological data during conscious perception. Proc Natl Acad Sci USA 100, 8520–5. De Kamps, M. and van der Velde, F. (2001). Using a recurrent network to bind form, color and positon into a unified percept. Neurocomputing 38–40, 523–8.
319
320
REFERENCES
Dell’Acqua, R., Pascali, A., Jolicoeur, P., and Sessa, P. (2003). Four-dot masking produces the attentional blink. Vision Res 43, 1907–13. Delord, S. (1998). Which mask is most efficient: a pattern or noise? It depends on the task. Vis Cogn 5, 313–38. Demb, J.B., Boynton, G.M., Best, M., and Heeger, D.J. (1998). Psychophysical evidence for a magnocellular pathway deficit in dyslexia. Vision Res 38, 1555–9. Demirci, H., Gezer, A., Sezen, F., Ovali, T., Demiralp, T., and Isoglu-Alkoc, U. (2002). Evaluation of the functions of the parvocellular and magnocellular pathways in stabismic amblyopia. J Pediatr Opthalmol Strabismus 39, 215–21. De Monasterio, F.M. (1978a). Properties of concentrically organized X and Y ganglion cells of macaque retina. J Neurophysiol 41, 1394–1417. De Monasterio, F.M. (1978b). Center and surround mechanisms of opponent-color X and Y ganglion cells of retina of macaques. J Neurophysiol 41, 1418–34. De Monasterio, F.M. and Schein, S.J. (1980). Protan-like spectral sensitivity of foveal Y ganglion cells of the retina of macaque monkeys. J Physiol 299, 385–96. Dennett, D.C. (1991). Consciousness Explained. Boston, MA: Little, Brown. Derrington, A.M. and Lennie, P. (1984). Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. J Physiol 357, 219–40. Desimone, R., Albright, T.D., Gross, C.G., and Bruce, C. (1984). Stimulus-selective properties of inferior temporal neurons in the macaque. J Neurosci 4, 2051–62. DeYoe, E.A. and Van Essen, D.C. (1988). Concurrent processing streams in monkey visual cortex. Trends Neurosci 11, 219–26. Dick, A.O. and Dick, S.O. (1969). An analysis of hierarchical processing in visual perception. Can J Psychol 23, 203–11. Didner, R. and Sperling, G. (1980). Perceptual delay: a consequence of metacontrast and apparent motion. J Exp Psychol Hum Percept Perform 6, 235–43. Di Lollo, V. (1980). Temporal integration in vision. J Exp Psychol Gen 109, 75–97. Di Lollo, V. and Bischof, W.F. (1995). Inverse-intensity effect in duration of visible persistence. Psychol Bull 118, 223–37. Di Lollo, V. and Hogben, J.H. (1985). Suppression of visible persistence. J Exp Psychol Hum Percept Perform 11, 304–16. Di Lollo, V. and Hogben, J.H. (1987). Suppression of visible persistence as a function of spatial separation between inducing stimuli. Percept Psychophys 41, 345–54. Di Lollo, V., Bischof, W.F., and Dixon, P. (1993). Stimulus-onset asynchrony is not necessary for motion perception or metacontrast masking. Psychol Sci 4, 260–3. Di Lollo, V., Enns, J.T., and Rensink, R.A. (2000). Competition for consciousness among visual events: the psychophysics of reentrant visual processes. J Exp Psychol Gen 129, 481–507. Di Lollo, V., Enns, J.T., and Rensink, R.A. (2002). Object substitution without reentry? J Exp Psychol Gen 131, 594–6. Di Lollo, V., von Muhlenen, A., Enns, J.T., and Bridgeman, B. (2004). Decoupling stimulus duration from brightness in metacontrast masking: data and models. J Exp Psychol Hum Percept Perform 30, 733–45. Dimberg, U., Thunberg, M., and Elmehed, K. (2000). Unconscious facial reactions to emotional facial expressions. Psychol Sci 11, 86–9.
REFERENCES
Di Russo, F. and Spinelli, D. (1999). Electrophysiological evidence for an early attentional mechanism in visual processing in humans. Vision Res 39, 2975–85. Dixon, N.F. and Hammond, E.J. (1972). The attenuation of visual persistence. Br J Psychol 63, 243–54. Dodge, R. (1900). Visual perception during eye movement. Psychol Rev 7, 454–65. Dolan, R.J. (2002). Emotion, cognition, and behavior. Science 298, 1191–4. Donagan, A. (1966). The Popper–Hempel theory reconsidered. In Philosophical Analysis and History (ed W.H. Dray). New York: Harper and Row, pp.127–59. Donchin, E. and Lindsey, B.B. (1965). Retroactive brightness enhancement with brief paired flashes of light. Vision Res 5, 59–70. Donchin, E., Wicke, J.D., and Lindsey, D.B. (1963). Cortical evoked potentials and perception of paired flashes. Science 141, 1285–6. Drasdo, N. (1980). Cortical potentials evoked by pattern presentation in the foveal region. In Evoked Potentials (ed C. Barker). Baltimore, MD: University Park Press, pp. 167–74. Dreher, B., Fukuda, Y., and Rodieck, R.W. (1976). Identification, classification and anatomical segregation of cells with X-like and Y-like properties in the lateral geniculate nucleus of old-world primates. J Physiol 258, 433–52. Dresp, B. (1993). Bright lines and edges facilitate the detection of small line targets. Spat Vis 7, 213–25. Dresp, B. and Grossberg, S. (1997). Contour integration across polarities and spatial gaps: from local contrast filtering to global grouping. Vision Res 37, 913–24. du Croz, J.J. and Rushton, W.A.H. (1963). Cone dark-adaptation curves. J Physiol 168, 52P. Duncan, J. (1984). Selective attention and the organization of visual information. J Exp Psychol Gen 113, 501–17. Duncan, J. (1985). Two techniques for investigating perception without awareness. Percept Psychophys 38, 296–8. Ebbecke, U. (1920). Über das Augenblicksehen. Pflugers Arch Gesamte Physiol 185, 181–95. Edelman, G.M. (1987). Neural Darwinism. New York: Basic Books. Edwards, S.B., Ginsburgh, C.L., Henkel, C.K., and Stein, B.E. (1979). Sources of subcortical projections to the superior colliculus in the cat. J Comp Neurol 184, 309–30. Edwards, V.T., Hogben, J.H., Clark, C.D., and Pratt, C. (1996). Effects of a red background on magnocellular functioning in average and specifically disabled readers. Vision Res 36, 1037–45. Egeth, H. and Gilmore, G. (1973). Perceptibility of the letters in words and nonwords with complete control of redundancy. Bull Psychon Soc 2, 329. Egly, R., Driver, J., and Rafal, R.D. (1994a). Shifting attention between objects and locations: evidence from normal and parietal lesion subjects. J Exp Psychol Gen 123, 161–77. Egly, R., Rafal, R., Driver, J., and Starrveveld, Y. (1994b). Covert orienting in the split brain reveals hemispheric specialization for object-based attention. Psychol Sci 5, 380–3. Eimer, M. (1996). The N2pc component as an indicator of attentional selectivity. Electroencephalogr Clin Neurophysiol 99, 225–34. Eimer, M. (1999). Facilitatory and inhibitory effects of masked prime stimuli on motor activation and behavioral performance. Acta Psychol 101, 293–313.
321
322
REFERENCES
Eimer, M. and Schlaghecken, F. (1998). Effects of masked stimuli on motor activation: behavioral and elctrophysiological evidence. J Exp Psychol Hum Percept Perform 24, 1737–47. Eimer, M. and Schlaghecken, F. (2002). Links between conscious awareness and response inhibition: evidence from masked priming. Psychol Bull Rev 9, 514–20. Elder, J.H. and Zucker, S.W. (1998). Evidence for boundary-specific grouping. Vision Res 38, 143–52. Emerson, R.C., Bergen, J.R., and Adelson, E.H. (1992). Directionally selective complex cells and the computation of motion energy in cat visual cortex. Vision Res 32, 203–18. Emre, M., Groner, M., Hofer, D., Groner, R., and Fisch, H.U. (1989). Effects of benzodiazepines on forward and backward visual masking. Clin Vis Sci 4, 257–63. Engel, A.K., Fries, P., Koenig, P., Brecht, M., and Singer, W. (1999). Temproal binding, binocular rivalry, and consciousness. Conscious Cogn 8, 128–51. Enns, J.T. (2002). Visual binding in the standing-wave illusion. Psychon Bull Rev 9, 489–96. Enns, J.T. (2004). Object substitution and its relation to other forms of visual masking. Vision Res 44, 1321–31. Enns, J.T. and Di Lollo, V. (1997). Object substitution: a new form of masking in unattended visual locations. Psychol Sci 8, 135–9. Enns, J.T. and Di Lollo, V. (2000). What’s new in visual masking? Trends Cogn Sci 4, 345–52. Enns, J.T., Visser, T.A., Kawahara, J., and Di Lollo, V. (2001). Visual masking and task switching in the attentional blink. In The Limits of Attention (ed K. Shapiro). Oxford: Oxford University Press, pp. 65–81. Enoch, J.M., Sunga, R.N., and Bachmann, E. (1970a). A static perimetric technique believed to test receptive field properties. I. Extension of Westheimer’s experiments on spatial interaction. Am J Ophthalmol 70, 113–26. Enoch, J.M., Sunga, R.N., and Bachmann, E. (1970b). A static perimetric technique believed to test receptive field properties. II. Adaptation of the method to the quantitative perimeter. Am J Ophthalmol 70, 126–37. Enroth-Cugell, C. and Robson, J.G. (1966). The contrast sensitivity of retinal ganglion cells of the cat. J Physiol 187, 517–52. Enroth-Cugell, C., Hertz, B.G., and Lennie, P. (1977). Convergence of rod and cone signals in the cat’s retina. J Physiol 269, 297–318. Erdmann, B. and Dodge, R. (1898). Psychologische Untersuchungen über das Lesen auf Experimenteller Grundlage. Halle: Max Niemeyer. Eriksen, C.W. (1966). Temporal luminance summation effects in backward and forward masking. Percept Psychophys 1, 87–92. Eriksen, C.W. and Eriksen, B.A. (1972). Visual backwward masking as revealed by choice reaction time. Percept Psychophys 12, 5–8. Eriksen, C.W., Becker, B.A., and Hoffman, J.E. (1970). Safari to masking land: a hunt for the elusive U. Percept Psychophys 8, 245–50. Erismann, T. (1935). Die Empfindungszeit. Arch Gesamte Psychol 93, 453–519. Ernst, U.A., Pawelzik, K.R., Sahar-Pikielny, C., and Tsodyks, M.V. (2001). Intracortical origin of visual maps. Nat Neurosci 4, 431–6.
REFERENCES
Erwin, D.E. (1976). Further evidence for two components in visual persistence. J Exp Psychol Hum Percept Perform 2, 191–209. Esteves, F. and Öhman, A. (1993). Masking the face: recogniton of emotional facial expressions as a function of parameters of backward masking. Scand J Psychol 34, 1–18. Exner, S. (1868). Über die zu einer Gesichtswahrnehmung nöthige Zeit. Wiener Sitzungsber Math-Naturwiss Cl Kaiserlichen Akad Wiss 58 (Part 2), 601–32. Exner, S. (1875). Experimentelle Untersuchungen der einfachsten psychischen Prozesse. Pflugers Arch Gesamte Physiol 11, 403–32. Exner, S. (1888). Über optische Bewegungsempfindungen. Biol Zentralbl (Leipzig) 8, 437–48. Exner, S. (1898). Studien auf dem Grenzgebiet des lokalisierten Sehens. Pflugers Arch Gesamte Physiol 73, 117–71. Farah, M.J., Wilson, K.D., Drain, M., and Tanaka, J. N. (1998). What is ‘special’ about face perception? Psychol Rev 105, 482–98. Farrell, J.E. (1984). Visible persistence of moving objects. J Exp Psychol Hum Percept Perform 10, 502–11. Farrell, J.E., Pavel, M., and Sperling, G. (1990). The visible persistence of stimuli in stroboscopic motion. Vision Res 30, 921–36. Fechner, G.T. (1840a). Uber die subjektiven Nachbilder und Nebenbilder I. Poggend Ann Phys Chem 50, 193–221. Fechner, G.T. (1840b). Uber die subjektiven Nachbilder und Nebenbilder II. Poggend Ann Phys Chem 50, 427–70. Fehmi, L.G., Adkins, J.W., and Lindsey, D.B. (1969). Electrophysiological correlates of visual perceptual masking in monkeys. Exp Brain Res 7, 299–316. Fehrer, E. (1966). Effect of stimulus similarity on retroactive masking. J Exp Psychol 71, 612–15. Fehrer, E. and Biederman, I. (1962). A comparison of reaction and verbal report in detection of masked stimuli. J Exp Psychol 64, 126–30. Fehrer, E. and Raab, D. (1962). Reaction time to stimuli masked by metacontrast. J Exp Psychol 63, 143–7. Fehrer, E. and Smith, E. (1962). Effects of luminance ratio on masking. Percept Mot Skills 14, 243–53. Felipe, A., Buades, M.J., and Artigas, J.M. (1993). Influence of the contrast sensitivity function on the reaction time. Vision Res 33, 2461–6. Felmingham, K.L. and Jakobson, L.S. (1995). Visual and visuomotor performance in dyslexic children. Exp Brain Res 106, 467–74. Ferrera, V.P., Nealy, T.A., and Maunsell, J.H.R. (1992). Mixed parvocellular and magnocellular geniculate signals in visual area V4. Nature 358, 756–8. Ferry, E.S. (1892). Persistence of vision. Am J Sci 44 (Ser. 3), 192–207. Finkel, D.L. (1973). A developmental comparison of two types of visual information. J Exp Child Psychol 16, 250–66. Finkel, D.L. and Smythe, L. (1973). Short-term storage of spatial information. Dev Psychol 9, 424–8. Finlay, B.L., Schiller, P.H., and Volman, S.F. (1976). Quantitative studies of single-cell properties in monkey striate cortex. IV. Corticotectal cells. J Neurophysiol 39, 1352–61.
323
324
REFERENCES
Fisch, H.U., Groner, M., Groner, R., and Menz, C. (1983). Influence of diazepam and methylphenidate on identification of rapidly presented letter strings: diazepam enhances visual masking. Psychopharmacology 80, 61–6. Fisicaro, S.A., Bernstein, I.H., and Narkiewicz, P. (1977). Apparent motion and metacontrast suppression: a decision analysis. Percept Psychophys 22, 517–25. Flaherty, T.B. and Matteson, H.H. (1971). Comparison of two measures of metacontrast. J Opt Soc Am 61, 828–30. Foley, J.M. and Chen, C.-C. (1997). Analysis of the effect of pattern adaptation on pattern pedestal effects: a two-process model. Vision Res 37, 2779–88. Foster, D.H. (1976). Rod–cone interaction in the after-flash effect. Vision Res 16, 393–6. Foster, D.H. (1977). Rod- and cone-mediated interactions in the fine-grain movement illusion. Vision Res 17, 123–7. Foster, D.H. (1978). Action of red-sensitive colour mechanism on blue-sensitive colour mechanism in visual masking. Opt Acta 25, 1001–4. Foster, D.H. (1979). Interactions between blue- and red-sensitive colour mechansims in metacontrast masking. Vision Res 19, 921–31. Foster, D.H. and Mason, R.J. (1977). Interaction between rod and cone systems in dichoptic visual masking. Neurosci Lett 4, 39–42. Fotowat, H., Ö˘gmen, H., Bedell, H.E., and Breitmeyer, B.G. (in press). Probing oscillatory visual dynamics at the perceptual level. In Handbook of Neural Engineering (ed M. Akay). New York: Wiley. Fowler, C.A., Wolford, G., Slade, R., and Tassinary, L. (1981). Lexical access with and without awareness. J Exp Psychol Gen 110, 341–62. Fox, R. (1978). Visual masking. In Handbook of Sensory Physiology. Vol. 8, Perception (ed R. Held, H. Leibowitz, and H.L. Teuber). New York: Springer, pp. 621–53. Francis, G. (1996a). Cortical dynamics of visual persistence and temporal integration. Percept Psychophys 58, 1203–12. Francis, G. (1996b). Cortical dynamics of lateral inhibition: visual persistence and ISI. Percept Psychophys 58, 1103–9. Francis, G. (1997). Cortical dynamics of lateral inhibition: metacontrast masking. Psychol Rev 104, 572–94. Francis, G. (2000). Quantitative theories of metacontrast masking. Psychol Rev 107, 768–85. Francis, G. and Cho, Y.S. (2005). Computational models of masking. In The First Half Second (ed H. Ö˘gmen and B.G. Breitmeyer) Cambridge, MA: MIT Press, pp. 111–26. Francis, G. and Grossberg, S. (1996a). Cortical dynamics of boundary segmentation and reset: persistence, afterimages, and residual traces. Perception 25, 543–67. Francis, G. and Grossberg, S. (1996b). Cortical dynamics of form and motion integration: persistence, apparent motion, and illusory contours. Vision Res 36, 149–73. Francis, G. and Hermens, F. (2002). Comment on ‘Competition for consciousness among visual events: the psychophysics of reentrant visual processes’ (Di Lollo, Enns and Rensink). J Exp Psychol Gen 131, 590–3. Francis, G. and Herzog, M. (2004). Testing quantitative models of backward masking. Psychon Bull Rev 11, 104–12. Francis, G. and Kim, H. (1999). Motion parallel to line orientation: disambiguation of motion percepts. Perception 28, 1243–55.
REFERENCES
Francis, G. and Kim, H. (2001). Perceived motion in orientational afterimages: direction and speed. Vision Res 41, 161–72. Francis, G. and Rothmayer, M. (2003). Interactions of afterimages for orientation and color: experimental data and model simulations. Percept Psychophys 65, 508–22. Francis, G., Grossberg, S., and Mingolla, E. (1994). Cortical dynamics of feature binding and reset: control of visual persistence. Vision Res 34, 1089–1104. Freeman, E., Driver, J., and Sagi, D. (2001). Psychophysical measurement of attentional modulation in low-level vision using the lateral-interaction paradigm. In Visual Attention Mechanisms (ed V. Cantoni, M. Marinaro, and A. Petrosino). New York: Plenum, pp. 1–15. Fries, P., Roelfsema, P.R., Engel, A.E., Koening, P., and Singer, W. (1997). Synchronization of oscillatory responses in visual cortex correlates with perception in interocular rivalry. Proc Natl Acad Sci USA 94, 12669–704. Frith, C. and Dolan, R.J. (1997). Brain mechanisms associated with top-down processes in perception. Philos Trans R Soc Lond B Biol Sci 352, 1221–30. Frizzi, J.T. (1979). Midbrain reticular stimulation and brightness detection. Vision Res 19, 123–30. Fröhlich, F.W. (1921). Untersuchungen über periodische Nachbilder. Zeitschr Sinnesphysiol 52, 60–8. Fröhlich, F.W. (1922a). Über den Einfluss der Hell- und Dunkeladaptation auf den Verlauf der periodischen Nachbilder. Zeitschr Sinnesphysiol 53, 79–107. Fröhlich, F.W. (1922b). Über die Abhängigkeit der periodischen Nachbilder von der Dauer der Belichtung. Zeitschr Sinnesphysiol 53, 108–21. Fröhlich, F.W. (1923). Über die Messung der Empfindungszeitr. Zeitschr Sinnesphysiol 54, 58–78. Fröhlich, F.W. (1929). Die Empfindungszeit. Jena: Fischer Verlag. Frumkes, T.E. and Temme, L.A. (1977). Rod–cone interaction in human scotopic vision. II: Cones influence rod increment thresholds. Vision Res 17, 673–9. Frumkes, T.E., Sekuler, M.D., Barris, M.C., Reiss, E.H., and Chalupa, L.M. (1973). Rod–cone interaction in human scotopic vision. I: Temporal analysis. Vision Res 13, 1269–82. Fry, G.A. (1934). Depression of the activity aroused by a flash of light by applying a second flash immediately afterwards to adjacent areas of the retina. Am J Physiol 108, 701–7. Fry, G.A. (1935). Color sensations produced by intermittent white light and the three-component theory of color vision. Am J Psychol 47, 464–9. Fry, G.A. and Bartley, S.H. (1936). The effect of steady illumination on one part of the retina upon the critical flicker frequency in another. J Exp Psychol 19, 351–6. Fujita, I., Tanaka, K., Ito, M., and Cheng, K. (1992). Columns for visual features of objects in monkey inferotemporal cortex. Nature 360, 343–6. Ganz, L. (1975). Temporal factors in visual perception. In Handbook of Perception, Vol.5 (ed E.C. Carterette and M.P. Friedman). New York: Academic Press, pp. 169–231. Gaudiano, P. (1992). A unified neural network of spatio-temporal processing in X and Y retinal ganglion cells. II: Temporal adaptation and simulation of experimental data. Biol Cybern 67, 23–34. Gegenfurter, K.R. (2003). Cortical mechanisms of color vision. Nat Rev 4, 563–72.
325
326
REFERENCES
Geisler, W.S. and Diehl, R.L. (2003). A Bayesian approach to the evolution of perceptual and cognitive systems. Cogn Sci 27, 379–402. Geldard, F.A. (1932). Foveal sensitivity as influenced by peripheral stimulation. J Gen Psychol 7, 185–9. Geldard, F.A. (1934). Flicker relations within the fovea. J Opt Soc Am 24, 299–302. Geremek, A., Stürzel, F., da Pos, O., and Spillmann, L. (2002). Masking, persistence and transfer in rotating arcs. Vision Res 42, 2509–19. Gerrits, H.J.M. and Timmerman, G.J.M.E.N. (1969). The filling-in process in patients with retinal scotoma. Vision Res 9, 439–42. Gerrits, H.J.M. and Vendrik, A.J.H. (1970). Simultaneous contrast, filling-in process and information processing in man’s visual system. Exp Brain Res 11, 411–30. Gibson, J.J. (1979). The Ecological Approach to Visual Perception. Boston, MA: Houghton Mifflin Co. Giersch, A. and Herzog, M.H. (2004). Lorazepam strongly prolongs visual information processing. Neuropsychopharmacology 29, 1386–94. Giesbrecht, B.L. and Di Lollo, V. (1998). Beyond the attentional blink: visual masking by object substitution. J Exp Psychol Hum Percept Perform 24, 1454–66. Giesbrecht, B.L., Bischof, W.F., and Kingstone, A. (2003). Visual masking during the attentional blink: Tests of the object substitution hypothesis. J Exp Psychol Hum Percept Perform 29, 238–58. Gilbert, C., Ito, M., Kapadia, M., and Westheimer, G. (2000). Interactions between attention, context and learning in primary visual cortex. Vision Res 40, 1217–26. Gilden, D.L., MacDonald, K.E., and Lasaga, M.I. (1988). Masking with minimal contours: selective inhibition with low spatial frequencies. Percept Psychophys 44, 127–32. Gilinsky, A.S. (1967). Masking of contour-detectors in the human visual system. Psychon Sci 8, 395–6. Gilinsky, A.S. (1968). Orientation-specific effects of patterns of adapting light on spatial acuity. J Opt Soc Am 58, 13–18. Glass, R.A. and Sternheim, C.E. (1973). Visual sensitivity in the presence of alternating monochromatic fields of light. Vision Res 13, 689–99. Glennerster, A. and Parker, A.J. (1997). Computing stereo channels from masking data. Vision Res 37, 2143–52. Goodale, M.A., Pélisson, D., and Prablanc, C. (1986). Large adjustments in visually guided reaching do not depend on vision of the hand or perception of target displacement. Nature 320, 748–50. Gouras, P. (1969). Antidromic responses of orthodromically identified ganglion cells in monkey retina. J Physiol 204, 407–419. Gouras, P. and Link, K. (1966). Rod and cone interaction in dark-adapted monkey ganglion cells. J Physiol 184, 499–510. Graham, C.H. and Granit, R. (1931). Comparative studies on the peripheral and central retina. VI: Inhibition, summation, and synchronization of impulses in the retina Am J Physiol 98, 664–673. Grandison, T.D., Ghirardelli, T.G., and Egeth, H.E. (1997). Beyond similarity: masking of the target is sufficient to cause the attentional blink. Percept Psychophys 59, 266–74.
REFERENCES
Green, D.M. and Swets, J.A. (1966). Signal Detection Theory and Psychophysics. New York: John Wiley & Sons. Green, M. (1981a). Psychophysical relationships among mechanisms sensitive to pattern, motion and flicker. Vision Res 21, 971–83. Green, M. (1981b). Spatial frequency masking in masking by light.Vision Res 21, 861–6. Green, M.F. and Walker, E. (1986). Symptom correlates of vulnerability to backward masking in schizophrenia. Am J Psychiatry 143, 181–6. Green, M.F., Nuechterlein, H.H., and Mintz, J. (1994a). Backward masking in schizophrenia and mania, I: specifying a mechanism. Arch Gen Psychiatry 51, 939–44. Green, M.F., Nuechterlein, H.H., and Mintz, J. (1994b). Backward masking in schizophrenia and mania. II: specifying the visual channels. Arch Gen Psychiatry 51, 945–51. Green, M.F., Nuechterlein, H.H., and Breitmeyer, B. (1997). Backward masking performance in unaffected siblings of schizophrenic patients. Arch Gen Psychiatry 54, 465–72. Green, M.F., Nuechterlein, H.H., Breitmeyer, B., and Mintz, J. (1999). Backward masking performance in remitted, unmedicated schizophrenia: suggestive evidence for aberrant cortical oscillations. Am J Psychiatry 156, 1367–73. Green, M.F., Kern, R.S., Braff, D.L., and Mintz, J. (2000). Neurocognitive deficits and functional outcome in Schizophrenia: are we measuring the ‘right stuff ’? Schizophren Bull 26, 119–36. Green, M.F., Mintz, J., Salveson, D., et al. (2003a). Visual masking as a probe for abnormal gamma range activity in schizophrenia. Biol Psychiatry 53, 1113–19. Green, M.F., Nuechterlein, K.H., Breitmeyer, B., Tsuang, J., and Mintz, J. (2003b). Forward and backward visual masking in schizophrenia: influence of age. Psychol Med 33, 887–95. Green, M.F., Glahn, D., Engel, S.A., et al. (2005a). Regional brain activity associated with visual backward masking. J Cogn Neurosci 17, 13–23. Green, M.F., Nuechterlein, K.H., Breitmeyer, B., and Mintz, J. (in press) Forward and backward masking in unaffected siblings of schizophrenia pateints. Biol Psychiatry. Greenspoon, T.S. and Eriksen, C.W. (1968). Interocular non-independence. Percept Psychophys 3, 93–6. Gross, C.G., Rocha-Miranda, C.E., and Bender, D.B. (1972). Visual properties of neurons in inferotemporal cortex of the macaque. J Neurophysiol 35, 96–111. Grossberg, S. (1970). Neural pattern discrimination. J Theor Biol 27, 291–337. Grossberg, S. (1972). Neural expectation: cerebellar and retinal analogs of cells fired by learnable or unlearned pattern classes. Kybernetik 10, 49–57. Grossberg, S. (1987). Cortical dynamics of three-dimensional form, color, and brightness perception. I: Monocular theory. Percept Psychophys, 41, 87–116. Grossberg, S. (1988). Nonlinear neural networks: principles, mechanisms, and architectures. Neural Netw 1, 17–61. Grossberg, S. (1991). Why do parallel cortical systems exist for the perception of static form and moving form? Percept Psychophys 49, 117–41. Grossberg, S. (1994). 3-D vision and figure–ground separation by visual cortex. Percept Psychophys 55, 48–120. Grossberg, S. and Mingolla, E. (1985a). Neural dynamics of perceptual grouping: textures, boundaries, and emergent segmentations. Percept Psychophys 38, 141–71.
327
328
REFERENCES
Grossberg, S. and Mingolla, E. (1985b). Neural dynamics of form perception: boundary completion, illusory figures, and neon color spreading. Psychol Rev 92, 173–211. Grossberg, S. and Todorovic, D. (1988). Neural dynamics of 1-D and 2-D brightness perception: a unified model of classical and recent phenomena. Percept Psychophys 43, 241–77. Gross-Glenn, K., Skottun, B.C., Glenn, W., et al. (1995). Contrast sensitivity in dyslexia. Vis Neurosci 12, 153–63. Growney, R. (1976). The function of contour in metacontrast. Vision Res 16, 253–61. Growney, R. and Weisstein, N. (1972). Spatial characteristics of metacontrast. J Opt Soc Am 62, 690–6. Growney, R., Weisstein, N., and Cox, S. I. (1977). Metacontrast as a function of spatial separation with narrow line targets and masks. Vision Res 17, 1205–10. Haber, R N. (1969). Repetition, visual persistence, visual noise and information processing. In Information Processing in the Nervous System (ed D.N. Leibovic). New York: Springer, pp. 121–40. Haber, R.N. (1985). The impending demise of the icon: a critique of the concept of iconic storage in visual information processing. Behav Brain Sci 6, 1–54. Haber, R.N. and Nathanson, L.S. (1968). Post-retinal storage? Some further observations on Park’s camel as seen through the eye of a needle. Percept Psychophys 3, 349–55. Haber, R.N. and Standing, L. (1970). Direct estimates of the apparent duration of a flash. Can J Psychol 24, 216–29. Hallett, P.E. (1969). Rod increment thresholds on steady and flashed backgrounds. J Physiol 202, 344–77. Hammett, S.T. (1997). Motion blur and motion sharpening in the human visual system, Vision Res 37, 2505–10. Hammond, P. (1971). Chromatic sensitivity and spatial organization of cat visual cortical cells: cone–rod interaction. J Physiol 213, 475–94. Hammond, P. (1972). Chromatic sensitivity and spatial organization of LGN neuron receptive fields in cat: cone–rod interaction. J Physiol 225, 391–413. Han, S. and Humphreys, G.W. (1999). Interactions between perceptual organization based on gestalt laws and those based on hierarchical processing. Percept Pssychophys 61, 1287–98. Handy, T.C., Kingstone, A., and Mangun, G.R. (1996). Spatial distribution of visual attention: perceptual sensitivity and response latency. Percept Psychophys 58, 613–27. Harris, M.G. (1980). Velocity specificity of the flicker to pattern sensitivity ratio in human vision. Vision Res 20, 687–91. Harrison, K. and Fox, R. (1966). Replication of reaction time to stimuli masked by metacontrast. J Exp Psychol 71, 162–3. Hartmann, E., Lachenmayr, B., and Brettel, H. (1979). The peripheral critical flicker frequency. Vision Res 19, 1019–23. Hartveit, E., Ramberg, S.I., and Heggelund, P. (1993). Brainstem modulation of spatial receptive field properties of single cells in the dorsal lateral geniculate nucleus of the cat. J Neurophysiol 70, 1644–55. Hartwell, R.C. and Cowan, J.D. (1993). Evoked potentials and simple motor reaction times to localized visual patterns. Vision Res 33, 1325–37.
REFERENCES
Harwerth, R.S. and Levi, D.M. (1978). Reaction time as a measure of suprathreshold grating detection. Vision Res 18, 1579–86. Harwerth, R.S., Boltz, R.L., and Smith, E.L. (1980). Psychophysical evidence for sustained and transient channels in the monkey visual system. Vision Res 20, 15–22. Hassler, R. (1978). Interaction of reticular activating system for vigilance and the truncothalamic and pallidal systems for directing awareness and attention under striatal control. In Cerebral Correlates of Conscious Experience (ed P.A. Buser and A. Rougeul-Buser). Amsterdam: North-Holland, pp. 111–29. Havig, P.R., Breitmeyer, B.G., and Brown, V.R. (1998). The effects of pre-cueing attention on metacontrast masking. Presented at the Annual Meeting of the Association for Research in Vision and Ophthalmology, Fort Lauderdale, FL, May 1998. Haynes, J.D., Driver, J., and Rees, G. (2005). Visibility reflects dynamic changes of effective connectivity between V1 and fusiform cortex. Neuron 46, 811–821. Hearty, P.J. and Mewhort, D.J.K. (1975). Spatial localization in sequential letter displays. Can J Psychol 29, 348–59. Heckenmueller, E.G. and Dember, W.N. (1965). Paradoxical brightening of a masked black disk. Psychon Sci 3, 457–8. Heinemann, E.G. (1955). Simultaneous brightness induction as a function of inducing and test-field luminances. J Exp Psychol 50, 89–96. Hellige, J.B., Walsh, D.A., Lawrence, V.S., and Cox, P.J. (1977). The importance of figural relationships between target and mask. Percept Psychophys 21, 285–6. Hellige, J.B., Walsh, D.A., Lawrence, V.S., and Prasse, M. (1979). Figural relationship effects and mechanisms of visual masking. J Exp Psychol Hum Percept Perform 5, 88–100. Helmholtz, H. von (1866). Handbuch der Physiologischen Optik (1st edn). Leipzig: Voss (trans. J.P.C. Southall (1962). Handbook of Physiological Optics (3rd edn). New York: Dover Publications). Hempel, C.G. (1966). Explanantions in science and history. In Philosophical Analysis and History (ed W.H. Dray). New York: Harper and Row, pp. 95–126. Hendry, S.H. and Reid, R.C. (2000). The koniocellular pathway in primate vision., Ann Rev Neurosci 23, 127–53. Hering, E. (1872). Zuhr Lehre vom Lichtsinn. Über successive Lichtinduction. Wiener Sitzungsber Math-Naturwiss Cl Kaiserlichen Akad Wiss 66 (Part 3), 5–24. Hering, E. (1878). Zuhr Lehre vom Lichtsinn. Vienna: Carl Gerold’s Sohn. Hermann, L. (1870). Eine Erscheining simultanen Kontrastes. Pflugers Arch Gesamte Physiol 3, 13–15. Hernandez, L.L. and Lefton, L.A. (1977). Metacontrast as measured under signal detection model. Perception 6, 696–702. Herrick, R.M. (1974). Foveal light-detection thresholds with two temporally placed flashes: a review. Percept Psychophys 15, 361–7. Herzog, M.H. and Fahle, M. (2002). Effects of grouping on contextual modulation. Nature 415, 433–6. Herzog, M.H. and Koch, C. (2001). Seeing properties of an invisible object: feature inheritance and shine through. Proc Natl Acad Sci USA 98, 4271–5.
329
330
REFERENCES
Herzog, M.H., Fahle, M., and Koch, C. (2001a). Spatial aspects of object formation revealed by a new illusion, shine-through. Vision Res 41, 2325–35. Herzog, M.H., Koch, C., and Fahle, M. (2001b). Shine-through: temporal aspects. Vision Res 41, 2337–46. Herzog, M.H., Schmonsees, U., and Fahle, M. (2003a). Timing of contextual modulation in the shine-through effect. Vision Res 43, 2039–51. Herzog, M.H., Schmonsees, U., and Fahle, M. (2003b). Collinear contextual suppression. Vision Res 43, 2915–25. Herzog, M.H., Harms, M., Ernst, U.A., Eurich, C.W., Mahmud, S.H., and Fahle, M. (2003c). Extending the shine-through to classical masking paradigms. Vision Res 43, 2659–67. Herzog, M.H., Parish, L., Koch, C., and Fahle, M. (2003d). Fusion of competing features is not serial. Vision Res 43, 1331–60. Herzog, M.H., Kopmann, S., and Brand, A. (2004). Intact figure–ground segmentation in schizophrenia. Psychiatry Res 129, 55–63. Hicks, T.P., Lee, B.B., and Vidyasagar, T.R. (1983). The responses of cells in macaque lateral geniculate nucleus to sinusoidal gratings. J Physiol 337, 183–200. Hochberg, J. (1968). In the mind’s eye. In Contemporary Theory and Research in Visual Perception (ed R.N. Haber). New York: Holt, Rhinehart and Winston, pp. 309–31. Hochberg, J. (1978). Perception. Englewood Cliffs, NJ: Prentice-Hall. Hofer, D., Walder, F., and Groner, M. (1989). Metakontrast: ein berühmtes, aber schwer messbares Phänomen. Schweiz Zeitschr Psychol 48, 219–32. Hoffmann, K.-P. (1973). Conduction velocity in pathways from retina to superior colliculus in the cat: a correlation with receptive-field properties. J Neurophysiol 36, 409–24. Hoffman, R.E., Buchsbaum, M.S., Jensen, R.V., Guich, S.M., Tsai, K., and Nuechterlein, K.H. (1996). Dimensional complexity of EEG waveforms in neuroleptic-free schizophrenic patients and normal control subjects. J Neuropsychiatry 8, 436–41. Hofstoetter, C., Koch, C., and Kiper, D.C. (2004). Motion-induced blindness does not affect the formation of negative afterimages. Conscious Cogn 13, 691–708. Hogben, J.H. and Di Lollo, V. (1984). Practice reduces suppression in metacontrast and apparent motion. Percept Psychophys 35, 441–5. Hogben, J.H. and Di Lollo, V. (1985). Suppression of visible persistence in apparent motion. Percept Psychophys 38, 450–60. Holender, D. (1986). Semantic activation without conscious identification in dichotic listening, parafoveal vision, and visual masking: a survey and appraisal. Behav Brain Sci 9, 1–66. Holland, H.C. (1963). ‘Visual masking’ and the effects of stimulant and depressant drugs. In Experiments with Drugs (ed H.J. Eysenck). Oxford: Pergamon Press, pp. 69–106. Homa, D., Haver, B., and Schwartz, T. (1976). Perceptibility of schematic faces: evidence for a perceptual gestalt. Mem Cognit 4, 176–85. Hood, D.C. (1973). The effects of edge sharpness and exposure duration on detection threshold. Vision Res 13, 759–66. Houlihan, K. and Sekuler, R.W. (1968). Contour interactions in visual masking. J Exp Psychol 77, 281–5.
REFERENCES
Hubel, D.H. (1988). Eye, Brain, and Vision. New York: W.H. Freeman and Co. Hubel, D.H. and Wiesel, T.N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160, 106–54. Hubel, D.H. and Wiesel, T.N. (1965). Binocular interaction in striate cortex of kittens reared with artificial squint. J Neurophysiol 28, 1041–59. Hubel, D.H. and Wiesel, T.N. (1968). Receptive fields and functional architecture of monkey striate cortex. J Physiol 195, 215–43. Hubel, D.H. and Wiesel, T.N. (1977). Ferrier lecture. Functional architecture of macaque monkey visual cortex. Proc R Soc Lond B Biol Sci 198, 1–59. Hummel, J.E. and Biederman, I. (1992). Dynamic binding in a neural network for shape recognition. Psychol Rev 99, 480–517. Hupé, J.-M., James, A.C., Payne, B.R., Lomber, S.G., Girard, P., and Bullier, J. (1998). Cortical feedback improves discrimination between figure and background by V1, V2 and V3 neurons. Nature 394, 784–7. Hupé, J.-M., James, A.C., Girard, P., and Bullier, J. (2000). Response modulation by static texture surround in area V1 of the macaque monkey does not depend on feedback connections from V2. J Neurophysiol 85, 146–63. Iani, C., Nicoletti, R., Rubichi, S., and Umilta, C. (2001). Shifting attention between objects. Brain Res Cogn Brain Res 11, 157–64. Ikeda, H. and Wright, M.J. (1972). Differential effects of refractive error and receptive field organization on central and peripheral ganglion cells. Vision Res 12, 1465–72. Ikeda, H. and Wright, M.J. (1975). Spatial and temporal properties of ‘sustained’ and ‘transient’ neurones in area 17 of the cat’s visual cortex. Exp Brain Res 22, 363–83. Ikeda, M. (1965). Temporal summation of positive and negative flashes in the visual system. J Opt Soc Am 55, 1527–34. Imanaka, K., Kita, I., and Suzuki, K. (2002). Effects of nonconscious perception on motor response. Hum Mov Sci 21, 541–61. Irwin, D. (1996). Integrating information across saccadic eye movements. Curr Dir Psychol Sci 5, 94–100. Isaak, M.I., Shapiro, K.L., and Martin, J. (1999). The attentional blink reflects retrieval competition among multiple rapid serial visual presentation items: tests of an interference model. J Exp Psychol Hum Percept Perform 25, 1774–92. James, W. (1950). Principles of Psychology, Vol. 1. New York: Dover Publications. Jas´kowski, P., Skalska, B., and Verleger, R. (2003). How the self controls its ‘automatic pilot’ when processing subliminal information. J Cogn Neurosci 15, 911–20. Jas´kowski, P., van der Lubbe, R.H.J., Schlotterbeck, E., and Verleger, R. (2002). Traces left on visual selective attention by stimuli that are not consciously identified. Psychol Sci 13, 48–54. Jeffreys, D.A. and Axford, J.G. (1972a). Source locations of pattern specific components of human visual evoked potentials. I. Components of striate cortical origin. Exp Brain Res 16, 1–21. Jeffreys, D.A. and Axford, J.G. (1972b). Source locations of pattern specific components of human visual evoked potentials. II. Components of extrastriate cortical origin. Exp Brain Res 16, 22–40.
331
332
REFERENCES
Jeffreys, D.A. and Musselwhite, M.J. (1986). A visual evoked potential study of metacontrast masking. Vision Res 26, 631–42. Johnston, J.C. and McClelland, J.L. (1974). Perception of letters in words: Seek not and ye shall find. Science 184, 1192–3. Jones, R. and Keck, M.J. (1978). Visual evoked response as a function of grating spatial frequency. Invest Ophthalmol Vis Sci 17, 652–9. Joseph, J.S., Chun, M.M., and Nakayama, K. (1997). Attentional requirements in a ‘preattentive’ search task. Nature 387, 805–7. Juola, J.F. and Breitmeyer, B.G. (1988). A discussion of models of motion perception. In Working Models of Human Perception (ed H. Bouma, B.A.G. Elsendorn, D.G. Bowhuis, S.G. Nooteboom, and J.A.J. Roufs). New York: Academic Press, pp. 251–9. Kaas, J.H. (1986). The structural basis for information processing in the primate visual system. In Visual Neuroscience (ed J.D. Pettigrew, K.J. Sanderson, and W.R. Levick). Cambridge: Cambridge University Press, pp. 315–40. Kahan, T.A. and Mathis, K.M. (2002). Gestalt grouping and common onset masking. Percept Psychophys 64, 1248–59. Kahneman, D. (1964). Temporal summation in an acuity task at different energy levels—a study of the determinants of summation. Vision Res 4, 557–66. Kahneman, D. (1967). An onset–onset law for one case of apparent motion and metacontrast. Percept Psychophys 2, 577–84. Kahneman, D. (1968). Method, findings, and theory in studies of visual masking. Psychol Bull 70, 404–25. Kaitz, M., Monitz, J., and Nesher, R. (1985). Electrophysiological correlates of visual masking. Int J Neurosci 28, 261–8. Kammer, T., Scharnowski, F. and Herzog, M.H. (2003). Combining backward masking and transcranial magnetic stimulation in human observers. Neurosci Let 343, 171–4. Kanai, R. and Kamitani, Y. (2003). Time-locked perceptual fading induced by visual transients. J Cogn Neurosci 15, 664–72. Kao, K.C. and Dember, W.N. (1973). Effect of size of ring on backward masking of a disk by a ring. Bull Psychon Soc 2, 15–17. Kapadia, M.K., Ito, M., Gilbert, C.D., and Westheimer, G. (1995). Improvement in visual sensitivity by changes in local contrast: parallel studies in human observers and in V1 of alert monkey. Neuron 15, 843–56. Kaplan, E. and Shapley, R.M. (1986). The primate retina contains two types of ganglion cells, with high and low contrast sensitivity. Proc Natl Acad Sci USA 83, 2755–7. Kaplan, E., Barlow, R.B., Renninger, G., and Purpura, K. (1990). Circadian rhythms in Limulus photoreceptors. II. Quantum bumps. J Gen Physiol 96, 665–85. Keil, A., Müller, M.M., Ray, W.J., Gruber, T., and Elbert, T. (1999). Human gamma band activity and perception of a gestalt. J Neurosci 19, 7152–61. Kelly, D.H. (1961). Flicker fusion and harmonic analysis. J Opt Soc Am 51, 917–18. Kelly, D.H. (1971a). Theory of flicker and transient responses. I. Uniform fields. J Opt Soc Am 61, 537–46. Kelly, D.H. (1971b). Theory of flicker and transient responses. II. Counterphase gratings. J Opt Soc Am 61, 632–40.
REFERENCES
Kelly, D.H. (1972). Adaptation effects on spatio-temporal sine-wave thresholds. Vision Res 12, 89–101. Kelly, D.H. (1973). Lateral inhibition in human colour mechanisms. J Physiol 228, 55–72. Kentridge, R.W., Heywood, C.A., and Weiskrantz, L. (1997). Residual vision in multiple retinal locations within a scotoma: Implications for blindsight. J Cogn Neurosci 9, 191–202. Kentridge, R.W., Heywood, C.A., and Weiskrantz, L. (1999). Attention without awareness in blindsight. Proc R Soc Lond B Biol Sci 266, 1805–11. Kerr, J.L. (1971). Visual resolution in the periphery. Percept Psychophys 9, 375–8. Kihlstrom, J.F. (1987). The cognitive unconscious. Science 237, 1145–52. Kihlstrom, J.F. (1996). Perception without awareness of what is perceived, learning without awareness of what is learned. In The Science of Consciousness (ed V. Velman). London: Routledge, pp. 23–46. Kim, H. and Francis, G. (1998). A computational and perceptual account of motion lines. Perception 27, 785–97. King, D.L., Hicks, H., and Brown, P.D. (1993). Context-produced increase in visibility. Psychol Res 55, 10–14. King, D.L., Mose, H.F., and Nixon, N.S. (1995). One line decreases the visibility of a simultaneous identical distant second line. Percept Psychophys 57, 393–401. King-Smith, P.E. and Kulikowski, J.J. (1973). Line, edge and grating detectors in human vision. J Physiol 230, 23–5. King-Smith, P.E. and Kulikowski, J.J. (1975). Pattern and flicker detection analysed by subthreshold summation. J Physiol 249, 519–48. King-Smith, P.E., More, R.K., and Riggs, L.A. (1976). A new approach to the frequency response of human vision. J Physiol 257, 36–7. Kinoshita, S. and Lupker, S.J. (2003). Masked Priming. New York: Psychology Press. Kinsbourne, M. and Warrington, E.K. (1962a). The effect of an aftercoming random pattern on the perception of brief visual stimuli. Q J Exp Psychol 14, 223–34. Kinsbourne, M. and Warrington, E.K. (1962b). Further studies of visual masking of brief visual stimuli by a random pattern. Q J Exp Psychol 14, 235–45. Kiorpes, L. and McKee, S.P. (1999). Neural mechanisms underlying amblyopia. Curr Opin Neurobiol 9, 480–6. Kirschfeld, K. and Kammer, T. (1999). The Fröhlich effect: a consequence of the interaction of visual focal attention and metacontrast. Vision Res 39, 3702–9. Kirschfeld, K. and Kammer, T. (2000). Visual attention and metacontrast modify latency to perception in opposite directions. Vision Res 40, 1027–33. Klinger, M.R., and Greenwald, A.G. (1995). Unconscious priming of association judgments. J Exp Psychol Learn Mem Cogn 21, 569–81. Klotz, W. and Neumann, O. (1999). Motor activation without conscious discrimination in metaconttrast masking. J Exp Psychol Hum Percept Perform 25, 976–92. Klotz, W. and Wolff, P. (1995). The effect of a masked stimulus on the response to the masking stimulus. Psychol Res 58, 92–101. Kobatake, E. and Tanaka, K. (1994). Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. J Neurophysiol 71, 856–67.
333
334
REFERENCES
Koch, C. (2004). The Quest for Consciousness. Englewood, CO: Roberts & Co. Koch, C. and Crick, F. (2001). The zombie within. Nature 411, 893. Koch, C. and Segev, I. (1998). Methods in Neuronal Modeling (2nd Ed.). Cambridge, MA: MIT Press. Kolers, P. (1962). Intensity and contour effects in visual masking. Vision Res 2, 277–94. Kolers, P. (1963). Some differences between real and apparent visual movement. Vision Res 3, 191–206. Kolers, P. (1972). Aspects of Motion Perception. Oxford: Pergamon Press. Kolers, P. and Rosner, B.S. (1960). On visual masking (metacontrast): dichoptic observations. Am J Psychol 73, 2–21. Kolers, P. and von Grünau, M.W. (1976). Shape and color in apparent motion. Vision Res 16, 329–35. Kolers, P. and von Grünau, M.W. (1977). Fixation and attention in apparent motion. Q J Exp Psychol 29, 389–95. Kondo, H. and Komatsu, H. (2000). Suppression of neuronal responses by a metacontrast masking stimulus in monkey v4. Neurosci Res 36, 27–33. Korte, A. (1915). Kinematoskopische Untersuchungen. Zeitschr Psychol 72, 194–296. Kovács, G., Vogels, R., and Orban, G.A. (1995). Cortical correlate of pattern backward masking. Proc Natl Acad Sci USA 92, 5587–91. Kovács, I., Papathomas, T.V., Yang, M., and Feher, A. (1996). When the brain changes its mind: interocular grouping during binocular rivalry. Proc Natl Acad Sci USA 93, 15508–11. Kreiman, G., Fried, I., and Koch, C. (2005). Response of single neurons in the human brain during flash suppression. In Binocular Rivalry (ed D. Alais and R. Blake). Cambridge, MA: MIT Press, pp. 213–30. Kremers, J., Weiss, S., and Zrenner, E. (1997). Temporal properties of marmoset lateral geniculate cells. Vision Res 37, 2649–60. Krüger, J. (1979). Responses to wavelength contrast in the afferent visual systems of the cat and rhesus monkey. Vision Res 19, 1351–8. Kubová, Z., Kuba, M., Peregrin, J., and Nováková, V. (1996). Visual evoked potential evidence for magnocellular system deficit in dyslexia. Physiol Res 45 87–9. Kuhn, T.S. (1957a). The Copernican Revolution. Cambridge, MA: Harvard University Press. Kuhn, T.S. (1957b). The Structure of Scientific Revolution. Chicago, IL: University of Chicago Press. Kuhn, T.S. (1962). Historical structure of scientific discovery. Science 136, 760–4. Kulikowski, J.J. (1974). Human averaged occipital potentials evoked by pattern and movement. J Physiol 242, 70–1. Kulikowski, J.J. (1975). Apparent fineness of briefly presented gratings: balance between movement and pattern channels. Vision Res 15, 673–80. Kulikowski, J.J. (1977). Visual evoked potentials as a measure of visibility. In Visual Evoked Potentials in Man: New Developments (ed J.E. Desmedt). Oxford: Clarendon Press, pp. 168–83. Kulikowski, J.J. and Tolhurst, D.J. (1973). Psychophysical evidence for sustained and transient detectors in human vision. J Physiol 232, 149–62.
REFERENCES
Kunkel, A. (1874). Über die Abhängigkeit der Farbempfinding von der Zeit. Pflugers Arch Gesamte Physiol 135, 197–200. Kurylo, D.D. (1997). Time course of perceptual grouping. Percept Psychophys 59, 142–7. LaBerge, D. (1995). Attentional Processing: The Brain’s Art of Mindfulness. Cambridge, MA: Harvard University Press. LaBerge, D. and Brown, V. (1989). Theory of attentional operations in shape identification. Psychol Rev 96, 101–24. Lachter, J. and Durgin, F.H. (1999). Metacontrast masking functions: a question of speed? J Exp Psychol Hum Percept Perform 25, 1–12. Lamme, V.A.F. (1995). The neurophysiology of figure–ground segregation in primary visual cortex. J Neurosci 15, 1605–15. Lamme, V.A.F. (2000). Neural mechanisms of visual awareness: a linking hypothesis. Brain Mind 1, 385–406. Lamme, V.A.F. (2003). Why visual attention and awareness are different. Trends Cogn Sci 7, 12–18. Lamme, V.A.F. and Spekreijse, H. (2000). Modulations of primary visual cortex activity representing attentive and conscious scene perception. Front Biosci 5, 232–43. Lamme, V.A., Rodriguez-Rodriguez, V., and Spekreijse, H. (1999). Separate processing dynamics for texture elements, boundaries and surfaces in primary visual cortex of the macaque monkey. Cereb Cortex 9, 406–13. Lamme, V.A.F., Super, H., Landman, R., Roelfsema, P.R., and Spekreijse, H. (2000). The role of primary visual cortex (V1) in visual awareness. Vision Res 40, 1507–21. Lamme, V.A.F., Zipser, K., and Spekreijse, H. (2002). Masking interrupts figure–ground signals in V1. J Cogn Neurosci 14, 1044–53. Lamy, D. and Egeth, H. (2002). Object-based selection: the role of attentional shifts. Percept Psychophys 64, 52–66. Landahl, H.D. (1961). A note on mathematical models for the interaction of neural elements. Bull Math Biophys 23, 91–7. Landahl, H.D. (1967). A neural net model for masking phenomena. Bull Math Biophys 29, 227–32. Lanze, M., Maguire, W., and Weisstein, N. (1985). Emergent features: a new factor in the object-superiority effect? Percept Psychophys 38, 438–42. Latch, M. and Lennie, P. (1977). Rod–cone interaction in light adaptation. J Physiol 269, 517–34. Lee, B.B., Martin, P.R., and Valberg, A. (1988). The physiological basis of heterochromatic flicker photometry demonstrated in the ganglion cells of the macaque retina. J Physiol 404, 323–47. Lee, B.B., Martin, P.R., and Valberg, A. (1989a). Amplitude and phase of responses of macaque retinal ganglion cells to flickering stimuli. J Physiol 414, 245–63. Lee, B.B., Martin, P.R., and Valberg, A. (1989b). Nonlinear summation of M- and L-cone inputs to phasic retinal ganglion cells of the macaque. J Neurosci 9, 1433–42. Lee, B.B., Pokorny, J., Smith, V.C., Martin, P.R., and Valberg, A. (1990). Luminance and chromatic modulation sensitivity of macaque ganglion cells and human observers. J Opt Soc Am A 7, 2223–6.
335
336
REFERENCES
Lee, B.B., Pokorny, J., Smith, V.C., and Kremers, J. (1994). Responses to pulses and sinusoids in macaque ganglion cells. Vision Res 34, 3081–96. Lee, D.N. (1976). A theory of visual control of braking based on information about time-to-collision. Perception 5, 437–59. Lee, D.N. (1980). The optic flow field: the foundation of vision. Philos Trans R Soc Lond B Biol Sci 290, 169–79. Lee, S.H. and Blake, R. (2002). V1 activity is reduced during binocular rivalry. J Vis 2, 618–26. Lee, S.H. and Blake, R. (2004). A fresh look at interocular grouping during binocular rivalry. Vision Res 44, 983–91. Lee, T., Mumford, D., and Schiller, P. (1995). Neuronal correlates of boundary and medial axis representations in primate visual cortex. J Invest Ophthalmol Vis Sci 36, 477. Lefton, L.A. (1970). Metacontrast: Further evidence for monotonic fucntions. Psychonom Sci 21, 85–7. Lefton, L.A. (1973). Metacontrast: a review. Percept Psychophys 13, 161–71. Lefton, L.A. and Griffin, J.R. (1976). Metacontrast with internal contours: more evidence for monotonic functions. Bull Psychon Soc 7, 29–32. Lefton, L.A. and Newman, Y. (1976). Metacontrast and paracontrast: both photopic and scotopic luminance levels yield monotones. Bull Psychon Soc 8, 435–8. Legge, G. (1978). Sustained and transient mechanisms in human vision: temporal and spatial properties. Vision Res 18, 341–76. Lehmkuhle, S. and Fox, R. (1980). Effect of depth separation on metacontrast masking. J Exp Psychol Hum Percept Perform 6, 605–21. Lehmkuhle, S., Kratz, K.E., Mangel, S.C., and Sherman, S.M. (1980). Spatial and temporal sensitivity of X- and Y-cells in dorsal lateral geniculate nucleus of the cat. J Neurophysiol 43, 520–41. Lehmkuhle, S., Garzia, R.P., Turner, L., Hash, T., and Baro, J.A. (1993). A defective visual pathway in children with reading disability. N Engl J Med 328, 989–96. Leopold, D.A. and Logothetis, N.K. (1996). Activity changes in early visual cortex reflect monkey’s percept during binocular rivalry. Nature 379, 549–52. Leuthold, H. and Kopp, B. (1998). Mechanisms of priming by masked stimuli: inferences from event-related brain potentials. Psychol Sci 9, 263–9. Leventhal, A.G. and Hirsch, H.V. (1978). Receptive-field properties of neurons in different laminae of visual cortex of the cat. J Neurophysiol 41, 948–62. Levi, D.M. (1994). Pathophysiology of binocular vision and amblyopia. Curr Opin Ophthalmol 5, 3–10. Levi, D.M. and Harwerth, R.S. (1980). Contrast sensitivity in amblyopia due to stimulus deprivation. Br J Ophthalmol 64, 15–20. Levi, D.M., Harwerth, R.S., and Manny, R. (1979). Suprathreshold spatial frequency detection and binocular interaction in strabismic and anisometropic amblyopia. Invest Ophthalmol Vis Sci 18, 714–25. Levine, R., Didner, R., and Tobenkin, N. (1967). Backward masking as a function of interstimulus distance. Psychon Sci 9, 185–6. Li, W. and Gilbert, C.D. (2002). Global contour saliency and local collinear interactions. J Neurophysiol 88, 2846–56.
REFERENCES
Libet, B. (1996). Neural processes in the production of conscious experience. In The Science of Consciousness (ed V. Velman). London: Routledge, pp. 97–117. Lie, I. (1980). Visual detection and resolution as a function of retinal locus. Vision Res 20, 967–74. Lindsey, D.B., Fehmi, L.G., and Adkins, J.W. (1967). Visually evoked potentials during perceptual masking in man and monkey. Electroencephalogr Clin Neurophysiol 23, 79. Lipkin, B.S. (1962). Monocular flicker discrimination as a function of the luminance and area of contralateral steady light. I: Luminance. J Opt Soc Am 52, 1287–1300. Livingstone, M.S. and Hubel, D.H. (1984). Anatomy and physiology of a color sytem in the primate visual cortex. J Neurosci 4, 309–56. Livingstone, M.S. and Hubel, D.H. (1987). Psychophysical evidence for separate channels for the perception of form color, movement, and depth. J Neurosci 7, 3416–68. Livingstone, M. and Hubel, D. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science 240, 740–9. Livingstone, M.S., Rosen, G.D., Drislane, F.W., and Galaburda, A.M. (1991). Physiological and anatomical evidence for a magnocellular defect in developmental dyslexia. Proc Natl Acad Sci USA 88, 7943–7. Lleras, A. and Moore, C.M. (2003). When the target becomes the mask: using apparent motion to isolate the object-level component of object substitution masking. J Exp Psychol Hum Percept Perform 29, 106–20. Loffler, G., Gordon, G.E., Wilkinson, F., Goren, D., and Wilson, H.R. (2005). Configurational masking of faces: Evidence for high-level interactions in face perception. Vision Res 45, 2287–97. Logothetis, N.K. and Schall, J.D. (1989). Neuronal correlates of subjective visual perception. Science 245, 761–3. Löhmann, W. (1906). Über Helladaptation. Zeitschr Sinnesphysiol 41, 290–311. Long, G. (1980). Iconic memory: a review and critique of the study of short-term visual storage. Psychol Bull 88, 785–820. Long, G.M. and Gildea, T.J. (1981). Latency for the perceived offset of brief target gratings. Vision Res 21, 1395–9. Lotze, H. (1852). Medicinische Psychologie oder Physiologie der Seele. Leipzig: Weidmann. Lovegrove, W. (1993). Weakness in the transient visual system: a causal factor in dyslexia? Ann NY Acad Sci 682, 57–69. Lovegrove, W.J. and Williams, M.C. (1993). Visual temporal processing deficits in specific reading disability. In Visual Processes in Reading and Reading Disabilities (ed D.M. Willows, R.S. Kruk, and E. Corcos). Hillsdale, NJ: Erlbaum, pp. 311–29. Lovegrove, W.J., Martin, F., and Slaghuis, W. (1986). A theoretical and experimental case for visual deficit in specific reading disability. Cogn Neuropsychol 3, 225–67. Lubimov, V. and Logvinenko, A. (1993). Motion blur revisited. Perception 22 (Suppl): 77. Lupp, U., Hauske, G., and Wolf, W. (1976). Perceptual latencies to sinusoidal gratings. Vision Res 16, 969–72. Lupp, U., Hauske, G., and Wolf, W. (1978). Different systems for the visual detection of high and low spatial frequencies. Photogr Sci Eng 22, 80–4.
337
338
REFERENCES
Lyon, J.E., Matteson, H.H., and Marx, M.S. (1981). Metacontrast in the fovea. Vision Res 21, 297–9. McAdams, C.J. and Maunsell, J.H.R. (1999). Effects of attention on orientation-tuning functions of single neurons in macaque cortical area V4. J Neurosci 19, 431–41. McClelland, J.L. (1978). Perception and masking of wholes and parts. J Exp Psychol Hum Percept Perform 4, 210–23. McCloskey, M. and Watkins, M.J. (1978). The seeing-more-than-is-there phenomenon: implications for the locus of iconic storage. J Exp Psychol Hum Percept Perform 4, 553–64. McCormick, P.A. (1997). Orienting attention without awareness. J Exp Psychol Hum Percept Perform 23, 168–80. McDougall, W. (1904a). The sensations excited by a single momentary stimulation of the eye. Br J Psychol 1, 78–113. McDougall, W. (1904b). The variations of the intensity of visual sensation with the duration of the stimulus. Br J Psychol 1, 151–89. McFadden, D. and Gummerman, K. (1973). Monoptic and dichoptic metacontrast across the vertical meridian. Vision Res 13, 185–96. McIlwain, J.T. and Lufkin, R.B. (1976). Distribution of direct Y-cell inputs to the cat’s superior colliculus: are there spatial gradients? Brain Res 103, 133–8. McKee, S.P. and Westheimer, G. (1970). Specificity of cone mechanisms in lateral interaction. J Physiol 206, 117–28. McKee, S.P., Bravo, M.J., Taylor, D.G., and Legge, G.E. (1994a). Stereo matching precedes dichoptic masking. Vision Res 34, 1047–60. McKee, S.P., Levi, D.M., and Movshon, J.A. (1994b). The pattern of visual deficits in amblyopia. J Vis 3, 380–405. Mach, E. (1865). Über die Wirkung der räumlichen Vertheilung des Lichtreizes auf die Netzhaut. Wiener Sitzungsber Math-Naturwiss Cl Kaiserlichen Akad Wiss 52, Part 2, 303–22. Mach, E. (1866a). Über den physiologischen Effekt räumlich verteilter Lichtreize (Zweite Abhandlung). Wiener Sitzungsber Math-Naturwiss Cl Kaiserlichen Akad Wiss 54, Part 2, 131–44. Mach, E. (1866b). Über die physiologische Wirkung räumlich verteilter Lichtreize (Dritte Abhandlung). Wiener Sitzungsber Math-Naturwiss Cl Kaiserlichen Akad Wiss 54, Part 2, 393–408. Mach, E. (1868). Über die physiologische Wirkung räumlich verteilter Lichtreize (Vierte Abhandlung). Wiener Sitzungsber Math-Naturwiss Cl Kaiserlichen Akad Wiss 57, Part 2, 11–1. Mack, A. and Rock, I. (1998). Inattentional Blindness. Cambridge, MA: MIT Press. Macknik, S.L. and Haglund, M.M. (1999). Optical images of visible and invisible percepts in the primary visual cortex of primates. Proc Natl Acad Sci USA 96, 15208–10. Macknik, S.L. and Livingstone, M.S. (1998). Neuronal correlates of visibility and invisibility in the primate visual system. Nat Neurosci 1, 144–9. Maffei, L., Cervetto, L., and Fiorentini, A. (1970). Transfer characteristics of excitation and inhibition in cat retinal ganglion cells. J Neurophysiol 33, 276–84.
REFERENCES
Maier, J., Dagnelie, G., Spekreijse, H., and van Dijk, B.W. (1987). Principal components analysis for source localization of VEPs in man. Vision Res 27, 165–77. Makous, W. and Peeples, D. (1979). Rod–cone interaction: rconciliation with Flamant and Stiles. Vision Res 19, 695–8. Manahilov, V. (1995). Spatiotemporal visual response to suprathreshold stimuli. Vision Res 35, 227–37. Marcel, A.J. (1983a). Conscious and unconscious perception: experiments on visual masking and word recognition. Cogn Psychol 15, 197–237. Marcel, A.J. (1983b). Conscious and unconscious perception: an approach to the relation between phenomenal experience and perceptual processes. Cogn Psychol 15, 238–300. Markoff, J.I. and Sturr, J.F. (1971). Spatial and luminance determinants of the increment threshold under monoptic and dichoptic viewing. J Opt Soc Am 61, 1530–7. Marr, D. (1982). Vision. San Francisco, CA: W.H. Freeman & Co. Marr, D. and Ullman, S. (1981). Directional selectivity and its uses in early visiual processing. Proc R Soc Lond B Biol Sci 211, 151–80. Marriott, F.H.C. (1962). Colour vision: the two-colour threshold technique of Stiles. In The Eye. Vol 2, The Visual Process (ed H. Davson). New York: Academic Press, pp. 251–72. Marrocco, R.T., McClurkin, J.W., and Young, R.A. (1982). Spatial summation and conduction latency classification of cells of the lateral geniculate of macaques. J Neurosci 2, 1275–91. Martin, K.A. and Marshall, J.A. (1993). Unsmearing visual motion: development of long-range horizontal intrinsic connections. In Advances in Neural Information Processing Systems, Vol. 5 (ed S. Hanson, J. Cowan, and C. Giles). San Mateo, CA: Morgan Kaufmann. Martin, K.A.C. (1992). Parallel pathways converge. Curr Biol 2, 555–7. Martius, G. (1902). Über die Dauer der Lichtempfindungen. Beitr Psychol Physiol Leipzig 1, 275–366. Matin, E. (1974). Light adaptation and the dynamics of induced tilt. Vision Res 14, 255–65. Matin, E. (1975). The two-transient (masking) paradigm. Psychol Rev 82, 451–61. Matsumura, M. (1976). Visual masking by luminance increments and decrements: Effects of rise time and decay time. Tohoku Psychol Folia 34, 95–102 Matteson, H.H. (1969). Effect of surround size and luminance on metacontrast. J Opt Soc Am 59, 1461–8. Matthews, M.L. (1971). Spatial and temporal factors in masking by edges and disks. Percept Psychophys 9, 15–22. Mattson, A.J., Levin, H.S., and Breitmeyer, B.G. (1994). Visual information processing after severe closed head injury: effects of forward and backward masking. J Neurol Neurosurg Psychiatry 57, 818–24. Maunsell, J.H.R. and Gibson, J.R. (1992). Visual response latencies in striate cortex of the macaque monkey. J Neurophysiol 68, 1332–44. Maunsell, J.H.R., Ghose, G.G., Assas, J.A., McAdams, C.J., Boudreau, C.E., and Noerager, B.D. (1999). Visual response latencies of magnocellular and parvocellular LGN neurons in macaque monkeys. Vis Neurosci 16, 1–14.
339
340
REFERENCES
May, J.G., Grannis, S.W., and Porter, R.J., Jr (1980). The lag effect in dichoptic viewing. Brain Lang 11, 19–29. May, J., Lovegrove, W., Martin, F., and Nelson, W. (1991). Pattern-elicited visual evoked potentials in good and poor readers. Clin Vis Sci 6, 131–6. May, J., Dunlap, W., and Lovegrove, W. (1992). Factor scores derived from visual evoked potential latencies differentiate good and poor readers. Clin Vis Sci 7, 67–70. May, J.G., Tsiappoutas, K.M., and Flanagan, M.B. (2003). Disappearance elicited by contrast decrements. Percept Psychophys 65, 763–9. Mayzner, M.S. and Tresselt, M.E. (1970). Visual information processing with sequential inputs: a general model for sequential blanking, displacement, and overprinting phenomena. Ann NY Acad Sci 169, 599–618. Merikle, P.M. (1980). Selective metacontrast. Can J Psychol 34, 196–9. Merikle, P.M. (1992). Perception without awareness. Am Psychol 47, 792–5. Merikle, P.M. and Joordens, S. (1997). Parallels between perception without attention and perception without awareness. Conscious Cogn 6, 219–36. Merikle, P.M., Smilek, D., and Eastwood, J.D. (2001). Perception without awareness: perspectives from cognitive psychology. Cognition 79, 115–34. Merritt, R.D. and Balogh, D.W. (1984). The use of a backward masking paradigm to assess information-processing deficits among schizophrenics: a re-evaluation of Steronko and Woods. J Nerv Ment Dis 172, 216–24. Merritt, R.D. and Balogh, D.W. (1990). Backward masking as a function of spatial frequency a comparison of MMPI-identified schizotypics and control subjects. J Nerv Ment Dis 178, 186–93. Metzinger, T. (2000). Neural Correlates of Consciousness. Cambridge, MA: MIT Press. Mewhort, D.J. and Campbell, A.J. (1978). Processing spatial information and the selective-masking effect. Percept Psychophys 24, 93–101. Mewhort, D.J.K., Hearty, P.J., and Powell, J.E. (1978). A note on sequential blanking. Percept Psychophys 23, 132–6. Mewhort, D.J.K., Huntley, M.F., and Duff-Fraser, H. (1993). Masking disrupts recovery of location information. Percept Psychophys 54, 759–62. Meyer, G.E. and Maguire, W.M. (1977). Spatial frequency and the mediation of short-term visual storage. Science 198, 524–5. Meyer, G.E., Lawson, R., and Cohen, W. (1975). The effects of orientation-specific adaptation on the duration of short-term visual storage. Vision Res 15, 569–72. Meyer, G.E., Jackson, W.E., and Yang, C. (1979). Spatial frequency, orientation and color: interocular effects of adaptation on the perceived duration of gratings. Vision Res 19, 1197–1201. Mezrich, J.J. (1984). The duration of visual persistence. Vision Res 24, 631–2. Michaels, C.F. and Turvey, M.T. (1973). Hemiretinae and nonmonotonic masking functions with overlapping stimuli. Bull Psychon Soc 2, 163–4. Michaels, C.F. and Turvey, M.T. (1979). Central sources of visual masking: Indexing structures supporting seeing at a single, brief glance. Psychol Res 41, 1–61. Michimata, C., Okubo, M., and Mugishima, Y. (1999). Effects of background color on the global and local processing of hierarchically organized stimuli. J Cogn Neurosci 11, 1–8.
REFERENCES
Milner, A.D. and Goodale, M.A. (1995). The Visual Brain in Action. Oxford: Oxford University Press. Minkowski, M. (1913). Experimentelle Untersuchungen über die Beziehung der Grosshirnrinde und der Netzhaut zu den primären optischen Zentern, besonders zum Corpus geniculatum externum. Arbeitsbl Hirnanat Inst Zurich 7, 255–62. Minkowski, M. (1920a). Über den Verlauf, die Endigung und die zentrale Repräsentation von gekreutzten und ungekreutzten Sehnervenfasern bei einigen Säugetieren und beim Menschen. Schweiz Arch Neurol Psychol 6, 201–52. Minkowski, M. (1920b). Über den Verlauf, die Endigung und die zentrale Repräsentation von gekreutzten und ungekreutzten Sehnervenfasern bei einigen Säugetieren und beim Menschen. Schweiz Arch Neurol Psychol 7, 268–303. Mitov, D., Vassilev, A., and Manahilov, V. (1981). Transient and sustained masking. Percept Psychophys 30, 205–10. Mitroff, S.R. and Scholl, B.J. (2005). Forming and updating object representations without awareness: evidence from motion-induced blindess. Vision Res 45, 961–7. Mollon, J.D. and Krauskopf, J. (1973). Reaction time as a measure of the temporal response properties of individual color mechanisms. Vision Res 13, 27–40. Monahan, J.S. and Steronko, R.J. (1977). Stimulus luminance and dichoptic pattern masking. Vision Res 17, 385–90. Monjé, J. (1927). Die Empfindungszeitmessung mit der Methode des Löschreizes. Zeitschr Biol 87, 23–40. Monjé, J. (1931). Uber die gegenseitige Beeinflussung der Empfindungen bei binokularem Sehen. Zeitschr Biol 91, 387–98. Moran, J. and Desimone, R. (1985). Selective attention gates visual processing in the extrastriate cortex. Science 229, 782–4. Mori, S., Tanaka, G., Ayaka, Y., et al. (1996). Preattentive and focal attentional processes in schizophrenia: a visual search study. Schizophr Res 22, 69–76. Morris, J.S., Öhman, A., and Dolan, R.J. (1998). Conscious and unconscious emotional learning in the human amygdala. Nature 393, 467–70. Morris, J. and Dolan, R. (2001). The amygdala and unconscious fear processing. In Out of Mind (ed B. de Gelder, E. de Haan, and C. Heywood). Oxford: Oxford University Press, pp. 185–204. Motoyoshi, I. (1999). Texture filling-in and texture segregation revealed by transient masking. Vision Res 39, 1285–91. Motter, B.C. (1993). Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. J Neurophysiol 70, 909–19. Moutoussis, K. and Zeki, S. (2002). Responses of spectrally selective cells in macaque area V2 to wavelengths and colors. J Neurophysiol 87, 2104–12. Muise, J.G., LeBlanc, R.S., Lavoie, M.E., and Arsenault, A.S. (1991). Two-stage model of visual backward masking: sensory transmission and accrual of effective information as a function of target intensity and similarity. Percept Psychophys 50, 197–204. Müller, J. (1834). Handbuch der Physiologie des Menschen. Coblenz: Hölscher. Müller, J.R., Metha, A.B., Krauskopf, J., and Lennie, P. (2001). Information conveyed by onset transients in responses of striate cortical neurons. J Neurosci 21, 6978–90.
341
342
REFERENCES
Munk, M.H.J., Roelfsema, P.R., Koenig, P., Engel, A.K., and Singer, W. (1996). Role of reticular activation in the modulation of intracortical synchronization. Science 272, 271–4. Murray, I.J. and Kulikowski, J.J. (1983). VEPs and contrast. Vision Res 23, 1741–3. Murray, I.J. and Plainis, S. (2003). Contrast coding and magno/parvo segregation revealed in reaction time studies. Vision Res 43, 2707–19. Murray, S.O., Kersten, D., Olshausen, B.A., Schrater, P., and Woods, D.L. (2002). Shape perception reduces activity in human primary visual cortex. Proc Natl Acad Sci USA 99, 15164–9. Mussap, A.J. and Levi, D.M. (1997). Vernier acuity with plaid masks: the role of oriented filters in vernier acuity. Vision Res 37, 1325–40. Nacchache, L., Blandin, E., and Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychol Sci 13, 416–24. Naccache, L., Gaillard, R., Adam, C., et al. (2005). A direct intracranial record of emotions evoked by subliminal words. Proc Natl Acad Sci USA 102, 7713–17. Nachmias, J. (1967). Effect of exposure duration on visual contrast sensitivity with square-wave gratings. J Opt Soc Am A 57, 421–7. Nagano, T. (1980). Temporal sensitivity of the human visual system to sinusoidal gratings. J Opt Soc Am 70, 711–16. Nakamura, K., Dehaene, S., Jobert, A., Le Bihan, D., and Kouider, S. (2005). Subliminal convergence of Kanji and Kana words: further evidence for functional parcellation of the posterior temporal cortex in visual word perception. J Cogn Neurosci 17, 954–68. Navon, D. (1977). Forest before the trees: the precedence of global features in visual perception. Cogn Psychol 9, 353–83. Navon, D. and Purcell, D. (1981). Does integration produce masking or protect from it? Perception 10, 71–83. Nealy, T.A. and Maunsell, J.H.R. (1994). Magnocellular and parvocellular contributions to the responses of neurons in macaque striate cortex. J Neurosci 14, 2069–79. Neisser, U. (1967). Cognitive Psychology. New York: Appleton-Century-Crofts. Neisser, U. (1976). Cognition and Reality. San Francisco: W.H. Freeman & Co. Nelson, J.I. and Frost, B.J. (1978). Orientation-selective inhibition from beyond the classical receptive field. Brain Res 139, 359–66. Nelson, S.B. (1991). Temporal interactions in the cat visual system. I: Orientation-selective suppression in the visual cortex. J Neurosci 11, 344–56. Neuhaus, W. (1930). Experimentelle Untersuchungen der Scheinbewegung. Arch Gesamte Psychol 75, 315–458. Neumann, O. (1990). Direct parameter specification and the concept of perception. Psychol Res 52, 207–15. Neumann, O. and Klotz, W. (1994). Motor responses to nonreportable, masked stimuli. Where is the limit of direct parameter specification? In Attention and Performance XV (ed C. Umilta and M. Moscovitch). Cambridge, MA: MIT Press, pp. 123–50. Newark, J. and Mayzner, M.S. (1973). Sequential blanking effects for two interleaved words. Bull Psychon Soc 2, 74–6. Nguyen, V.A., Freeman, A.W., and Alais, D. (2003). Increasing depth of binocular rivalry suppression along two visual pathways. Vision Res 43, 2003–8.
REFERENCES
Nijhawan, R. (1994). Motion extrapolation in catching. Nature 370, 256–7. Nijhawan, R. (1997). Visual decomposition of colour through motion extrapolation. Nature 386, 66–9. Nillson T.H., Richmond, C.F., and Nelson, T.M. (1975). Flicker adaptation shows evidence of many visual channels selectively sensitive to temporal frequency. Vision Res 15, 621–4. Noesselt, T., Hillyard, S.A., Woldorff, M.G., et al. (2002). Delayed striate cortical activation during spatial attention. Neuron 35, 575–87. Nowak, L.G. and Bullier, J. (1997). The timing of information transfer in the visual system. In Cerebral Cortex: Extrastriate Cortex in Primates (ed J. Kaas, K. Rockland, and A. Peters). New York: Plenum Press, pp. 205–41. Nowak, L.G., Munk, M.H., Girard, P., and Bullier, J. (1995). Visual latencies in areas V1 and V2 of the macaque monkey. Vis Neurosci 12, 371–84. Nuechterlein, K.H. and Dawson, M.E. (1984). Information processing and attentional functioning in the developmental course of schizophrenic disorders. Schizophr Bull 10, 160–203. O’Connor, D.H., Fukui, M.M., Pinsk, M.A., and Kastner, S. (2002). Attention modulates responses in the human lateral geniculate nucleus. Nat Neurosci 5, 1203–9. Ög˘ men, H. (1993). A neural theory of retino-cortical dynamics. Neural Netw 6, 245–73. Ö˘gmen, H. and Breitmeyer, B.G. (2005). The First Half Second: The Microgenesis and Temporal Dynamics of Unconscious and Conscious Processes. Cambridge, MA: MIT Press. Ö˘gmen, H. and Gagné, S. (1990). Neural models for sustained and on-off units of insect lamina. Biol Cybern 63, 51–60. Ö˘gmen, H., Breitmeyer, B.G., and Melvin, R. (2003). The what and where in visual masking. Vision Res 43, 1337–50. Ö˘gmen, H., Breitmeyer, B.G., Todd, S., and Mardon, L. (2004). Double dissociation in target recovery: effect of contrast. Paper presented at the Annual Meeting of the Vision Sciences Society, Sarasota, FL, 30 April–5 May 2004. Öhman, A. (2002). Automaticity and the amygdala: nonconscious responses to emotional faces. Curr Dir Psychol Sci 11, 62–6. Ortells, J.J., Daza, M.T., and Fox, E. (2003). Semantic activation in the absence of perceptual awareness. Percept Psychophys 65, 1307–17. Oyama, T. (1970). The visually perceived velocity as a function of aperture size, stripe size, luminance and motion direction. Jpn Psychol Res 12, 163–71. Pääkkönen, A.K. and Morgan, M.J. (1994). Effects of motion on blur discrimination, J Opt Soc Am A 11, 992–1002. Palmer, L.A. and Rosenquist, A.C. (1974). Visual receptive fields of single striate cortical units projecting to the superior colliculus in the cat. Brain Res 67, 27–42. Palmer, S.E. (1999). Vision Science. Cambridge, MA: MIT Press. Pammer, K. and Lovegrove, W. (2001). The influence of color on transient system activity: implications for dyslexia research. Percept Psychophys 63, 490–500. Pantle, A. (1971). Flicker adaptation. I: Effect on visual sensitivity to temporal fluctuations of light intensity. Vision Res 11, 943–52.
343
344
REFERENCES
Paradiso, M.A. and Nakayama, K. (1991). Brightness perception and filling-in. Vision Res 31, 1221–36. Parker, D.M. (1980). Simple reaction times to the onset, offset, and contrast reversal of sinusoidal grating stimuli. Percept Psychophys 28, 365–8. Parker, D.M. and Salzen, E.A. (1977a). The spatial selectivity of early and late waves within the human visual evoked response. Perception 6, 85–95. Parker, D.M. and Salzen, E.A. (1977b). Latency changes in the human visual evoked response to sinusoidal gratings. Vision Res 17, 1201–4. Parker, D.M. and Salzen, E.A. (1982). Evoked potentials and reaction times to the offset and contrast reversal of sinusoidal gratings. Vision Res 22, 205–7. Parks, T.E. (1965). Post-retinal visual storage. Am J Psychol 78, 145–7. Parks, T.E. (1968). Further comments on the evidence for post-retinal storage. Percept Psychophys 4, 373. Parks, T.E. (1970). A control for ocular tracking in the demonstration of postretinal visual storage. Am J Psychol 83, 442–4. Parnas, J., Bovet, P., and Innocenti, G.M. (1996). Schizophrenic trait features, binding, and cortico-cortical connectivity: a neurodevelopmental pathogenetic hypothesis. Neurol Psychiatr Brain Res 4, 185–96. Pasley, B.N., Mayes, L.C., and Schultz, R.T. (2004). Subcortical discrimination of unperceived objects during binocular rivalry. Neuron 42, 163–72. Pasupathy, A. and Connor, C.E. (2002). Population coding of shape in area V4. Nat Neurosci 5, 1332–8. Pélisson, D., Prablanc, C., Goodale, M.A., and Jeannerod, M. (1986). Visual control of reaching movements without vision of the limb. II: Evidence of fast unconscious processes correcting the trajectory of the hand to the final position of a double-step stimulus. Exp Brain Res 62, 303–11. Perry, V.H., Oehler, R., and Cowey, A. (1984). Retinal ganglion cells that project to the dorsal lateral geniculate nucleus in the macaque monkey. Neuroscience 12, 1101–23. Pessoa, L., McKenna, M., Gutierrez, E., and Ungerleider, L.G. (2002). Neural processing of emotional faces requires attention. Proc Natl Acad Sci USA 99, 11458–63. Pessoa, L. and De Weerd, P. (2003). Filling-in. Oxford: Oxford University Press. Petrén, K. (1893). Untersuchungen über den Lichtsinn. Skand Arch Physiol 4, 421–47. Petry, S. (1978). Perceptual changes during metacontrast. Vision Res 18, 1337–41. Petry, S., Grigonis, A., and Reichert, B. (1979). Decrease in metacontrast masking following adaptation to flicker. Perception 8, 541–7. Piéron, H. (1935). Le processus du metacontraste. J Psychol (Paris) 32, 5–24. Piper, H. (1903). Über Dunkeladaptation. Zeitschr Psychol Physiol Sinnesorg 31, 169–214. Plateau, J. (1834). Über das Phänomen der zufälligen Farben. Poggend Arch Phys Chem 32, 543–54. Poggio, G.F., Baker, F.H., Lamarre, Y., and Sanseverino, E.R. (1969). Afferent inhibition at input to visual cortex of the cat. J Neurophysiol 32, 892–915. Polat, U. and Sagi, D. (1993). Lateral interactions between spatial channels: suppression and facilitation revealed by lateral masking experiments. Vision Res 33, 993–9.
REFERENCES
Polat, U. and Sagi, D. (1994). Spatial interactions in human vision: from near to far via experience-dependent cascades of connections. Proc Natl Acad Sci USA 91, 1206–9. Polat, U., Sagi, D., and Norcia, A.M. (1997). Abnormal long-range spatial interactions in amblyopia. Vision Res 37, 737–44. Pollen, D.A. (1999). On the neural correlates of visual perception. Cereb Cortex 9, 4–13. Polonsky, A., Blake, R., Braun, J., and Heeger, D.J. (2000). Neuronal activity in human primary visual cortex correlates with perception during binocular rivalry. Nat Neurosci 3, 1153–9. Pomerantz, J.R. and Pristach, E.A. (1989). Emergent features, attention, and perceptual glue in visual form perception. J Exp Psychol Hum Percept Perform 15, 635–49. Pomerantz, J.R., Sager, L.C., and Stoever, R.J. (1977). Perception of whole and their component parts: Some configurational superiority effects. J Exp Psychol Hum Percept Perform 3, 422–35. Popper, K. (1972). Objective Knowledge. Oxford: Oxford University Press. Posner, M. (1978). Chromometric Explorations of Mind. Hillsdale, NJ: Lawrence Erlbaum & Associates. Posner, M. (1980). Orienting of attention. Q J Exp Psychol 32, 3–25. Posner, M.I. (1994). Attention: the mechanism of consciousness. Proc Natl Acad Sci USA 91, 7398–403. Posner, M.I. and Petersen, S.E. (1990). The attention system of the human brain. Ann Rev Neurosci 13, 25–42. Posner, M.I. and Rothbart, M.K. (1994). Constructing neuronal theories of mind. In Large-Scale Neuronal Theories of the Brain (ed C. Koch and J.L. Davis). Cambridge, MA: MIT Press, pp. 183–99. Previc, F.H. (1990). Functional specialization in the lower and upper visual fields: its ecological origin and neurophysiological implications. Behav Brain Sci 13, 519–41. Previc, F.H. (1998). The neuropsychology of 3-D space. Psychol Bull 124, 123–64. Prinzmetal, W. and Banks, W.P. (1977). Good continuation affects visual detection. Percept Psychophys 21, 389–95. Proctor, R.W., Nunn, M.B., and Pallos, I. (1983). The influence of metacontrast masking on detection and spatial-choice judgments: an apparent distinction between automatic and attentive response mechanisms. J Exp Psychol Hum Percept Perf 2, 278–87. Pulos, E., Raymond, J.E., and Makous, W. (1980). Transient sensitization by a contrast flash. Vision Res 20, 281–8. Purcell, D.G. and Dember, W.N. (1968). The relation of phenomenal brightness reversal and re-reversal to backward masking and recovery. Percept Psychophys 3, 290–2. Purcell, D.G. and Stewart, A.L. (1970). U-shaped backward masking functions with nonmetacontrast paradigms. Psychon Sci 21, 361–3. Purcell, D.G. and Stewart, A.L. (1988). The face-detection effect: configuration enhances perception. Percept Psychophys 43, 355–66. Purcell, D.G. and Stewart, A.L. (1991). The object-detection effect: configuration enhances perception. Percept Psychophys 50, 215–24. Purcell, D.G., Stewart, A.L., and Brunner, R.L. (1974). Metacontrast target detection under light and dark adaptation. Bull Psychon Soc 3, 199–201.
345
346
REFERENCES
Purcell, D.G., Stewart, A.L., Davis, J., et al. (1975). U-shaped masking functions under backward masking by pattern. Bull Psychon Soc 5, 498–500. Purkinje, J.E. (1819). Beitrage zur Kenntnis des Sehens in Subjektiver Hinsicht. Prague: J.G. Calve. Purpura, D.P. (1970). Operations and processes in thalamic and synaptically related neural subsystems. In The Neurosciences: Second Study Program (ed F.O. Schmitt). New York: Rockefeller University Press, pp. 458–70. Purpura, K., Tranchina, D., Kaplan, E., and Shapley, R.M. (1990). Light adaptation in primate retina: analysis of changes in gain and dynamics of retinal ganglion cells. Vis Neurosci 4, 75–93. Purushothaman, G., Ö˘gmen, H., Chen, S., and Bedell, H.E. (1998). Motion deblurring in a neural network model of retino-cortical dynamics. Vision Res 38, 1827–42. Purushothaman, G., Ögmen, H., and Bedell, H.E. (2000). Gamma-range oscillations in backward masking functions and their putative neural correlates. Psychol Rev 107, 556–77. Purushothaman, G., Ö˘gmen, H., and Bedell, H.E. (2003). Suprathreshold intrinsic dynamics of the human visual system. Neural Comput 15, 2883–908. Rafal, R., Smith, J., Krantz, J., Cohen, A., and Brennan, C. (1990). Extrageniculate vision in hemianopic humans: saccade inhibition by signals in the blind field. Science 250, 118–20. Raiguel, S.E., Lagae, L., Gulyas, B., and Orban, G.A. (1989). Response latencies of visual cells in macaque areas V1, V2 and V5. Brain Res 493, 155–9. Ramachandran, V.S. (1990). Visual perception in people and machines. In AI and the Eye (ed A. Blake and T. Troscianko). New York: Wiley, pp. 21–77. Ramachandran, V.S. and Cobb, S. (1995). Visual attention modulates metacontrast masking. Nature 373, 66–8. Ramachandran, V.S., Rao, V.M., and Vidyasagar, T.R. (1974). Sharpness constancy during movement perception. Perception 3, 97–8. Ransom-Hogg, A. and Spillmann, L. (1980). Perceptive field size in fovea and periphery of the light- and dark-adapted retina. Vision Res 20, 221–8. Rashbass, C. (1970). The visibility of transient changes of luminance. J Physiol 210, 165–86. Rashevsky, N. (1960). Mathematical Biophysics. New York: Dover Publications. Rassovsky, Y., Green, M.F., Nuechterlein, K.H., Breitmeyer, B., and Mintz, J. (2004). Paracontrast and metacontrast in schizophrenia: clarifying the mechanism for visual masking deficits. Schizophr Res 71, 485–92. Rassovsky, Y., Green, M.F., Nuechterlein, K.H., Breitmeyer, B., and Mintz, J. (2005). Visual processing in schizophrenia: a confirmatory factor analysis of visual masking parameters. Schizophr Res 78, 251–60. Ratliff, F. (1965). Mach Bands: Quantitative Studies on Neural Networks in the Retina. San Francisco, CA: Holden-Day. Rauschenberger, R. and Yantis, S. (2001). Masking unveils pre-modal completion representation in visual search. Nature 410, 369–72. Reeves, A. (1980). Visual imagery in backward masking. Percept Psychophys 28, 118–24. Reeves, A. (1981). Metacontrast in hue substitution. Vision Res 21, 907–12. Reeves, A. (1982). Metacontrast U-shaped functions derive from two monotonic functions. Perception 11, 415–26.
REFERENCES
Regan, D. (1970). Evoked potentials and psychophysical correlates of changes in stimulus colour and intensity. Vision Res 10, 163–78. Regan, D. and Cynader, M. (1979). Neurons in area 18 of cat visual cortex selectively sensitive to changing size: nonlinear interactions between responses to two edges. Vision Res 19, 699–711. Reicher, G.M. (1969). Perceptual recognition as a function of meaningfulnessof stimulus material. J Exp Psychol 81, 275–80. Reingold, E.M. and Merikle, P.M. (1990). On the inter-relatedness of theory and measurement in in the study of unconscious processes. Mind Lang 5, 9–28. Reynolds, J.H., Chelazzi, L., and Desimone, R. (1999). Competitive mechanisms subserve attention in macaque areas V2 and V4. J Neurosci 19, 1736–53. Rieger, J.W., Braun, C., Bulthoff, H.H., and Gegenfurtner, K.R. (2005). The dynamics of visual pattern masking in natural scene processing: a magnetoencephalography study. J Vis 5, 275–86. Rieke, F., Warland, D., de Ruyter van Stevenick, R., and Bialek, W. (1997). Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press. Rijsdijk, J.P., Kroon, J.N., and van der Wildt, G.J. (1980). Contrast sensitivity as a function of position on the retina. Vision Res 20, 235–41. Ro, T., Breitmeyer, B., Burton, P., Singhal, N.S., and Lane, D. (2003). Feedback contributions to visual awareness in human occipital cortex. Curr Biol 11, 1038–41. Ro, T., Shelton, D., Lee, O.L., and Chang, E. (2004). Extrageniculate mediation of unconscious vision in transcranial magnetic stimulation-induced blindsight. Proc Natl Acad Sci USA 101, 9933–5. Rodieck, R.W. and Rushton, W.A.H. (1976). Cancellation of rod signals by cones, and cone signals by rods in the cat retina. J Physiol 254, 775–85. Rodriguez, E., George, N., Lachaux, J.-P., Martinerie, J., Renault, B., and Varela, F.J. (1999). Perception’s shadow: long-distance synchronization of human brain activity. Nature 397, 430–3. Roelfsema, P.R., Lamme, V.A.F., and Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature 395, 376–81. Rogowitz, B. (1983). Spatial/temporal interactions: backward and forward metacontrast masking with sine-wave gratings. Vision Res 23, 1057–73. Rolls, E.T. (1992). Neurophysiological mechanisms underlying face processing within and beyond the temporal cortical visual areas. Philos Trans R Soc Lond B Biol Sci 335, 11–20. Rolls, E.T. (2005). Consciousness absent and present: a neurophysiological exploration of masking. In The First Half Second (ed H. Ö˘gmen and B.G. Breitmeyer). Cambridge, MA: MIT Press, pp. 89–108. Rolls, E.T. and Tovée, M.J. (1994). Processing speed in the cerebral cortex and the neurophysiology of visual masking. Proc R Soc Lond B Biol Sci 257, 9–15. Rolls, E.T., Tovee, M.J., Purcell, D.G., Stewart, A.L., and Azzopardi, P. (1994). The responses of neurons in the temporal cortex of primates, and face identification and detection. Exp Brain Res 101, 473–84.
347
348
REFERENCES
Rolls, E.T., Tovée, M.J., and Panzeri, S. (1999). The neurophysiology of backward visual masking: information analysis. J Cogn Neurosci 11, 300–11. Rosenquist, A.C. and Palmer, L.A. (1971). Visual receptive field properties of cells of the superior colliculus after cortical lesions in the cat. Exp Neurol 33, 629–52. Ross, J., Morrone, M.C., Goldberg, M.E., and Burr, D. (2001). Changes in visual perception at the time of saccades. Trends Neurosci 24, 113–21. Rossi, A.F., Desimone, R., and Ungerleider, L. (2001). Contextual modulation in primary visual cortex of macaques. J Neurosci 21, 1698–1709. Roth, E.C. and Hellige, J.B. (1998). Spatial processing and hemispheric asymmetry: contributions of the transient/magnocellular visual system. J Cogn Neurosci 10, 472–84. Rubin, E. (1929). Kritisches und Experimentelles zur ‘Empfindungszeit’ Fröhlichs. Psychol Forsch 13, 101–12. Rudvin, I., Valberg, A., and Kilavik, B.E. (2000). Visual evoked potentials and magnocellular and parvocellular segregation. Vis Neurosci. 17, 579–90. Rund, B.R. and Landrø, N.I. (1990). Information processing: a new model for understanding cognitive disturbances in psychiatric patients. Acta Psychiatr Scand 81, 305–16. Rushton, W.A.H. (1965). Bleached rhodopsin and visual adaptation. J Physiol 181, 645–55. Rushton, W.A. and Westheimer, G. (1962). The effect upon the rod threshold of bleaching neighbouring rods. J Physiol 164, 318–29. Saarinen, J., Levi, D.M., and Shen, B. (1997). Integration of local pattern elements into global shape in human vision. Proc Natl Acad Sci USA 94, 8267–71. Saccuzzo, D.P. and Braff, D.L. (1980). Associative cognitive dysfunction in schizophrenia and old age. J Nerv Ment Dis 168, 41–5. Saccuzzo, D.P. and Braff, D.L. (1981). Early information processing deficit in schizophrenia. New findings using schizophrenic subgroups and manic control subjects. Arch Gen Psychiatry 38, 175–9. Saccuzzo, D.P. and Braff, D.L. (1986). Information-processing abnormalities: trait- and state-dependent components. Schizophr Bull 12, 447–59. Saccuzzo, D.P. and Schubert, D.L. (1981). Backward masking as a measure of slow processing in schizophrenia spectrum disorders. J Abnorm Psychol 90, 305–12. Saccuzzo, D.S., Cadenhead, K.S., and Braff, D.L. (1996). Backward versus forward visual masking deficits in schizophrenic patients: centrally, not peripherally, mediated? Am J Psychiatry 153, 1564–70. Sagi, D. and Julesz, B. (1985). Enhanced detection in the aperture of focal attention during simple discrimination tasks. Nature 321, 693–5. Sahraie, A., Weiskrantz, L., Barbur, J.L., Simmons, A., Williams, S.C.R., and Brammer, M.J. (1997). Pattern of neuronal activity associated with conscious and unconscious processing of visual signals. Proc Natl Acad Sci USA 94, 9406–11. Sahraie, A., Weiskrantz, L., and Barbur, J.L. (1998). Awareness and confidence ratings in motion perception without geniculo-striate projection. Behav Brain Res 96, 71–7. Sakitt, B. (1975). Locus of short-term visual storage. Science 190, 1318–19.
REFERENCES
Sakitt, B. (1976). Iconic memory. Psychol Rev 83, 257–76. Samar, V.J., Parasnis, I., and Berent, G.P. (2002). Deaf poor readers’ pattern reversal visual evoked potentials suggest magnocellular system deficits: implications for diagnostic neuroimaging of dyslexia in deaf individuals. Brain Lang 80, 21–44. Sandberg, M.A., Berson, E.L., and Effron, M.H. (1981). Rod–cone interaction in the distal human retina. Science 212, 829–31. Sarikaya, M., Wang, W., and Ö˘gmen, H. (1998). Neural network model of on-off units in the fly visual system: simulations of dynamic behavior. Biol Cybern 78, 399–412. Sasaki, H., Saito, Y., Bear, D.M., and Ervin, F.R. (1971). Quantitative variation in striate receptive fields of cats as a function of light and dark adaptation. Exp Brain Res 113, 273–93. Sato, T. (1988). Effects of attention and stimulus interaction on visual responses of inferior temporal neurons in macaque. J Neurophysiol 60, 344–64. Saucer, R.T. (1954). Processes of motion perception. Science 120, 806–7. Saunders, J. (1977). Foveal and spatial properties of brightness metacontrast. Vision Res 17, 375–8. Sawatari, A. and Callaway, E.M. (1996). Convergence of magno- and parvocellular pathways in layer 4B of macaque primary visual cortex. Nature 380, 442–6. Scharf, B. and Lefton, L.A. (1970). Backward and forward masking as a function of stimulus and task parameters. J Exp Psychol 84, 331–8. Scharf, B., Zamansky, H.S., and Brighthill, R.F. (1966). Word recognition with masking. Percept Psychophys 1, 110–2. Scharlau, I. and Ansorge, U. (2003). Direct parameter specification of an attention shift: evidence from perceptual latency priming. Vision Res 43, 1351–63. Scharlau, I. and Neumann, O. (2003). Perceptual-latency priming by metacontrast-masked stimuli: evidence for an attentional interpretation. Psychol Res 67, 184–97. Scheerer, E. (1973). Integration, interruption and processing rate in visual backward masking. Psychol Forsch 36, 71–93. Scheinberg, D.L. and Logothetis, N.K. (1997). The role of temporal visual areas in perceptual organization. Proc Natl Acad Sci USA 94, 3408–13. Schiller, P.H. (1965). Backward masking for letters. Percept Mot Skills 20, 47–50. Schiller, P.H. (1966). Forward and backward masking as a function of relative overlap and intensity of test and masking stimuli. Percept Psychophys 1, 161–4. Schiller, P.H. (1968). Single unit analysis of backward visual amskingand metacontrast in the cat lateral geniculate nucleus. Vision Res 8, 855–66. Schiller, P.H. (1986). The central visual system. Vision Res 26, 1351–86. Schiller, P.H. and Chorover, S.L. (1966). Metacontrast: its relation to evoked potentials. Science 153, 1398–1400. Schiller, P.H. and Colby, C.L. (1983). The responses of single cells in the lateral geniculate nucleus of the rhesus monkey to color and luminance contrast. Vision Res 23, 1631–41. Schiller, P.H. and Greenfield, A. (1969). Visual masking and the recovery phenomenon. Percept Psychophys 6, 182–4. Schiller, P.H. and Lee, K. (1991). The role of primate extrastriate area V4 in vision. Science 251, 1251–3.
349
350
REFERENCES
Schiller, P.H. and Smith, M.C. (1965). A comparison of forward and backward masking. Psychonomic Sci 3, 77–8. Schiller, P.H. and Smith, M.C. (1966). Detection in metacontrast. J Exp Psychol 71, 32–9. Schiller, P.H. and Smith, M.C. (1968). Monoptic and dichoptic metacontrast. Percept Psychophys 3, 237–9. Schiller, P.H., Finlay, B.L., and Volman, S.F. (1976). Quantative studies of single cell properties in monkey striate cortex. I–V. J Neurophysiol 39, 1288–1374. Schmidt, T. (2000). Visual perception without awareness: priming responses by color. In Neural Correlates of Consciousness (ed T. Metzinger). Cambridge, MA: MIT Press, pp. 157–69. Schmidt, T. (2002). The finger in flight: real-time motor control by visually masked color stimuli. Psychol Sci 13, 112–18. Schmolesky, M.T., Wang, Y., Hanes, D.G., et al. (1998). Signal timing across the macaque visual system. J Neurophysiol 79, 3272–8. Schober, H.A.W. and Hilz, R. (1965). Contrast sensitivity of the human eye for square-wave gratings. J Opt Soc Am A 55, 1086–91. Schuck, J.R. and Lee, R.G. (1989). Backward masking, information processing, and schizophrenia. Schizophr Bull 15, 491–500. Schultz, D.W. and Eriksen, C.W. (1977). Do noise masks terminate target processing? Mem Cognit 5, 90–6. Schulz, A.J. (1908). Untersuchungen über die Wirkung gleicher Reize auf die Auffassung bei momentaner Exposition. Zeitschr Psychol 52, 238–96. Schulz, M.F. and Sanocki, T. (2003). Time course of perceptual grouping by color. Psychol Sci 14, 26–30. Schumann, F. (1899). Sitzungsberichte des Psychologischen Vereins zu Berlin. Zeitschr Psychol 1, 96–100. Schwartz, M. and Pritchard, W.S. (1981). AERs and detection in tasks yielding U-shaped backward masking functions. Psychophysiology 18, 678–85. Schwartz, S.H. (1992). Reaction time distributions and their relationship to the transient/sustained nature of the neural discharge. Vision Res 32, 2087–92. Seiffert, A.E. and Di Lollo, V. (1997). Low-level masking in the attentional blink. J Exp Psychol Hum Percept Perform 23, 1061–73. Sekuler, R.W. (1965). Spatial and temporal determinants of visual backward masking. J Exp Psychol 70, 401–6. Shannon, C.E. (1948). A mathematical theory of communication. Bell Syst Tech J 27, 379–423. Shapiro, K. (2001). The Limits of Attention. Oxford: Oxford University Press. Shapley, R. (1992). Parallel retinocortical channels: X and Y and P and M. In Applications of Parallel Processing in Vision (ed J. Brannan). Amsterdam: Elsevier, pp. 3–36. Shaw, R. and Bransford, J. (1977). Perceiving, Acting, and Knowing. Hillsdale, NJ: Lawrence Erlbaum Associates. Sheer, D.E. (1984). Focused arousal, 40 Hz EEG, and dysfunction. In Self Regulation of the Brain and Behavior (ed T. Elbert, B. Rockstroh, W. Lutzenberger, and N. Birbaumer). Berlin: Springer, pp. 66–84.
REFERENCES
Sheinberg, D.L. and Logothetis, N.K. (1997). The role of temporal cortical areas in perceptual organization. Proc Natl Acad Sci USA 94, 3408–13. Shelley-Tremblay, J. and Mack, A. (1999). Metacontrast masking and attention. Psychol Sci 10, 508–15. Sherman, S.M. and Guillery, R.W. (1996). Functional organization of thalamocortical relays. J Neurophysiol 76, 1367–95. Sherrick, M. and Dember, W. (1970). Completeness and spatial distribution of mask contorus as factors in backward visual masking. J Exp Psychol 84, 179–80. Sherrick, M.F., Keating, J.K., and Dember, W.N. (1974). Metacontrast with black and white stimuli. Can J Psychol 28, 438–45. Sherrington, C.S. (1897). On the reciprocal action in the retina as studied by means of some rotating discs. J Physiol 21, 33–54. Shostak, Y., Ding, Y., and Casagrande, V.A. (2003). Neurochemical comparison of synaptic arrangements of parvocellular, magnocellular, and koniocellular geniculate pathways in owl monkey (Aotus trivirgatus) visual cortex. J Comp Neurol 456, 12–28. Sigman, M., Cecchi, G.A., Gilbert, C.D., and Magnasco, M.O. (2001). On a common circle: natural scenes and gestalt rules. Proc Natl Acad Sci USA 98, 1935–40. Silveira, L.C., Grünert, U., Kremers, J., Lee, B.B., and Martin, P.R. (2005). Comparative anatomy and physiology of the primate retina. In The Primate Visual System: A Comparative Approach (ed J. Kremers). Chichester: Wiley pp. 127–60. Silverstein, S.M., Knight, R.A., Schwarzkopf, S.B., West, L.L., Osborn, L.M., and Kamin, D. (1996). Stimulus configuration and context effects in perceptual organization in schizophrenia. J Abnorm Psychol 105, 410–20. Silverstein, S.M., Kovács, I., Corry, R., and Valone, C. (2000). Perceptual organization, the disorganization syndrome, and context processing in chronic schizophrenia. Schizophr Res 43, 11–20. Simoncelli, E.P. (2003). Vision and the statistics of the visual environment. Curr Opin Neurobiol 13, 144–9. Sincich, L.C. and Horton, J.C. (2002). Divided by cytochrome oxidase: a map of the projections from V1 to V2 in macaques. Science 295, 1734–7. Sincich, L.C. and Horton, J.C. (2005). The circuitry of V1 and V2: integration of color, form, and motion. Annu Rev Neurosci 28, 303–26. Singer, W. (1977). Control of thalamic transmission by corticofugal and ascending reticular pathways in the visual system. Physiol Rev 57, 386–420. Singer, W. (1979). Central-core control of visual cortex functions. In The Neurosciences, Fourth Study Program (ed F.O. Schmitt and F.G. Worden). Cambridge, MA: MIT Press, pp. 1093–110. Singer, W. (1994). Putative functions of temporal correlations in neocortical processing. In Large-Scale Neuronal Theories of the Braind (ed C. Koch and J.L. Davis). Cambridge, MA: MIT Press, pp. 201–37. Singer, W. and Creutzfeldt, O.D. (1970). Reciprocal lateral inhibition of on- and off-center neurones in the lateral geniculate body of the cat. Exp Brain Res 10, 311–30. Singer, W., Tretter, F., and Cynader, M. (1976). The effect of reticular stimulation on spontaneous and evoked activity in the cat visual cortex. Brain Res 102, 71–90.
351
352
REFERENCES
Skagestadt, P. (1975). Making Sense of History. Oslo: Universitetsforlaget. Skottun, B. (1997a). The magnocellular deficit theory of dyslexia. Trends Neurosci 20, 397–8. Skottun, B. (1997b). Some remarks on the magnocellular deficit theory of dyslexia. Vision Res 37, 965–6. Skottun, B. (2000). On the use of metacontrast to assess magnocellular function in dyslexic readers. Percept Psychophys 63,1271–4. Skottun, B. (2001). Is dyslexia caused by a visual deficit? Vision Res 41, 3069–71. Skottun, B.C. and Parke, L.A. (1999). The possible relationship between visual deficits and dyslexia: examination of a critical assumption. J Learn Disabil 32, 2–5. Slaghuis, W.L. and Bakker, V.J. (1995). Forward and backward visual masking of contour by light in positive- and negative-symptom schizophrenia. J Abnorm Psychol 104, 41–54. Slaghuis, W.L and Ryan, J.F. (1999). Spatio-temporal contrast sensitivity, coherent motion, and visible persistence in developmental dyslexia. Vision Res 39, 651–68. Smith, E.E. and Haviland, S.E. (1972). Why words are perceived more accurately than nonwords: inferences versus unitization. J Exp Psychol 92, 59–64. Smith, M.C. and Schiller, P.H. (1966). Forward and backward masking: a comparison. Can J Psychol 20, 191–7. Smith, P.L. (2000). Attention and luminance detection: effects of cues, masks, and pedestals. J Exp Psychol Hum Percept Perform 26, 1401–20. Smith, P.L. and Wolfgang, B.J. (2004). The attentional dynamics of masked detection. J Exp Psychol Hum Percept Perform 30, 119–36. Smith, P.L., Wolfgang, B.J., and Sinclair, A.J. (2004). Mask-dependent attentional cuing effects on visual signal detection: the psychometric function of contrast. Percept Psychophys 66, 1056–75. Smith, V.C. (1969a). Scotopic and photopic functions for visual band movement. Vision Res 9, 293–304. Smith, V.C. (1969b). Temporal and spatial interactions involved in the band movement phenomenon. Vision Res 9, 665–76. Smythe, L. and Finkel, D.L. (1974). Masking of spatial and identity information from geometric forms by a visual noise field. Can J Psychol 28, 399–408. Somers, D.C., Dale, A.M., Seiffert, A.E., and Tootell, R.B.H. (1999). Functional MRI reveals spatially specific attentional modulation in human primary visual cortex. Proc Natl Acad Sci USA 96, 1663–8. Spehlmann, R. (1965). The averaged electrical responses to diffuse and to patterned light in the human. Electroencephalogr Clin Neurophysiol 19, 560–9. Spencer, T.J. (1969). Some effects of differing masking stimuli on iconic storage. J Exp Psychol 81, 132–40. Spencer, T.J. and Shuntich, R. (1970). Evidence for an interruption theory of backward masking. J Exp Psychol 85, 198–203. Sperling, G. (1960). The information available in brief visual presentations. Psychol Monogr 74 (498), 1–29. Sperling, G. (1963). A model for visual memory tasks. Hum Factors 5, 19–31.
REFERENCES
Sperling, G. (1964). What visual masking can tell us about temporal factors in perception. Proc 17th Int Congr on Psychology. Amsterdam: North Holland, pp. 541–59. Sperling, G. (1965). Temporal and spatial visual masking. I. Masking by impulse flashes. J Opt Soc Am 55, 541–59. Sperling, G. (1967). Successive approximations to a model for short-term memory. Acta Psychol 27, 285–92. Spillmann, L. and Werner, J.S. (1996). Long-range interactions in visual perception. Trends Neurosci 19, 428–34. Stainton, W.H. (1928). The phenomenon of Broca and Sulzer in foveal vision. J Opt Soc Am 16, 26–39. Stein, J. (1993). Dyslexia—impaired temporal information processing? Ann NY Acad Sci 682, 83–6. Stein, J. and Walsh, V. (1997). To see but not to read: the magnocellular theory of dyslexia. Trends Neurosci 20, 147–52. Steriade, M. and McCarley, R.W. (1990). Brainstem Control of Wakefulness and Sleep. New York: Plenum Press. Steriade, M., Contreras, D., Amzica, F., and Timofeev, I. (1996). Synchronization of fast (30–40 Hz) spontaneous oscillations in intrathalamic and thalamocortical networks. J Neurosci 16, 2788–808. Sternheim, C.E. and Cavonius, C.R. (1972). Sensitivity of the human ERG and VECP to sinusoidally modulated light. Vision Res 12, 1685–95. Stettler, D.D., Das, A., Bennett, J., and Gilbert, C.D. (2002). Lateral connectivity and contextual interactions in macaque primary visual cortex. Neuron 36, 739–50. Stewart, A.L. and Purcell, D.G. (1970). U-shaped masking functions in visual backward masking: effects of target configuration and retinal position. Percept Psychophys 7, 253–6. Stewart, A.L. and Purcell, D.G. (1974). Visual backward masking by a flash of light: a study of U-shaped detection functions. J Exp Psychol 103, 553–66. Stigler, R. (1908). Über die Unterschiedsschwelle im aufsteigenden Teile einer Lichtempfindung. Pflugers Arch Gesamte Physiol 123, 163–223. Stigler, R. (1910). Chronotopische Studien über den Umgebungskontrast. Pflugers Arch Gesamte Physiol 135, 365–435. Stigler, R. (1913). Metacontrast (demonstration), IX Congress International de Physiologie, Gröningen. Arch Int Physiol 14, 78. Stigler, R. (1926) Die Untersuchungen des zeitlichen Verlaufes der optischen Erregung mittels des Metakontrastes. In Handbuch der Biologischen Arbeitsmethoden, Part 6 (ed E. Aberhalden). Berlin: Urban und Schwarzenberg, pp. 949–68. Stiles, W.S. (1939). The directional sensitivity of the retina and the spectral sensitivities of the rods and cones. Proc R Soc Lond B Biol Sci 127, 64–105. Stiles, W.S. (1949). Increment thresholds and the mechanisms of colour vision. Doc Ophthalmol 3, 138–63. Stiles, W.S. (1959). Color vision: the approach through increment threshold sensitivity. Proc Natl Acad Sci USA 45, 100–14. Stober, R.S., Brussel, E.M., and Komoda, M.K. (1978). Differential effects of metacontrast on target brightness and clarity. Bull Psychon Soc 12, 433–6.
353
354
REFERENCES
Stoerig, P. (1996). Varieties of vision: from blind responses to conscious recognition. Trends Neurosci 19, 401–6. Stoerig, P. (2002). Neural correlates of consciousness as state and trait. In Encyclopedia of Cognitive Neuroscience (ed L. Nadel). London: MacMillan, pp. 233–40. Stone, J. and Dreher, B. (1973). Projection of X- and Y-cells of the cat’s lateral geniculate nucleus to areas 17 and 18 of visual cortex. J Neurophysiol 36, 551–67. Stoner, G.R. and Albright, T.D. (1992). Neural correlates of perceptual motion coherence. Nature. 358, 412–14. Stoper, A.E. and Banffy, S. (1977). Relation of split apparent motion to metacontrast. J Exp Psychol Hum Percept Perf 3, 258–77. Stoper, A.E. and Mansfield, J.G. (1978). Metacontrast and paracontrast suppression of a contourless area. Vision Res 18, 1669–74. Stromeyer, C.F. and Julesz, B. (1972). Spatial-frequency masking in vision: critical bands and spread of masking. J Opt Soc Am 62, 1221–32. Stromeyer, C.F., 3rd, and Martini, P. (2003). Human temporal impulse response speeds up with increased stimulus contrast. Vision Res 43, 285–98. Stuart, G.W., McAnnaly, K.I., and Castles, A. (2001). Can contrast sensitivity functions in dyslexia be explained by inattention rather than a magnocellular deficit? Vision Res 41, 3205–11. Sturr, J.F. and Frumkes, T.E. (1968). Spatial factors in masking with black and white targets. Percept Psychophys 4, 282–4. Sturr, J.F., Frumkes, T.E., and Veneruso, D.M. (1965). Spatial determinants of visual masking: effects of mask size and retinal position. Psychon Sci 3, 327–8. Sugase, Y., Yamane, S., Ueno, S., and Kawano, K. (1999). Global and fine information coded by single neurons in the temporal visual cortex. Nature 400, 869–73. Sukale-Wolf, S. (1971). Prediction of the metacontrast phenomenon from simultaneous brightness contrast. PhD Thesis, Stanford University, Stanford, CA. Super, H., Spekreijse, H., and Lamme, V.A.F. (2001). Two distinct modes of sensory processing observed in monkey primary visual cortex (V1). Nat Neurosci 4, 304–10. Szoc, R. (1973). Metacontrast with stereoscopically displayed stimuli. Master’s Thesis, University of California, Santa Barbara, CA. Talcott, J., Hansen, P., Willis-Owen, C., McKinnell, I., Richardson, A., and Stein, J. (1998). Visual magnocellular impairment in adult developmental dyslexics. Neuro-ophthalmology 20, 187–201. Tallon-Baudry, C. and Bertrand, O. (1999). Oscillatory gamma activity in humans and its role in object representation. Trends Cogn Sci 3, 151–62. Tanaka, J.W. and Farah, M.J. (1993). Parts and wholes in face perception. Q J Exp Psychol 46A, 225–45. Tanaka, J.W. and Sengco J.A. (1993). Features and their configuration in face perception. Mem Cognit 25, 225–45. Tanaka, K. (1997). Mechanisms of visual object recognition: monkey and human studies. Curr Opin Neurobiol 7, 523–9. Tanaka, K., Saito, H., Fukuda, Y., and Moriya, M. (1991). Coding visual images of objects in the inferotemporal cortex. J Neurophsyiol 66, 170–89.
REFERENCES
Tartaglione, A., Goff, D.P., and Benton, A.L. (1975). Reaction time to square-wave gratings as a function of spatial frequency, complexity and contrast. Brain Res 100, 111–20. Tata, M.S. (2002). Attend to it now or lose it forever: selective attention, metacontrast masking, and object substitution. Percept Psychophys 64, 1028–38. Tata, M.S. and Giashi, D.E. (2004). Warning: attending to a mask may be hazardous to your perception. Psychon Bull Rev 11, 262–8. Taylor, G.A. and Chabot, R.J. (1978). Differential backward masking of words and letters by masks of varying orthographic structure. Mem Cognit 6, 629–35. Taylor, J.L. and McCloskey, D.I. (1990). Triggering of preprogrammed movements as reactions to masked stimuli. J Neurophysiol 63, 439–46. Teller, D.Y. (1971). Sensitization by annular surrounds: temporal (masking) properties. Vision Res 11, 1325–35. Teller, D.Y., Matthews, C., Phillips, W.D., and Alexander, K. (1971). Sensitization by annular surrounds: sensitization and masking. Vision Res 11, 1445–58. Temme, L.A. and Frumkes, T.E. (1977). Rod–cone interaction in human scotopic vision. III: Rods influence cone increment thresholds. Vision Res 17, 681–5. Teuber, H.-L. (1955). Physiological psychology. Annu Rev Psychol 6, 267–96. Thomas, G.J. (1954). The effect of critical flicker frequency of interocular differences in intensity and in phase relations of flashes of light. Am J Psychol 67, 632–46. Thompson, K.G. and Schall, J.D. (1999). The detection of visual signals by macaque frontal eye field during masking. Nat Neurosci 2, 283–8. Thompson, K.G. and Schall, J.D. (2000). Antecedents and correlates of visual detection and awareness in macaque prefrontal cortex. Vision Res 40, 1523–38. Tigerstedt, R. and Bergqvist, J. (1883). Zur Kenntniss der Apperceptionsdauer zusammengesetzter Gesichtsvorstellungen. Zeitschr Biol 19, 5–44. Tiitinen, H., Sinkkonen, J., Reinikainen, K., Alho, K., Lavikainen, J., and Naataneb, R. (1993). Selective attention enhances the auditory 40-Hz transient response in humans. Nature 364, 59–60. Toch, H.H. (1956). The perceptual elaboration of stroboscopic presentations. Am J Psychol 69, 345–58. Tolhurst, D.J. (1973). Separate channels for the analysis of the shape and the movement of moving visual stimulus. J Physiol 231, 385–402. Tolhurst, D.J. (1975). Sustained and transient channels in human vision. Vision Res 15, 1151–5. Tong, F. (2001). Competing theories of binocular rivalry: a possible resolution. Brain Mind 2, 55–83. Tong, F. and Engel, S.A. (2001). Interocular rivalry revealed in the human cortical blind-spot representation. Nature 411, 195–9. Treisman, A. (1988). Features and objects. The 14th Bartlett Memorial Lecture. Q J Exp Psychol 40A, 201–37. Treisman, A.M. and Gelade, G. (1980). A feature integration theory of attention. Cogn Psychol 12, 97–136. Tresselt, M.E., Mayzner, M.S., Schoenberg, K.M., and Waxman, J. (1970). A study of sequential blanking and overprinting combined. Percept Psychophys 8, 261–4.
355
356
REFERENCES
Treue, S. (2001). Neural correlates of attention in primate visual cortex. Trends Neurosci 24, 295–300. Troxler, D. (1804). Über das Verschwinden gegebener Gegenstände innerhalb unseres Gesichtskreises. In Opthalmologische Bibliothek, vol 2 (eds. K. Himly and J.A. Schmidt). Jena: Fromann, pp. 1–119. Tsunoda, K., Yamane, Y., Nishizaki, M., and Tanifuji, M. (2001). Complex objects are represented in macaque inferotemporal cortex by the combination of feature columns. Nat Neurosci 4, 832–8. Tulunay-Keesey, U. (1972). Flicker and pattern detection: a comparison of thresholds. J Opt Soc Am 62, 446–8. Tulunay-Keesey, U. and Bennis, B.J. (1979). Effects of stimulus onset and image motion on contrast sensitivity. Vision Res 19, 767–74. Turvey, M. (1973). Contrasting orientations to the theory of visual information processing. Psychol Rev 84, 67–88. Turvey, M. (1977). On peripheral and central processes in vision: inferences from an information-processing analysis of masking with patterned stimuli. Psychol Rev 80, 1–52. Turvey, M. (1978). Visual processing and short-term memory. In Handbook of Learning and Cognitive Processes. Vol. 5, Human Information Processing (ed W.K. Estes). Hillsdale, NJ: Erlbaum, pp. 91–142. Turvey, M.T., Michaels, C.F., and Kewley-Port, D. (1974). Visual storage or visual masking? An analysis of the ‘retroactive contour enhancement’ effect. Q J Exp Psychol 26, 72–81. Tytla, M.E. and McAdie, P.J. (1981). Metacontrast masking in amblyopia. Paper presented at the Annual Meeting of the Association for Research in Vision and Ophthalmology, Sarasota, FL, May 1981. Tytla, M.E. and Steinbach, M.J. (1984). Metacontrast masking in amblyopia. Can J Psychol 38, 369–85. Ueno, T. (1977). Temporal characteristics of the human visual system as revealed by reaction time to double pulses of light. Vision Res 17, 591–6. Ullman, S. (1989). Aligning pictorial descriptions: an approach to object recognition. Cognition. 32, 193–254. Ullman, S. (1996). High-Level Vision. Cambridge, MA: MIT Press. Ungerleider, L.G. (1985). The corticocortical pathways for object recognition and spatial perception. In Pattern Recognition Mechanisms (ed C. Chagas, R. Gattas, and C. Gross ). Vatican City: Pontifical Academy of Sciences, pp. 21–7. Ungerleider, L.G. and Mishkin, M. (1982). Two cortical visual systems. In Analysis of Visual Behavior (ed D.J. Ingle, M.A. Goodale, and R.J.W. Mansfield). Cambridge, MA: MIT Press, pp. 549–86. Uttal, W. (1973). The Psychobiology of Sensory Coding. New York: Harper and Row. van der Wildt, G.J. and Vrolijk, P.C. (1981). Propagation of inhibition. Vision Res 21, 1765–71. van Essen, D.C. (1985). Functional organization of primate visual cortex. In Cerebral Cortex, Vol.3 (ed A. Peters and E.G. Jones). New York: Plenum Press, pp. 259–329. van Essen D.C., Anderson, C.H., and Felleman, D.J. (1992). Information processing in the primate visual system: an integrated systems perspective. Science 255, 419–23.
REFERENCES
VanRullen, R. and Koch, C. (2003). Visual selective behavior can be triggered by a feed-forward process. J Cogn Neurosci 15, 209–17. VanRullen, R. and Thorpe, S.J. (2001). The time course of visual processing: from early perception to decision making. J Cogn Neurosci 13, 454–61. van Santen, J.P.H. and Jonides, J. (1978). A replication of the face-superiority effect. Bull Psychon Soc 12, 378–80. van Santen, J.P.H. and Sperling, G. (1985). Elaborated Reichardt detectors. J Opt Soc Am A 2, 300–21. Vassilev, A. and Mitov, D. (1976). Perception time and spatial frequency. Vision Res 16, 89–2. Vassilev, A. and Strashimirov, D. (1979). On the latency of human visually evoked response to sinusoidal gratings. Vision Res 19, 843–5. Vaughan, H.G., Jr. and Silverstein, L. (1968). Metacontrast and evoked potentials: a reappraisal. Science 160, 207–8. Ventura, J. (1980). Foveal metacontrast. I: Criterion content and practice effects. J Exp Psychol Hum Percept Perform 6, 473–85. Vernoy, M.W. (1976). Masking by pattern in random-dot stereograms. Vision Res 16, 1183–4. Victor, J.D., Conte, M.M., Burton, L., and Nass, R.D. (1993). Visual evoked potentials in dyslexics and normals: failure to find a difference in transient or steady-state responses. Vis Neurosci 10, 939–46. Vidnyánszky, Z., Papathomas, T.V., and Julesz, B. (2001). Contextual modulation of orientation discrimination is independent of stimulus processing time. Vision Res 41, 2813–17. Vierordt, K. (1868). Der Zeitsinn. Tübingen: Universität Tübingen. Virsu, V., Lee, B.B., and Creutzfeldt, O.D. (1977). Dark adaptation and receptive field organisation of cells in the cat lateral geniculate nucleus. Exp Brain Res 27, 35–50. Volkmann, F.C. (1986). Human visual suppression. Vision Res 26, 1401–16. von Békésy, G. (1969). Mach- and Hering-type lateral inhibition in vision. Vision Res 9, 1483–99. von der Heydt, R., Friedman, H.S., Zhou, H., Komatsu, H., Hanazawa, A., and Murakami, I. (1997). Neuronal responses in monkeyV1 and V2 unaffected by metacontrast. Invest Ophthalmol Vis Sci 38 (Suppl), 2146. von Grünau, M.W. (1976). The ‘fluttering heart’ and spatio-temporal characteristics of color processing. III: Interactions between the systems of the rods and the long-wavelength cones. Vision Res 16, 397–401. von Grünau, M.W. (1978a). Interactions between sustained and transient channels: form inhibits motion in the human visual system. Vision Res 18, 197–201. von Grünau, M.W. (1978b). Dissociation and interaction of form and motion information in the human visual system. Vision Res 18, 1485–9. von Grünau, M.W. (1979). Form information is necessary for the perception of motion. Vision Res 19, 839–41. von Grünau, M.W. (1981). The origin of pattern information of an apparently moving object during stroboscopic motion. Percept Psychophys 30, 357–61. Vorberg, D., Mattler, U., Heinecke, A., Schmidt, T., and Schwarzbach, J. (2003). Different time courses for visual perception and action priming. Proc Natl Acad Sci USA 100, 6275–80.
357
358
REFERENCES
Vorberg, D., Mattler, U., Heinecke, A., Schmidt, T., and Schwarzbach (2004). Invariant time course of priming with and without awareness. In Psychophysics Beyond Sensation. Laws and Invariants of Human Cognition (ed C. Kaernbach, E. Schröger, and H. Müller). Mahwah, NJ: Erlbaum, pp. 271–88. Vrolijk, P.C. and van der Wildt, G.J. (1982). Propagating inhibition as a function of flash diameter and duration. Vision Res 22, 401–6. Vrolijk, P.C. and van der Wildt, G.J. (1985). Foveal inhibition measured with suprathreshold stimuli. Vision Res 25, 1413–21. Wachtler, T., Sejnowski, T.J., and Albright, T.D. (2003). Representation of color stimuli in awake macaque primary visual cortex. Neuron 37, 681–91. Wald, G. (1961). Retinal chemistry and the physiology of vision. In Visual Problems of Color, Symposium, Vol. 1. New York: Chemical Publishing, pp. 15–67. Wascher, E., Schatz, U., Kuder, T., and Vorleger, R. (2001). Validity and boundary conditions of automatic response activation in the Simon task. J Exp Psychol Hum Percept Perform 27, 731–51. Watamaniuk, S.N. (1992). Visible persistence is reduced by fixed-trajectory motion but not by random motion. Perception 21, 791–802. Watanabe, M. and Rodieck, R.W. (1989). Parasol and midget ganglion cells. J Comp Neurol 361, 537–51. Watanabe, T., Sasaki, Y., Miyauchi, S., et al. (1998). Attention-regulated activity in human primary visual cortex. J Neurophysiol 79, 2218–21. Watson, A.B. and Ahumada, A.J. (1985). Model of human visual-motion sensing. J Opt Soc Am A 2, 322–41. Watson, A.B. and Nachmias, J. (1977). Patterns of temporal interaction in the detection of gratings. Vision Res 17, 893–902. Watt, T. (1988). Visual Processing. Hove: Erlbaum. Wauschkuhn, B., Vorleger, R., Wascher, E., et al. (1998). Lateralised human cortical activity for shifting visuospatial attention and initiating saccades. J Neurophysiol 80, 2900–10. Wehrhahn, C. and Dresp, B. (1998). Detection facilitation by collinear stimuli in humans: dependence on strength and sign of contrast. Vision Res 38, 423–8. Weiskrantz, L. (1997). Consciousness Lost and Found. Oxford: Oxford University Press. Weisstein, N. (1966). Backward masking and models of perceptual processing. J Exp Psychol 72, 232–40. Weisstein, N. (1968). A Rashevsky–Landahl neural net: simulation of metacontrast. Psychol Rev 75, 494–521. Weisstein, N. (1971). W-shaped and U-shaped functions obtained for monoptic and dichoptic disk–disk masking. Percept Psychophys 9, 275–8. Weisstein, N. (1972). Metacontrast. In Handbook of Sensory Physiology. Vol. 7/4, Visual Psychophysics (ed D. Jameson and L.M. Hurvich). New York: Springer, pp. 233–72. Weisstein, N. and Growney, R. (1969). Apparent movement and metacontrast: a note on Kahneman’s formulation. PerceptPsychophys 5, 321–8. Weisstein, N. and Haber, R.N. (1965). A U-shaped backward masking function. Psychon Sci 2, 75–6.
REFERENCES
Weisstein, N. and Harris, C. (1974). Visual detection of line segments: an object superiority effect. Science 186, 752–5. Weisstein, N. and Maguire, W. (1978). Computing the next step: psychophysical measures of representation and interpretation. In Computer Vision Systems (ed A.R. Hanson and E.M. Riseman). New York: Academic Press, pp. 243–60. Weisstein, N., Jurkens, T., and Onderisin, T. (1970). Effect of forced-choice vs. magnitude-estimation measures on the waveform of metacontrast functions. J Opt Soc Am 60, 978–80. Weisstein, N., Harris, C., and Ruddy, M. (1973). An object superiority effect. Bull Psychon Soc 2, 324. Weisstein, N., Ozog, G., and Szoc, R. (1975). A comparison and elaboration of two models of metacontrast. Psychol Rev 82, 325–43. Werner, H. (1935). Studies of contour. I: Qualitative analysis. Am J Psychol 47, 40–64. Werner, H. (1940). Studies on contour strobostereoscopic phenomena. Am J Psychol 53, 418–22. Wertheim, T. (1894). Über die indirekte Sehschärfe. Zeitschr Psychol Physiol Sinnesorg 7, 112–89. Wertheimer, M. (1912). Experimentelle Studien über das Sehen von Bewegung. Zeitschr Psychol 61, 161–265. Westendorf, D.H. (1989). Binocular rivalry and dichoptic masking: suppressed stimuli do not mask stimuli in a dominating eye. J Exp Psychol Hum Percept Perform 15, 485–92. Westerink, J. and Teunissen, K. (1995). Perceived sharpness in complex moving images. Displays, 2, 89–97. Westheimer, G. (1967). Spatial interaction in human cone vision. J Physiol 190, 139–54. Westheimer, G. (1968). Bleached rhodopsin and retinal interaction. J Physiol 195, 97–106. Westheimer, G. (1970). Rod–cone independence for sensitizing interaction in the human retina. J Physiol 206, 109–16. Westheimer, G. and Hauske, G. (1975). Temporal and spatial interference with Vernier acuity. Vision Res 15, 1137–41. Whalen, P.J., Rauch, S.L., Etcoff, N.L., McInerney, S.C., Lee, M.B., and Jenike, M.A. (1998). Masked presentations of emotional facial expressions modulate amygdala activity without explicit knowledge. J Neurosci 18, 411–18. Wheeler, D.D. (1970). Processes in word recognition. Cogn Psychol 1, 59–85. Wiesel, T.N. and Hubel, D.H. (1966). Spatial and chromatic interactions in the lateral geniculate body of the rhesus monkey. J Neurophysiol 29, 1115–56. Wilke, M., Logothetis, N.K., and Leopold, D.A. (2003). Generalized flash suppression of salient visual targets. Neuron 39, 1043–52. Williams, A. and Weisstein, N. (1978). Line segments are perceived better in a coherent context than alone: an object-line effect in visual perception. Mem Cognit 6, 85–90. Williams, M.A., Morris, A.P., McGlone, F., Abbott, D.F., and Mattingley, J.B. (2004). Amygdala responses to fearful and hapy facial expressions under condtions of binocular suppression. J Neurosci 24, 2898–904.
359
360
REFERENCES
Williams, M.C. and LeCluyse, K. (1990). Perceptual consequences of a temporal processing deficit in reading disabled children. J Am Optom Assoc 61, 111–21. Williams, M.C. and Weisstein, N. (1980). Perceptual grouping produces spatial frequency specific effects on metacontrast. Paper presented at the Annual Annual Meeting of the Association for Research in Vision and Ophthalmology, Orlando, FL, May 1980. Williams, M.C. and Weisstein, N. (1981). Spatial frequency response and perceived depth in the time-course of object superiority. Vision Res 21, 631–46. Williams, M.C., Molinet, K., and LeCluyse, K. (1989). Visual masking as a measure of temporal processing in normal and disabled readers. Clin Vision Sci 4, 137–44. Williams, M.C., LeCluyse, K., and Bologna, N. (1990). Masking by light as a measure of visual integration time in normal and disabled readers. Clin Vision Sci 5, 335–43. Williams, M.C., Breitmeyer, B.G., Lovegrove, W.J., and Gutierrez, C. (1991). Metacontrast with masks varying in spatial frequency and wavelength. Vision Res 31, 2017–23. Williams, M.J., Stuart, G.W., Castles, A., and McAnnaly, K.I. (2003). Contrast sensitivity in subgroups of developmental dyslexia. Vision Res 43, 467–77. Williamson, S.J., Kaufman, L., and Brenner, D. (1977). Magnetic fields of the human brain. Naval Res Rev 30, 1–18. Williamson, S.J., Kaufman, L., and Brenner, D. (1978). Latency of the neuromagnetic response of the human visual cortex. Vision Res 18, 107–10. Wilson, H.R. (1980). Spatiotemporal characterization of a transient mechanism in the human visual system. Vision Res 20, 443–52. Wilson, A.E. and Johnson, R.M. (1985). Transposition in backward masking: the case of the traveling gap. Vision Res 25, 283–8. Wolf, J.M., Chun, M.M., and Friedman-Hill, S.R. (1995). Making use of texton gradients: visual search and perceptual grouping exploit the same parallel processes in different ways. In Early Vision and Beyond (ed T.V. Papathomas, C. Chubb, A. Gorea, and E. Kowler). Cambridge, MA: MIT Press, pp. 189–97. Wong, P.S. and Root, J.C. (2003). Dynamic variations in affective priming. Conscious Cogn 12, 147–68. Woodman, G.F. and Luck, S.J. (1999). Electrophysiological measurement of rapid shifts of attention during visual search. Nature 400, 867–9. Woodman, G.F. and Luck, S.J. (2003). Dissociations among attention, perception, awareness during object-substitutiion masking. Psychol Sci 14, 605–11. Wundt, W. (1899). Zur Kritik tachistoskopischer Versuche I. Philos Stud 15, 287–317. Wundt, W. (1900). Zur Kritik tachistoskopischer Versuche II. Philos Stud 16, 61–81. Wynn, J.K., Light, G.K., Breitmeyer, B., Nuechterlein, K.H., and Green, M.F. (2005). Event-related gamma activity in schizophrenia patients during a visual backward masking task. Am J Psychiatry 162, 2330–6. Xiao, Y., Wang, Y., and Felleman, D.J. (2003). A spatially organized representation of colour in macaque cortical area V2. Nature 421, 535–9. Yabuta, N.H. and Callaway, E.M. (1998). Functional streams and local connections of layer 4C neurons in primary visual cortex of the macaque monkey. J Neurosci 18, 9489–99. Yantis, S. and Nakama, T. (1998). Visual interactions in the path of apparent motion. Nat Neurosci 1, 508–12.
REFERENCES
Yellott, J.I., Jr. and Wandell, B.A. (1976). Color properties of the contrast flasheffect: monoptic vs. dichoptic comparisons. Vision Res 16, 1275–80. Yeshurun, Y. (2004). Isoluminant stimuli and red background attenuate the effects of transient spatial attention on temporal resolution. Vision Res 44, 1375–87. Yeshurun, Y. and Levy, L. (2003). Transient spatial attention degrades temporal resolution. Psychol Sci 14, 225–31. Zeki, S. (1993). A Vision of the Brain. Oxford: Blackwell Science. Zeki, S. (1997). The color and motion systems as guides to conscious visual perception. In Cerebral Cortex. Vol. 12, Extrastriate Cortex of Primates (ed K.S. Rockland, J.H. Kaas, and A. Peters). New York: Plenum Press, pp. 777–809. Zeki, S. (1999). Inner Vision. Oxford: Oxford University Press. Zimba, L.D. and Blake, R. (1983). Binocular rivalry and semantic processing: out of sight, out of mind. J Exp Psychol Hum Percept Perform 9, 807–15. Zöllner, F. (1862). Über eine neue Art anorthoskopischer Zerrbilder. Ann Phys Chem 27, 477–84.
361
This page intentionally left blank
Index
access consciousness 247, 272 activity normalization 180 adaptation 20–21, 22 level 53–54 see also background intensity temporal 181 additive equations 179, 184, 185, 301–303 additive (leaky-integrator) model 179 after-image, positive 11, 12, 20, 23 homophotic/metaphotic 11 amblyopia 37, 288–289, 295 and metacontrast 288 Anbar and Anbar, visual masking model 304–305, 306 apparent motion 102, 109, 259 see also phi motion; stroboscopic motion apperception 28 area suppression 44 attention 18, 27, 28, 37, 72 and consciousness 253, 269 feature-based 246, 250 figural context and 235–251 and figural grouping 206, 248–249 and masking 130 and metacontrast 19, 243, 248–249 object-based 243, 246, 248, 250 space-based 243–246, 248, 250 spatial 19, 131, 133, 208, 249 unconscious processing and 269–272 attentional blink (AB) 37, 247, 272 attentional interruption 247 awareness 37, 272 Bachmann’s perceptual retouch model 124–129, 177 background color 59–66, 205 and magnocellular (M) activity 65 and metacontrast 63 background intensity see also adaptation, level and metacontrast 53 and two-process model of metacontrast 115 background luminance 53–54, 77, 154, 204 background wavelength 205
backward masking 5, 19–20, 24, 26–27, 34, 52, 66–73 by light 20, 29 by noise 196 by overlapping patterns 94–95 by pattern 198 by structure 196 object substitution and type B 129–132 types of 166 backward masking functions inverted type A 114 type B 114, 126, 197 backward pattern masking 36, 245 effects 95 in frontal eye field 89–90 functional magnetic resonance imaging (fMRI) studies 93–95, 97 and noise masking 76 in temporal cortex 90–92 backward structure masking 79 behavioral target visibilities 86f Bidwell’s ghost 9, 10, 11, 17 mechanism for suppression of 100 and visual persistence 24 binocular rivalry, metacontrast and 272–275 binocular vision 26, 58 biochemical dynamics 180 blindness 246 metacontrast induced 272–277 motion-induced 275–277 blindsight 254, 258, 259, 264, 270, 271 Bloch’s law 48, 153 blocking mask 306 blocking target 306 blur, image 203 blurring 203, 215, 219, 224 boundary contour system (BCS) 44, 134–138, 139, 264 Breitmeyer and Ganz’s sustained/transient dual-channel model 164–168, 212, 215–216 Bridgeman’s Hartline-Ratliff inhibitory network model 116–121, 139, 214, 302 brightness 8, 23, 63, 83 brightness contrast 110 brightness discrimination trials 82
364
INDEX
brightness suppression 12, 56, 113, 199 in metacontrast 69, 202 in paracontrast 40, 69, 112 in stroboscopic motion 199 Broca-Sulzer effect 20 Burr’s spatiotemporal receptive-field model 107–109 center-surround organization 181–182 see also receptive fields central attentional mechanisms 246–248 channels sustained 124, 203 transient 124, 203 Charpentier bands 9, 10f, 17, 100, 220 and visual persistence 24 chromatic interactions see cone-cone interactions; rod-cone interactions cognitive masking 28 color 25 and metacontrast 13, 63 perception and identification of objects 264–269 and unconscious priming 267–269, 284 see also wavelength common-onset masking 71–73, 208, 247, 248 object substitution and 132–133 computer simulations 185 conduction velocity 3 cone–cone interactions 59–66, 100 see also rod–cone interactions cone–rod interactions see rod–cone interactions congruency effects 267 connectedness 238, 239f, 250 constancy assumption 6 context effects 250 see also connectedness; figural context; object superiority contextual modulation 70, 241 contour interactions 130, 174–178 contour masking, type B 199 contour proximity 130, 131 contour visibility 43 contrast simultaneous 7, 110 spatial 7 successive 7 see also stimulus contrast contrast reversal 79, 128, 156 contrast sensitivity 149, 153 flicker and pattern thresholds 150–151 spatiotemporal 147–155 contrast threshold 147, 149 contrast visibility 43 cortical contour networks 185 cortical mechanisms 93
cortical pathways 143–144 cortical visually evoked potentials (CVEPs) 92, 150, 159, 293 Crawford effect 20 criterion content 40–41, 43, 46, 47, 77, 79n, 90 criterion-free (thresholds) 216n critical flicker fusion (CFF) 8, 9, 25, 186, 187 see also flicker cyclopean (vision), and metacontrast 57–58 dark adaptation 20, 21, 22, 53 delay hypothesis (overtake) 11, 100, 101, 124–129 depth rating 238, 239f detection criterion 46 dichoptic metacontrast 12, 16, 30n, 57–58, 62, 100, 273, 274 dichoptic viewing 34 and metacontrast 48 and paracontrast 43 difference functions 87 direct parameter specification (DPS) theory 265 disinhibition see target recovery display parameters 32 dorsal pathway 143, 144, 145, 146f double dissociation 258, 259 drugs 38 dual-channel activation hypothesis 124–129 duration see stimulus duration duration–contrast reciprocity 153, 161 dyslexia 287, 292 evoked potentials 92–93 excitatory input 179, 180, 301, 302 eye dominance 73 FACADE model 264 facial superiority 242 facilitatory effects 175, 177, 242 feature contour system (FCS) 44, 264 feature inheritance 19, 47, 69–71, 79 see also shine-through feature migration 19, 47 feature transposition 19 feature-based attention 246 feedback 87, 168 feedback dominant phase 169 feedback inhibition 208 feedforward 87, 168, 169 feedforward dominant phase 169 Ferry–Porter law 187, 204 figural context, and attention 235–251 figural grouping, role in metacontrast 248–249 figure–ground segmentation 37
INDEX
figure-ground segregation 85 flash suppression 275–277, 298 see also blindness flicker 8, 9, 150 and paracontrast 9 and persistence 8 see also critical flicker fusion flicker adaptation 206 transient and 161–164 flicker channels, spatial response profiles 151–152 flicker threshold 217n flicker-detectors 9 forced-choice contour discrimination 42 form perception and identification of objects 264–269 unconscious priming 265–267 forward lateral masking effects 106 forward masking 5, 34, 40–43 dichoptic type A 196 types of 165–166 four-dot masking 130, 131, 132, 208 fovea 199 and metacontrast 199 frontal eye field (FEF), backward pattern masking in 89–90 functional hierarchy 299 functional magnetic resonance imaging (fMRI) 93–95, 97 fusion velocity 204 gain control 180–181, 182 ganglion cells 142 Ganz’s interactive trace decay and random encoding time model 110–112, 116 gap inheritance 69 gap location tasks 94 generalized flash suppression 276 gestalt 70, 108, 249, 250 formation 109 grouping 37, 133, 244f, 248 rules 235 grouping 242 gestalt 37, 133, 244f, 248 perceptual 37 Hartline-Ratliff model 116–121, 139, 214, 302 Hering-type lateral inhibition 44 history of masking by light 19–23 of meta- and paracontrast 5–19 and perceptual stage analysis 3 of visual masking 1–30 of visual persistence 24 Hodgkin-Huxley equation 179, 301 homophotic image 11
horizontal cells 100 and metacontrast 12 hue substitution 63, 64, 66, 78, 204 see also isoluminance; isoluminant borders hypercomplex cells 136 icon 28 see also persistence iconic memory 28, 247 iconic persistence 23–24, 28, 207 impulse response 154, 155 inattentional blindness 246 inferior temporal cortex (IT) 90, 91 inhibition 125 inhibition-based models 227–231 inhibitory input 179, 301 integration 179 temporal 180 see also persistence integration hypothesis 30n integration and interruption model 114–115 integration–inhibition hypothesis 12, 14 intensity see background intensity; stimulus intensity inter-channel inhibition 109, 165, 166, 167, 198 see also sustained-on-transient inhibition; transient-on-sustained inhibition interactive trace decay and random encoding time model 110–112 interference effects 262 interocular effects see dichoptic viewing interruption 114, 247 interstimulus interval (ISI) 26, 47, 48 intra-channel inhibition 165, 166, 173, 191, 194 see also sustained-on-sustained inhibition isoluminance 65 isoluminant borders 66, 78 Kahneman’s impossible stroboscopic motion model 102–104, 205 latency lateral inhibition 214 of neuromagnetic response 160 perceptual 3 processing 144–145 and spatial frequency 160, 203 of sustained/transient channels 165 transduction 3 transmission 144–145 lateral geniculate nucleus (LGN) 81, 125, 142 lateral inhibition 55, 116, 133, 135, 139, 214, 304–305 Hering-type 44 Mach-type 44
365
366
INDEX
lateral inhibitory interactions and horizontal cells 100 lateral masking 29, 99 leaky-integrator (additive) model 179 learning 47 light adaptation 20, 21, 22, 23, 25, 120, 187 limits of consciousness 29 Lissajous figure 26, 27 local signs 6 location retinal 18, 56 stimulus 56–57 luminance 20, 53–54, 77, 83, 115, 154, 204 Mach-type lateral inhibition 44 magnocellular (M) pathways 65, 108, 183, 184 mask, definition 2 mask blocking 306 mask-to-target (M/T) contrast ratio 42, 262 duration ratio 48, 49 energy ratio 48, 49, 50, 51–52, 53, 74 masking attentional effects 243–248 backward see backward masking by gain control 180–181, 182 by inhibition 179, 182 by integration 179, 182, 279 by light 19–23, 29, 188, 189 by noise 33, 73–76, 77, 78 by normalization 179 by structure 33f, 34, 73–76, 77, 78, 82–88 common-onset 71–73, 132–133, 208 dichoptic 34, 78, 101 forward 34, 40–43 models 99–140, 301–306 monoptic 60–61 phenomenology of 38–39 masking by pattern 33, 85, 188, 189 applications and uses of 36–38 noise and structure masks 73–76, 115 masking functions 47 backward 34, 35f bimodal 35f, 41 forward 34, 35f monotonic 34 monotonic type A 34, 89 multimodal 35f non-monotonic 34 oscillatory 35f type A 34, 35f, 78, 96 type B 34, 35f, 96 type B U-shaped 48, 114 U-shaped 16, 34, 53, 248, 306 U-shaped backward 305, 306 unimodal 35f masking of light by light 188
masking models, mathematical aspects of 301–306 masking of pattern by light 188 masking threshold 194 Matin’s three-neuron model 104–107, 204, 213 metacontrast 38, 43–66, 297 and amblyopia 288 and attention 243, 248 and background intensity 53 and backward masking 66–73 and binocular rivalry suppression 272–275 and color 63 and cortical visual evoked potential (CVEP) 92 criterion content in 260 cyclopean 57–58 definition 5 dichoptic 12, 16, 30n, 57–58, 62, 100, 273 effects of stimulus wavelength on 64–65 flash suppression and motion-induced blindness 275–277 and flicker adaptation 206 foveal 16, 56, 57, 199 as functions of SOA 50, 51f history of 5–19 and hue substitution 63, 66 and induced blindness 272–277 masking performance in SRD and normal readers 294–295, 296 monoptic 57 motion deblurring and 220–231 and motion perception 219–233 and paracontrast 5–19 parafoveal 15, 56 and pattern masking by structure in V1 82–88 and reaction time 46, 201 role of attention and figural grouping in 248–249 sequential 67 and simple detection 44, 46 stereoscopic 58 and stimulus onset asynchrony 45 and stimulus size 55, 78 and stroboscopic motion 18, 102, 103 suppression 56, 186, 202, 254–259, 272, 274 sustained/transient channels 65 target recovery 255 and target-mask energy ratio 48, 49 and target-mask spatial separation 56–57, 199, 223 task parameters and criterion content 43–47, 55 transition from type B to type A 194–195 trials 82, 83 type A 105
INDEX
type B 16, 64, 65, 198 in V4 88 VEP studies 92–93 metacontrast contour masking 202 metacontrast functions 240f for foveally and peripherally presented stimuli 294 in normal and schizophrenic subjects 291, 295 oscillatory 53, 119 type B 44, 45, 56, 57, 62, 77, 79n, 105, 203 U-shaped 54, 57 metaphotic image 11, 12, 23 microgenesis 29, 32, 33, 277 microtomization see temporal slicing models lumping 181 unlumping 173, 174–178, 181, 192 models of visual masking 99–140 Anbar and Anbar’s 304–305, 306 Bachmann’s perceptual retouch 124–129, 177 boundary contour system (BCS) 44, 134–138, 139 Breitmeyer and Ganz’s sustained/transient dual-channels 164–168, 212, 215–216 Bridgeman’s Hartline-Ratliff inhibitory network 116–121, 139, 214, 302 Burr’s spatiotemporal receptive-fields 107–109 Di Lollo and Enn’s object substitution 129–132 Ganz’s interactive trace decay and random encoding time 110–112, 116 Kahneman’s impossible stroboscopic motion 102–104, 205 Matin’s three-neuron model 104–107, 204, 213 Navon and Purcell’s integration and interruption model 114–115, 116 object-substitution models 129–133 Ög˘ men’s retino-cortical dynamics (RECOD) neural network see RECOD model Reeve’s temporal integration and segregation model 112–114, 115 spatiotemporal sequence 102–109 Weisstein’s Rashevsky–Landahl two-factor neural network 121–123, 139, 212, 302 monocular vision 26 monoptic viewing 57–58, 197 motion blurring 219, 224 explosive 18, 38, 39f, 259 form-cue invariant 232 phi 17–18, 68, 232 split 38, 39, 102, 103, 104
stroboscopic 17, 18, 68, 102–103, 104, 199, 202 motion deblurring 219 effect of exposure duration on 224 inhibition-based models 227–231 mechanisms 223–231 and metacontrast 220–231 motion estimation–compensation models 224–227 motion estimation–compensation models 224–227 motion perception, metacontrast and 219–233 motion smear 220–223 motion-induced blindness 275–277, 299 multi-unit activity (MUA) 86, 87 multiplicative inhibition 180 multiplicative (shunting) equation 179, 183, 301–303 multiplicative model see shunting model Navon and Purcell’s integration and interruption model 114–115, 116 neural activity 284–285n neural-network models 214 neuroelectric response 159 neurological patients, visual masking in 289–292 neuromagnetic response 160 normalization 179 object location, unconscious processing of 259–264 object substitution 101, 247 and common-onset masking 132–133 and type B backward masking 129–132 object substitution models 129–133 object superiority 236f, 242, 250 see also connectedness; context effects; figural context object-based attention 243, 246, 248, 250 Ög˘ men’s retino-cortical dynamics neural network model see RECOD model onset–onset law 13, 47 oscillations, spatiotemporal 117 overtake hypothesis 11, 100, 101, 124–129 overtake–inhibition hypothesis 12, 14 paracontrast 40–43 criterion content in 40–41 dichoptic 43 as flicker suppression 9 history of 5–19 and intra-channel inhibition 191 metacontrast and 5–19 and reaction time 201 reaction times for target localization 201
367
368
INDEX
paracontrast (cont.) target localization 260 and target-mask spatial separation 40 type A 40 type B 40, 106, 191 in V4 88 VEP studies 92–93 paracontrast suppression 176, 186 parvocellular/magnocellular pathways 141–164 afferent 142–143 parvocellular/P pathway 65, 108, 181–183 pattern channels, spatial response profiles 151–152 pattern discrimination tasks 200 pattern masking see masking by pattern pattern recognition 238 pattern thresholds 150, 216n Pearson r coefficient 118f percept-dependent activity 284n perception 4, 19, 28, 37 perception time 29 perceptual context, and grouping 236–243 perceptual latency 3 perceptual retouch model 124–129, 177, 249 perceptual stages 3 persistence 23, 24, 25, 26, 28, 113, 134, 162, 207, 219, 220 see also icon; iconic memory phenomenal consciousness 247, 272 phi motion 17–18, 68, 232 see also apparent motion; stroboscopic motion photopic luminance 190 positive after-image 11, 12, 20, 23 post-iconic processing 247 post-retinal cells 183 sustained 183 transient 184 post-retinal inhibitory interneurons 184 post-retinal network, parameters 308t posterior contralateral negativity (PCN) 271 preattentive process 28 previsible persistence 24, 26 primary sensation 10 psychiatric patients, visual masking in 289–292 pulvinar 258 Purkinje image see Bidwell’s ghost quadratic non-linearity, with threshold and persistence 182–183 rapid serial visual presentation (RSVP) 37 Rashevsky-Landahl equations 302 Rashevsky’s two-factor neuron model equations 122–123
reaction times 36, 66, 103, 154, 157f, 262 changes due to contour interactions 201 and cortical visual evoked potential (CVEP) 159 and metacontrast 45, 46 and neuromagnetic response 160 and paracontrast 262 and sustained/transient channels 154 to gratings 162, 163f to stimuli activating achromatic and chromatic systems 158–159 to target detection during metacontrast 260 to vertical sine wave gratings 158 unconscious priming effects 270 reading 27 reading disability, visual masking in subjects with specific 292–295 receptive fields 107, 175 see also center-surround organization receptor responses, interactions among 100 RECOD model 96, 124, 139, 168–185, 213–214, 216, 258, 259 architecture of 170, 172f explanations 186–211 mathematical basis 178–185 parameters of 307–308 parameters for the retinal network 307t post-retinal network 308t response to single moving dot stimulus 227–229 response to two-dot paradigm 229–231 simulation study 227–231 sub-cortical network 308t target recovery 260 unlumping 174, 175f recurrent see feedback reentrant see feedback Reeve’s temporal integration and segregation model 112–114, 115 reset mechanisms 215, 218n reset phase 169, 173 reticular complex 127 retinal cells with sustained activities 181–183 with transient activities 183 retinal locus 56, 57 retinal network parameters 307t retino-cortical dynamics neural network model see RECOD model retino-geniculate-cortical pathway 164, 216n ripple effect 217n rod–cone interactions 17, 59–66, 100 see also cone–cone interactions saccadic suppression 260, 298 schizophrenia 38, 287, 288, 290–291
INDEX
scotoma 271, 298 scotopic luminance 190 seeing-more-than-there-is 25 sensation time 29 sensory persistence hypothesis 20 sensory response persistence, and temporal integration in vision 23–29 sequential blanking 15f, 30f, 67–68, 93, 207, 223 shape discrimination tasks 90 shifter-circuit model 224 shine-through 53, 69–71, 79 see also feature inheritance shunting (multiplicative) equation 179, 183, 184, 301–303 shunting model 301 signal-to-noise ratio 126, 214 single-transient masking 107, 203 size see stimulus size SOA law see onset–onset law space-based attention 243–246, 248, 250 spatial center–surround organization 181–182 spatial frequency 148, 149, 158, 203 and latency 160, 203 see also stimulus size spatial frequency contrast sensitivity 153 spatial inhibition see lateral inhibition spatiotemporal contrast sensitivity 147–155 spatiotemporal receptive-field model 107–109 spatiotemporal sequence models 102–109 specific flash suppression 276 specific reading disability (SRD) 287, 288 standing-wave illusion 19, 57, 68–69 step visual response function (VRF) 304 stimulus contrast 47–53 polarity 54–55 stimulus duration 26, 47–53 stimulus energy 48 stimulus intensity 47–53 stimulus location 56–57 stimulus luminances 77, 115, 204 stimulus onset asynchrony (SOA) 8, 32, 42, 46, 48, 49, 52 and metacontrast 38, 45 and paracontrast 40, 41 stimulus orientation 55–56 stimulus parameters 32, 82 stimulus separation 56–57 stimulus size 55–56, 57, 78 stimulus termination asynchrony (STA) 47, 48 stimulus wavelength effects on metacontrast 64–65 variables 59–66
stroboscopic motion 17, 18, 68, 102–103, 104, 199, 202 see also apparent motion; phi motion sub-cortical network 184–185 sub-cortical network, parameters 308t subjective contrast matching 42 superior colliculus 258, 264, 271 suppressive effects 105 surface dynamics 174–178 surface networks 185 sustained, and transient channels 65, 145–164 sustained-on-sustained inhibition 166 see also intra-channel inhibition sustained-on-transient inhibition 184, 209 see also inter-channel inhibition sustained/transient dual-channel model, Breitmeyer and Ganz 164–168, 212, 215–216 tachistoscopic techniques 4 Talbot brightness 187, 188 target blocking 306 target localization, stimulus configurations used in experiments 261 target recovery 137, 187, 208, 254–259, 278–281 target suppression 107, 203 target visibility 209–211, 255 target–mask asynchrony 166 target–mask configuration 39f target–mask percepts 39f target-mask spatial separation 40, 56–57, 77 and metacontrast 56–57 and paracontrast 40 and retinal locus 57 target-to-mask (T/M) energy ratio 123 targets definition 2 spatially embedding within a larger target gestalt 241–243 spatiotemporally grouping with mask display 236–241 TMS and the disinhibition or recovery of visually masked targets 278–281 task parameters 32, 55 and criterion content 43–47, 55 techniques brightness-matching 41, 154 forced-choice 42 rating 41 temporal adaptation 181 temporal cortex, backward pattern masking in 90–92 temporal integration 27, 91, 96, 180, 304 see also Bloch’s law temporal integration and segregation model 112–114
369
370
INDEX
temporal parameters 3 temporal resolution 66, 97, 115 temporal slicing 3 time perception 29 time sensation 29 timing parameters 32 transcranial magnetic stimulation (TMS) 298 and disinhibition or recovery of visually masked targets 278–281 masking of visual targets 277–281 relation between visual masking and 282–283 suppression of visual targets 277–278 transduction latency 3 transient, and flicker adaptation 161–164 transient channels 131, 187, 203, 207 sustained and 145–164 transient M-channel deficit, in specific disability 292–294 transient-on-sustained inhibition 66, 173, 197, 208, 209 see also inter-channel inhibition transient-on-transient inhibition 201 Troxler effect 277, 297 two-factor neural network model, Weisstein’s Rashevsky–Landahl 121–123, 139, 212 two-process models, general criticism of 115–116 two-pulse fusion threshold 206 two-transient paradigm 203 type of masking see masking function U-shaped masking see masking function unconscious priming by color 267–269, 284 by form 265–267, 284 unconscious processing 253–285 and attention 269–272
of object location 259–264 of object-identity information 264–269 unlumped model, with contour and surface networks 184 V1 see visual cortex, primary V4, paracontrast and metacontrast in 88 ventral pathway 143, 264 vision, conscious and unconscious 299 visual cortex 253 extrastriate 93 primary 82–88, 208, 242, 243, 254, 266 striate 93 visual evoked potential (VEP) 289 visual masking definition 2 methods of 33–35 in neurological and psychiatric patients 289–292 relation between transcranial magnetic stimulation (TMS) and 282–283 visual pattern masking, neurobiological correlates of 81–97 visual reaction time 155–161 visual response function (VRF) 304, 305 visual targets, masking by transcranial magnetic stimulation 277–281 visually evoked potentials (VEPs) 92–93 wavelength 25, 205, 268 see also color Weber ratio 189 Weisstein’s Rashevsky–Landahl two-factor neural network model 121–123, 139, 212, 302 word-superiority effect 236 X neurons (sustained) 104, 140n Y neurons (transient) 104, 140n
Note: Locators followed by f indicate figures (86f ), n indicates chapter notes (79n) and t indicates tables (308t).