Advances in Cognitive Science Volume 2
ii Advances in Cognitive Science
Advances in Cognitive Science Volume 2
...
67 downloads
1692 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advances in Cognitive Science Volume 2
ii Advances in Cognitive Science
Advances in Cognitive Science Volume 2
Edited by
Narayanan Srinivasan Bhoomika R. Kar Janak Pandey
Copyright © Narayanan Srinivasan, Bhoomika R. Kar and Janak Pandey, 2010 All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage or retrieval system, without permission in writing from the publisher. First published in 2010 by Sage Publications India Pvt Ltd B1/I-1, Mohan Cooperative Industrial Area Mathura Road, New Delhi 110044, India www.sagepub.in Sage Publications Inc 2455 Teller Road Thousand Oaks, California 91320, USA Sage Publications Ltd 1 Oliver’s Yard, 55 City Road London EC1Y 1SP, United Kingdom Sage Publications Asia-Pacific Pte Ltd 33 Pekin Street #02-01 Far East Square Singapore 048763 Published by Vivek Mehra for Sage Publications India Pvt Ltd, typeset in 10/13 pt. ITC Stone Serif by Star Compugraphics Private Limited, Delhi and printed at Chaman Enterprises, New Delhi. Library of Congress Cataloging-in-Publication Data Available
ISBN: 978-81-321-0444-5 (HB) The Sage Team: Rekha Natarajan, Meena Chakravorty, Amrita Saha and Trinankur Banerjee
Contents List of Figures List of Abbreviations Preface
ix xv xvii
Section I Learning and Memory Introduction Chapter 1 Study of Basic Associative Processes Contributes to Our Understanding in Cognitive Science J. Bruce Overmier and John M. Holden Chapter 2 Minimizing Cognitive Load in Map-based Navigation: The Role of Landmarks Kazuhiro Tamura, Bipin Indurkhya, Kazuko Shinohara, Barbara Tversky, and Cees van Leeuwen Chapter 3 Quantitative and Qualitative Differences between Implicit and Explicit Sequence Learning Arnaud Destrebecqz Chapter 4 Behavioural Study of the Effect of Trial and Error versus Supervised Learning of Visuo-motor Skills Ahmed, Raju S. Bapi, V. S. Chandrasekhar Pammi, K. P. Miyapuram and Kenji Doya Chapter 5 ACE (Actor–Critic–Explorer) Paradigm for Reinforcement Learning in Basal Ganglia: Highlighting the Role of the Indirect Pathway Denny Joseph, Garipelli Gangadhar, and V. Srinivasa Chakravarthy
3
7
24
43
56
71
vi Advances in Cognitive Science Section II Perception and Attention Introduction
93
Chapter 6 Peripersonal Space Representation in Humans: Proprieties, Functions, and Plasticity Elisabetta Làdavas and Andrea Serino
97
Chapter 7 A Neurophysiological Correlate and Model of Reflexive Spatial Attention Anne B. Sereno, Sidney R. Lehky, Saumil Patel, and Xinmiao Peng
104
Chapter 8 Effects of Emotions on Selective Attention and Control Narayanan Srinivasan, Shruti Baijal, and Neha Khetrapal
132
Chapter 9 Modelling Neuropsychological Deficits with a Spiking Neural Network Eirini Mavritsaki, Glyn W. Humphreys, Dietmar Heinke, and Gustavo Deco
150
Section III Time Perception Introduction Chapter 10 Continuity of Subjective Experience Across Eye Movements: Temporal Antedating Following Small, Large, and Sequential Saccades Kielan Yarrow
175
179
Chapter 11 Duration Illusions and What They Tell us about the Brain Vani Pariyadath and David M. Eagleman
196
Chapter 12 Implicit Timing Trevor B. Penney, Latha Vaitilingam, and Siwei Liu
207
Chapter 13 Localization and Dynamics of Cerebral Activations Involved in Time Estimation: Studies Combining PET, fMRI, and EEG Data Viviane Pouthas
224
Contents
vii
Section IV Language, Cognition, and Development Introduction Chapter 14 Effects of Remediation on Auditory Temporal Processing in Dyslexia: An Overview Bhoomika R. Kar and Malini Shukla
241
245
Chapter 15 Brain Networks of Attention and Preparing for School Subjects Michael I. Posner and Bhoomika R. Kar
256
About the Editors and Contributors Subject Index Name Index
270 278 283
viii Advances in Cognitive Science
List of Figures Chapter 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
Illustration of behavioural stream and three theories of action of reinforcers. SD = discriminative stimulus, R = response, SR = reinforcer Illustration of a transfer-of-control procedure Latency to perform an avoidance response as a function of stimulus type and test day Illustration of a standard discriminative conditional choice task employing common outcomes, and a similar task employing differential outcomes Performance in a bi-conditional discrimination task under differential and common outcomes procedures Illustration of the transfer-of-control procedure employed by Kruse et al. (1983) Illustration of an experimental design for showing inter-problem transfer of control of choice Illustration of a stimulus equivalence training procedure similar to that employed in Joseph et al. (1997) Data on the short-term working memory of normal older men and older men with Korsakoff’s disease
8 10 11 12 13 14 16 17 20
Chapter 2 2.1
2.2
2.3 2.4 2.5
2.6
Examples of map displays used in Experiment 1: The global landmark conditions (global–inside: A, global–outside: B, and without–global landmark: C) Example sequences of displays used in Experiment 1 (Aligned and Misaligned versions of a global landmark inside two local landmark conditions) Mean response speed and standard errors for the interaction of the global landmark and Alignment of the test display in Experiment 1 Mean response speed and standard errors for the interaction between local landmark and alignment test display conditions in Experiment 1 Mean response speed and standard errors for the interaction between the arrangement of landmarks (fixed versus random), number of local landmarks, and the Alignment of the test display in Experiment 1 Example of maps used as stimuli in Experiment 2
27
28 30 32
34 36
x Advances in Cognitive Science 2.7 2.8
Mean response speed and standard errors for the interaction between type of landmark and Alignment of the test display in Experiment 2 Mean response speeds and standard errors between Arrangement and type of landmark conditions in Experiment 2
38 39
Chapter 3 3.1
3.2 3.3
Mean reaction times during the 15 blocks of the SRT task plotted separately for participants trained with a 0 ms or 1000 ms RSI. Block 13 is the transfer block during which another sequence was used. Mean recognition scores in RSI 0 and RSI 1000 conditions when either a constant (CST) or variable (VAR) RSI was used at test. RTs recorded for old and new fragments presented in the recognition task to participants trained with a 0 ms or a 1000 ms RSI and plotted separately for fragments presented at test with either a constant (CST) or variable (VAR) RSI
48 50
51
Chapter 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7
Sequence Learning Tasks Summary snapshot of a subject (AS) from “mild-learning” group of 1 × 12 experiment Summary snapshot of a subject (AS) from “mild-learning” group of 2 × 6 experiment Summary snapshot of a subject (ST) from “mild-learning” group of 1 × 12 experiment Summary snapshot of a subject (JY) from “mild-learning” group of 2 × 6 experiment Summary snapshot of a subject (BN) from “continued-learning” group of 1 × 12 experiment Summary snapshot of a subject (BN) from “continued-learning” group of 2 × 6 experiment
58 62 63 65 66 67 68
Chapter 5 5.1 5.2 5.3 5.4 5.5 5.6
ACE Architecture Simple muscle model system (a) Architecture of the actor network (b) The 2D arm and the targets to be reached Architecture of the critic network Architecture of the explorer (a) STN–GPe neuron pair illustrating the excitatory and inhibitory connections, (b) Network model of STN–GPe loop with lateral connections
73 74 75 75 76 78
List of Figures 5.7
5.8
5.9 5.10
5.11 5.12
xi
Dynamics of the STN–GPe Loop: Three characteristic patterns of activity in the STN–GPe layer – (a) Uncorrelated activity, (b) Travelling waves, and (c) Clustering Snapshots of STN activity for various values of DNe: (a) DNe = 50; observed E‑dim ~ 96, (b) DNe = 20; observed E-dim ~ 48, (c) DNe = 5; observed E-dim ~ 15. There is a consistent decrease in E-dim with decreasing DNe Output of the Critic for different targets, with the x–y plane representing [ga, gb] values and the z-axis representing the Value, Q (a) The dynamics of the model before learning, E-dim = 93, (b) The dynamics of the model after learning for eight epochs, E-dim = 57, (c) The dynamics of the model at the end of learning, E-dim = 5 The relation between norepinephrine (DNe) in the STN–GPe layer and d Changes due to dopamine reduction
79
81 83
85 86 87
Chapter 7 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 7.14
Typical reflexive spatial attention task used to elicit IOR Typical behavioural results obtained in a reflexive spatial attention task Behavioural results of monkeys in a reflexive spatial attention task Activity of SC neurons during a reflexive spatial attention task Stimulus repetition suppression effects in the ventral steam during a serial recognition task Stimulus repetition suppression effects in the ventral steam during delayed match-to-sample tasks Schematic localization of visual pathways in the macaque brain Schematic diagram of the fixation task (one location, eight shapes) Activity of AIT and LIP shape selective neurons during a passive fixation task with repeated stimulus presentations within a trial Repetition suppression effects in AIT and LIP Repetition suppression effects in LIP across six blocks of trials Proposed network model of reflexive spatial attention Simulated outputs of the model at three different CTOAs during a reflexive attention task Simulation of reflexive spatial attention and the influence of shape
106 107 109 110 112 114 115 116 117 118 119 120 121 123
Chapter 8 8.1 8.2 8.3 8.4
Search times for sad and happy schematic faces in a detection task Search times for sad and happy schematic faces in a discrimination task The magnitude of flanker compatibility N 100 component (90 to 140 ms; frontal sites) depicts increased amplitude for happy target faces compared to threatening target faces
136 137 143 144
xii Advances in Cognitive Science 8.5 8.6
N2 component (220 to 260 ms; central midline sites) is locked to stimulus onset and reflects processing of conflict ERN component (60 to 90 ms; central midline sites) is locked to onset of response and reflects processing of errors
144 145
Chapter 9 9.1 9.2 9.3 9.4 9.5 9.6
The architecture of the sSoTS model The mean correct reaction times (RTs, in ms) for the unlesioned (dotted lines) and lesioned versions (solid lines) of sSoTS The mean percentage miss responses for the lesioned versions of sSoTS (data from Simulation 1) Example displays from Simulation 2 The mean correct RTs (ms) from Simulation 2 (a) The mean correct RTs (ms) and (b) the mean percentage miss responses for Simulation 3 (variation in the NMDA parameter)
153 160 161 163 164 167
Chapter 10 10.1 10.2
Schematic of the experimental task in saccade and control conditions Time estimation data
184 186
Chapter 11 11.1 11.2 11.3
The debut effect Repeated stimuli subjectively proliferate less than random stimuli Proposed repetition suppression diagnostic tool
198 201 203
Chapter 12 12.1 12.2 12.3 12.4 12.5
SOA Polyrhythmic Sequence Mean Stop-RT (ms) averaged across 20 participants in a bimodal single SOA experiment from our lab Performance on two SOA Polyrhythmic Sequence Examples of typical passive oddball paradigms used in mismatch negativity (MMN) studies of interval timing
209 211 212 214 217
Chapter 13 13.1 13.2 13.3 13.4 13.5
Contingent Negative Variation (CNV) Generalization gradients Peak latency Mean CNVs Three hypothetical models of the neural mechanisms for timing
226 227 228 229 230
List of Figures 13.6 13.7 13.8 13.9
Attention conditions Estimation of Activations due to length of duration Waveforms Time course activity
xiii 231 233 235 236
xiv Advances in Cognitive Science
List of Abbreviations AIT
anterior inferotemporal cortex ANOVA Analysis of Variance ANT Attention Network Test AR Adapted Response ARP Associative Reward Prediction BG Basal Ganglia BOLD Blood Oxygenation LevelDependent CBCS Centre for Behavioural and Cognitive Sciences CS Cognitive Science CO Common Outcomes CNV Contingent Negative Variation CFFT Critical Flicker Fusion Threshold CTOAs Cue–Target Onset Asynchronies DO Differential Outcomes DRD4 Dopamine 4 Receptor Gene DLPFC Dorso–Lateral Prefrontal Cortex EEG Electroencephalography ERN Error-Related Negativity ERPs Event-Related Potentials fMRI functional Magnetic Resonance Imaging GPe Globus Pallidus externa GPi Globus Pallidus interna IOR Inhibition of Return IFG Interior Frontal Gyrus ISI Inter-Stimulus Interval
LIP LED MMN MOBS mRT NE OP PREP PA PSE PPC PMC pre-SMA RBF RT RF RL thr RT RSI RBD SOC SAIM SRT SCRs SLI sSoTS SOA S-R Stop-RT
Lateral Intraparietal Cortex Light-Emitting Diode Mismatch Negativity Modified Binary Search modulatory Response Time Norepinephrine Omission Potential PASS Reading Enhancement Program Performance Accuracy Point of Subjective Equality Posterior Parietal Cortex premotor cortex pre-Supplementary Motor Area Radial Basis Function Reaction Time Receptive Field Reinforcement Learning relative threshold Response Time Response-Stimulus Interval Right Brain Damaged Second Order Conditional Selective Attention for Identification Model Serial Reaction Time Skin Conductance Responses Speech Language Impairment spiking Search over Time and Space Stimulus Onset Asynchrony Stimulus-Response Stop-Reaction Time
xvi Advances in Cognitive Science SNc SNr STN–GPe
Substantia Nigra pars compacta Substantia Nigra pars reticula Subthalamic Nucleus and Globus Pallidus externa
SC STG TD TOJ VTA
Superior Colliculus Superior Temporal Gyrus Temporal Difference Temporal Order Judgement Ventral Tegmental Area
Preface
C
ognitive Science is an interdisciplinary enterprise interfacing with psychology, neuroscience, computer science, philosophy, and linguistics. Cognitive science seeks to answer many fundamental and long-standing questions about the nature of mind and mental processes. In the last few decades, it has established itself as a truly interdisciplinary science. Given the current advances, it is expected that it will become even more interdisciplinary. Cognitive Science is not yet a flourishing discipline in India. Recently, the Department of Science and Technology of the Government of India has designated Cognitive Science as a fourth pillar of knowledge along with nano-, bio-, and information technologies and has started major research initiatives in Cognitive Science. Under the UGC Scheme of Universities with Potential for Excellence, the University of Allahabad was selected for developing “Behavioural and Cognitive Sciences” as an Island of Excellence. As a follow-up, the University established the Centre of Behavioural and Cognitive Sciences (CBCS) in 2002, for providing education of merit and distinction in line with new developments and challenges, as a constructive opportunity for advancement of scientific knowledge through basic and applied research and teaching as well as outreach programmes. The objectives of the academic programme are to provide comprehensive training and prepare the students for a professional/research/academic career, to develop a richer understanding of mental processes and neural mechanisms underlying cognition using behavioural, computational and neurophysiological techniques. The faculty and students at the Centre are involved in research programmes pertaining to vision, attention, perception, linguistics, cognitive neuroscience, consciousness, cognitive disorders, cognitive modelling and human computer interactions. There is a strong emphasis on research projects and exposure to various theoretical and experimental studies in Cognitive Science. The Centre and the University provide an ideal environment for study and research in Cognitive Science. The first International Conference on Cognitive Science at the Centre of Behavioural and Cognitive Sciences was held in December 2004 followed by the Second International Conference on Cognitive Science on December 10 to 12, 2006. The mission of the conferences was to explore the truly interdisciplinary nature of cognitive science and create awareness of cognitive science among the interested students and researchers. The conferences served as the meeting point for scientists from interfacing disciplines like psychology, neuroscience, computer science, linguistics, and philosophy. The selected papers of the first conference are published by SAGE, New Delhi in a book titled “Advances in Cognitive Science: Volume 1”. The first volume consisted of twenty-seven chapters organized
xviii Advances in Cognitive Science into six sections (Cognitive Processes, Cognitive Neuroscience, Computational Modelling, Culture and Cognition, Cognitive Development and Intervention, and Consciousness). The Second International Conference in 2006 comprised three keynote lectures, twenty-three oral presentations and thirty-four poster presentations. The conference was inaugurated by Prof. K. Ramakrishna Rao, Chairman, Indian Council of Philosophical Research, Delhi, and chaired by Prof. R. G. Harshe, Vice-Chancellor, University of Allahabad. The keynote lectures were presented by three prominent experts in cognitive science: Prof. Bruce Overmier, University of Minnesota, USA, Prof. Ira Noveck, Centre National de la Recherche Scientifique, France, and Prof. James Georgas, University of Athens, Greece. There were oral sessions and symposia on cognitive neuroscience, computational modelling, language and cognition, time perception, mind and consciousness, and attention. In addition to faculty members from various institutions in India and abroad, doctoral research scholars and master students attended the conference. Based on initial review of abstracts and papers, the editors requested the selected authors to submit full papers for the volume. All the editors reviewed the papers and fifteen contributions were selected for publication in the current volume. The contributors are senior as well as young cognitive scientists from various countries including USA, Canada, UK, France, Belgium, Italy, Japan, Spain, Singapore, and India. The volume contains research articles addressing the challenges faced in cognitive science requiring cross-linking of different interfacing disciplines like psychology, neuroscience, and computer science. The recent findings from cognitive science presented in the volume will serve as a useful resource for scientists working in the area. The volume represents a good sample of the current trends in major sub-disciplines in cognitive science. It contains four sections: (a) Learning and Memory, (b) Vision and Attention, (c) Time Perception, and (d) Language, Cognition, and Development. The first section focuses on basic cognitive processes of learning and memory including simple associative processes, spatial memory, sequence learning, and implicit learning. The second section focuses on vision and attention with contributions on multisensory spatial perception, basic attentional processes, and modelling of attention. The third section contains four chapters focusing on time perception. The fourth section on language, cognition, and development includes two chapters on both normal and abnormal development especially focusing on language development and related cognitive abilities. We would like to acknowledge the efforts and support of office staff especially M. P. Srivastav, Puneet Srivastav, and Shabeeh Abbas who have contributed to the conference and preparation of this volume. We would like to thank all the colleagues for their enthusiastic support. We would like to thank all the CBCS students as well as research scholars from the Department of Psychology who worked very hard for the conference. We thank Sage Publications for bringing out this volume. Narayanan Srinivasan Bhoomika Kar Janak Pandey
Section I
Learning and Memory
2 J. Bruce Overmier and John M. Holden
Introduction
L
earning and memory are critical for adaptive behaviour. Learning refers to acquisition of information and skill. Memory is a process by which learned information is stored for later use. The chapters in this section focus on different aspects of learning and memory. One basic aspect of learning and memory is associative learning that is based on the assumption that ideas and experiences can be linked to each other to enhance the learning process. Reinforcement learning involves activation of brain mechanisms that increase the likelihood that response will occur. Overmier and Holden’s chapter focuses on associative processes that underlie choice and decision-making, which can be influenced by simple associative mechanisms. Theories like associative theories of learning, stimulus-response theory, Mowrer’s twoprocess theory, and modern expectancy theory are discussed. The authors propose that the expectancies of reward control choice behaviour and may probe separate memory mechanisms. They also argue that animal studies on learning have implications for the role of associative processes in the treatment of disorders like Down’s syndrome, and alcohol related dementia. Overmier and Holden have discussed about the relevance of the study of associative processes for understanding human cognition and behaviour in the context of choice behaviour and decision-making. An important aspect of learning and memory is their role in spatial navigation. Navigational patterns depend on the learning and formation of cognitive maps (Solso et al., 2005). The chapter by Tamura and colleagues discusses the role of learning landmarks in navigation decision. They have discussed the results of two experiments designed to study how landmarks facilitate map reading. Such an inquiry also gives an insight into the mental processes that are involved in mental navigation decisions in the context of mental rotation and reorientation. Left-right orientation is an important factor in navigation. The authors have investigated the differences between local and global landmarks related to their efficacy in facilitating navigation decisions. Authors discuss the effect of misalignment between the orientation of the map and the navigator on navigation decisions and the role of landmarks in reducing the misalignment. They also find that global landmarks facilitate navigation decisions but not the local landmarks and that the global landmarks provide orientation cues to facilitate left right orientation. The study has implications for improving navigation by adding more global landmarks that facilitate mental rotation and reorientation. Distinctions about how we learn and retain information have been made in terms of implicit and explicit acquisition of information. Implicit learning has been characterized as a passive process, whereas explicit learning, as an active process where people seek
4 Advances in Cognitive Science out the structure of any information that is presented to them. Experimental and neuroimaging studies suggest that implicit and explicit learning, and memory operate through distinct mechanisms (Kluwe et al., 2003; Gazzaniga et al., 2002). Research on implicit learning and memory has embodied three fundamental issues in cognitive science namely consciousness, mental representation and modularity of the cognitive system. The chapter by Destrebecqz raises methodological and theoretical issues associated with implicit learning. The focus on the notion of implicit learning with respect to unconscious learning has been discussed in the context of the sequence learning studies based on serial reaction time tasks. Tasks that have been used in earlier studies to study implicit learning or to measure conscious knowledge were actually dependent on both implicit and explicit components. Fluency, for example, constitutes a potential bias in conscious knowledge assessment, for it may reflect implicit rather than explicit influences. The author has discussed these issues on the basis of the results of a sequence learning experiment in which the implicit influence of perceptual and motor fluency was controlled in a recognition task. Results of this experiment were found to be consistent with the idea that slowing the pace of the learning phase increases explicit knowledge acquisition. The dissociation between priming and recognition as shown by previous studies were not replicated in this study. The authors do not argue for two independent learning systems; rather they argue for the interaction between cortical (primarily involving the anterior cingulate) and subcortical (primarily involving the striatum) structures. The authors conclude that conscious knowledge affects performance and influences behaviour and do not support the notion that sequence learning is based on unconscious learning mechanisms. Skill learning involves the acquisition and improvement of mental or physical abilities through practice and human skill learning has been studied extensively including behavioural and neuroimaging studies (Perez et al., 2007; Grafton et al., 1998; Hikosaka et al., 1996). Brain bases of implicit and explicit learning have been studied with neuroimaging studies focusing on medial temporal structures for explicit learning and memory and basal ganglia for implicit learning. Skill learning studies have focused on issues such as inter-manual transfer, chunking and co-articulation effects. The chapter by Ahmed and his colleagues focuses on mechanisms of skill learning. But the effects of learning paradigm adopted during sequence learning, such as explicit guidance and trial and error, are not specifically investigated. The authors designed a task to tap supervised learning and trial and error learning. The authors propose that the time course of chunk formation could be a viable measure for demarcating a subject’s extent of learning. Learning by reinforcement has been investigated in the context of the neural mechanisms focusing on the neural pathways that modulate the reward information. Basal ganglia and its connections have been identified as the prominent neural substrate for reinforcement learning. The chapter by Joseph, Gangadhar and Chakravarthy highlights the role of basal ganglia in exploration, a component of reinforcement learning. They present a comprehensive model of basal ganglia involving every nucleus of basal ganglia
Learning and Memory
5
and the anatomical substrates of the various components of reinforcement learning have been discussed. The model is being proposed to identify and explain the neural basis for exploratory behaviour. The chapter describes the architecture of the proposed model with its three components, actor, critic and explorer. In the model, out of the three components, “Actor” represents the sensorimotor cortical pathway, “Critic” represents the cortico-striatal pathway and “Explorer” represents the subthalamic nucleus-globus pallidus pathway. The model was trained to perform a simple behaviour, that is, to learn to reach the target. In all, the model explains the neural basis for the acquisition of motor skill and highlights the dynamic nature of the neural circuit, which enables the organism to respond to changes in the environment. The implications of this model with respect to disorders like Parkinson’s disease have also been discussed.
References Gazzaniga, M. S., R. B. Ivry, and G. R. Mangun. 2002. Cognitive Neuroscience. New York: W. W. Norton & Company. Grafton, S. T., E. Hazeltine, and R. B. Ivry. 1998. “Abstract and effector-specific representations of motor sequences identified with PET”, Journal of Neuroscience, 18, 22: 9420–428. Hikosaka, O., S. Miyachi, K. Miyashita, and M. K. Rand. 1996. “Learning of sequential procedures in monkeys”, in J. R. Bloedel, T. J. Ebner, and S. P. Wise (eds), The Acquisition of Motor Behaviour in Vertebrates, pp. 303–17. Cambridge: MIT Press. Kluwe, R. H., G. Luer, and F. Rosler. 2003. Principles of learning and memory. Boston: Birkhauser Verlag. Perez, M. A., S. Tanaka, S. P. Wise, N. Sadato, H. C. Tanabe, D. T. Willingham, and L. G. Cohen. 2007. “Neural substrates of intermanual transfer of a newly acquired motor skill”, Current Biology, 17: 1896–902. Solso, R. L., M. K. MacLin, and O. H. MacLin. 2005. Cognitive Psychology. New York: Pearson Education Inc.
6 J. Bruce Overmier and John M. Holden
Chapter 1 Study of Basic Associative Processes Contributes to Our Understanding in Cognitive Science* J. Bruce Overmier and John M. Holden
Introduction
W
e would like to illustrate, from our work and that of our colleagues, a basic research finding that shows that the cognitive processes presumed to underlie choice and decision-making can be dramatically influenced by simple associative mechanisms. Moreover, we want to show that this same basic animal research can be translated into applications with human patients and that such translation is taking place today. It is a long and complicated story, but not uninteresting because it reflects how our psychological science is self-correcting and how with that self-correction come new insights and new treatment options. In this story, we shall go from learning theory to the animal laboratory, to tests with normal persons, to applications with clients. We will skip some of the steps and details, but all the links are there. Let us begin our research presentation with some reflections on early theory and its transformation. The behaviouristic associationism that so dominated Western research and thinking in the first half of the 20th century springs from the research and theorizing of Thorndike (1911). Thorndike argued that learning was the development of associations between a stimulus (environment) and a response (action) that was “stamped in” because the sequence was followed by a reinforcer (see Figure 1.1A). For Thorndike, the reinforcer was a catalyst
* Study of Basic Associative Processes Contributes to Our Understanding in Cognitive Science. Supported by grants from NSF and NICHHD to the Center for Cognitive Science, University of Minnesota.
8 J. Bruce Overmier and John M. Holden Figure 1.1 Illustration of behavioural stream and three theories of action of reinforcers. SD = discriminative stimulus, R = response, SR = reinforcer
establishing the stimulus-response (S-R) learning, but the reinforcer was not itself part of what was learned (see Figure 1.1B). According to the theory, it really did not matter what the particular reinforcer was—or even if the same reinforcer was used all the time—the behaviour in question just had to be reinforced. One fascinating thing about this theory is its dominance despite the fact that it conflicts with our private conceptions of “why” we do things; introspection suggests that we do them to get to a particular goal, rather than as goalless automatons. Nonetheless, Thorndike’s theory—with Spence’s (1937) extension—was very successful in accounting for many observed phenomena of learning and choice behaviour and made interesting predictions (for example, both transposition and when it would fail). Theorists like Tolman (1945) tried to incorporate learning about goals (“cathexes”) into the then-current theories of learning. They were not very successful in this in their time, but they did get later theorists thinking about the functions of reinforcers and the outcomes of choices. Perhaps the best known of these attempts is Mowrer’s two-process theory (1947). In part, Mowrer invoked this theory in an attempt to explain avoidance behaviour—behaviour which prevented the occurrence of an aversive event (for example, a rat learning to jump a barrier in a shuttle-box in order to avoid an electrical shock, the upcoming presentation of which was signalled by a tone). The theoretical question of interest was, since the result of a successful avoidance response was a non-event (for example, the tone is turned off and there is no delivery of shock), what was motivating avoidance behaviour?
Basic Associative Processes
9
The two-process theory invokes a classically conditioned mediating state between the stimulus and the response. Mowrer argued that behaviour was the product of two parallel learning processes. The first was a Pavlovian association between the stimulus (environment) and the scheduled outcome event that established an anticipatory state (the anticipatory state standing between the environment and action was thought to motivate behaviour). The second was a Thorndikian strengthening of the response either by the reinforcer outcome or a change in the outcome-based anticipatory state. Thus, in the example mentioned above, the subject learns two things: (a) the tone signals upcoming shock, and thus the tone evokes an unpleasant anticipatory state (that is, fear) through classical conditioning; and (b) making the response of jumping the barrier turns off the tone and reduces the fear, thereby reinforcing that instrumental response. There is another way to talk about this theory and what it accomplished. One of the basic dissatisfactions with Thorndikian S-R behaviourism was that it clashed with our own “causal explanations” as to why we do things. That is, we think we do things in order to get certain goals or achieve certain outcomes, while the Thorndikian view has us as goalless automatons, who do not know why we do things. In one sense, Mowrer’s theory and its extensions brought the goals back into the picture. The anticipatory motivational state, brought about through Pavlovian conditioning, was based on the goal. Moreover, this motivational state mediated between the stimulus and the behavioural act. That is, we are talking about anticipations of outcomes “causing” the behavioural act (see Figure 1.1C). Mowrer thought that the key property of these mediating anticipations was one of “energizing” behaviour in a relatively non-specific manner. Tests of this idea—that it was a Pavlovian-conditioned mediating state that was generating the behaviour—used what is called a transfer of control design that had three phases (see Figure 1.2): 1. An Instrumental Phase: in which an instrumental response was learned. 2. A Pavlovian Phase: in which a stimulus–outcome relation was learned. These first two phases can be in either order. 3. A Test Phase: in which the Pavlovian cognitive science (CS) was presented in the instrumental context. The question of interest was whether the Pavlovian CS could evoke or control the instrumental response. Aspects of this two-process theory are still popular today especially as they account for relations among trauma, fears, phobias, and avoidant defensive behaviours. For Mowrer, the key property of the anticipatory state was as a behavioural mediator that provided non-specific motivation for actions. Figure 1.3 contains data from an experiment that indicates that a separately established “fear-evoking” CS can immediately evoke the trained instrumental response—even after the original discriminative stimulus has been extinguished.
10 J. Bruce Overmier and John M. Holden Figure 1.2 Illustration of a transfer-of-control procedure
Some years ago within this Mowrerian tradition taught by R. L. Solomon, the primary author was led to ask: “Are the conditioned fears of different things different?” Not quantitatively different, as in Mowrer’s theory, but rather, qualitatively different? And if so, what would be the implications of the qualitative difference? At the same time, our colleague Milton Trapold, one of Kenneth Spence’s students, asked a similar question about the then hypothesized “fractional anticipatory responses” that Hull (1951) and Spence (1956) had argued antedated rewards as a result of a conditioned association between the discriminative stimulus and the reward. Together, we theorized that the hypothesized, association-based, conditioned anticipatory mediating state was not merely motivating as Mowrer suggested but rather guiding the selection of the behaviour. That is, we speculated that the mediator had cue properties (Trapold and Overmier, 1972). Indeed, we thought these cue properties likely to be more important than any motivational properties (see Figure 1.1D). Although the idea was not entirely new (for example, the “sg” in Hull’s proposed “rg-sg” mechanism), we pushed the idea to its logical conclusion. For example, we argued that the conditioned mediating state was specific to the particular reinforcer or “outcome” anticipated and that it was as distinctive as that outcome. We even argued that it was possible—even likely—that the mediator had only these “cue” properties, rather than motivating properties. For this reason, we actually referred to the conditioned anticipatory state as an expectancy—with the quasi-cognitive connotations intended. Now, this little change in thinking may not seem significant, but we propose to show that it is quite significant for research and practice. So, how would one test this new conception about the possible cue properties of conditioned anticipatory mediating states or, as we called them, “expectancies”? If expectancies of outcomes have cue properties, then we should be able to show that the
Basic Associative Processes
11
Figure 1.3 Latency to perform an avoidance response as a function of stimulus type and test day
Source: From Bull, J. A. and Overmier, J. B. 1968. Transfer of control of avoidance is not dependent upon the maintenance of the original discriminative response. Proceedings 76th Annual Convention of the American Psychological Association. Note:
Each point represents 40 responses.
supposed cue properties can guide behaviour. The best task in which to show the existence of cue properties is the conditional discriminative choice task. An example would be in a T-maze or Skinner’s operant chamber with two or more alternative responses. Let us describe the traditional way that instrumental discriminative choice learning tasks are structured, using S for discriminative stimulus, R for choice response, and O for the outcome event. Then, we will contrast that traditional method with our test for cue properties of expectancies of particular reinforcers—a test procedure that we call the differential outcomes (DO) procedure. In the traditional conditional discriminative choice task (see Figure 1.4A), in the presence of one stimulus, S1, choice of a response to the left, R1, results in the usual common reinforcer—perhaps a sweet pellet for a rat; choices of the R2 yield no reinforcer. In the presence of S2, choices of a response to the right, R2, also results in getting a reinforcer and it is the same sweet pellet reinforcer, while now choices of the R1 response yield nothing. Note that following either discriminative stimulus, correct choices produce the same common reward. We call this the common outcomes (CO) procedure because the reward is common to either correct choice. And, typically, animals, children, even college students can learn conditional discriminations this way—although when the stimuli are complex, not always easily.
12 J. Bruce Overmier and John M. Holden Figure 1.4 Illustration of a standard discriminative conditional choice task employing common outcomes, and a similar task employing differential outcomes
In our proposed DO procedure (see Figure 1.4B), the organism is required to learn exactly the same S-R relations. That is, the choice problem that must be solved is identical. But, in contrast, there is a difference after the choosing is done. The difference is that each type of correct stimulus-response relation is followed by its own, unique reward. Thus, in the DO procedure, in the presence of one stimulus, S1, choice of a response to the left, R1, results in one reinforcer—perhaps an unsweetened pellet, while in the presence of S2, choices of a response to the right, R2, result in getting a different reinforcer—one unique to that response, perhaps sweet water. That is, correct discriminative choices following the different discriminative stimuli produce different rewards—rewards unique for each association—hence our label DO. Why is this apparently trivial design feature important? In the CO procedure, the organism only has the presence of the discriminative stimulus to guide its choice. In contrast, in the DO procedure, if there are unique, specific, anticipations or expectations of rewards or outcomes, and if these expectations of these different rewards have cue properties, then the organism has these extra cues from the expectancies to guide the choices as well.
Basic Associative Processes
13
In essence, we are asking: What is in the organism’s “mind” at the time of choice? Is it thinking retrospectively of the recent discriminative stimulus, is it thinking prospectively of the expected reward? Or is it thinking perhaps of both? Functionally, if the organism has more than one source of guiding information, then it should learn faster and better. Let us now compare rates of learning under these two different training paradigms (see Figure 1.5). Comparisons of groups learning in conditional discriminative tasks wherein one group was trained using CO procedure and the other trained using DO procedure reveals that the DO procedure produces significantly faster learning—and commonly to a higher asymptote (Overmier, Bull, and Trapold, 1971; Trapold, 1970). Several experiments using different species of animals from birds to horses and different kinds of reinforcers have confirmed this new phenomenon (for example, Edwards et al., 1982; Miyashita et al., 2000). Yet, it is a basic fact completely unanticipated within the traditional Thorndikian behaviourist tradition (and, as yet, is also rarely noted in texts). Of course, there are a number of ways that this procedural difference could induce the differences in rates of learning. But, we argued that it was the Pavlovian conditioned association between each discriminative stimulus and its distinctive outcome that was responsible. To show that it is the Pavlovian conditioned mediator that controls choosing Figure 1.5 Performance in a bi-conditional discrimination task under differential and common outcomes procedures
14 J. Bruce Overmier and John M. Holden requires a somewhat different experiment—a variant on the transfer of control design in which we can separate out the Pavlovian relation to isolate its choice controlling function. In this three-stage transfer of control experiment (outlined in Figure 1.6), we (Kruse et al., 1983) began by training a conditional discriminated choice using DO such that each cue-choice sequence resulted in a different, unique outcome like that just described. Then in a second stage, which took place outside of the choice arena, we took a new neutral stimulus and associated it with one of the two reinforcers. Finally, in the third test stage, the animal was returned to the choice situation and probes of the Pavlovian CS were introduced. This was a test of the CS’s power to directly induce the animal to make the specific choice for the signalled outcome—even though such choices had never before occurred in the presence of the CS. If the particular outcome with which the CS was associated were irrelevant, then choosing should be random. On the other hand, if the CS elicits a specific expectancy which in turn has unique response-cueing properties, then the CS should result in the CS inducing the animal to make the choice response that had previously produced that specific outcome in the original discriminative training. Such choices we would call “correct”.
Figure 1.6 Illustration of the transfer-of-control procedure employed by Kruse et al. (1983)
Basic Associative Processes
15
We found that the Pavlovian stimulus, in the presence of which the animal had never before made any choice responses, immediately and reliably substituted for the instrumental discriminative cue to elicit the outcome-specific “correct” choice responses. This is consistent with the view that embedded simple Pavlovian associations in conditional discriminated choice tasks can and do guide choices. We recognize that most readers are likely cognitive or clinical psychologists and wonder what this can tell you about humans and patients. So let us address this question. Recall that a very large part of learned human behaviours are in fact conditional discriminative choices. Deciding on the correct name for the person standing before you is a conditional discriminative choice. So is deciding daily proper clothing. For example, in the northern United States where we live, when choosing our clothing for the day, we always first check to see what the temperature is. The weather is the discriminative stimulus and choices of clothing must be conditional upon that stimulus. Wrong clothing choices can lead to death—and do each year. This illustrates how the DO applies to clothing choices. Does the DO procedure have a facilitating effect on learning by humans? In our lab at Minnesota, we have tested this (Maki et al., 1995), and colleagues around the world (for example, Estevez et al., 2001) have confirmed our findings. We have found that in nearly every task we have tested, using DO facilitates learning or performance—sometimes very modestly, sometimes dramatically—depending on the task difficulty and the particular outcomes used. This is true for normal five–six year old children learning to point to correct pictures or learning symbolic relationships. And, as we will note later, it is even true for persons who have learning disabilities. The experiments with humans are more complicated conditional discriminative choice experiments than the ones we have illustrated with animals, but they are essentially the same. Estevez, Fuentes and their associates (Estevez et al., 2003) have extended tests of this teaching method to adults with Down’s Syndrome. The Down’s clients have exactly the same pattern of greater success in learning using the DO procedure. This success with learning-disabled populations has been found by other groups as well. But does this parallel effect in humans mean that the same simple Pavlovian associative processes underlie the enhanced choice behaviour? Well, we can apply the same transferof-control paradigm as in the animal experiments to test this. First, children are trained on a conditional discriminative choice task, either with CO or with DO. Then, new stimuli are separately and selectively paired with the outcomes in a Pavlovian procedure. Finally, the children are tested to determine if the Pavlovian “CSs” will selectively control the choosing behaviour of the children. The results are straightforward. Data from these experiments comparing rates of learning by the children on conditional discriminative choice tasks trained either with the traditional CO procedure with learning under the new DO procedure show that the DO method yielded faster and better learning than the CO procedure. Moreover, when tested with the Pavlovian signals for ability to control outcome specific responses, the children trained under DO made specific choices with great accuracy,
16 J. Bruce Overmier and John M. Holden while those trained with CO performed at chance. Moreover, just as in our experiment with rats, here too the Pavlovian stimuli evoke specific selective choices of the “correct” response—the response that would produce the expected outcome (Maki et al., 1995). As a variation of the Pavlovian transfer, we can also use the conditioned expectancy model to show inter-problem transfer of control of choice (see Figure 1.7). Some would prefer to describe this as learning equivalences or categories. But here, it is based on signalled outcomes. After learning two different discriminations using the same outcomes, we test for a “crossover” of control, by presenting the subjects with the samples from one problem and the choice stimuli from another. We would expect that the sample from one problem would evoke responding for the choice stimulus from the other problem associated with the same outcome. The results indicated that there was substantial interproblem transfer based on the signalled outcome. This effect is true for children and for animals. Figure 1.7 Illustration of an experimental design for showing inter-problem transfer of control of choice
Basic Associative Processes
17
Acquired stimulus equivalence (Sidman, 1985) is essentially a form of complex conceptual category learning in which new untrained controlling relationships emerge. Subjects are taught a series of conditional discrimination problems in order to establish separate stimulus categories. For example, in Figure 1.8, in the course of teaching successfully four separate conditional discriminations, our subjects are explicitly taught these relationships between sample stimuli and choice alternatives such that when S1 is presented (S1→S3) and continuing with training in which S1→S3, S3→S5, S5→S7, and S7→S9, while S2→S4, S4→S6, S6→S8, S8→S10. This should lead to the establishment of separate stimulus categories—S1, S3, S5, S7, and S9 should belong to one category, whereas S2, S4, S6, S8, and S10 should belong to another. The Prader–Willi syndrome is an eating disorder that is accompanied by mental retardation. In my laboratory, we have extended this paradigm to teaching sets of acquired stimulus equivalences to patients with Prader–Willi syndrome. The clients were actually trained on a succession of four conditional discriminations, each with two cues and four alternative choices. Each pair is taught after the prior pair is mastered. Testing for transitivity and ‘symmetry’ involves testing of stimulus control of choice alternatives that are in the chain ‘but are relations that were not directly trained’. These are sometimes referred to as emergent relations. For example, if such training has been successful in establishing stimulus categories, then we should see the emergence of such untrained associations as S1→S5, S3→S7, S5→S9, S1→S7, S3→S9 or even S1→S9 on the one hand, and S2→S6, S4→S8, S6→S10, S2→S8, S4→S10 and S2→S10. These are all examples of a kind of emergent relationship called ‘transitivity’, and based on previous research, Figure 1.8 Illustration of a stimulus equivalence training procedure similar to that employed in Joseph et al. (1997)
Note: Correct choices are marked with an asterisk (*).
18 J. Bruce Overmier and John M. Holden we should expect to find less transitivity as the ‘nodal distance’ between stimuli in a class increase (for example, S1→S7 involves greater nodal distance than S1→S5). What’s more, we should also see the emergence of ‘symmetrical relationships’; for example, S9→S7, S7→S5, S5→S3, or S3→S1 on the one hand, and S10→S8, S8→S6, S6→S4, and S4→S2. Again, the learning and mastery of such equivalence relations by the learning-impaired clients with Prader–Willi syndrome is dramatically more accurate when they were taught using DO than with CO (Joseph et al., 1997). Interestingly, in the DO training conditions, accuracy of transitivity and emergence of equivalences is independent of nodal distance along the chain of possible relations, while in contrast, CO training results in decreasing accuracy as nodal distance increases. Thus, these adult retarded clients not only learned the basic relations faster when taught using DO, but they showed more reliable generative use of the new relational equivalences. We have begun work on demonstrating that we may well use the DO procedure to teach useful basic life skills to clients with Down’s syndrome. We have used newspaper symbols for cues for the selection of items of apparel that they should take with them to their workshop. Correct choices of weather-appropriate clothing received unique token reinforcers exchangeable for unique items. The early results from this new teaching method have been very promising suggesting this is a useful training tool in the real world. But, have we learned all we can from our animal experiments? No. We can gain more. Given that outcomes are important in learning, perhaps they are important for memory as well. Consider that if animals have to learn a conditional discriminative task but are not allowed to make choices until some time after the discriminative stimulus is removed, then how do they choose? This simple, delayed choice procedure is the prototypic way for testing short-term working memory. In the traditional CO procedure, participants have only their memory of the stimulus to rely on. However, if we use the DO procedure with such a delayed choice task, there is an additional source of information or cueing: The expectancy of the reinforcer could help to bridge the time delay gap because Pavlovian conditioned responses typically persist until the typical time of reward. Does the DO procedure prove and assist in such memory tasks? The answer is a resounding “yes”. Let us describe sample experiments, first with pigeons, then with patients. Consider an example of a conditional symbolic discriminative choice task for pigeons arranged for testing short-term working memory function. First, a colour is presented for a few seconds in the centre of a display panel in front of the pigeon. Then the colour cue is removed. After a variable delay (on the order of any where from zero to eight seconds), the bird must choose between two alternatives presented, one on each side of the display panel. Here, the choice is between alternatives of a vertical line and a horizontal line. If red is remembered, the vertical is correct; if green is remembered, then horizontal is correct. Correct choices are reinforced. The delay between the cue and the opportunity to choose is the “memory load”. If we arrange the sequence of events such that the reinforcer is the same for both selecting vertical after a red cue, and for selecting horizontal after a
Basic Associative Processes
19
green cue, that is the CO procedure. But when the reinforcers for correct choices of the different lines are themselves different, then this is the DO procedure. Does this difference in reinforcement method after the choice change the way the animals cope with the memory load? Undoubtedly, as memory based performance established under CO quickly drops to chance after only a few seconds worth of delay in this task. In contrast, memory performance established using DO remains at near perfect levels even at delay intervals at which subjects trained under CO has dropped to chance levels (Linwick et al., 1988; Peterson et al., 1987). This is an effect of great significance. And, it implies that activation of different cognitive processes is engaged under DO rather than under CO. Let us give you one last example of our research work with humans that grows out of the animal laboratory work we have been discussing—one that we believe has practical applications. Long-term excessive consumption of alcohol (and resulting thiaminedeficiency) can lead to brain damage and a disorder historically referred to as Korsakoff’s disease, but now more generally called simply alcohol related dementia. These patients are relatively intact cognitively but do suffer a specific problem. They have impaired short-term working memory—especially for faces and names. Oliver Sacks (1985) vividly describes just such a patient in his chapter, “The Lost Mariner”. This memory disability for recognizing faces and remembering the names that go with faces has the sad effect of socially isolating these individuals. Cognitive impairments in laboratory animals can be produced which are similar in nature to Korsakoff’s disease through the use of pyrithiamine, which lesions brain areas (that is, the mammilary bodies of the hippocampus) important in memory. Savage and Langlais (1995) discovered that our DO procedure seems to aid memory in these animal models of Korsakoff’s disease. That is, our DO procedure provides remediation for the diseased memory of these rats. This body of work by Savage won for her an APA (American Psychological Association) award for early career contributions in 2002. Once again, you must wonder whether there is anything here that has meaning for your human clients. And, again, we believe the answer is “yes”. We have tested use of the DO procedure to help Korsakoff patients to more readily learn to recognize faces and even learn the names that go with the faces. After all, learning to recognize a recently seen face or to name someone after seeing their face is a discriminative conditional symbolic choice task very much like those we have been discussing. Our work here is relatively new, but the results are very promising (Hochhalter et al., 2001). To test whether our newly discovered knowledge about the power of DO to improve learning and memory could be applied to these patients, we set up an artificial task that was similar to those we have previously described. First, we would show the patient a picture of one person’s face. Then we would hide the picture. After a variable delay, we would then show a page of pictures of two faces or a page with two names on it. The patient’s task was to report or point to the face or the name of the person they had seen a few seconds earlier. This seems easy, but it is quite difficult for Korsakoff patients.
20 J. Bruce Overmier and John M. Holden We rewarded the patients for correct choices with money, or tokens for coffee, or points—whatever was small but valuable to them. For one set of faces, all correct identifications received the same reward—the CO procedure. For another set of faces, the reward was unique to each particular face—the DO procedure. This within-subject comparison allowed us to see the effects of the different teaching procedures. Figure 1.9 shows the working memory for faces of normal age-matched control and of Korsakoff patients taught under CO and those same patients taught under DO. Clearly, normal age-matched individuals have no problem with their recognition memory under either condition. Equally clearly, Korsakoff patients taught with CO have a serious recognition memory impairment—with declines in memory showing up with delays of as little as five seconds. But those same patients taught with DO (bottom) show markedly improved recognition memory—not differing from normal individuals until after 25 seconds, but even at 25 seconds they are substantially improved. In summary, we think that we have shown that simple associative processes—like those of Pavlovian conditioning—can and do play important roles in choice behaviours and Figure 1.9 Data on the short-term working memory of normal older men and older men with korsakoff’s disease
Note: Diagnosed with alcohol-related dementia taught face recognition using the traditional common outcomes procedure or with the new differential outcomes procedure.
Basic Associative Processes
21
decision tasks. These examples arose from a reconceptualization of traditional learning theories. But we did not abandon associative accounts of learning to derive these complex choice phenomena. Although the examples were mostly from simpler conditional discriminative choice tasks, there are data from colleagues that suggest the same can be found in college students learning difficult types of equations (Estevez, personal communication) and even word equivalences across languages (Mahoney, 1991). Our research examples are not unique. They were meant to open up readers to the message that contemporary basic science research with animals on fundamental associative mechanisms continues to produce results that are of potential interest to cognitive scientists and certainly important and helpful to practitioners. We can even expand to normal aging phenomena. Now it turns out, that as animals get old, they, like humans, experience difficulties with working memory in delayed discriminative choice tasks when trained by the traditional CO training procedures. That is, when old, the rats cannot remember correct choices for more than a few seconds of delay. However, Lisa Savage, whom the primary author had the good fortune to work with in his laboratory some years ago, has recently shown that use of the DO training procedure can help these old animals to perform the memory-based task as well as young animals (Savage et al., 1999). Moreover, basic research with laboratory animals can enable us to discover things not possible through research with humans. For example, Savage and Parsons (1997) uncovered data in a double dissociation that suggests that there are different neurochemistries for memories and for expectancies. It appears that in conditional discriminated choice tasks, retrospective memories of the cue or sample stimulus are encoded in through cholinergicdependent processes because muscarinic antagonist scopolamine disrupts memory-based choosing more in the CO procedure than in DO procedure. Meanwhile, the expectations of reinforcer outcomes appear to be encoded through glutamineric-dependent processes because dizocilpine (MK-801) disrupted memory based choice more in the DO procedure than in the CO procedure. This should suggest to cognitive scientists that retrospective memories and prospective expectancies have different neural substrates and, perhaps, different brain modules. And indeed, we are also using fMRI (functional magnetic resonance imaging) to see if we might localize these different modules for memories and expectancies. (See Mok, Thomas, Lungu & Overmier, 2009). Our simple associative conditioning has taken us far.
References Edwards, C. A., J. A. Jagielo, T. R. Zentall, and D. E. Hogan. 1982. ‘Delayed matching-to-sample by pigeons: Mediation by reinforcer-specific expectancies’, Journal of Experimental Psychology: Animal Behavior Processes, 8 (3): 244–59. Estevez, A. F., L. J. Fuentes, P. Mari-Beffa, C. Gonzalez, and D. Alvarez. 2001. ‘The differential outcome effect as a useful tool to improve conditional discrimination learning in children’, Learning & Motivation, 32 (1): 48–64.
22 J. Bruce Overmier and John M. Holden Estevez, A. F., L. J. Fuentes, J. B. Overmier, and C. Gonzalez. 2003. ‘Differential outcomes effect in children and adults with Down syndrome’, American Journal on Mental Retardation, 108 (2): 108–16. Hochhalter, A. K., W. A. Sweeney, B. L. Bakke, B. L., R. J. Holub, and J. B. Overmier. 2000. ‘Improving face recognition in alcohol dementia’, Clinical Gerontologist, 22: 3–18. ———. 2001. Using animal models to address the memory deficits of Wernicke–Korsakoff syndrome in M. E. Carroll and J. B Overmier (eds). Animal research and human health: Advancing human welfare through behavioral science (pp. 281–92). Washington DC, US: American Psychological Association, xviii, 386. Hull, C. L. 1951. Essentials of Behavior. New Haven, CT, US: Yale University Press. Joseph, B., J. B. Overmier, and T. I. Thompson. 1997. ‘Food and nonfood related differential outcomes in equivalence learning by adults with Prader–Willi syndrome’, American Journal of Mental Retardation, 4 (4): 374–86. Kruse, J. M., J. B. Overmier, W. A. Konz, and E. Rokke. 1983. ‘Pavlovian conditioned stimulus effects upon instrumental choice behavior are reinforcer specific’, Learning & Motivation, 14 (2): 165–81. Linwick, D., J. B. Overmier, G. B. Peterson, and M. Mertens. 1988. ‘The interactions of memories and expectancies, as mediators of choice behavior’, American Journal of Psychology, 101 (3): 313–34. Mahoney, J. L. 1991. ‘An expansion of expectancy theory: Reaction time as a test of relative expectancy strength and forward vs. backward associations’, Proceedings: Undergraduate Research Opportunities Program in Behavioral Sciences, 43–74. Technical Report from the Center for research in Learning, Perception & Cognition, University of Minnesota. Maki, P., J. B. Overmier, S. Delos, and A. Gutmann. 1995. ‘Expectancies as factors influencing conditional discrimination performance of children’, Psychological Record, 45 (1): 45–71. Miyashita, Y., S. Nakajima, and H. Imada. 2000. ‘Differential outcome effect in the horse’, Journal of Experimental Analysis of Behavior, 74 (2): 245–54. Mok, L. W., Thomas, K. M., Lungu, O. V., and Overmier, J. B. (2009). Neural correlates of cue-unique out-come expectations under differential outcomes training: An fMRI study. Brain Research, 1265, April 10, 111–27. Mowrer, O. H. 1947. ‘On the dual nature of learning—A reinterpretation of ‘conditioning’ and ‘problem solving’. Harvard Educational Review, 17: 102–48. Overmier, J. B., J. A. Bull, and M. A. Trapold. 1971. ‘Discriminative cue properties of different fears and their role in response selection in dogs’, Journal of Comparative & Physiological Psychology, 76 (3): 478–82. Peterson, G. B., D. Linwick, and J. B. Overmier. 1987. ‘On the comparative efficacy of memories and expectancies as cues for choice behavior in pigeons’, Learning & Motivation, 18 (1): 1–21. Sacks, O. 1985. The Man Who Mistook His Wife for a Hat. New York: Touchstone. Savage, L. M., and J. Parsons. 1997. ‘The effects of delay interval, intertrial interval, amnestic drugs, and differential outcomes on matching to position in rats’, Psychobiology, 25: 303–12. Savage, L. M. and P. J. LanglaisJ. 1995. ‘Differential outcomes attenuates spatial memory impairments on matching to position following pyrithiamine-induced thiamine deficiency in rats’, Psychobiology, 23 (4): 153–60. Savage, L. M., S. R. Pitkin, and J. M. Careri. 1999. ‘Memory enhancement in aged rats: The differential outcomes effect’, Developmental Psychobiology, 35 (4): 318–27. Sidman, M., B. Kirk, and M. Willson-Morris. 1985. ‘Six-member stimulus classes generated by conditionaldiscrimination procedures’, Journal of the Experimental Analysis of Behavior, 43 (1): 21–42. Spence, K. W. 1937. ‘The differential response in animals to stimuli varying within a single dimension’, Psychological Review, 44 (5): 430–44. ———. 1956. Behavior Theory and Conditioning, vii, p. 262. New Haven, CT, US: Yale University Press.
Basic Associative Processes
23
Thorndike, E. L. 1911. Animal Intelligence. New York: Macmillan. Tolman, E. C. 1945. ‘A stimulus-expectancy need-cathexis psychology’, Science, 101 (2616): 160–66. Trapold, M. A. 1970. ‘Are expectancies based upon different positive reinforcer events discriminably different?’, Learning & Motivation, 1 (2): 129–40. Trapold, M. A. and J. B. Overmier. 1972. ‘The second learning process in instrumental learning’, in A. H. Black and W. F. Prokasy (eds), Classical Conditioning II: Current Research and Theory (427–52). New York: Appleton-Century-Crofts.
Chapter 2 Minimizing Cognitive Load in Map-based Navigation: The Role of Landmarks Kazuhiro Tamura, Bipin Indurkhya, Kazuko Shinohara, Barbara Tversky, and Cees van Leeuwen
Introduction
R
eading maps can be tricky; in particular when this happens during driving. We would like to minimize any unnecessary effort that navigators experience in using maps to locate themselves in their environment or figure out a route to their destination. This motivates efforts to find the optimal way in which the information on display is represented. People’s spontaneous sketches of environment they have experienced primarily through navigation are oriented as if they imagined themselves entering the environment from the bottom of the page (for example, Tversky, 1981). Navigators typically prefer maps that are oriented “heads up” that is, when the “up” direction in the map, the top of the page, corresponds to the direction the person is facing (for example, Levine, 1982). A map oriented this way is called aligned and preference for such maps is called the alignment effect. Modern technology offers ways to present maps that are always aligned. This is typically done in on-board navigation systems based on Global Positioning. Such systems provide the current position indicated on the map; with the destination of the individual user known, the orientation of the display can therefore easily be adjusted to align the map. Maps that are placed on public display, however, will have to indicate possible destinations in several directions, making it impossible to present them in an aligned fashion. This raises a question, whether there are other ways to facilitate map usage and, if so, what is their relation with the alignment effect? Specifically, is there a natural way to annotate maps to decrease the cognitive effort required to mentally navigate, in particular, in misaligned conditions? For meaningless shapes, salient “landmarks” are known to facilitate mental rotation (Hochberg and Gellman, 1977). Landmarks are a natural element in spontaneous sketch
Minimizing Cognitive Load
25
maps and in mental representations of environments (for example, Denis, 1997; Taylor and Tversky, 1992a, 1992b; Tversky and Lee, 1998, 1999). Properly placed and designed, landmarks might provide cues for making mental navigation decisions that facilitate map use. The study of whether and when landmarks facilitate map reading has, besides practical, also more fundamental implications, as it allows us to investigate what type of mental processes are being used in navigation. Facilities to assist navigation may address either of two types of information processing. The first is referred to alternatively as mental rotation or reorientation (for example, Shepard and Metzler, 1971; Corballis and Nagourney, 1978; Eley, 1982, 1988; Evans and Pezdek, 1980; Aretz and Wickens, 1992) because it can be done in at least two ways: either by mentally rotating the map, or by reorienting one’s own direction within the map. These two possibilities correspond, respectively, to two major ways of experiencing an environment; from a survey or overview perspective or from a route or embedded perspective (for example, Taylor and Tversky, 1992; Tversky, 1996; Zacks, et al., 2002). The present project does not attempt to distinguish the conditions inducing people to adopt each transformation, though other research has made efforts to do so (for example, Bryant and Tversky, 1999; Zacks, et al., 2002). We will, however, be able to determine whether an overview or embedded perspective is chosen in our experiments. With regard to processes such as mental rotation and reorientation we may predict the following: if landmarks facilitate these processes, they will reduce the alignment effect; we would expect the difference in navigation efficiency between aligned and misaligned conditions to become smaller as a result of adding landmarks to the map. The second major process, making the correct navigation decision, is predominantly involved in making left–right judgements. These are notoriously difficult (for example, Farrell, 1979; Franklin and Tversky, 1990; Maki and Braine, 1985). Map reading requires navigators to follow a route on a map and indicate whether the next turn is a right or left turn. Landmarks may facilitate left–right decisions, because they confer salient asymmetries to configurations that provide cues for which way to turn. If landmarks facilitate left–right decisions, placing landmarks is likely to facilitate navigation in aligned as well as in misaligned conditions. Do landmarks facilitate realigning misaligned maps or making left–right navigation decision? To test the contrasting predictions for these two processes, our first experiment varied independently the number of landmarks placed on a map, and its alignment with respect to the destination. In case landmarks facilitate mental rotation and/or reorientation, we expect an interaction between these two factors. In case the effect is due to left–right orientation, we expect independence of the main effects of alignment and landmark conditions. If both effects play a role, we may expect a main effect in combination with an interaction. A second way in which this issue was addressed in our experiments was related to the question: what kind of landmarks are most effective? We examined this question in both
26 Kazuhiro Tamura et al. our experiments. We compared global and local landmarks. Global landmarks confer asymmetries on the entire trajectory. They are, therefore, assumed to facilitate the processes of mental rotation or reorientation. We can distinguish overview versus embedded perspectives by comparing trajectories that curve around a global landmark versus ones in which it is placed outside the trajectory. In the first case, the global landmark is always at the same, left- or right-hand side from an embedded perspective; in the second, the landmark is always on the same side of the trajectory in overview. Thus, we expect the first to facilitate an embedded, the second an overview perspective. Local landmarks are placed as markers at points in the trajectory where a left- or right-hand turn needs to be chosen. They may, therefore, facilitate left–right decisions.
Experiment 1 Method Participants Twenty-eight undergraduates or adults from various cities in the Tokyo area with normal or corrected-to-normal vision (14 male, 14 female; age ranging between 18–36 years). They were paid an hourly fee of 1000 yen for participation in the experiment. Participants in this experiment also took part in Experiment 2, which was performed first.
Stimuli and Design Maps were constructed using the Java two-dimensional graphic library created by Sun Microsystems. Each map was rendered as an 800 by 800 pixel image on an 18.1 inch Liquid Crystal Display of an AT compatible personal computer. Each map subtended about 7 degrees of visual angle. The maps were based on those of Levine et al. (1982); they consisted of a route, presented as a segmented black line, connecting a destination and a target. Examples of the maps appear in Figure 2.1. Half of the test maps were aligned and half were misaligned (meaning rotated by 180° with respect to the aligned orientation). We distinguished maps that belong to a fixed-order and a variable-order condition. In the first condition, local landmarks were always in the fixed order: {building, tower, house, and windmill}. Maps could appear in three different global landmark conditions. In one-third of the conditions, a global landmark was placed inside the street pattern. In this condition, the landmark was always in the same orientation from an embedded perspective (always to the left or always to the right-hand side in mental navigation). In another one-third of the conditions, a global landmark was placed outside the street pattern. In this condition, the landmark was always in the same orientation from an external or overview position (always to the east or always to the west, irrespective of whether this was on the left-hand or righthand side in mental navigation). In the third condition, no global landmark was added.
Minimizing Cognitive Load
27
Figure 2.1 Examples of map displays used in Experiment 1: The global landmark conditions (global–inside: A, global–outside: B, and without–global landmark: C)
Two types of local landmark conditions could occur: In two-landmark maps, local landmarks were placed only to mark the beginning and ending points. In four-landmark maps, landmarks were placed in addition near each individual turn. Left- and right-hand turns were balanced across the maps, yielding 2 (Local landmarks; two or four) × 3 (Global landmarks: inside, outside, and without) × 2 (Reflections) = 12 unique maps.
28 Kazuhiro Tamura et al. The second set of maps belong to the variable-order condition: they used the same landmarks as in the fixed-order condition but these have been rearranged in random order. Of these, there were 2 (Local landmarks; two or four) × 2 (Reflections: left or right) = 4 map conditions in which the order of the local landmarks was randomly determined at the time of presentation.
Procedure Each trial consists of a sequence of three displays, shown in Figure 2.2. The first display consists of one the maps selected from Figure 2.1. Participants studied the map for three seconds and were asked to remember the configuration during an approximately one second period, in which the screen remained blank. Figure 2.2 Example sequences of displays used in Experiment 1 (Aligned and Misaligned versions of a global landmark inside two local landmark conditions)
Note: Blue arrows in test displays indicate the current location and orientation in the initial segment of the preceding map display, from which navigators must respond whether the next turn on the route to their destination, shown in the upper-left corner of the test display, requires a right or left turn. Blue arrows in feedback displays indicate the correct response.
Minimizing Cognitive Load
29
Next, a test display appeared showing an initial segment of the map from which they were to navigate to their destination. The destination was indicated by a landmark (for example, “windmill” see the centre of Figure 2.2), which was indicated at the left side at the top of the test display. The test was a two-alternative forced choice; participants determined whether the first turn along the route was a left- or right-hand turn. Participants responded by pressing one of two arrow keys on the computer keyboard indicating left or right. After each response, the original map was displayed with the correct response superimposed on the participant’s response. For the fixed-order conditions, in all of the cases the initial segment of the test-display was the long part of the trajectory. In the random-order conditions, in half of the cases the initial part of the trajectory was the short segment. This was done for two reasons: first, in order to prevent solutions based on reasoning such as “If the first turn is on the right-hand side of the map display my answer must be ‘right’ when the test display is aligned and ‘left’ when it is misaligned”. Second: to enable a check on differences between short and long initial segments. These should not be large in size if the processes currently under investigation capture a substantial part of the variability in display difficulty. Each participant received the following trials, 2 (Aligned or Misaligned test display) × 3 (Global landmarks: Inside, Outside, and Without) × 2 (Local landmarks: Two or Four in same order arrangement) × 2 (Map reflections: Left or Right) = 24 trials with fixed-order local landmarks, which were repeated two times, and 2 (Aligned or Misaligned test display) × 2 (Local landmarks: Two or Four) × 2 (First segment of the test display: Long or Short) × 2 (Map reflections: Left or Right) = 16 different trials which were repeated three times. This yielded a total of 96 trials which were randomized during the experiment.
Results and Discussion Responses were analyzed for speed (1/RT in second) and accuracy (number of errors) for the fixed and random-order trials separately, after which a comparison between the relevant subsets of both conditions was made.
Fixed-order Trials The fixed-order trials were evaluated in a 2 × 3 × 2 factorial design Analysis of Variance (ANOVA) with the within-subjects factors: Alignment test display, global and local landmarks conditions. The effect of Alignment reached significance for both speed, F(1, 27) = 75.3, p < 0.01, and accuracy F(1, 27) = 12.8, p < 0.01. Aligned test display was significantly faster (Aligned = 1.05; Misaligned = 0.76), and produced fewer errors (Aligned = 0.04; Misaligned = 0.13) than the misaligned one. The results reproduce the well-known alignment effect in map navigation mentioned in the introduction.
30 Kazuhiro Tamura et al. Global landmarks had an effect on the speed, F(2, 54) = 6.7, p < 0.01. The Tukey HSD post hoc tests showed that the Global–Inside (0.95) and Global–Outside (0.96) landmarks were both faster than the condition Global–Without landmark (0.81), p < 0.01. There was no difference between the Global–Inside and Global–Outside landmarks. These results show that global landmarks can ease the load on navigation. They do so to the same amount, irrespective of whether they are more helpful with respect to an overview (landmarks outside) or an embedded perspective (landmarks inside the terrain). The result, therefore, indicates that an embedded or an overview perspective is equally facilitated by landmarks, and that the corresponding strategies, respectively, of reorientation and mental rotation are approximately equally predominant. There was an interaction between the factors global landmark and Alignment of the test display, F(2, 54) = 5.9, p < 0.01, (see Figure 2.3). The Tukey HSD post hoc tests revealed that for the Aligned condition, the Global–Inside landmark and the Global–Outside Figure 2.3 Mean response speed and standard errors for the interaction of the global landmark and Alignment of the test display in Experiment 1
Minimizing Cognitive Load
31
landmark were faster than Global–Without landmark (p < 0.01), and there were no differences in speed between the Global–Inside and Global–Outside landmarks (p > 0.1). This was also the case for the misaligned condition. However, the size of the differences observed depends on alignment conditions. The effect of landmarks is greater in Misaligned than in Aligned conditions. This result is consistent with the notion that the landmarks facilitate processes such as mental rotation and reorientation. Global landmarks had no effect on accuracy, F(2, 54) = 1.8, p > 0.1. The error rates were Global–Inside (0.07), Global–Outside (0.08) and the Global–Without (0.11) landmark, respectively. The effect of local landmarks reached significance; the two–landmark (0.98) was faster than four–landmark condition (0.84), F(1, 27) = 39.2, p < 0.01. Surprisingly, however, the four–landmark condition (0.07) produced fewer errors than the two–landmark condition (0.10), F(1, 27) = 8.3, p < 0.01. The effect of the local landmarks condition can be regarded as a speed–accuracy trade-off, in that fewer local landmarks induce faster but less accurate responses. Local landmarks do not ease the mental load of navigation, they increase it. But as they require more effort, this pays off by reducing the number of mistakes made. In other words, local landmarks are a double-edged sword: they are helpful when accuracy matters, but involve cognitive costs which become a burden when speed is crucial. This is the case in particular on misaligned test displays, where participants lose their orientation without local landmarks at turning points. Because the local landmarks were presented prior to navigation, the cognitive load required must involve processes such as reading off memory information, associating all landmarks to positions and realigning them with the test display. Overall, the verdict on landmarks cannot simply be “more is better”. Global landmarks facilitate navigation, but landmarks placed at turns, although they increase accuracy, they reduce processing speed. So the mere addition of iconic landmarks as such does not suffice to improve performance. However, landmarks strategically placed to help orient mental navigation globally are effective result in improving the speed of navigation.
Random Order Trials The second analysis was performed on the Random-order local landmark conditions. The 2 × 2 × 2 ANOVA on response speed and accuracy used as within–subjects factors: Alignment of the test display, number of local landmarks and length of the initial segment (Long versus Short) in the test display. The effect of Alignment test display reached significance for both speed, F(1, 27) = 118.6, p < 0.01, and accuracy, F(1, 27) = 22.7, p < 0.01. Aligned test displays were significantly faster (Aligned = 0.88; Misaligned = 0.55), and produced fewer errors (Aligned = 0.08; Misaligned = 0.22) than misaligned ones. The variable–landmark condition, therefore, reproduces the Alignment effect as well as the fixed–landmark condition. It may be concluded that this effect does not depend on a certain fixed arrangement of landmarks.
32 Kazuhiro Tamura et al. Local landmarks reached significance for response speed, F(1, 27) = 86.1, p < 0.01. The two–landmark condition (0.82) was faster than the four–landmark condition (0.60). This result is consistent with the fixed–landmark condition. No significant of effect on accuracy was obtained, however, although the direction in the errors is consistent with speed–accuracy tradeoff (two–landmark 0.16, four–landmark 0.15), F(1, 27) = 0.3, p > 0.1. There was an appreciable interaction between local landmarks and Alignment of the test display conditions for response speed, F(1, 27) = 39.6, p < 0.01 (see Figure 2.4). These results are consistent with the previous analysis. The effect of Length of the initial segment was not significant, neither for speed, (Long = 0.72; Short = 0.70), F(1, 27) = 1.2, p > 0.1, nor for accuracy, (Long = 0.14; Short = 0.16), F(1, 27) = 0.9, p > 0.1. No further interactions reached significance. The
Figure 2.4 Mean response speed and standard errors for the interaction between local landmark and alignment test display conditions in Experiment 1
Minimizing Cognitive Load
33
absence of a Length effect provides us with a reality-check that, whereas factors considered relevant contribute variety to the design, the ones considered irrelevant do not.
Fixed- versus Random-order Trials The comparison between Fixed- and Random-order local landmark conditions in a factorial design can only be done between subsets of each condition: fixed-order trajectories without global landmarks (excluding two-thirds of the Global–Inside and Outside landmark trials) and random-order trials that start from the first Long segment (excluding the ones that start from the Short segment). Even so, this comparison should be taken with a grain of salt, as the frequency of occurrence of both types of trials is not identical. With the fixed-order trials restricted to the Global–Without landmark condition and the random-order trials to the first long-segment condition, response speed and accuracy were evaluated in a within–subjects 2 × 2 × 2 factorial design ANOVA with factors: Alignment of the test display, local landmarks and Arrangement landmarks (Fixed versus Random) conditions. The effect of Alignment test display reached significance for both speed, F(1, 27) = 109.5, p < 0.01, and accuracy, F(1, 27) = 15.7, p < 0.01. Aligned test displays were significantly faster (Aligned = 0.94; Misaligned = 0.60), and produced fewer errors (Aligned = 0.06; Misaligned = 0.20) than Misaligned ones. This replicated previous analyses. Local landmarks reached significance for response speed, F(1, 27) = 51.7, p < 0.01. The two-landmark condition (0.87) was faster than the four-landmark condition (0.67). There was an appreciable interaction between local landmarks and Alignment of the test display conditions for response speed, F(1, 27) = 37.4, p < 0.01, all consistent with earlier observations. Arrangement of landmarks had an effect on the speed, F(1, 27) = 13.7, p < 0.01. The Fixed-order landmarks were faster (0.80) than Random-order landmarks (0.72). The Fixedorder landmarks had fewer errors (0.11) than Random-order landmarks (0.14). There was an interaction between Arrangement landmark and local landmark conditions for response speed, F(1, 27) = 10.9, p < 0.01, indicating that the Random-order landmarks are more difficult than Fixed-order ones in particular in the four local landmarks condition as well as a three-way interaction, F(1, 49) = 8.5, p < 0.01. The result indicates that fixed order local landmarks facilitate navigation decisions more prominently in misaligned test display conditions (Figure 2.5). All these results are consistent with the observation that landmarks at choice points complicate map navigation. None of the other main effects or any of the interactions reached significance. Whereas the fact that global landmarks also facilitate navigation in aligned conditions shows that they support the process of left–right orientation, the observation that they have greater effects in misaligned conditions, however, indicates that they also play a role in supporting processes such as mental rotation and reorientation.
34 Kazuhiro Tamura et al. Figure 2.5 Mean response speed and standard errors for the interaction between the arrangement of landmarks (fixed versus random), number of local landmarks, and the Alignment of the test display in Experiment 1
Experiment 2 In the previous experiment, we observed that global landmark conditions facilitate navigation generally. We concluded that these effects are based on mental rotation and/or reorganization. Local landmarks, by contrast, did not facilitate navigation, suggesting that processes relating to left–right orientation played a less predominant role. Interestingly, we observed that adding local landmarks made navigation harder. We may ask the question, what is it in the nature of these landmarks that makes them hard to process? One possibility is that they clutter, and therefore disrupt, the global order of the maps.
Minimizing Cognitive Load
35
We may, therefore, ask another question, whether this could be remedied by a different type of landmark. In some environments, for example, freeway exits and numbered streets, numeric landmarks appear at turns. We may raise the issue, whether these are more helpful than iconic landmarks. In the present experiment, we will therefore compare the iconic landmarks of the previous experiment with numerical ones. Iconic landmarks placed at turns were called “local” in the previous experiment, because they do not provide salient cues to the configuration of an environment. Numeric landmarks placed at turns provide salient configural information. For these reasons, we expected numeric landmarks to be superior to iconic ones.
Method Participants Fifty undergraduates or adults with normal or corrected-to-normal vision from various cities in the Tokyo area (25 male, 25 female; age ranging between 18–36 years) took part in the experiment.
Stimuli and Apparatus Stimuli and apparatus were same as in Experiment 1, with an additional factor, of which examples appear in Figure 2.6. In one condition, iconic landmarks were used and numeric ones in the other. Both were shown in different rotation and reflection versions, yielding 2 (Numeric versus Iconic) × 2 (0° and 180° Rotations) × 2 (Reflections) = 8 unique maps. As in the previous experiment, there were fixed and variable landmark conditions. In the first, landmarks were always in the fixed order: {1, 2, 3, and 4} for the numeric and a corresponding fixed order for the iconic maps {building, tower, house, and windmill}. Variable-order displays used the same landmarks as in the fixed-order condition, but with random rearrangements of the landmarks in both numeric and iconic conditions.
Procedure and Design The first display consists of one of the maps selected from Figure 2.6. Participants studied this map for 5 seconds. This is longer than in the previous experiment, because the map could occur in two different rotations (0° and 180°). They were asked to remember the configuration during an approximately 1 second period, in which the screen remained blank. Next, a test display appeared showing the initial segment of the map. The test display could be presented in an aligned or misaligned orientation. The test was a two-alternative forced choice; participants determined whether the first turn along the route was a left- or right-hand turn. Participants responded by pressing one of two arrow keys on the computer
36 Kazuhiro Tamura et al. Figure 2.6 Example of maps used as stimuli in Experiment 2
Note: Two different types of landmarks were used: iconic (A) and numeric (B). Rotation versions 0° (B) versus 180° (C), reflection (B versus D) and short versus long initial segment version (B versus E) of both landmark types were used in map displays.
keyboard indicating left or right. After each response, the original map was displayed with the correct response superimposed on the participant’s response. Each participant received 64 trials: 2 (Orientations map display: 0° or 180°) × 2 (Aligned or Misaligned test display) × 2 (Destination 3 or Destination 4) × 2 (First segment: long or short) × 2 (Type of landmark: numeric or iconic) × 2 (Map reflections: left or right) = 64 unique trials with fixed order arrangement landmarks. Each Participant also received 32 variable-order trials: 2 (Orientations map display: 0° or 180°) × 2 (Aligned or Misaligned test display) × 2 (Destination 3 or Destination 4) × 2 (Type of landmark: numeric or iconic) x 2 (Map reflections: left or right) = 32 trials with variable order arrangement landmarks. First segments (long and short) were compounded in the catch trials. The 64 experimental and 32 catch trials, yielding a total of 96 trials which were randomized during the experiment.
Results and Discussion Fixed-order Trials The fixed-order landmark conditions were analyzed for response speed (1/RT in second) and accuracy (number of errors) in a 2 × 2 × 2 × 2 × 2 a factorial design ANOVA with the five within–Subjects factors; Orientation map display, Alignment of test display, Destination, First-segment length and type of landmark. There were no effects of map Rotation in the response speed (0° = 0.68; 180° = 0.66), F(1, 49) = 2.3, p > 0.1, nor accuracy (0° = 0.13; 180° = 0.11), F(1, 49) = 0.9, p > 0.1. We may, therefore, assume that our maps were memorized in an orientation-independent manner.
Minimizing Cognitive Load
37
The effect of Alignment test display affected both speed and accuracy. Aligned test displays were faster speed (Aligned = 0.82; Misaligned = 0.52), F(1, 49) = 243.1, p < 0.01, and produced fewer errors (Aligned = 0.05; Misaligned = 0.19), F(1, 49) = 52.6, p < 0.01 than Misaligned ones, reproducing the alignment effect. Distance from the target to the destination in the test displays did not affect speed (Destination 3 = 0.67 versus Destination 4 = 0.66), F(1, 49) = 2.0, p > 0.1, nor accuracy (Destination 3 = 0.13 versus Destination 4 = 0.11), F(1, 49) = 1.8, p > 0.1. Neither did the length of the first segment influence speed (Long = 0.67 versus Short = 0.67), F(1, 49) = 0.0, p > 0.1, the accuracy, (Long = 0.11 versus Short = 0.13), F(1, 49) = 2.1, p > 0.1. The absence of distance and length effects again, provides a reality-check that variables expected not to influence difficulty, indeed, do not. There was a large effect of the type of landmark, both on response speed (Numeric = 0.74; Iconic = 0.60) and accuracy (Numeric = 0.09; Iconic = 0.15). The numeric landmarks were faster, F(1, 49) = 88.3, p < 0.01 and elicited fewer errors, F(1, 49) = 25.6, p < 0.01 than the iconic landmarks. These results show that numeric landmarks at choice points are less intrusive than iconic ones. There was an appreciable interaction between type of landmark and the alignment of the test display conditions for response speed, F(1, 49) = 37.7, p < 0.01 (Figure 2.7). The interaction shows that the alignment effect is larger for iconic landmarks. This is opposite to what would be expected if iconic landmarks are helpful to overcome the alignment effect.
Fixed versus Variable Order Conditions The second set of analysis was performed to compare fixed- and variable-order conditions. We did this, after having assured in the previous analysis that the factor “length of initial segment” had no effect whatsoever. Consequently, these conditions of the fixed-order landmark trials were subsequently pooled. We investigated speed and accuracy in ANOVAs with the within–subject factors: Orientation of the map display, Alignment of the test display, Type of landmark, and Fixed versus Variable arrangement of landmarks. No effect of map Orientation was observed in response speed (0° = 0.63; 180° = 0.61), F(1, 49) = 0.9, p > 0.1, nor accuracy, (0° = 0.15; 180° = 0.14), F(1, 49) = 0.27, p > 0.1, replicating the previous analysis. as before, an effect of Alignment test display was observed in the response speed (Aligned = 0.78; Misaligned = 0.47), F(1, 49) = 247.5, p < 0.01, and accuracy, (Aligned = 0.06; Misaligned = 0.22), F(1, 49) = 62.1, p < 0.01. With respect to type of landmarks, numeric landmarks were faster than the Iconic ones (Numeric = 0.67; Iconic = 0.57), F(1, 49) = 53.2, p < 0.01. A two-way interaction was observed, as in the previous analysis, between type of landmark and Alignment test display conditions on response speed, F(1, 49) = 24.9, p < 0.01. There was an effect of the Arrangement landmarks on both response speed (Fixed order = 0.67; Random order = 0.57)
38 Kazuhiro Tamura et al. Figure 2.7 Mean response speed and standard errors for the interaction between type of landmark and Alignment of the test display in Experiment 2
and accuracy (Fixed order = 0.12; Variable order = 0.17); the Fixed order landmarks were faster, F(1, 49) = 60.9, p < 0.01 and elicited fewer errors than the Variable order landmarks, F(1, 49) = 13.7, p < 0.01. A two-way interaction was found between Arrangement landmarks and type of landmark conditions on response speed, F(1, 49) = 20.0, p < 0.01 (Figure 2.8). Numeric landmarks show a greater advantage of fixed-order than iconic ones. This confirms the interpretation that numeric order provides a navigation cue. None of the other main effects or any of the interactions reached significance. We reproduced the well-known alignment effect in both speed and accuracy of navigation. Based on the assumption that processes such as mental rotation or reorientation are facilitated by any of these landmarks, we would have expected greater facilitation, relatively speaking, in misaligned rather than in aligned conditions. Although an interaction was found, the interaction pointed in the opposite direction, in that the
Minimizing Cognitive Load
39
Figure 2.8 Mean response speeds and standard errors between Arrangement and type of landmark conditions in Experiment 2
fastest landmark conditions showed the greatest alignment effect. Thus, the effects do not point to facilitation but to interference; the results show that numeric landmarks are less intrusive than others, in particular when they are presented in a fixed order.
General Discussion Maps misaligned with the direction of travel cause problems for navigators. Their difficulty stems from a mismatch between the orientation of the map and the orientation of the navigator. As a consequence, they require navigators either to mentally rotate the map to fit the navigator’s location and direction or to mentally reorient their location and direction within the map. Both of these processes require mental gymnastics that take time and may induce errors. Can the cognitive difficulty of mental rotation or reorientation
40 Kazuhiro Tamura et al. be alleviated, in particular, by the placement of landmarks? The present experiments examined cognitive costs of misalignment for both stimulus and test maps. They also examined whether strategically placed landmarks could ameliorate the alignment effect. The findings in short: first, there were cognitive costs of misalignment of the test map but not the stimulus map; second, global landmarks did serve to alleviate the cognitive costs of realigning either map or orientation. The Alignment effect is a particularly robust phenomenon. People’s spontaneous sketches of environments they have experienced primarily through navigation are “heads up”, as if they imagined themselves entering the environment from the bottom of the page (Tversky, 1981). When maps are aligned, users are less likely to make navigational errors than when maps are misaligned (Evans and Pezdek, 1980; Levine, Jankovic and Palij, 1982; Thorndyke and Hayes-Roth, 1982; Levine, Marchon and Hanley, 1984; Presson and Hazelrigg, 1984; Shepard and Hurwitz, 1984; Presson, Delange and Hazelrigg, 1989; Rossano and Warren, 1989; Rossano, Warren and Kenan, 1995; May, Peruch and Savoyant, 1995; Tlauka and Wilson, 1996; Richardson, Montello and Hegarty, 1999; Rossano, et al. 1999; Wilson, Tlauka and Wildbur, 1999). How does the addition of landmarks reduce the effects of misalignment? People make extensive use of landmarks in navigating, especially at choice points, in deciding which way to turn (for example, Denis, 1997). Landmarks have also been shown to facilitate mental rotation (Hochberg and Gellman, 1977). Therefore, they will also facilitate making judgements with misaligned maps. This hypothesis was tested in the first experiment. Global landmarks do indeed facilitate navigation. They do so, irrespectively of whether they are optimally placed from an embedded point of view or an overview. On half the trials, the global landmark preserved turning direction regardless of alignment of map; on the other half, the global landmark preserved the cardinal directions, so that they appeared mirror image on misaligned maps, along with the rest of the map. Both cues facilitated, and there was no advantage to one cue over another. It may be concluded that they confer asymmetry on the configurations of the environments and that navigators can equally benefit from any such asymmetry, regardless of whether it supports an embedded or an overview perspective. By contrast, local landmarks, that is, landmarks placed at decision points for left- or right-hand turns did not facilitate navigation. Rather, they confuse the navigators. In some conditions, this has the beneficial side-effect that this forces them to look longer at the display, thereby avoiding some mistakes. Depending on the costs and benefits of speed and accuracy, one might consider placing such landmarks in certain cases. In cases of on-board navigation systems, such landmarks should probably not be used, given the importance of avoiding an accident, which is directly proportional to the speed of navigation but not to accuracy. In the second experiment, we compared the iconic local landmarks of the first experiment to numeric landmarks, in order to determine whether a fixed numerical order
Minimizing Cognitive Load
41
in the landmarks could confer a similar global asymmetry, and whether this would facilitate navigation. We found that fixed numerical conditions did, indeed, yield better performance, compared to iconic or variable landmarks. However, this effect turned out to be one of least interference rather than facilitation. In sum, we observed a role for global landmarks in breaking the symmetry of the display and providing orientation cues in for processes for left-right orientation, both from an embedded and an overview perspective. The local landmarks were introduced in an effort to produce facilitatory effects on left-right orientation. Left-right orientation appears to be an important factor in navigation, but the present results show unambiguously that adding landmarks at local choice point can only interfere with it. In many situations in the world, such as shopping centres and subway exits, misaligned maps are inevitable as navigators using them are oriented in different directions. A simple design principle follows from this research for augmenting performance with misaligned maps: in order to improve navigation, global landmarks should be added, in order to provide a mental handle for mental rotation or for mental reorientation; local landmarks, however, are to be avoided.
References Aretz, A. J. and C. D. Wickens. 1992. ‘The mental rotation of map displays’, Human Performance, 5 (4): 303–28. Bryant, D. J. and B. Tversky. 1999. ‘Mental representations of spatial relations from diagrams and models’, Journal of Experimental Psychology: Learning, Memory and Cognition, 25 (1): 137–56. Corballis, M. C. and B. A. Nagourney. 1978. ‘Latency to categorize disoriented alphanumeric characters as letters or digits’, Canadian Journal of Psychology, 32 (3): 186–88. Denis, M. 1997. ‘The description of routes: A cognitive approach to the production of spatial discourse’, Current Psychology of Cognition, 16 (4): 409–58. Eley, M. G. 1982. ‘Identifying rotated letter-like symbols’, Memory & Cognition, 10 (1): 24–32. ———. 1988. ‘Determining the shapes of land surfaces from topographic map’, Ergonomics, 31 (3): 355–76. Evans, G. W. and K. Pezdek. 1980. ‘Cognitive mapping: Knowledge of real-world distance and location information’, Journal of Experimental Psychology: Human Learning and Memory, 6 (1): 13–24. Farrell, W. S. J. 1979. ‘Coding left and right’, Journal of Experimental Psychology: Human Perception and Performance, 5 (1): 42–51. Franklin, N. and B. Tversky. 1990. ‘Searching imagined environments’, Journal of Experimental Psychology: General, 119 (1): 63–76. Hochberg, J. and L. Gellman. 1977. ‘The effect of landmark features on mental rotation times’, Memory and Cognition, 5 (1): 23–26. Lee, P. and B. Tversky. In press. ‘Interplay between visual and spatial: The effect of landmark descriptions on comprehension of route/survey spatial descriptions’, Spatial Cognition and Computation. Levine, M. 1982. You are here maps. Environment and Behavior, 14 (2), 221–37. Levine, M., I. H. Jankovic, and M. Palij. 1982. ‘Principles of spatial problem solving’, Journal of Experimental Psychology: General, 111 (2): 157–75. Levine, M., I. Marchon, and G. Hanley. 1984. ‘The placement and misplacement of you-are-here maps’, Environment and Behavior, 16 (2): 139–57.
42 Kazuhiro Tamura et al. Maki, R. H. and L. G. Braine. 1985. ‘The role of verbal labels in the judgment of orientation and location’, Perception, 14 (1): 67–80. May, M., P. Peruch, and A. Savoyant. 1995. ‘Navigating in a virtual environment with map-acquired knowledge: Encoding and alignment effects’, Ecological Psychology, 7 (1): 21–36. Presson, C. C. and M. D. Hazelrigg. 1984. ‘Building spatial representations through primary and secondary learning’, Journal of Experimental Psychology: Learning, Memory, and Cognition, 10 (4): 716–22. Presson, C. C., N. Delange, and M. D. Hazelrigg. 1989. ‘Orientation specificity in spatial memory: What makes a path different from a map of the path?’, Journal of Experimental Psychology: Learning, Memory, and Cognition, 15 (5): 887–97. Rossano, M. J. and D. H. Warren. 1989. ‘Misaligned maps lead to predictable errors’, Perception, 18 (2): 215–29. Rossano, M. J., D. H. Warren, and A. Kenan. 1995. ‘Orientation specificity: How general is it?’, American Journal of Psychology, 108 (3): 359–80. Rossano, M. J., S. O. West, T. J. Robertson, M. C. Wayne, and R. B. Chase. 1999. ‘The acquisition of route and survey knowledge from computer models’, Journal of Environmental psychology, 19 (2): 101–15. Richardson, A. E., D. R. Montello, and M. Hegarty. 1999. ‘Spatial knowledge acquisition from maps and from navigation in real and virtual environments’, Memory & Cognition, 27 (4): 741–50. Shepard, R. N. and J. Metzler. 1971. ‘Mental rotation of three-dimensional objects’, Science, 171: 701–03. Shepard, R. N. and S. Hurwitz. 1984. ‘Upward direction, mental rotation, and discrimination of left and right turns in maps’, Cognition, 18 (1–3): 161–93. Taylor, H. A. and B. Tversky. 1992a. ‘Descriptions and depictions of environments’, Memory and Cognition, 20 (5): 483–96. ———. 1992b. ‘Spatial mental models derived from survey and route descriptions’, Journal of Memory and Language, 31 (2): 261–82. Thorndyke, P. W. and B. Hayes-Roth. 1982. ‘Differences in spatial knowledge acquired from maps and navigation’, Cognitive Psychology, 14 (4): 560–89. Tlauka, M. and P. N. Wilson. 1996. ‘Orientation-Free Representations from Navigation through a Computer-Simulated Environment’, Environment and Behavior, 28 (5): 647–64. Tversky, B. 1981. ‘Distortions in Memory for Maps’, Cognitive Psychology, 13 (3): 407–33. ———. (1992). ‘Distortions in Cognitive maps’, Geoforum, 23 (2): 131–38. ———. 1996. “Spatial perspective in descriptions”, in P. Bloom, M. A. Peterson, L. Nadel, and M. Garrett (eds), Language and Space, pp. 463–91. Cambridge: MIT Press. Tversky, B. and P. U. Lee. 1998. ‘How space structures language’, in C. Freksa, C. Habel, and K. F. Wender (eds), Spatial Cognition: An Interdisciplinary Approach to Representation and Processing of Spatial Knowledge, pp. 157–75. Berlin: Springer-Verlag. ———. 1999. ‘Pictorial and verbal tools for conveying routes’, in C. Freksa and D. M. Mark (eds), Spatial Information Theory: Cognitive and Computational Foundations of Geographic Information Science, pp. 51–64. Berlin: Springer. Wilson, P. N., M. Tlauka, and D. Wildbur. 1999. ‘Orientation specificity occurs in both small- and largeScale imagined routes presented as verbal descriptions’, Journal of Experimental Psychology: Learning, Memory, and Cognition, 25 (3): 664–79. Zacks, J. M., J. Mires, B. Tversky, and E. Hazeltine. 2002. ‘Mental spatial transformations of objects and perspective’, Journal of Spatial Cognition and Computation, 2 (4): 315–32.
Chapter 3 Quantitative and Qualitative Differences between Implicit and Explicit Sequence Learning Arnaud Destrebecqz
Introduction
S
ince the beginning of implicit learning research, the interpretation of available empirical evidence has flipped between a strong endorsement of an unconscious learning system and the denial of non-conscious acquisition of new information (see, for instance, Reber, 1969; Wilkinson and Shanks, 2004). In this chapter, we would like to reflect on the origin of this contradiction and on the possible methodological and theoretical additions that can be made to the field in order to work this conflict out. First of all, we would like to clarify the way in which we refer to the notion of implicit learning in this chapter. This notion can indeed be understood in different ways depending on whether one focuses on the processes or on the resulting knowledge involved during a learning episode. According to the first perspective, implicit learning refers to those situations in which some knowledge is acquired without intention to do so (unintentional learning) while the second meaning of the term relates to the acquisition of knowledge that is difficult to verbalize (unconscious learning). In this chapter, we will focus on this latter notion and discuss the conditions under which the knowledge acquired in sequence learning studies could or could not be described as unconscious. Now that this clarification has been made, the next question arises: How can one demonstrate the acquisition of unconscious knowledge? Most studies have used a procedure consisting in (a) measuring learning in a first part of the experiment and (b) assessing conscious knowledge with another task in a second part of the experiment. In a sequence learning experiment, for instance, participants are first presented with a serial reaction time (SRT) task in which they have to press as fast as possible on a key corresponding to the spatial location of a visual target presented on a computer screen. Unknown to them,
44 Arnaud Destrebecqz the sequence of locations follows a repeating pattern or some other sort of sequential regularities. Numerous studies have shown that participants can learn such a sequential pattern, as their reaction time (RT) improves more when the target follows the pattern rather than when it is random or when it follows a different pattern, in which case it increases dramatically (Cleeremans and McClelland, 1991; Reed and Johnson, 1994). In a second phase, participants are informed of the existence of a sequential pattern and asked either to identify smaller sequences of the training material in a recognition task, or to reproduce this pattern in a generation task. These latter, direct tasks are used as measures of conscious knowledge. The rationale is that if participants are able to deploy their knowledge in these tasks, this knowledge must be described as conscious. This procedure is based on quantitative dissociation logic: implicit sequence learning would be demonstrated whenever a successful discrimination between regular and random trials is observed in the training phase while performance in generation or recognition remains at chance. Importantly, this logic depends on the exclusivity assumption (Reingold and Merikle, 1988), that is, the notion that the tasks used to measure awareness are only sensitive to conscious knowledge. This assumption has been recently challenged by several sequence learning studies (Destrebecqz and Cleeremans, 2001; Fu et al., 2008; Jimenez et al., 2006). Adapting a procedure proposed by Tunney and Shanks (2003), Vandenberghe et al. (submitted) have trained participants in a SRT task in which a 12-event repeated sequence was presented. Subsequently, participants were confronted with a recognition task in which they were presented with fragments consisting of three trials. Half of these fragments were part of their training sequence, and the other half were not. Participants were asked to react to the stimuli as fast and as accurately as possible, just as in the learning phase, and then to answer the two questions “Have you seen this short sequence before?” and “Are you confident in your response?” by pressing one of two buttons marked Yes or No. Participants were trained either with a 0 or with a 1000 ms Response-Stimulus Interval (RSI, that is, the time interval between the response of the participant and the occurrence of the following stimulus in the sequence). Both groups exhibited learning effects during the learning phase, as evidenced by the RTs decrease with practice and by the RTs increase when the training sequence was modified. The same RSI was used in both training and recognition phases. This temporal parameter, which actually determines the pace of the task, seems to influence the extent to which the acquired knowledge is implicit or explicit (Destrebecqz and Cleeremans, 2001). Indeed, when the RSI is set to a minimum of zero milliseconds, learning tends to be essentially implicit, that is, participants lack control over their knowledge or have difficulty recognizing sequence fragments. When higher values of RSI are used, they generally do well in these different tasks and they are even able to offer accurate metacognitive judgements of their own performance (Destrebecqz and Cleeremans, 2003). Vandenberghe et al. (submitted) computed two d’ indexes for each participant to analyze recognition performance. A first d’ indexed participants’ ability to discriminate between triplets from the training sequence and triplets from the transfer sequence. Hits were
Differences in Sequence Learning
45
calculated from “yes” responses to training triplets, and false alarms were calculated from “yes” responses to transfer triplets. Results showed that participants reliably discriminated between old and new triplets in both RSI groups and that there was no significant difference between both groups. This suggests that RSI duration did not influence the ability to discriminate between old and new triplets. Importantly, the second d’ measured the participants’ ability to discriminate between correct and incorrect responses. Hits and false alarms can be computed as follows in this situation: Considering participants’ responses to the question “Are you confident in your response?”, hits are defined as “yes” responses to correct discrimination decisions (including both endorsements and rejections), and false alarms are defined as “yes” responses to incorrect discrimination decisions (again including both endorsements and rejections). As proposed by Tunney and Shanks (2003), if participants are aware of the information used to discriminate between old and new triplets, they should be more confident in correct than in incorrect discrimination responses. On the other hand, if participants are unaware of the information used to discriminate between old and new triplets, then “yes” and “no” confidence responses should be distributed equally between correct and incorrect discriminations. Thus, explicit knowledge would result in d’ values greater than zero while implicit knowledge would result in d’ values close to zero. Vandenberghe et al. (submitted) observed that the second d’ was reliably greater than zero in subjects trained with a 1000 ms RSI. This was not the case, however, in the RSI 0 ms group. Participants trained with a 0 ms RSI were not more confident in their correct discriminations than in their errors. Following Tunney and Shanks’ proposal, Vandenberghe et al. concluded that the knowledge acquired by participants in the 0 ms RSI group can be described as implicit. The important result here is that performance can be above chance level in a recognition task—usually used as an awareness index—even though an additional test indicates that the knowledge on which recognition was based remains difficult to access consciously. As a consequence, recognition performance cannot be considered as an exclusive test of awareness as it can also be influenced by implicit knowledge. Why is it the case that a direct task such as a recognition task cannot be considered as a pure measure of conscious knowledge? A central reason is that perceptual and motor fluency may influence recognition judgements. Jacoby and Dallas (1981) proposed that the feeling of familiarity finds its origin in experiencing fluency. Fluency can be defined as the ease or speed with which people process perceptual information (Benjamin et al., 1988; Jacoby and Dallas, 1981; Oppenheimer and Franck, 2008; Reber et al., 2004). It can be seen as the conscious subjective experience “that a cognitive process is running smoothly” (Oppenheimer and Frank, 2008: 1180). While there are good reasons to believe that perceptual fluency plays a central role in the tasks used to study implicit learning, its specific influence on performance and its relationship with implicit and explicit processes has seldom been assessed systematically (Buchner, 1994; Scott and Dienes, 2007; Shanks and Perruchet, 2002; Shanks et al., 2003). In the context of a sequence recognition task following a SRT task, it is often observed that participants react faster to the trained than to the untrained sequence fragments. They might
46 Arnaud Destrebecqz then experience a feeling of fluency when reacting to those fragments that were part of the training sequence. In the absence of any other knowledge, they might then decide to attribute this feeling of fluency to the “oldness” of the fragment, even when they do not explicitly recollect that this particular fragment had been presented during the SRT task (see Cohen and Curran, 1993; Perruchet and Amorim, 1992; and Reber et al., 1985 for further discussion). Other fragments, by contrast, can be effectively consciously recollected during the recognition task, the results of which, therefore, reflect a mixture of implicit and explicit components. In the following experiment, we have tried to control for the potential implicit influence of perceptual and motor fluency by varying the value of the temporal delay between any two trials during recognition. This procedure was used so as to reduce the likelihood of a reactivation, during the recognition test phase, of motor routines learned during the SRT task. To explore whether the pace of the SRT task exerts differential effects on implicit and explicit sequence learning, we compared performance in two conditions differing only by the value of the RSI. If lowering the pace of the learning phase indeed contributes to the acquisition of explicit knowledge, we expect participants trained with a 1000 ms RSI to be better at recognizing fragments of the training sequence than participants trained with a 0 ms RSI.
Method Subjects Sixty-four participants aged 18–26 years, all undergraduate students at the Université libre de Bruxelles, were randomly assigned to one of the four experimental conditions.
Material The experiment was run on Macintosh computers. The display consisted of four dots arranged in a horizontal line on the computer’s screen and separated by intervals of 3 cm. Each screen position corresponded to a key on the computer’s keyboard. The spatial configuration of the keys was fully compatible with the screen positions. The stimulus was a small black circle 0.35 cm in diameter that appeared on a white background, centred 1 cm above one of the four dots.
Procedure The experiment consisted of 15 training blocks during which subjects were exposed to a serial four-choice RT task. Each block consisted of 96 trials, for a total of 1440 trials. On each trial, a stimulus appeared at one of the four possible screen locations. Participants were
Differences in Sequence Learning
47
instructed to respond as fast and as accurately as possible by pressing on the corresponding key. The target was removed as soon as a key had been pressed, and the next stimulus appeared after either 0 ms (no RSI condition) or a 1000 ms (RSI condition) interval depending on the condition. Erroneous responses were signalled to participants by means of a tone. Short rest breaks occurred between any two experimental blocks. Participants were presented with one of the following 12 elements sequences: 342312143241 (SOC1), 341243142132 (SOC2). Each experimental block consisted of eight repetitions of the sequence. These sequences consisted entirely of so-called “second order conditional” transitions or SOCs (Reed and Johnson, 1994). These sequences were identical to those used by Destrebecqz and Cleeremans (2001) and Wilkinson and Shanks (2004). In each condition, half of the subjects were trained on SOC1 during the first 12 blocks and during blocks 14 and 15; and on SOC2, during block 13. This design was reversed for the other half of the subjects. Participants were then asked to perform a recognition task. Here, we used a procedure similar to that initially described in Shanks and Johnstone (1999) and later applied in Destrebecqz and Cleeremans (2001). Participants were presented with 24 fragments of six trials. Twelve were part of SOC1 and 12 were part of SOC2. Participants were asked to respond to the stimuli as in the SRT task, and then to provide a rating of how confident they were that the fragment was part of the training sequence. Ratings involved a six point scale where 1 = “I am certain that this fragment was part of the training sequence”, 2 = “I am fairly certain that this fragment was part of the training sequence”, 3 = “I believe that this fragment was part of the training sequence”, 4 = “I believe that this fragment was not part of the training sequence”, 5 = “I am fairly certain that this fragment was not part of the training sequence”, and 6 = “I am certain that this fragment was not part of the training sequence”. It was emphasized to participants that they had to respond as fast as possible to the dots and that the person achieving the best recognition score would receive a $10 reward. Both ratings and reaction times were recorded. In each RSI condition, the value of the RSI in the recognition task was identical to the one used during training for half of the participants. For the other half of the participants, the RSI between any two trials in the recognition task varied randomly between 0 and 1000 ms by step of 200 ms.
Results and Discussion Reaction Times Figure 3.1 shows the average RTs obtained over the entire experiment, plotted separately for participants trained with a 0 ms or 1000 ms RSI. To analyze the data, we performed an analysis of variance (ANOVA) with Blocks (15 levels) as a within-subjects variable and with RSI (2 levels, 0 ms vs. 1000 ms) and Variability (2 levels, constant and variable RSI in the recognition task) as a between-subjects variable.
48 Arnaud Destrebecqz Figure 3.1 Mean reaction times during the 15 blocks of the SRT task plotted separately for participants trained with a 0 ms or 1000 ms RSI. Block 13 is the transfer block during which another sequence was used
Figure inspection suggests that participants responded faster in the RSI 1000 condition, but also that sequence learning took place in both conditions, as suggested by the increase in RT at transfer (Block 13). These impressions are confirmed by the results of the ANOVA, which indicated a significant main effect of Blocks [F(14, 840) = 23.86, p < 0.0001, Mse = 37152.2] and RSI [F(1, 60) = 30.09, p < 0.0001, Mse = 3218925.86]. The Blocks X RSI interaction also reached significance [F(14, 840) = 6.74, p < 0.0001, Mse = 146844.6]. This interaction seems to be attributable to the different pattern of RT obtained over the first five blocks of training and during the transfer block (Block 13). When these five blocks are removed from the analysis, the interaction is no longer significant. This result suggests that the initial learning rate difference resulting from the use of different RSI tend to disappear after 500 trials of practice in the SRT task.
Differences in Sequence Learning
49
The main effect of Variability in this analysis and in the following analysis of variance, as well as all the interactions involving this factor did not reach significance and will not be further reported. Potential differences between groups presented with constant or variable RSI in the recognition task cannot therefore be attributed to different levels of sequence learning achieved during training. Participants trained with a 1000 ms RSI responded faster than participants in the RSI 0 condition. However, the cost in RT due to switching the training sequence by the transfer sequence in Block 13 seems similar in both RSI conditions. This impression is confirmed by an ANOVA performed on the differences between the transfer block (Block 13) and the average of RTs obtained for blocks 12 and 14. The ANOVA with RSI (2 levels) as a betweensubject variable showed that the magnitude of transfer did not differ significantly between the RSI 0 (mean = 76.26 ms, Std. deviation = 61.84) and RSI 1000 condition (mean = 59.76, Std. deviation = 62.82) [F (1, 60) = 1.16, p = 0.285, Mse = 4357.3].
Recognition Task Figure 3.2 shows the main recognition scores obtained for old and new six-element fragments presented to participants in the RSI 0 and RSI 1000 conditions. Recall that a low score (between 1 and 3) was expected for old fragments and a high score (between 4 and 6) was expected for new fragments. On average, responses ranged between 2.9 and 3.7. They were therefore at the lowest possible level of confidence for both correct endorsements and rejections. To analyze these data, we performed an ANOVA with Type of fragment (2 levels, old versus new) as a within-subject variable and Variability (2 levels) and RSI (2 levels) as between-subjects variables. This analysis revealed a significant main effect of Type of fragment [F(1, 60) = 15.9, p < 0.0001, MSe = 7.11], as well as a significant Type of fragment X RSI interaction [F(1, 60) = 1.79, p = 0.05, MSe = 4.01]. The main effect of Variability and all the other interactions did not reach significance. Inspection of Figure 3.2 suggests that recognition performance was better when the pace of the training task was slow rather than fast. However, performance was not affected whatever a constant or a random RSI was used at test. These impressions are confirmed by a series of bilateral paired t-tests comparing the mean scores obtained for old and new fragments in each of the four groups of participants. When a fixed RSI was used, the difference between recognition scores obtained for old and new triplets was significant in the RSI 1000 condition [t(15) = 2.52, p < 0.05] but not in the RSI 0 condition [t(15) = 0.85, p > 0.4]. When a variable RSI was used, the same pattern of results was observed: Participants recognized the old fragments in the RSI 1000 condition [t(15) = 3.31, p < 0.005], but not in the RSI 0 condition [t(15) = 1.4, p = 0.2]. In sum, we observed that participants trained with a 0 ms RSI were not able to differentiate between old and new 6-elements sequences. Participants trained with a 1000 ms
50 Arnaud Destrebecqz Figure 3.2 Mean recognition scores in RSI 0 and RSI 1000 conditions when either a constant (CST) or variable (VAR) RSI was used at test
RSI were able to make the distinction. On average, however, participants tended to respond at the centre of the scale and the mean difference between scores given to both types of fragments was only of 0.47, suggesting that they lacked confidence in their ability to recognize sequence fragments. As recognition performance might also be influenced by perceptual and motor fluency, we analyzed RTs recorded in responding to old and new fragments in the recognition tasks. Examination of Figure 3.3 suggests that participants tended to respond faster to old than to new sequences in all groups. These impressions are confirmed by an ANOVA with RSI (2 levels) and Variability (2 levels) as between subjects variables and Type of fragment (2 levels) as a within-subject factor. This analysis revealed a significant main effect of Type of fragment [F(1, 60) = 16.03, p < 0.001, MSe = 29668.96], RSI [F(1, 60) = 5.84, p = 0.019, MSe = 99967.36], and Variability [F(1, 60) = 15.28, p < 0.001, MSe = 261270.36]. No interaction reached significance. These results indicate that participants responded faster to old than to new fragments. They were also faster in the RSI 1000 (mean RT = 472.65 ms) than in the RSI 0 conditions (mean RT = 535.85 ms) and responded faster when a constant (mean RT = 461.90 ms) rather than a variable RSI (mean RT = 546.65 ms) was used.
Differences in Sequence Learning
51
Figure 3.3 RTs recorded for old and new fragments presented in the recognition task to participants trained with a 0 ms or a 1000 ms RSI and plotted separately for fragments presented at test with either a constant (CST) or variable (VAR) RSI
The introduction of a variable RSI had a general slowing effect on RT but did not, as we expected, interfere with the influence of the perceptual and motor fluency in the recognition task. Participants responded faster to old than to new fragments even with variable RSI. Interestingly, even though they responded faster to old fragments, they did not identify them in the recognition task in the RSI 0 condition.
Discussion The aim of the experiment presented here was twofold. First, we wanted to replicate the effect of slowing the pace of the SRT task on explicit sequence learning. Second, we
52 Arnaud Destrebecqz wanted to control for the potential bias of perceptual and motor fluency by using random time intervals between any two trials in the recognition task. We were not successful in our attempt to prevent a perceptual-motor fluency effect during the recognition task. The mean RT to old fragments was reliably faster than to new fragments in all conditions. Therefore, we cannot exclude the possibility that participants based their responses in the recognition task on a feeling of perceptual-motor fluency, that is, on the perception that they respond faster to some fragments and decide to classify them as “old” even though they did not recognize them consciously. Importantly, even though RSI 0 participants responded faster for old than for new fragments, they were unable to use perception of fluency as a basis for identifying old fragments. This result suggests that, when the pace of the learning task is particularly sustained, the acquired knowledge is more implicit than in the RSI 1000 condition, in which participants had no difficulty in differentiating between old and new 6-element sequences. This result is in line with the notion that more time is needed in order to acquire “high quality” or strong representations of the sequential constraints that are under cognitive control and that can be latter used in different experimental contexts and with different task demands such as recognition or generation (Cleeremans, 2008; Cleeremans and Jiménez, 2002; Destrebecqz and Cleeremans, 2001). It remains to be determined whether RSI 0 participants were aware of the fact that they responded faster to some of the fragments but were unable to use this conscious feeling as a basis for responding, or whether they did not consciously notice the fluency effect. It also remains to ascertain whether RSI 1000 participants notice the fluency effect consciously and therefore strategically decided to classify as “old” the “fluent” fragments or whether they automatically classified as “old” the fragments for which they consciously felt an effect of fluency. Previous studies have shown that participants could differentiate between 6-element sequences and that recognition was usually superior compared to with three trial sequences (Shanks, 2003; Shanks and Johnstone, 1999; Shanks and Perruchet, 2002; Shanks et al., 2003). Reliable recognition has also been reported with even longer sequence fragments or when the complete sequence was used in the recognition task (Curran, 1997). In our experiment, however, recognition of 6-element fragments remained at chance in the RSI 0 condition. We must also mention that, just like other researchers (Shanks et al., 2003), we observed successful recognition in other experiments in which a 0 ms RSI was used (Destrebecqz and Cleeremans, 2003). In this respect, the non-significant difference observed in this experiment between old and new 6-element sequences in the RSI 0 condition might appear surprising. A potential explanation could be that, if the developed representations acquired in this latter condition are “weak”, increasing the length of the recognition fragments might bring about low quality, unreliable information and paradoxically result in lower performance. The dissociation we observed between priming and recognition has previously been reported by Shanks (2003). In his study, he showed that when old and new sequences
Differences in Sequence Learning
53
received the same recognition rating, old fragments were nevertheless executed more rapidly than new ones. Importantly, significant priming was observed even for fragments that were not recognized as old. Shanks insists, however, on the fact that such a dissociation cannot be interpreted as an indication that there exists a form of implicit learning which is independent and neurally distinct from explicit learning. Accordingly, Shanks and Perruchet (2002) have shown that dissociation between priming and recognition is not inconsistent with a model in which both processes depend on the same memory base (with the addition of a random quantity of noise). In accordance with Shanks and Perruchet’s demonstration, we do not argue that our results are in favour of the existence of two independent learning systems. Rather, we argue for the functional interaction between different cortical (the anterior cingulate cortex) and sub-cortical brain regions (the striatum) subserving different computational objectives, and in which information processing appears to be either accompanied by awareness or not (Destrebecqz et al., 2005). Our study is based on the notion that no cognitive ability can be considered as process-pure, therefore we do not claim that sequence learning is exclusively implicit in RSI 0 conditions and exclusively explicit with higher RSI values. However, we argue that increasing the RSI results in more explicit knowledge because participants are given more time during the SRT task to develop explicit strategies (such as recoding, and/or trial-and-error strategies) concerning the sequential regularities. In other words, we do not defend a position according to which sequence learning can be based on powerful but unconscious learning mechanisms. On the contrary, we believe that conscious knowledge is systematically associated with better performance. However, while performance and consciousness are associated, they may also vary on a graded and continuous dimension. Learning conditions interact with participants’ ability to flexibly and easily recollect, control, and describe the knowledge they have acquired during a learning episode. One may discuss on how both ends of this continuum should be called. Namely, one could argue that even the most weak and low quality representation of some sequential regularity cannot be described as unconscious as it is nevertheless represented in the cognitive system. While this might indeed be the case, we think that our results suggest, however, that some knowledge might influence behaviour while remaining undetected by demonstrably sensitive tests of awareness.
Acknowledgements The author would like to thank Axel Cleeremans for his comments on a previous version of this chapter and Olivier Deville for his help in acquiring the data.
References Benjamin, A., R. Bjork, and E. Hirshman. 1988. ‘Predicting the future and reconstructing the past: A bayesian characterization of the utility of subjective fluency’, Acta Psychologica, 98 (2–3): 267–90.
54 Arnaud Destrebecqz Buchner, A. 1994. ‘Indirect effects of synthetic grammar learning in an identification task’, Journal of Experimental Psychology—Learning Memory and Cognition, 20 (3): 550–66. Cleeremans, A. 2008. ‘Consciousness: The radical plasticity thesis’, in R. Banerjee and B. Chakrabarti (eds), Models of brain and mind: physical, computational and psychological approaches progress in brain research, 168: 19–33. Cleeremans, A. and J. L. McClelland. 1991. ‘Learning the structure of event sequences’, Journal of Experimental Psychology—General, 120 (3): 235–53. Cleeremans, A. and L. Jiménez. 2002. ‘Implicit learning and consciousness: A graded dynamic perspective’, in R. M. French and A. Cleeremans (eds), Implicit learning and consciousness: An empirical, computational and philosophical consensus in the making? pp. 1–40. Hove, UK: Psychology Press. Cohen, A. and T. Curran. 1993. ‘On tasks, knowledge, correlations, and dissociations: Comment on Perruchet and Amorim’ (1992), Journal of Experimental Psychology—Learning Memory and Cognition, 19 (6): 1431–437. Curran, T. 1997. ‘Effects of aging on implicit sequence learning: Accounting for sequence structure and explicit knowledge’, Psychological Research, 60 (1–2): 24–41. Destrebecqz, A. and A. Cleeremans. 2001. ‘Can sequence learning be implicit? New evidence with the process dissociation procedure’, Psychonomic Bulletin & Review, 8 (2): 343–50. ———. 2003. ‘Temporal factors in sequence learning’, in L. Jiménez (ed.), Attention and implicit learning, pp. 181–213). Amsterdam and Philadelphia: John Benjamins. Destrebecqz, A., P. Peigneux, S. Laureys, C. Degueldre, Del Fiore, G., Aerts, J. et al. 2005. ‘The neural correlates of implicit and explicit sequence learning: Interacting networks revealed by the process dissociation procedure’, Learning & Memory, 12 (5): 480–90. Fu, Q., X. Fu, and Z. Dienes. 2008. ‘Implicit sequence learning and conscious awareness’, Consciousness and Cognition, 17 (1), 185–202. Jacoby, L. and M. Dallas. 1981. ‘On the relationship between autobiographical memory and perceptual learning’, Journal of Experimental Psychology—General, 110 (3): 306–40. Jimenez, L., J. M. M. Vaquero, and J. Lupianez. 2006. ‘Qualitative differences between implicit and explicit sequence learning’, Journal of Experimental Psychology—Learning Memory and Cognition, 32 (3): 475–90. Oppenheimer, D. and M. Franck. 2008. ‘A rose in any other font would not smell as sweet: Effects of perceptual fluency on categorization’. Cognition, 106 (3), 1178–194. Perruchet, P. and M. A. Amorim. 1992. ‘Conscious knowledge and changes in performance in sequence learning: Evidence against dissociation’. Journal of Experimental Psychology—Learning Memory and Cognition, 18 (4): 785–800. Reber, A. S. 1969. ‘Transfer of syntactic structure in synthetic languages’, Journal of Experimental Psychology, 81 (1): 115–19. Reber, A. S., R. Allen, and S. Reagan. 1985. ‘Syntactical learning and judgements, still unconscious and still abstract’, Journal of Experimental Psychology—General, 114 (1): 17–24. Reber, R., P. Wurtz, and T. Zimmerman. 2004. ‘Exploring ‘fringe’ consciousness: The subjective experience of perceptual fluency and its objective cases’, Consciousness and Cognition, 13 (1): 47–60. Reed, J. and P. Johnson. 1994. ‘Assessing implicit learning with indirect tests: Determining what is learned about sequence structure’, Journal of Experimental Psychology—Learning Memory and Cognition, 20 (3): 585–94. Reingold, E. M. and P. M. Merikle. 1988. ‘Using direct and indirect measures to study perception without awareness’, Perception and Psychophysics, 44 (6): 563–75. Scott, R. and Z. Dienes. 2007. ‘No role for perceptual fluency in artificial grammar learning’, paper presented at the ESCOP, Marseille, France. Shanks, D. R. 2003. ‘Attention, awareness, and implicit learning’, in L. Jimenez (ed.), Attention and implicit learning, Vol. 48, pp. 11–42. Amsterdam and Philadelphia: John Benjamins.
Differences in Sequence Learning
55
Shanks, D. R. and T. Johnstone. 1999. ‘Evaluating the relationship between explicit and implicit knowledge in a serial reaction time task’, Journal of Experimental Psychology—Learning Memory and Cognition, 25 (6): 1435–451. Shanks, D. R. and P. Perruchet. 2002. ‘Dissociating between priming and recognition in the expression of sequential knowledge’, Psychonomic Bulletin & Review, 9 (2): 362–67. Shanks, D. R., L. Wilkinson, and S. Channon. 2003. ‘Relationship between priming and recognition in deterministic and probabilistic sequence learning’, Journal of Experimental Psychology—Learning Memory and Cognition, 29 (2): 248–61. Tunney, R. J. and D. R. Shanks. 2003. ‘Subjective measures of awareness and implicit cognition’, Memory & Cognition, 31 (7): 1060–071. Vandenberghe, M., V. Gaillard, A. Destrebecqz, P. Fery, et al. submitted. ‘Is slowing down better than speeding up? Exploring the effects of temporal factors on age differences in sequence learning’. Wilkinson, L. and D. R. Shanks. 2004. ’Intentional control and implicit sequence learning’, Journal of Experimental Psychology—Learning Memory and Cognition, 30 (2): 354–69.
Chapter 4 Behavioural Study of the Effect of Trial and Error versus Supervised Learning of Visuo-motor Skills Ahmed, Raju S. Bapi, V. S. Chandrasekhar Pammi, K. P. Miyapuram, and Kenji Doya
INTRODUCTION
A
cquiring a sequential skill typically requires compiling a number of elementary actions into a unique chain that forms the complete sequence. Human skill learning has been extensively studied Thut et al., 1996; Perez et al., 2007; Rosenbaum, Kenny and Derr, 1983; Grafton et al., 1998; Jueptner et al., 1997a, 1997b; Sakai et al., 1998; Hikosaka et al., 1996, 2000; Bapi et al., 2000. Way back in 1964, Fitts observed that subjects are more attentive in the initial phase of learning a skill and become more automatic in the later stages of learning when attention can be engaged in other tasks, such as performance of dual tasks. Various aspects of skill learning have been studied in monkeys using a 2 × 5 sequence learning paradigm (summarized in Hikosaka et al. 1995, 1996) in which a sequence of 10 button presses is learned by trial and error. These experiments showed that as training progressed, monkeys improved on two measures of performance: errors—monkeys made fewer errors before attaining a success criterion—and response time (RT), the time taken to perform a sequence decreased with training. While errors reached a minimum level within a shorter period of training, RTs continued to improve over longer periods (Hikosaka et al. 1995, 1996). Bapi et al. (2000, 2006) followed up this in humans studying whether the skills are tied to the effectors or they are abstract. They concluded that in the initial stage of learning, effector-independent representation in visual/spatial coordinates is formed and this transforms to an effectordependent representation in motor coordinates by the late stage and there are distinct brain areas sub-serving these two representations.
Trial and Error versus Supervised Learning
57
Another issue of importance in skill learning is the learning strategy or paradigm employed. Learning paradigms can be grouped broadly into two main categories: supervised and unsupervised, based on whether evaluative feedback was provided or not. Feedback signal provides an assessment of the performance of the system during the learning process. In supervised learning, we assume that the teacher provides the desired response at each instant of time that can be used to calculate the errors and make appropriate corrections in order to eventually achieve the desired target. In a variation of supervised learning called reinforcement learning, a coarse feedback indicating the quality of the output is provided without specifying the desired response itself. In unsupervised learning, the desired response is not known. Thus explicit error information cannot be used to improve behaviour in unsupervised learning. The system needs to discover the inherent regularities present in the inputs and self-organize the information. Thus adaptive sequential skills could be acquired using any one of the strategies, that is, supervised, reinforcement, or unsupervised learning. The existing studies were not explicitly designed to tease out differences between trial and error and guided modes of learning motor sequences. In the current study we investigated the effects of two learning paradigms: supervised (explicit guidance) and reinforcement (trial and error). We adopted a modified version of 2 × 5 visuo-motor sequence learning task (Hikosaka et al., 1995; Bapi et al., 2000) wherein subjects learned the same number of finger movements in two learning modes. Supervised learning mode involves learning the sequential dependency structure by following a series of visual cues. On the other hand, trial and error learning requires an active exploration of visual cues, working memory for previous choice of responses, evaluation of responses and learning the sequence structure (Figure 4.1). In this chapter, we emphasize issues related to analysis of learning related effects in the behavioural data. The aim is to recommend a methodology for characterizing a subject’s stage and extent of learning of a skill. We have also acquired functional MR images while subjects engaged in the sequence learning tasks. The imaging results will be discussed elsewhere.
MATERIALS AND METHODS Subject Eighteen right-handed normal human subjects (16 male and two females in the range of 23 to 28 years) participated in this study. Informed consents were obtained from all the subjects prior to the experiment. The ethics committee of the Brain Activity Imaging Center, ATRI, Japan approved the experimental protocol.
Behavioural Paradigm Subjects alternated between two conditions—test and control conditions. In the control condition subjects followed randomly generated visual targets and thus there was no
58 FIGURE 4.1
Note:
Ahmed et al.
Sequence Learning Tasks
Procedure for one trial in the sequence tasks. Top Panel, 1 × 12 task. Bottom Panel, 2 × 6 Task. Numbers indicate the order of responses. Identical sequences are shown for clarity but subjects used different hyper sets for each task.
learning involved. In the test condition they learned a sequence of 12 key-presses arranged either as a 1 × 12 or as a 2 × 6 sequence. In the 1 × 12 task, subjects learned a fixed sequence of 12 finger movements explicitly guided by a series of visual cues presented one at a time (Figure 4.1a). In the 2 × 6 task (Figure 4.1b), subjects learned, by trial and error, the correct order of pressing two keys (called a set) successively for six times (called a hyper set). Subjects were given 0.8 seconds on average per key-press. However, they were
Trial and Error versus Supervised Learning
59
allowed to proceed immediately to the next set as soon as they completed one set. The presentation is reset to the beginning of the hyper set upon an error. Subjects performed two experiments, one with 2 × 6 and the other with 1 × 12 task. The order of experiments was counter-balanced across the subjects. Each experiment consisted of four sessions and a session comprised 13 epochs of alternating control and test conditions, each lasting for 18 and 36 seconds, respectively. Functional images were acquired in a 1.5 tesla wholebody scanner while subjects performed the experiments. The focus of this chapter is on the analysis of behavioural results and the imaging results will be discussed elsewhere. The time taken to execute a set was recorded for all the trials in all of the epochs. Two performance measures, performance accuracy (PA) and response time (RT), were calculated from the recorded set completion times. PA is the percentage of trials in which all the sets of a trial (hyper set) were correctly executed. RT is the average key press time from all the successful trials.
Behavioural Analysis Performance improvement of subjects was determined by two parameters, PA and RT, as discussed above. A summary of the performance measures recorded for the subjects is given in Table 4.1. A one-way ANOVA was performed on RTs from both the experiments to assess learning related improvements (see Table 4.2, for a summary). Dendograms were generated on the cumulative RTs of successful hyper sets using complete-linkage clustering method with Euclidean distance metric (Duda et al., 2001).
RESULTS A repeated-measures ANOVA indicated steady performance by all the subjects in the control task in both the experiments (results not shown here). Using the ANOVA results from Table 4.2, subjects were grouped into either “continued-learning” (p < 0.05) or “mildlearning” (p > = 0.05) group. The continued-learning group exhibited significant learningrelated behavioural improvements across sessions. On the other hand, the mild-learning group comprising those subjects who did not learn at all or were in an asymptotic stage of learning too early. Based on these criteria, four subjects (AS, FB, PV, and ST) were classified as mild-learning group in the 1 × 12 experiment, the other 14 subjects were in the continued-learning group. Similarly, 15 subjects showed continued-learning in the 2 × 6 experiment, whereas the other three subjects (AS, JY, KU) were in the mild-learning group. The RT, PA profiles and session-wise dendograms of one of the subjects (AS) from the mild-learning group under 1 × 12 experiment are shown in Figure 4.2. The RT profile seems to reach saturation in the first session itself, an indication that the subject might have reached an asymptotic performance level quite early in the task and it can be seen
Note:
YU
WY
WP
TG
TF
ST
RC
PV
NS
KU
JY
JC
HU
FB
EC
CH
BN
AS
Subject
0.130 ± 0.082 61.76 ±17.26 0.408 ± 0.064 81.67 ± 15.71 0.285 ± 0.064 50.00 ± 13.69 0.336 ± 0.024 56.11 ± 18.88 0.213 ± 0.089 75.40 ± 19.84 0.419 ± 0.025 81.94 ± 21.35 0.372 ± 0.071 58.61 ± 10.56 0.537 ± 0.061 88.89 ± 17.21 0.256 ± 0.103 52.36 ± 11.84 0.216 ± 0.131 77.48 ± 14.61 0.131 ± 0.125 62.73 ± 14.28 0.197 ± 0.118 74.25 ± 12.99 0.426 ± 0.018 86.11 ± 15.52 0.331 ± 0.067 64.80 ± 11.31 0.175 ± 0.074 39.86 ± 17.19 0.234 ± 0.136 51.51 ± 13.60 0.307 ± 0.127 59.72 ± 19.31 0.167 ± 0.142 60.19 ± 15.80
S1
S3 0.070 ± 0.007 86.81 ± 17.78 0.224 ± 0.033 65.63 ± 12.35 0.096 ± 0.028 65.67 ± 11.57 0.219 ± 0.064 83.97 ± 19.00 0.164 ± 0.113 67.53 ± 19.36 0.396 ± 0.016 87.50 ± 20.92 0.097 ± 0.010 71.83 ± 18.47 0.448 ± 0.020 95.83 ± 10.21 0.128 ± 0.036 57.37 ± 15.00 0.091 ± 0.014 75.00 ± 20.92 0.050 ± 0.009 81.48 ± 18.14 0.068 ± 0.012 78.15 ± 11.37 0.463 ± 0.045 94.44 ± 13.61 0.146 ± 0.013 67.06 ± 9.81 0.099 ± 0.053 60.19 ± 15.65 0.105 ± 0.017 77.68 ± 17.34 0.086 ± 0.009 70.27 ± 16.47 0.058 ± 0.008 69.68 ± 12.80
1 × 12 0.101 ± 0.043 65.01 ± 27.82 0.258 ± 0.017 83.33 ± 15.06 0.169 ± 0.115 74.03 ± 23.22 0.236 ± 0.041 71.67 ± 25.24 0.107 ± 0.025 40.03 ± 18.39 0.403 ± 0.006 87.50 ± 20.92 0.131 ± 0.053 54.86 ± 17.35 0.488 ± 0.025 94.44 ± 13.61 0.157 ± 0.098 41.77 ± 24.65 0.110 ± 0.017 93.45 ± 7.20 0.046 ± 0.002 71.76 ± 11.44 0.083 ± 0.012 72.92 ± 12.29 0.417 ± 0.028 95.83 ± 10.21 0.191 ± 0.037 64.92 ± 28.73 0.100 ± 0.031 60.19 ± 11.17 0.098 ± 0.016 67.56 ± 16.23 0.113 ± 0.011 66.67 ± 11.66 0.062 ± 0.015 58.33 ± 6.80
S2 0.069 ± 0.003 100.00 ± 0.00 0.193 ± 0.024 70.56 ± 17.44 0.076 ± 0.012 71.00 ± 17.29 0.167 ± 0.114 88.00 ± 10.86 0.142 ± 0.124 66.95 ± 22.95 0.370 ± 0.007 85.00 ± 23.45 0.093 ± 0.009 73.81 ± 13.82 0.422 ± 0.029 94.44 ± 13.61 0.122 ± 0.025 58.10 ± 19.01 0.076 ± 0.012 77.71 ± 14.65 0.048 ± 0.005 75.69 ± 18.96 0.058 ± 0.012 83.50 ± 5.62 0.433 ± 0.025 94.44 ± 13.61 0.118 ± 0.015 60.32 ± 26.59 0.080 ± 0.010 64.54 ± 23.77 0.104 ± 0.003 95.24 ± 7.38 0.069 ± 0.006 77.94 ± 17.12 0.065 ± 0.010 82.41 ± 11.71
S4 0.233 ± 0.087 59.40 ± 30.96 0.470 ± 0.036 33.70 ± 28.03 0.342 ± 0.036 40.83 ± 46.95 0.345 ± 0.069 43.25 ± 20.85 0.370 ± 0.073 37.30 ± 27.01 0.423 ± 0.070 45.71 ± 31.19 0.250 ± 0.038 57.64 ± 14.51 0.443 ± 0.069 43.06 ± 40.28 0.269 ± 0.091 58.43 ± 41.97 0.322 ± 0.074 58.89 ± 22.38 0.172 ± 0.093 47.20 ± 33.88 0.291 ± 0.051 35.32 ± 25.13 0.379 ± 0.064 68.33 ± 40.21 0.226 ± 0.067 54.59 ± 20.82 0.263 ± 0.070 33.76 ± 25.73 0.241 ± 0.053 60.99 ± 12.55 0.272 ± 0.057 63.00 ± 24.11 0.199 ± 0.072 66.77 ± 18.10
S1
S2
S3 0.179 ± 0.014 70.24 ± 6.15 0.227 ± 0.025 100.00 ± 0.00 0.212 ± 0.020 86.21 ± 13.79 0.208 ± 0.014 72.95 ± 18.60 0.238 ± 0.025 74.21 ± 21.32 0.244 ± 0.025 72.64 ± 18.03 0.219 ± 0.021 79.66 ± 13.02 0.419 ± 0.058 83.33 ± 23.38 0.215 ± 0.031 79.46 ± 13.54 0.203 ± 0.021 69.68 ± 10.17 0.091 ± 0.005 70.63 ± 16.29 0.153 ± 0.010 73.15 ± 9.45 0.295 ± 0.024 96.67 ± 8.16 0.144 ± 0.026 77.73 ± 14.00 0.185 ± 0.022 51.72 ± 18.37 0.111 ± 0.012 81.11 ± 18.31 0.177 ± 0.009 83.33 ± 15.14 0.155 ± 0.010 97.92 ± 5.10
2×6 0.193 ± 0.023 54.17 ± 7.88 0.287 ± 0.032 91.67 ± 13.94 0.247 ± 0.032 76.98 ± 15.11 0.275 ± 0.040 65.54 ± 30.42 0.251 ± 0.013 76.29 ± 15.76 0.299 ± 0.022 93.89 ± 9.53 0.223 ± 0.012 73.21 ± 16.52 0.438 ± 0.038 65.56 ± 23.13 0.227 ± 0.015 66.67 ± 21.03 0.230 ± 0.030 74.74 ± 12.32 0.097 ± 0.006 57.27 ± 15.10 0.196 ± 0.022 65.77 ± 7.86 0.322 ± 0.020 96.67 ± 8.16 0.180 ± 0.022 74.34 ± 18.76 0.196 ± 0.033 54.76 ± 18.36 0.129 ± 0.017 68.54 ± 15.77 0.200 ± 0.018 69.94 ± 9.74 0.156 ± 0.007 85.88 ± 12.40
Session-wise summary of performance of subjects for 1 × 12 and 2 × 6 experiment
0.185 ± 0.018 68.22 ± 12.26 0.192 ± 0.014 93.45 ± 10.69 0.180 ± 0.014 83.73 ± 16.06 0.202 ± 0.020 80.95 ± 12.96 0.200 ± 0.015 69.64 ± 21.16 0.212 ± 0.028 81.88 ± 18.82 0.201 ± 0.016 83.43 ± 15.99 0.404 ± 0.022 83.61 ± 13.60 0.192 ± 0.028 66.93 ± 13.13 0.185 ± 0.029 60.45 ± 19.53 0.091 ± 0.005 77.47 ± 10.32 0.138 ± 0.016 77.31 ± 15.75 0.301 ± 0.020 97.22 ± 6.80 0.109 ± 0.011 76.97 ± 12.23 0.159 ± 0.015 52.84 ± 12.07 0.098 ± 0.006 83.94 ± 10.13 0.156 ± 0.014 83.80 ± 6.84 0.122 ± 0.005 96.48 ± 5.46
S4
Each cell in the table gives the mean and standard deviation of RT (response time) and PA (performance accuracy) for the corresponding subject/session/expt.
RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA
TABLE 4.1
Trial and Error versus Supervised Learning TABLE 4.2
Summary of ANOVA results for 1 × 12 and 2 × 6 experiment. Each cell gives the p-value and the degrees of freedom
Subject AS BN CH EC FB HU JC JY KU NS PV RC ST TF TG WP WY YU Note:
61
1 × 12 RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA RT PA
0.099884 @df (3,20) 0.005368 @df (3,20) 0 @df (3,20) 0.161921 @df (3,20) 0.000117 @df (3,20) 0.101432 @df (3,20) 0.003686 @df (3,20) 0.03945 @df (3,20) 0.308297 @df (3,20) 0.034222 @df (3,20) 0.000226 @df (3,20) 0.965338 @df (3,20) 0 @df (3,20) 0.111977 @df (3,20) 0.000176 @df (3,20) 0.828328 @df (3,20) 0.018418 @df (3,20) 0.404083 @df (3,20) 0.006942 @df (3,20) 0.167589 @df (3,20) 0.080513 @df (3,20) 0.256828 @df (3,20) 0.002192 @df (3,20) 0.362283 @df (3,20) 0.090693 @df (3,20) 0.584222 @df (3,20) 0 @df (3,20) 0.953318 @df (3,20) 0.015063 @df (3,20) 0.098586 @df (3,20) 0.006546 @df (3,20) 0.000303 @df (3,20) 0.000008 @df (3,20) 0.306127 @df (3,20) 0.041708 @df (3,20) 0.010461 @df (3,20)
2×6 0.203223 @df (3,19) 0.361027 @df (3,20) 0 @df (3,19) 0.000002 @df (3,20) 0 @df (3,18) 0.028548 @df (3,20) 0.000017 @df (3,20) 0.03893 @df (3,20) 0.000004 @df (3,19) 0.017736 @df (3,20) 0 @df (3,19) 0.00516 @df (3,20) 0.018082 @df (3,20) 0.037187 @df (3,20) 0.531735 @df (3,18) 0.052021 @df (3,20) 0.091658 @df (3,19) 0.559732 @df (3,20) 0.000129 @df (3,20) 0.333918 @df (3,20) 0.014931 @df (3,19) 0.087273 @df (3,20) 0 @df (3,19) 0.000823 @df (3,20) 0.003513 @df (3,19) 0.071874 @df (3,20) 0.000231 @df (3,20) 0.084032 @df (3,20) 0.00251 @df (3,19) 0.234985 @df (3,20) 0 @df (3,20) 0.04156 @df (3,20) 0.000015 @df (3,20) 0.076073 @df (3,20) 0.014691 @df (3,20) 0.000491 @df (3,20)
Each cell in the table gives the mean and standard deviation of RT (response time) and PA (performance accuracy) for the corresponding subject/session/expt.
that PA profile continues to improve without any deterioration in the RT profile. The dendograms do not show any interesting reorganization pattern, except that the average depth of dendogram improves from 3.91 to 3.75. The same subject also happens to be in the mild-learning group in the 2 × 6 experiment. Figure 4.3 shows the RT, PA profiles and session-wise dendograms for subject AS. The RT
62 FIGURE 4.2
Note:
Ahmed et al.
Summary snapshot of a subject (AS) from “mild-learning” group of 1 × 12 experiment
Top left: Epoch-wise mean response time with standard deviation for successful hypersets are plotted. The vertical dotted lines are session markers. The horizontal dotted line is plotted at 0.8 sec (the max allowed key-press time). Top right: Epoch-wise performance accuracy plotted as percentage successful hypersets of total trials presented. Dendograms for, session 1 (middle left), session 2 (middle right), session 3 (bottom left), and session 4 (bottom right).
Trial and Error versus Supervised Learning FIGURE 4.3
Note:
63
Summary snapshot of a subject (AS) from “mild-learning” group of 2 × 6 experiment
Top left: Epoch-wise mean response time with standard deviation for successful hypersets are plotted. The vertical dotted lines are session markers. The horizontal dotted line is plotted at 0.8 sec (the max allowed key-press time). Top right: Epoch-wise performance accuracy plotted as percentage successful hypersets of total trials presented. Dendograms for, session 1 (middle left), session 2 (middle right), session 3 (bottom left), and session 4 (bottom right).
64
Ahmed et al.
profile is the same as discussed in the previous case above. The dendograms for all four sessions are exactly the same, indicating perhaps no further reorganization of sequence information. In the mild-learning group, subjects are of two sub-types. One sub-type were those who reached asymptotic level of RT very early (as in the previous two examples above) and the other who exhibit a very mild improvement in RT and the RTs were not significantly different from their baseline (control task) values. Figures 4.4 and 4.5, show examples of the second sub-type for 1 × 12 and 2 × 6, respectively. One of the subjects (BN) is taken as a case study for continued-learning in the 1 × 12 experiment. Figure 4.6 shows a snapshot of the performance. The dendogram for session one has a tree-depth of five whereas that of session four has depth four, which is an indicator of internal reorganization and change in the chunking pattern of the successive movements. Subject BN is also under continued-learning in the 2 × 6 experiment. Figure 4.7 shows the profiles. There is a gradual improvement in the RT and PA profiles. The dendogram of session one has a tree-depth of four whereas that of session two–four has a depth of three. Again, this is an indication of internal reorganization. Although we have shown the dendogram profiles of selected case studies, we have observed a similar profile among subjects within the same group.
DISCUSSION In the present study we investigated differences corresponding to learning a sequence of finger movements either in supervised or trial-and-error paradigms. The former involves learning by following a sequence of explicit visual cues. In contrast the latter paradigm involves exploring visual cues to make a choice and then evaluate the choice with the help of external feedback. Both the learning paradigms, however, require working memory and executive processes for learning the sequential dependencies and establishing motor sequence. The focus of this chapter is to highlight issues related to processing of behavioural results of a learning experiment. We performed one-way ANOVA on the RT measure and grouped the subjects into “continued-learning” and “mild-learning” categories (see Table 4.2). However, the ANOVA results do not conclusively establish the underlying learning profile of the subjects as discussed below. To bring out the aspects of information organization, we performed clustering analysis on the cumulative RT values from successful (complete) hyper sets (see Figures 4.2–4.7). We have selected two kinds of cases, one that belongs to the “mild-learning” group (Figures 4.2–4.5) and the other that belongs to the “continued-learning” group (Figures 4.6–4.7) from both 1 × 12 and 2 × 6 experiments. Figures 4.2 and 4.3 show the case where non-significant ANOVA result is due to the subject exhibiting asymptotic level of
Trial and Error versus Supervised Learning FIGURE 4.4
Note:
65
Summary snapshot of a subject (ST) from “mild-learning” group of 1 × 12 experiment
Top left: Epoch-wise mean response time with standard deviation for successful hypersets are plotted. The vertical dotted lines are session markers. The horizontal dotted line is plotted at 0.8 sec (the max allowed key-press time). Top right: Epoch-wise performance accuracy plotted as percentage successful hypersets of total trials presented. Dendograms for, session 1 (middle left), session 2 (middle right), session 3 (bottom left), and session 4 (bottom right).
66 FIGURE 4.5
Note:
Ahmed et al.
Summary snapshot of a subject (JY) from “mild-learning” group of 2 × 6 experiment
Top left: Epoch-wise mean response time with standard deviation for successful hypersets are plotted. The vertical dotted lines are session markers. The horizontal dotted line is plotted at 0.8 sec (the max allowed key-press time). Top right: Epoch-wise performance accuracy plotted as percentage successful hypersets of total trials presented. Dendograms for, session 1 (middle left), session 2 (middle right), session 3 (bottom left), and session 4 (bottom right).
Trial and Error versus Supervised Learning FIGURE 4.6
Note:
67
Summary snapshot of a subject (BN) from “continued-learning” group of 1 × 12 experiment
Top left: Epoch-wise mean response time with standard deviation for successful hypersets are plotted. The vertical dotted lines are session markers. The horizontal dotted line is plotted at 0.8 sec (the max allowed key-press time). Top right: Epoch-wise performance accuracy plotted as percentage successful hypersets of total trials presented. Dendograms for, session 1 (middle left), session 2 (middle right), session 3 (bottom left), and session 4 (bottom right).
68 FIGURE 4.7
Note:
Ahmed et al.
Summary snapshot of a subject (BN) from “continued-learning” group of 2 × 6 experiment
Top left: Epoch-wise mean response time with standard deviation for successful hypersets are plotted. The vertical dotted lines are session markers. The horizontal dotted line is plotted at 0.8 sec (the max allowed key-press time). Top right: Epoch-wise performance accuracy plotted as percentage successful hypersets of total trials presented. Dendograms for, session 1 (middle left), session 2 (middle right), session 3 (bottom left), and session 4 (bottom right).
Trial and Error versus Supervised Learning
69
performance right from the very early stage (from session one itself) in 1 × 12 and 2 × 6 tasks, respectively. On the other hand, Figures 4.4 and 4.5 show the case where subjects’ RT performance does not seem to indicate improvement as compared to baseline performance (baseline values not shown here). Interestingly, the results in Figure 4.5 indicate that the subject traded off RT for accuracy. Consequently, ANOVA results indicate nonsignificant improvement for RT but near-significant improvements for PA for subject JY (see Table 4.2). The dendogram analysis of the “continued-learning” group as shown in Figures 4.6 and 4.7 clearly establishes the fact that learning progressed well in these subjects. In this case, both the ANOVA and dendogram results coincide. In summary, it is clear that dendogram analysis gives meta-information related to learning dynamics that is not easily decipherable from ANOVA results alone. Thus we recommend this supplementary analysis approach for characterizing learning-related effects in behavioural data. It has been demonstrated earlier that dendogram-based clustering analysis brings out the hierarchical structure that subjects have internalized while learning sequential motor skills (Pammi et al., 2004; Miyapuram et al., 2006). In addition, in this chapter we propose that cluster analysis enables demarcating learning stages as well as grouping subjects based on the learning dynamics. It is expected that such grouping criteria will enable proper interpretation of the functional imaging results.
REFERENCES Bapi, R. S., K. Doya, and A. M. Harner. 2000. ‘Evidence for effector independent representations and their differential time course of acquisition during motor sequence learning’, Experimental Brain Research, 132: 149–62. Bapi, R. S., K. P. Miyapuram, F. X. Graydon, and K. Doya. 2006. ‘fMRI investigation of cortical and subcortical networks in the learning of abstract and effector-specific representations of motor sequences’, NeuroImage, 32 (2): 714–27. Doya, K. 1999. ‘What are the computations of the cerebellum, the basal ganglia, and the cerebral cortex’, Neural Networks, 12: 961–74. Duda, R. O., P. E. Hart, and D. G. Stork. 2001. Pattern Classification, 2nd edition, New York: John Wiley & Sons. Fitts, P. M. 1964. ‘Perceptual-motor skill learning’, in A. W. Melton (ed.), Categories of Human Learning, pp. 243–85. Academic Press: New York. Grafton, S. T., E. Hazeltine, and R. B. Ivry. 1998. ‘Abstract and effector-specific representations of motor sequences identified with PET’, The Journal of Neuroscience, 18: 9420–428. Hikosaka, O., M. K. Rand, S. Miyachi, and K. Miyashita. 1995. ‘Learning of sequential movements in the monkey-process of learning and retention of memory’, Journal of Neurophysiology, 74 (4): 1652–661. Hikosaka, O., S. Miyachi, K. Miyashita, and M. K. Rand. 1996. ‘Learning of sequential procedures in monkeys’, in J. R. Bloedel, T. J. Ebner, and S. P. Wise (eds), The Acquisition of Motor Behaviour in Vertebrates, pp. 303–17. Cambridge, MA: MIT Press. Hikosaka, O., K. Sakai, H. Nakahara, X. Lu, S. Miyachi, K. Nakamura, and M. K. Rand. 2000. ‘Neural mechanisms for learning of sequential procedures’, in M. Gazzaniga (ed.), The New Cognitive Neurosciences, 2nd edition, pp. 553–72. Cambridge MA: MIT Press.
70
Ahmed et al.
Jueptner, M., K. M. Stephan, C. D. Frith, D. J. Brooks, R. S. J. Frackowiak, and R. E. Passingham. 1997a. ‘Anatomy of motor learning. I. Frontal cortex and attention and action’, Journal of Neurophysiology, 77 (3): 1313–324. Jueptner, M., C. D. Frith, D. J. Brooks, R. S. J. Frackowiak, and R. E. Passingham. 1997b. ‘Anatomy of motor learning. II. Subcortical structures and learning by trial and error’, J. Neurophysiol, 77 (3): 1325–337. Miyapuram, K. P., R. S. Bapi, V. S. C. Pammi, Ahmed, and K. Doya. 2006. ‘Hierarchical Chunking during Learning of Visuomotor Sequences’, proceedings of the IEEE World Congress on Computational Intelligence (WCCI ‘06), Vancouver, Canada, IJCNN 2006, pp. 249–53. Pammi, V. S. C., K. P. Miyapuram, R. S. Bapi, and K. Doya. 2004. ‘Chunking phenomenon in complex sequential skill learning in humans’, Lecture Notes in Computer Science, Vol. 3316: Proceedings of the 11th International Conference on Neural Information Processing (ICONIP 2004), N. R. Pal, N. Kasabov, R. K. Mudi, S. Pal, and S. K. Parui (eds), Heidelberg: Springer-Verlag, pp. 294–99. Perez, M. A., S. Tanaka, S. P. Wise, N. Sadato, H. C. Tanabe, D. T. Willingham, and L. G. Cohen. 2007. ‘Neural substrates of intermanual transfer of a newly acquired motor skill’, Current Biology, 17 (21): 1896–902. Rosenbaum, D. A., S. B. Kenny, and M. A. Derr. 1983. ‘Hierarchical control of rapid movement sequences’, J. Exp. Psychol.: Hum. Percept. Perform., 9 (1): 86–102. Sakai, K., O. Hikosaka, S. Miyauchi, R. Takino, Y. Sasaki, and B. Putz. 1998. ‘Transition of brain activation from frontal to parietal areas in visuomotor sequence learning’, The Journal of Neuroscience, 18 (5): 1827–840. Thut, G., N. D. Cook, M. Regard, K. L. Leenders, U. Halsband, and T. Landis T. 1996. ‘Intermanual transfer of proximal and distal motor engrams in humans’, Experimental Brain Research, 108 (2): 321–27.
Chapter 5 ACE (Actor–Critic–Explorer) Paradigm for Reinforcement Learning in Basal Ganglia: Highlighting the Role of the Indirect Pathway Denny Joseph, Garipelli Gangadhar, and V. Srinivasa Chakravarthy
Introduction
I
n the mammalian brain, learning by reinforcement is a function of deep brain nuclei known as the Basal Ganglia (BG). It is now believed that the BG uses reward information to modulate sensory–motor pathways so as to render future behaviours more rewarding. Although exploration and exploitation are equally important in a Reinforcement Learning (RL) framework, literature concerned with the role of the BG in RL seems to focus mainly on the reward signal—its chemical messenger, its anatomical site, and its consequences in learning—but only presents a summary treatment of exploration. Even in studies where exploration is discussed, no anatomical site for the exploratory signal is hypothesized (Montague et al., 1995). It is often said that the activity of the dopaminergic cells in the SNc and/or the Ventral Tegmental Area (VTA) signals reward (Montague et al., 1995). But which part of the BG generates the stochastic signal necessary for exploration? This chapter puts forward the view that the Subthalamic Nucleus and Globus Pallidus externa (STN–GPe) loop in BG is the site for exploratory behaviour. In this chapter we present a model of BG in which nearly every nucleus of BG is incorporated. Particularly, the model highlights the role of the STN–GPe loop in exploration. The model is attached to a simple arm model that is trained to reach a small number of targets. It will be shown that complex activity of the STN–GPe loop is essential for the arm to learn to make a successful reach. It will also be shown that training the model under dopamine deficient conditions progressively results in a defective reach, with the arm displaying a Parkinsonian-like tremor.
72 Denny Joseph et al. The chapter is outlined as follows: Section 2 describes the proposed BG model; Section 3 describes the results of simulations of the model. A discussion of the results is presented in the following section.
Basal Ganglia Model BG consists of five extensively connected subcortical nuclei: Caudate nucleus, Putamen, Globus Pallidus (externa, GPe; interna, GPi), STN, and Substantia Nigra (pars compacta, SNc; pars reticula; SNr). The input nucleus of the BG is the Striatum (Caudate + Putamen). Axons of dopaminergic neurons in the SNc project onto the striatum. There are two pathways from the striatum to the Globus Pallidus internal (GPi), one of the output nuclei of the BG. In the direct pathway, neurons in the striatum directly project onto the GPi. The other pathway, namely the indirect pathway, connects the striatum, GPe, STN, and GPi in that order. Also there exist excitatory and inhibitory recurrent connections between the STN and the GPe.
Model Architecture The proposed BG model (Figure 5.1) consists of three main components: the Actor, the Critic, and the Explorer. The Actor represents the sensory–motor cortical pathway, where the results of learning a motor task are consolidated. The Critic represents the striatum or the corticostriatal network. The Explorer, which is the new element in our BG model, represents the STN–GPe loop. The ACE model is coupled to a simple two-joint, four-muscle arm model. The goal of the network is to learn to reach a given target out of eight targets located on a circle within the arm’s workspace, when instructed to do so [Figure 5.2(c)]. The arm model consists of two links with four muscles as shown in the Figure 5.2(b). A single joint controlled by two muscles is shown in Figure 5.2(b). For the realization of the reaching task, it is assumed that the muscles of the arm model are driven by the neural activation from the Actor. A simple muscle model is assumed which consists of a spring and damper system as shown in Figure 5.2(a). A more detailed description of the arm model may be found elsewhere (Joseph, 2008).
Actor, Critic, and Explorer The Actor performs actions on the environment so as to achieve a desired goal. It receives state information, (ξA), from the environment and performs an action, G = [ga, gb]. The Actor is modelled as a Perceptron Neural Network (Haykin, 1998), with eight input nodes representing the eight targets to be reached, and two output nodes, which generate the muscle activations needed for the Arm Model at Joint-A and Joint-B as ga and gb, respectively [Figure 5.3(a)]. The input to the network is given by,
ACE (Actor–Critic–Explorer) Paradigm
73
Figure 5.1 ACE Architecture
ξA = {1, −1, −1, −1, −1, −1, −1, −1} for target 1, for example, which communicates to the Actor its present goal. The position of the single +1 in the eight-dimensional vector specifies the target to be reached. The output of the Actor is computed as a weighted summation of input, ξA and the corrective signal from the BG network and is given by, ga =
gb =
1 1+ e (
− λa ( WAaξ A + ∆g a )
)
(1)
1
(2)
− λ ( W bξ + ∆g ) 1+ e ( a A A b )
a
b
where, Dga & Dgb are the corrective signals from the BG network, W A & W A are the weights connecting the input layer, ξA, to the muscle activation layer, G = [ga, gb], la ( = 0.6) is the slope of the nonlinearity. The muscle activations cause the Arm Model to settle at a point in the workspace, which represents the equilibrium position of the arm dynamics. The function of the Critic is to “assign value”. The Critic assigns each output of the Actor a Value; this Value depends on the goal of the Actor. Accurate training of the Critic is paramount, since it provides the gradient information necessary for guiding the arm as it traverses the workspace searching for the target.
74 Denny Joseph et al. Figure 5.2 Simple Muscle Model System
Note: (a) A simple muscle model based on a spring–damper system, (b) A single link configuration with an agonist and an antagonist muscle, (c) The shaded region represents the workspace of the arm.
A Radial Basis Function (RBF) network (Kumar, 2004) is used to implement the Critic. It receives as input, a concatenated vector, consisting of the target selection vector which is the current input to the Actor, and the muscle activation, G = [ga, gb] output by the Actor when instructed to reach the aforementioned target. Thus, the input to the Critic network is given by, ξC = [ξA, ga, gb]. The network has one hidden layer, consisting of h ( = 7) Radial Basis Functions, each of which is a Gaussian with standard deviation, s ( = 2). The output layer consists of a single neuron, whose output is the “Value”, Q(t) (Figure 5.4). The BG network has three stages: an input stage, representing the Striatum, a hidden layer (actually a double layer) representing the STN–GPe loop, and an output layer, representing the GPi (linear stage) (Figure 5.5).
ACE (Actor–Critic–Explorer) Paradigm
75
Figure 5.3 (a) Architecture of the Actor Network (b) The 2D Arm and the Targets to be reached
Figure 5.4 Architecture of the Critic Network
76 Denny Joseph et al. Figure 5.5 Architecture of the Explorer
The Explorer network input, xE, is given as,
ξ E = [ξ A , g a , g b ]
(3)
where xA is the input to the Actor and [ga, gb] is the output of the Actor. The processing of the input among these stages may be described by the following equations:
I GPe = WE1ξ E
(4)
where IGPe is the input to the GPe network (described below) and W 1E represents the weights connecting the input layer to the GPe network. The networks of STN and GPe are implemented in a 2D grid fashion (Equations 10 to 12). The output, Dg = [Dga, Dgb], of the Explorer is a corrective signal to the Actor and is given by:
ACE (Actor–Critic–Explorer) Paradigm
77
∆g a = WEa2 U STN
(5)
∆g b = WEb2 U STN
(6)
2 2 where USTN is the output of the STN layer, W E a & W E b represent the weights connecting the STN layer to the output stage. The outputs Dga and Dgb are corrective signals used to modulate the output of the Actor.
STN–GPe Layer in the Model A pair of neuron layers, connected in excitatory–inhibitory fashion represents the STN–GPe system. A single STN–GPe neuron pair with glutamergic and GABAergic connections are shown as excitatory and inhibitory connections in Figure 5.6(a). The dynamics of the GPe neuron are given by:
dx = −x + V + s + I dt V = tanh(lx)
(7) (8)
where x denotes the state of the GPe neuron, I is the external input to the neuron, s is the state of the STN neuron, V denotes the output of the GPe neuron, and l ( > 1) controls the slope of the sigmoidal activation function. The dynamics of the STN neuron is given by:
ds = −s − V dt
(9)
Note that while x has inhibitory influence on s, s in turn excites x. Such an excitatory– inhibitory pair is a standard recipe for producing oscillations. Analysis of Equations 7 to 9 shows that, for I = 0, s = 0, l > 1, V in Equation 8 has two stable states, V = ±1. Moreover, if V is at a negative (positive) stable state, a sufficiently large positive (negative) s in (Equation 9) flips V to its positive (negative) stable state. In (Equation 9), s simply follows −V with a delay. Therefore, a persistent value of V induces a change in s such that V is toggled periodically. Oscillations are produced by the above system, but only within certain limits of the external input, I. The pair of neurons discussed above is replicated and connected in a 2D grid fashion to realize the STN–GPe layer as shown in Figure 5.6(b). The connections between these nuclei are assumed to be one-to-one with the inclusion of lateral connections in the GPe layer, and no lateral connections in the STN layer. There exist one-to-one connections
78 Denny Joseph et al. Figure 5.6 (a) STN–GPe neuron pair illustrating the excitatory and inhibitory connections, (b) Network model of STN–GPe loop with lateral connections
from the STN to the GPe, and vice versa. The lateral connection strengths of the GPe network are calculated using (Equation 13). The dynamics of the network model of the STN–GPe loop are given by: dxij
n
n
= − xij + ∑ ∑ wij , pqVpq + sij + I ijNe
dt
Vij = tanh(lxij)
dsij dt
q =1 p =1
= −sij − Vij
(10) (11) (12)
where (i, j) and (p, q) denote coordinates of neurons on the 2D grid, n is the size of the 2D grid, xij is the state of the (i, j)th neuron on the GPe grid, sij is the state of the (i, j)th neuron on the STN grid, and Vij is the output of the (i, j)th neuron on the GPe grid. The lateral connections, within the GPe layer are assumed to be translation invariant and are calculated as:
2 2 WijGPe , pq = ε − a exp( −rlat / σ lat ) for r < R
= 0, otherwise
(13)
where rlat = [(i − p )2 + ( j − q )2 ] denotes the distance between neurons on the 2D grid, a controls the depth of the Gaussian bell function, slat its width, and R is the neighbourhood size. Thus, each unit has a negative centre, and a positive surround, with the relative sizes of the centre and surround determined by e. Smaller e⋅ implies more negative lateral GPe connections.
ACE (Actor–Critic–Explorer) Paradigm
79
In the absence of input from the input layer (that is, Iij = 0), as e⋅ is varied from 0 to a, the activity of the STN–GPe layer exhibits three different regimes: (a) Uncorrelated activity, (b) Travelling waves, and (c) Clustering (Figure 5.7). Terman et al. (2002) also observed similar dynamic behaviour in their more detailed electrophysiological model of the STN–GPe system. Operation of the network in the first regime, namely, uncorrelated Activity, is most crucial, since it is uncorrelated activity in the STN–GPe layer that allows the network to extensively explore the space of possible actions. The three activity regimes (from left to right) are obtained by progressively increasing e from 0 to a ( = 2). Increasing e increases the percentage of positive lateral connections in the STN. In regime [Figure 5.7(c)], Clustering, the array splits into a centre and a surround, with neurons in either region forming a synchronized cluster. An interesting extension is put forth, in this chapter, to Doya’s (Doya, 2002) suggestion that norepinephrine levels in the GP control the ‘temperature’ parameter in RL exploration. Keeping in line with the perspective, described above, on the role of norepinephrine in exploration, in the present model we assume that norepinephrine activity controls the activity level of STN–GPe neurons, and that the norepinephrine activity is in turn dependent on dopamine levels. Accordingly, we designate a quantity called DNe, which signifies the level of norepinephrine in the STN–GPe loop and which also depends on the level of dopamine. We also assume that DNe controls the overall activity level (number of active neurons) in the STN. Further, we assume that the precise form of DNe dependence on dopamine, is given by:
1 DNe = A1 1 − + A2 ( αδ ( t ) − β ) 1+ e
(14)
The value of DNe decreases from a maximum value of (A1 + A2) to a minimum value of A2, with increase in the value of d. The parameters α and β control the slope and the bias of the non-linearity, respectively. Figure 5.7 Dynamics of the STN–GPe Loop: Three characteristic patterns of activity in the STN–GPe layer— (a) Uncorrelated activity, (b) Travelling waves, and (c) Clustering
80 Denny Joseph et al. Now, a negative feedback loop to the STN–GPe layer is designed to control the activity level (number of active neurons in the STN) of the STN–GPe loop using the DNe signal. This is achieved as follows: Da =
τ
1 N ∑ (U ijSTN + 1) 2 ij
(15)
e = DNe – Da
(16)
dE = tanh( λg e ) dt
(17)
I ijNe = E −
N 2
(18)
where Da is the actual number of active units, N is the number of neurons in the STN layer, and e denotes the discrepancy between the actual number of active units, Da, and DNe levels at any given instant. This discrepancy is accumulated in E. The parameter, lg ( = 10) controls the slope of ‘tanh’ function and τ is the time constant of the feedback loop. INe ij is the input to the GPe neuron as in shown in Equation 10. The value DNe of controls not only the average activity of the STN layer but also its ‘complexity’. For example, let us assume a 10 × 10 grid of neurons representing the STN– GPe layer and assume DNe = 50. It was mentioned earlier that DNe controls the number of active neurons (neurons in the “ON” state) at any given time. So DNe = 50, results in any 50 neurons out of the 100 neurons being in the ‘ON’ state. The number of different states 100 100 the loop can assume is highest for this value of DNe, namely 50, because C 50 > C n for any other n. The network travels through these states in a pseudo random fashion. But then how many of these states are actually visited by the network depends on the dynamic regime in which the network functions. When the network is uncorrelated mode, it can visit a larger number of states than when it is in travelling wave mode or in clustering is the maximum number of states that the network can access mode. Accordingly, C 100 50 when DNe = 50, though in actual practice it may never visit a large number of those states. Thus with DNe less than 50, the number of possible states the network can visit is less than that possible with DNe = 50. Simulation results, showing how DNe influences the complexity of the STN–GPe dynamics, are shown in Figure 5.8.
Model Description The input to the Actor is the target selection vector (like, ξA = {1, −1, −1, −1, −1, −1, −1, −1}; for target –1). The Actor is expected to output muscle activations (ga and gb for joint –a and joint –b respectively [Figure 5.2(b)] that place the arm’s end effector at the target location. Since the Striatum receives both sensory and motor representations, the target selection
ACE (Actor–Critic–Explorer) Paradigm
81
Figure 5.8 Snapshots of STN activity for various values of DNe: (a) DNe = 50; observed E-dim ~ 96, (b) DNe = 20; observed E-dim ~ 48, (c) DNe = 5; observed E-dim ~ 15. There is a consistent decrease in E-dim with decreasing DNe
(a)
(b)
(c)
pattern, ξA, and the estimated muscle activations, (ga, gb), are presented as input to the Critic. The Critic uses this information and estimates the State–Action Value function (Sutton and Barto, 1998). Input ξC ( = [ξA, ga, gb]) received by the Critic is also copied to the Explorer (that is, the input to explorer is, ξE = ξC). The Explorer generates corrective output (Dga, Dgb), which represents ‘exploration’ of the state space of possibilities in an attempt to determine the right muscle activations. The corrective output (Dga, Dgb) is sent to modify the estimated output (ga, gb) of the Actor. Training of the system proceeds as follows. Given a command to reach a specific target, the arm makes exploratory reaching movements. When the end effector strays too close to the target by chance, the system is bestowed with reward, which is used for training the ACE components. The Critic and the Explorer help the Actor reach the target on command. The Actor requires their help during the training period, in order to reach the targets. Once trained, the Actor, on being instructed to do so, can reach the target in an entirely feed–forward manner. However, the continued presence of the Critic and the Explorer are required by the Actor to learn new goals for which it has not been trained. The Actor, Critic, and Explorer thus form a closely-knit team geared to efficiently acquire the motor skill at hand.
Computation of Reward and Temporal Difference Error Reward is the result of the interaction of the Actor with the environment and plays a crucial role in Actor training. Accordingly, a reward of +1 is administered, if the arm settles to within a threshold distance, dthresh, from the target point. Also, a negative reward of –0.3 is given, if the arm activations take on extreme values. To speed up the learning process, in the initial few epochs of training, dthresh, is maintained sufficiently high, its value is reduced as a function of the number of epochs.
82 Denny Joseph et al. Calculation of Temporal Difference (TD) error, δ(t), is done as follows.
d(t) r(t) + g Q(t) – Q(t – 1)
(19)
where r(t) is the reward obtained for the action of the Actor at time t, Q(t) is the Value assigned to the state-action pair at time t, Q(t–1) and is the Value corresponding to the state–action pair at time, t−1. The discount factor, g, is assumed to be 1.0. This method of calculating δ (t) is similar to SARSA(λ) (Sutton and Barto, 1998).
Training We split the process of ACE training into different stages. All the ACE components (Actor, Critic, and Explorer) are not trained simultaneously but are trained in an order. During the first stage, only the Critic is trained. In the next stage, the Critic is no longer trained; instead the Actor and the Explorer are trained. Descriptions of the training methodology followed for individual elements of the ACE model are given below. The weights of the actor are updated as per the following equations.
WAa = WAa + ηaδ (t )( ∆g a (t )ξ A )
(20)
WAb = WAb + ηaδ (t )( ∆g b (t )ξ A )
(21)
where ha is the learning rate for the weights of the Actor network. The RBF network is trained offline, with two sets of Input–Output examples. The first set has as input, a concatenated vector, consisting of the target selection vector and the approximate muscle activations to reach the target (determined by a trial and error method); this input is paired with a desired output of 1, which represents the case of a ‘high’ value (that is, Q = 1). In the second set, for the same target, muscle activations at a distance of one standard deviation, stol ( = 2), from the activations of the first set are chosen. These inputs are paired with a desired output of 0.1, which corresponds to a ‘low’ value (that is, Q = 0.1). After training, we observe that the network generalizes well, producing local peaks in the state space (Figure 5.9). The network is trained offline using the Neural Network Toolbox available in MATLAB. The Value function modelled by the Critic in our model represents the effectiveness of the reach, that is, the nearness of the end effector of the arm to the target position. We train the Critic off-line and use its predictions to train the Actor and the Explorer. Existence of such a ‘pre-trained’ Critic may be justified as follows. It is known that the posterior parietal cortex (PPC) contains neurons that respond when a successful reach coincides with visual appearance of the hand with the grasped object (Gardner and Kandel, 2000). Based on studies involving transcranial magnetic stimulation, Desmurget et al. (1999) suggest that the PPC generates an internal representation of hand position and provides
ACE (Actor–Critic–Explorer) Paradigm
83
Figure 5.9 Output of the Critic for different targets, with the x–y plane representing [ga , gb] values and the z-axis representing the Value, Q
a dynamic reaching error, which is used by motor cortical areas for real-time correction of movement. Imaging studies on the role of PPC in reaching also suggest that the visual and motor components of reaching may have different functional organization (Kertzman et al., 1997). Functional magnetic resonance imaging studies of sequential motor learning by Bapi et al. (2006) reveal that there is an early stage of visuo-spatial-based learning subserved by parietal areas, and a late stage of motor-based learning subserved by motor cortical areas. Finally, projections from parietal association areas to striatum are common knowledge of basal ganglia connectivity pattern (Yeterian and Pandya, 1993). Thus, our Critic model is meant to be a summary representation of the parietal visuo-spatial machinery supporting reaching movements. 2 2 Only the weights, W 1E, and W E a & W E b, are adjusted during the training of the Explorer network. The internal weights of the STN–GPe loop are assumed to be constant. The 2 2 connections from the STN to the output layer, W E a & W E b, are trained by using. The first 1 layer weights, W E, are trained by equations similar to that used in Associative Reward Prediction (ARP) (Barto and Anandan, 1985). The adaptation equations are given as: For the first layer,
If d > 0
WE1 = WE1 + η1ξ E [U GPe − tanh(U GPe )]
else
WE1 = WE1 + η2ξ E [ −U GPe − tanh(U GPe )]
(22)
where h1, h2 (h2 < h1) denote learning rates, ξE is the input to the Explorer and UGPe is the output of the GPe layer. The equation used depends on the sign of d. For the second layer:
84 Denny Joseph et al.
WEa2 = WEa2 + η3δ (t )U STN [ ∆g a (t ) − ∆g a (t − 1)]
(23)
WEb2 = WEb2 + η3δ (t )U STN [ ∆g b (t ) − ∆g b (t − 1)]
(24)
where h3 is the learning rate, Dga & Dgb are the output of the Explorer.
Simulations In this section, the results of simulations, in which the BG model drives a simple arm model to reach targets, under ‘normal’ and ‘dopamine-deficient’ conditions is presented.
Comparing Explorer Dynamics During and Post Learning In the next experiment, we compare the Explorer dynamics during and post-learning in terms of neural activity in the STN–GPe layer and in terms of reaching performance. We observe that during the initial stages of training, as the arm explores the state space, the neural activity of the STN–GPe layer is complex; assisting this exploration [Figure 5.10(a)]. As training progresses, the arm makes initial guesses closer and closer to the target and the neural activity becomes less complex [Figure 5.10(b)]. After training, the arm moves to the target in a feed-forward manner and there is no need for exploration. In post-training conditions, the neural activity in STN–GPe is regular and small clusters are observed [Figure 5.10(c)]. However, even after learning, if the radius of tolerance, dthresh , were to suddenly change, or the target point suddenly shifted by a small amount, resulting in the Actor no longer receiving reward where it previously used to, the output of the STN–GPe layer which was previously exhibiting clusters, will now become uncorrelated so that the Actor can further explore the workspace and find the target. The output of the STN–GPe layer thus changes dynamically to respond to changes in the conditions of the environment.
Simulation Results for Dopamine Deficient Conditions We assume that reduced dopamine affects the activity of the STN–GPe layer neurons via the quantity, DNe. Therefore, to simulate dopamine deficient conditions, we shift and scale down the function that maps δ onto DNe, as show in Figure 5.11. Remember that changes in DNe affect STN–GPe dynamics according to equations 15 to 18. Next, we trained the network under such altered conditions of STN–GPe dynamics. The changes observed were divided into two categories: (a) primary changes, changes seen soon after dopamine reduction, and (b) secondary changes, changes observed after the network was trained under dopamine deficient conditions.
ACE (Actor–Critic–Explorer) Paradigm
85
Figure 5.10 (a) The dynamics of the model before learning, E-dim = 93, (b) The dynamics of the model after learning for eight epochs, E-dim = 57, (c) The dynamics of the model at the end of learning, E-dim = 5
(a)
(b)
(c)
Primary Changes Previously, when the Actor was untrained, the output of the STN–GPe layer would be complex so as to assist the Actor in finding the target. However, now we observe that the number of neurons in the “ON” state in the STN–GPe layer is much lower, hence the complexity of the oscillatory activity is lower. Consequently, the arm takes small steps as it explores and exploration is confined to a narrow region [Figure 5.12(a)].
Secondary Changes As the network continues to be trained under low dopamine conditions, the output of the STN–GPe layer begins to lose whatever complexity it had and settles down to clustered activity. This inevitably means that the complexity of the oscillatory activity is low. The Explorer tries to make up for paucity in the STN–GPe activity by increasing its output weights. This means that the output of the Explorer is of high amplitude but
86 Denny Joseph et al. Figure 5.11 The relation between norepinephrine (DNe) in the STN–GPe layer and δ
Note: The function is shifted and scaled down to simulate dopamine deficient conditions (for solid curve; A1 = 45 and A2 = 5, α = 3, β = 0.33, for dashed curve, A1 = 15 and A2 = 5, α = 6, β = −0.07).
low complexity, causing the Explorer to send large values of Vg to the arm, resulting in large, regular, ‘tremor-like’ oscillations of the arm as shown in Figure 5.12(b). Naturally, the arm fails to learn to reach the target.
Discussion In an earlier work, we had hypothesized that the STN–GPe layer in the BG was responsible for exploratory behaviour. In this paper, we describe a model of BG in which the role of BG in exploration is highlighted. RL-based models of basal ganglia function do not appear to give sufficient significance to the exploratory aspect of the RL framework. Though the contribution of the BG in activities that involve exploration, like foraging, have been described, a precise anatomical substrate in the BG for such exploration has not been given sufficient attention. In an RL perspective of BG function, we believe that the STN–GPe part of the BG is the Explorer. We thus have an Actor–Critic–Explorer model of BG.
ACE (Actor–Critic–Explorer) Paradigm
87
Figure 5.12 Changes due to dopamine reduction
Note: (a) Primary Changes due to dopamine reduction. Exploration is confined to a narrow region (top-left). Complexity and activity levels of the STN–GPe layer are reduced, E-dim = 32 (bottom-left). (b) Secondary Changes due to dopamine reduction. Exploration is of large amplitude but of poor quality due to large oscillations of the arm (top-right). Dramatic loss of complexity in the activity of the STN–GPe layer, E-dim = 5 (bottom -right).
The ACE model is coupled to an arm model and is trained on a simple reaching task, which consists of reaching a small number of targets on instruction. The Critic, which represents the striatum, supplies Value information; the Explorer, which represents the STN–GPe loop, explores around the prior provided by the Actor. The Actor, which represents the motor cortex, consolidates the results of successful exploration. In the present work, special emphasis is placed on the characterization of the STN–GPe subsystem of the BG. Complexity of this system, it is suggested, provides the stochastic
88 Denny Joseph et al. signal necessary for exploration in RL. Loss of complexity in the activity of this system has been reported under PD conditions. Since the STN–GPe subsystem is emphasized, the Direct Pathway is ignored in the present model. The Motor Cortex (Actor) receives the stochastic perturbation modulated by dopamine signal (TD error) from the Indirect Pathway (Explorer). Contribution from the Indirect Pathway is related inversely to dopamine levels. In a more comprehensive future version of the model, we will describe the Direct and Indirect Pathways as being involved in action selection and exploration respectively. An interesting extension is put forth in this paper to Doya’s (Doya, 2002) suggestion that norepinephrine levels in the GP control the ‘temperature’ parameter in RL exploration. We suggest that norepinephrine levels (DNe) in the GPe controls the activity levels of the STN–GPe layer and is in turn controlled by the dopamine level, δ. This assumption gives rise to a mechanism by which dopamine fluctuations can indirectly control the complexity of the STN–GPe layer and hence influence exploration. It also opens ways to understand how the complexity of the STN–GPe activity is reduced under Parkinsonian conditions. Changes in PD conditions aside, our model predicts learning-dependent changes in the complexity of the STN–GPe activity, under normal conditions. Studies in Section 3.1 suggest that both complexity (effective dimension) and average activity are higher in the STN–GPe layer during acquisition of a reaching skill, as compared to activity in the same system, post-learning. This is an interesting prediction, which can be examined experimentally. Interesting scenarios emerge from simulations of our model under dopamine-deficient conditions. The primary changes upon dopamine reduction include, a reduction in the activity level, and in the complexity of the STN–GPe layer, resulting in inefficient, lowamplitude exploration. The secondary changes, which develop after extended training under reduced dopamine conditions, include increased activity levels in the STN–GPe layer but with reduced complexity, thereby resulting in large-amplitude, tremor-like oscillations of the arm. Interestingly, hyperactivity in the GPi and STN neurons, secondary to dopamine deficiency, is a well-observed phenomenon in the MPTP model of PD in monkeys (DeLong, 1990). Studies of the olfactory system in rabbits show that when a familiar odour is presented to the animal, the olfactory bulb responds with a rhythmic waveform. However, when the stimulus is novel, or unfamiliar, activity in the bulb exhibits chaotic wandering (Skarda and Freeman, 1987). Bergman et al. (1998) reported dynamic synchronization of Pallidal activity in MPTP-treated monkeys. Recent experimental studies have shown that under dopamine depletion, the STN and GPe do not show much change in mean rate of firing, but reveal prominent, low-frequency periodicity of firing (4 to 30 Hz), and dramatically increased correlations among neurons in STN and GPe (Bergman et al., 1994; Nini et al., 1995; Magnin et al., 2000; Raz et al., 2000; Brown et al., 2001). Terman et al. (2002) have performed computer simulations of conductance-based models of the subthalamopallidal
ACE (Actor–Critic–Explorer) Paradigm
89
circuit, fully incorporating existing knowledge about synaptic connections, and cellular properties. The simulations reveal a wide variety of oscillatory patterns, which depend on the arrangement, and strengths of synaptic connections within, and between cellular populations. It was found that the network switched from irregular, uncorrelated spiking to correlated rhythmic patterns upon weakening the intrapallidal inhibitory connections. Based upon this observation, the authors went on to suggest that such changes in synaptic connectivity may underlie the correlated rhythmic activity in the subthalamopallidal circuit, observed in pathological states like Parkinson’s disease. Such observations lend support to our contention that the STN–GPe layer is the source of exploratory behaviour required in an RL framework.
References Bapi, R. S., K. P. Miyapuram, F. X. Graydon, and K. Doya. 2006. ‘fMRI Investigation of Cortical and Subcortical Networks in the Learning of Abstract and Effector-Specific Representations of Motor Sequences’, NeuroImage, 32: 714–27. Barto, A. G., and P. Anandan. 1985. ‘Pattern recognizing stochastic learning automata’, IEEE Transactions on Systems, Man and Cybernetics, 15: 360–74. Bergman, H., A. Feingold, A. Nini, A. Raz, H. Slovin, M. Abeles, and E. Vaadia. 1998. ‘Physiological aspects of information processing in the basal ganglia of normal and parkinsonian primates’, Trends in Neuroscience, 21: 32–38. Brown, P., A. Oliviero, P. Mazzone, A. Insola, P. Tonali, and V. Di Lazzaro. 2001. ‘Dopamine dependency of Oscillations between Subthalamic Nucleus and Pallidum in Parkinsons disease’, The Journal of Neuroscience, 21: 1033–038. DeLong, M. R. 1990. ‘Primate models of movement disorders of basal ganglia origin’, Trends in Neurosciences, 13: 281–85. Desmurget, M., C. M. Epstein, R. S. Turner, C. Prablanc, G. E. Alexander, and S. T. Grafton. 1999. ‘Role of the posterior parietal cortex in updating reaching movements to a visual target’, Nature Neuroscience, 2: 563–67. Doya, K. 2002. ‘Metalearning and Neuromodulation’, Neural Networks, 15. Gardner, E. P., E. R. Kandel. 2000. ‘Touch’, in E. R. Kandel, J. H. Schwartz, and T. M. Jessell (eds), Principles of Neural Science, Fourth Edition, pp. 451–72, New York: McGraw-Hill. Haykin, S. 1998. Neural Networks: A Comprehensive Foundation. Upper Saddle River, NJ: Prentice-Hall, PTR. Joseph, D. 2008. ‘Towards an integral model of the Basal Ganglia: Exploring the multi-faceted role of Basal Ganglia in motor function’, Master’s Thesis, Indian Institute of Technology, Madras, India. Kertzman, U. Schwarz, T. Zeffiro, and M. Hallett. 1997. ‘The role of posterior parietal cortex in visually guided reaching movements in humans’, Experimental Brain Research, 114: 170–83. Kumar, S. 2004. Neural Networks: A Classroom Approach. India: Tata McGraw-Hill. Magnin, M., A. Morel, and D. Jeanmonod. 2000. ‘Single-unit analysis of the pallidum, thalamus and Subthalamic nucleus in Parkinsonian patients’, Neuroscience, 96: 549–64. Montague, P. R., P. Dayan, C. Person, and T. J. Sejnowski. 1995. ‘Bee foraging in uncertain environments using predictive Hebbian learning’, Nature, 377: 725–28. Nini, A., A. Feingold, H. Slovin, and H. Bergman. 1995. ‘Neurons in the globus Pallidus do not show correlated activity in the normal monkey, but phase-locked oscillations appear in the MPTP model of Parkinsonism’, Journal of Neurophysiology, 74: 1800–805.
90 Denny Joseph et al. Prashanth, P. S. and V. S. Chakravarthy. (under review). An oscillator theory of motor unit recruitment in skeletal muscle, Biological Cybernetics. Raz, A., E. Vaadia, and H. Bergman. 2000. ‘Firing patterns of spontaneous discharge of Pallidal neurons in the model of Parkinsonism’, Journal of Neuroscience, 20: 8559–571. Skarda, C. A. and W. J. Freeman. 1987. ‘How brains make chaos in order to make sense of the world’, Behavioral and Brain Sciences, 10: 161–95. Sutton, R. S. and A. G. Barto. 1998. Reinforcement Learning: An Introduction, Cambridge, MA: MIT Press. Terman, D., J. E. Rubin, A. C. Yew, and C. J. Wilson. 2002. ‘Activity patterns in a model for the Subthalamopallidal network of the basal ganglia’, The Journal of Neuroscience, 22: 2963–76. Yeterian, E. H. and D. N. Pandya. 1993. ‘Striatal connections of the parietal association cortices in rhesus monkeys’, Journal of Computational Neurology, 332: 175–97.
Section II
Perception and Attention
92 Elisabetta Làdavas and Andrea Serino
Introduction
V
isual perception and attention have been studied extensively in Psychology, Neuroscience, and Computer Science and they have become a significant focus area in Cognitive Science. While information is processed in different modalities independently to some extent, information from different perceptual systems are integrated to form a coherent percept. For example, tactile and visual information is combined to form representations of space that can be used for action planning and execution. Attentional processes modulate the perceptual system involved in processing of space. Research on selective attention has shown differences in exogenous (stimulusdriven) and endogenous (voluntary) attention. For example, with exogenous attention, early facilitation is followed by later inhibition (Posner and Cohen, 1984) while inhibition is typically absent with endogenous attention. An interesting question is whether the inhibition shown with exogenous attention is general or is dependent on salient stimuli (for example, emotional stimuli). Based on such behavioural and neuropsychological findings, many models of attention have been developed (Heinke and Humphreys, 2003; Itti and Koch, 2000; Rolls and Deco, 2002). This section consists of four chapters focusing on some of these critical issues and developments in perception and attention research. While vision provides significant information about the world, we perceive a unitary world in which information from multiple sensory modalities are combined. The chapter by Ladavas and Serino focuses on multisensory processing related to peripersonal space, that is, space immediately surrounding the body. They discuss evidence for representations of peripersonal space from electrophysiological recordings of neurons in parietal and frontal cortical areas in monkeys. They present evidence for peripersonal space representations in humans based on studies with patients with brain damage exhibiting crossmodal extinction. Ladavas and Serino discuss the evidence for a unitary module versus multiple modules for peripersonal space representations around body parts like hands and heads. They conclude based on neuropsychological evidence that there are multiple modular peripersonal space representations specific to different body parts. Stimuli presented in peripersonal space seem to activate and control these representations in a rapid manner. An important role of these peripersonal space representations is to enable actions like reaching, grasping and withdrawing to objects in near vicinity of the body. Experience plays a critical role in the formation of these representations and Ladavas and Serino discuss their experiments on tool use (with a blind person’s cane) and also studies with blind people. The results indicate that peripersonal space representations are dynamically altered by experience and also space itself gets remapped due to tool use with far space becoming near due to the use of a cane. They conclude that research on peripersonal space indicate tight coupling between action and perception.
94 Advances in Cognitive Science Studies on exogenous attention have shown facilitation at short cue-to-target intervals and inhibition at longer cue-to-target intervals (Posner and Cohen, 1984). The longer reaction times to targets presented at previously attended locations has been labelled inhibition of return (IOR). Various explanations have been proposed for IOR (Berlucchi, 2006; Klein, 2000). It has been argued that processing at a particular location is inhibited once attention shifts from that location and there has been a debate on whether this inhibition is sensory or motor-based (Berlucchi, 2006; Taylor and Klein, 2000). Serino and her colleagues discuss the phenomenon of IOR based on their neurophysiological findings and their model for IOR based on repetition suppression in their chapter. They discuss their findings based on recordings from the superior colliculus and lateral intraparietal cortex (LIP). They argue that the superior colliculus is not the origin for IOR but only expresses IOR produced by other regions like anterior inferotemporal cortex (AIT) and LIP. Repetition suppression effects are hypothesized to play a critical role in reflexive attention. They propose and implement a neural net model based on repetition suppression that is able to simulate reflexive attention effects including IOR. Emotion and attention interact with each other (Vuilleumier, 2002). Different emotions interact differently leading to asymmetries in emotion perception. Many studies have been performed with sad and happy faces with specific attentional manipulations and paradigms to explore the reciprocal relationships between emotion and attention. Facial expressions are an important aspect of emotional information processing (Vuilleumier et al., 2001). Not all emotional expressions have similar interactions with cognitive processes like attention and memory (Frischen et al., 2008; Gupta and Srinivasan, 2009; Srinivasan and Gupta, in press). Several studies have shown that emotional expressions capture attention and interfere with the ongoing task even though they are not relevant to the current task (Vuilleumier et al., 2001). The chapter by Srinivasan, Baijal and Khetrapal discusses the interaction between emotion and selective attention as well as emotion and control. They discuss findings from different studies on attention focusing on the differences between sad and happy emotional expressions in the way they interact with selective attention. They discuss visual search studies in the context of emotional faces (Frischen et al., 2008). Findings on IOR with emotional faces indicate that IOR magnitude is different for different emotions and hemispherical differences are present for detection of emotions in an IOR paradigm (McAuliffe et al., 2006). In addition to selective attention, emotions interact with control processes. Fenske and Eastwood (2003) showed that flanker effects are different for sad and happy faces. The authors review studies on attentional control with emotional stimuli. They also discuss their findings from a flanker study performed with emotional stimuli (happy and angry schematic faces) presented in either left or right visual field to explore hemispherical asymmetries in emotional processing in the context of control. The chapter discusses both behavioural and electrophysiological results from their flanker study and its implications for emotions and control. Various models have been proposed on attention (Deco and Rolls, 2005; Heinke and Humphreys, 2003; Itti and Koch, 2000; Rolls and Deco, 2002). A significant amount of research on modelling of attention is based on results from patients with neuropsychological
Perception and Attention
95
disorders. Many specific deficits in individuals with neurological disorders indicate the many component processes of attention. The chapter by Humphreys and colleagues present a model of attention based on data from individuals with posterior parietal cortex (PPC) damage. They use a spiking level neural network to implement their spiking Search over Time and Space (sSoTS) model. The model consists of neural units that code both spatial information processed by posterior parietal cortex and object-based information processed by the ventral pathway in the visual system. The model is used to simulate visual selection deficits found in patients with PPC damage. The simulations show that lesioning the location map affect detection of conjunction targets than simple targets (defined by a single feature) amongst distractors. Effects of arousal were also simulated by manipulating neurotransmitters indicating that the model is successful in simulating and understanding deficits shown by neuropsychological patients and linking it to underlying neural mechanisms.
References Berlucchi, G. 2006. “Inhibition of return: A phenomenon in search of a mechanism and a better name”, Cognitive Neuropsychology, 23 (7): 1065–074. Deco, G. and E. Rolls. 2005. “Neurodynamics of biased competition and cooperation for attention: a model with spiking neuron”, Journal of Neurophysiology, 94 (1): 295–313. Fenske M. J. and J. D. Eastwood. 2003. “Modulation of focused attention by faces expressing emotion: evidence from flanker tasks”, Emotion, 3 (4): 327–43. Frischen, A., J. D. Eastwood, and D. Smilek. 2008. “Visual search for faces with emotional expressions”, Psychological Bulletin, 134 (5): 662–76. Gupta, R. and N. Srinivasan. 2009. “Emotions help memory for faces: Role of whole and parts”, Cognition & Emotion, 23 (4), 807–16. Heinke, D. and G. Humphreys. 2003. “Attention, spatial representation and visual neglect: Simulating emergent attention and spatial memory in the selective attention for identification model (SAIM)”, Psychological Review, 110 (1): 29–87. Itti, L. and C. Koch. 2000. “A saliency-based search mechanism for overt and covert shifts of visual attention”, Vision Research, 40 (10–12): 1489–1506. Klein, R. M. (2000). Inhibition of return. Trends in Cognitive Sciences, 4, 138–47. McAuliffe, J., A. L. Chasteen, and J. Pratt. 2006. “Object- and location-based inhibition of return in younger and older adults”, Psychology and Aging, 21 (2): 406–10. Posner, M. and Y. Cohen. 1984. “Components of visual orienting”, in H. Bouma and D. G. Bouwhuis (eds), Attention and Performance X, pp. 531–36. Hillsdale, NJ: Laurence Erlbaum and Associates. Rolls, E. and G. Deco. 2002. Computational Neuroscience of Vision. New York: Oxford University Press. Srinivasan, N. and R. Gupta. (in press) “Emotion–attention interactions in recognition memory for distractor faces”. Emotion. Taylor, T. and R. M. Klein. 2000. “Visual and motor effects in inhibition of return”, Journal of Experimental Psychology: Human Perception and Performance, 26 (5): 1639–656. Vuilleumier, P. 2002. “Facial expression and selective attention”, Current Opinion in Psychiatry, 15 (3): 291–300. Vuilleumier, P., J. L. Armony, J. Driver, and R. J. Dolan. 2001. “Effects of attention and emotion on face processing in the human brain: an event-related fMRI study”, Neuron, 30 (3): 829–41.
96 Elisabetta Làdavas and Andrea Serino
Chapter 6 Peripersonal Space Representation in Humans: Proprieties, Functions, and Plasticity Elisabetta Làdavas and Andrea Serino
T
he taxonomy of space representation includes at least three main sectors: personal (that is, the space defined by the body surface), peripersonal (the space just surrounding the body) and extrapersonal space (the space far from the body). The strongest support for the distinction between peripersonal and extrapersonal space was provided by neurophysiological studies in monkeys. Rizzolatti and colleagues (1981; 1983; 1997) used the term “peripersonal” to define a limited sector of space around an animal’s body-part whose spatial boundaries are operationally defined by variations in the neuronal firing rate that mainly depends upon the relation of proximity between a visual stimulus and a given body-part. This space appears to be implemented by multisensory neurons within the putamen, parietal and premotor areas (Bremmer et al., 2001; Duhamel et al., 1998; Graziano and Gross, 1993; Graziano et al., 1997; Rizzolatti et al., 1998;) that respond, in case of peri-hand space, both to touches delivered within the hand somatotopic receptive field (RF) and to visual stimuli presented close to the same RF. Most importantly, the neuronal response to visual stimuli (that is, visual RF) is spatially tuned, being stronger at shorter distances between the body of and the visual source, and vice versa (Graziano et al., 1994; Fogassi et al., 1996; Duhamel et al., 1998; Fogassi et al., 1999). The evidence for the existence of peripersonal space in humans derives mainly from neuropsychological studies (di Pellegrino et al., 1997; Làdavas et al., 1998a; Làdavas et al., 1998b) conducted on right brain damaged (RBD) patients with extinction. Extinction is a phenomenon whereby patients fail to detect contralesional stimuli only under the condition of double (ipsilesional and contralesional) simultaneous stimulation (Bender, 1952). Extinction can emerge when concurrent stimuli are presented in the same (unimodal extinction) or in different modalities (crossmodal extinction) (Mattingley et al., 1997). For example, crossmodal extinction emerges when tactile perception on the contralesional hand is modulated by visual or auditory stimuli presented on the ispilesional
98 Elisabetta Làdavas and Andrea Serino hand. Importantly, crossmodal extinction in RBD patients is modulated by the spatial arrangement of the stimuli with respect to the patient’s body (see Làdavas, 2002; Làdavas and Farnè, 2004a; 2004b, for reviews); visuo–tactile extinction is much stronger when the visual stimulation occurs in the space close (~5 cm) to the patient’s body, as compared to when visual stimulation occurs in the space far (~35 cm) from the body. The finding that multisensory integration may occur in a privileged manner in a limited sector of space surrounding the hand has been taken as an evidence for the existence, in humans, of an integrated visuo–tactile system coding peripersonal space around the hand (Làdavas, 2002). In a similar way, visual and tactile information are integrated in other regions of space surrounding different body parts, such as around the head and the face (Làdavas et al, 1998b; Farnè et al., 2005a). Peripersonal space is not organized as a unitary sector of space encompassing the whole body, but as a collection of modules, each coding for the space immediately adjacent to a specific body part (Farnè et al., 2005a). Touches delivered to the left hand are extinguished by visual stimuli presented near the right hand but not by those presented adjacent to the face, and vice versa. This finding excludes that peripersonal space representation is unitary, as this predicts that touches delivered to the left hand would be equally extinguished by any visual stimulus presented near the ipsilesional side of the body, irrespective of the stimulated body part (for example, the right hand or cheek). In contrast, when visual stimuli were presented far from the ipsilesional side of the patients’ body, the amount of visual–tactile extinction obtained in homologous and non-homologous combinations was absolutely comparable. These results suggest that peripersonal space is an ensemble of modules separately representing the space immediately adjacent to a given body-part. The modular organization of near peripersonal space as shown by Farnè et al. (2005a) is also in agreement with previous neurophysiological and neuropsychological findings. Peripersonal space coding operates in body part-centred co-ordinates in human and nonhuman primates (di Pellegrino et al., 1997; Graziano et al., 1994; Duhamel et al., 1998). To date, visual–tactile extinction has been used to assess multisensory integration in the peripersonal space around the hand and head, although, in principle, other peripersonal space representations might actually exist tuned to different body parts (for example, for the inferior limb and the trunk). Future studies will shed light on the existence of these additional sectors of peripersonal space. A modulation of tactile processing in peripersonal space has been reported also in the interaction between audition and touch, both for the face (Làdavas et al., 2001; Farnè and Làdavas, 2002) and the hand (Serino et al., 2007), showing in this way the exist-ence of an auditory peripersonal space around the face and the hand. The activation of peripersonal space representation can occur rather automatically, following a bottom-up flow of information. In a recent study, Farnè et al. (2003) measured left tactile extinction when patients’ hands were either covered or not by a transparent Plexiglas. The results showed that visual stimuli presented near the ipsilesional hand induced strong
Peripersonal Space Representation in Humans
99
crossmodal extinction of contralesional tactile stimuli. Crucially, the amount of crossmodal extinction was comparable whether the patients’ hands were physically protected by the Plexiglas, or not, against the approaching visual stimulus. Overall, these findings suggest that visuo–tactile integrative processes can occur automatically. Indeed, although patients were explicitly aware that the transparent barrier would prevent any possibility for the visual stimulus to get into physical contact with their own right hand, the Plexiglas did not block either visual or proprioceptive cues, which provided congruent inputs relative to the spatial proximity of the hand and the visual stimulus. Another piece of evidence supporting the notion that peripersonal space representation is automatically activated by stimulus-driven processes derives from findings showing that visuo–tactile interaction within the peripersonal space can also be triggered uniquely on the basis of visual information about hand location; the human brain can form visual representations of the peripersonal space of a non-owned body part, like a rubber hand, as if it were a real hand (Farnè et al., 2000). It is important to underline that also this result is, again, fully consistent with neurophysiological evidence. In monkeys, when a fake realistic arm is visible, instead of the real animals arm, the activity of many visuo–tactile neurons in the ventral premotor (Graziano, 1999) and in parietal (Graziano et al., 2000) areas are modulated by the congruent or incongruent location of the seen fake arm. In summary, these findings confirm that peripersonal space representation is automatically activated by a rather bottom-up flow of information. Why should our perception of nearby space be so stimulus-driven? One reason may depend on a possible function of such a multisensory system coding visual peripersonal space. Cells in the putamen, the VIP area and inferior area six have motor functions, as well as sensory functions. Indeed, the same neurons often have both sensory and motor activity. These areas probably encode the location of nearby sensory stimuli in order to generate an appropriate motor response to those stimuli, such as avoiding a stimulus coming towards the face or the hand (see Graziano and Cooke, 2006 for a review), or reaching to grasp an object, or getting food into the mouth (see Rizzolatti et al., 1997). Many of the movements aimed at avoiding a stimulus coming towards the face or the head have a reflexive quality, that is, they are fast and they can occur without conscious planning or thought: in keeping with this view electrical stimulation of brain regions coding peripersonal space in monkeys automatically evokes movements aimed at avoiding, withdrawing, or protecting the part of the body on which neurons’ tactile receptive field is located (Cooke and Graziano, 2004). Thus, it becomes clear that, due to the functional role of these multisensory neurons, their activation must be quick and mainly not requiring higher level of information processing.
Dynamic Changes of Peripersonal Space due to Tool-use The functional characteristic of the peripersonal space, that is, the ability to localize the stimulus even when the skin is not immediately stimulated and to produce an appropriate movement in response to it, allows the prediction that this space can be modified by a
100 Elisabetta Làdavas and Andrea Serino given action aimed to reach objects in space, such as when individuals use a tool to reach far space. Multidisciplinary evidence widely supports this proposal (Iriki et al., 1996; Farnè and Làdavas, 2000; Maravita et al., 2002). In humans, Farnè and Làdavas (2000) reported behavioural evidence of peripersonal space extension due to tool-use by investigating crossmodal extinction in a group of RBD patients. Visual stimuli, presented at the tip of a 38 cm long rake statically held in the patients’ ipsilesional hand, induced more contralesional tactile extinction immediately after tool use (retrieving distant objects with the rake for five minutes), than before tool use. Stronger crossmodal extinction at the same far location after tool use can be considered as an evidence of the extension of peri-hand space along the tool axis. In the same study, a backward contraction of the extended peri-hand space was also documented, as crossmodal extinction was reduced to pre-tool use levels after a longer interval of tool inactivity. The finding that tool-use can change space perception raises several issues about the crucial determinants of peri-hand space extension. Some of these issues can be summarized by the following questions: (a) Is a passive change of the corporeal configuration (hand + tool) sufficient for the extension of peripersonal space, or is some goal-directed activity necessary? and (b) Is the representation of peripersonal space influenced by the long-term experience of the tool user? In other words, can a far space become permanently near in subjects who constantly use tools to interact with the environment? When considering the first issue, that is, the role played by passive or active experience, the results of a recent study on patients with extinction (Farnè et al., 2005b) showed that a relatively prolonged, but passive exposure to a visual/proprioceptive change in the spatial characteristics of the patients’ bodies, failed to elongate the peri-hand space. Indeed, the amount of visuo–tactile extinction obtained in the far location, after a short period while the patients passively experienced the wielding of a rake, did not change compared to that observed when there was no rake at all. In sharp contrast, a change in spatial representations was found after tool use. Immediately after the use of the rake to retrieve distant objects, crossmodal extinction significantly increased compared to the situation of passive exposure. As far as the second issue is concerned, most studies on the effect of tool-use of perispersonal space representation (see Maravita and Iriki, 2004, for a review) have shown that the expansion of peripersonal space after tool use lasts only for short time intervals, because visual peri-hand space contracted backwards to pre-tool use levels after some minutes from the end of the training (Ishibashi et al., 2000). However, tool-use can be a quite common experience in everyday life, and in fact there are some subjects who habitually and functionally use a tool to interact with far space, such as blind people who use the blind cane everyday to navigate in their environment. Therefore another interesting question is whether such a prolonged experience of using a tool might result in a durable expansion of peripersonal space representation. To answer to this question, Serino and colleagues (2007) investigated audio–tactile integration in the space around the hand and in far space in blind cane users and in sighted, blindfolded, subjects, who
Peripersonal Space Representation in Humans
101
never used the cane. Subjects performed a tactile discrimination task at their right hand, while concurrent, task-irrelevant, sounds were presented either near the subject’s hand, or in far space, at a distance approximating the length of a normal blind cane. The effect on tactile RT due to near with respect to far sounds was taken as a measure of the extension of the auditory peri-hand space. The results showed that in sighted subjects, RT to tactile stimuli was speeded up when concurrent sounds were presented close to the subject’s hand with respect to sounds presented far from the hand; this shows a form of multisensory integration in a limited space around the hand. Second, it was found that in sighted subjects this space can be dynamically extended by a brief use of a tool to explore the far space. Indeed, after a 10 minutes training with a blind cane, RT associated with far sounds was speeded-up, resulting in equal RT for sounds presented in near and far space. However, when sighted subjects were tested after one day without using the cane, a preference for near sounds was again evident, as before any tool use. These findings suggest that auditory peripersonal space also expands and contracts backwards depending on tool use. Instead, a long-term everyday life experience of using a tool results in a durable expanded representation of peripersonal space. Indeed, in blind-cane users holding their cane, RT associated to far sounds was even faster than those associated with near sounds. However, when they held a 14 cm long handle the response was as in sighted subjects. Blind people, who continuously use the cane to integrate auditory and tactile information in far space, in order to compensate for the lack of visual information, developed a new, extended representation of auditory perihand space. We might say that such a long-term experience with the cane “transformed” far space into near space and near space into far space: in order to prevent collisions with external objects, the space at the tip of the cane becomes much more important than that around the hand and thus assumes all the integrative proprieties usually seen around the hand. In this sense these plastic changes in processing peripersonal space may have a strong adaptive value in enhancing blind subjects’ proficiency in avoiding harmful collisions. This interpretation of the results, as well as the finding that an extended representation of space is selectively activated when blind people hold their cane, emphasizes the automatic link between action and perception. The action–perception relationship may develop in a reciprocal fashion, such that perceivers come to anticipate the perceptual outcomes whenever an action is possible, planned or executed. Objects that are presented within the action space are perceived to be closer than those that are not. In other words, the tool defines the actual action space: when a hand uses a tool, objects that were out of the reachable space, become reachable, and, as a consequence, they may be processed as they were near and not far. In conclusion, experimental findings reviewed in the present paper show that: there are different multisensory representations of the space around the body; the extent of these representations is dynamically and functionally shaped by action. Thus, the possibility of acting in space contributes to the construction of space perception.
102 Elisabetta Làdavas and Andrea Serino References Bender, M. B. 1952. Disorders of Perception. Springfield, IL: Charles C. Thomas. Bremmer, F., A. Schlack, J. R. Duhamel, W. Graf, and G. R. Fink. 2001. “Space coding in primate posterior parietal cortex”, Neuroimage, 14 (1 Part 2): S46–51. Cooke, D. F. and M. S. Graziano. 2004. “Sensorimotor integration in the precentral gyrus: Polysensory neurons and defensive movements”, Journal of Neurophysiology, 91 (4): 1648–60. di Pellegrino, G., E. Làdavas, and A. Farne. 1997. “Seeing where your hands are”, Nature, 388 (6644): 730. Duhamel, J. R., C. L. Colby, and M. E. Goldberg. 1998. “Ventral intraparietal area of the macaque: congruent visual and somatic response properties”, Journal of Neurophysiology, 79 (1): 126–36. Farnè, A. and E. Làdavas. 2000. Dynamic size-change of hand peripersonal space following tool use. Neuroreport, 11 (8): 1645–649 ——— 2002. “Auditory peripersonal space in humans”, Journal of Cognitive Neuroscience, 14 (7): 1030–43. Farnè, A., F. Pavani, F. Meneghello, and E. Làdavas. 2000. “Left tactile extinction following visual stimulation of a rubber hand”, Brain, 123 (Part 11): 2350–60. Farnè, A., M. L. Dematte, and E. Làdavas. 2003. “Beyond the window: multisensory representation of peripersonal space across a transparent barrier”, International Journal of Psychophysiology, 50 (1–2): 51–61. Farnè, A., M. L. Dematte, and E. Làdavas. 2005a. “Neuropsychological evidence of modular organization of the near peripersonal space”, Neurology, 65 (11): 1754–58. Farnè, A., A. Iriki, and E. Làdavas. 2005b. “Shaping multisensory action-space with tools: evidence from patients with crossmodal extinction”, Neuropsychologia, 43 (2): 238–48. Fogassi, L., V. Gallese, L. Fadiga, G. Luppino, M. Matelli, and G. Rizzolatti. 1996. “Coding of peripersonal space in inferior premotor cortex (area F4)”, Journal of Neurophysiology, 76 (1): 141–57. Fogassi, L., V. Raos, G. Franchi, V. Gallese, G. Luppino, and M. Matelli. 1999. “Visual responses in the dorsal premotor area F2 of the macaque monkey”, Experimental Brain Research, 128 (1–2): 194–99. Graziano, M. S. A. 1999. “Where is my arm? The relative role of vision and proprioception in the neuronal representation of limb position”, Proceedings of National Academy of Sciences, 96 (18): 10418–21. Graziano, M. S. and C. G. Gross. 1993. “A bimodal map of space: somatosensory receptive fields in the macaque putamen with corresponding visual receptive fields”, Experimental Brain Research, 97 (1): 96–109. Graziano, M. S., Taylor C. S., and Moore T. 2002. Complex movements evoked by microstimulation of precentral cortex, Neuron 34 (2002), pp. 841–51. Graziano, M. S. and D. F. Cooke. 2006. “Parieto-frontal interactions, personal space, and defensive behavior”, Neuropsychologia, 44 (6): 845–59. Graziano, M. S., C. S. Taylor, and T. Moore. 2002. “Complex movements evoked by microstimulation of precentral cortex”, Neuron, 34 (2002): 841–51. Graziano, M. S., D. F. Cooke, and C. S. Taylor. 2000. “Coding the location of the arm by sight”, Science, 290 (5497): 1782–86. Graziano, M. S., G. S. Yap, and C. G. Gross. 1994. “Coding of visual space by premotor neurons”, Science, 266 (5187): 1054–57. Graziano, M. S., X. T. Hu, and C. G. Gross. 1997. “Visuospatial properties of ventral premotor cortex”, Journal of Neurophysiology, 77 (5): 2268–92. Iriki, A., M. Tanaka, and Y. Iwamura. 1996. “Coding of modified body schema during tool use by macaque postcentral neurones”, Neuroreport, 7 (14): 2325–30. Ishibashi, H., Hiharara, S., and Iriki, A. 2000. Acquisition and development of monkey tool-use: behavioural and kinematic analyses, Canadian Journal of Physiology & Pharmacology, 78 (11): 958–66.
Peripersonal Space Representation in Humans
103
Làdavas, E. 2002. “Functional and dynamic properties of visual peripersonal space”, Trends in Cognitive Sciences, 6 (1): 17–22. Làdavas, E. and A. Farnè. 2004a. “Neuropsychological evidence for multimodal representations of space near specific body parts”, in J. Driver and Spence (eds), Crossmodal Space and Crossmodal Attention, pp. 69–98. New York: Oxford University Press. ———. 2004b. “Visuo-tactile representation of near-the-body space”, Journal of Physiology Paris, 98 (1–3): 161–70. Làdavas, E., G. di Pellegrino, A. Farne, and G. Zeloni. 1998a. “Neuropsychological evidence of an integrated visuotactile representation of peripersonal space in humans”, Journal of Cognitive Neuroscience, 10 (5): 581–89. Làdavas, E., G. Zeloni, and A. Farne. 1998b. “Visual peripersonal space centered on the face in humans”, Brain, 121 (Part 12): 2317–326. Làdavas, E., Pavani, F., and Farne A. 2001. Auditory peripersonal space in humans: A case of auditory–tactile extinction. Neurocase, 72 (2): 97–103. Maravita A. and Iriki A. 2004. Tools for the body (schema). Trends in Cognitive Science, 8 (2): 79–86. Maravita, A., C. Spence, S. Kennett, and J. Driver. 2002. “Tool-use changes multimodal spatial interactions between vision and touch in normal humans”, Cognition, 83 (2): B25–34. Mattingley, J. B., J. Driver, N. Beschin, and I. H. Robertson. 1997. “Attentional competition between modalities: extinction between touch and vision after right hemisphere damage”, Neuropsychologia, 35 (6): 867–80. Rizzolatti, G., C. Scandolara, M. Matelli, and M. Gentilucci. 1981. “Afferent properties of periarcuate neurons in macaque monkeys. II. Visual responses”, Behavioral Brain Research, 2 (2): 147–63. Rizzolatti, G., G. Luppino, and M. Matelli, M. 1998. “The organization of the cortical motor system: new concepts”. Electroencephalography and Clinical Neurophysiology, 106 (4): 283–96. Rizzolatti, G., M. Matelli, and G. Pavesi. 1983. “Deficits in attention and movement following the removal of postarcuate (area 6) and prearcuate (area 8) cortex in macaque monkeys”, Brain, 106 (3): 655–73. Rizzolatti, G., L. Fadiga, L. Fogassi, and V. Gallese. 1997. “The space around us”, Science, 277 (5323): 190–91. Serino A, M. Bassolino, A. Farnè, and E. Làdavas. 2007. “Extended auditory peripersonal space in blind cane users”, Psychological Science, 18 (5): 642–48.
Chapter 7 A Neurophysiological Correlate and Model of Reflexive Spatial Attention Anne B. Sereno, Sidney R. Lehky, Saumil Patel, and Xinmiao Peng
T
he importance of the distinction between reflexive and voluntary orienting is often overlooked, despite the fact that much research has documented differences in behaviour, physiology, and anatomical structure that are critically involved in reflexive (passive, bottom-up) and voluntary (active, top-down) saccades (Briand et al., 1999; Klein et al., 1992; Munoz, 2002; Pierrot-Deseilligny et al., 1991; Sereno and Amador, 2006; Sereno and Holzman, 1996). With respect to covert orienting, or spatially selective attention, both a non-predictive peripheral cue and a centrally presented, predictive, symbolic cue will result, for a certain interval of time, in subjects being able to better detect, identify, and discriminate stimuli at the cued location. The peripheral cue reflexively draws attention, whereas the central cue causes a voluntary shift of attention. Much research has also shown that reflexive and voluntary attentional shifts differ in behaviour, including time-course and valence (facilitation versus inhibition), physiology, and brain structures (Corbetta et al., 1993; Rafal et al., 1989; Rosen et al., 1999; Sereno et al., 2006b). In this chapter, we focus on understanding and providing a neurophysiologically plausible mechanism of reflexive spatial attention. We will argue that the neurophysiological expression of inhibition of return (IOR) in the superior colliculus (SC) bears resemblance to repetition suppression effects previously reported in ventral cortical areas and, most recently, a dorsal stream cortical area, the lateral intraparietal cortex (LIP). With a simple model of neurons that show an adaptive response, we demonstrate that the output of the network mimics the behavioural attentional findings that result after the onset of a non-predictive reflexive cue: namely, facilitation in responding to the target at short cue–target intervals and IOR at longer cue–target intervals. Given that many neurons in both LIP and ventral stream cortical areas have been shown to be shape selective, the model makes a further prediction that the shape of the cue will influence
Reflexive Spatial Attention
105
spatial attention. In particular, it predicts that a cue with the same shape as the target will suppress spatial attention, due to shape specific adaptation effects. Preliminary behavioural data from our own lab, as well as previously published data from different paradigms support this contention. In sum, we demonstrate that a neurophysiologically plausible model, whose neurons incorporate a simple adaptive mechanism, results in output that mimics the behavioural findings of reflexive attention, including both spatial and shape effects of the cue on the response to a subsequent target. The consequence of these findings for understanding reflexive spatial attention is briefly discussed: (a) A simple adaptive mechanism, similar to what has been previously described as repetition suppression, is sufficient to explain reflexive attentional effects, including both spatial and shape effects; (b) This adaptive mechanism has been demonstrated in the superior colliculus, LIP, and ventral stream areas and hence may be a property of many brain regions, suggesting that reflexive attentional effects are a ubiquitous and distributed property of the brain; (c) Given that different brain regions manifest different properties and selectivities, the adaptive effects are specific and dependent on the stimulus properties and organization of stimulus properties represented in each area. Thus, as we have demonstrated for shape (Patel et al., 2007), we propose that spatial attentional effects may appear in many forms; and finally, (d) Although some have proposed that the facilitation and inhibitory (IOR) effects due to reflexive attention may be independent effects that occur simultaneously, we show that both can be explained by a single adaptive mechanism. We also demonstrate that this single mechanism can account for the modulation of attentional effects that are dependent on the shapes of the cue and target. Thus, although we do not preclude independent facilitatory and inhibitory effects, we show that separate mechanisms are not a necessary requirement.
Inhibition of Return (IOR) Behaviour Much research has focused on the behavioural effects that occur in a reflexive spatial attention paradigm (for review, see Egeth and Yantis, 1997; Klein, 2000; Wright and Ward, 1998). A typical spatial attention task that elicits IOR is illustrated in Figure 7.1. When a stimulus, S1, is flashed in the visual field, there are two well documented behavioural phenomena that follow. First, within approximately 50 to 150 ms, the response to a second stimulus, S2, that appears in the same location as S1 (cued trial, Figure 7.1) compared to other locations of the visual field (uncued trial, Figure 7.1) is facilitated (light gray section, Figure 7.2). This reflexive spatial attentional facilitation has been documented for a variety of responses, including detection, localization, and discrimination of nonspatial features, and has been observed for both manual and saccadic responses (see Sereno et al., 2006a for a review).
106 Anne B. Sereno et al. Figure 7.1 Typical reflexive spatial attention task used to elicit IOR
Note: After the subject fixates, one of two boxes brightens (S1 or cue). Following a variable interval of time (CTOA), a second stimulus (S2 or target) appears and the subject responds as quickly as possible. The figure shows two possible target conditions. An uncued trial, where S2 appears in a location other than the location of S1. A cued trial, where S2 appears in the same location as S1. Figure adapted from Klein, 2000.
Second, the facilitation effect is followed at cue–target onset asynchronies (CTOAs) of 150 ms or more by an opposite, inhibitory effect; hence the name, inhibition of return (see dark gray section, Figure 7.2). IOR was first described by Posner (1980) and Posner and Cohen (1984). IOR has also been reported for a variety of classification responses and response modes (see Klein, 2000, for a review; but see also Khatoon et al., 2002). The standard interpretation of these behavioural findings is that a peripheral visual event automatically draws attention to the position of the stimulus. This initial reflexive shift of attention towards the source of stimulation results in facilitation of the processing of all nearby stimuli. However, when the event is not task-relevant, attention shifts away and
Reflexive Spatial Attention
107
Figure 7.2 Typical behavioural results obtained in a reflexive spatial attention task
Source: Figures adapted from Klein, 2000. Note:
(A) With short cue–target onset asynchronies (CTOAs), subjects are faster to respond on cued trials, when S2 appears in the same location as S1. At longer CTOAs, subjects are faster to respond on uncued trials, when S2 appears in any location other than S1. Data from a task with manual response. (B) The time course of attentional effects in saccade paradigms. Data combined from eight studies. Data plotted as differences in saccadic reaction times (uncued minus cued) as a function of CTOAs. The period of reflexive spatial attentional facilitation is highlighted in light gray. The period of reflexive spatial attentional inhibition is highlighted in dark gray and referred to as inhibition of return (IOR).
108 Anne B. Sereno et al. there is an inhibition of attentional resources returning to previously attended locations. This inhibitory effect is measured as a delayed response to stimuli subsequently displayed at the originally cued location. Whether the slowing of response is due to inhibition of sensory analyses or to inhibition of motor response to previously cued locations remains debatable (Fecteau and Munoz, 2007; Sereno et al., 2006a; Taylor and Klein, 2000).
Anatomical Localization Several lines of evidence have suggested that the SC is involved in the generation of IOR (for review, see Sereno et al., 2006b). Much of this evidence comes from behavioural studies in humans. The first evidence supportive of collicular involvement in IOR came from studies performed in patients with progressive supranuclear palsy. A common early symptom is disruption of eye movements. This deficit is thought to be caused by degeneration in the SC and adjacent midbrain structures. In a series of studies, Posner, Rafal, and colleagues demonstrated that progressive supranuclear palsy patients showed deficits in early facilitation and later IOR (Posner et al., 1982; Posner et al., 1985; Rafal et al., 1988). Other behavioural evidence suggesting that the SC is involved in IOR comes from reports of differences in orienting effects between the temporal and nasal visual fields. The visual pathways leading into the SC include a greater representation of the contralateral nasal hemiretina (temporal visual field) than the ipsilateral temporal hemiretina (nasal visual field). In 1989, Rafal and colleagues demonstrated that subjects performing a spatial attention task monocularly showed larger IOR effects in the temporal versus the nasal hemifield, corresponding to the greater representation of the temporal hemifield in retinotectal projections. Finally, two reports have demonstrated that patients with lesions of the SC fail to show IOR (Sapir et al., 1999; Sereno et al., 2006b). Until recently, no direct neurophysiological evidence has been produced in support of localization of IOR in the SC.
Physiology A series of recent studies (Bell et al., 2004; Dorris et al., 2002; Fecteau et al., 2004) has examined the responses of single neurons in the SC of monkeys performing a spatial attention task in search of neural correlates of reflexive spatial attention effects. Like humans, monkeys show behavioural facilitation on cued trials (see Figure 7.3, gray line) compared to uncued trials (Figure 7.3, black line) at a short CTOA, and inhibition for cued trials at longer CTOAs (also see Figure 7.4A, top panel). These studies have identified neural correlates of both early facilitation and later IOR effects in the intermediate layers of the SC. At short CTOA intervals that result in behavioural facilitation of response for cued trials, there is relatively stronger target-related activity when the cue and target appear
Reflexive Spatial Attention
109
Figure 7.3 Behavioural results of monkeys in a reflexive spatial attention task
Source: Figure adapted from Fecteau et al., 2004. Note:
The left panel shows that monkeys show a similar pattern to humans. Response times of cued trials when S2 (target) appears on the same side as S1 (cue) are shown by gray line. Response times of uncued trials, when S2 appears on the opposite side as S1, are shown by black line. These results show that monkeys also show facilitation on cued trials (gray line compared to black line) at the shortest CTOA, and inhibition on cued trials at the longer CTOAs (gray line compared to black line). The right panel displays the same data as differences between mean uncued (opposite) and cued (same) conditions, again showing facilitation at the shortest CTOA and inhibition at greater CTOAs.
at the same location versus different locations (Bell et al., 2004; Fecteau et al., 2004). In particular, Figure 7.4B (top panel) demonstrates that the response of an example neuron to the target, indicated by the gray box, is greater on cued trials (gray line) than on uncued trials (black line). At longer CTOA intervals that result in behavioural inhibition of response for cued trials, these studies further show that there is relatively weaker targetrelated activity when the cue and target appear at the same location compared to different locations. Figure 7.4B (bottom panel) demonstrates that the response of this neuron to the target, time period indicated by the gray box, is suppressed on cued trials (gray line) compared to uncued trials (black line). Dorris et al. (2002) showed that the magnitude of this suppressed response was correlated with subsequent slowing in saccadic reaction times. This attenuation of activity during IOR is direct evidence that the SC is involved in the manifestation of IOR.
110 Anne B. Sereno et al. Figure 7.4 Activity of SC neurons during a reflexive spatial attention task
Source: Figure adapted from Fecteau et al., 2004. Note: (A) Population averages for saccadic response times (top panel, same as right panel of Figure 7.3 showing the 5 CTOAs), pre-target-related activity (middle panel), and target-related activity (bottom panel). Pre-target activity (middle panel) shows maintained activity on cued trials at long CTOAs. Target-related activity (bottom panel) shows that population activity of cued trials compared to uncued trials are greater at the shortest CTOA, but suppressed at CTOAs of 100 and 200. (B) Neural activity of two representative SC neurons. Top panel shows the averaged activity of a neuron at the 50 ms CTOA when S1 and S2 appeared in the response field (gray line, cued trials) or just S2 appeared in the response field (black line, uncued trials) of the neuron. At the 50 ms CTOA, the behavioural response of the monkey to the target is facilitated on cued trials. Note that the neural response to the target (period highlighted in gray box) is greater on cued trials (gray) compared to uncued (black) trials. Bottom panel shows the averaged activity of a second neuron at the 200 ms CTOA when S1 and S2 appeared in the response field (gray line, cued trials) or when just S2 appeared in the response field (black line, uncued trials) of the neuron. At the 200 ms CTOA, the behavioural response of the monkey to the target is inhibited on cued trials. Note that the neural response to the target (period highlighted in gray box) is reduced on cued trials (gray) compared to uncued (black) trials. In addition, the bottom panel illustrates that during the pre-target period (white box immediately preceding the gray boxed target period) that there is maintained activity on cued trials (gray) compared to uncued trials (black).
Reflexive Spatial Attention
111
Dorris et al. (2002), however, also argued that SC is not the site of inhibition because the observed attenuation of activity during IOR was not caused by active inhibition of those neurons which were, in fact, more active following the presentation of the first stimulus (cue, S1) in their response field. As shown in Figure 7.4B (bottom panel), the activity of the neuron after the presentation of S1 (cue), but before the presentation of S2 (target), is greater than baseline activity. That is, in the period of time immediately preceding the presentation of S2 (period marked by white box, immediately preceding target period marked in gray), the activity of neurons in SC is greater on cued trials (gray line) than on uncued trials (black line). When they repeated the same experiment and induced saccades by electrical microstimulation of the SC in order to assess the level of excitability of the SC circuitry during the IOR task, they found that faster saccades were elicited from the cued location than the uncued location. Thus, they concluded that in monkeys, the SC participates in the expression of IOR but is not the site of the inhibition. In particular, they suggested that reduced activity in the SC reflects a signal reduction that has taken place upstream, perhaps in posterior parietal cortex. This greater activity in the S1–S2 (cue–target interval) was perplexing because prior studies of motor preparation (for example, Dorris and Munoz, 1998; Dorris et al., 1997) had shown that similar increases in pre-target activity were associated with shorter, not longer, saccade reaction times. Nevertheless, this physiological pattern of suppression for a repeated stimulus and maintained activity between stimulus presentations has been reported before in physiological studies of cortical areas, primarily from recordings in temporal, but not parietal, cortical areas. We briefly review this literature in the next section.
Repetition Suppression Ventral Stream Physiology Visual sensory information proceeds through several cortical areas before it reaches inferotemporal cortex (Felleman and Van Essen, 1991). Neurons in inferotemporal cortex are sensitive to complex visual properties such as colour, shape, and facial structure (for reviews, see Logothetis and Sheinberg, 1996; Tanaka, 1996). Previous neurophysiological studies have also demonstrated that many neurons in inferotemporal cortex show a reduced response to a repeated stimulus. The reduction in neuronal response to a repeated stimulus has been variously described or labelled: decremental response (Brown et al., 1987; Fahy et al., 1993); adaptive filtering (Desimone, 1992); stimulus specific adaptation (Ringo, 1996); or response suppression (Desimone, 1996); see Brown and Xiang (1998) for review. These effects have been reported from a number of different laboratories (Brown et al., 1987; Miller et al., 1993; Sobotka and Ringo, 1993) from recordings in various regions of inferotemporal cortex, including area TE, perirhinal cortex, and entorhinal cortex (Fahy et al., 1993). Figure 7.5
112 Anne B. Sereno et al. Figure 7.5 Stimulus repetition suppression effects in the ventral steam during a serial recognition task
Source: Figure adapted from Fahy et al., 1993. Note:
(A) Schematic representation of the task. The monkey was presented with different unfamiliar objects and pictures (indicated by the letters A, B, C, and D) and taught that the correct response was a left reach (L) to the first presentation and a right reach (R) to a subsequent presentation of a stimulus. The first (black box) and repeated (gray box) presentations of stimulus A and D are highlighted with boxes. (B) Histograms of a representative perirhinal neuron with activity averaged either across the first (L response) or repeat (R response) presentations of 10 different unfamiliar stimuli.
Reflexive Spatial Attention
113
illustrates a cell in perirhinal cortex that demonstrates a repetition suppression effect. The average response of the cell to 10 different stimuli is shown for the first (Figure 7.5B) and a repeated presentation (Figure 7.5C). Repetition suppression effects are also evident in population measures. Figure 7.6 illustrates average responses across a population of cells recorded in anterior inferotemporal cortex (AIT) while the animal performed one of two tasks (see Figure 7.6A). Figure 7.6B demonstrates that whether or not a repeated stimulus was a match (in a standard delayed match to sample task; gray bars with vertical stripes) or a nonmatch (in a delayed match to sample task with intervening nonmatch repeats; gray bars with horizontal stripes), there was a reduced response to the repeated stimulus compared to its first presentation in a trial (sample, black bar; and nonmatch, white bars). Interestingly, a repeated visual stimulus also elicits a reduced neural response in functional magnetic resonance imaging (fMRI). This decrease in blood oxygenation leveldependent (BOLD) response has been variously labelled fMRI-adaptation, repetition attenuation, or repetition suppression (Grill-Spector et al., 2006; Henson, 2003; Schacter and Buckner, 1998; Wiggs and Martin, 1998; Xu et al., 2007). In the last decade, this repetition suppression effect has been used in fMRI studies to measure neuronal selectivity in different regions of ventral visual cortex (for example, Epstein et al., 2003; GrillSpector et al., 1999; Kourtzi and Kanwisher, 2001).
Lateral Intraparietal Cortex (LIP) Too little research has explored stimulus specific repetition suppression effects in the dorsal stream visual areas. In part, this may be due to the widely held presumption that the ventral stream is important for object properties whereas the dorsal stream is important for spatial processing (Ungerleider and Mishkin, 1982; see also Figure 7.7). However, recent reports in the monkey (Sereno and Amador, 2006; Sereno and Maunsell, 1998; M. E. Sereno et al., 2002) demonstrate that shape information is present in neurons at a high level area of the dorsal stream, the lateral intraparietal cortex (LIP). In a comparison of shape encoding between dorsal and ventral visual pathways, Lehky and Sereno (2007) demonstrate that stimulus repetition of a 2D geometric shape in a passive fixation task (Figure 7.8) causes a response decrement in both AIT and LIP. This repetition suppression effect is apparent in plots of the peristimulus responses (Figure 7.9) averaged over all shape stimuli and all shape selective cells in AIT (black lines) and LIP (gray lines). As illustrated in Figure 7.9, averaged population responses of neurons in AIT and LIP are greater in the first presentation of a stimulus in a trial (solid lines) compared to subsequent stimulus repetitions within a trial (dashed lines). These findings document the first report of shape selective repetition suppression effects in LIP (see arrow labelled RS in Figure 7.9). Figure 7.9 also shows that neurons in LIP have significantly higher average firing rates to the various shapes than do neurons in AIT. After normalization, however, Lehky and
114 Anne B. Sereno et al. Figure 7.6 Stimulus repetition suppression effects in the ventral steam during delayed match-to-sample tasks
Source: Figure adapted from Miller and Desimone, 1994. Note:
(A) Schematic representation of the tasks. The monkey was trained to initiate a trial by holding down a lever. After fixating a small fixation target, up to five different familiar objects and pictures were presented (e.g., butterfly, umbrella). The first stimulus of the trial was the sample and the animal was trained to release the lever when the same stimulus (match) was presented again. In the standard task, the intervening nonmatch stimuli were never repeated. In the ABBA design, interleaving nonmatch stimuli could also repeat. (B) Average activity across neurons with a significant repetition suppression effect in inferotemporal cortex. Panel B illustrates that repeated presentation of a stimulus, whether a match (gray bars with vertical stripes) or repeated nonmatch (gray bars with horizontal stripes) was reduced compared to the first presentation of the stimulus in a trial, either as the sample (black bar) or the first presentation of a nonmatch (white bars).
Reflexive Spatial Attention
115
Figure 7.7 Schematic localization of visual pathways in the macaque brain
Source: Figure adapted from Lehky and Sereno, 2007. Note:
This indicates a major visual areas along the dorsal pathway (solid arrows) and ventral pathway (dashed arrows). The lateral intraparietal area (LIP), located on the lateral bank within the intraparietal sulcus (IPS), is a high-level area in the dorsal pathway. Anterior inferotemporal cortex (AIT), including the lower bank of the superior temporal sulcus (STS) and convexity of the middle temporal gyrus, is a high-level visual area in the ventral pathway. The frontal eye field (FEF), including cortex in the rostral bank of the arcuate sulcus (AS), receives projections from both LIP and AIT. Dorsolateral prefrontal cortex (dlPFC), including the cortex of the principal sulcus (PS) and dorsal to the PS, and ventrolateral prefrontal cortex (vlPFC), including cortex ventral to the PS, are prefrontal cortical areas receiving heavy projections from LIP and AIT, respectively. (LuS: Lunate sulcus; LaS: Lateral sulcus; CS: Central sulcus).
Sereno, (2007) demonstrated that repetition suppression effects were not significantly different for the two areas (see Figure 7.10). Miller et al. (1993) and Holscher and Rolls (2002) have argued that there is an active reset mechanism in AIT that restricts the repetition suppression effect to stimuli presented within a single trial such that suppression does not continue even in the short duration until the next trial. This suggests that repetition suppression is an active, task-related cognitive process, not simply a biophysical adaptation effect. In our data, repetition suppression effects in LIP neurons are consistent with reset between short blocks of trials (see Figure 7.11). However, the average time between presentation onsets within a trial in Figure 7.11 was approximately 900 ms whereas the
116 Anne B. Sereno et al. Figure 7.8 Schematic diagram of the fixation task (one location, eight shapes)
Note: The stimulus for each trial in the fixation task is selected from a set of eight different shapes. All shapes are centred on the same position within the cell’s receptive field. For each trial, one randomly selected shape is presented for typically four repetitions before a central fixation spot is extinguished. Stimulus duration and the blank interval following each stimulus repetition are constant (typically 350 ms and 750 ms, respectively). The animal is required to maintain fixation within 0.5° of the central 0.1° spot in the centre of the video display throughout the trial. The animal is rewarded for maintaining fixation of the central spot until it disappears. A minimum of six trials is presented for each shape.
average time between presentations between blocks was minimally 21 sec (average of 4 trials). A relatively fast recovering adaptation function would appear to reset between trials. Hence, it remains for quantitative studies across different brain regions to determine whether repetition suppression is a function of time versus trial and task structure.
Reflexive Spatial Attention
117
Figure 7.9 Activity of AIT and LIP shape selective neurons during a passive fixation task with repeated stimulus presentations within a trial
Source: Figure adapted from Lehky and Sereno, 2007. Note:
Solid lines show responses to the first presentation, whereas dashed lines show average response over all subsequent stimulus repetitions within a trial. Gray bar at bottom indicates stimulus presentation period, which varied between 250 (darker gray) and 350 ms in different units. The arrow labelled (RS) indicates repetition suppression effects in LIP. A similar reduction in response can be seen in AIT responses (black lines). Activity preceding the zero on the x-axis shows baseline activity (solid lines) or maintained activity between stimulus repetitions (dashed lines). The arrow labelled (MA) indicates maintained activity in LIP. A similar increase in baseline response can be seen in AIT responses (black lines).
Maintained Activity in AIT, LIP, and SC Lehky and Sereno (2007) also called attention to another aspect of AIT and LIP responses that is apparent in Figure 7.9, namely, that the activity of neurons in both areas does not return to baseline between stimulus presentations within a trial. This maintained activity between presentations is most clearly seen on the left side of Figure 7.9 (see arrow labelled MA)
118 Anne B. Sereno et al. Figure 7.10 Repetition suppression effects in AIT and LIP
Source: Figure adapted from Lehky and Sereno, 2007. Note:
In both cortical areas, repeat presentations of the same stimulus within a trial produced a decreased response relative to the first presentation. Responses for each cell have been normalized relative to the first stimulus presentation and then averaged together for each cortical area.
where the average baseline firing rate before subsequent presentations in a trial (dashed lines) is greater than the firing rate before the first presentation of the stimulus (solid lines). Interestingly, this maintained activity in AIT and LIP resembles the elevated baseline activity reported by Dorris et al. (2002) in the SC. As with repetition suppression effects, maintained activity may be a ubiquitous reflexive response property occurring in many areas of the brain.
Model of Reflexive Spatial Attention Repetition Suppression as a Manifestation of IOR Recent neurophysiological studies in SC have shown that at longer CTOA intervals that result in IOR, there is relatively weaker target-related activity when cue and target appear at the same, compared to different, locations (Bell et al., 2004; Dorris et al., 2002;
Reflexive Spatial Attention
119
Figure 7.11 Repetition suppression effects in LIP across six blocks of trials
Note: Repeat presentations of the same stimulus within a trial produced a decreased response relative to the first presentation, but the average activity largely recovered between blocks. Each block consisted of eight trials (one for each shape stimulus) presented in random order. Data were pooled over all neurons showing shape selectivity and included responses for all stimuli.
Fecteau et al., 2004). These changes in the neural representation of the target were tightly coupled with subsequent saccadic reaction times. This repetition suppression effect is direct evidence that the SC is involved in the manifestation of IOR.
A Model with Repetition Suppression as the Mechanism of Reflexive Spatial Attention We hypothesize that behavioural reflexive spatial attentional effects can be explained by repetition suppression of neuronal responses. Given that different brain regions are selective for different stimulus properties, we suggest that suppression effects may be specific and dependent on the stimulus properties represented in each area. If repetition suppression effects in other cortical areas are related to the manifestation of reflexive attentional effects, as they are in the SC, there may be a form of reflexive spatial attention that is indeed sensitive to the shape of the cue and target. In many reflexive attention tasks, the cue has a different shape than the target (for example, see Figure 7.1). To model the effects of repetition suppression while allowing for shape selectivity, we created a small
120 Anne B. Sereno et al. network of four neurons (two shape selective neurons for each of two locations), each with an adaptive mechanism, mutually inhibitory spatial interactions, and non-linear dynamics (see Figure 7.12; Patel et al., 2007). In our model neuron, the adaptive mechanism is identical to those utilized in model neurons of retinal computations (Abbott et al., 1997; Grossberg, 1972; Ogmen, 1993). Such models are also employed in perceptual models of blur discrimination (Purushothaman et al., 2002) and visual masking (Ogmen et al., 2003). We compute the model’s output by first summing the activity of the two shape selective neurons at each location. The model’s output is the larger of the summed Figure 7.12 Proposed network model of reflexive spatial attention
Note: This simple model consists of four neurons. The receptive fields of Neurons 1 and 2 represent Location 1, whereas the receptive fields of Neurons 3 and 4 represent Location 2. In addition, Neurons 1 and 3 prefer one shape (circle), and Neurons 2 and 4 prefer a second shape (triangle). Each neuron can be excited directly by visual stimuli (cue or target). Neurons from one location inhibit neurons representing other locations (thin gray lines) via tonic interneurons (IN1 & IN2). For simplicity, this mutual inhibition is shown to originate after the outputs of all the neurons at one location are summed. The activity from all neurons representing a given location are summed (S) and the larger of the two sums (MAX) is designated as the output of the model. The output of the model is used as the modulatory component on the behavioural response. Greater model output facilitates behavioural response and weaker model output slows behavioural response. Similar to psychophysical studies, the attentional effects of the model are then computed by comparing behavioural responses in the uncued and cued conditions.
Reflexive Spatial Attention
121
activities at the two locations. We did not explicitly model the MAX readout mechanism but have used a scheme that can be implemented by a competitive winner-take-all type neural network (for example, Lo and Wang, 2006). An alternative implementation in which the model’s output is equal to the difference between the summed activities at the two locations, yielded qualitatively similar results. Figure 7.13 illustrates the response of individual neurons in the model (black lines; solid, dashed, and dotted to match neuron pattern in Figure 7.12) and the output of the model (gray line) to the presentation of a cue (S1) and target (S2) under three different cueing conditions (rows) at three different CTOAs (columns). The first two rows (Figures 7.13A and 13B) represent two types of cued trials. The first row (Figure 7.13A) represents a cued trial in which the target has the same shape as the cue (target neuron response shown with solid black line), and the second row (Figure 7.13B), a cued trial in which the target has a different shape than the cue (target neuron response shown with dashed black line). The third row (Figure 7.13C) represents uncued trials in which the target appears in a different location than the cue (target neuron response shown with dotted black line). The period of time highlighted in gray in each of the nine graphs of Figure 7.13 represents the period over which the model’s output is integrated to compute latency modulation of the behavioural response to the target. This period highlights each cell’s activity associated with the target presentation (black lines) as well as the output of the network (gray line). The output of the model to the target in a particular cueing condition was used to compute the modulatory component of the response time (mRT) for that condition (higher output was associated with shortening of the response time). Attentional effects were then computed by comparing the model’s output to the target across conditions (uncued mRT minus cued mRT condition) and are illustrated in Figure 7.14 for the two different spatial cueing conditions (same shape of cue and target, dashed line; different shape of cue and target, solid line). We now briefly describe in greater detail the simulation results of the model for a cued trial in which the target has the same shape as the cue (Figure 7.13A). Each of the three graphs in Figure 7.13A illustrates the model’s simulation results for 3 different CTOAs (50, 250, and 1000). In each graph, the first peak response of Neuron 1 (solid black line; arrow labelled R0 in the third graph) indicates the magnitude of the normal unadapted response of the neuron to a stimulus cue. For each graph, AR indicates the magnitude of an adapted response (AR) of Neuron 1 to the repetition of the same stimulus (target) at three different cue–target intervals following the initial stimulus (labelled AR50, AR250, AR1000 for CTOAs of 50 ms, 250 ms, and 1000 ms). As illustrated in the first graph (CTOA of 50 ms) in Figure 7.13A, the response to the target is the most attenuated, indicated by the relatively small increase in activity associated with presentation of the target (indicated by arrow labelled AR50). In this cued condition, given that the activity of the neuron is high at the time of target presentation, due to temporal proximity of the cue response, even a weak adapted response to the target (second peak of the solid black line) is greater than the initial cue response (first peak of solid black line). More importantly, the output of the model
122 Anne B. Sereno et al. Figure 7.13 Simulated outputs of the model at three different CTOAs during a reflexive attention task
Note: Panels A and B depict two types of cued conditions (CUED) where the target is presented at the same location as the cue (Location 1 indicated by L1 in the left margin). In Panel A, the target (a circle, activating Neuron 1, solid neuron) has the same shape as the cue, whereas in Panel B the target (a triangle, activating Neuron 2, dashed neuron) has a different shape than the cue. Panel C depicts an uncued condition (UNCUED) where the target is presented at a different location (Location 2 or L2) than the cue (L1). The target in Panel C could be either a circle or triangle, activating either Neuron 3 or 4 (dotted neurons). For each graph, the activity of each model Neuron is indicated by the black lines (solid or broken) (see figure legend in upper left graph). For simplicity of illustration, activities of only 3 of the neurons are depicted (only Neuron 3 is shown at Location 2). Identical results will be obtained if the target was presented to Neuron 4 instead of Neuron 3 in the uncued condition. The gray line in each graph indicates the output of the model. The shaded gray region in each graph represents a 50 ms period of time shortly after target presentation, over which the output is integrated for association with the behavioural response to the target (target related output, see Figure 7.14). (A) Simulated output of the model during a cued trial where the target (S2) has the same shape as the cue (S1). Firing traces (relative response magnitude) of model Neurons 1 (solid), 2 (dashed), and 3 (dotted) in Figure 7.12 in response to a cue (circle in Location 1, presented at time 0) and target (circle in Location 1) presented at either 50 ms in the first column, 150 ms in the second column, or 1000 ms in the third column. Both cue and target are presented briefly for 30 ms. In panel A, the first peak response of Neuron 1 (solid black line), labelled R0 in the third graph, indicates the magnitude of the normal unadapted response of the neuron to a stimulus, and AR indicates the magnitude of an adapted response of Neuron 1 to the repetition of the same stimulus at three different intervals following the initial stimulus (AR50, AR250, AR1000 for CTOAs of 50 ms, 250 ms, and 1000 ms). (B) Output of the model during a cued trial where the target (S2) has a different shape than the cue (S1). Firing traces of model Neurons in response to a cue (circle in Location 1) and target (triangle in Location 1). All conventions remain the same as in Panel A. (C) Firing traces of model Neurons in response to cue (circle) and target (circle or triangle). Here, the cue is presented at Location 1, whereas the target is presented at Location 2, corresponding to an UNCUED condition.
Reflexive Spatial Attention
123
Figure 7.14 Simulation of reflexive spatial attention and the influence of shape
Note: To calculate the effect of reflexive attention at a particular CTOA, we compare the target related output (see gray shaded region in the individual graphs of Figure 7.13) in a cued condition (Figure 7.13, Panel A: cue and target, same shape, or Panel B: cue and target, different shapes) versus an uncued condition (Panel C). The difference between the target related output in a cued and uncued condition as a function of cue–target onset asynchrony (CTOA) simulates the behavioural effect of reflexive spatial attention. The solid curve shows the simulated reflexive spatial attention effects in a standard attention paradigm in which the cue and target have different shapes (comparing model output in Panels B and C in Figure 7.13). The dashed curve shows the spatial attention effects when the cue and target have the same shape (comparing model output in Panels A and C in Figure 7.13). The three black vertical lines denote the three CTOA conditions that are depicted in Figure 7.13, namely CTOAs of 50, 250, and 1000 ms. Note that behavioural responses are slowed when the shapes of cue and target are the same. As a consequence, at a relatively early CTOA of 250 ms, the standard attentional paradigm (different shape of cue and target, solid curve) results in behavioural facilitation, whereas the same shape attentional condition (same shape of cue and target, dashed curve), results in behavioural inhibition.
124 Anne B. Sereno et al. to the target at the shortest CTOA (gray line during the gray highlighted period in the first graph, Figure 7.13A) is greater than the output to the target on uncued trials (gray line during the gray highlighted period in the first graph, Figure 7.13C). This period of time corresponds to the period of behavioural facilitation illustrated in Figure 7.14 by the position of the dashed curved line above 0 at a CTOA of 50 ms (indicated by the first vertical line, labelled 50). At slightly longer CTOAs (second graph of Figure 7.13A), the response to the target begins to recover. The adapted response, indicated by arrow labelled AR250, increases compared to AR50. Nevertheless, in this cued condition, the response to the target (second peak of solid black line, second graph, Figure 7.13A) is suppressed and smaller than the neuron’s response to the initial cue (first peak of solid black line). However, due to a non-adapted response in the uncued location coupled with inhibitory spatial interactions, the neuronal response to a stimulus in an uncued location is actually slightly greater (gray highlighted region of dotted black line, second graph, Figure 7.13C). Together, these effects result in a weaker output of the model in the cued versus uncued condition and, thus, a slower behavioural response to the target in the cued versus uncued condition. This period of time corresponds to the onset of behavioural inhibition or IOR, as illustrated in Figure 7.14 by the position of the dashed curved line below 0 at a CTOA of 250 ms (indicated by the second vertical line, labelled 250). At even longer CTOAs (third graph of Figure 7.13A), the response of the neuron to the target has largely recovered from adaptation. The adapted response, indicated by arrow labelled AR1000, is now nearly equivalent to the initial response to the cue, arrow labelled R0. However, in an uncued location (gray highlighted region of dotted black line, third graph, Figure 7.13C), due to persistent adaptation effects and inhibitory spatial interactions, the response of a neuron to a target continues to be greater than the response to a target in the cued condition (gray highlighted region or second peak of solid black line, third graph, Figure 7.13A). Together, these effects result in a weaker output of the model in the cued versus uncued condition and, thus, a slower response to the target. This period of time corresponds to behavioural inhibition or IOR, as illustrated in Figure 7.14 by the position of the dashed curved line below 0 at a CTOA of 1000 ms (indicated by the third vertical line, labelled 1000). The dynamic properties of these model neurons, including repetition suppression and maintained activity, qualitatively agree with previous reports of neurophysiological recordings in SC, AIT, and LIP. Further, the output of the model to a target presented at a cued location (Figure 7.13A, 13B) compared to an uncued location (Figure 7.13C), at different CTOAs, qualitatively agrees with standard reflexive attentional effects (see curves in Figure 7.14). That is, greater output on cued trials to a repeated stimulus at short intervals results in behavioural facilitation, and reduced output on cued trials to a repeated stimulus at longer intervals results in behavioural inhibition (IOR). Hence, a simple network model composed of shape selective cells with an adaptive mechanism appears to be sufficient to explain reflexive spatial attentional cueing effects.
Reflexive Spatial Attention
125
Prediction of the Model: Effect of Shape on Reflexive Spatial Attention Although repetition suppression effects appear to be sufficient to explain reflexive spatial attentional cueing effects, cortical brain regions are sensitive to and selective for different properties and previous studies have demonstrated that repetition suppression effects are specific and dependent on the stimulus properties represented, such as shape (Lehky and Sereno, 2007). If such suppression effects in these cortical areas result in reflexive attentional effects, then some form of reflexive spatial attention should be sensitive to the shape of the cue and target. To model the effects of the cue and target shape on reflexive attention, we compared the output of the model when the target had the same shape as the cue (Figure 7.13A) versus when it had a different shape (Figure 7.13B). As illustrated in Figure 7.14, the solid curve shows behavioural results of the model for a standard reflexive attention task, where cue and target shape differ. Output from the model produces facilitation at short CTOAs and inhibition at longer CTOAs. However, when the cue and target have the same shape, early spatial facilitation is reduced, resulting in an earlier onset of IOR (Figure 7.14, dashed curve). The model thus predicts that some spatial attention effects may depend on the visual similarity of the cue and target.
Spatial Attention is Influenced by the Shape Similarity of the Cue and Target Some previous reports have suggested that stimulus attributes of the cue and target, such as colour, orientation, and shape may affect spatial attentional effects. Kwak and Egeth (1992) showed that response times for CTOAs ranging from 300 to 900 ms were faster when the orientation of cue and target were the same. On the other hand, Riggio, Patteri, and Umilta (2004) showed that response times at 250 and 500 ms were slowed when the shape of cue and target were the same. Studies examining letter (Corballis and Armstrong, 2007) and word (Kanwisher, 1987) repetitions have similarly demonstrated a substantial inhibitory effect on recognition performance for repetition intervals between 100 and 700 ms. In a recent series of experiments, we also showed that the shapes of cue and target influence reflexive spatial selective attention. At some CTOAs, the same shape of cue and target reduces or even eliminates the early facilitation and agrees qualitatively with the predictions of the model as illustrated in Figure 7.14 (Patel et al., 2007).
Implications of Widespread Repetition Suppression Effects A Simple Adaptive Mechanism as the Basis of Reflexive Spatial Attention We hypothesize that reflexive spatial attentional effects, including effects of both a cue’s location and its shape on the behavioural response to a subsequent target, can be explained by repetition suppression. We created a model consisting of neurons whose dynamic properties were similar to those of neurons in area AIT, LIP, and SC, and found that the
126 Anne B. Sereno et al. model’s simulations qualitatively agree with psychophysical data (Patel et al., 2007), suggesting that these properties are sufficient to explain spatial attentional cueing effects.
Repetition Suppression as a Ubiquitous and Distributed Property in the Brain As reviewed above, repetition suppression effects have been shown repeatedly in several areas in the ventral stream. Such effects for 2D shapes are also present in LIP, a high-level dorsal stream area (Lehky and Sereno, 2007). Repetition suppression appears to be a ubiquitous and distributed reflexive response property occurring in many areas of the brain.
Many Forms of Reflexive Spatial Attention and IOR Given that different brain regions are sensitive and selective for different stimulus attributes, we suggest that suppression effects may be specific and dependent on the stimulus properties represented in each area. If repetition suppression is related to the manifestation of reflexive spatial attention as it is in the SC, then there may be many forms of IOR. Attentional effects, then, may depend on the stimulus properties and organization of stimulus properties represented in different brain areas. In support of this contention, in a recent series of experiments, we show that the shape of cue and target can influence reflexive spatial selective attention (Patel et al., 2007).
LIP as the Upstream Source of IOR in SC As reviewed above, Dorris et al. (2002) suggested that the attenuated activity in the SC to a repeated stimulus reflects a signal reduction that has taken place upstream of the SC, perhaps in posterior parietal cortex. We have demonstrated that neurons in LIP, an area in posterior parietal cortex with heavy projections to the SC, show reduced responses to a repeated stimulus. Accordingly, LIP could be the upstream source of the attenuated signals in SC. Alternatively, as we demonstrate with our model using a simple adaptive mechanism, the neurons need not receive any additional external suppressed signals to exhibit a suppressed or adapted response. Hence, both areas may create unique and independent neural correlates of reflexive attention. Indeed, we have recently demonstrated that cells in LIP represent reflexive attentional and mnemonic properties more robustly than voluntary ones (Sereno and Amador, 2006).
Facilitation versus Inhibition Because the neural mechanisms underlying reflexive attentional cueing effects were not well understood, some previous investigators suggested that facilitation and IOR effects
Reflexive Spatial Attention
127
due to reflexive spatial attention may be independent effects that occur simultaneously, but whose magnitudes follow different time courses (Klein, 2000; Ro and Rafal, 1999; Tipper et al., 1997). We demonstrate that, with our model composed of neurons with a simple adaptive mechanism, we can elicit either increased responses to the target (behavioural facilitation) or suppressed responses to the target (behavioural inhibition). We also show that, by manipulating the shapes of the cue and target, we can induce an additional inhibitory cueing effect. Thus, while not ruling out the possibility of independent facilitatory and inhibitory mechanisms, our model results, showing facilitation, inhibition, and even modulations that appear specific to either facilitation or inhibition, do not necessitate independent mechanisms.
Comparison to Previous Computational Models of Visual Attention Several elaborate computational models of visual attention attempt to account for many aspects of attention including both reflexive and voluntary processes (for review, Itti and Koch, 2002). Our model differs in several respects. First, the model is focused and restricted. We attempt to explain only reflexive spatial attentional effects. Second, most previous computational models of attention propose a unique “saliency” or “master” map (for example, Koch and Ullman, 1985) that is used to control and maintain a single attentional focus. Although many models are based on a saliency map, Desimone and Duncan (1995) have argued that saliency is not explicitly represented by neurons in a specific saliency map, but instead is implicitly encoded in a distributed modulatory manner across various feature maps. We propose here a mechanism for reflexive spatial attention that is present in many areas of the brain and does not require implementation in a unique master saliency map. Additionally, in order to allow a model to rapidly shift attentional focus without being bound to attend only to the location of maximal saliency at any given time, various computational models have implemented IOR as a trigger of transient inhibitory conductances in the saliency map at the currently attended location (Itti and Koch, 2002). Instead, we argue that both excitatory and inhibitory reflexive spatial attentional effects are specific and local to the areas that are responding.
Conclusions In sum, the first neurophysiological evidence recorded from neurons in the SC suggests that the second presentation of a stimulus results in a reduced neuronal response. The magnitude of the reduced response is correlated with behavioural response and IOR. We suggest here that this reduced response of neurons in SC is similar to repetition suppression effects previously reported in ventral stream areas. We also show that repetition suppression effects are present and of equal magnitude in a dorsal stream area, LIP, that is
128 Anne B. Sereno et al. involved in eye movements and attention and also has dense projections to the SC. Such effects may be a pervasive feature of many brain regions. Further, given that multiple brain regions represent different aspects of visual stimuli, repetition suppression effects are likely specific to particular features that each area represents. For this reason, they have become a powerful tool in fMRI research to explore stimulus-specific neuronal representations. We have developed a simple neural model showing that repetition suppression effects can account for spatial and shape dependent reflexive spatial attentional cueing effects, providing a plausible neurophysiological mechanism for reflexive spatial attention that accounts for the time course and valence (facilitation and inhibition) of both spatial and shape effects.
Acknowledgements This work was supported in part by grants from the NIH (National Institutes of Health) (R01 MH 065492 and R01 MH 63340), National Science Foundation and the NARSAD (National Alliance for Research on Schizophrenia and Depression) Essel Investigator Award. The authors would like to thank Margaret Sereno for comments on earlier versions of this manuscript.
References Abbott, L. F., J. A. Varela, K. Sen, and S. B. Nelson. 1997. “Synaptic depression and cortical gain control”, Science, 275 (5297): 220–24. Bell, A. H., J. H. Fecteau, and D. P. Munoz. 2004. “Using auditory and visual stimuli to investigate the behavioral and neuronal consequences of reflexive covert orienting”, Journal of Neurophysiology, 91 (5): 2172–184. Briand, K. A., D. Strallow, W. Hening, H. Poizner, and A. B. Sereno. 1999. “Control of voluntary and reflexive saccades in Parkinson’s disease”, Experimental Brain Research, 129 (1): 38–48. Brown, M. W. and J.-Z. Xiang. 1998. “Recognition memory: neuronal substrates of the judgement of prior occurrence”, Progress in Neurobiology, 55 (2), 149–89. Brown, M. W., F. A. Wilson, and I. P. Riches. 1987. “Neuronal evidence that inferomedial temporal cortex is more important than hippocampus in certain processes underlying recognition memory”, Brain Research, 409 (1): 158–62. Corballis, M. C., and C. Armstrong. 2007. “Repetition blindness is orientation blind”. Memory and Cognition, 35 (2): 372–80. Corbetta, M., F. M. Miezin, G. L. Shulman, and S. E. Petersen. 1993. “A PET study of visuospatial attention”, Journal of Neuroscience, 13 (3): 1202–26. Desimone, R. 1992. “The physiology of memory: recordings of things past”, Science, 258 (5080): 245–46. ———. 1996. “Neural mechanisms for visual memory and their role in attention”, Proceedings of the National Academy of Sciences of the United States of America, 93 (24): 13494–99. Desimone, R., and J. Duncan. 1995. “Neural mechanisms of selective visual attention”, Annual Review of Neuroscience, 18: 193–222. Dorris, M. C. and D. P. Munoz. 1998. “Saccadic probability influences motor preparation signals and time to saccadic initiation”, Journal of Neuroscience, 18 (17): 7015–26.
Reflexive Spatial Attention
129
Dorris, M. C., M. Pare, and D. P. Munoz. 1997. “Neuronal activity in monkey superior colliculus related to the initiation of saccadic eye movements”, Journal of Neuroscience, 17 (21): 8566–79. Dorris, M. C., R. M. Klein, S. Everling, and D. P. Munoz. 2002. “Contribution of the primate superior colliculus to inhibition of return”, Journal of Cognitive Neuroscience, 14: 1256–63. Egeth, H. E. and S. Yantis. 1997. “Visual attention: control, representation, and time course”, Annual Review of Psychology, 48: 269–97. Epstein, R., K. S. Graham, and P. E. Downing. 2003. “Viewpoint-specific scene representations in human parahippocampal cortex”, Neuron, 37 (5): 865–76. Fahy, F. L., I. P. Riches, and M. W. Brown. 1993. “Neuronal activity related to visual recognition memory: long-term memory and the encoding of recency and familiarity information in the primate anterior and medial inferior temporal and rhinal cortex”, Experimental Brain Research, 96 (3): 457–72. Fecteau, J. H., A. H. Bell, and D. P. Munoz. 2004. “Neural correlates of the automatic and goal-driven biases in orienting spatial attention”, Journal of Neurophysiology, 92 (3): 1728–37. Fecteau, J. H. and D. P. Munoz. 2007. “Warning signals influence motor processing”, Journal of Neurophysiology, 97 (2): 1600–609. Felleman, D. J. and D. C. Van Essen. 1991. “Distributed hierarchical processing in the primate cerebral cortex”, Cerebral Cortex, 1 (1): 1–47. Grill-Spector, K., R. Henson, and A. Martin. 2006. “Repetition and the brain: neural models of stimulusspecific effects”, Trends in Cognitive Sciences, 10 (1): 14–23. Grill-Spector, K., T. Kushnir, S. Edelman, G. Avidan, Y. Itzchak, and R. Malach. 1999. “Differential processing of objects under various viewing conditions in the human lateral occipital complex”, Neuron, 24 (1): 187–203. Grossberg, S. 1972. “A neural theory of punishment and avoidance, II: Quantitative theory”, Mathematical Biosciences, 15: 253–85. Henson, R. N. A. 2003. “Neuroimaging studies of priming”. Progress in Neurobiology, 70 (1): 53–81. Holscher, C. and E. T. Rolls. 2002. “Perirhinal cortex neuronal activity is actively related to working memory in the macaque”, Neural Plasticity, 9 (1): 41–51. Itti, L. and C. Koch. 2002. “Computational modeling of visual attention”, Nature Reviews Neuroscience, 2 (3): 194–203. Kanwisher, N. G. 1987. “Repetition blindness: type recognition without token individuation”, Cognition, 27 (2): 117–43. Khatoon, S., K. A. Briand, and A. B. Sereno. 2002. “The role of response in spatial attention: direct versus indirect stimulus–response mappings”, Vision Research, 42 (24): 2693–708. Klein, R. M. 2000. “Inhibition of return”, Trends in Cognitive Sciences, 4 (4): 138–47. Klein, R. M., A. Kingstone, and A. Pontefract. 1992. “Orienting of visual attention”, in K. Rayner (ed.), Eye Movements and Visual Cognition. New York: Springer Verlag. Koch, C. and S. Ullman. 1985. “Shifts in selective visual attention: towards the underlying neural circuitry”, Human Neurobiology, 4 (4): 219–27. Kourtzi, Z. and N. Kanwisher. 2001. “Representation of perceived object shape by the human lateral occipital complex”, Science, 293 (5534): 1506–509. Kwak, H. W. and H. Egeth. 1992. “Consequences of allocating attention to locations and to other attributes”, Perception and Psychophysics, 51 (5): 455–64. Lehky, S. R. and A. B. Sereno. 2007. “Comparison of shape encoding in primate dorsal and ventral visual pathways”, Journal of Neurophysiology, 97 (1): 307–19. Lo, C. C. and X. J. Wang. 2006. “Cortico-basal ganglia circuit mechanism for a decision threshold in reaction time tasks”, Nature Neuroscience, 9 (7): 956–63. Logothetis, N. K. and D. L. Sheinberg. 1996. “Visual object recognition”, Annual Review of Neuroscience, 19: 577–621.
130 Anne B. Sereno et al. Miller, E. K. and R. Desimone. 1994. “Parallel neuronal mechanisms for short-term memory”, Science, 263 (5146): 520–22. Miller, E. K., L. Li, and R. Desimone. 1993. “Activity of neurons in anterior inferior temporal cortex during a short-term memory task”, Journal of Neuroscience, 13 (4): 1460–78. Munoz, D. P. 2002. “Commentary: saccadic eye movements: overview of neural circuitry”, Progress in Brain Research, 140: 89–96. Ogmen, H. 1993. “A neural theory of retino-cortical dynamics”, Neural Networks, 6 (2): 245–73. Ogmen, H., B. G. Breitmeyer, and R. Melvin. 2003. “The what and where in visual masking”, Vision Research, 43 (12): 1337–50. Patel, S. S., X. Peng, and A. B. Sereno. 2007. “Shape effects on reflexive spatial selective attention and inhibition of return”, paper presented at the 37th annual meeting of the Society for Neuroscience. Pierrot-Deseilligny, C., A. Rosa, K. Masmoudi, S. Rivaud, and B. Gaymard. 1991. “Saccade deficits after a unilateral lesion affecting the superior colliculus”, Journal of Neurology, Neurosurgery and Psychiatry, 54 (12): 1106–109. Posner, M. I. 1980. “Orienting of attention”, Quarterly Journal of Experimental Psychology, 32 (1): 3–25. Posner, M. I., Y. Cohen, and R. D. Rafal. 1982. “Neural systems control of spatial orienting”, Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 298 (1089): 187–98. Posner, M. I. and Y. A. Cohen. 1984. “Components of visual orienting”, in H. Bouma and D. G. Bouhuis (eds), Attention and Performance X, pp. 531–54. Hillsdale, NJ: Laurence Earlbaum and Associates. Posner, M. I., R. D. Rafal, L. S. Choate, and J. Vaughan. 1985. “Inhibition of return: neural basis and function”, Cognitive Neuropsychology, 2 (3): 211–28. Purushothaman, G., D. Lacassagne, H. E. Bedell, and H. Ogmen. 2002. “Effect of exposure duration, contrast and base blur on coding and discrimination of edges”, Spatial Vision, 15 (3): 341–76. Rafal, R. D., P. A. Calabresi, C. W. Brennan, and T. K. Sciolto. 1989. “Saccade preparation inhibits reorienting to recently attended locations”, Journal of Experimental Psychology: Human Perception and Performance, 15 (4): 673–85. Rafal, R. D., M. I. Posner, J. H. Friedman, A. W. Inhoff, and E. Bernstein. 1988. “Orienting of visual attention in progressive supranuclear palsy”, Brain, 111 (Part 2): 267–80. Riggio, L., I. Patteri, and C. Umilta. 2004. “Location and shape in inhibition of return”, Psychological Research, 68 (1): 41–54. Ringo, J. L. 1996. “Stimulus specific adaptation in inferior temporal and medial temporal cortex of the monkey”, Behavioural Brain Research, 76 (1–2): 191–97. Ro, T. and R. D. Rafal. 1999. “Components of reflexive visual orienting to moving objects”, Perception and Psychophysics, 61 (5): 826–36. Rosen, A. C., S. M. Rao, P. Caffarra, A. Scaglioni, J. A. Bobholz, S. J. Woodley, et al. 1999. “Neural basis of endogenous and exogenous spatial orienting: a functional MRI study”, Journal of Cognitive Neuroscience, 11 (2), 135–52. Sapir, A., N. Soroker, A. Berger, and A. Henik. 1999. “Inhibition of return in spatial attention: direct evidence for collicular generation”, Nature Neuroscience, 2 (12): 1053–54. Schacter, D. and R. L. Buckner. 1998. “Priming and the brain”, Neuron, 20 (2): 185–95. Sereno, A. B. and J. H. Maunsell. 1998. “Shape selectivity in primate lateral intraparietal cortex”, Nature, 395 (6701): 500–03. Sereno, A. B. and P. S. Holzman. 1996. “Spatial selective attention in schizophrenic, affective disorder, and normal subjects”, Schizophrenia Research, 20 (1–2): 33–50. Sereno, A. B. and S. C. Amador. 2006. “Attention and memory related responses of neurons in the lateral intraparietal area during spatial and shape delayed match-to-sample tasks”, Journal of Neurophysiology, 95 (2): 1078–98.
Reflexive Spatial Attention
131
Sereno, A. B., C. B. Jeter, V. Pariyadath, and K. A. Briand. 2006a. “Dissociating sensory and motor components of inhibition of return”, Scientific World Journal, 6: 862–87. Sereno, A. B., K. A. Briand, S. C. Amador, and S. V. Szapiel. 2006b. “Disruption of reflexive attention and eye movements in an individual with a collicular lesion”, Journal of Clinical and Experimental Neuropsychology, 28 (1): 145–66. Sereno, M. E., T. Trinath, M. Augath, and N. K. Logothetis. 2002. “Three-dimensional shape representation in monkey cortex”, Neuron, 33 (4): 635–52. Sobotka, S. and J. L. Ringo. 1993. “Investigation of long-term recognition and association memory in unit responses from inferotemporal cortex”, Experimental Brain Research, 96 (1): 28–38. Tanaka, K. 1996. “Inferotemporal cortex and object vision”, Annual Review of Neuroscience, 19: 109–39. Taylor, T. L. and R. M. Klein. 2000. “Visual and motor effects in inhibition of return”, Journal of Experimental Psychology: Human Perception and Performance, 26 (5): 1639–56. Tipper, S. P., R. Rafal, P. A. Reuter-Lorenz, Y. Starrveldt, T. Ro, R. Egly, et al. 1997. “Object-based facilitation and inhibition from visual orienting in the human split-brain”, Journal of Experimental Psychology: Human Perception and Performance, 23 (5): 1522–32. Ungerleider, L. G. and M. Mishkin. 1982. “Two cortical systems”, in D. J. Ingle, M. A. Goodale and R. Mansfield (eds), Analysis of Visual Behavior, pp. 549–86. Cambridge, MA: MIT Press. Wiggs, C. L. and A. Martin. 1998. “Properties and mechanisms of perceptual priming”, Current Opinion in Neurobiology, 8 (2): 227–33. Wright, R. D. and L. M. Ward. 1998. “The control of visual attention”, in R. D. Wright (ed.), Visual Attention, pp. 132–86. New York: Oxford University Press. Xu, Y., N. B. Turk-Browne, and M. M. Chun. 2007. “Dissociating task performance from fMRI repetition attenuation in ventral visual cortex”, Journal of Neuroscience, 27 (22): 5981–85.
Chapter 8 Effects of Emotions on Selective Attention and Control Narayanan Srinivasan, Shruti Baijal, and Neha Khetrapal
INTRODUCTION
E
motions consist of multiple components that include subjective feelings, a behavioural reaction, cognitive appraisal and accompanying physiological reactions (Cacioppo et al., 1993). William James (1894) regarded emotions as adaptive behavioural and physiological response tendencies that are called forth directly by evolutionarily significant situations. Carver and Scheier (1990) have viewed emotion as the readout of a system that monitors the rate at which the discrepancy between a goal and reality is being decreased. According to them, positive emotion signals a rate of discrepancy reduction that is faster than expected; negative emotion signals a rate that is slower than expected. Certain contrasting positions have been taken with respect to the relationship between cognitive processes and emotion. While some claim emotions to be completely independent of cognition (Zajonc, 1980), others posit that cognitive appraisals are invariably necessary for the production of emotion (Lazarus, 1991). Recently the view that cognitive processes are closely related to emotion has been steadily gaining ground (Izard, 1993; Rolls, 1990, 2005). The close interaction of cognition and emotion makes it difficult to dissociate the two sets of processes (Halgren, 1992). Theories of emotion differ regarding whether emotional evaluation of events is considered to be automatic or a conscious process (Damasio, 1994; Ekman, 1992; LeDoux, 1993; Thompson, 1988). In order to determine a goal for one’s action, every type of cognitive operation needs to be involved. According to Rolls (2005) cognitive operations produce emotions at three levels. The first is the implicit level, where an unlearned reinforcer may lead to emotions. The second level is where a (first order) syntactic symbol processing system performs computation to implement planning for the identification of a rewarding or punishing
Selective Attention and Control
133
outcome. The third level is the higher order linguistic thought level where thinking about and evaluating the operations of the first order linguistic processor may result in a reinforcing outcome. Another way in which cognition influences emotion is through cognitive states at the level of language that modulate subjective and brain responses to affective stimuli. De Araujo et al. (2005) showed that a word label influences the pleasantness ratings and the activations of the secondary olfactory cortex located in the orbitofrontal cortex (where the reinforcing value of rewards and punishers is initially represented). This type of top-down modulation occurs in a way that is analogous to the top-down attentional effects (Rolls and Deco, 2002). This fits in with arguments for reciprocal links between emotion and attention (Vuilleumier, 2002). Attentional processes have been characterized in terms of three different functions subserved by different attentional networks (Jackson et al., 1994). The alerting network (involving right fronto–parietal areas) functions to maintain an organism’s readiness to react to stimuli by suppressing background neural noise through the inhibition of irrelevant mental activity. The orienting network (involving posterior parietal lobe and thalamus) functions to motorize special neural operations needed to bring attention to the relevant visual location (visual search) and thus bind signals. The executive network (involving anterior cingulate, left lateral frontal lobe and basal ganglia) coordinates multiple specialized neural processes, for example, target detection to direct behaviour. The anterior cingulate helps in target detection by boosting activation of signals for feature selection in the extrastriate regions (Posner and DiGirolamo, 1998). These functions of the attentional system facilitate activities in different brain regions to increase performance. Given the putative interactions between emotion and attention, we first discuss the interaction between emotions and selective attention in the next section. This is followed by a discussion on emotions and executive control.
EMOTION AND SELECTIVE ATTENTION Processes of selective attention and emotion operate together in prioritizing thoughts and actions. Abundant evidence suggests that emotionally salient stimuli can determine how visual attention is allocated. Compton (2003) suggests that a primary way to determine importance of a stimulus is to evaluate its emotional significance. She argues that stimuli deemed emotionally significant receive enhanced processing and that this occurs via the operation of two attentional mechanisms: one that evaluates emotional significance preattentively or “automatically”, and another that gives these significant stimuli priority in the competition for selective attention. Although the emotional value of stimuli may differ between individuals, there are some stimuli—such as snakes, spiders, and human faces—that are emotionally significant to most individuals. Faces are probably the most biologically and socially significant visual stimuli in the human environment, and might therefore be expected to receive enhanced processing. Emotional facial expressions are
134
Narayanan Srinivasan et al.
effective communicators despite significant language and cultural differences and they provide coarse evaluations of one’s underlying emotional state (Johnson-Laird and Oatley, 1992). They are highly salient, easily detected and are able to communicate social information rapidly (Etcoff, 1984). People are more likely to attend to faces than other more common objects under some, but not all, circumstances. Converging evidence also strongly suggests that emotional, particularly threatening, facial expressions receive enhanced processing (Eastwood et al., 2001; Foxm et al., 2002; Frischen et al., 2008; Smilek et al., 2000). This enhanced processing appears to be mediated by direct re-entrant processing from the amygdala to all cortical visual areas and via interacting prefrontal attentional networks that preferentially allocate spatial attention to emotional, especially threatening, facial expressions (Levesque et al., 2003; Ochsner, 2004; Palermo and Rhodes, 2007). The emotional significance and neural specificity of face processing make faces ideal candidates for automatic or preattentive processing (Öhman, 2002; Öhman and Mineka, 2001).
FACIAL EXPRESSIONS AND AUTOMATICITY Automatic processes have some or all of the following characteristics: they are rapid (Batty and Taylor, 2003; Öhman, 2002), non-conscious (Öhman, 2002), mandatory (Wojciulik et al., 1998) and capacity-free, requiring minimal attentional resources. Studies in cognitive neuroscience have explored the rapid nature of face processing. Psychophysiological studies using ERP (Event Related Potentials) and MEG (Magnetoencephalography) also suggest that emotional information from faces is rapidly registered and discriminated, from as early as 80 ms after stimulus onset (Oram and Perrett, 1992; Sugase et al., 1999; Liu et al., 2002). Lower thresholds are needed to detect faces when they are unambiguously emotional (Calvo and Esteves, 2005). Basic facial expressions such as fear and happiness appear to be rapidly categorized and identified (Batty and Taylor, 2003). People are able to distinguish between positive and negative facial expressions even when they are not aware of the stimuli. Information from faces that are not consciously perceived or attended may be conveyed via subcortical pathways to the amygdala. This information may only be sufficient to discriminate emotional from unemotional faces (Williams et al., 2004) or more arousing from less arousing facial expressions (Killgore and Yurgelun-Todd, 2004), with attention needed for more precise (Anderson et al., 2003), and perhaps conscious (Pessoa et al., 2006), discrimination. Fearful faces are rapidly, preferentially, non-consciously and mandatorily registered by the amygdala with little or no reliance upon attentional resources. Consistent with the view that emotional stimuli are processed pre-attentively, the detection of threatening stimuli is associated with flat search slopes in visual search tasks (Ohman, et al., 2002). The non-conscious recognition of facial expression has also been examined by backward masking paradigms. In visual backward masking paradigms, a briefly presented
Selective Attention and Control
135
target can be rendered invisible if immediately followed by a second “masking stimulus”. Affective priming studies typically present a brief face prime (15 ms), followed by a neutral stimulus to be evaluated. In situations where the hidden target stimulus is an emotional item, for example a conditioned angry face or a spider, preserved processing can be indexed by differential skin conductance responses (SCRs) to fear-relevant compared with fearirrelevant targets even though the target stimulus is not perceived (Esteves et al., 1994). The expression depicted by the prime influences judgements of the neutral stimuli; for instance, meaningless symbols are rated as more appealing when they follow brief happy, rather than angry, faces (Rotteveel et al., 2001; Wong and Root, 2003). The participants were also found to be subjectively unaware of the face primes (that is, they report no knowledge of the facial expressions). However, emerging evidence has challenged the view of automaticity of emotions. Recent studies demonstrate that emotional perception cannot proceed when a competing task is made sufficiently demanding thereby depleting attentional resources (Pessoa, et al., 2002; 2006). An fMRI study revealed that differential response in amygdale to masked fearful faces relative to masked neutral ones was observed only when participants were aware of the faces. Under conditions of unawareness, no amygdalar response to fearful faces was observed (Pessoa, et al., 2006).
EMOTIONAL EFFECTS IN SELECTIVE ATTENTION TASKS A large number of studies have explored the interactions between emotion and attention (Eastwood et al., 2001; Fenske and Raymond, 2006; Frischen et al., 2008; Vuilleumier et al., 2003). A popular method to investigate the interaction between attention and emotion has been the visual search paradigm. In visual search, the time to detect a given target usually increases as a linear function of the number of distractors, indicating serial attentive inspection of every stimulus in the display. However if target recognition can occur without attention at earlier parallel processing stages, then attention is automatically drawn to the target, “pop-out” and detection times are largely independent of the number of distractors. Recent studies (Frischen et al., 2008) using visual search task showed that schematic faces with a negative expression (angry) were found more quickly than similar faces with a positive expression (happy). A faster detection of negative faces therefore suggests that their potential threatening value can be extracted prior to selective spatial attention and can heighten their saliency. All these studies employed schematic faces instead of real photographs with the aim of minimizing potential confounds due to physical difference in image properties. Although ecologically impoverished, such stimuli are known to effectively convey subjective emotional meaning and produce reliable facespecific neural responses in functional neuroimaging and electrical potentials similar to veridical photographs. Moreover, in some studies, an advantage for detecting angry faces was not obtained when the stimuli were inverted or when only some features were presented in isolation, suggesting that effect did not result from just the low-level cues
136
Narayanan Srinivasan et al.
(downward mouth curves) but rather require some holistic processing of the whole face configuration (Vuilleumier, 2002). A few studies have explored asymmetries in performance with sad faces compared to happy faces. Eastwood et al. (2001) employed a visual search task with 7, 11, 15 and 19 schematic faces in which participants searched for a unique target face (with positive or negative expression) among neutral distractor faces. They measured the slopes and found that the slopes of the search functions for the negative (sad) faces were smaller than the slopes of the search functions for the positive (happy) faces. They also showed that this effect disappeared when the search task was performed with inverted faces. We performed a similar experiment with happy and sad schematic faces and have found similar results with a detection task as well as a discrimination task (see Figures 8.1 and 8.2). In the detection task, participants detected a target emotional face (sad in one block and happy in another block) among neutral distractors. In the discrimination task, observers had to identify the given target face as sad or happy. With both type of tasks, we found that sad faces were identified faster than happy faces. However, this advantage for schematic faces is present only when there are multiple targets and there was no difference in response time for a single sad or happy schematic face. Search slopes for the discrimination task were higher FIGURE 8.1
Search times for sad and happy schematic faces in a detection task
Selective Attention and Control FIGURE 8.2
137
Search times for sad and happy schematic faces in a discrimination task
than for the detection task but there was no indication of a pop-out like effect even when sad face was the target among neutral distractors in a detection task. In a recent visual search study with threatening and non-threatening facial expressions (Williams et al., 2005), search times were measured with four and eight distractors for happy, sad, fearful and angry faces. Williams et al., used photographs of faces in their visual search study. Contrary to Eastwood et al., (2001) and our results, they found that sad faces were not detected faster than happy faces. In fact, the effect was reversed with happy faces detected faster than sad faces. Williams et al., (2005) have argued that this could be due to the presence of certain distinctive features in happy faces that may be responsible for faster search times in their experiment. Further studies are needed to explore the differences in visual search with sad and happy schematic and real faces. Studies in our lab also indicated that emotional information modulates dynamics of visual attention (Srivastava and Srinivasan, 2010, Srinivasan and Gupta, in press). With an attentional dwell time paradigm, Srivastava and Srinivasan (2010) has shown that emotional faces interact with the time course of attention with happy faces being better than sad faces under conditions of less attentional resources. We have explored emotionattention interactions in the context of attentional load by looking at recognition memory for distractor faces. The results show that happy faces were detected better than sad faces under conditions of distributed attention or less attentional resources. Studies on spatial attention show facilitation at short cue-to-target intervals and inhibition of return (IOR) at longer cue-to-target intervals. Given that negative facial expressions
138
Narayanan Srinivasan et al.
capture attention much faster than positive facial expressions, Baijal and Srinivasan (submitted) have explored the effect of emotional stimuli on the inhibitory processes of exogenous attention using an ERP study. The study employed a peripheral double cueing paradigm using schematic faces with happy and sad expressions in which the exogenous cue was uninformative and did not predict the location of target appearance. The experiment consisted of a 100 ms exogenous cue (a square frame) appearing on one of the left, right, top or bottom locations (McAuliffe et al., 2006). Following a delay of 350 ms interrupted by a central cue (100 ms) the target face appeared at one of the four mentioned locations with equal likelihood. The participants had to simply detect the presence of the target. The results reveal that sad faces were detected faster than happy faces. There was a hemispheric asymmetry observed for behavioural inhibition of return. The facial affect influenced attentional processes at early stages of processing as seen from the negative deflections appearing at around 120 to 180 ms in the ERPs. The amplitudes for the negative peak whose incidence is predominant at the parietal-occipital sites depended on emotional expression, cuing and location of the target. The inhibition of return was obtained for positive peaks occurring in the range 250 to 350 ms with large peaks in the invalid condition compared to valid condition, the effects being maximal at the frontal sites. The study provides further supporting evidence for the modulation of attentional processing by facial affect. Functional imaging studies have also studied the interaction between emotions and attentional processes in the brain using two different approaches. The first approach has investigated how attentional resources are allocated when there is competition between task-relevant stimulus attributes and concurrent emotional information that elicits a prepotent sensory or response bias. Studies have investigated the processing of emotional facial expression under conditions of attention and inattention during spatial cueing tasks in which other neutral stimuli were simultaneously presented. Vuilleumier et al. (2001) showed that amygdala responses to displays of facial affect were not modulated by visuospatial attention, but responses in other brain regions (for example, fusiform gyrus) were. In contrast, Pessoa et al. (2002) showed that responses to emotional faces in both the amygdala and fusiform gyrus were modulated by attention when resources were presumably more exhausted. Both of these studies showed an interaction between emotional and attentional processing in the anterior cingulate gyrus. The second approach has been to investigate how the neural representation of an emotional stimulus changes when participants selectively attend to their emotional responses to the stimulus versus some other stimulus features (for example, spatial attributes). Lane et al. (1997) showed that when participants attended to their own emotional responses (pleasant/neutral/unpleasant) to a stimulus, there was greater anterior cingulate gyrus activity than when participants were attending to the spatial setting of the stimulus. These neuroimaging studies along with the behavioural studies clearly show that emotional information clearly interacts with processes involved in selective attention.
Selective Attention and Control
139
EMOTIONS AND COGNITIVE CONTROL Cognitive control refers to the ability to guide thought and action in accord with internal intentions (Botvinick et al., 2001; Cohen et al., 2000) and encompasses those processes necessary for controlled information processing and coordinated actions. Current cognitive neuroscience theories distinguish between at least two important components of cognitive control: a regulative or strategic component responsible for activation and implementation of control processes, and an evaluative component responsible for monitoring the need for control and signalling when adjustments in control are necessary (Botvinick et al., 2001). Regulative processes are those that are involved in the top-down control of cognition and include functions such as representing and maintaining task context and goals, the allocation of attention, preparation to execute cognitive tasks and processes that override prepotent response tendencies (Cohen et al., 1999). An example of a task that requires cognitive control is the Stroop colour-word task (MacLeod, 1991), where the participants must report one stimulus attribute (that is, word or printed colour) and ignore the other. The participants show robust interference effects, wherein there are increased reaction times (RT) when participants are required to name the colour of the word in the incongruent condition (conflict condition, for example, the word GREEN printed in blue) compared to the congruent condition (no conflict, for example, word GREEN printed in green). The ability to successfully complete this task and overcome the “conflict” illustrates efficient allocation of attentional resources thereby selecting a task-relevant response in the face of competition from an otherwise more automatic but task-irrelevant option, a fundamental aspect of cognitive control (Miller and Cohen, 2001). The emotion-based version of the Stroop task is the affective Stroop task. The affective Stroop task is an adaptation of the number Stroop task of Pansky and Algom (2002). In this task, the participants are presented with two numerical displays and their task is to determine which of the two contained greater number of targets. During the affective version of the task, positive, negative or neutral picture distracters are inserted between the numerical displays. The presence of emotional distracters (positive and negative) disrupts performance on the task compared to non-emotional distracters (neutral) (Blair et al., 2007; Mitchell et al., 2006). Moreover, the activity in the brain regions such as amygdala and inferior frontal gyrus that participate in emotional responding and re-appraisal (Levesque et al., 2003; Ochsner, 2004) is suppressed during concurrent task performance (Blair et al., 2007). Interestingly, the emotional distracters did not differentially disrupt the two levels of conflict (congruent and incongruent trials) neither did the executive task performance differentially affect the positive and negative emotions. The finding was also supported by fMRI results that showed regions identified as being implicated in task performance were unaffected by emotional relative to neutral distracters (Blair et al., 2007). Stroop interference tasks appear to engage different subdivisions of anterior cingulate gyrus
140
Narayanan Srinivasan et al.
depending on whether the context that the target stimuli created (aversive or neutral) during an emotional Stroop task, the effects being localized to more ventral regions of the anterior cingulate gyrus (Bush et al., 2000). Also, when subjects are presented with threat related colour words, automatic processing of word meaning delays naming of the words’ colour (Mathews and MacLeod, 1985). A second set of processes essential for cognitive control are those involved in the evaluation of one’s performance and include functions such as detection of processing conflicts and error monitoring. These processes are believed to play a crucial role in signalling for adjustments of top-down control needed to adapt to constantly changing task demands (Kerns et al., 2004). Some of these processes have been observed in the Stroop task itself as discussed earlier. The participants in the Stroop task showed greater interference on the initial trials compared to subsequent trials in a series (Botvinick et al., 2001). Additionally, participants performing on a modified Stroop task showed less interference on conflict trials if they are more in number compared to when they are rare (Lindsay and Jacoby, 1994). During maintenance of the context of a task, the dorso– lateral prefrontal cortex (DLPFC) was more active which is consistent with its role in preparation for performing on a demanding task. In contrast, ACC activity was increased upon presentation of conflict trials, consistent with a role in detection of response conflict (MacDonald et al., 2000) which marks a clear distinction between regulative and evaluative processes of cognitive control. West (2003) reported a temporal dissociation between the regulative and evaluative component processes of cognitive control. Following presentation of task instruction, implementation of regulative processes was exhibited by an occipital–parietal slow-wave that differentiated correct and incorrect response trials. Implementation of control was associated with frontal slow-wave ERP modulation that differentiated more attentionally demanding trials where participants prepare to override the more automatic response, from less demanding trials. Conflict detection was associated with a fronto–central N450 with greater amplitude for incongruent than congruent trials. A frontal slow-wave component 500 ms to 600 ms following incongruent stimulus presentation known as the conflict slow-wave potential (conflict SP) was also reported that reflects allocation of increased attentional resources in preparation for future incongruent trials. Another common paradigm used for studying the processes underlying cognitive control is the flanker task in which participants are instructed to make judgements about a target stimulus and ignore other stimuli that flank the target. Faster reaction times are obtained with targets that are flanked by stimuli that are identical or have been assigned the same response as the target compared to targets that are flanked by different stimuli that have not been assigned the same response as the target. This pattern of results is called the “flanker compatibility effect”. The magnitude of the flanker compatibility effect has been shown to depend on differences between targets and flankers in their relative physical or conceptual features.
Selective Attention and Control
141
Some event-related potentials (ERPs) have been pointed out as being critical to observe the temporal course of neural activity underlying cognitive control. ERPs can be used to temporally dissociate the component processes of cognitive control—conflict detection and error monitoring. For instance, using tasks that invoke processing conflicts (for example, Stroop task or the Flanker task) investigators found a reliably evoked late fronto–central ERP component referred to as the N450 or N2 (van Veen and Carter, 2002a, 2002b; West, 2003). These ERP components are largest under conditions in which cognitive conflict is high compared to when it is low (Grapperon et al., 1988; Liotti et al., 2000; Rebai et al., 1997). Source localization algorithms for ERP components have localized regions of the ACC as the neural generators of the N450 and N2 components (van Veen and Carter, 2002a, 2002b; West, 2003). These results implicate the N450 and N2 components as neurobiological indices of conflict detection, and support the role of ACC as a conflict detection mechanism. The detection of errors has also been investigated using ERPs. The detection of errors has been associated with a midline fronto–central negative deflection in response-locked ERPs that occurs within 100 ms of committing an error. This negative deflection is known as the error-related negativity (ERN), and is the first identified neurobiological index of performance monitoring (Dahaene et al., 1994; Falkenstein et al., 2000). The ERN has typically been obtained across various stimulus and response modalities. Specifically, the ERN has been obtained following errors of omission as well as commission of errors (Falkenstein et al., 2000). Its amplitude is greatest when participants are aware that an error has been made (Luu et al., 2000; Scheffers and Coles, 2000). This is consistent with the action of an error detector/performance monitor in cognitive control. ERN is also observed when participants make “partial” errors (begin to make an error but spontaneously correct themselves), and is greater in amplitude on the trials with “partial” errors than errors where the conflict is not produced in time to spontaneously correct the response (Gehring et al., 1993). These results implicate the ERN as an on-line index of error detection or performance monitoring which is critical which monitors behaviour and prevent undesirable actions. Though there have been debates over the exact function of ERN. Research has not clarified whether the ERN is caused by the error itself, or by processing conflicts that produce uncertainty and predisposed to error. Some research has failed to correlate post-error slowing with the ERN and some failed to document such a slowing effect (Hajcak et al., 2003). Dipole-modelling techniques have localized the neural activity associated with the ERN to the ACC (Holroyd et al., 1998; van Veen and Carter, 2002b) which coincides with that of conflict processing. Van Veen and Carter (2002a) reviewed multiple studies of the N2 and ERN components and concluded that both components are representations of similar conflict detection processes. They concluded that the N2 reflects pre-response detection of conflict between competing response tendencies, while the ERN reflects post-response detection of incompatible responses following error trials.
142
Narayanan Srinivasan et al.
The magnitude of the flanker compatibility effect also varies depending upon the emotional information that is conveyed by the stimuli. It has been found that a face expressing negative emotion draws attention to itself more strongly than the one that shows positive emotion. The incongruent flankers in this case produce less interference for target faces displaying negative affect than for target faces displaying positive emotion. Thus the responses to target face showing negative affect show smaller flanker compatibility effect. These results suggest that negative faces constrict the focus of attention more effectively than positive faces (Fenske and Eastwood, 2003). Some studies examining the relationship between emotional processing and cognitive control ignore observing the interaction of two hemispheres with targets displaying opposite emotions (Blair et al., 2007). Given the evidence of hemispheric biases for different emotional information (Heller and Nitschke, 1998), it can be inferred that the two hemispheres may be differentially involved in cognitive control of opposing emotions. Baijal, Khetrapal and Srinivasan (2007) further investigated the hemispheric biases in processing emotional information in an executive control task by modifying the basic version of the flanker task. To explore hemispheric asymmetries both the target and flankers were presented in a vertical arrangement with the target at the horizontal meridian and flankers above or below them either in the left visual field or in right visual field. The flanker compatibility effect observed in the reaction times and error data for happy and threatening schematic faces depended upon the visual fields they were presented to (see Figures 8.3a and 8.3b). The flanker interference effect was pronounced when threatening face targets were presented to the right hemisphere and happy target faces were presented to the right hemisphere. This is consistent with the proposed view that positive emotions facilitate performance on tasks that rely on the left hemisphere whereas the negative emotions show preference for tasks that rely on the right hemisphere (Heller and Nitschke, 1998). The ERP analysis revealed that firstly, emotions could be discriminated at a very early stage of processing as indexed by the frontal N100 component. It was observed that happy targets were preferentially processed compared to threatening targets shown by increased N100 amplitudes (Figure 8.4), consistent with the behavioural effects of faster responses for happy compared to threatening target faces. The conflict related ERP (N2), which occurred later at 220 to 260 ms after stimulus onset, dissociated high and low conflict trials producing greater amplitudes for incompatible trials compared to the compatible trials (Figure 8.5). Another response locked ERP component (ERN), which reflects processing of errors also showed greater amplitudes for erroneous responses compared to correct responses (Figure 8.6). Conclusions were drawn that the interference occurred after evaluation of emotional stimuli. Most importantly, since stimulus evaluation and conflict occurred at temporally distinct stages of processing they did not interfere with each other. In addition, hemispheric asymmetries were present in processing of conflict with emotional information. There is an indication that ERN magnitudes in incorrect trials are different for happy and threatening faces. Further studies are needed to explore possible emotional interactions with control processes.
Selective Attention and Control FIGURE 8.3
Note:
143
The magnitude of flanker compatibility
(a): The magnitude of flanker compatibility effect (mean reaction time) were plotted for target faces depicting happy and threatening facial expressions when presented to left and the right visual fields. (b): The magnitude of flanker compatibility effect (error) were plotted for target faces depicting happy and threatening facial expressions when presented to left and the right visual fields.
144
Narayanan Srinivasan et al.
FIGURE 8.4
N 100 component (90 to 140 ms; frontal sites) depicts increased amplitude for happy target faces compared to threatening target faces
FIGURE 8.5
N2 component (220 to 260 ms; central midline sites) is locked to stimulus onset and reflects processing of conflict
Note:
N2 depicted overall greater amplitudes for incompatible compared to compatible condition.
Selective Attention and Control FIGURE 8.6
Note:
145
ERN component (60 to 90 ms; central midline sites) is locked to onset of response and reflects processing of errors
ERN showed greater amplitude for erroneous responses compared to correct responses.
CONCLUSIONS Emotions play a critical role in everyday behaviour and they reciprocally interact with most of the cognitive processes including perception, attention and memory. Emotional processes interact not only with selective attention but also with executive control processes. Given the existence of separate posterior and anterior networks for selection and executive control respectively in the brain (Posner and Dehaene, 1994), further studies will shed light on the neural and cognitive mechanisms involved in the interactions between emotion, attention and control.
REFERENCES Anderson, A. K., K. Christoff, D. Panitz, E. De Rosa, and J. D. E. Gabrieli. 2003. ‘Neural correlates of the automatic processing of threat facial signals’, The Journal of Neuroscience, 23 (13): 5627–33. Baijal, S. and N. Srinivasan. (submitted). ‘Inhibition of return for facial expressions with exogenous cuing’. Baijal, S., N. Khetrapal, and N. Srinivasan. 2007. ‘Neural Correlates of Conflict with Emotional Stimuli in a Flanker Task’, Journal of Psychophysiology, 44 (Supplement 1), 42–43. Batty, M., and M. J. Taylor. 2003. ‘Early processing of the six basic facial emotional expressions’, Cognitive Brain Research, 17 (3): 613–20.
146
Narayanan Srinivasan et al.
Blair, K. S., B. W. Smith, D. G. Mitchell, J. Morton, M. Vythilingam, L. Pessoa, D. Fridberg, A. Zametkin, D. Sturman, E. E. Nelson, W. C. Drevets, D. S. Pine, A. Martin, and R. J. Blair. 2007. ‘Modulation of emotion by cognition and cognition by emotion’, Neuroimage, 35 (1): Botvinick, M. M., T. S. Braver, D. M. Barch, C. S. Carter, and J. D. Cohen. 2001. ‘Conflict monitoring and cognitive control’, Psychological Review, 108 (3): 624–52. Bush, G., P. Luu, and M. I. Posner. 2000. ‘Cognitive and emotional influences in anterior cingulate cortex’, Trends in Cognitive Sciences, 4 (6): 215–22. Cacioppo, J. T., D. J. Klein, G. C. Bernston, and E. Hatfield. 1993. ‘The psychophysiology of emotion’, in M. Lewis and J. M. Haviland (eds), Handbook of Emotions (pp. 173–91). New York, NY: Guilford Press. Calvo, M. G. and F. Esteves. 2005. ‘Detection of emotional faces: low perceptual threshold and wide attentional span’, Visual Cognition, 12 (1): 13–27. Carver, C. S. and M. F. Scheier. 1990. ‘Origins and functions of positive and negative affect: a controlprocess view’, Psychological Review, 97 (1): 19–35. Cohen, J. D., D. M. Barch, C. S. Carter, and D. Servan-Schreiber. 1999. ‘Context-processing deficits in schizophrenia: converging evidence from three theoretically motivated cognitive tasks’, Journal of Abnormal Psychology, 108 (1): 120–33. Cohen, J. D., M. Botvinick, and C. S. Carter. 2000. ‘Anterior cingulate and prefrontal cortex: who’s in control?’, Nature Neuroscience, 3 (5): 421–23. Compton, R. 2003. ‘The interface between emotion and attention: a review of evidence from psychology and neuroscience’, Behavioral and Cognitive Neuroscience Reviews, 2 (2): 115–29. Compton, R. J., K. Feigenson, and P. Widick. 2005. ‘Take it to the bridge: an interhemispheric processing advantage for emotional faces’, Cognitive Brain Research, 24: 66–72. Dahaene, S., M. I. Posner, and D. M. Tucker. 1994. ‘Localization of a neural system for error detection and compensation’, Psychological Science, 5 (5): 303–05. Damasio, A. R. 1994. Descartes’ Error. New York, NY: Grosset/Putamen Books. De Araujo, I. E. T., E. T. Rolls, M. I. Velazco, C. Margot, and I. Cayeux. 2005. ‘Cognitive modulation of olfactory processing’, Neuron, 46 (4): 671–79. Eastwood, J. D., D. Smilek, and P. M. Merikle. 2001. ‘Differential attentional guidance by unattended faces expressing positive and negative emotion’, Perception & Psychophysics, 63 (6): 1004–13. Eastwood, J. D., D. Smilek, and P. M. Merikle. 2003. ‘Negative facial expression captures attention and disrupts performance’, Perception and Psychophysics, 65: 352–58. Ekman, P. 1992a. ‘An argument for basic emotions’. Cognition and Emotion, 6 (3–4): 169–200. Esteves, F., U. Dimberg, and A. Ohman. 1994. ’Automatically elicited fear: conditioned skin conductance responses to masked facial expressions’, Cognition and Emotion, 8 (5): 393–413. Etcoff, N. L. 1984. ‘Selective attention to facial identity and facial emotion’, Neuropsychologia, 22(3): 281–95. Falkenstein, M., J. Hoormann, S. Christ, and J. Hohnsbein. 2000. ‘ERP components on reaction errors and their functional significance: a tutorial’, Biological Psychology, 51 (2–3): 87–107. Fenske, M. J. and J. D. Eastwood. 2003. ‘Modulation of focused attention by faces expressing emotion: evidence from flanker tasks’, Emotion, 3 (4): 324–43. Fenske, M. J. and J. Raymond. 2006. ’Affective influences on selective attention’, Current Directions in Psychological Science, 15 (6): 312–16. Fox, E., R. Russo, and K. Dutton. 2001. ‘Do threatening stimuli draw or hold attention in subclinical anxiety?’, Journal of Experimental Psychology: General, 130: 681–700. Frischen, A., J. D. Eastwood, and D. Smilek. 2008. ‘Visual search for faces with emotional expressions’, Psychological Bulletin, 134 (5): 662–76. Gehring, W. J., B. Goss, M. G. H. Coles, D. E. Meyer, and E. Donchin. 1993. ‘A neural system for error detection and compensation’, Psychological Science, 4 (6): 385–90.
Selective Attention and Control
147
Ginsburg, G. and M. Harrington. 1996. ‘Bodily states and context in situated lines of action’, in R. Harre and W. Parrott (eds), The Emotions: Social, Cultural and Biological Dimensions (pp. 229–58). Thousand Oaks: Sage Publications. Grapperon, J., F. Vidal, and P. Leni. 1988. ‘The contribution of cognitive evoked potentials to knowledge mechanisms of the Stroop color-word interference effect’, Neuropsychologia, 38 (3): 701–11. Hajcak, G., N. McDonald, and R. F. Simons. 2003. ‘To err is autonomic: error-related brain potentials, ANS activity, and post-error compensatory behavior’, Psychophysiology, 40 (6): 895–903. Halgren, E. 1992. ‘Emotional neurophysiology of the amygdala within the context of human cognition’, in J. P. Aggleton (ed.), pp. 191–228. The Amygdala: Neurobiological Aspects of Emotion, Memory and Mental Dysfunction. Toronto, ON: Wiley. Hess, U., S. Blairy, and R. E. Kleck. 1997. ‘The intensity of emotional facial expressions and decoding accuracy’, Journal of Non Verbal Behaviour, 21: 241–57. Heller, W. and J. B. Nitschke. 1998. ‘The puzzle of regional brain activity in depression and anxiety’, Cognition and Emotion, 12 (3): 421–47. Holroyd, C. B., J. Dien, and M. G. Coles. 1998. ‘Error-related scalp potentials elicited by hand and foot movements: evidence for an output-independent error-processing system in humans’, Neuroscience Letters, 242 (2): 65–68. Izard, C. E. 1993. ‘Four systems for emotion activation: cognitive and neurocognitive processes’, Psychological Review, 100 (1): 68–90. Jackson, S. R., R. Marrocco, and M. I. Posner. 1994. ‘Networks of anatomical areas controlling visuo–spatial attention’, Neural Networks, 7 (6–7): 925–44. James, W. 1894. ‘The physical basis of emotion’. Psychological Review, 101 (2): 205–10. Johnson-Laird, P. N. and K. Oatley. 1992. ‘Basic emotions, rationality and folk theory’, Cognition and Emotion, 6 (3–4): 201–23. Kerns, J. G., J. D. Cohen, A. W. MacDonald, R. Cho, V. A. Stenger, and C. S. Carter. 2004. ‘Anterior cingulate conflict monitoring and adjustments in control’, Science, 303 (5660): 1023–26. Killgore, W. D. S. and D. A. Yurgelun-Todd. 2004. ‘Activation of the amygdala and anterior cingulate during nonconscious processing of sad versus happy faces’, Neuroimage, 21 (4): 1215–23. Lane, R. D., E. M. Reiman, M. M. Bradley, P. J. Lang, G. L. Ahern, R. J. Davidson, and G. E. Schwartz. 1997. ‘Neuroanatomical correlates of pleasant and unpleasant emotion’, Neuropsychologia, 35 (11): 1437–44. Lazarus, R. S. 1991. Emotion and Adaptation. New York: Oxford University Press. LeDoux, J. E. 1993. ‘Emotional memory systems in the brain’, Behavioural Brain Research, 58 (1–2): 69–79. Levesque, J., F. Eugene, Y. Joanette, V. Paquette, B. Mensour, and G. Beaudoin. 2003. ‘Neural circuitry underlying voluntary suppression of sadness’, Biological Psychiatry, 53 (6): 502–10. Lewis, M. 1992. Shame: The Exposed Self. New York: Free Press. Lindsay, D. S. and L. L. Jacoby. 1994. ‘Stroop process dissociations: the relationship between facilitation and interference’, Journal of Experimental Psychology: Human Perception and Performance, 20 (2): 219–34. Liotti, M., M. G. Woldorff, R. Perez, and H. S. Mayberg. 2000. ‘An ERP study of the temporal course of the Stroop color-word interference effect’, Neuropsychologia, 38 (5): 701–11. Liu, J., A. Harris, and N. Kanwisher. 2002. ‘Stages of processing in face perception: an MEG study’, Nature Neuroscience, 5 (9): 910–16. Luu, P., P. Collins, and D. M. Tucker. 2000. ‘Mood, personality, and self-monitoring: negative affect and emotionality in relation to frontal lobe mechanisms of error Monitoring’, Journal of Experimental Psychology: General, 129 (1): 43–60.
148
Narayanan Srinivasan et al.
MacLeod, C. M. 1991. ‘Half a century of research on the Stroop effect: an integrative review’, Psychological Bulletin, 109 (2): 163–203. Mathews, A. and C. MacLeod. 1985. ‘Selective processing of threat cues in anxiety states’, Behavioral Research Therapy, 23 (5): 563–69. McAuliffe, J., A. L. Chasteen, and J. Pratt. 2006. ‘Object- and location-based inhibition of return in younger and older adults’, Psychology and Aging, 21 (2): 406–10. MacDonald, A. W., J. D. Cohen, V. A. Stenger, and C. S. Carter. 2000. ‘Dissociating the role of the dorsolateral prefrontal cortex in cognitive control’. Science, 288: 1835–38. Miller, E. K. and J. D. Cohen. 2001. ‘An integrative theory of prefrontal cortex function’, Annual Review of Neuroscience, 24: 167–202. Ochsner, K. N. 2004. ‘Current directions in social cognitive neuroscience’, Current Opinion in Neurobiology, 14 (2): 254–58. Öhman, A. 2002. ‘Automaticity and the amygdala: nonconscious responses to emotional faces’, Current Directions in Psychological Science, 11 (2): 62–66. Öhman, A. and S. Mineka. 2001. ‘Fears, phobias, and preparedness: toward an evolved module of fear and fear learning’, Psychological Review, 108 (3): 483–522. Oram, M. W. and D. I. Perrett. 1992. ‘Time course of neural responses discriminating different views of the face and head’, Journal of Neurophysiology, 68 (1): 70–84. Palermo, R. and G. Rhodes. 2007. ‘Are you always on my mind? A review of how face perception and attention interact’, Neuropsychologia, 45 (1): 75–92. Pansky, A. and D. Algom. 2002. ‘Comparative judgment of numerosity and numerical magnitude: attention preempts automaticity’, Journal of Experimental Psychology, Learning, Memory and Cognition, 28 (2): 259–74. Pessoa L., M. McKenna, E. Gutierrez, and L. G. Ungerleider. 2002. ‘Neural processing of emotional faces requires attention’, Proceedings of the Natural Academy of Sciences USA, 99 (17): 11458–63. Pessoa, L., S. Japee, D. Sturman, and L. G. Ungerleider. 2006. ‘Target visibility and visual awareness modulate amygdala responses to fearful faces’, Cerebral Cortex, 16 (3): 366–75. Posner, M. I. and S. Dehaene. 1994. ‘Attentional Networks’, Trends in the Neurosciences, 17 (2): 75–79. Posner, M. I. and G. J. Di Girolamo. 1998. ‘Executive attention: conflict, target detection and cognitive control’, in R. Parasuraman (ed.), The Attentive Brain (401–24). Cambridge MA: MIT Press. Rebai, M., C. Bernard, and J. Lannou. 1997. ‘The Stroop test evokes a negative brain potential, the N400’, International Journal of Neuroscience, 91 (1–2): 85–94. Rolls, E. T. 1990. ‘A theory of emotion and its application to understanding the neural basis of emotion’, Emotion and Cognition, 4 (3): 161–90. Rolls, E. T. 2005. Emotions Explained. Oxford University Press: Oxford. Rolls, E. T. and G. Deco. 2002. Computational Neuroscience of Vision. Oxford University Press: Oxford. Rotteveel, M., P. de Groot, A. Geutskens, and R. H. Phaf. 2001. ‘Stronger suboptimal than optimal affective priming?’, Emotion, 1 (4): 348–64. Scheffers, M. K. and M. G. H. Coles. 2000. ‘Performance monitoring in a confusing world: error-related brain activity, judgments or response accuracy, and types or errors’, Journal of Experimental Psychology: Human Perception and Performance, 26 (1): 141–51. Smilek, D., J. D. Eastwood, and P. M. Merikle. 2000. ‘Does unattended information facilitate change detection?’, Journal of Experimental Psychology: Human Perception and Performance, 26 (2): 480–87. Srinivasan, N. and R. Gupta. (in press). ‘Emotion-attention interactions in recognition memory for distractor faces’. Srivastava, P. and N. Srinivasan. 2010. ‘Time course of visual attention with emotional faces’, Attention, Perception, and Pshychophysics, 72 (2): 369–77.
Selective Attention and Control
149
Strongman, K. T. 1996. The Psychology of Emotion: Theories of Emotion in Perspective (4th ed). Toronto, ON: Wiley. Sugase, Y., S. Yamane, S. Ueno, and K. Kawano. 1999. ‘Global and fine information coded by single neurons in the temporal visual cortex’, Nature, 400 (6747): 869–72. Thompson J. G. 1988. The psychobiology of emotions. New York: NY: Plenum Press. van Veen, V., and C. S. Carter. 2002a. ‘The anterior cingulate as a conflict monitor: fMRI and ERP studies’, Physiology and Behavior, 77 (4–5): 477–82. ———. 2002b. ‘The timing of action monitoring process in the anterior cingulate cortex’, Journal of Cognitive Neuroscience, 14 (4): 593–602. Vuilleumier, P. 2002. ‘Facial expression and selective attention’, Current Opinion in Psychiatry, 15 (3): 291–300. Vuilleumier, P., J. L. Armony, and R. J. Dolan. 2003. ‘Reciprocal links between emotion and attention’, in R. S. J. Frackowiak, J. T. Ashburner, W. D. Penny et al., (eds), Human Brain Function (2nd edition), pp. 419–44. San Diego: Academic Press. Vuilleumier, P., J. L. Armony, J. Driver, and R. J. Dolan. 2001. ‘Effects of attention and emotion on face processing in the human brain: an event-related fMRI study’, Neuron, 30 (3): 829–41. West, R. 2003. ‘Neural correlates of cognitive control and conflict detection in the Stroop and digitlocation tasks’, Neuropsychologia, 41 (8): 1122–35. Williams, L. M., K. J. Brown, P. Das, W. Boucsein, E. N. Sokolov, M. J. Brammer, G. Olivieri, A. Peduto, and E. Gordon. 2004. ‘The dynamics of cotico-amygdala and autonomic activity over the experimental course of fear perception’, Cognitive Brain Research, 21 (1): 114–23. Williams, M. A., S. A. Moss, J. L. Bradshaw, and J. B. Mattingley. 2005. ‘Look at me, I’m smiling: visual search for threatening and nonthreatening facial expressions’, Visual Cognition, 12 (1): 29–50. Wojciulik, E., N. Kanwisher, and J. Driver. 1998. ‘Covert visual attention modulates face-specific activity in the human fusiform gyrus: fMRI study’, Journal of Neurophysiology, 79 (3): 1574–78. Wong, P. S. and J. C. Root. 2003. ‘Dynamic variations in affective priming’, Consciousness and Cognition, 12 (2): 147–68. Zajonc, R. B. 1980. ‘Feeling and thinking: preferences need no inferences’, American Psychologist, 35 (2): 151–75.
Chapter 9 Modelling Neuropsychological Deficits with a Spiking Neural Network Eirini Mavritsaki, Glyn W. Humphreys, Dietmar Heinke, and Gustavo Deco
Introduction
D
amage to the posterior parietal cortex (PPC) in humans can lead to a variety of spatial and non-spatial processing disorders including clinical deficits such as Balint’s syndrome (after bilateral lesions), unilateral neglect, and extinction (after unilateral lesions). However, syndromes such as visual neglect are associated with a wide variety of symptoms that may vary across patients and even across different occasions in the same patient. This makes it difficult to develop a detailed account of the disorders without simulating how patterns of behaviour can emerge from interactions between modules in a damaged system. Consequently, it is useful to develop explicit models of such disorders, which can provide a framework for understanding how the complex behavioural syndromes arise. The usual direction modellers take to simulate neurophysiological disorders is to connectionist architectures, which approximate neuronal function at relatively high levels (see Ellis and Humphreys, 1999). When applied to the spatial deficits found after PPC damage, such models have been able to capture a wide set of symptoms ranging from the emergent lateralization of spatial selection through to the influence of grouping and topdown knowledge on reducing extinction and neglect (for example, Heinke and Humphreys, 2003; Mozer and Behramann, 1990; Mozer et al., 1997; Pouget and Sejnowski, 1997). For example, Heinke and Humphreys (2003) simulated PPC damage in their Selective Attention for Identification Model (SAIM) by reducing the connectivity between units on one side of a “selection network”. Humphreys and Riddoch (1994, 1995) reported a patient who, after suffering bilateral damage, showed right neglect of stimuli in retinotopic space along with neglect of the left parts of objects. Heinke and Humphreys (2003) simulated
Modelling Neuropsychological Deficits
151
this by damaging connections between units in the selection network responding to the right side of the retina along with connections between units connecting to the left side of the focus of attention. Although connectionist models can have many virtues, they also have some limitations. For example, many such models incorporate learning through back propagation, which is not biologically-realistic (see Sejnowski, 1986) and which can give rise to network properties divorced from real neuronal structures (for example, with units acting in both an excitatory or an inhibitory manner, depending on the sign of their connection to other units). Also, many models use simplified activation functions, modified by single parameters (for example, Servan-Schreiber et al., 1990), and so they typically fail to capture more complex neural modulations generated through different neuro-transmitter systems in the human brain. In addition, many connectionist models do not have units operating in a time-based manner that can be explicitly related to the time course of neuronal processes. In such cases, time course functions must be matched to human data either in a purely qualitative manner (for example, Heinke and Humphreys, 2003) or time by fitting a measure based on network iterations to one based on real time (for example, Seidenberg and McClelland, 1989). These limitations become important when we wish to simulate neurological disorders at a finer-grained level, where (for example) we wish to model quantitatively the effects of varying the temporal intervals between stimuli along with variations in neuronal-transmitter signals. For example, do non-spatial deficits emerge after unilateral lesioning if there is concurrent alteration in neuro-transmitter modulation (cf. Malhotra et al., 2005)? To capture such effects, it is possible to use models that incorporate some of the biological parameters of real neuronal operations, including values representing the time course of neuro-transmitter operations and of spiking activity in real neuronal systems. Deco and colleagues (Deco and Rolls, 2005; Deco and Zihl, 2001) have simulated human attention using models based on “integrate and fire” neurons, which utilize biologically plausible activation functions and output in terms of neuronal spikes. These authors showed that a model with parallel processing of the input through to high levels could simulate classic “attentional” (serial) aspects of human search (for example, contrasting search when targets are defined by simple features with search when targets are defined by conjunctions of features; cf. Treisman and Gelade, 1980), providing an existence proof that a model incorporating details of neuronal activation functions could capture aspects of human visual attention. The Deco and Zihl (2001) model, which forms the starting point for our own simulations (below), used a “mean-field approximation” to neuronal function, where the actual fluctuating induced local field ui for each neuron i was replaced by its average, with the result that the model does not capture dynamic operations at the level of individual neurons. Recently we elaborated this account at the level of individual neurons, to simulate in more detail human spatial and temporal selection. The approach is captured in the spiking Search over Time and Space (sSoTS) model (Mavritsaki et al., 2007; Mavritsaki et al., 2006),
152 Eirini Mavritsaki et al. which uses a system of spiking neurons modulated by NMDA, AMPA, GABA currents as originally presented by Brunel and Wang (2001) along with a IAHP current put forward by Deco and Rolls (2005). The architecture of the model is illustrated in Figure 9.1. sSoTS uses a simplified form of feature coding, with two layers of feature maps to encode the characteristics of visual stimuli (their colour and shape). There is in addition a “location map” in which units respond to the presence of any feature at a given location. At each location (in the feature maps and the location map), there is a pool of spiking neurons, providing some redundancy in the coding of visual information. The feature maps may correspond to neurons in ventral cortex (for example, V4) while the location map may correspond to neurons in dorsal (posterior parietal) cortex. Activity in the location map provides an index of “saliency” irrespective of the feature values involved (cf. Itti and Koch, 2000), since the location units represent the strength of evidence for “something” occupying each position, but they are “blind” to the features present (which are summed across the feature maps). There is then also feedback activation from the pool of units corresponding to each position in the map of locations to units at the corresponding location in the feature maps, supporting the selection of features that are linked to the highest saliency value. Over time, the model converges upon a target, with reaction times (RTs) based on the real-time operation of the neurons. Search efficiency in sSoTS is determined by the degree of overlap between the features of the target and those of distractors, with RTs lengthening as overlap increases and competition for selection increases. Consequently, search for a conjunction target (having no unique feature and sharing one feature with each of two distractors) is more difficult than search for a feature-defined target (differing from the distractors by a unique feature). Like Deco and Zihl (2001), Mavritsaki et al. (2006, 2007) showed that search in the conjunction condition also increased linearly as a function of the display size, mimicking “serial” search. In addition to modelling spatial aspects of search, sSoSTs also successfully simulated data on human search over time, in the preview search paradigm. In preview search (Watson and Humphreys, 1997; Watson et al., 2003), one set of distractors precedes the other distractors and the target. Provided the interval between the initial items and the search display is over 450 ms or so, the first distractors have little impact on performance (Humphreys et al., 2006a; Humphreys et al., 2006b; Watson and Humphreys, 1997). The sSoTS model generated efficient preview search when there was an interval of over 500 ms between the initial preview and the final search display. sSoTS mimics this time course due to the contribution of two processes: (a) a spike frequency-adaptation mechanism generated from a slow [Ca2+]-activated K+ current, which reduces the probability of spiking after an input has activated a neuron for a prolonged period (Madison and Nicoll, 1984), and (b) a top-down inhibitory input that forms an active bias against known distractors. The slow action of frequency-adaptation simulates the time course of preview search. The top-down inhibitory bias matches data from human psychophysical studies where the detection of probes has been shown to be impaired when they fall at the
Modelling Neuropsychological Deficits
153
Figure 9.1 The architecture of the sSoTS model
Note: At the bottom of the figure we show the time periods for the displays [single feature (SF), conjunction (CJ) and preview (PV)]. The dotted lines indicate the top-down inhibition applied to the features and locations of distractors during the preview period.
154 Eirini Mavritsaki et al. locations of old, ignored distractors (Agter and Donk, 2005; Allen and Humphreys, 2007; Humphreys et al., 2004; Watson and Humphreys, 2000). In addition, in explorations of the parameter space for sSoTS, Mavritsaki et al. (2006, 2007) found that it was a necessary component to approximate the behavioural data on preview search. These results, using the sSoTS model, indicate that processes of co-operation and competition between processing units may not be sufficient to account for the full range of data on human selective attention and that factors such as frequency adaptation are required in order to simulate the temporal dynamics of visual attention. In the present chapter we show how the sSoTS model can be used to simulate not only normal performance but also the effects of PPC damage on human visual selection. To mimic unilateral PPC lesions, we reduced the number of neurons in the pools on one side of the location map. We report five simulations. In Simulation 1, we examined whether a unilateral lesion would generate selective disturbances in spatial and temporal search found in patients with PPC damage in which they find both conjunction search and preview search abnormally difficult (compared with a single feature baseline condition). We also assessed if this selective difficulty was most pronounced for targets falling on the contralesional side of space. In Simulation 2, we investigated a particular variation of preview search used by Olivers and Humphreys (2004) in which the target and distractors appeared either within the same field or across different fields. This simulation served to distinguish predictions made by sSoTS from predictions made from an account which attributes impairments after PPC damage to problems in the spatial disengagement of attention (cf. Posner and Cohen, 1984). In Simulation 3 we report the results when, in addition to lesioning the model, we alter the level of NMDA current, mimicking changes associated with damage to the NE system (Posner and Petersen, 1990). Here we ask whether we find emergent deficits on the ipsi- as well as the contralesional side of space, consistent with non-spatial components of the neglect syndrome. Taken together, the results indicate that: (a) both parallel and serial aspects of human visual selection can emerge from a model with a purely parallel processing architecture, (b) a model operating at the spiking level can capture crucial aspects of a neuropsychological disorder, including factors such as whether non-spatial as well as spatial deficits occur, and (c) a model of this type can generate non-intuitive predictions (for example, on the effects of spatial overlap between targets and distractors in preview search, in Experiment 2), distinguishing the account proposed from others in the field.
sSoTS Model The Architecture of the Model The model consists of spiking neurons organized into pools containing a number of units with similar biophysical properties and inputs. The network of the model is based on the network suggested by Deco and Zihl (2001). Deco and Zihl’s (2001) model is an extended
Modelling Neuropsychological Deficits
155
approach of Usher and Niebur (1996) model that is based on neurophysiological data from the inferior temporal cortex from Chelazzi et al. (1993). The network represents the location of each item in the retina using an n × m matrix, where n represents the row and m the column of the item. The simulations presented here were based on a highly simplified case where there were six positions in the visual field,1 with 3 rows and 2 columns. sSoTS has three layers of retinotopically-organized units, each containing neurons that are activated on the basis of a stimulus falling at the appropriate spatial position. There is one layer for each feature dimension (“colour” and “letter shape”) and one layer for the location map (Figure 9.1). The feature dimension “colour” encodes information on the basis of whether a blue or green colour is presented in the visual field at a given position i, (i = 1...6) (creating activity in the blue and green feature maps). For simplicity, and given that many visual search experiments have utilized letter displays, we label the form maps as representing “letter shape” (there are two feature maps, one responding to the letter H and the other to the letter A). Note that we are not proposing that “letter maps” of this type exist in the brain and all that is crucial for the present simulations is that one form in the search task activates one map and the other a second map. The form maps here could equally well correspond to edge orientations (for example, a map coding vertical edges would be differentially activated by H stimuli in the search task, and a map coding oblique edges would be differentially activated by A stimuli). However, for ease in relating the simulations to typical search experiments (for example, see Watson and Humphreys, 1997), we use the labels “H” and “A” maps. The third layer contains the location map. The pools in the location map sum activity from the different feature maps to represent the overall activity for the corresponding positions in the visual field. Each of the layers contains one inhibitory pool (see also Deco and Rolls, 2005; Mavritsaki et al., 2006) and one non-specific pool, along with the feature maps. The inhibitory pools are modelled following Dale’s hypothesis, which states that a neuron is either excitatory or inhibitory in all of its connections with other cells, although this law is not absolute it provides useful classification of cell populations (Tuckwell, 1998). These neurons can have spontaneous activation and are connected with the corresponding feature maps for each layer. The ratio between inhibitory and excitatory neurons is the same for all the layers and is based on a ratio of 20 to 80, derived from populations of (inhibitory) interneurons and (excitatory) pyramidal neurons in the brain (Abeles, 1991; Rolls and Deco, 2002). The model also simulates signals that the feature and location maps receive from other brain areas. These signals can be characterized as noise and are simulated using a Poisson noise distribution. The system receives this signal as external spontaneous activity with a value of 3 Hz for each of the 800 neurons that convey the signal to the system, consistent with activity values observed in the cerebellar cortex (Rolls and Treves, 1998; Wilson et al., 1994). 1
Simulations with increased numbers of units can take an extortionately long time to run.
156 Eirini Mavritsaki et al. For each layer the inhibitory and non-specific pools are connected with the pools in the feature maps. The neurons within each pool in the feature map are mutually excitatory and have strong coupling (w+); the neurons from different pools within each map are also excitatory but have low coupling (w –). The inhibitory process within each layer operates through the inhibitory pool of neurons. In addition, each pool in the location map is connected in an excitatory manner with the pools in the feature maps that represent the same position in the visual field. These excitatory connections feed-back activity to enhance the competition at the feature level. The system used and the connections are illustrated in Figure 9.1.
Spiking Characteristics Spiking activity in the system can be described by a set of coupled differential equations that give the time evolution of each neuron as a function of the other neurons. The neurons use integrate-and-fire functions (Tuckwell, 1998) which can be represented by a circuitry with parallel capacitance and resistance. The formulation of the integrate-andfire neurons was taken from Deco and Rolls (2005) and Brunel and Wang (2001). Each neuron contains recurrent excitatory postsynaptic-currents with two components: (a) a fast component that is mediated by AMPA-like dynamics; and (b) a slow component mediated by NMDA-like dynamics. The external neurons are modelled by AMPA-like connections. Inhibition is modelled using GABA-like dynamics. The sub-threshold membrane potential is given by the following equation: Cm
dV ( r ) = 9m (V ( r ) − VL ) − I syn ( r ) + I AHP dr
Where Cm is the membrane capacitance, gm is the membrane leak conductance, VL is the resting potential, Isyn is the synaptic current and IAHP is the current term for the frequency adaptation mechanism (for more details see Mavritsaki et al., 2006). In order to investigate the system’s behaviour efficiently a mean field approach was initially used (for more details see Mavritsaki et al., 2006; Deco and Rolls 2005; Brunel and Wang, 2001). Thus simulations at the mean field level can be used to define the limits on parameters in the system which can then be explored more systematically at the level of spiking neurons, in order to more precisely model the relevant data.
Inhibitory Mechanisms in the Model Frequency Adaptation (Passive Inhibition) In addition to the currents described in previous section, the model also includes a [Ca2+]sensitive K+ current, that incorporates in the model a mechanism of frequency adaptation.
Modelling Neuropsychological Deficits
157
Adaptation of firing is known to be a common property of spiking neurons (Ahmed et al., 1998), whereby after firing there is a decrease in the probability of the neuron spiking again, down to some steady state. Spike frequency adaptation is typically ascribed to either the voltage-activated M-type current or the slow [Ca2+]-activated K+ current (IAHP). However, it is believed that during the first 300 ms of adaptation the main effect stems from a slow [Ca2+]-activated K+ current (Madison and Nicoll, 1984). In addition, it has been suggested that IAHP is responsible for slowing of the spikes but not stopping them (Prescott and Sejnowski, 2007). This might be useful for simulating aspects of human search over time (in preview search), where there is evidence that “old” distractors are still activated—for example, old distractors are available to compete with new targets if a secondary task is introduced while the old distractors are initially present in the field (Humphreys et al., 2002; Watson and Humphreys, 2007). Frequency adaptation has been modelled by Liu and Wang (2001), and their formulation is employed here. The frequency adaptation function provides, “for free”, a passive component to preview search based on the length of time that items have been in the field. In the present simulation, the average firing rate of the neurons in each pool is calculated. The slow [Ca2+]-activated K+ current will affect more quickly the pools with higher firing frequency, since their increased firing leads to quicker increases in the intracellular levels of [Ca2+]. Following this, the frequency of firing within these higher firing pools will decrease. In sSoTS the spike frequency adaptation mechanism takes the form of inhibition applied proportionately to the pools that are active for some period, where an active pool is one where the frequency of firing is relatively high compared with the other pools. An active pool in the feature maps indicates that there is an item in the corresponding position in the visual field. Under conditions of preview search the IAHPcurrent leads to a decrease in activation in pools that represent the positions of the first set of distractors, because these items are active for a period before the presentation of the search display.
Active Ignoring (Active Inhibition) In addition to this passive mechanism of adaption to the preview, we captured the “active” ignoring of old distractors (Agter and Donk, 2005; Humphreys et al., 2004; Watson and Humphreys, 2000), by adding a small increase in inhibition to the maps representing the features of old distractors (inhibitory weights increased from 1 to 1.15).
Parameter Setting The parameters for the simulations were established in baseline conditions in the unlesioned version of the model with “single feature” and “conjunction” search tasks as
158 Eirini Mavritsaki et al. reported by Watson and Humphreys (1997) (conjunction search: blue H target versus green H and blue A distractors; feature search: blue H target versus blue A distractors). The presence of an object in the visual field was signified by adding an additional lin value given to the external input that the system received. The target also benefited from an extra top-down input latt given to those feature maps that represent the target’s characteristics (that is, the colour blue and the letter H). This conforms to an expectation of the target’s features, as suggested by psychological theories of search (for example, the biased competition model, Duncan et al., 1997; the Guided Search model, Wolfe, 1994). Overall the input that a pool could receive was νext = νext + (lin + latt)/Next. The parameters for the [Ca2+]-sensitive K+ current were selected in order to be able to simulate the preview effect in addition to conjunction and single feature search, when the model was unlesioned (search efficiency in the preview condition matching that in the single feature baseline; Watson and Humphreys, 1997). The parameters used for the system can be found in Deco and Rolls (2005) and Mavritsaki et al. (2006, 2007). Reaction times (RTs) were based on the time taken for the firing rate of the pool in the location map to cross a relative threshold (thr). If the selected pool corresponded to the target then the search was successful (a hit trial). If the pool that crossed the threshold corresponded to a distractor rather than the target then the target was ‘missed’. Note, however, that if the parameters were set so that the target’s pool was the winner on every trial, only small differences in the slopes were observed between conjunction and single feature search, due to target activation saturating the system. Accordingly, search was run under conditions in which some errors occurred, mimicking human data. Only target present trials were simulated. Detailed simulations, were run at the spiking level only, to match the experimental results (Watson and Humphreys, 1997; Watson et al., 2003). In simulations run at the spiking level there is noise within the system based on a Poisson distribution of activity in the units. This enables statistics to be calculated where different runs of the model correspond to different trials. In addition, reaction times (RTs) are generated based on the real-time properties of the neurons. After setting the parameters for the unlesioned model, sSoTS was “lesioned” by reducing the number of units in the pools on one side of the location map. In the simulations of search (Simulations 1 to 3) we reduced each pool of location units by 16.66%. In the simulations of extinction, which may be thought of as a milder form of neglect (though see Karnath et al. (2003), the pools of location units on the “contralesional side” were each reduced by 12.5%. In addition, in studies of extinction patients may only be asked to decide whether they detected a stimulus on the contralesional side, and they may not have to discriminate a target and distractors (cf. Gilchrist et al., 1996). To simulate this in sSoTS we ran the model without employing any top-down expectation for a particular target (for example, a blue H). Here we examine the model’s ability to discriminate the presence of a stimulus in the affected field purely using bottom-up cues.
Modelling Neuropsychological Deficits
159
Simulations Simulation 1: Effects of a Unilateral Lesion on Search through Space and Time PPC patients finding it difficult to find a conjunction target or a target defined by a low saliency single feature difference relative to distractors (see Eglin et al., 1989; FriedmanHill et al., 1995; Humphreys and Price, 1994; Humphreys and Riddoch, 1993; Riddoch Humphreys, 1987). Such patients also show deficits under conditions of preview search, where one set of distractors is temporarily segmented from the other distractors and the search target, even though for normal participants preview search can be as efficient as single feature search (Humphreys et al., 2006b; Olivers and Humphreys, 2004). In this simulation we examine how a unilateral lesion influences single feature, conjunction and preview search. Method The display conditions are the same with Mavritsaki et al. (2006, 2007) and mirror those used by Watson and Humphreys (1997). The target was always the blue letter H. In each case the stimuli were randomly positioned in the field. Performance was averaged across the different permutations of the displays to create data equivalent to the results from one participant. Results and Discussion Figure 9.2 gives the mean correct RTs in the unlesioned and lesioned versions of the model as a function of the target field and the display size. The data are plotted separately for comparisons of the preview condition against the single feature and conjunctions conditions (following Humphreys et al., 2002). The number of items in the final preview search display matches the number of items in the single feature search condition. If the search is equally efficient in the conditions then the slopes of the search functions should not differ. The total number of items in the preview search matches those in the conjunction display. If the preview search is more efficient than the conjunction search, then the slope of its search function, based on the total number of items present, should be reduced. These different comparisons are mostly clearly shown by providing plots against the different display sizes. In the comparisons of the preview conditions with the single feature and conjunction conditions, we used display sizes 2 and 3 from single feature search and display sizes 4 and 6 from the conjunction search task. Comparisons between the single feature and conjunction task were based on display sizes 4 and 6 in each case, so the number of items in the final displays was matched. Figure 9.3 gives the data for accuracy of report in the lesioned version of the model. Note that errors in the unlesioned version of the model were minimal and resembled those found to ipsilesional targets in the lesioned model.
160 Eirini Mavritsaki et al. Figure 9.2 The mean correct reaction times (RTs, in ms) for the unlesioned (dotted lines) and lesioned versions (solid lines) of sSoTS
Note: Data for the unlesioned model are plotted for convenience in the “contralesional” slide, but the data are for targets shown on either side of space. A illustrates the RT data for contralesional targets (in the lesioned model) for preview and single feature search, plotted against the display sizes in the search display. B illustrates RT data for ipsilesional targets for preview compared with single feature search. C shows the data for contralesional targets (in the lesioned model) plotted against the full display sizes for preview and conjunction search. D shows RT data for ipsilesional targets for the preview condition relative to the conjunction condition. The separate plots for preview search against the display sizes in the single feature and conjunction search baselines follows the procedure used bys Humphreys et al. (2002) (data from Simulation 1).
Modelling Neuropsychological Deficits
161
Figure 9.3 The mean percentage miss responses for the lesioned versions of sSoTS (data from Simulation 1)
Note: The layout of the figure follows the scheme used for Figure 9.5.
sSoTS was able to simulate normal patterns of search performance over space and time, when in an unlesioned state (Mavritsaki et al., 2006, 2007): search was more efficient in the single feature and baseline conditions than in the conjunction condition, when two sets of distractors appeared simultaneously with the target. It is clear that a pattern of ‘serial’ search can occur in a model employing only a parallel processing architecture, with the linear component to the search function caused by increases in competition between targets and distractors, as the display size increases. In addition, after lesioning there were differential effects on the “easy” and “hard” search tasks, with the “hard” conjunction search showing a greater effect than the “easy” feature search task. The general finding that conjunction search is selectively disrupted, compared with a single feature baseline, replicates neuropsychological results (Eglin et al., 1989; Friedman-Hill et al., 1995; Riddoch and Humphreys, 1987). Additionally, preview search was more disrupted than single feature search, which follows the pattern reported by Humphreys et al. (2006)
162 Eirini Mavritsaki et al. and Olivers and Humphreys (2004) with PPC patients. These results are consistent with the effect of lesioning being to increase the internal competition between elements, with targets falling on the contralesional side of space being particularly susceptible to competition from ipsilesional distractors. In terms of accuracy, the deficit was greater for the conjunction than the preview condition, but nevertheless for both cases there were many misses of targets on the contralesional side. This reflected distractor competition with the contralesional target, which meant either that a distractor was sometimes selected instead of the target or that there was no clear “winner” of the competition. One further consequence of the competition between the target and distractors is that units representing target features received less strong feedback as selection was taking place. Thus, even though the feature pools did not suffer direct damage, activation in these pools decreased. This is an emergent property of a model employing strong interactivity between layers. The result is itself of some interest, since it matches functional imaging data in patients with PPC damage (Rees et al., 2000). In these cases, activity in primary visual cortex on the contralesional side can be reduced, when an ipsilesional item appears simultaneously, even though the visual cortex is not lesioned. Interestingly, although none of the critical mechanisms contributing to the preview benefit in the model was directly subject to lesioning, they were indirectly affected by the lesion, particularly the process of frequency adaptation. This is due to the reduction of the activity in the contralesional site of the location map, which resulted in passive inhibition (caused by frequency adaptation) being reduced. This in turn meant that old distractors can remain available to compete with new targets, making target selection more difficult. This argument was further investigated in simulation 2.
Simulation 2: Preview Search with Old and New Stimuli in the Same or Different Fields Olivers and Humphreys (2004) reported one result that helps to throw light on the factors leading to impaired preview search in PPC patients. In their Experiment 3, they carried an orthogonal manipulation of whether the old and new search stimuli fell in the contra- and ipsilesional fields of the patients. One account for why preview search might be disrupted after PPC damage can be couched in terms of impaired disengagement of attention. Posner et al. (1984) originally reported that patients with PPC lesions had difficulty in responding to contralesional targets particularly when their attention was cued to the ipsilesional side, and they argued that the patients had problems in disengaging attention from the ipsilesional side of space. Now, under preview conditions, there will be typically old distractors in the ipsilesional field. Consequently, patients may have difficulty in responding to new contralesional targets because they are impaired at disengaging attention from the old ipsilesional distractors. According to this disengagement account, preview search should be particularly difficult for PPC patients when the old distractors fall on the ipsilesional side and the new target on the contralesional side. Olivers and
Modelling Neuropsychological Deficits
163
Humphreys (2004) did not find this, though. Instead they found that the patients were most impaired when the old and new stimuli fell in the same hemifield, irrespective of whether the target appeared on the ipsi- or contralesional side. The data argue against a spatial disengagement account of the deficit in preview search. Olivers and Humphreys (2004) put forward an alternative account, which was that PPC damage led to poor spatio–temporal segmentation of stimuli. As a consequence, performance of the patients was most impaired when their poor temporal segmentation (disrupting preview search) combined with conditions under which spatial segmentation was difficult (when the old and new items fell in the same hemifield). The precise mechanisms of spatio–temporal segmentation, however, were not specified. In Simulation 2 we evaluated whether sSoTS would give rise to a similar pattern of deficit to that observed by Olivers and Humphreys (2004), where presenting old and new items in the same hemifield was particularly disruptive to performance. In testing the effects of hemifield, we also assessed whether sSoTS could help us develop a more precise account of why spatio-temporal segmentation might be disrupted in the patients. Method The method was the same as for Simulation 1, except that we orthogonally varied whether old and new items appeared in the contra- and ipsilesional hemifields for the model. Due to constraints on the number of items we could present in the displays, we were confined to using displays with just 3 items, 1 old distractor and 1 new distractor plus the target. As in Simulation 1, the target could appear either on the contra- or the ipsilesional side of space, but in each case it could appear either in the same field as the old distractor (the within-field condition) or in the opposite field (the across-field condition). Figure 9.4 Figure 9.4 Example displays from Simulation 2
Note: The shaded area of each display indicates the locations that were lesioned.
164 Eirini Mavritsaki et al. gives example displays from the study. There were 20 permutations of the target and distractor locations in each condition, and these were presented four times in order to assemble the data for one “participant”. Simulations were run to generate 20 participants. Only the preview condition was examined. Results and Discussion The mean correct RTs (ms) and the percentage of correct trials are depicted in Figure 9.5. The error rates were low in this experiment due to the small display sizes that were presented (see also Simulation 1, Figure 9.5). These data simulate the results reported by Olivers and Humphreys (2004, Experiment 3). There were strong effects of whether old and new items appeared in the same or in opposite hemifields, and presenting targets in the same field as the old distractors was sufficient to overcome any advantage for the targets appearing on the ipsilesional side. Performance was easiest when the preview appeared in the contralesional field and the new targets in the ipsilesional field. As with Olivers and Humphreys (2004), the data contradict an attentional disengagement account of performance (cf. Posner et al., 1984). According to the disengagement account, performance should be most difficult in the contralesional, across-field condition (when the old items appeared on the ipsilesional side and the target fell in the contralesional field). Figure 9.5 The mean correct RTs (ms) from Simulation 2
Modelling Neuropsychological Deficits
165
Rather than a disengagement account, sSoTS offers a different proposal. According to sSoTS the speed of search is determined by competition between the distractors and the target. The degree to which the old (previewed) distractor competes with the target is influenced by (a) whether this distractor is suppressed at the time when the search display appears, along with (b) the relative magnitude of activation for this distractor compared with the new target. Old distractors falling in the contralesional field will not be fully suppressed because the corresponding neurons will not be activated enough to induce frequency adaptation; in addition, a contralesional item also generates reduced activity compared with an ipsilesional stimulus, and thus any accrual of activation in the location units also takes more time. These different factors can combine to generate the observed pattern of results. Search is easiest for an ipsilesional target following a contralesional preview (the ipsilesional, across-field condition) because the contralesional preview does not provide strong competition for selection, even if it is not suppressed at the time the search display appears, when activation accrues for the ipsilesional target. Search is most difficult for a contralesional target following a contralesional preview (the contralesional, within-field condition) because the target has relatively weak activation and the distractor is not strongly suppressed at the time the target activation is accruing. Search for a contralesional target is better in the across-field condition because the ipsilesional preview is suppressed at the time activation for the target accrues. This situation changes, however, with an ipsilesional target. In this case, a target can sometimes accrue activation before the ipsilesional distractor is suppressed, in which case the target suffers competition and slowed selection. In conclusion, for sSoTS there is an interplay of influences that determine search efficiency based on the relative timing of suppression of the preview and of activation accrual and strength for the target. This inter-play would be difficult to hypothesize without the explicit model.
Simulation 3: Effects of Reduced Neuro-transmitter Based Excitation Patients with unilateral PPC lesions do not only manifest problems in selecting stimuli on the contralesional side of space, but they can also show impairments in selection on their ipsilesional side (Dunkan et al., 1999; Husain et al., 1997; Robertson, 1994). There are also clear demonstrations that ipsilesional items can abnormally increase their saliency (Ladavas et al., 1990; Snow and Mattingley, 2006), altering the way in which these stimuli are selected in patients relative to control participants. These non-spatial deficits after unilateral PPC damage have been linked both to imbalances in spatial competition for selection (for example, in cases of ipsilesional “capture”) and to the presence of additional functional deficits, including impaired visuo–spatial working memory (Husain et al., 2001; Wojciulik et al., 2001) and impaired arousal (Robertson and Manley, 1999). Indeed impairments in arousal and sustained attention are important predictors of the degree of clinical deficit in neglect patients (Husain and Rorden, 2003), and spatial neglect can be
166 Eirini Mavritsaki et al. reduced by temporarily increasing non-spatial, phasic arousal (Robertson et al., 1998). Posner and Petersen (1990), for example, propose that arousal is modulated through a norepinephrine (NE) system that is localized largely in the right hemisphere, and that damage to the right PPC can disrupt the operation of the neuro-transmitters modulating arousal. The consequence is that patients can have a conjoint problem both in spatial selection and in aspects of non-spatial selection particularly dependent on maintained arousal. In Simulation 3 we used sSoTS to provide an existence proof test of this last proposal by reducing spontaneous activity modulated by the NMDA excitatory current. We assumed that this reduction in neuro-transmitter regulation could be caused by a right PPC lesion, in addition to the effect of the lesion on selection mediated by the location map. We examined if reductions in NMDA led to non-spatial deficits emerging on performance. Method The method was the same as in Simulation 1, except that we reduced excitatory NMDAbased activity throughout the model. Performance was tested under conditions of single feature, conjunction and preview search. Results and Discussion The mean correct RTs (ms) and the percentage correct responses are presented in Figure 9.6. Note that there were no correct detections of a contralesional target at display size 6 in the conjunction condition; hence data were averaged across the display sizes for the RT analyses. The results show that performance generally decreased when NMDA levels were lowered (in Simulation 3), while spatial biases in selection were exacerbated, compared with when there was only a unilateral lesion to the location map (Simulation 1). This greater impairment was most evident in the conjunction condition in accuracy, though RT costs were also selectively apparent in preview search compared with the single feature baseline. Under conditions of reduced activity, due to NMDA loss, there is increased competition between targets and distractors, and this is most detrimental to search where the competition is greatest (for example, in conjunction and preview search). One other interesting aspect of performance was that problems increased for ipsi- as well as for contralesional targets, when excitatory modulation decreased (in Simulation 3). This provides evidence for a non-spatial deficit in selection, additional to any effects of the spatially-selective lesion on the location map. As we noted in the Introduction, there is neuropsychological evidence for altered neuro-transmitter modulation in patients showing unilateral neglect, which can be improved by appropriate drug treatment (Malhotra et al., 2005). Simulation 3 illustrates how reduced neuro-transmitter modulation may impact on the mechanisms of visual selection.
Modelling Neuropsychological Deficits
167
Figure 9.6 (a) The mean correct RTs (ms) and (b) the mean percentage miss responses for Simulation 3 (variation in the NMDA parameter)
168 Eirini Mavritsaki et al. General Discussion In the current work we present data on the effects of lesioning the sSoTS model matching data from human patients with PPC damage. sSoTS models human performance at a neural level, providing a fine-grained account for the microgenesis of visual selection. The model has previously been shown to simulate aspects of normal visual search over space and time (Mavritsaki et al., 2006, 2007). Here we “lesioned” the model by removing neurons at selective locations in the location map. Subsequently, sSoTS was able to simulate a range of visual selection experiments while, in addition, providing novel interpretations of some experimental results (for example, Olivers and Humphreys, 2004). In Simulation 1 there was clear evidence that, after lesioning, conjunction search worsened relative to single feature search, particularly for targets on the contralesional side. This result has been reported in patients (Eglin et al., 1989; Friedman-Hill et al., 1995; Humphreys and Muller, 1993; Riddoch and Humphreys, 1987), where it has been interpreted as a deficit in binding features (for example, Friedman-Hill et al., 1995). In sSoTS the deficit comes about not because of a problem in binding but because there is more competition for selection in conjunction compared with single feature search, so that conjunction search suffered most when the lesion reduced the accrual of activation for the target. The simulation provides an existence proof that a selective problem in conjunction compared with single feature search is not necessarily due to poor binding. Additionally, preview search was selectively disrupted compared with the single feature search task. This pattern of results has been observed in PPC patients (Humphreys et al, 2006b; Olivers and Humphreys, 2004). Interestingly, this deficit in preview search in the model comes about by indirect effects to the factors critical to preview search. More specifically preview search is disrupted because there is a break down in the process of passive inhibition, caused by the frequency adaptation mechanism. The combined effects of these factors were demonstrated in Simulation 2 where we orthogonally varied the field where the old and the new items appeared. Similarly to Olivers and Humphreys (2004), we found that performance was best when the old and new stimuli appeared in opposite hemifields. For sSoTS these results occurred because (a) old items in the contralesional field were not strong competitors for new items in the ipsilesional field, and (b) old items in the ipsilesional field were subject to frequency adaptation by the time activation accrued for new, contralesional stimuli, so competition was reduced. The data contradict an account of preview search in terms of impaired disengagement of attention from ipsilesional stimuli. In addition to demonstrating selective spatial deficits in attention, the present simulations show how non-spatial deficits can emerge after unilateral right PPC damage, assuming that right PPC damage disrupts neuro-transmitter regulation (cf. Posner and Petersen, 1990). When there was a reduction of in the excitatory NMDA channel, impairments in selection were generally exacerbated. In addition, non-spatial deficits on selection
Modelling Neuropsychological Deficits
169
occurred, with search for ipsi- as well as contralesional targets being disrupted. The results fit with the effects of low arousal on the performance of right PPC patients (Robertson and Manley, 1999). Overall, the data indicate that the sSoTS model can provide a powerful framework for integrating some of the different symptoms found after PPC damage—covering effects on serial search, on temporal selection, and on effects of reduced neuro-transmitter modulation in neuropsychological patients. In this respect, the model adds to other explicit simulations of neglect and extinction (Heinke and Humphreys, 2003; Mozer et al., 1997; Pouget and Sejnowski, 1997) while offering a first account of temporal as well as spatial aspects of performance. The model indicates how factors such as frequency adaptation may play an important role in visual selection, in addition to competitive and co-operative interactions between processing units. In addition, the simulations of neglect here point to the importance of interactions between different parts of the model in determining output. In addition to contributing to our functional understanding of visual selection, sSoTS advances prior work on neglect by being more closely aligned than previous models to physiological properties of neural systems. As a consequence, sSoTS offers the possibility of a more detailed analysis than before of neuropsychological data in terms of the underlying neural pathology. An example of this here is provided by Simulation 3, where we evaluated the effects of reduced excitatory neuro-transmitter operations and showed emerged deficits in selecting ipsi- as well as contralesional stimuli. Malhotra and colleagues (Malhotra et al., 2005) assessed the effects on neglect of using a drug that increased excitatory neuro-transmitter function and found that the deficits in spatial selection decreased. sSoTS can provide a theoretical framework for such effects. It is also possible to use models such as sSoTS to predict the haemodynamic response function measured in fMRI experiments (Deco et al., 2004; Humphreys et al., in press), enabling the model to be tested using neural as well as behavioural data. An exciting possibility will be to examine changes in the haemodynamic response function after neural damage, to provide an account of structure-function relations in patients.
Acknowledgements This work was supported by grants from the BBSRC, MRC and Stroke Association (UK).
References Abeles, A. 1991; Corticonics: Neural Circuits of the Cerebral Cortex. Cambridge, UK: Cambridge University Press. Agter, A. and M. Donk. 2005. ‘Prioritized selection in visual search through onset capture and color inhibition: evidence from a probe-dot detection task’, Journal of Experimental Psychology-Human Perception and Performance, 31 (4): 722–30.
170 Eirini Mavritsaki et al. Ahmed, B., J. Anderson, R. Douglas, K. Martin, and D. Whitteridge. 1998. ‘Estimates of the net excitatory currents evoked by visual simulation of identified neurons in cat visual cortex’. Cerebral Cortex, 8 (5): 462–76. Allen, H. A. and G. W. Humphreys. 2007. ‘A psychological investigation into the preview benefit in visual search’, Vision Research, 47: 735–45. Brunel, N. and X. Wang. 2001. ‘Effects of neuromodulation in a cortical networks model of object working memory dominated by current inhibition’, Journal of Computational Neuroscience, 11 (1): 63–85. Chelazzi, L., E. Miller, and J. Duncan. 1993. ‘A neural basic of visual search in inferior temporal cortex’, Nature, 363 (6427): 345–47. Deco, G. and E. Rolls. 2005. ‘Neurodynamics of biased competition and cooperation for attention: a model with spiking neuron’, Journal of Neurophysiology, 94 (1): 295–313. Deco, G. and J. Zihl. 2001. ‘Top-down selective visual attention: a neurodynamical approach’, Visual Cognition, 8 (1): 119–40. Deco, G., E. Rolls, and B. Horwitz. 2004. ‘Integrating fMRI and single-cell data if visual working memory’, Neurocomputing, 58–60, 729–37. Duncan J., G. W. Humphreys, and R. Ward. 1997. Competitive brain activity in visual attention, current opinion in Neurobiology, 7 (2): 255–61. Duncan, J., C. Bundesen, A. Olson, G. W. Humphreys, S. Chavda, and H. Shibuya. 1999. ‘Systematic analysis of deficits in visual attention’, Journal of Experimental Psychology, 128: 1–29. Eglin, M., L. C. Robertson, and R. D. Rafal. 1989. ‘Visual search performance in the neglect syndrome’, Journal of Cognitive Neuroscience, 1: 372–85. Ellis, R. and G. W. Humphreys. 1999. Connectionist Psychology. London. Friedman-Hill, S. R., L. C. Robertson, and A. Treisman. 1995. ‘Parietal contributions to visual feature binding: evidence from a patient with bilateral lesions’, Science, 269 (5225): 853–55. Gilchrist, I., G. W. Humphreys, and M. J. Riddoch. 1996. ‘Grouping and extinction: evidence for low level modulation of selection’, Cognitive Neuropsychology, 13: 1223–56. Heinke, D. and G. Humphreys. 2003. ‘Attention, spatial representation and visual neglect: simulating emergent attention and spatial memory in the selective attention for identification model (SAIM)’, Psychological Review, 110 (1): 29–87. Humphreys, G. 1998. ‘Neural representation of objects in space: a dual coding account’, Philosophical transactions of the Royal Society of London. Series B, Biological Sciences, 353 (1): 1341–51. Humphreys, G. W. and C. J. Price. 1994. ‘Visual feature discrimination in simultanagnosia: a study of two cases’, Cognitive Neuropsychology, 11 (4): 393–434. Humphreys, G. W. and H. J. Muller. 1993. ‘Search via recursive Rejection (SERR): a connectionist model of visual search’, Cognitive Psychology, 25: 43–110. Humphreys, G. W. and M. J. Riddoch. 1993. ‘Interactions between object and space vision revealed through neuropsychology’, in D. E. Meyer and S. Kornblum (eds), Attention and Performance XIV, pp. 301–18. Hillsdale, N. J.: Erlbaum. ———. 1994. ‘Attention to within-object and between-object spatial representations: multiple sites for visual selection’, Cognitive Neuropsychology, 11 (2): 207. ———. 1995. ‘The old town no longer looks the same: the effects of brain damage on the computation of visual similarity’, in C. Cacciari (ed.), Similarity in Language: Thought and Perception, p. 157. San Marino: Brepols. Humphreys, G. W., D. G. Watson, and Joliceour P. 2002. Fractioning visual marking: dual task decomposition of the marking state by timing and modality. Journal of Experimental Psychology: Human Perception and Performance, 28: 640–60. Humphreys, G. W., B. Jung Stalmann, and C. N. L. Olivers. 2004. ‘An analysis of the time course of attention in preview search’, Perception and Psychophysics, 66 (5): 713–30.
Modelling Neuropsychological Deficits
171
Humphreys, G. W., E. Mavritsaki, D. G. Heinke, and G. Deco. (in press). ‘The application of neural-level models to human visual search: modelling whole system behaviour, neuropsychological break down and BOLD signal activation’, in D. G. Heinke and E. Mavritsaki (eds), Bridging the Gap Between Physiology and Cognition. London: Psychology Press. Humphreys, G. W., C. N. L. Olivers, and J. J. Braithwaite. 2006a. ‘The time course of preview search with color-defined, not luminance-defined stimuli’, Percept. Psychophys., 68 (8): 1351–58. Humphreys, G. W., C. N. L. Olivers, and E. Young Yoon. 2006b. ‘An onset advantage without a preview benefit: neuropsychological evidence separating onset and preview effects in search’, Journal of Cognitive Neuroscience, 18 (1): 110–20. Humphreys, G. W., D. G. Watson, and P. Joliceour. 2002. ‘Fractionating visual marking: dual task decomposition of the marking state by timing and modality’, Journal of Experimental Psychology: Human Perception and Performance, 28: 640–60. Husain, M. and C. Rorden. 2003. ‘Non-spatially lateralized mechanisms in hemispatial neglect’, Nature Review Neuroscience, 4: 26–36. Husain, M., S. Mannan, T. Hodgson, E. Wojciulik, J. Driver, and C. Kennard. 2001. ‘Impaired spatial working memory across saccades contributes to abnormal search in parietal neglect’, Brain, 124 (5): 941–52. Husain, M., K. Shapiro, J. Martin, and C. Kennard. 1997. ‘Abnormal temporal dynamics of visual attention in spatial neglect patients’, Nature, 385 (6612): 154–56. Itti, L. and C. Koch. 2000. ‘A saliency-based search mechanism for overt and covert shifts of visual attention’, Vision Research, 40 (10–12): 1489–506. Karnath, H. O. 1988. ‘Deficits of attention in acute and recovered visual hemi-neglect’, Neuropsychologia, 26 (3): 27–43. Karnath, H. O., M. Himmelbach, and W. Kuker. 2003. ‘The cortical substrate of visual extinction’, Neuroreport, 14: 437–42. Ladavas, E., A. Petronio, and C. Umilta. 1990. ‘The deployment of visual attention in the intact field of hemineglect patients’. Cortex, 26 (3): 353–66. Liu, Y. and X. Wang. 2001. ‘Spike-frequency adaptation of a generalized leaky integrate-and-fire model neuron’, Journal of Computational Neuroscience, 10 (1): 25–45. Madison, D. and R. Nicoll. 1984. ‘Control of the repetitive discharge of rate ca1 pyramidal neurons in vitro’, Journal of Physiology, 345 (SEP): 319–31. Malhotra, P., A. D. Parton, R. Greenwood, and M. Husain. 2005. ‘Noradrenergic modulation of space exploration in visual neglect’. Annals of Neurology, 59, 186–90. Mavritsaki, E., D. Heinke, G. Humphreys, and G. Deco. 2007. ‘Suppressive effects in visual search: a neurocomputational analysis of preview search’, Neurocomputing, 70 (10–12), pp. 1925–31. Mavritsaki, E., D. Heinke, G. W. Humphreys, and G. Deco. 2006. ‘A computational model of visual marking using an interconnected network of spiking neurons: the spiking search over time and space model (sSoTS)’, Journal of Physiology Paris, 100 (1–3): 110–24. Mozer, M. C. and M. Behramann. 1990. ‘On the interaction of selective attention and lexical knowledge: a connectionist account of neglect dyslexia’, Journal of Cognitive Neuroscience, 2: 96–123. Mozer, M. C., P. W. Halligan, and J. C. Marshal. 1997. ‘The end of the line for a brain damaged model of unilateral neglect’, Journal of Cognitive Neuroscience, 9 (2): 171–90. Olivers, C. N. L., and G. W. Humphreys. 2004. ‘Spatiotemporal segregation in visual search: evidence from parietal lesions’, Journal of Experimental Psychology, 30 (4): 667–88. Posner, M. and Y. Cohen. 1984. ‘Attention and performance x: control of language processes’, in H. Bouma and D. Bouwhuis (eds), Components of Visual Orienting, pp. 531–56. New Jersey: Lawrence Erlbaum Assoc. Posner, M. I. and S. E. Petersen. 1990. ‘The attention system of the human brain’, Annual Review Neuroscience, 13, 25–42.
172 Eirini Mavritsaki et al. Pouget, A. and T. J. Sejnowski. 1997. ‘A new view of hemineglect case on the response properties of parietal neurons’, Philosophical Transactions of Royal Society, B352: 1449–59. Prescott, S. A. and T. J. Sejnowski. 2007. ‘Effects of different types of spike frequency adaptation on spike timing and rate coding’, paper presented at the Society for Neuroscience 2007, San Diego, California. Rees, D., B. Backus, and D. Heeger. 2000. ‘Activity in primary visual cortex predicts performance in a visual detection task’, Nature Neuroscience, 3 (12): 940–45. Riddoch, M. J. and G. W. Humphreys. 1987. ‘Perceptual and action systems in unilateral neglect’, in M. Jeannerod (ed.), Neurophysiological and Neuropsychological Aspects of Spatial Neglect, 151–83. Amsterdam: Elsevier Science. Robertson, I. H. 1994. Cognitive Neuropsychology and Cognitive Rehabilitation, in M. J. Riddoch and G. W. Humphreys (eds), pp. 173–86. London: Psychology Press. Robertson, I. H. and T. Manley. 1999. ‘Sustained attention deficits in time and space’, in G. W. Humphreys, J. Duncan and A. Treisman (eds), Attention, Space and Action: Studies in Cognitive Neuroscience, pp. 279–310. Oxford: Oxford University Press. Robertson, I. H., J. B. Mattingley, C. Rorden, and J. Driver. 1998. ‘Phasic alerting of neglect patients overcomes their spatial deficit in visual awareness’, Nature, 396 (10): 169–72. Rolls, E. and G. Deco. 2002. Computational Neuroscience of Vision. Oxford: Oxford University Press. Rolls, E. and A. Treves. 1998. Neural Networks and Brain Function. Oxford: Oxford University Press. Seidenberg, M. S. and J. L. McClelland. 1989. ‘A distributed, developmental model of word recognition and naming’, Psychological Review, 96 (4): 523–68. Sejnowski, T. J. 1986. ‘Open questions about computation in cerebral cortex’, in J. L. McClelland and D. E. Rumelhart (ed.), Parallel Distributed Processing: Psychological and Biological Models. (Vol. 2), pp. 373–90. Cambridge, MA: MIT press. Servan-Schreiber, D., H. Printz, and J. D. Cohen. 1990. ‘A network model of catecholamine effects gain signal to noise ratio and behavior’, Science, 249 (4971): 892–95. Snow, J. and J. B. Mattingley. 2006. ‘Goal-driven selective attention in patients with right hemisphere lesions: how intact is the ipsilesional field?’, Brain Research Reviews, 129 (1): 168–81. Treisman, A. and G. Gelade. 1980. ‘A feature-integration theory of attention’, Cognitive Psychology, 12 (1): 97–136. Tuckwell, H. 1998. Introduction to Theoretical Neurobiology. Cambridge: Cambridge University Press. Usher, M. and E. Niebur. 1996. ‘Modelling the temporal dynamics of IT neurons in visual search: a mechanism of top-down selective attention’, Journal of Cognitive Neuroscience, 8 (4): 311–27. Watson, D. and G. Humphreys. 1997. ‘Visual marking: prioritizing selection for new objects by top-down attentional inhibition of old objects’, Psychological Review, 104 (1): 90–122. ———. 2000. ‘Visual marking: evidence for inhibition using probe-dot detection paradigm’, Perception and Psychophysics, 62 (3): 471–80. Watson, D., G. Humphreys, and C. Olivers. 2003. ‘Visual marking: using time in visual selection’, Trends in Cognitive Sciences, 7 (4): 180–86. Watson, D. G. and Humphreys G. W. 2005. Visual marking: the effects of irrelevant changes on preview search, Perception and Psychophysics, 67 (3): 418–34. Wilson, F., S. O’Scalaidhe, and P. Goldman-Rakic. 1994. ‘Functional synergism between putative gammaaminobutyrate-containing neurons in pyramidal neurons in prefrontal cortex’, Proceedings of the National Academy of Science, 91 (99): 4009–13. Wolfe, J. E. 1994. Guided search 2: a revised model of visual search. Psychonomic Bulletin and Review, 1 (2): 202–38. Wojciulik, E., M. Husain, K. Clarke, and J. Driver. 2001. ‘Spatial working memory deficit in unilateral neglect’, Neuropsychologia, 39 (4): 390–96.
Section III
Time Perception
174 Kielan Yarrow
Introduction
T
ime perception has been studied extensively in Neuroscience as well as Psychology (Block and Zakay, 1997; Ivry and Spencer, 2006; Lewis and Miall, 2002; Mauk and Buonmano, 2004; Walsh, 2003). Neuroscientific studies have focused on identification of neural structures involved in judgements about time (Berlin et al., 2003; Ivy and Spencer, 2004; Lewis and Miall, 2006; Mauk and Buonmano, 2004). In cognitive psychology, the major focus has been on two fundamentally different questions: what mechanisms are involved when one has to estimate time retrospectively versus when one knows in advance (prospective condition) that time has to be estimated. In the first case, memory is emphasized, while in the second case attention based models are reported to be more useful. Recently studies in cognitive neuroscience have focused on identifying the neural mechanisms underlying the cognitive processes involved in time perception. Many factors affect time perception including attention, sensory modality, memory, age, gender, and task demands (Block et al., 2000; Block and Zakay, 1997; Grondin and Rammsayer, 2003; Tse et al., 2004). This section consists of four chapters on time perception that discuss important developments in time perception in recent years. The chapter by Kielan Yarrow discusses the continuous nature of our subjective experience across eye-movements. Saccadic chronostasis is an illusion in which the duration of a visual stimulus to which an eye-movement is made, is perceived to be longer than the actual duration. Many explanations have been proposed to explain the illusion and the author discusses the antedating account in which the perceived duration is estimated such that the onset of the stimulus is fixed at a point prior to the initiation of the saccade and experimental evidence supporting the account. He presents new findings from an experiment on saccadic chronostasis that is consistent with the antedating account. In the final section, he discusses various methodological issues involved in the study of saccadic chronostasis. It is known that perception of time is distorted due to many factors (Tse et al., 2004; Pariyadath and Eagleman, 2007). Sometimes we feel that the passage of time is slow and sometimes we feel that time flies quickly. One common example is people with the experience of a traumatic event like accidents. They report that time seem to move in slow motion during the accident. Many studies have explored this subjective expansion of time (Tse et al., 2004). In their chapter, Pariyadath and Eagleman discuss the theoretical implications of subjective expansion of time based on their experiments involving repetitive stimuli. They argue that many mechanisms play a critical role in time perception and these mechanisms can be dissociated using appropriate experiments.
176 Advances in Cognitive Science An important aspect of time perception is duration judgement. Many studies have shown that repeated presentations of a stimulus lead to reduction in response. They argue that predictability plays a critical role in duration judgements. Using repetitive stimuli, they have performed multiple experiments (Pariyadath and Eagleman, 2007) to explore the role of predictability. They have argued that unpredictability rather than the salience of the stimulus or attentional factors produce subjective expansion of time. These predictability computations are performed in higher cortical areas. They also discuss potential diagnostic applications based on the repetition paradigm. The experiments on subjective expansion of time as well as many other studies explore duration judgements made explicitly by human observers. In contrast, temporal judgements are made implicitly in many situations and tasks. The chapter by Trevor Penney and colleagues discusses their findings on implicit timing and their importance in understanding time perception. They discuss behavioural paradigms in which the duration between successive stimuli is manipulated. Participants depend on estimating the interstimulus interval for better performance even though the task is directly not dependent on the inter-stimulus interval. They also discuss findings from stop-reaction time task involving stimuli from both auditory and visual modalities indicating the differences in parallel time judgements between auditory and visual modalities. This is followed by a discussion on electrophysiological measures related to implicit timing based on findings from negative omission potential and mismatch negativity paradigm. In the final section, they discuss differences between explicit and implicit timing based on continuous circle drawing and tapping tasks and argue that different mechanisms are involved in implicit and explicit timing. Many studies have explored the neural structures involved in time perception (Ivy and Spence, 2004; Lewis and Miall, 2003; Mauk and Buonmano, 2004). Vivane Pouthas reviews the findings on brain areas involved in duration judgements based on findings from electrocenphalography (EEG), functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). She discusses the link between contingent negative variation (CNV) and duration judgements using normal human participants as well Parkinson disease patients. This is followed by a discussion on the brain areas involved in time perception based on fMRI studies. In the final section, she discusses a study based on combining PET and EEG indicating that the right frontal area plays a critical role in temporal processing. The four chapters together critically discuss and review the recent literature on time perception (both explicit and implicit) as well as the processes and neural structures involved in time perception.
References Berlin, H. A., E. T. Rolls, and U. Kischka. 2004. “Impulsivity, time perception, emotion and reinforcement sensitivity in patients with orbitofrontal cortex lesions”, Brain, 127 (5): 1108–126.
Time Perception
177
Block, R. A., P. A. Hancock, and D. A. N. Zakay. 2000. “Sex differences in duration judgments: a metaanalytic review”, Memory and Cognition, 28 (8): 1333–346. Block, R. A. and D. Zakay. 1997. “Prospective and retrospective duration judgments: a meta-analytic review” Psychonomic Bulletin and Review, 4 (2): 184–97. Grondin, S. and T. Rammsayer. 2003. “Variable fore-periods and temporal discrimination”, The Quarterly Journal of Experimental Psychology A, 56 (4): 731–65. Ivy, R. B. and R. M. C. Spencer. 2004. “The neural representation of time”, Current Opinion in Neurobiology, 14 (2): 225–32. Lewis, P. A. and R. C. Miall. 2003. “Overview: An image of human neural timing”, in W. H. Meck (ed.), Functional and Neural Mechanisms of Interval Timing, pp. 515–32. Boca Raton: CRC Press. Lewis, P. A. and R. C. Miall. 2006. “Remembering the time: a continuous clock”, Trends in Cognitive Sciences, 10 (9): 401–06. Mauk, M. D. and D. V. Buonomano. 2004. “The neural basis of temporal processing”, Annual Review of Neuroscience, 27: 307–40. Pariyadath, V. and D. Eagleman. 2007. “The effects of predictability on subjective duration”, PLoS ONE, 2 (11): e1264. Tse, P., J. Intriligator, J. Rivest, and P. Cavangh. 2004. “Attention and the subjective expansion of time”, Perception and Psychophysics, 66 (7): 1171–189. van Wassenhove, V., D. V. Buonomano, S. Shimojo, and L. Shams, (2008). Distortions of subjective time perception within and across senses. PLoS ONE, 3 (1): e1437. Walsh, V. 2003. “Time: the back-door of perception”, Trends in Cognitive Sciences, 7 (8): 335–38. Zakay, D. and R. A. Block, (1996). The role of attention in time estimation processes. In M. A. Pastor and J. Artieda (eds.) Time, internal clocks and movement (pp. 143–64), Amsterdam: Elsevier Science. Zakay, D., and R. A. Block (1997). Temporal cognition: Current Directions in Psychological Science, 6: 12–16.
178 Kielan Yarrow
Chapter 10 Continuity of Subjective Experience Across Eye Movements: Temporal Antedating Following Small, Large, and Sequential Saccades Kielan Yarrow
Introduction
W
hen people make a saccadic eye movement to fixate a new visual target, they overestimate the duration for which that target is perceived (Yarrow et al., 2001). This illusion, known as saccadic chronostasis, has been demonstrated using the following basic procedure. Subjects make a saccade to a target that changes form or colour during the saccade. They judge the duration of the new target stimulus relative to subsequently presented reference stimuli, and these judgements are used to determine a point of subjective equality (PSE). This is the point at which target and reference stimuli are perceived to have identical durations. The same task performed at fixation forms a control. Reduced PSEs in saccade compared to control conditions imply temporal overestimation of the post-saccadic stimulus. A similar effect can also be observed in a more everyday setting. The “stopped clock” illusion occurs when we glance at a watch with a moving second hand and feel, just for a moment, that it has stopped working. This experience is one that many people recognize, and prompted the first investigations of saccadic chronostasis. It does not occur every time we look at our watch, but only on those occasions when the watch hand (or a digital counter) changes just before or during the saccade (Rothwell, 1997). In these cases, the next interval seems to exceed one second in duration. Aside from explaining this common perceptual experience, why study saccadic chronostasis? In this paper, I will suggest that the illusion helps explain how it is that our visual experience consists of a seamless progression of fixations without any intervening
180 Kielan Yarrow saccadic gaps. I will develop this account in the following manner. Firstly, I will briefly describe and interpret various published saccadic chronostasis experiments. Next, I will report a previously unpublished experiment which illustrates some key attributes of the basic phenomenon. To conclude, I will discuss some methodological points that bear on the interpretation of saccadic chronostasis experiments.
The antedating account: we feel like we see the target of a saccade from the moment we move our eyes towards it Using the basic methodology described above, Yarrow et al. (2001) found that subjects overestimated the duration of a stimulus they had just fixated with a rapid eye movement. Subjects made saccades of either 22° or 55° extent and judged the duration of a postsaccadic stimulus. They made the same judgement in two control conditions involving fixation at an identical orbital eccentricity. The size of the resultant bias was found to depend upon the duration of the saccade. The bias was greater in the large saccade condition than in the small saccade condition, and this difference was very similar to the difference in saccade durations. This finding is consistent with the hypothesis that an illusory timeline of events is being recalled following a saccade. Observers do not report durations consistent with having perceived the post-saccadic stimulus at the moment it was foveated (the end of the saccade) or even at the moment it first appeared (during the saccade). Instead, they report durations consistent with having seen this stimulus approximately 50 ms before they moved their eyes. I will refer to this as the antedating hypothesis. The saccade length effect is consistent with the antedating hypothesis, but is not conclusive on its own. Problems of interpretation arise because the measure that is being used (the perceived duration of the post-saccadic stimulus) cannot be unambiguously related to the perceptual event that the hypothesis suggests is being temporally repositioned (the onset of the post-saccadic target). In physics, the duration of an interval can only be changed by adjusting the time at which the events that bound that interval occur. Psychologically, however, this is not the case: perceived duration can be adjusted by a number of non-temporal factors (Allan, 1979). Many theorists relate these changes in perceived time to the rate at which some hypothetical internal clock is functioning (Treisman et al., 1990; Wearden et al., 1998). Hence saccadic chronostasis effects could reflect a change in clock rate rather than temporal antedating (Hodinott-Hill et al., 2002). This issue has been addressed in two ways. Firstly, if saccadic chronostasis is the result of a change in clock rate, the size of the effect should depend on the duration of the post-saccadic interval that is being judged. This follows because subjective time will equal objective time multiplied by clock rate. In fact, the size of the effect is constant and independent of stimulus duration (Yarrow et al., 2004a). This finding does not rule out a transient increase in clock rate, but it would need to be very brief and very intense.
Subjective Experience Across Eye Movements
181
A second set of experiments provided more direct evidence supporting the antedating account. Yarrow et al. (2006a) carried out a typical saccadic chronostasis experiment in which subjects made saccades of either 10° or 50° extent. The same subjects also completed an experiment in which a brief auditory stimulus (a beep) was sounded around the time they moved their eyes. In this case, their task was to judge whether the auditory stimulus came before or after they first saw the post-saccadic visual stimulus (that is, a cross-modal temporal order judgement). A large bias emerged in both experiments, compared to control conditions requiring a similar cross-modal temporal order judgement or duration judgement, but no saccade. In the standard chronostasis experiment, the post-saccadic stimulus had an extended subjective duration compared to control conditions. In the temporal order judgement experiment, the beep had to be sounded before the post-saccadic target was foveated in order to appear synchronous with it. In both experiments, effects were larger following large saccades than following small saccades. Hence two completely different tasks applied to the same experimental situation provided consistent evidence that an illusory timeline was being recalled for the period containing the saccade. The temporal order judgement task is explicitly an event judgement task. It therefore circumvents the problem that the chronostasis effect has previously been measured using interval judgements as an implicit index of subjective events. There are a number of other observations that can inform, or be interpreted within, the antedating framework. During a saccade, visual input is highly degraded. High spatial frequency visual information is smeared by the rapid movement of the eye, while low spatial frequency visual information is subjected to an active process of saccadic suppression (Ross et al., 2001). The visual input is further suppressed as a result of backwards masking by the post-saccadic image (Campbell and Wurtz, 1978). These results explain our failure to perceive motion during a saccade, but not our failure to experience any interruption of normal vision during this interval. The recollection of an illusory timeline across a saccade seems to provide the final piece in this puzzle, explaining why we have no visual experience at all corresponding to the period our eyes are in motion. This account seems to imply a retrospective interpretative process operating on uncertain (degraded) sensory data from the period of a saccade. Such high-level processes depend upon prior expectations (for example, Yang and Purves, 2003). One might therefore expect the illusion to be modulated when sensory evidence is available that contradicts these expectations. Yarrow et al. (2001) found that saccadic chronostasis does not occur when the saccade target is noticeably displaced (that is, jumped horizontally by around 3°) at the moment it changes form during the saccade. This manipulation may have violated the brain’s expectations about the stability of the external world across eye movements and therefore undermined the illusion. Experiments such as those reported above reveal the timeline of events the brain typically infers across a saccade, but a question remains about exactly what we recall having actually seen in the saccadic interval. Recent experiments begin to address this
182 Kielan Yarrow issue. Chronostasis was observed following saccades to a moving object, but subjects did not perceive a corresponding period of stimulus motion filling the saccadic gap (Yarrow et al., 2006b), at least to the extent that their percepts could be inferred from judgements about the position of the moving stimulus. It appears, then, that our perception of the timing of events can be adjusted without requiring a complementary adjustment to spatial vision, an example of the way different stimulus properties can become dissociated in conscious perception. In another set of experiments, Yarrow et al. (2004b) found that the saccadic chronostasis effect could be obtained with a similar magnitude for many different kinds of saccade (self timed saccades, pro- and anti-saccades, even express saccades). These experiments offer some insight into the possible neural locus of the effect. An extensive network of brain areas is involved in the production of saccades, but express saccades (those elicited in a gap paradigm with a latency of 70 to 130 ms; Fischer and Ramsperger, 1984) are generally held to be generated in exclusively subcortical regions (Hopp and Fuchs, 2002). The antedating hypothesis predicts that an efference copy signal relating to the saccade must be transmitted to brain regions that can generate an illusory timeline. The finding of saccadic chronostasis following express saccades suggests that this signal may originate in the superior colliculus. While a subcortical signal may trigger chronostasis, the retrospective adjustment of perceptual content is presumably generated elsewhere. So where is this signal transmitted to? The experience of saccadic chronostasis may perhaps reflect receptive field shifts of visual neurones. These were first described in the lateral intraparietal area (LIP) of behaving monkeys (Duhamel et al., 1992) and have been found more recently in a number of other brain areas (Walker et al., 1995; Umeno and Goldberg, 1997; Nakamura and Colby, 2002). Some apparently retinocentric cells in these areas begin to respond before a saccade has been initiated to stimuli at locations that the saccade will bring into their receptive fields. The timing of this pre-saccadic activity varies widely across cells, but a brain region capable of averaging these neurones’ initial responses to a post-saccadic stimulus could produce the illusory temporal memory underlying saccadic chronostasis. The idea that receptive field shifts arise in response to an efference copy signal from the superior colliculus is physiologically plausible (Sommer and Wurtz, 2002) but the part played by such cells in producing saccadic chronostasis remains hypothetical.
An Interval Estimation Experiment Involving Small, Large, and Double Saccades The following experiment is presented with three purposes in mind. First, because the experiment tested small and large saccades, it provides a replication of the key finding that prompted us to formulate the antedating hypothesis. Second, the experiment includes a new double saccade condition. This condition will permit some additional inferences
Subjective Experience Across Eye Movements
183
about the nature of the efference copy signal that may trigger the mental operations which yield saccadic chronostasis. Third, presenting the methods for a typical (early) saccadic chronostasis experiment here will facilitate the subsequent discussion of methodological issues.
Materials and Methods Participants The experiment was completed by four subjects (three male, mean age = 32.8, SD = 9.7) including the author (KY) and one participant with detailed knowledge of saccadic chronostasis (JR). All participants had completed a number of previous experiments investigating saccadic chronostasis.
Apparatus Subjects faced a 22” CRT colour monitor refreshing at 60 Hz. A chin rest ensured an eye to screen distance of 41 cm. Horizontal eye movements were recorded using DC electro-oculography (7A22 amp: Techtronix; low-pass filtered at 100 Hz) and sampled at 200 Hz. Stimuli (black and red circles/crosses) were black or red on a white background, subtending 1.1°.
Design A repeated-measures design was employed with four conditions (short saccade, long saccade, double saccade, control). Trials from each condition were presented in separate blocks, with ten blocks per condition. Testing spanned five sessions on separate days with two blocks completed from each condition in each session. The eight blocks completed in a session were presented in a random order.
Procedure (see Figure 10.1) In long saccade blocks, subjects fixated a red cross on one side of the screen, initiated the trial with a mouse key press, waited at least 500 ms, then made a 50° voluntary saccade towards a black cross on the far side of the screen. Eye movement triggered the black cross to be replaced with a circle when the saccade had travelled one third of the distance to target. The circle remained on screen for a variable duration (200 to 1800 ms). It then disappeared, to be replaced by an identical circle (the comparison stimulus) after 500 ms. This stimulus was displayed for 1000 ms. Subjects indicated whether they saw the first circle for a longer or shorter time than they saw the comparison circle. The duration of the first circle was controlled by two modified binary search procedures (MOBS, Tyrrell and
184 Kielan Yarrow Figure 10.1 Schematic of the experimental task in saccade and control conditions
Note: Stimuli shown in grey were actually displayed in red.
Owens, 1988: low boundary 200 ms, high boundary 1800 ms, first presentation random 600 to 1400 ms, five reversals to terminate). MOBS procedures are adaptive, effectively homing in on a PSE-based on the responses subjects make. The two independent MOBS procedures were run in a concurrent interleaved fashion, with random switching between runs until one MOBS had terminated. The direction of the saccade (leftwards or rightwards) alternated every trial. Blocks finished when both MOBS had terminated.
Subjective Experience Across Eye Movements
185
Saccade start/end points were calculated automatically using a velocity criterion. Trials where the first saccade recorded did not exceed 80 per cent of the total distance recorded (summed across all detected saccades) were excluded online and repeated immediately. Trials in which subjects initiated their saccade within 500 ms of their key press were also rejected. In a few blocks in which EOG noise was problematic an optional bidirectional second order 30 Hz low-pass Butterworth digital filter was incorporated prior to start/end point detection. For short saccade trials, an identical procedure was followed except that the initial fixation cross appeared 15° from the saccade target cross (consequently on the same side of the subject’s midline). For double saccade trials, initial fixation was as in the long saccade condition, but two target crosses were displayed. Subjects were required to make two saccades in rapid succession, pausing only very briefly at the intermediate target. The first saccade was 35° in extent, the second 15°. The change of target stimulus was triggered one third of the way into the second saccade (after four-fifths of the total saccadic distance). Trials where the first and second saccades detected did not exceed 56 per cent and 24 per cent respectively of the total distance recorded were excluded. In control (constant fixation) trials, subjects initially fixated a cross at equivalent eccentricity to the target cross used in all other conditions. It was blanked 400 ms after the subject’s mouse key press, then replaced after a further 100 ms by the to-be-judged circle, with subsequent stimulus presentation and subject responses as per saccade trials. The position of the fixation cross (left or right of screen) alternated every trial. Blocks were of variable length, typically 12 to 40 trials (excluding those rejected). Twenty subjective second estimates per condition were recorded as the experiment proceeded (one for each MOBS termination). In the saccade conditions, each estimate was corrected post hoc to match the time the first circle was on screen following target foveation by subtracting the average time the eye was in motion following the triggered change to a circle (averaged across all blocks).
Result Saccade Characteristics In the short saccade condition, saccades lasted an average of 53 ms. In the long saccade condition, they lasted an average of 145 ms. For the double saccade condition, the two saccades took an average of 98 and 50 ms, respectively, with an average intermediary fixation time of 190 ms (total sequence = 338 ms).
Time Estimates Figure 10.2 shows the corrected MOBS termination points across all ten blocks (20 MOBS values in all) in the four conditions. All four subjects showed considerable variability in their MOBS-derived estimates. Lower subjective second values in the three saccade
186 Kielan Yarrow Figure 10.2 Time estimation data
Note: Left: Time estimation data presented separately for four subjects over the course of five testing sessions. Separate lines show the four experimental conditions (constant fixation, a 15º saccade, a 50º saccade or a double [35º + 15º] saccade). Each data point represents the corrected termination value for a single MOBS procedure, representing the duration for which the post-saccadic stimulus had to be presented to seem equivalent to the subsequent one-second long reference stimulus (two per block, two blocks per testing session). Points of Subjective Equality calculated across all MOBS procedures in each condition are displayed at the right hand side of each subject’s graph.
Right: Mean corrected PSEs across subjects for the same four conditions. Error bars show standard errors.
Subjective Experience Across Eye Movements
187
conditions compared to the fixation control condition demonstrate the saccadic chronostasis effect. The effect is evident right across the duration of the experiment. Corrected MOBS termination points were averaged to produce mean PSEs for each subject in each condition. For three out of four subjects, the magnitude of saccadic chronostasis is substantially larger in the long saccade condition compared to the short saccade condition. The double saccade condition generally shows an effect size similar to that obtained for short saccades. To assess the statistical reliability of these observations for each observer, their individual corrected MOBS termination values were entered into a two-way (2 × 4) independent measures ANOVA, with practice and condition as fixed factors. Practice was assessed by dividing data between the first and second halves of the experiment. Subject KY, the most experienced observer, showed a main effect of condition (f = 36.8, df = 3, 72, p < 0.001) but no effect of practice (f = 2.3, df = 1, 72, p = 0.138) and no interaction (f = 0.5, df = 3, 72, p = 0.688). Follow ups showed a significant difference between the control condition and all three saccade conditions (all p < 0.001) and between the long saccade condition and each of the other two saccade conditions (both p < 0.001). Subject JM also showed a main effect of condition (f = 12.7, df = 3, 72, p < 0.001) and no effect of practice (f = 2.1, df = 1, 72, p = 0.152). For JM the condition × practice interaction approached significance (f = 2.5, df = 3, 72, p = 0.066) because PSEs in the double saccade condition, but not other conditions, fell as the experiment progressed. Follow ups distinguished the control condition from all three saccade conditions (all p < 0.01) but lacked the power to differentiate the long saccade condition from the short and double saccade conditions. Subject NF again showed a main effect of condition (f = 3.5, df = 3, 72, p = 0.02) but she also showed a main effect of practice (f = 7.8, df = 1, 72, p = 0.007). There was no interaction (f = 1.0, df = 3, 72, p = 0.39). Her PSEs fell in the second half of the experiment; the difference between the control condition and the three saccade conditions actually grew numerically, but this was not a reliable feature of the data. The only significant difference in follow up testing was between the control and long saccade conditions (p = 0.014). Finally, Subject JR did not show a main effect of condition (f = 1.9, df = 3, 72, p = 0.141) but did show a main effect of practice (f = 7.6, df = 1, 72, p = 0.007) with an interaction that approached significance (f = 2.4, df = 3, 72, p = 0.079). His PSEs rose in the second half of the experiment for saccade conditions, reducing (and, for the double saccade condition, actually reversing) the difference between control and saccade conditions. No follow ups were significant. Figure 10.2 also shows mean PSEs averaged across subjects for each of the four experimental conditions. A large difference is evident when comparing the saccade conditions to the constant fixation control. The control estimate itself is quite low, mainly reflecting the contribution of the highly biased subject NF in this small sample. Additionally, the long saccade condition shows a reduced subjective second relative to short and
188 Kielan Yarrow double saccade conditions. To assess reliability when generalising to the population at large, these data were entered into a one-way repeated-measures ANOVA which narrowly missed significance (f = 8.0, Greenhouse–Geisser corrected, df = 1.3, p = 0.055). Given the small sample, Bonferroni-corrected post-hoc tests lacked sufficient power to further discriminate between conditions.
Discussion Looking only at the patterns of means, the data tell a fairly straightforward story. For all four subjects, post-saccadic stimuli had to be presented for a shorter time after foveation than similar stimuli viewed at fixation in order to appear to be of the same duration. This implies that the duration of the post-saccadic stimulus was subjectively extended. For three out of four subjects, the magnitude of this effect was larger following a large saccade than following a small saccade. This difference was in approximate agreement with the extra time needed to complete the large saccade. Statistical support for this interpretation is equivocal, but the data are in complete agreement with the statistically significant findings from previous published experiments (Yarrow et al., 2001; Yarrow et al., 2006a). The general approach in other chronostasis experiments has been to use fairly large groups of subjects, whereas in the current experiment more data was collected from a smaller number of participants. The development of the effect over time could thus be investigated (although note that all the subjects had at least some previous experience in similar kinds of tasks). The subjects showed idiosyncratic changes in their judgements across the experiment. KY showed no changes. JM was stable for control, short saccade and long saccade conditions, but his chronostasis effect grew in the double saccade condition. For NF practice seemed to exacerbate a bias to overestimate the first of two intervals in both control and saccade conditions, and if anything enhanced the chronostasis effect. By contrast, JR’s chronostasis effect seemed to diminish with time. Overall, then, the only conclusion that can be drawn about the development of the chronostasis effect is that judgements were not uniformly consistent, suggesting that the subjects may have experienced criterion shifts of some kind during the experiment. The other new feature of this experiment was the introduction of a double saccade condition. According to the antedating hypothesis, an efference copy signal relating to programming of the saccadic response initiates processes that later give rise to saccadic chronostasis. Two alternative predictions can be made for the double saccade condition. The first is that both saccades are planned as a single action, and that the efference copy signal for the first movement of this complete sequential action serves as the trigger for the chronostasis effect. This implies that antedating would be towards the beginning of the first (long) saccade, giving rise to a greatly enhanced illusory bias. The second prediction assumes that each saccade is planned individually, or that the relevant efference copy signal is one that carries information about each individual saccade within a sequence,
Subjective Experience Across Eye Movements
189
regardless of the overall action plan. This implies that antedating would be towards the beginning of the final (short) saccade, giving rise to an illusory bias of similar size to that obtained in the short single saccade condition. The current data support this second prediction. This is broadly consistent with the account presented above; in which saccadic chronostasis is linked to receptive field shifts. To our knowledge, these have only been demonstrated prior to single saccades, and not for more complex sequences. A previous experiment found that chronostasis occurs with a similar magnitude for pro and anti saccades, where action planning processes differ markedly. This finding also suggests that a late efference copy signal is critical; motor preparation takes longer, and therefore starts earlier, for anti saccades compared to pro saccades, but this early activity does not give rise to a larger chronostasis effect (Yarrow et al., 2004b).
Methodological Issues Is Saccadic Chronostasis Simply an Order Effect? When two or more intervals are presented in sequence, participants often exhibit biases in their temporal judgements. The best known example is the time order error (see Hellstroem, 1985, and Allan, 1979, for reviews). Subjects’ judgements are often biased such that two identical consecutively presented intervals do not appear of equal duration. Either interval can appear prolonged, and the direction and magnitude of the bias is difficult to predict. There are also other examples of specific biases arising as a result of sequential presentation of stimuli. Rose and Summers (1995) reported that when four squares are presented with intervening blank periods, the first and fourth square seem prolonged compared to the middle two. It is also possible to observe the influence of one interval on another when one of these intervals is evaluated with a comparison stimulus that is presented much later (for example, Sasaki et al., 2002). However, none of these biases are directly relevant to saccadic chronostasis. Chronostasis is always evaluated relative to a control condition with identical sequential properties. Demonstrations of saccadic chronostasis therefore reveal a bias in subjective duration over and above any order effects present with the particular procedure employed.
Do Constant Fixation Conditions Provide a Suitable Control? The purpose of the constant fixation conditions in saccadic chronostasis experiments is to provide a match for the pattern of visual stimulation experienced in saccadic conditions. Three different kinds of control condition have been used. The first type matches sequence effects (see above) but provides only an approximate match for visual stimulation. For example, Yarrow et al. (2001) and Park et al. (2003) used a numeric
190 Kielan Yarrow counter (“0”, “1”, “2”, “3”, “4”) in fixation conditions (judge the “1”, relative to the “2”, and the “3”). In saccade conditions, subjects fixated a cross, then saccaded to the same counter, which changed to display a “1” mid saccade. Hence foveal stimulation differed somewhat between conditions. In saccade conditions, subjects foveated a cross, then had a brief period of smeared foveal input during the saccade itself, then foveated the target stimulus (a “1”). This was compared to control conditions in which they foveated a “0” immediately followed by a “1”. The second type of control condition better approximates foveal stimulation by matching the first foveal stimulus (usually a cross) and introducing a brief blank period between it and the target stimulus. In the current experiment, for example, the control condition consisted of a cross, followed by a 100 ms blank period, followed by the target stimulus. The blank period was intended to approximate the time the eyes were in motion in saccade conditions. Yarrow et al. (2006a) and Yarrow et al. (2006b) went one step further. In these experiments, running averages were calculated for saccade duration, and these were used to make sure that the blank period was precisely matched to the duration of the saccade. In fact, this level of precision is probably not required. Yarrow et al. (2004a) ran an experiment evaluating perceived duration in four variants of the standard control condition. The cross changed to the target stimulus either immediately, after 50 ms, after 100 ms, or after 500 ms. Duration estimates were very similar in all conditions, so the presence of a gap does not seem to affect perceived duration (although it does affect temporal order judgements; Yarrow et al., 2006a). Overall, these sorts of control condition do a reasonable job of matching foveal stimulation under an assumption of saccadic suppression, but leave open the issue of whether the visual motion experienced during the saccade might yield a chronostasis effect. A third type of control condition attempts to answer this concern by having the critical visual objects in the control condition move in a way that approximates their motion on the retina in the saccade condition. In a recent example, Yarrow et al. (2004a, Experiment 3) had subjects fixate a cross, while a second cross was displayed 20° away across the screen. Both crosses were reduced in contrast, then moved with near saccadic velocity (200° per second) such that the second cross moved towards fixation and the first cross moved away from fixation in a consistent manner. Half way through this movement, the second cross changed into the target stimulus (a circle). At the end of the movement, subjects were left fixating this circle (now at full contrast) and made a judgement about the duration for which they had fixated it. This condition was compared with two variants of the more typical control condition, and yielded very similar PSEs. Taken collectively, these results make it unlikely that saccadic chronostasis arises from visual factors. However, it is currently uncertain whether full field visual motion exactly matching that experienced during a saccade could yield a chronostasis effect. For this reason, one further experiment is required. If stimuli are presented via a mirror which can be rapidly rotated, it is possible to produce full field motion with a saccadic time
Subjective Experience Across Eye Movements
191
course (for example, Diamond et al., 2000). Duration estimates could be assessed for a stimulus brought to fixation using this approach, and compared with a matched saccadic condition, so that chronostasis can be positively demonstrated over and above full visual field stimulation.
Is it Really the First Interval that is Being Affected? The standard chronostasis procedure involves comparing one interval with one or more subsequent intervals. This procedure cannot distinguish between biases that affect the first interval, and biases that affect later intervals but in the opposite direction. Our assertion that the first interval is subjectively lengthened is, however, supported by our results using a temporal order judgement procedure (Yarrow et al., 2006a). It is further supported by an experiment in which a different kind of duration judgement was required. Yarrow et al. (2006b, Experiment 5) presented only a single post-saccadic stimulus (in these experiments a moving circle) and had subjects make absolute duration estimates, in ms, to evaluate its perceived duration. As expected, estimates were higher in a saccade condition compared to a control condition.
Is Saccadic Chronostasis an Artefact of Correcting Presentation Times in Order to Calculate Points of Subjective Equality Relative to the Moment of Foveation? In the standard saccadic chronostasis procedure, the PSEs reported in saccade conditions are not simply calculated using the duration for which the target stimulus appeared on the screen in each trial. These PSEs incorporate an additional correction to display times. The rationale for this correction is as follows. The target stimulus changes into its target state during the saccade, at a time when perception is degraded (Ross et al., 2001). It is therefore probably not perceived to a degree compatible with the initiation of a mental timing operation until it is actually foveated. Hence the time for which the stimulus was on screen during the saccade (the period from stimulus change to saccade termination) is subtracted from presentation times before PSEs are calculated. The effects reported in most saccadic chronostasis experiments (the difference between control and saccade PSEs) can therefore be broken down into two components: (a) an increase in perceived duration relative to the on-screen presentation time, and (b) this correction. If this correction is not justified there are two implications. Firstly, the magnitude of the saccadic chronostasis effect will be overestimated. Note, however, that in all saccadic chronostasis experiments reported to date, omitting the correction would not have eliminated or reversed the direction of the effect. Put another way, an increase in perceived duration relative to on-screen presentation time is always obtained, even before the correction is applied.
192 Kielan Yarrow A second implication, however, is more critical for the antedating hypothesis. The finding that the magnitude of saccadic chronostasis increases with saccade duration provides an important foundation for this account. In both the original experiment reporting this effect (Yarrow et al., 2001, Experiment 1) and the experiment reported here, the change to the target stimulus was triggered a set proportion of the distance into the saccade. This means that the size of the correction varied in the short and long saccade conditions, being larger in the latter case. Hence, if the correction is unwarranted, the saccade size difference may have been artificially enhanced. Because of this issue, the appropriateness of the correction was tested by Yarrow et al. (2001, Experiment 1c). They compared two saccadic conditions, both of which employed a very large eye movement. In one condition, the change to the target stimulus was triggered very near the beginning of the saccade. In a second condition, it was triggered very near the end of the saccade. Recall that the correction equals the interval from the change trigger to the end of the saccade. This means that the size of the correction was large in the first condition and small in the second condition. Consider first the hypothesis that subjects did not perceive the mid-saccadic change of stimulus, or were uncertain about its timing, and antedated their subsequent percept to a moment just before saccade initiation regardless of this event. In this case, one might expect corrected PSEs to be identical in both conditions, but uncorrected PSEs to vary by the same interval that separated the trigger times in the two conditions. Now consider the alternative hypothesis that subjects perceived the mid-saccadic change of stimulus and used it as the start point in estimating the duration of the post-saccadic stimulus, with chronostasis yielding some constant addition to this estimate. In this case, one might expect corrected PSEs to differ by an amount equal to the temporal separation between the two trigger times, but uncorrected PSEs should not differ. In this experiment, the interval between trigger times was 85 ms. Originally, corrected PSEs were reported, which differed by only 11 ms. This difference in PSEs was not significant, supporting the antedating view. There is an interpretational issue here because the conclusion depends upon a negative result (power = 0.71 two tailed, 0.8 one tailed). However, a reanalysis of the data from this control experiment using uncorrected PSEs shows a significant 75 ms difference (t = 2.0, df = 9, one-tailed p = 0.036), positively supporting the antedating account. Hunt et al. (2008) have recently challenged the validity of the correction procedure based on a different kind of experiment. Their subjects made a 25° saccade from a cross to a counter initially showing a “0”. The counter changed to a “1” mid saccade, but only after the very brief (25 ms) presentation of either a “×” or a “+” at the same location. Subjects were asked to discriminate between these two symbols, and indeed were able to do so. Hunt et al. therefore conclude that in saccadic chronostasis experiments, subjects are able to see the mid-saccadic change to the target stimulus, and that the correction is therefore flawed, undermining the saccade size difference effect. Their conclusion is probably unwarranted because their subjects were performing a very different task to the
Subjective Experience Across Eye Movements
193
one typically required in chronostasis experiments. They were asked to discriminate a brief mid-saccadic event rather than judge the duration of a post-saccadic stimulus. This difference implies attending to the stimuli in different ways. The impact of the mid-saccadic stimulus change is better assessed in the same context used to demonstrate chronostasis in the first place, as in the trigger time experiment reported above. The question is not whether a mid-saccadic stimulus change can be perceived. The question is whether it is perceived in saccadic chronostasis experiments. The debate about whether the saccade size effect is an artefact has recently been addressed in a new experiment comparing saccades of different sizes (Yarrow et al., 2006a). This experiment included a critical procedural change. Instead of triggering the change to the target stimulus a set proportion of the distance into the saccade, this change was triggered at a similar time relative to the end of the saccade. Hence for both long and short saccades, the change was triggered around 30 ms before the target was fixated. The correction applied to PSEs was therefore virtually identical in both conditions. A significant difference between PSEs in long and short saccade conditions was still obtained. This finding provides clear evidence for a saccade size effect in saccadic chronostasis that cannot have arisen because of the correction technique. These data are in accord with the antedating account.
Is Saccadic Chronostasis Really a Perceptual Phenomenon? Do we really see (or recall seeing) an extended interval following a saccade, or could saccadic chronostasis result from some kind of response bias? In most chronostasis experiments, subjects judge the first interval relative to subsequent intervals, so a simple bias to respond “longer” would yield reduced PSEs. However, saccadic chronostasis is measured relative to a control condition so any bias would have to be specific to saccade conditions. Perhaps, then, the presence of a saccade biases subjects towards making a “longer” response for some non-perceptual reason? This is also unlikely, because the effect has been demonstrated when judgements are made about whether the second interval is longer or shorter than the first (Yarrow et al., 2004a, Experiment 4). In this case, subjects tended to respond “shorter” with equal display durations. The methods employed thus far, however, cannot be said to be “criterion free” in the sense derived from signal detection theory. For example, it is possible that subjects employed some sort of high level reasoning strategy in reaching their decisions. Subjects were asked to judge how long they saw the post-saccadic stimulus for, but might have reasoned that this stimulus appeared during their saccade, and therefore added on some time to compensate. This possibility cannot be discounted completely, but the differences obtained for saccades of different sizes imply that this strategy would have to be extremely sophisticated. Moreover, this account does not fit with the phenomenology of the task. There is no sense of “adding time”, only of accurately reporting a percept.
194 Kielan Yarrow Conclusions When observers saccade towards a visual target, they overestimate the duration for which it is presented. Extensive investigations of this illusory bias appear to favour an antedating account in which the saccade target object is subjectively experienced as having been fixated since before the eye movement began. This account may explain why we have no temporal experience corresponding to the period of our saccades, and therefore helps explain our conscious experience during active vision. Hopefully, further research will allow the mechanisms underlying this bias to be better understood.
References Allan, L. G. 1979. ‘The perception of time’, Perception and Psychophysics, 26 (5): 340–54. Brown, P. and J. C. E. Rothwell. 1997. ‘Illusions of time’, Society for Neuroscience Abstracts, 27th Annual Meeting, 23: 1119. Campbell, F. W. and R. H. Wurtz. 1978. ‘Saccadic omission: why we do not see a grey-out during a saccadic eye movement’, Vision Research, 18 (10): 1297–303. Diamond, M. R., J. Ross, and M. C. Morrone. 2000. ‘Extraretinal control of saccadic suppression’, Journal of Neuroscience, 20 (9): 3449–55. Duhamel, J. R., C. L. Colby, and M. E. Goldberg. 1992. ‘The updating of the representation of visual space in parietal cortex by intended eye movements’, Science, 255 (5040): 90–92. Fischer, B. and E. Ramsperger. 1984. ‘Human express saccades: extremely short reaction times of goal directed eye movements’, Experimental Brain Research, 57 (1): 191–95. Hellstroem, A. 1985. ‘The time-order error and its relatives: mirrors of cognitive processes in comparing’, Psychological Bulletin, 97 (1): 35–61. Hodinott-Hill, I., K. V. Thilo, A. Cowey, and V. Walsh. 2002. ‘Auditory chronostasis: hanging on the telephone’, Current Biology, 12 (20): 1779–81. Hopp, J. J. and A. F. Fuchs. 2002. ‘Investigating the site of human saccadic adaptation with express and targeting saccades’, Experimental Brain Research, 144 (4): 538–48. Hunt, A. R., C. S. Chapman, and A. Kingstone. 2008. ‘Taking a long look at action and time perception’, Journal of Experimental Psychology: Human Perception and Performance, 34 (1): 125–36. Nakamura, K. and C. L. Colby. 2002. ‘Updating of the visual representation in monkey striate and extrastriate cortex during saccades’, Proceedings of the National Academy of Sciences of the United States of America, 99 (6): 4026–31. Park, J., M. Schlag-Rey, and J. Schlag. 2003. ‘Voluntary actions expands perceived duration of its sensory consequence’, Experimental Brain Research, 149 (4): 527–29. Rose, D. and J. Summers. 1995. ‘Duration illusions in a train of visual stimuli’, Perception, 24 (10): 1177–87. Ross, J., M. C. Morrone, M. E. Goldberg, and D. C. Burr. 2001. ‘Changes in visual perception at the time of saccades’, Trends in Neurosciences, 24 (2): 113–21. Sasaki, T., D. Suetomi, Y. Nakajima, and G. ten Hoopen. 2002. ‘Time-shrinking, its propagation, and Gestalt principles’, Perception and Psychophysics, 64 (6): 919–31. Sommer, M. A. and R. H. Wurtz. 2002. ‘A pathway in primate brain for internal monitoring of movements’, Science, 296 (5572): 1480–82.
Subjective Experience Across Eye Movements
195
Treisman, M., A. Faulkner, P. L. Naish, and D. Brogan. 1990. ‘The internal clock: evidence for a temporal oscillator underlying time perception with some estimates of its characteristic frequency’, Perception, 19 (6): 705–43. Tyrrell, R. A. and D. A. Owens. 1988. ‘A rapid technique to assess the resting states of the eyes and other threshold phenomena: the Modified Binary Search (MOBS)’, Behavior Research Methods, Instruments and Computers, 20 (2): 137–41. Umeno, M. M. and M. E. Goldberg. 1997. ‘Spatial processing in the monkey frontal eye field. I. Predictive visual responses’, Journal of Neurophysiology, 78 (3): 1373–83. Walker, M. F., E. J. Fitzgibbon, and M. E. Goldberg. 1995. ‘Neurons in the monkey superior colliculus predict the visual result of impending saccadic eye movements’, Journal of Neurophysiology, 73 (5): 1988–2003. Wearden, J. H., H. Edwards, M. Fakhri, and A. Percival. 1998. ‘Why ‘sounds are judged longer than lights’: Application of a model of the internal clock in humans’, Quarterly Journal of Experimental Psychology B: Comparative and Physiological Psychology, 51 (2): 97–120. Yang, Z. and D. Purves. 2003. ‘A statistical explanation of visual space’, Nature Neuroscience, 6 (6): 632–40. Yarrow, K., P. Haggard, and J. C. Rothwell. 2004a. ‘Action, arousal, and subjective time’, Consciousness and Cognition, 13 (2): 373–90. Yarrow, K., L. Whiteley, P. Haggard, and J. C. Rothwell. 2006a. ‘Biases in the perceived timing of perisaccadic visual and motor events’, Perception and Psychophysics, 68 (7): 1217–26. Yarrow, K., L. Whiteley, J. C. Rothwell, and P. Haggard. 2006b. ‘Spatial consequences of bridging the saccadic gap’, Vision Research, 46 (4): 545–55. Yarrow, K., P. Haggard, R. Heal, P. Brown, and J. C. E. Rothwell. 2001. ‘Illusory perceptions of space and time preserve cross-saccadic perceptual continuity’, Nature, 414 (6861): 302–05. Yarrow, K., H. Johnson, P. Haggard, and J. C. E. Rothwell. 2004b. ‘Consistent chronostasis effects across saccade categories imply a subcortical efferent trigger’, Journal of Cognitive Neuroscience, 16 (5): 839–47.
Chapter 11 Duration Illusions and What They Tell Us about the Brain Vani Pariyadath and David M. Eagleman
Introduction
A
necdotally, time has frequently been reported to speed up or slow down. For example, people often report that events appeared to have taken place in slow motion during car accidents. Similar illusions can be reproduced in the laboratory. Take for example the illusion of the frozen clock: upon first glance, the second hand of a clock sometimes seems to be stopped in position momentarily before it continues to tick at a normal pace (Yarrow et al., 2001; Park et al., 2003). Similarly, perceived duration can be warped by saccades (Eagleman, 2005; Morrone et al., 2005), flicker (Kanai et al., 2006), and life-threatening events (Eagleman et al., 2005; Stetson et al., under review). The neural bases of such distortions remain unclear. To better understand time representation and its plasticity, we have taken advantage of such illusions of time in the laboratory. One example is the “oddball” illusion: an oddball stimulus in a train of repeated presentations is often perceived to have a longer duration than the repeated stimuli (Tse et al., 2004; Kanai and Watanabe, 2006; Ulrich et al., 2006; Pariyadath and Eagleman, 2007b). This is also true for the first stimulus in a repeated stimulus train—an effect we term the “debut” illusion (Rose and Summers, 1995; Pariyadath and Eagleman, 2007b). Why do these duration illusions occur, and what do they tell us about time?
Oddballs, attention, and predictability Several authors have suggested that the duration distortion is a consequence of increased attention or arousal triggered by the oddball or the first stimulus (Rose and Summers, 1995; Ranganath and Rainer, 2003; Tse et al., 2004; Ulrich et al., 2006). These authors
Duration Illusions
197
make use of the pacemaker-accumulator model of timing (Treisman, 1963; Gibbon, 1984) to explain the duration dilation. In this framework, an increase in arousal caused by the appearance of an unexpected (“oddball”) stimulus leads to a transient increase in the “tick rate” of an internal clock. Thus, the accumulator collects a larger number of ticks in the same time period, and the duration is judged as having lasted longer during the oddball. We were able to address this hypothesis in two ways. First, if the duration dilation represented a general speeding of an internal clock, then other temporal judgements such as the pitch of an auditory tone or the rate of a flickering stimulus should be expected to change concomitantly with the oddball. However, when measured carefully, pitches and flicker rates do not change during the oddball duration distortion (Pariyadath and Eagleman, 2007b). This indicates clearly that time is not one thing, but is instead composed of separate neural mechanisms that usually work together but can be teased apart in the laboratory. Second, the attentional theory suggests that a more salient oddball would bring about a larger duration distortion. We tested this hypothesis by using more emotionally salient oddballs—that is, stimuli which activate the amygdala and attract attention more quickly (Holland and Gallagher, 1999; Davis and Whalen, 2001)—and found that when the oddball stimuli are replaced with emotionally salient images such as pointed guns and aggressive dogs, the oddball effect was essentially unchanged (Pariyadath and Eagleman, 2007b). This surprising result suggests two possibilities: first, since increasing the salience of the oddball failed to increase the duration distortion, it may be that the duration dilation is caused by attentional mechanisms but saturates. Alternatively, the oddball effect may not be, fundamentally, an attentional effect—but rather caused solely by the oddball’s unpredictability. To test whether the predictability of the stimulus is responsible for its perceived duration, we turned to the fact that the first stimulus in a repeated series, like an oddball, appears to last longer than the subsequent presentations (Rose and Summers, 1995; Hodinott-Hill et al., 2002; Kanai and Watanabe, 2006). We hypothesized that this duration distortion would disappear if different, and thus unpredictable, stimuli were presented serially. So, we presented participants with two types of trials—in one, the same stimulus was presented five times; in the other, five random stimuli were presented. Participants judged the duration of the first stimulus with respect to successive stimuli. We found that the first stimulus in the different or “random” condition did not undergo duration distortion unlike in the “repeat” condition (Figure 11.1A). This suggests that the use of random stimuli allows each stimulus to act as an unpredicted oddball. Our findings raised an interesting question: if predictability plays a role in the perceived duration of the stimulus, does this prediction have to be violated at a low level (for example, involving the visual form of the stimulus) or a high level (for example, number sequences such as 4…5…6…)? To address this question we tested participants using the same paradigm as in the previous experiment. This time, however, we presented them
198 Vani Pariyadath and David M. Eagleman Figure 11.1 The debut effect
Note: It occurs with repeated or predictable stimuli. (A) Participants report that the first stimulus appears to be longer in duration in comparison to successive stimuli in a serial visual stream of repeated stimuli or a learned sequence. This duration distortion is not witnessed in a scrambled learned sequence or a presentation of random stimuli. (B) We hypothesize that these duration distortions are a result of a contraction in the duration of repeated or predictable stimuli.
with (a) a repeated presentation of the number 1 five times, (b) the sequence 1, 2, 3, 4, 5, or (c) a “scrambled” sequence that began with 1 and did not have 2 in its second position (such as 1, 4, 3, 5, 2). Participants judged whether the duration of the first “1” appeared longer or shorter than the stimuli that followed. We found a similar duration distortion in both the repeated and sequential stimuli, but not in the scrambled version (Figure 11.1A). We hypothesize that the duration distortion in the “sequential” condition stems from the successive stimuli being predicted in sequence, even though they differ from the first stimulus in shape and form. This suggests that the predictability of successive stimuli involves higher cortical areas than the primary visual cortex. Our results are in agreement with Tse’s findings that in a series of visually similar stimuli (for example, figurines of female bodies in different natural poses), a stimulus that belongs to a different category (for example, figurine of a male body) is dilated in duration. Our findings go beyond that result, however, by demonstrating that even abstract sequences which share little visual similarity can produce such duration distortions. This suggests that areas higher than the primary visual cortex are involved in computing predictability.
Novelty and the Brain Figure 11.1B illustrates the relative durations of the first and oddball stimuli in a train of repeated stimuli. However, note that we have put no units or landmarks on the y-axis.
Duration Illusions
199
This is on purpose: Although the conventional view is that the oddball or unexpected stimulus is expanded in duration as compared with the standard, we suggest here an alternative hypothesis: the oddball’s duration may be closer to the physical duration while the standards are contracted in duration due to their predictability. We base this suggestion in part on decades of research into repetition effects on the brain, which we turn to now. Neuronal firing rates in higher cortical areas quickly become suppressed after repeated presentations of a stimulus (Fahy et al., 1993; Rainer and Miller, 2000; sereno et al., 2010), an effect generally known as repetition suppression. In humans, differential responses to familiar and novel stimuli are shown using EEG and neuroimaging (Henson and Rugg, 2003; Grill-Spector et al., 2006). For example, there is a decrease in amplitude in the ERP signal in response to the second presentation of a stimulus (Grill-Spector et al., 2006). Using fMRI, a decrease in amplitude is seen in the hemodynamic response to a second presentation of a face in the midfusiform response (Henson and Rugg, 2001). Finally, repetition suppression is seen in the cortical response in positron emission tomography studies (Buckner et al., 1995) and MEG experiments (Noguchi et al., 2004; Ishai et al., 2006). Behaviourally, repetition suppression has been linked to repetition priming or a decrease in reaction time to respond to repeated or familiar stimuli (Dehaene et al., 2001; Orfanidou et al., 2006). The exact mechanism by which a neural suppression is caused by repetition is not agreed upon and several models have been proposed so far (Grill-Spector et al., 2006). One such model is that it exists as a mechanism for increasing efficiency of representation (Desimone and Duncan, 1995; Wiggs and Martin, 1998). In that view, with repeated presentations of a stimulus, a sharpened representation or a more efficient encoding is achieved in the neural network that codes for the object. Such a representation affords lower metabolic costs and possibly faster processing (Grill-Spector et al., 2006). We hypothesize that this differential response to novel versus repeated stimuli maps on to perceived duration: a suppressed neural response corresponds to a shorter perceived duration. If true, this would encourage speaking of the duration distortions not as an expansion of the oddball, but instead as a contraction of the standards. Our findings with number sequences (Figure 11.1A) suggest that the neural responses to not only repeated but also predicted stimuli become suppressed (Rao and Ballard, 1999; Pariyadath and Eagleman, 2007b). This implies that repetition suppression may be a special case of a more general case that we will term prediction suppression.
Repetition and Implicit Duration Judgements The above studies used stimuli lasting hundreds of milliseconds; we next explored duration judgements of stimuli briefer than 100 ms. By employing brief stimuli, we were able to design a paradigm in which explicit judgements regarding perceived duration did not
200 Vani Pariyadath and David M. Eagleman have to be made. We achieved this by deriving a novel variation of the traditional flicker fusion frequency paradigm. In flicker fusion experiments, a light is rapidly turned on and off: at a low frequency, flicker is perceived, while at a high frequency, the light appears to be steady. The frequency at which the perception switches from flicker to a steady light is called the critical flicker fusion threshold (CFFT). Critically, CFFT experiments make use of a single stimulus that is presented repeatedly. Because there are subjective duration differences when viewing familiar versus novel stimuli, we hypothesized that the CFFT will change if the rapid stimulus could somehow be made novel each time it appeared. To accomplish this we presented stimuli serially at different locations on a computer screen. Although only one stimulus was present at any moment, more than one appeared to temporally overlap on screen due to visual persistence, the phenomenon that a briefly presented stimulus appears to last longer than the time it was physically presented (Efron, 1970; Bowen et al., 1974; Di Lollo, 1977). We refer to this perceived multiplicity of stimuli as the proliferation effect (Pariyadath and Eagleman, 2007a). We employed two conditions: in the first, the same stimulus was presented (Figure 11.2A); in the second, different stimuli were presented (Figure 11.2A). Participants were required to report the number of stimuli subjectively present on screen at any one moment of time—that is, how many characters appeared to share screen time. Participants’ estimates of how many characters they perceived on screen simultaneously varied significantly between the repeated and random conditions (Figure 11.2B). At a 50 Hz presentation rate, for example, observers reported an average of 3.4 characters on screen in the “repeat” condition and 4.2 in the “random” condition. The difference between the two conditions holds across different stimulus frequencies. These results suggest that repetition contracts the duration of visual persistence in the same manner that it contracts durations at longer timescales. A contraction in the visual persistence of repeated stimuli leads to less temporal overlap, and hence fewer items are perceived to be present at once. The differential proliferation effect generalizes to other types of stimuli such as pictures of everyday objects or faces (Figure 11.2B), consistent with the wide-ranging stimuli that lead to repetition suppression. To further test our hypothesis of contractions in visual persistence, we next had participants report perceived numerosity in conditions of different duty cycles. We again found a difference in the perceived numerosity for repeated and random conditions, there is no difference in this numerosity across different stimulus durations, that is, in spite of physically varying the duration of stimulus presentation, participants reported the same numerosity for a given stimulus presentation frequency. This can be understood in light of the fact that brief stimuli (< 100 ms) will be perceptually expanded to ~100 ms, irrespective of their physical durations (for example, a 10 ms stimulus and a 60 ms stimulus will appear to last the same duration). This result further strengthens our hypothesis that the visual persistence of a brief stimulus contracts with repetition.
Note: (A) Example sequences of stimulus presentation and perceived numerosity for repeated and random stimuli. (B) Number of stimuli perceived to be present for various repeated and random stimuli. Participants report more stimuli present on screen when the stimuli are different than when they are repeated. (* indicates p < 0.05, paired t-tests). Error bars S. E. M.
Figure 11.2 Repeated stimuli subjectively proliferate less than random stimuli
202 Vani Pariyadath and David M. Eagleman To quantify visual persistence for repeated and random stimuli, we asked participants to match the offset of a train of stimuli with the onset of a surrounding frame. In one condition, the same stimulus flashed repeatedly before the onset of the surrounding frame; in the second, different stimuli flashed one after the other. The adjusted ISI between the final image and the frame is an indication of how long the image perceptually persisted. This visual persistence was found to be 69.5 ms for a repeated stimulus and 78.5 ms for a novel stimulus. These values are in line with previous reports of visual persistence; however, given some inter-subject variability and the subjectively reported difficulty of aligning offsets with onsets, we highlight the difference in the visual persistence between the two conditions rather than the magnitudes. Note that an attention-shifting explanation for duration distortion requires sufficiently long-lasting stimuli (> 150 ms) to allow for attentional shifts (Tse et al., 2004); such an explanation is untenable for the present duration distortions. Next, to address whether higher order predictability was involved—even at such brief timescales—we measured the proliferation effect using ordinal sequences which, as we highlighted earlier, have the advantage of predictability but not repeated shapes. In one condition, letters of the alphabet were serially flashed in sequential order; in the second condition, letters of the alphabet were serially flashed in scrambled order. Participants perceived more characters on screen simultaneously when characters were presented in scrambled order as opposed to when they were presented in sequence (Figure 11.2B, “sequential letters”). This indicates that the predictability of a stimulus plays a role in its visual persistence, independent of repeated form. Interestingly, participants did not report noticing any difference between the two types of trials, hinting that conscious appreciation of ordinality and its violation does not play a role in this result. The above set findings have suggested a new theoretical framework for duration judgements, one in which subjective duration is a reflection of neural energy usage: the bigger the neural response, the longer an event seems to have lasted. This is currently being tested by directly correlating neural responses to repeated and random stimuli with their subjective durations.
A Potential Diagnostic Tool for Schizophrenia Our new framework for duration has direct application for instantly screening for schizophrenia. An impaired novelty response is a hallmark characteristic of schizophrenia, as evidenced by an impaired pre-pulse inhibition of the startle response (Hong et al., 2007), impaired mismatch negativity (Javitt et al., 1998; Light and Braff, 2005) and poor oddball detection (Kiehl and Liddle, 2001). Relatedly, schizophrenics show a lowered CFFT (Black et al., 1975), and generally a lower sensitivity for detecting flicker (Slaghuis and Bishop, 2001). These findings are consistent with electrophysiological measures that show a reduced
Duration Illusions
203
Figure 11.3 Proposed repetition suppression diagnostic tool
Note: When presented with the proliferation effect, healthy controls will perceive more stimuli on screen in the random condition than in the repeated condition; schizophrenics however have impaired repetition effects and will thus perceive no difference in the repeat or random conditions. By comparing the perceived numerosity for novel and familiar stimuli, an early diagnosis could be made for schizophrenia.
or absent repetition suppression in schizophrenics, presumably because of a deficit in cortical inhibition (Daskalakis et al., 2002). Roughly speaking, to a schizophrenic brain everything appears novel. Because schizophrenics have reduced repetition suppression, the proliferation effect acts as a diagnostic tool: while a healthy control perceives a clear difference in numerosity for random (novel) and repeated (familiar) stimuli, our preliminary data show that schizophrenics fail to perceive this differential numerosity (Figure 11.3) (Gandhi et al., 2007). A schizophrenic patient thus reports the same numerosity in both presentations and can thereby, in the span of seconds, be referred for further examination. We hope this non-invasive, rapid method will serve as a screening tool for early diagnosis, allowing early treatment and reducing cognitive deficits. Our current research aims to determine how sensitive and specific our proposed tool is for the diagnosis of schizophrenia.
204 Vani Pariyadath and David M. Eagleman Conclusions Subjective duration appears to contract with repetition—and more generally, with increased predictability—suggesting the possibility that it may reflect the amplitude of the neural response. We have derived a method with which we can measure the effect of repetition on duration in the absence of explicit temporal judgements. Our preliminary results with schizophrenic patients suggest that we can use the proliferation paradigm to test the robustness of repetition suppression in a rapid, non-invasive manner.
Acknowledgements We are indebted to Anne B. Sereno for pointing us to relevant references on repetition suppression.
References Black, S., L. M. Franklin, F. P. de Silva, and H. S. Wijewickrama. 1975. ‘The flicker–fusion threshold in schizophrenia and depression’, The New Zealand Medical Journal, 81 (535): 244–46. Bowen, R. W., J. Pola, and L. Matin. 1974. ‘Visual persistence: effects of flash luminance, duration and energy’, Vision Research, 14 (4): 295–303. Buckner, R. L., S. E. Petersen, J. G. Ojemann, F. M. Miezin, L. R. Squire, and M. E. Raichle. 1995. ‘Functional anatomical studies of explicit and implicit memory retrieval tasks’, The Journal of Neuroscience, 15 (1): 12–29. Daskalakis, Z. J., B. K. Christensen, R. Chen, P. B. Fitzgerald, R. B. Zipursky, and S. Kapur. 2002. ‘Evidence for impaired cortical inhibition in schizophrenia using transcranial magnetic stimulation’, Archives of General Psychiatry, 59 (4): 347–54. Davis, M. and P. J. Whalen. 2001. ‘The amygdala: vigilance and emotion’, Molecular Psychiatry, 6: 13–34. Dehaene, S., L. Naccache, L. Cohen, D. Le Bihan, J. Mangin, J. B. Poline, and D. Riviere. 2001. ‘Cerebral mechanisms of word masking and unconscious repetition priming’, Nat. Neurosci., 4 (7): 727–58. Desimone, R. and J. Duncan. 1995. ‘Neural mechanisms of selective visual attention’, Annual Review of Neuroscience, 18: 193–222. Di Lollo, V. 1977. ‘Temporal characteristics of iconic memory’, Nature, 267 (5608): 241–43. Eagleman, D. M. 2005. ‘Distortions of time during rapid eye movements’, Nature Neuroscience, 8 (7): 850–51. Eagleman, D. M., P. U. Tse, D. Buonomano, P. Janssen, A. C. Nobre, and A. O. Holcombe. 2005. ‘Time and the brain: how subjective time relates to neural time’, Journal of Neuroscience, 25 (45): 10369–71. Efron, R. 1970. ‘The minimum duration of a perception’, Neurophysiologia, 8: 57–63. Fahy, F. L., I. P. Riches, and M. W. Brown. 1993. ‘Neuronal activity related to visual recognition memory: long-term memory and the encoding of recency and familiarity information in the primate anterior and medial inferior temporal and rhinal cortex’, Experimental Brain Research, 96: 457–72. Gandhi S. k., A. A. Wassef, and D. M. Eagleman 2007. Timing judgements in schizophrenia. 806 13. Neuroscience Meeting Planner. San Diego, CA: Society fro Neuroscience. Online
Duration Illusions
205
Gibbon J., R.M. Church, and W. H Meck. 1984. Scalar timing in memory. Annals of the New York Academy of 423: 52–77. Grill-Spector, K., R. Henson, and A. Martin. 2006. ‘Repetition and the brain: neural models of stimulusspecific effects’, Trends in Cognitive Sciences, 10 (1): 14–23. Henson, R. and M. Rugg. 2001. ‘Effects of stimulus repetition on latency of the BOLD impulse response’, NeuroImage, 13: 683. Henson, R. N. A. and M. D. Rugg. 2003. ‘Neural response suppression, haemodynamic repetition effects, and behavioural priming’, Neuropsychologia, 41: 263–70. Hodinott-Hill, I., K. V. Thilo, A. Cowey, and V. Walsh. 2002. ‘Auditory chronostasis: hanging on the telephone’, Current Biology, 12 (20): 1779–81. Holland, P. C. and M. Gallagher. 1999. ‘Amygdala circuitry in attentional and representational processes’, Trends in Cognitive Science, 3 (2): 65–73. Hong, L. E., A. Summerfelt, I. Wonodi, H. Adami, R. W. Buchanan, and G. K. Thaker. 2007. ‘Independent domains of inhibitory gating in schizophrenia and the effect of stimulus interval’, American Journal of Psychiatry, 164: 61–65. Ishai, A., P. C. Bikle, and L. G. Ungerleider. 2006. ‘Temporal dynamics of face repetition suppression’, Brain Research Bulletin, 70: 289–95. Javitt, D. C., S. Grochowski, A. Shelley, and W. Ritter. 1998. ‘Impaired mismatch negativity (MMN) generation in schizophrenia as a function of stimulus deviance, probability, and interstimulus/interdeviant interval’, Electroencephalography and Clinical Neurophysiology, 108: 143–53. Kanai, R. and M. Watanabe. 2006. ‘Visual onset expands subjective time’, Percept Psychophysics, 68 (7): 1113–23. Kanai, R., C. L. Paffen, H. Hogendoorn, and F. A. Verstraten. 2006. ‘Time dilation in dynamic visual display’, Journal of Vision, 6 (2): 1421–30. Kiehl, K. A. and P. F. Liddle. 2001. ’An event-related functional magnetic resonance imaging study of an auditory oddball task in schizophrenia’, Schizophrenia Research, 48: 159–71. Light, G. A. and D. L. Braff. 2005. ‘Mismatch negativity deficits are associated with poor functioning in schizophrenia patients’, Archives of General Psychiatry, 62: 127–36. Morrone, M. C., J. Ross, and D. Burr. 2005. ‘Saccadic eye movements cause compression of time as well as space’, Nature Neuroscience, 8 (7): 950–54. Noguchi, Y., K. Inui, and R. Kakigi. 2004. ‘Temporal dynamics of neural adaptation effect in the human visual ventral stream’, The Journal of Neuroscience, 24 (28): 6283–90. Orfanidou, E., W. D. Marslen-Wilson, and M. H. Davis. 2006. ‘Neural response suppression predicts repetition priming of spoken words and pseudowords’, Journal of Cognitive Neuroscience, 18: 1237–52. Pariyadath V. and D. Eagleman 2007. The effect of predictability on subjective duration. PLoS One 2 (11): e1264, doi: 10.1371/journal.pone.ooo1264. Pariyadath V. and D. M. Eagleman 2008. Brief subjective durations contract with repetition. Journal of Vision. Park, J., M. Schlag-Rey, and Schlag. 2003. ‘Voluntary action expands perceived duration of its sensory consequence’, Experimental Brain Research, 149: 527–29. Rainer, G. and E. K. Miller. 2000. ‘Effects of visual experience on the representation of objects in the prefrontal cortex’, Neuron, 27: 179–89. Ranganath, C. and G. Rainer. 2003. ‘Neural mechanisms for detecting and remembering novel events’, Nature Reviews Neuroscience, 4: 193–204. Rao, R. P. and D. H. Ballard. 1999. ‘Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects’, Nature Neuroscience, 2 (1): 79–87.
206 Vani Pariyadath and David M. Eagleman Rose, D. and J. Summers. 1995. ‘Duration illusions in a train of visual stimuli’, Perception, 24: 1177–87. Sereno, A. B., S. R. Lehky, S. Patel, and X. Peng. 2010. ‘A neurophysiological correlate and model of reflexive spatial attention’, in N. srinivasan, B. R. Kar, and J. Pandey (eds), Advances in cognitive Science, Volume 2, pp. 104–31. New Delhi: Sage. Slaghuis, W. L. and A. M. Bishop. 2001. ‘Luminance flicker sensitivity in positive- and negative-symptom schizophrenia’, Experimental Brain Research, 138: 88–99. Stetson, C., M. P. Fiesta, and D. M. Eagleman, 2007. Does time really slow down during a frightening event? PLos ONE 2 (12): e1295.doi:10.1371/journal.pone.0001295. Treisman, M. 1963. Temporal discrimination and the indifference interval: lmplications for a model of the ‘internal clock’. Psychological Monographs, 77: 1–31. Tse, P. U., J. Intriligator, J. Rivest, and P. Cavanagh. 2004. ‘Attention and the subjective expansion of time’, Perception and Psychophysics, 66 (7): 1171–89. Ulrich, R., J. Nitschke, and T. Rammsayer. 2006. ‘Perceived duration of expected and unexpected stimuli’, Psychological Research, 70: 77–87. Wiggs, C. L. and A. Martin. 1998. ‘Properties and mechanisms of perceptual priming’, Current Opinion in Neurobiology, 8 (2): 227–33. Yarrow, K., P. Haggard, R. Heal, P. Brown, and J. C. Rothwell. 2001. ‘Illusory perceptions of space and time preserve cross-saccadic perceptual continuity’, Nature, 414 (6861): 302–05.
Chapter 12 Implicit Timing Trevor B. Penney, Latha Vaitilingam, and Siwei Liu
Introduction
I
n a laboratory setting, human time perception is frequently studied using tasks with instructions that require the participant to time a stimulus and to make a subsequent judgement about the stimulus duration. Sometimes, the subject is not told in advance that a duration judgement will be required, but is informed after the duration of interest has elapsed. In both cases, however, the time judgement is explicit because the participant indicates the value of the experienced duration by explicitly categorizing, reproducing, or verbally estimating the duration of the stimulus. In contrast, if a participant is not told to attend to the passage of time, and timing judgements, estimations, or categorizations are not explicitly required, but task performance shows sensitivity to the passage of time, then timing is implicit. For example, reaction time tasks that have a consistent delay between a cue stimulus and a subsequent imperative stimulus. In other paradigms, participants experience a stream of repeated stimuli. In the attentive version of the paradigm, the participant is told that a stimulus stream will be presented and that he or she must indicate when a stimulus is omitted or the stimulus sequence ends. Instructions for such tasks do not require any reference to interval timing to be made, but successful task completion requires the participant to be sensitive to when a stimulus should occur and to respond appropriately when that does not happen. Electroencephalography (EEG) allows one to investigate timing mechanisms when the participant is not required to attend to the stimulus stream (that is, pre-attentively). For example, in one version of such an approach a stream of repeated stimuli is presented at a consistent inter-stimulus interval (ISI). Occasionally, a stimulus violates this pattern by occurring too early, too late, or by being omitted entirely. Such violation stimuli elicit a specific brain response and, as discussed below, this brain measure can be used as an investigative tool.
208 Trevor B. Penney et al. In everyday life, we are sensitive to and make judgements about the temporal relations between events without necessarily being aware that we are using temporal information. That is to say, much of our use of temporal information is implicit. Moreover, an important practical concern for laboratory studies of temporal processing, particularly experiments with very young or patient populations, is whether the task instructions influence task performance. For example, it is possible that a particular behavioural result, such as a performance deficit in one group versus another, is actually a consequence of a failure to understand and follow task instructions rather than a specific time perception deficit. Consequently, measures that permit examination of interval timing in the absence of explicit timing instructions are potentially of great value to the researcher. These measures can rely entirely on behavioural responses, or take advantage of brain activity measures to reveal processing either before or in the absence of an overt behavioural response. In the following, we review some of the approaches to measuring implicit timing that we believe offer promise both in terms of revealing the details of the interval timing system (that is, the internal clock) as well as serving as useful tools for comparing interval timing ability in various participant groups. This chapter is not intended to be an exhaustive review of the literature, but rather a limited review of an area of time perception research that we believe deserves greater attention in the future. We begin by considering a task in which the participants are not explicitly asked to make a judgement about time, but that requires sensitivity to temporal information for successful completion. Second, we consider the use of electrophysiological measures as markers for both attentive and pre-attentive timing within this explicit/implicit timing framework. Finally, we briefly consider implicit timing as an emergent property of continuous motor responses rather than as defined by an explicit versus implicit timing instructions.
Behavioural Reaction Time Tasks It has long been known that in simple reaction time (RT) tasks, the RTs vary proportionally with the length of the delay between stimuli (for example, Kornblum, 1973; Ollman and Billington, 1972). This effect also extends to choice RT tasks. For example, Grosjean et al. (2001) examined RTs in a serial choice reaction time task. Participants were not instructed about the timing of the stimuli, but stimuli were presented at a constant rate before the critical trial. On critical trials, the stimulus occurred earlier than usual, later than usual, or at the appropriate time given the preceding stimulus rate. Compared to stimuli presented at the typical time, RT increased and errors decreased when stimuli were presented earlier than usual and RT decreased and errors increased when the stimuli were presented later than usual. Clearly, performance was sensitive to the delays between stimuli in this experiment, thereby indicating implicit use of temporal information in the task. However, our primary interest here is in RT tasks that are designed to extract more specific information about the interval timing system than is available from a standard
Implicit Timing
209
simple or choice RT task. One such implicit timing task, illustrated in Figure 12.1, is the Stopreaction time (Stop-RT) task (for example, Ten Hoopen, 1985; Rousseau and Rousseau, 1996). In Experiment 1 of Rousseau and Rousseau (1996), participants experienced unimodal, variable length sequences of brief auditory or visual stimuli that had a constant stimulus onset asynchrony (SOA). Participants were instructed to respond as quickly as possible when the sequence ended. The variable sequence length rendered counting stimuli Figure 12.1 SOA
Note: The top two rows illustrate unimodal single SOA (500 ms) sequences for the auditory (Tone–T) and visual (Light–L) modalities in the Stop-RT task. Sequence length ranged randomly from 10 to 15 stimuli across trials to ensure that participants did not solve the task by counting stimuli. The third and fourth rows illustrate formation of a bimodal single SOA (250 ms) sequence by combining the 500 ms SOA auditory sequence with the 500 ms SOA visual sequence.
210 Trevor B. Penney et al. to determine sequence end unhelpful. Moreover, the instructions did not include any specific mention of a time judgement. However, in order to perform the task successfully, a participant had to be sensitive to the delay between stimuli and respond when that delay exceeded some criterion value. Evidence from these unimodal single SOA experiments suggests that participants build up a representation of the SOA, which is compared with the interval following each stimulus. If the interval exceeds the SOA by a criterion amount, then an end-of-sequence response is emitted. In other words, a stop response is determined by the duration that has elapsed since the last presented signal. Rousseau and Rousseau (1996) defined the criterion for the response by the formula, C = t + n (var[t])1/2, which specifies that the stop-response is triggered when the elapsed duration exceeds the sum of the mean of the SOA representation and some multiple of the standard deviation from that mean SOA representation. This decision rule means that the Stop-RT increases as the variance of the mean SOA representation increases, in other words, Weber’s Law holds for timing in the Stop-RT task. However, some of the most interesting results reported by Rousseau and Rousseau (1996) were obtained from experiments that used bimodal single SOA and unimodal multiple SOA (polyrhythmic) sequences. Bimodal single SOA sequences comprise alternating auditory and visual stimuli (see Figure 12.1). To determine whether a bimodal sequence has ended, participants could represent the delay between stimuli, independent of stimulus modality, and respond when that value has been exceeded by the criterion amount. In the example presented in Figure 12.1, the SOA would be 250 ms if the participant merely timed from one stimulus to another independent of stimulus modality. Alternatively, the sequence could be treated as two modality specific subsequences that are timed separately. In the example in Figure 12.1, the participant would separately represent 500 ms SOAs for the auditory and visual subsequences. The participant would start a modality specific timing process at the offset of a given signal. If a light was the last signal (that is, a tone was missing), then the participant would respond that the sequence had ended based upon the amount of time elapsed since the last tone stimulus. If a tone was the last stimulus (that is, a light was missing), then the participant would respond based on the amount of time elapsed since the last light stimulus. The SOA represented in memory under the first possibility would be half the length of that represented under the second possibility. Given Weber’s law for time, variance should be smaller with a smaller SOA representation and therefore the Stop-RTs should be smaller and less variable given the decision rule described above. Their results indicated that participants processed these bimodal sequences as two concurrent unimodal sub-sequences. Rousseau and Rousseau (1996) next examined whether or not participants could extract separate SOA subsequences from a complex unimodal polyrhythmic sequence. To create unimodal polyrhythmic sequences, two sequences that had different SOAs (for example, 450 ms and 750 ms) were started at the same time (see Figure 12.2). However, because the signals comprising
Implicit Timing
211
Figure 12.2 Polyrhythmic Sequence
Note: A unimodal polyrhythmic sequence (top panel, third row) is formed by combining two samemodality single SOA sequences (top panel, first and second rows). The stimuli forming the two subsequences are physically identical. A bimodal polyrhythmic sequence (bottom panel, third row) is formed by combining two different modality single SOA sequences (bottom panel, first and second rows).
the sequence were identical (for example, the same duration, frequency and amplitude for the auditory signals) it would not necessarily be obvious to the participant that the sequence contained two sub-sequences. Interestingly, the behavioural data indicated that participants did extract subsequences from auditory polyrhythmic sequences, but were not able to do so for visual polyrhythmic sequences (that is, they could parallel time
212 Trevor B. Penney et al. auditory, but not visual sequences). See Rousseau and Rousseau (1996) and Penney (2003) for a discussion of the implication of these results for information processing models of interval timing. We recently partially replicated and extended the findings of Rousseau and Rousseau (1996) using both behavioural and combined behavioural/electrophysiological measures. Similar to Rousseau and Rousseau (1996), we found superior detection of missing auditory signals as compared to missing visual signals (Figure 12.3). In our bimodal single SOA experiment, Stop-RTs were much faster when an auditory stimulus was missing (that is, the last signal was visual) as compared to when a visual stimulus was missing (that is, the Figure 12.3 Mean Stop-RT (ms) averaged across 20 participants in a bimodal single SOA experiment from our lab
Note: Sequences alternating from Light to Tone (L-T) are indicated by open diamonds, whereas sequences alternating from Tone to Light (T-L) are indicated by closed squares. This means, for example, that a tone was presented in position 10 for the L-T sequence and the next light stimulus was omitted, whereas a light was presented in position 10 for the T-L sequence and the next tone was omitted. Stop-RTs were slower when the visual signal was missing (that is, the sequence ended with a tone) as compared to when the auditory signal was missing (that is, the sequence ended with a light). This saw tooth pattern is similar in form to that found by Rousseau and Rousseau (1996).
Implicit Timing
213
last signal was auditory). More importantly, the overall saw tooth pattern of the results was consistent with Rousseau and Rousseau’s (1996) data and interpretation that participants extract and time the auditory and visual subsequences in parallel when presented with a bimodal single SOA stimulus stream. Moreover, unimodal polyrhythmic (450 ms and 750 ms) experiments from our lab also indicated support for parallel timing of two asynchronous auditory modality signals because Stop-RT magnitude was modulated by sequence position in the expected manner, but did not indicate parallel timing of visual modality signals (see Figure 12.4). As an extension of the findings of Rousseau and Rousseau (1996), we tested parallel timing of three auditory subsequences in two experiments. Four hundred and fifty ms, 562.5 ms, 750 ms were the SOAs in one experiment and 450 ms, 562.5 ms, 1125 ms were the SOAs in the other. Stop-RTs did not systematically differ depending on sequence position of the missing stimulus, indicating that participants did not extract the subsequence SOAs from the stimulus stream. These results suggest that when the event markers are spatially and physically identical, then it is not possible for participants to automatically extract and separately time (that is, parallel time) three or more auditory subsequences embedded within an overall sequence. Although the visual subsequences were not timed in parallel, it is unknown at present whether there are circumstances under which visual sequences can be timed in parallel. Indeed, recent evidence showing visual hemi-field specific interference effects in short interval timing (Johnston et al., 2006) suggests that visual parallel timing might be possible when spatially distinct stimuli are used. Moreover, one might expect parallel timing of three or more subsequences if more than one modality is used (for example, two auditory SOAs and one visual). We have also used the Stop-RT task to investigate interval timing in both young participants and patient populations (individuals with Sz) because we viewed the absence of explicit reference to timing critical from a task comprehension perspective. For example, in one study, we compared the unimodal (auditory) Stop-RT performance of good and poor readers (Penney et al., 2005). Our rationale was that if one cause of poor reading ability is a loss of temporal resolution of perceptual processing, then perceptual processing difficulties could extend into the range of hundreds of milliseconds rather than merely being limited to the time range (that is, tens of milliseconds) that is critical for speech perception. We found that poor readers had larger Stop-RTs than good readers, and that the magnitude of the difference was not a function of the particular SOA in effect. This outcome indicated that the difference between good and poor readers was not due to differences in clock or memory function, because such effects would be expected to elicit Stop-RT differences that are proportional to the SOA in use (cf. Penney et al., 2000; Penney, 2003). Rather, the absolute difference in Stop-RT independent of SOA duration could be due to different efficiencies in terminating interval timing between the two groups. If the poor readers were slower to terminate timing, then their representation of the SOA should be larger by an absolute amount. Indeed, generally slower mental operations would impact task completion. These operations could include detection
214 Trevor B. Penney et al. Figure 12.4 Performance on two SOA polyrhythmic sequence
Note: The top panel illustrates performance on a unimodal (auditory) two SOA polyrhythmic sequence as illustrated in Figure 12.2. The mean stop-RTs (in milliseconds) are averaged across 20 participants, and depicted as a function of the end signal position (1 to 7) since there were seven last signal positions in this experiment. The data are consistent with extraction and parallel timing of the two sub-sequences. In contrast, the data from a unimodal (visual) two SOA polyrhythmic sequence (bottom panel) are not consistent with extraction and parallel timing of the two subsequences.
Implicit Timing
215
and representation of the timing signal, comparison with a memory representation of previous SOA durations, and initiation of the behavioural response. In sum, the Stop-RT difference could be due to an overall sensory-motor slowing in the poor reader group as compared to the good reader group. A clear advantage of the Stop-RT task as a tool for investigating interval-timing behaviour is that it is not necessary for researchers to explicitly instruct the participants to time. The experiments described above demonstrate the value of the task in outlining the structure and capacity of the internal clock in the absence of task instructions about how to time stimuli, as well as the task’s usefulness as an easily explained, conceptually simple task for a range of participant populations. Whether the same stimulus stream could be used to extract information about the timing system if timing were explicitly mentioned in the instructions is unclear. It is likely that the instructions would influence how participants treated the stimuli. For example, if a participant were told to time the delay between stimuli, then it is possible he or she would try to time from one stimulus to another independent of stimulus modality rather than to extract single modality subsequences from the overall sequence. How participants respond in this situation is an empirical question, but it is clear the outcome could be determined or at least influenced by the instructions.
Electrophysiological Measures of Implicit Timing Numerous studies during the past 20 years have examined interval timing with electrophysiological measures in deviance detection paradigms that either manipulated the duration of a stimulus or the time of occurrence of a stimulus. In some cases, participants were required to attend to the stimuli and respond overtly with a motor response or covertly by maintaining a count of the number of deviants. In other cases, participants were not required to attend to the auditory stimuli and instead focused on reading, watching a movie, or performing some other foreground task (for example, Tse and Penney, 2006). Therefore, processing of the auditory stimuli was assumed to be pre-attentive. Attentive detection of changes in ISI or SOA has also been regularly featured in the electrophysiology of timing literature. In one of the earliest studies, Simson et al. (1976) examined ERP responses to both visual and auditory stimulus sequences. In the auditory experiment, a stimulus was presented every second and approximately 5 per cent of expected stimuli were omitted. Participants indicated that a stimulus was missing by responding after the occurrence of the following stimulus. Stimulus omissions elicited a negative omission potential (OP) that reached its maximum amplitude approximately 230 ms after the time point when the missing stimulus should have occurred and a positive OP that reached its maximum amplitude approximately 465 ms after the missing stimulus should have occurred. The negative OP had a frontal focus, whereas
216 Trevor B. Penney et al. the positive OP had a parietal maximum. The authors interpreted the negative OPs as ‘in part, anticipatory events activated by an internal representation of the temporal rhythm of stimulus presentation’. In our lab we have used electrophysiological measures in conjunction with the StopRT task described above. For example, participants listened to a sequence of brief tones (Penney, 2004; Experiment 1) that had a constant SOA of either 470 or 770 ms, and were required to respond as soon as the sequence ended. For both the 470 ms and the 770 ms conditions, a biphasic negative-positive event-related potential (ERP) response was emitted following the point in time when a stimulus would have occurred had the sequence continued. The negative omission potential (OP) was interpreted as a correlate of an interval timing process that precedes response initiation, whereas the positive OP was interpreted as a target detection response (that is a P300). A major benefit of such electrophysiological measures is that they potentially provide a window on the processing that occurs before the response is emitted. For example, Penney (2004) compared omitted stimulus ERP responses for both auditory and visual stimulus streams in an effort to tease apart some of the differences in auditory and visual processing described above. However, a particularly attractive aspect of using electrophysiological measures to examine interval timing is that these measures also allow one to differentiate timing processes in the absence of any behavioural response. Indeed, it is not even necessary for the participant to direct attention towards the timing signal; rather the participant can be focused on the completion of a foreground task while background stimuli occur with some temporally defined regularity. Occasionally, a stimulus that deviates from this regularity is presented. The primary electrophysiological measure of passive detection of these deviations in environmental stimulation is the ERP component termed the mismatch negativity (MMN; see Näätänen, 2000 and Näätänen et al., 2001, for reviews). Typically, in an ERP study of passive deviance detection, a stream of auditory stimuli is presented, but the participant is not required to attend to the stimulus stream. Most of the stimuli in the stream, termed the standard stimuli, are identical to each other, whereas some stimuli, termed deviants or oddballs, deviate from the standard stimuli on a particular stimulus dimension, such as frequency, intensity, duration, or location. The deviant stimuli elicit an MMN, which usually peaks between 100 and 200 ms after the onset of deviance and is observed as a frontal-central negativity when the reference electrode is placed on the nose. The amplitude and latency of the MMN is related to the degree of deviation from the standard stimulus, such that the larger the degree of deviance the shorter the MMN latency and larger the MMN amplitude (see Näätänen et al., 2001; Näätänen and Winkler, 1999). This change detection mechanism is considered to be an automatic preattentive process that also sets the stage for further attentive processing of the deviant stimulus. There are at least two approaches to using the MMN as a measure of passive processing of temporal stimuli. First, one may manipulate the duration of a signal and examine
Implicit Timing
217
whether or not the subject is sensitive to the changes in the signal duration (for example, Jacobsen and Schröger, 2003). For example, the standard stimulus could be a 400 ms tone and the deviant could be a 350 ms tone. If an MMN is elicited, then the participant, in this example, passively discriminated between the 400 ms standard and the 350 ms deviant stimulus. The second approach is to define the duration of interest as the interstimulus-interval (ISI) between successive markers (for example, Yabe et al., 1997). This allows one to examine whether or not the participant is sensitive to changes in the time of occurrence of stimulus presentation. A deviant stimulus could be presented earlier than the standard ISI, later than the standard ISI, or omitted entirely. Examples of these passive deviance detection paradigms are presented in Figure 12.5. Figure 12.5 Examples of typical passive oddball paradigms used in mismatch negativity (MMN) studies of interval timing
Note: The top three rows illustrate time of occurrence oddballs. Tones (for example, 10 ms) are presented with at a regular SOA (for example, 100 ms). Occasionally, the regular SOA pattern is violated by presenting the tone too early (for example, after 75 ms), too late (for example, after 125 ms), or by omitting it. The bottom panel illustrates a stimulus duration oddball. In this example, most stimuli are 75 ms tones, but occasionally a 15 ms tone is presented.
218 Trevor B. Penney et al. Duration of Stimulus Pre-attentive processing of stimulus duration has been the focus of a number of studies in both normal (for example, Jacobsen and Schröger, 2003; Tervaniemi et al., 1999) and patient (for example, Michie et al., 2000) populations. Of most relevance here are two recent studies that yielded somewhat contradictory results. Näätänen et al. (2004) used standard durations ranging from 400 ms to 1600 ms with deviant durations equal to 50 per cent of the standard. They found that all deviants elicited an MMN, although the magnitude of the MMN was smaller with standard signal durations larger than 800 ms. These findings contrast with those of Grimm et al. (2004), who failed to obtain evidence of pre-attentive timing for a 1000 ms standard/600 ms deviant condition, but were able to show that participants discriminated these durations in an attentive task. It seems unlikely that the differential effect is due to the proportional change in the respective deviants across the two studies (40 per cent versus 50 per cent), although it is possible that the variable SOAs in the Grimm et al. (2004) study had an influence on the results obtained. The authors used nine levels of SOA ranging between 1400 and 2200 ms meaning the SOA was 1.4 to 2.2 times the standard duration. In contrast, Näätänen et al. (2004) used SOAs that were proportional to the standard duration (1.25 times the standard duration). There is evidence in the literature that the timing context influences timing sensitivity (for example, Kidd and Watson, 1992; Suprenant, 2001). In any case, the results of these studies indicate that the range of operation of the pre-attentive timing system and the parameters that influence this range remain to be firmly established.
Time of Occurrence Pre-attentive processing of time of occurrence has also been a focus of study. For example, Ford and Hillyard (1981) presented tone sequences that had a standard ISI of 300 ms, and occasionally (5 per cent) tones were delayed by 300 ms, but participants were not required to attend to the tones. Interestingly, these tone omissions failed to elicit an omission potential. Indeed, more recent work showed an omission potential could be elicited by unattended stimulus omissions, but only when the stimulus onset asynchrony (SOA) was less than about 150 ms (Yabe et al., 1997). Yabe and colleagues (1997) examined seven SOA durations (100, 125, 150, 200, 250, 300, and 350 ms), but obtained a clear omission effect, in this case a MMN, for the 100 and 125 ms SOA conditions only. These results, combined with the earlier attentive omission results described above, suggest that the failure to obtain an omission effect in the Ford and Hillyard (1981) study was due to a combination of the specific SOA used and the absence of an attentive stimulus-processing requirement in the task. However, in an event-related optical imaging study of pre-attentive timing, we showed that omitting a stimulus when the standard ISI was 1500 ms or presenting a stimulus
Implicit Timing
219
500 ms too early elicited brain activity in the superior temporal gyrus (STG) in the former case and the STG and the interior frontal gyrus (IFG) in the latter case (Tse and Penney, 2007). These findings corroborate other optical imaging work from our lab (Tse et al., 2006) in which omission deviants in a stream of stimuli with a standard SOA of 87 ms elicited both STG and IFG activity. The results reviewed above are particularly interesting because they suggest a potential distinction between pre-attentively timing the duration of a signal and timing when a signal will occur, as well as corroborating the evidence that attending to a timing signal, whether how long or when, extends the range of duration sensitivity. The former possibility is surprising because time of occurrence is effectively equivalent to a duration measure when one considers that the ISI in a time of occurrence paradigm may be considered equivalent to the timing signal in a regular duration paradigm. For example, in a time of occurrence paradigm, 50 ms tone pips could be presented with an ISI of 500 ms 90 per cent of the time, but occasionally the ISI could be 600 ms. Here, the delay between stimuli is critical and this is what the participants are assumed to be timing. The participant has a representation of the standard ISI and compares each experienced ISI to it. How is this different from timing a 500 ms tone that is presented repeatedly with an occasional 600 ms tone interceding? From the perspective of pre-attentive brain activity measures, however, signal duration and time of occurrence paradigms may engage different processing mechanisms. As noted above, when participants are presented with sequences of filled durations, a MMN response is elicited when the standard is up to 1600 ms in duration and the deviant is 800 ms in duration (Näätänen et al., 2004). In contrast, when short tone pips are presented and the crucial factor is when stimuli occur, a MMN response is consistently obtained with ISIs up to about 200 ms only (Yabe et al., 1997, 1998) and results for ISIs greater than that are less consistent. However, consistent omission effects are obtained for longer ISI durations when participants are required to attend to the stimulus stream (for example, Penney, 2004). These differences raise the possibility that the two situations, although quite similar in some respects, actually recruit different neural mechanisms.
Implicit Timing as an Emergent Property of a Task There is an additional sense in which explicit and implicit timing may be distinguished. Although detailed discussion of this view is beyond the scope of the present chapter, we briefly present the critical ideas here and refer the reader to the relevant literature for further information. Within this framework, rather than focusing on whether or not the participant is required to explicitly make a timing judgement the focus is on the nature of the temporal representation used to complete the task. According to this viewpoint, any task that requires an explicit representation of time is an explicit timing task, whereas when timing behaviour emerges from the participant’s continuous response
220 Trevor B. Penney et al. related movement it is an implicit timing task. For example, tapping in synchrony with a metronome and then continuing tapping at the same rate after metronome offset would be considered an explicit timing task within this framework because representation of temporal intervals (for example, the delay between taps) is necessary to perform the task effectively, even though the participant is not explicitly instructed to time (Biberstine et al., 2005; Spencer and Zelaznik, 2003; Zelaznik et al, 2002; Zelaznik et al., 2000). In contrast, a task like continuous circle-drawing is considered an implicit timing task within this framework because the timing behaviour emerges from the participant’s continuous movement. Specifically, timing is considered a by-product of the processes that control the movement such as the modulation of limb stiffness in order to constrain the trajectory of the movement (Robertson et al., 1999; Zelaznik et al., 2002). Even though the target interval within which each circle must be completed is externally specified at the beginning of the task, it is assumed that once the continuous drawing behaviour has been initiated, the movement is no longer dependent on an explicit representation of time (Zelaznik et al., 2005; Zelaznik et al., 2002). These authors have argued that the temporal representations of tasks that are continuous in nature, such as circle drawing, differ from those of tasks of a discrete nature, such as duration perception and tapping. This view is based on variability analyses of duration perception, tapping, and continuous movement tasks. Given tasks that share common timing mechanisms should show similar variability patterns, the similar variability patterns for tapping and duration perception suggest common timing mechanisms (for example, Ivry and Hazeltine, 1995; Keele et al., 1985). In line with this approach, Robertson et al. (1999) asked participants to perform tapping and drawing tasks paced initially by a metronome and then to maintain that rate of tapping or circle drawing following disengagement of the metronome. Variability measures in the continuation phase were correlated across different durations within each task, but no significant correlation was found between tasks. Participants were three times more consistent in circle drawing than in tapping even at the same prescribed rate of movement. The authors suggested that the results reflect a distinction between explicit and implicit timing processes. In order to achieve synchrony with the metronome, it is essential for participants to estimate time in tapping tasks so that they will know when to produce the tap. With the exception of the production of the tap, there is no other movement trajectory involved in this task. In contrast, consistent performance in continuous drawing tasks depends upon and is driven by the movement trajectory and motor processes. Therefore, implicit timing in continuous drawing tasks is likely an emergent property of the circle drawing movement itself because the movement dynamic is not controlled by an event separate from the trajectory of the movement. Systematic variation of the demands of the circle drawing and tapping tasks provided further confirmation of the difference in timing variance between tapping and circle drawing (Zelaznik et al., 2000) when movement rate was and was not set by a metronome beat. While within task correlations in timing variability were found for the paced
Implicit Timing
221
and unpaced tapping tasks, no significant correlations were evident for the paced and unpaced circle drawing tasks, and no significant between task correlations were found. The lack of a relationship between tapping and circle drawing performance regardless of whether the movement was paced by a metronome or at the participant’s own preferred rate of responding is crucial because it corroborates the claim that the timing processes underlying tapping and circle drawing differ and that this difference is not driven by preferred rate of movement. Comparison of duration discrimination (a non-motor task), tapping, intermittent circle drawing (hybrid of tapping and continuous circle drawing) and continuous circle drawing tasks revealed that temporal precision in tapping, duration discrimination and intermittent circle drawing tasks were significantly correlated with one another, whereas performance on these tasks was not correlated with that on the continuous circle drawing task (Zelaznik et al., 2002). The lack of a correlation between the intermittent and continuous circle drawing tasks in spite of task similarity suggests that the requirement to pause for 500 ms after each complete circle in the intermittent task significantly influenced the temporal representation used. Rather, the temporal information in the intermittent circle-drawing task appears to be represented in a manner more related to that of the tapping task. The lack of a correlation between performance in the continuous circle drawing task and the other three tasks, which were significantly correlated with one another, implies that while the latter tasks share a common timing mechanism, the continuous circle drawing is controlled by a different timing process. In sum, the studies described above support the implicit-explicit distinction in timing processes using discrete tasks like tapping versus continuous drawing tasks. This distinction has clear impact on whether or not particular tasks, independently of whether or not timing is explicitly mentioned, can be used to investigate the interval timing system.
Conclusion Although human interval timing studies in which the participant is explicitly asked to time a stimulus have revealed many features of the cognitive and neural mechanisms underlying interval timing, it is clearly the case that much of our use of interval timing in daily life occurs in situations where we have not been explicitly asked to time. Moreover, some individuals may have difficulty understanding the task instructions of a typical explicit timing task or the instructions may influence the behaviour observed in the participants. Consequently, it is of value to examine implicit timing behaviour in a controlled laboratory setting. Behavioural and electrophysiological methods, as well as the combination of the two approaches, offer much promise for the study of implicit timing. Here, we have reviewed the Stop-RT task as well as electrophysiological passive deviance detection paradigms for the study of the interval timing system in its own right, and as a tool for understanding information processing in general. Although good progress has been made towards revealing the capacity and constraints on implicit timing, as well as its relationship to explicit timing, many questions remain to be answered.
222 Trevor B. Penney et al. References Biberstine, J., H. N. Zelaznik, L. Kennedy, and E. Whetter. 2005. ‘Timing precision in circle drawing does not depend on spatial precision of the timing target’, Journal of Motor Behavior, 37 (6): 447–53. Ford, J. M. and S. A. Hillyard. 1981. ‘Event-related potentials (ERPs) to interruptions of a steady rhythm’, Psychophysiology, 18 (3): 322–30. Grimm, S., A. Widmann, and E. Schröger. 2004. ‘Differential processing of duration changes within short and long sounds in humans’, Neuroscience Letters, 356 (2): 83–86. Grosjean, M., D. A. Rosenbaum, and C. Elsinger. 2001. ‘Timing and reaction time’, Journal of Experimental Psychology General, 130 (2): 256–72. Ivry, R. B. and R. E. Hazeltine. 1995. ‘The perception and production of temporal intervals across a range of durations: evidence for a common timing mechanism’, Journal of Experimental Psychology: Human Perception and Performance. 21 (1): 1–12. Jacobsen, T. and E. Schroger. 2003. ‘Measuring duration mismatch-negativity’, Clinical Neurophysiology, 114 (6): 1133–43. Johnston, A., D. H. Arnold, and S. Nishida. 2006. ‘Spatially localized distortions of event time’, Current Biology, 16 (5): 472–79. Keele, S. W., R. A. Pokorny, D. M. Corcos, and R. B. Ivry. 1985. ‘Do perception and motor production share common timing mechanisms: a correlational analysis’, Acta Psychologica, 60 (2–3): 173–91. Kidd, G. R. and C. S. Watson. 1992. ‘The ‘proportion of the total duration rule’ for the discrimination of auditory patterns’, Journal of the Acoustic Society of America, 92 (6): 3109–18. Kornblum, S. 1973. ‘Simple reaction time as a race between signal detection and time estimation: a paradigm and model’, Perception and Psychophysics, 13: 108–12. Michie, T. T., T. W. Budd, J. Todd, D. Rock, H. Wichmann, J. Box, and A. V. Jablensky. 2000. ‘Duration and frequency mismatch negativity in schizophrenia’, Clinical Neurophysiology, 111 (6): 1054–65. Näätänen, R. 2000. ‘The perception of speech sounds by the human brain as reflected by the mismatch negativity (MMN) and its magnetic equivalent (MMNm)’, Psychophysiology, 38 (1): 1–21. Näätänen, R., O. Syssoeva, and R. Takegata. 2004. ‘Automatic time perception in the human brain for intervals ranging from milliseconds to seconds’, Psychophysiology, 41 (4): 660–63. Näätänen, R., M. Tervaniemi, E. Sussman, P. Paavilainen, and I. Winkler. 2001. ‘Primitive intelligence’ in the auditory cortex’, Trends in Neurosciences, 24 (5): 283–88. Näätänen, R. and I. Winkler. 1999. ‘The concept of auditory stimulus representation in cognitive neuroscience’, Psychological Bulletin, 125 (6): 826–59. Ollman, R. T. and M. J. Billington. 1972. ‘The deadline model for simple reaction time’, Cognitive Psychology, 3 (2): 311–36. Penney, T. B., K. Man. Leung, P. Chi. Chan, X. Meng, and C. A. McBride-Chang. 2005. ‘Poor readers of Chinese respond slower than good readers in phonological, rapid naming, and interval timing tasks’, Annals of Dyslexia, 55 (1): 9–27. Penney, T. B. 2004. ‘Electrophysiological correlates of interval timing in the Stop RT task’, Cognitive Brain Research, 21 (2): 234–49. Penney, T. B. 2003. ‘Modality differences in interval timing: attention, clock speed, and memory’, in W. H. Meck (ed.), Functional and Neural Mechanisms of Interval Timing, pp. 209–34. Boca Raton, FL: CRC Press. Penney, T. B., J. Gibbon, and W. H. Meck. 2000. ‘Differential effects of auditory and visual signals on clock speed and temporal memory’, Journal of Experimental Psychology: Human Perception and Performance, 26 (6): 1770–87.
Implicit Timing
223
Robertson, S. D., H. N. Zelaznik, D. Lantero, K. G. Bojczyk, R. M. Spencer, J. G. Doffin, and T. Schneidt. 1999. ‘Correlations for timing consistency among tapping and drawing tasks: evidence against a single timing process for motor control’, Journal of Experimental Psychology: Human Perception and Performance, 25 (5): 1316–30. Rousseau, L. and R. Rousseau. 1996. ‘Stop-reaction time and the internal clock’, Perception and Psychophysics, 58 (3): 434–48. Simson, R., H. G. Vaughan, and W. Ritter. 1976. ‘The scalp topography associated with missing visual or auditory stimuli’, Electroencephalography and Clinical Neurophysiology, 40 (1): 33–42. Spencer, R. M. C. and H. N. Zelaznik. 2003. ‘Weber (slope) analyses of timing variability in tapping and drawing tasks’, Journal of Motor Behavior, 35 (4): 371–81. Suprenant, A. M. 2001. ‘Distinctiveness and serial position effects in tonal sequences’, Perception and Psychophysics, 63 (4): 737–45. Ten Hoopen, G. 1985. ‘The detection of anisochrony in monaural and interaural sequences’, in J. A. Michon and J. L. Jackson (eds), Time, Mind and Behavior, pp. 140–50. Berlin: Springer-Verlag. Tervaniemi, M., A. Lehtokoski, J. Sinkkonen, J. Virtanen, R. J. Ilmoniemi, and R. Näätänen. 1999. ‘Test-retest reliability of mismatch negativity for duration, frequency, and intensity changes’, Clinical Neurophysiology, 110 (8): 1388–93. Tse, C. Y. and T. B. Penney. 2006. ‘Pre-attentive timing of empty intervals is from marker offset to onset’, Psychophysiology, 43 (2): 172–79. Tse, C. Y., K. R. Tien, and T. B. Penney. 2006. ‘Event–related optical imaging reveals the temporal dynamics of right temporal and frontal cortex activation in pre-attentive change detection’, NeuroImage, 29 (1): 314–20. Tse, C. Y. and T. B. Penney. 2007. ‘Optical imaging of cortical activity elicited by unattended temporal deviants’, IEEE Engineering in Medicine and Biology Magazine, 26: 52–58. Yabe, H., M. Tervaniemi, K. Reinikainen, and R. Näätänen. 1997. ‘Temporal window of integration revealed by MMN to sound omission’, NeuroReport, 8 (8): 1971–74. Yabe, H., M. Tervaniemi, J. Sinkkonen, M. Huotilainen, R. J. Ilmoniemi, and R. Näätänen. 1998. ‘Temporal window of integration of auditory information in the human brain’, Psychophysiology, 35 (5): 615–19. Zelaznik, H. N, R. M. C. Spencer, R. B. Ivry, A. Baria, M. Bloom, L. Dolansky, S. Justice, K. Patterson, and E. Whetter. 2005. ‘Timing variability in circle drawing and tapping: probing the relationship between event and emergent timing’, Journal of Motor Behavior, 37 (5): 395–403. Zelaznik, H. N., R. M. C. Spencer, and R. B. Ivry. 2002. ‘Dissociation of explicit and implicit timing in repetitive tapping and drawing movements’, Journal of Experimental Psychology: Human Perception and Performance, 28 (3): 575–88. Zelaznik, H. N., R. M. Spencer, and J. G. Doffin. 2000. ‘Temporal precision in tapping and circle drawing movements at preferred rates is not correlated: further evidence against timing as a general purpose ability’. Journal of Motor Behavior, 32 (2): 193–99.
Chapter 13 Localization and Dynamics of Cerebral Activations Involved in Time Estimation: Studies Combining PET, fMRI, and EEG Data Viviane Pouthas
Introduction
T
ime is crucial for our everyday activities. Humans, as other animals, process temporal information over different timescales (for example, Buhusi and Meck, 2005; Mauk and Buonomano, 2004). This chapter concerns the scale of tens to hundreds of milliseconds, which is fundamental for speech and motor coordination, and the scale of seconds to minutes, which is generally seen as the conscious perception of time. Without the ability to discriminate differences in duration and appreciate time, other cognitive functions, visual and auditory awareness, would be severely impaired. In the range of seconds, we estimate the duration of traffic lights in order to cross the roads when cars have stopped. Or when we drive a car we are able to anticipate the right moment to start again. In the millisecond range timing is crucial for speech, music and motor control. For example, we have to discriminate the duration of linguistic sounds and accurately pause between sentences and speaking turns, to achieve optimal communication. Yet, the neural mechanisms involved in the processing of these duration-ranges remain largely unknown, in contrast with the mechanisms involved in other timescales, such as those controlling the circadian sleep–wake activity. Whereas time is clearly a source of information, which shapes one’s behaviour, nobody has yet evidenced any sense or sense organ by which time can be directly perceived. The stimulus “time “is not a stimulus per se, but could correspond to an internally generated activity in the nervous system (thus increased neural activity with time could code for elapsed duration). The register of this internal activity the internal clock dedicated to interval timing (hundreds of ms to several seconds) has long been thought to be centralized, using the same brain circuitry
Cerebral Activations Involved in Time Estimation
225
for motor and perceptive timing, as well as for estimating the duration of an auditory or visual stimulus. However, from a behavioural point of view, a distributed (rather than unified) view of psychological time has recently been proposed (for example, Grondin, 2001), suggesting that more than one central timekeeping system may contribute to time perception. In addition, thanks to brain imaging studies, neuroscientists have recently shown that multiple brain areas are involved in the judgement or production of brief intervals. But, two questions remain a matter of debate: How do neurons in those regions measure time? Is the observed pattern of activation specific to time processing? This chapter will give a bird’s eye view of EEG electrophysiological), fMRI (Functional Magnetic Resonance Imagery) and PET (Positron Emission Topography) studies, which have contributed to answer these questions.
EEG Studies: How does the Brain Code for Target Times? In the 1960s, electrophysiologists described a slow negative wave, called the Contingent Negative Variation (CNV) that develops between two stimuli: S1 corresponding to a warning stimulus and S2 to an imperative stimulus (for example, Walter et al., 1964). One of the processes reflected by this wave was thought to be the estimation of time between S1 and S2 (Figure 13.1). The CNV is generally observed between two stimuli, but this wave also occurs during a continuous stimulus where onset and offset correspond to warning and imperative stimulus, respectively. EEG studies have shown that: (a) this slow wave develops over wide areas of the scalp, mainly over frontal and central areas; (b) it reflects the preparation and/or anticipation of a response; (c) it is associated with several cognitive processes including expectancy and attention. Then, the relationship between time estimation and CNV has been well documented. It has been shown that its amplitude and time course vary depending on performance when participants have to estimate the duration of a signal or to time an action. For example, in the Macar et al. (1999) study (for example, Macar et al., 1999), subjects learnt the duration of a 2.5 s target interval. Thereafter, they had to judge an interval as equal, shorter or longer than the memorized target. There was a strong bias towards equality. CNV amplitude on the FCZ (Fronto–Central Electrode) electrode was found to reflect the judged interval duration despite the fact that its objective duration was strictly the same. The results showed that the longer the judgement, the larger was the CNV amplitude. However, the picture of CNV variations given by the literature is rather complex. The progressive development of the CNV might reflect the elaboration of an internaltime reference, corresponding to the target duration. Once this reference is constituted, it may be used more automatically. The results of a recent experiment conducted by my research group illustrate this point (for example, Pfeuty et al., 2003). The aim of the study was to specify the relationship between time course of CNV and perceived duration. The paradigm we used was a delayed matching to sample task. A tone was on during 700 ms, it was presented six times, then five different test tones
226 Viviane Pouthas Figure 13.1 Contingent Negative Variation (CNV)
Source: Adapted from Walter et al. 1964. Note:
It was developed between the warning stimulus (S1) and the imperative stimulus (S2). Its amplitude is maximal over frontal-central areas.
of different durations were presented 20 times each at random, making 100 trials. The task of the participant was to decide whether the duration of the tone was equal (right button press) or different from (left button press) the previously memorized standard duration. Figure 13.2 displays the average generalization gradient, that is, the percentage of “standard responses” for the five test durations. The highest percentage was obtained for the standard test duration (68.5 per cent). Individual generalization gradients showed a rather large amount of inter-individual variability. For example, subjects 12 (S12) gave more responses for long than for short durations, while subject 2 (S2) gave more standard responses for short than for long durations. For each participant, we calculated an index that we named the gradient asymmetry coefficient: A positive value means that the standard response percentage was higher for long than for short durations. At the reverse, a negative value means that the standard response percentage was higher for short than for long durations. A positive value was obtained for eight subjects and a negative value was obtained for four subjects (Figure 13.2). ERP analyses were performed over the vertex (FCZ electrode site), where CNV amplitude was maximal. For the short and standard durations, CNV amplitude continued to increase until the end of the stimulus, whereas for the long durations, the activity had already reached its maximum before the end. Thus, CNV amplitude kept increasing between the short and standard duration but not between standard and long durations. This showed that CNV amplitude increased until the standard duration has elapsed (Figure 13.3a). Then, we investigated whether or not there were relationships between the subjective standard duration and the time course of CNV activity. The analysis was facilitated by
Cerebral Activations Involved in Time Estimation
227
Figure 13.2 Generalization gradients
Note: (Left) Average generalization gradient mean and standard error of the standard response percentage (ordinate), test duration values abscissa. (Right) Individual generalization gradients for the 12 subjects.
the fact that there was a large variability between the individual subjective standard durations. Therefore, for the long test duration, we looked at the correlation between the peak latency and the asymmetry gradient value. Based on the assumption that interindividual variability in the gradient reflected individual subjective standard duration, results showed that, for long duration trials, CNV peaked earlier when the subjective standard was shortened and later when the memorized standard was lengthened (Figure 13.3b). These results provide an argument that CNV activity at medial sites is related to the memorized standard duration. In conclusion, the analysis of the relationship between CNV activity and test duration suggests that medial frontal activity increased with the current test duration, then stopped when the current duration reached the memorized (subjective) standard duration. Medial frontal activity may be linked to an accumulation process (time code) that stops once the memorized duration has elapsed. This interpretation is consistent with one of the prominent information-processing model of temporal cognition (for example, Gibbon et al., 1984). According to this model, a hypothetical clock mechanism, which consists of an oscillatory pacemaker emitting pulses and of an accumulator counting them during the signal duration, provides the raw material for measuring time (clock stage). The output from the accumulator corresponding to the current time is stored transiently in a working memory system. Finally, a mechanism compares the current duration values with those
228 Viviane Pouthas Figure 13.3 Peak latency
Source: Adapted from Pfeuty et al. 2003. Note:
(a) Grand mean event-related potentials (CNVs) over FCZ elicited by the presentation of the 490 ms (short), 700 ms (standard), and 910 (long) test durations. (b) Diagram of correlation between the gradient asymmetry coefficient and the CNV peak latency over FCZ for the long test duration.
in reference memory (the target durations) to decide on the adequate temporal response (decision stage): in our study participants had to decide whether or not the current test duration was different from the standard (target) value. This leads us to question the CNV generators, and further the cerebral regions involved in timing. Multiple cortical as well subcortical regions have been proposed to participate in the generation of the CNV. Studies in patients with Parkinson’s disease provide some clues on the possible CNV generators. It has been shown that the amplitude of this ERP component is significantly reduced in PD (Parkinson Disease) patients, particularly over the frontal sites. Furthermore, the level of CNV reduction in PD patients seems to be directly related to the severity of disease and to increase following treatment with levodopa (for example, Amabile et al., 1986). In the study by Gerschlager and collaborators (for example, Gerschlager et al., 1999), the CNV was recorded from patients with Parkinson’s disease when on and off bilateral subthalamic nucleus stimulation. Without subthalamic nucleus stimulation Parkinson’s disease patients showed reduced CNV amplitudes over frontal and frontal–central regions compared with control subjects, probably reflecting impaired frontal activation. With bilateral subthalamic nucleus, CNV amplitudes over frontal and frontal-central regions were significantly increased.
Cerebral Activations Involved in Time Estimation
229
Impaired frontal cortex function probably stems from impaired basal ganglia outflow, which normally provides important input to frontal regions via motor and dorsolateral prefrontal basal ganglia–thalamo–cortical loops (Figure 13.4). Brain imaging studies as well as pharmacological and lesion studies corroborate the involvement of the frontal cortex and basal ganglia. But where in the brain time is processed remains an open issue.
Functional Imaging Studies: Where Time is Processed? Different hypotheses can be found in literature nicely synthesized (for example, Ivry and Spencer, 2004) (Figure 13.5): (a) only one cerebral region would be dedicated to time processing, for example the cerebellum; (b) a large distributed network of cerebral areas would be involved in timing; (c) temporal information would be processed in a specific region depending on the temporal task (duration discrimination, duration production or reproduction et cetera), on the sensory modality (visual, auditory) or the duration ranges (for example, sub-second/supra-second, as discussed (for example, Lewis and Miall, 2003). Figure 13.4 Mean CNVs
Source: Adapted from Gerschlager et al. 1999. Note:
This is for Parkinson patients with subthalamic nucleus stimulation (thick line) compared with off (thin line) stimulation.
230 Viviane Pouthas Figure 13.5 Three hypothetical models of the neural mechanisms for timing
Source: Ivry and Spencer, 2004.
The purpose of recent studies has been to answer to the crucial question of the specificity of brain structures in temporal processing. Indeed, basal ganglia are also involved in motor functions and frontal-parietal networks are involved in attention and memory independently of the stimulus dimension to be processed. Two of these fMRI studies are reported here. The less we attend to the temporal properties of a stimulus the more likely we misestimate its duration (for example, Brown, 1997; Macar et al., 1994). Using a dual-task paradigm in which attention to duration versus colour was parametrically varied Coull et al. (2004) showed that decreasing attention to time progressively increased
Cerebral Activations Involved in Time Estimation
231
errors in time trials but reduced them in colour trials (Figure 13.6a). More interestingly, increasing attention to time selectively increased activity in a corticostriatal network, involving the pre-supplementary motor area (pre-SMA) (Figure 13.6b), dorsal premotor cortex (PMC), putamen and frontal operculum. Increasing attention to colour selectively increased activity in area V4 (Figure 13.6b). Demonstrating that activation of brain areas was sensitive to parametric attentional modulation of timing authors concluded that these areas form the core neuroanatomical substrates of timing behaviour. A strategy for specifying structures involved in timing arose from results of EEG studies, which suggest that activity in the brain increases with increasing time. Therefore, in our own study (for example, Pouthas et al., 2005), we manipulated the stimulus duration to be discriminated. We assumed: 1. that regions within a time estimation network, in which activity is modulated when the duration to be estimated is varied, play a crucial role in the perception Figure 13.6 Attention conditions
Source: Adapted from Coull et al. 2004. Note:
(a) Behavioural data for the five attention conditions. (b) Amplitude of specific cortical activity, in V4 for colour and in pre-SMA for timing, in function of the attentional instructions.
232 Viviane Pouthas of time. Before scanning, six young adults were trained to discriminate two standard durations from slightly non-standard durations: A 450 ms standard interval from shorter (325 ms) and longer (575 ms) intervals; A 1300 ms interval (long duration) from shorter (975 ms) and longer (1625 ms) intervals. Two brief light-emitting diode (LED) flashes delineated intervals. Participants gave their responses by left (non standard) and right (standard) button presses. The letter C (court, for short in French) was presented before the short estimation trials and the letter L (for long, long in French) was presented before the long estimation trials. Control trials with the letter C or the letter L and requiring no estimation were also included. The activation revealed by the comparison between all estimation trials with all control trials showed that the pre-SMA, the anterior cingulate, the prefrontal and parietal cortices and the basal ganglia were involved in the estimation trials whatever the duration to be estimated. Then, we examined the effect of duration to be estimated by comparing long and short duration estimation trials. 2. To restrict this analysis to the regions constituting the time estimation network revealed by the comparison of estimation trials with control trials, we used this former comparison as an inclusive mask. Moreover, in order to ensure that the duration effect could not be attributed to the presentation of the L/C letter initiating the trials, we also used as an inclusive mask the contrast testing for the interaction between the estimation/control and L/C cues. Only a subset of the regions involved in the ‘estimation network’ yielded increased activation with increasing time: the pre-SMA, the anterior cingulate, the right frontal inferior gyrus, the bilateral premotor cortex and the right caudate nucleus (Figure 13.7). In sum, there was no greater activation for long durations, neither in the right dorsolateral prefrontal cortex, nor in the right inferior parietal cortex. These areas probably subtend attentional resources and mnemonic operations not uniquely dedicated to the perception of time. The greater activation that we observed in pre-SMA for long durations in comparison with short ones provides a strong argument for giving this structure a prominent role in temporal processing. Moreover, the caudate nucleus showed reliable enhanced activation. These results provide evidence that pre-SMA together with part of the basal ganglia play a crucial role in time coding. This role could be the cumulative function postulated by pacemaker-accumulator models as described above. The results as a whole are promising but much remains to be understood about the functional neuroanatomy of temporal information processing. The hemodynamic methods thanks to their good spatial resolution allow identification of brain regions involved in timing. But integrating activity over a period of at least one second, they cannot discriminate between the different stages in cognitive processing, as is possible
Cerebral Activations Involved in Time Estimation Figure 13.7 Estimation of activations due to length of duration
Note: The size of the effect for each is subject-specific.
233
234 Viviane Pouthas with event-related potential (ERPs), which reflect the rapidly changing electrical activity in the brain evoked by a stimulus or a cognitive event. The problem related to EEG, however, is the bad spatial resolution of the method. Indeed, the most common ERP component associated with timing, the CNV, has probably different origins: multiple cortical as well as sub-cortical regions have been thought to participate in its generation. Therefore, to go further in the localization of cerebral areas activated when time is processed and to better characterize the dynamics of these areas, studies must now fusion EEG and fMRI data. We initiated such a line of research a few years ago combining PET and EEG data (for example, Pouthas et al., 2000). This study is reported in the next section.
Combining PET and EEG Data: Where and When Time is Processed Different strategies have been chosen in experiments using PET and fMRI methods. One is to contrast perception tasks based either on the discrimination of a stimulus duration or of another parameter of the stimulus-pitch, intensity, colour (for example, Ferrandez et al., 2003; Maquet et al., 1996; Rao et al., 2001). Another strategy is to design experiments, which allow for the identification of brain regions whose activity varies in line with controlled changes in the stimulus feature under investigation—levels of attention to duration, duration length (for example, Coull et al., 2004; Jech et al., 2005; Pouthas et al., 2005). Moreover, event-related potential studies have provided very important data on the electrophysiological correlates of the various complex operations required by time estimation, because of the very good temporal resolution of the EEG method. These studies have mainly examined the amplitude and time course of the slow negative wave, CNV. Both CNV parameters have been shown to vary depending on different factors: level of training, accuracy and precision of temporal performance (for example, Ladanyi and Dubrovsky, 1985; Macar et al., 1999; Pfeuty et al., 2003), as well as age (for example, Ferrandez and Pouthas, 2001), mental state or disease (for example, Ikeda et al., 1997; Praamstra et al., 1996). However, the picture of the variations is rather complex. CNV recorded from the scalp is probably the summation of several cortical potentials that have different origins and different functions. In a PET study (for example, Maquet et al., 1996), we contrasted two visual discrimination tasks, one based on the duration of a visual stimulus (illumination of a light emitting diode, LED) and the other on the luminance of the same stimulus. Results showed that the same network was activated in both tasks—right prefrontal cortex, right inferior parietal lobule, anterior cingulate cortex, left fusiform gyrus and vermis. We also recorded ERPs from subjects performing duration and intensity discrimination tasks identical to those used in our PET study. Inspection of the waveforms suggests that there was no difference
Cerebral Activations Involved in Time Estimation
235
between the early and middle latency components observed during the two discrimination tasks. A positive wave developed between 90 ms and 130 ms over parietal-occipital areas, and a negative component maximal on parietal electrodes peaked around 180 ms. Prominent differences emerged between both tasks for the late latency ERP components. The intensity task elicited a large positive wave at posterior sites that peaked around 500 ms following stimulus onset and a negativity of small amplitude over frontal-central sites. The parietal–occipital positive component (late P3b) would probably reflect the end of the stimulus evaluation and the decision process. In contrast, the duration task elicited a large negative wave CNV distributed over frontal-central areas that peaked around 600 ms following stimulus onset (Figure 13.8a). In order to examine the CNV in more detail, we analyzed ERP waveforms on the frontal-central electrode site, FCZ, for the five test durations (Figure 13.8b). The waveforms indicate a co-variation between the duration of the LED illumination and the latency of CNV zero-crossing (return to baseline) on FCZ. Differences in the latency of the zero crossing would reflect differences in the timing of the cognitive processes associated to the participants’ temporal judgements and response selection. The strategy used to carry out the fusion of PET and ERP data was to derive a generator model accounting for the ERPs recorded in a large time window (100 to 1500 ms), Figure 13.8 Waveforms
Source: Adapted from Pouthas et al. 2000. Note:
(a) Mean ERPs waveforms recorded for the duration (thick line) and intensity (thin line); (b) CNVs waveforms recorded on the FCZ electrode for the five stimulus duration.
236 Viviane Pouthas corresponding to the time window during which participants do perform the tasks. The generators of the cerebral activity were modelled with a “PET seeded model”: prefrontal cortex, cingulate cortex, inferior parietal lobule and fusiform gyrus. I will focus on the right frontal dipole. Figure 13.9a displays the activity temporal course of this dipole in the two tasks for the standard stimulus: this standard stimulus had the same duration and the same intensity in the two tasks. What did change were the instructions given to the participants, that is, discriminate duration or discriminate intensity. In the duration task this dipole yielded largest activity between 450 and 950 ms following stimulus onset that is corresponding to the CNV time window. The activity of the dipole ends when the CNV resolves. By contrast in the intensity task the right frontal dipole shows a low level activity during the CNV time range. Moreover, the time course of dipole activity for the five test durations paralleled the CNV waveforms on the frontal central site (FCZ electrode) for these durations (Figure 13.9b). In sum, by evidencing differences in the time course of activation between duration and intensity processing of a visual stimulus, we were able to determine the activated areas when participants do really perform the duration task. Moreover, our results provide evidence that, in such a matching to sample task, the right frontal area has an essential role in making a decision about the stimulus duration. However, in order to question the specificity of spatio–temporal organization of cerebral areas involved when time is processed, more studies combining PET or fMRI and EEG data (for example, Nagaï et al., 2004) are needed. Figure 13.9 Time course activity
Source: Adapted from Pouthas et al. 2000. Note:
This is of the right frontal dipole in the 100–1500 ms window following stimulus onset for the standard in the duration and the intensity task (a) and for the five test durations in the duration task (b).
Cerebral Activations Involved in Time Estimation
237
References Amabile, G., F. Fattapposta, G. Pozzessere, G. Albani, L. Sanarelli, P. A. Rizzo, and C. Morocutti. 1986. ‘Parkinson disease: electrophysiological (CNV) analysis related to pharmacological treatment’, Electroencephalography and Clinical Neurophysiology, 64 (6): 521–24. Brown, S. W. 1997. ‘Attentional resources in timing: interference effects in concurrent temporal and nontemporal working memory tasks’, Perception and Psychophysics, 59 (7): 1118–40. Buhusi, C. V. and W. H. Meck. 2005. ‘What makes us tick? Functional and neural mechanisms of interval timing’, Nature Reviews Neuroscience, 6 (10): 755–65. Coull, J. T., F. Vidal, B. Nazarian, and F. Macar. (2004). ‘Functional anatomy of the attentional modulation of time estimation’, Science, 303 (5663): 1506–08. Ferrandez, A. M., L, Hugueville, S. Lehéricy, J. B. Poline, C. Marsault, and V. Pouthas. 2003. ‘Basal ganglia and supplementary motor area subtend duration perception: an fMRI study’, Neuroimage, 19 (4): 1532–44. Ferrandez, A. M. and V. Pouthas. 2001. ‘Does cerebral activity change in middle-aged adults in a visual discrimination task?’, Neurobiology of Aging, 22 (4): 645–57. Gerschlager, W., F. Alesch, R. Cunnington, L. Deecke, G. Dirnberger, W. Endl, W. G. Lindinger, and W. Lang. 1999. ‘Bilateral subthalamic nucleus stimulation improves frontal cortex function in Parkinson’s disease: An electrophysiological study of the contingent negative variation’, Brain, 122 (Pt 12): 2365–73. Gibbon, J., R. M. Church, and W. H. Meck. 1984. ‘Scalar Timing Theory’. In J. Gibbon and L. Allan (eds), Timing and Time Perception, pp. 52–77. New York: The New York Academy of Sciences. Grondin, S. 2001. ‘From physical time to the first and second moments of psychological time’, Psychological Bulletin, 127 (1): 22–44. Ikeda, A., H. Shibasaki, R. Kaji, K. Terada, T. Nagamine, M. Honda, and J. Kimura. 1997. ‘Dissociation between contingent negative variation (CNV) and Bereitschaftspotential (BP) in patients with parkinsonism’, Electroencephalography and Clinical Neurophysiology, 102 (2): 142–51. Ivry, R. B. and R. M. Spencer. 2004. ‘The neural representation of time’, Current Opinion in Neurobiolology, 14 (2): 225–32. Jech, R., P. Dusek, J. Wackermann, and J. Vymazal. 2005. ‘Cumulative blood oxygenation-level-dependent signal changes support the “time accumulator” hypothesis’, Neuroreport, 16 (13): 1467–71. Ladanyi, M. and B. Dubrovsky. 1985. ‘CNV and time estimation’, International Journal of Neuroscience, 26 (3–4): 253–57. Lewis, P. and C. Miall. 2003. ‘Distinct systems for automatic and cognitively controlled time measurement: evidence from neuroimaging’, Current Opinion in Neurobiology, 13 (2): 250–55. Macar, F., F. Vidal, and L. Casini. 1999. ‘The supplementary motor area in motor and sensory timing: evidence from slow brain potential changes’, Experimental Brain Research, 125 (3): 271–80. Macar, F., S. Grondin, and L. Casini. 1994. ‘Controlled attention sharing influences time estimation’, Memory and Cognition, 22 (6): 673–86. Maquet, P., H. Lejeune, V. Pouthas, M. Bonnet, L. Casini, F. Macar, M. Timsit-Berthier., F. Vidal, A. Ferrara, C. Degueldre, L. Quaglia, G. Delfiore, A. Luxen, R. Woods, J. C. Mazziotta, and D. Comar. 1996. ‘Brain activation induced by estimation of duration: a PET study’, Neuroimage, 3 (2): 119–26. Mauk, M. D. and D. V. Buonomano. 2004. ‘The neural basis of temporal processing’, Annual Review of Neuroscience, 27: 307–40. Nagaï, Y., H. D. Critchley, E. Featherstone, P. B. Fenwick, M. R. Trimble, and R. J. Dolan. 2004. ‘Brain activity relating to the contingent negative variation: an fMRI investigation’, Neuroimage, 21 (4): 1232–41.
238 Viviane Pouthas Pfeuty, M., R. Ragot, and V. Pouthas. 2003. ‘When time is up: CNV time course differentiates the roles of the hemispheres in the discrimination of short tone durations’, Experimental Brain Research, 151 (3): 372–79. Pouthas, V., L. Garnero, A. M. Ferrandez, and B. Renault. 2000. ‘ERPs and PET analysis of time perception: spatial and temporal brain mapping during visual discrimination tasks’, Human Brain Mapping, 10 (2): 49–60. Pouthas, V., N. George, J. B. Poline, M. Pfeuty, P. F. Vandemoorteele, L. Hugueville, A. M. Ferrandez, S. Lehéricy, D. Lebihan, and B. Renault. 2005. ‘Neural network involved in time perception: an fMRI study comparing long and short interval estimation’, Human Brain Mapping, 25 (4): 433–41. Praamstra, P., A. S. Meyer, A. R. Cools, M. W. Horstink, and D. F. Stegeman. 1996. ‘Movement preparation in Parkinson’s disease: Time course and distribution of movement-related potentials in a movement precueing task’, Brain, 119 (5): 1689–704. Rao, S. M., A. R. Mayer, and D. L. Harrington. 2001. ‘The evolution of brain activation during temporal processing’, Nature Neuroscience, 4 (3): 317–23. Walter, W. G., R. Cooper, V. J. Aldridge, W. C. McCallum, and A. L. Winter. 1964. ‘Contingent negative variation: an electric sign of sensory-motor association and expectancy in the human brain’, Nature, 25 (203): 380–84.
Section IV
Language, Cognition, and Development
240 Bhoomika R. Kar and Malini Shukla
Introduction
T
he field of cognitive development traces its roots back to Jean Piaget’s theory of cognitive development. The advancement in the study of cognitive development happened due to the convergence of developmental psychology and cognitive neuroscience (Johnson and Munakata, 2005). In recent years, cognitive development has been widely studied using behavioural methods and neuroimaging techniques. It is known that brain development and cognitive maturation occur concurrently during childhood and adolescence but much less is known about the direct relationship between neural and cognitive development. Detailed accounts of the embryonic brain development and cognitive development from infancy to early childhood are available (Matsuzawa et al., 2001; Sowell et al., 2004). Only recently, developmental neuroimaging has provided data on postnatal structural and functional maturation of the brain-from childhood to adolescence (Paus, 2005). Attention involves separable networks that compute different functions. Development of neural networks underlying attentional control has been explained by Posner and his colleagues. Substantial development of executive attention occurs between three years to seven years of age. Genetic, school and home environment, socialization and culture influence the ongoing development of attention (Posner and Rothbrat, 2005). Attention training and its effects on executive attention network are determined by gene and experience interactions. The way children learn language follows a specific pattern and is inherently systemic in nature. Even though young children are not formally taught language, language acquisition is part of the overall physical, social and cognitive development of children. There is strong evidence that children may never acquire a language if they have not been exposed to a language before they reach the age of six or seven. Children naturally obtain a communicative competence, intrinsically understand the rules of grammar, and gain knowledge of the rules of using language. After three decades of his research on language and education, Tucker (1999) concluded that language of school and language of home are different in almost all countries around the world. He also found that we develop literacy skills easily in our first language. According to Tucker, the best predictor of cognitiveacademic language development in second language is the proficiency in first language. There are also individual differences in the way second language is acquired and these differences are related to ones culture, ones group, and ones personality. While acquiring literacy in a particular language, children may show difficulties in learning to read, write, and spell. Dyslexia is a specific disability related to reading due to faulty information processing. A consensus has emerged that dyslexia is associated with a marked deficit in phonological processing. Since dyslexics have deficits in speech perception, they may have
242 Advances in Cognitive Science less precise phonological representations. One needs to know if the speech perception deficit occurs at the level of sensory perception involving pre-attentive auditory processing. Electro-physiological studies have supported the presence of auditory processing deficits in dyslexia. The attenuated Mismatch Negativity (MMN) for speech stimuli in dyslexics reveals deficits in pre-attentive and automatic information processing which can be considered a cause of dyslexia (Schulte-Korne et al., 1998). Another ERP study on Indian languages reported that children with dyslexia have basic non-linguistic information processing deficits despite the good phoneme to grapheme correspondence in Indian languages (Shankarnarayan and Maruthy, 2007). Both the chapters in this section focus on the interaction between cognitive development and literacy acquisition in normally developing children as well as those with a developmental disorder like dyslexia. The chapter by Posner and Kar highlights the interaction between education and development of attention and skills like literacy and numeracy. This is to point to the fact that brain is prepared to some extent for the school education based on the evidences on signs of development of literacy skills as early as in infancy. The processes that are responsible for school success may have common roots in the experiences of infancy. The infant also shows the ability to discriminate phonemes in various languages. The phonemic structure is shaped by the environmental input even before the infant begins to speak. Similarly the ability to appreciate quantity is present in infancy. Authors have also discussed the relationship between the early development of an executive attention brain network and acquisition of literacy skills. The complex scenario with respect to the multilingual context in India and its implication on development of literacy skills is also discussed. The authors raise issues about the acquisition of literacy in more than one language and the imposing demands on the executive attention system particularly in a multilingual context. The chapter by Kar and Shukla focuses on the much prevalent debate on the presence of auditory processing deficits in dyslexia. They have argued for the presence of auditory processing deficits along with deficient phonological awareness in dyslexia. Focusing on the temporal order judgement paradigm of rapid temporal processing by Paula Tallal, the authors argue for the relationship between auditory temporal processing and phonological awareness and reading. Dyslexics have difficulty in discriminating sounds that contain rapid transitions (Tallal, 1984). Fundamental temporal processing problem leads to speech processing impairments, which in turn have deleterious impact on reading development. Auditory processing of rapid acoustic transitions is lateralized to the left hemisphere. Neuroimaging studies have revealed that left temporo-parietal, inferior frontal regions are involved in phonological processing and hypo-activation of posterior regions and hyperactivation of Broca’s area has been reported (Rumsey et al., 1992). Electrophysiological data has shown reduced MMN, reduced P300 and N400 components related to phonological processing deficits in dyslexia. The authors have also discussed about the auditory temporal processing deficits in dyslexia being related to poor phonological awareness in
Language, Cognition, and Development
243
the context of remediation programmes focusing on phonological skills and temporal processing. They emphasize on the need to understand the mechanisms that determine the positive effects of remediation programmes like PASS Reading Enhancement program (PREP) and Fast ForWord training programme. The authors have described these remediation programmes in the context of auditory temporal processing. Recent neuroimaging evidences on effects of remediation in dyslexia have reported neural plasticity effects in terms of changes in brain activations before and after remediation particularly in the left posterior temporal-parietal regions. Authors highlight the importance of electrophysiological measurement in looking at the temporally mediated changes in brain activity with respect to phonological processing. Temporal dynamics underlying auditory temporal processing and phonological deficits can be better understood through ERP studies. However, authors also raise certain methodological concerns related to the electrophysiological measurement of auditory temporal processing using the temporal order judgement paradigm. The authors finally discuss about the possibilities to investigate near and far transfer from remediation programmes based on phonological and reading skills training versus programmes based on training in processing sounds and sound discrimination.
References Johnson, M. H. and Y. Munakata. 2005. “Cognitive development: at the crossroads?”, Trends in Cognitive Science, 9, 3: 91. Matsuzawa, J., M. Matsui, T. Konishi, K. Noguchi, R. C. Gur, W. Bilker, 2001. “Age related volumetric changes of brain gray and white matter in healthy infants and children”, Cerebral Cortex, 11, 4: 335–42. Paus, T. 2005. “Mapping brain maturation and cognitive development during adolescence”, Trends in Cognitive Sciences, 9, 2: 60–68. Posner, M. I. and M. K. Rothbrat. 2005. Influencing brain networks: Implications for education. Trends in Cognitive Sciences, 9, 3, 99–103. Posner, M. I. and Rothbart, M. K. 2007. Educating the Human Brain. Washington, DC: APA Books. Rumsey, J. M., P. Andreason, and A. J. Zametkin. 1992. Archives of Neurology, 49, 5: 527–34. Schulte-Körne, G., W. Deimel, J. Bartling, and H. Remschmidt. 1998. Auditory processing and dyslexia: evidence for a specific speech processing deficit. NeuroReport, 9, 2: 337–40. Shankarnarayan, V. C. and S. Maruthy. 2007. “Mismatch negativity in children with dyslexia speaking Indian languages”, Behavioural and Brain Functions, 3: 36. Sowell, E. R., P. M. Thompson, C. M. Leonard, S. E. Welcome, E. Kan, and A. W. Toga. 2004. “Longitudinal mapping of cortical thickness and brain growth in normal children”, Journal of Neuroscience, 24, 38: 8223–231. Tallal, P. 1984. “Temporal or phonetic processing deficits in dyslexia? That is the question”, Applied Psycholinguistics, 5, 2: 167–69. Tucker, G. R. 1999. “Georgetown University round table on languages and linguistics”, Georgetown University Press, 333–40.
244 Bhoomika R. Kar and Malini Shukla
Chapter 14 Effects of Remediation on Auditory Temporal Processing in Dyslexia: An Overview Bhoomika R. Kar and Malini Shukla
D
yslexia involves a functional coordination deficit with respect to complex cognitive processes such as visual and semantic decoding, temporal processing, phonological processing, orthographic, syntactic, and contextual analysis (Lachmann, 2002). Dyslexia is supposed to be a failure in learning to optimize the coordination of sub processes involved in reading with the consequence of errors in integrating reading related information represented in working memory. The causal connection between phonological skills and reading acquisition is well established (Richardson et al., 2004). The relationship between deficits in auditory processing and phonological representations has also been proposed and debated. The extent to which auditory processing deficits are important in the genesis of language disorders like dyslexia and speech language impairment (SLI) has been a much debated issue in dyslexia literature (Rosen 2003). Low level auditory processes might affect the development of phonological processing in children. The quality of phonological representations is important for literacy acquisition and this relationship has been observed across many languages for normal readers as well as children with dyslexia.
Auditory Temporal Processing in Dyslexia Auditory temporal processing refers to the processing of temporal properties of the acoustic signal (Nittrouer, 1999). Fundamental temporal processing problem leads to speech processing impairments, which in turn have deleterious impact on reading development. Temporal processing speed deficit across modalities might be the cause of difficulties in dyslexia. Dyslexic children have language problems that result from their
246 Bhoomika R. Kar and Malini Shukla inability to perceive the rapid acoustic elements included in human speech. Phonetic processing deficits might result from deficiencies of the mechanisms crucial for processing rapidly changing acoustic signals (Tallal, 1984). A large body of evidences indicates that children and adults with language problems have difficulties in phonological processing. Phonological processing refers to awareness and manipulation of the phonological structure of language. Children with reading difficulties have problem in segmenting syllables into phonemes whereas good readers do not have this problem. It has also been proposed that phonetic distance and not the temporal order difficulty may be reflected in the dyslexic’s poor performance on temporal order judgement tasks (Mody et al., 1997). Children with reading difficulties have problems in recalling the order of a series of non-speech sounds if they are presented rapidly and also if the duration of the sounds in also about 100 ms or shorter (Tallal, 1980). These findings have been the initial evidences for the fact that children with reading difficulties have problem in identifying or sequencing short duration stimuli presented in rapid succession (Merzenich et al., 1996). The relationship between temporal processing and phonological processing ability has been hypothesized to propose that temporal processing deficits have their effect on phonological processing at the level of speech perception. Phonological processing deficits are one of the fundamental deficits in reading impaired children. Children with reading disability make more errors in recalling the correct order of rapidly presented sounds. Dyslexics perform worse on tasks which need rapid sequential processing and this impairment could involve both the visual and auditory modalities and might cause dyslexia (Eden et al., 1995). In a study on 19 dyslexics, Mismatch Negativity (MMN) was determined for tones and speech stimuli and the results showed no differences between dyslexics and non-dyslexics with tones but significantly attenuated MMN for speech stimuli, thereby confirming a specific speech processing deficit at sensory level (SchulteKorne et al., 1998). MMN could be taken as a tool for early identification of dyslexia but studies are required to show if MMN could be a predictor of later reading disability.
Temporal Order Judgement Paradigm Tallal and her proponents proposed auditory temporal processing deficit hypothesis in 1980s. Tallal introduced the temporal order judgement (TOJ) task and later auditory temporal-order threshold (Tallal, 1980). Rapid temporal processing abilities have been examined with speech and non-speech stimuli. Difficulty with TOJ for briefly and rapidly presented stimuli with short and long inter-stimulus intervals (ISIs) has been well documented (Wittmann and Fink, 2004). It has been found that children with language impairments need longer ISIs between tones/phonemes to identify them as separate stimuli. It is still a controversy as to whether TOJ performance is related to phoneme discrimination (Rey et al., 2002; Berthenton and Holmes, 2003). However, the popular hypothesis in neuropsychology refers to the central auditory temporal deficit as the underlying deficit in dyslexia.
Auditory Temporal Processing in Dyslexia
247
Neural Basis of Phonological and Temporal Processing Neuroimaging studies have revealed that left temporo–parietal, inferior frontal regions are involved in phonological processing and hypo-activation of posterior regions and hyper-activation of Broca’s area has been reported (Habib, 2000). Inter-hemispheric transfer hypothesis suggests that due to the lack of asymmetry in planum temporale there are excessive cortical connections between the two hemispheres, which pose a state of constant conflict for speech/non-speech processing. Event related potentials have been used with tones, phonemes, words and comprehension tasks to study reading disorders. Electrophysiological data has shown reduced MMN, reduced P300 and N400 components related to phonological processing deficits in dyslexia. Decreased accuracy in auditory processing depicted by tonal and a phoneme discrimination task has been found to correlate with poor phonological skills (Kujala and Näätänen, 2001). Auditory processing of rapid acoustic transitions is lateralized to the left hemisphere. Most of the studies on electrophysiological correlates of phonological deficits in dyslexia have been done on alphabetic scripts like English and other western languages but no studies have been done on semi-syllabic scripts like Hindi particularly in cases where Hindi is the first language and English is the second language and a child with reading disability has difficulties with both languages. Recently one of the studies looked at MMN component using tones and phonemes varying in terms of duration and spectral properties in children with dyslexia in Indian languages (Kannada) (Shankarnarayan and Maruthy, 2007). They reported abnormalities in processing of speech as well as non-speech stimuli but greater for speech stimuli. In spite of having good phoneme to grapheme correspondence in Indian languages dyslexics showed central auditory processing deficits. MMN studies have reported smaller MMN in dyslexic children in speech and non-speech sounds using consonant, vowel and tone changes with most pronounced differences in case of consonants (Csepe, 2003). Dyslexics have difficulty in discriminating sounds that contain rapid transitions. In a study on auditory and visual processing in children with language impairment and dyslexia, N 140 component also showed lower amplitude and longer latency on a visual recognition task (Neville et al., 1993). In another study looking at N200 and P300 components, during an alphabetic, non-alphabetic and lexical decision task dyslexics showed longer latencies and lower amplitudes for P300 (Taylor and Keenan, 1990). Lower P300 amplitude shows greater demand on working memory and amount of attentional resources. Breznitz (2002) reported that dyslexics are slower on most of the linguistic or nonlinguistic auditory and visual low level tasks and higher level orthographic and phonological tasks after looking at the ERP waveforms for N100, P200, N200 and P300 components. In a recent study using tones, phonemes, pictures, words comparing low, average and expert readers, Given et al., (2006) found no significant difference across the three groups with respect to tones. However, differences in terms of lower MMN were observed for phonemes, words heard and words seen. It has also been suggested that
248 Bhoomika R. Kar and Malini Shukla only some of the components reflecting early stage auditory processing are affected in dyslexia (Taylor et al., 2003).
Remediation in Dyslexia Remediation in dyslexia aims to induce normalizing and compensatory effects in brain function and language processing and reading skills. Remediation in dyslexia is mostly based on the use of intact areas of higher cortical functioning in the development of remedial strategies while minimizing the emphasis placed upon dysfunctional cortical areas. A remedial programme that helps dyslexic children read better also improves activity in the part of the brain linked to the learning disorder, bringing the region closer to that of normal children. It is the method of improving the process or processes in which the child has deficits. Many children with reading disability are deficient in successive processing and lag behind in word decoding, working memory, attentional skill and visuo–spatial processing.
PASS Reading Enhancement Programme (PREP) The PASS Reading Enhancement Programme (PREP) aims at improving the information processing strategies, namely, simultaneous and successive processing that underlie reading. Cognitive remediation of decoding deficit was attempted by following a theoretically based programme, PREP (Das et al., 1994), which is based on well-accepted theories of child development and cognitive psychology. It aims at improving the information strategies namely, simultaneous and successive processing that underlie reading, while involving the training of planning and promoting selective attention, at the same time avoiding the direct teaching of word reading skills. PREP is also founded on the premise that the transfer of principles can be facilitated through inductive rather than deductive inferences (Carlson and Das, 1997). An integral part of the structure of each task is to develop strategies such as rehearsal, categorization, monitoring the performance, prediction, and revision of prediction, sounding and sound blending. Thus children develop their ability to use these strategies through experience with the task. The programme consists of ten tasks that vary considerably in content and requirement of the student. Each task involves both ‘global’ process training form and contentrelated ‘bridging’ form. The global component includes structured, non-reading task that requires the application of simultaneous or successive strategies. The bridging component involves the same cognitive demands as global component, which have been closely linked to reading and spelling. The global tasks begin with content that is familiar and non-threatening so that strategy acquisition occurs in stages. Complexity is introduced gradually and only after a return to easier content. Through verbal mediation (that occurs
Auditory Temporal Processing in Dyslexia
249
through specific discussions of strategies used), the global and bridging components of PREP encourage children to apply their strategies to academic tasks such as word decoding. The global and bridging components are further divided into three levels of difficulty. A criterion of 80 per cent correct responses is required before a child can proceed to the next level of difficulty. If this criterion is not met, an alternate set of tasks, at the same difficulty level, is used to provide the additional training required. The programme is typically given for 15 to 20 hours, once or twice a week, and spread over 12 weeks or more. Improvement within the session can be objectively scored. Records are maintained for every session. Each child also takes a pre-test for reading, spelling or comprehension that needs to be remediated along with a post-test. Selected PASS tests may be administered in the package of pre and post-test. The behavioural effects of PREP in dyslexic children are evident in the speed of reading and phonological processing. It provides alternatives for children who cannot use the process of simultaneous and successive processing in reading very well. Theoretically, successive and simultaneous processing are both important for word reading. Dual-route theories of word recognition, for example, suggest that a word is recognized either through direct visual access, or through phonological coding of its sounds. The first should relate to mainly simultaneous processing via orthographic processing, and the second primarily to successive processing via phonological processing. Thus, the two processes should show correlations with word reading. Early studies using experimental versions of PREP produced positive results in both cognitive processing tasks and reading performance. The notion that poor reading may coexist with cognitive processing deficits that go beyond phonological processing became apparent with the first intervention study. Brailsford et al. (1984) provided 15 hours of remedial training in simultaneous and successive processes to a group of learningdisabled children, aged nine to 12 years, enrolled in reading resource room programmes. The matched control group received the same amount of remedial reading instruction. The results showed significant group by time interaction in one simultaneous task, in all successive processing tasks, and in the Standard Reading Inventory scores (McCracken, 1966), all in favour of the remediation group. Similar gains were not evident in the Gates– MacGinitie reading comprehension subtest. This result could be interpreted to suggest that simultaneous and successive processing strategies emphasized in training generalized to a reading task requiring active organizational strategies but not to the more structured multiple-choice format reading task. The training tasks used in this early study could be best described as global in terms of their relation to reading, that is, they did not include training in reading related proximal processes. Therefore, the positive results are even more surprising and offer strong support for cognitive remediation. Later studies have replicated the positive results with a combination of global and bridging tasks, whereas training in either component alone has not necessarily been successful. Das et al. (1995) used PREP with a group of Grade three and four students with reading disabilities who exhibited delays of at least 12 months on
250 Bhoomika R. Kar and Malini Shukla either the Word Identification or Word Attack subtest of the WRMT-R (Woodcock Reading Mastery Test-Revised). Participants were first divided into two groups, PREP remediation and a no-intervention control group. The PREP group received 15 sessions of training involving groups of two students, over a period of two-and a half months. Children in the control group participated in regular classroom activities. After the intervention, both groups were tested again using the Word Identification and Word Attack subtests. The results indicated that while both groups gained during the intervention period, the PREP group gained significantly more on both Word Identification and Word Attack, as evidenced by a significant Group × Time interaction. In the second part of this study, children from the control group received either the global or the bridging component of PREP for the same length of time. Neither of these groups benefited from the programme to the same extent as the original PREP group that received both components. Similarly, Molina et al. (1997) found that when a Spanish version of full PREP (20 hours of remediation) was given to a group of nine- and 10-year-old Spanish children with reading difficulties, they did significantly better than a matched control group in reading, planning, simultaneous processing, and successive processing tasks. Similar gains were not evident in the group that received only the bridging component of the PREP (10 hours of training). The lack of positive results in this case, however, could have resulted also from the shorter intervention that the bridging group received. Finally, the effectiveness of a modified PREP (for an older group of children) was studied by Boden and Kirby (1995). A group of fifth- and sixth-grade students, who were identified a year earlier as poor readers, were randomly assigned to either a control or an experimental group. The control group received regular classroom instruction and the experimental group received PREP, in groups of four students, for approximately 14 hours. As in previous studies, the results showed differences between the control and PREP groups on the Word Identification and Word Attack tests after treatment. In relation to the previous year’s reading scores, the PREP group performed significantly better than the control group. Taken together, these studies provide sufficient evidence for the effectiveness of PREP for remediation of deficient reading skills during the elementary school years. Moreover, most of the studies included participants experiencing rather severe difficulties in learning to read and thus qualifying as ‘reading disabled’ rather than just poor readers.
Fast ForWord Training Programme Another effectively used programme for remediation in speech language impairments and dyslexia is the Fast ForWord training programme (Tallal et al., 1996). Fast ForWord Training programme has shown improvements in speech/language skills, such as auditory memory, phonemic awareness and analysis, in just six to eight weeks on the programme
Auditory Temporal Processing in Dyslexia
251
for 100 minutes a day for five days a week. Based on two decades of research on how the brain learns, Fast ForWord also makes unprecedented use of the latest multimedia and Internet software technology to address the underlying causes of these language-learning impairments. Fast ForWord Language is appropriate for children aged five to twelve, while Fast ForWord Middle and High School is for adolescents and adults. Fast ForWord Language to Reading is the subsequent programme offered to certain participants after mastering Fast ForWord Language or Middle and High School. Fast ForWord Reading is a curriculum-based reading programme that is appropriate for children with at least a third-grade reading level. Participants usually display improvement in auditory processing speed, working memory, serial order processing, phonological awareness, listening comprehension, syntax, and morphology. In addition to the measurable gains in language, it is often reported that children who have completed Fast ForWord are better able to interact with parents, teachers, and peers. New language skills often empower children to participate in the world in ways that were not possible before Fast ForWord training. Participants also display improvement in overall communication skills, as well as listening, thinking, and reading skills. In one of the first few studies by Temple et al., (2003) the language module of Fast ForWord training programme showed improvements in phonological awareness, better reading skills in dyslexics after getting trained on processing rapidly changing sounds. Behavioural effects were associated with brain activity related differences before and after remediation. Apart from dyslexia the programme has shown positive effects on reading skills in children with sub-average reading skills and poor academic performance in a study looking at effects of intensive training based on Fast ForWord programme on auditory temporal discrimination but the results did not generalize to reading skills tested through non-word reading tasks and phonological awareness tasks (Agnew et al., 2004).
Neural Mechanisms Studies on remediation in dyslexia have mostly looked at the behavioural effects on cognitive processes such as phonological awareness and comprehension. Neuroimaging studies have recently reported neural plasticity effects in terms of changes in brain activations before and after remediation. One of the studies by Temple et al. (2003) reported increased activation in left posterior temporal-parietal regions after remediation. Since reading underlies temporally mediated cognitive processes it would be more informative to look at the ERP related changes after remediation. An electrophysiological measurement of the changes in brain activity as an effect of remediation, using EEG ERP, would provide an understanding of the temporally mediated changes in brain activity with respect to phonological processing. Dyslexic readers have shown under-activation in posterior regions
252 Bhoomika R. Kar and Malini Shukla (Wernicke’s area, the angular gyrus, and striate cortex) and over-activation in inferior frontal gyrus (Shaywitz et al., 1998). The left temporo-parietal cortex shows a relationship between increased activity after remediation and improvement in oral language ability and word blending (Eden and Moats, 2002). Remediation in dyslexia has shown behavioural effects in terms of improvement in phonological awareness, speed of reading and reading comprehension. Remediation also results in certain compensatory effects based on the principles of brain plasticity. Remediation shows behavioural effects in terms of improvement in reading because of the reorganization of the functional circuits in the brain mediating the reading process. This mechanism is supported by the evidences that show changes in brain activity after remediation in dyslexics using Functional Magnetic Resonance Imaging (FMRI), which report increased activations in the left sided posterior regions correlated with an improvement in reading skills. FMRI studies inform about the changes in regional activations in the brain following remediation (Temple et al., 2001). Electrophysiological data can provide temporal information about the on-line brain function by computing event related potentials during a phonological/rhyming/orthographic task. Temporal processing contributes to reading particularly with reference to the processes of sensory integration and phonological decoding. An electrophysiological measurement of the changes in brain activity, using EEG/ERP, would provide a better understanding of the temporally mediated changes in brain activity with respect to phonological and temporal processing involved in reading.
Methodological Concerns: ERP Paradigm on Rapid Temporal Auditory Processing There are certain methodological difficulties with an ERP paradigm with stimuli presented in rapid succession. Overlapping ERPs from previous and subsequent stimuli distort averaged waveforms in subtle ways which happens when inter-stimulus intervals (ISIs) between stimulus events are as short as 100 to 150 ms. Overlap arises when the response to the previous stimulus has not ended before the baseline period prior to the current stimulus or when the subsequent stimulus is presented before the ERP response to the present stimulus ends. Problem is acute when stimuli are presented rapidly. ERP waveforms can last for several seconds and overlap can distort the data and lead us to misinterpret the data. This is particularly important to consider in pre and post remediation studies where pre and post ERP waveforms have to be compared on a temporal order judgement paradigm. Overlap is problematic when it differs between experimental conditions. Some ways to minimize overlap effects have been suggested by Luck (2005). One of them is to average ERPs across stimuli altogether, another one is to introduce no stimulus trials or have larger ISIs. High pass filters also could be used but it can distort the waveforms in
Auditory Temporal Processing in Dyslexia
253
other ways. Still another approach is to estimate the overlap and subtract the estimated overlap from the averaged waveforms. Current theories of dyslexia emphasize the difficulties in phonological processing or in processing temporal stimuli presented rapidly such as speech stimuli. ERPs could be a powerful means to study temporal processing in dyslexia to examine the levels of processing in cognitive and perceptual tasks. Mismatch negativity is one component which can reflect functional changes in auditory processing and has been applied to the assessment of training effects. Most of the studies on electrophysiological correlates in dyslexia have taken MMN as a suitable paradigm for auditory processing. Increase in MMN amplitudes after training reflects an ease in detecting differences (Kujala and Näätänen, 2001). ERP related changes in brain activity as an effect of remediation in dyslexia has been a recent interest. Most of the remediation programmes in dyslexia are based on training in phonological awareness, word recognition, and reading skills. Hence, it would be interesting to investigate near and far transfer from remediation programmes based on phonological and reading skills training like PREP versus training programmes based on training in processing sounds and sound discrimination like Fast ForWord training programme on reading skills.
Conclusion Popular hypothesis in neuropsychology has emphasized on the central auditory temporal processing deficit as an underlying deficit in dyslexia. Auditory processing deficits in terms of detection of temporal order may be associated but not causally related with dyslexia. There is a need to look at the deficits in auditory temporal processing across subgroups of dyslexics. Mechanisms or sub-processes mediating the deficits in auditory processing related to phonological processing deficits should also be explored.
References Agnew, J. A., C. Dorn, and G. F. Eden. 2004. ‘Effect of intensive training on auditory processing and reading skills’, Brain and Language, 88: 121–25. Boden, C. and J. R. Kirby. 1995. ’Successive processing, phonological coding, and the remediation of reading’, Journal of Cognitive Education, 4, 2&3: 19–32. Brailsford, A., F. Snart, and J. P. Das. 1984. ‘Strategy training and reading comprehension’, Journal of Learning Disabilities, 17: 5287–90. Bretherton, L. and V. M. Holmes. 2003. ‘The relationship between auditory temporal processing, phonemic awareness, and reading disability’, Journal of Experimental Psychology, 84 (3): 218–43. Breznitz, Z. 2002. ‘Asynchrony of visual–orthographic and auditory–phonological word recognition processes: an underlying factor in dyslexia’, Reading and Writing: An Interdisciplinary Journal, 15: 15–42.
254 Bhoomika R. Kar and Malini Shukla Carlson, J. S. and J. P. Das. 1997. ‘A process approach to remediation of word-decoding deficiencies in Chapter 1 Children’, Learning Disability Quarterly, 20 (1): 93–125. Csepe, V. 2003. ‘Auditory event-related potentials in studying developmental dyslexia’, in Valeria Csepe, Dyslexia: Different Brain, Different Behavior, pp. 81–112. New York: Kluwer Academic/Plenum Publishers. Das, J. P, J. A. Naglieri, and J. R. Kirby. 1994. Assessment of Cognitive Processes. Needham Heights, MA: Allyn and Bacon. Das, J. P., R. K. Mishra, and J. E. Pool. 1995. ‘An experiment on cognitive remediation or word-reading difficulty’, Journal of Learning Disabilities, 28: 66–79. Eden, G. F., J. F. Stein, H. M. Wood, and F. B. Wood. 1995. ‘Temporal and spatial processing in reading disabled and normal children’, Cortex, 31: 451–68. Eden, G. F. and L. Moats. 2002. ‘The role of neuroscience in the remediation of students with dyslexia’, Nature Neuroscience, 5 (Suppl): 1080–84. Given, B. K., S. Chari, J. Ennis, L. Kirby, O. Firoozi, E. Liptak, G. Isnardi, and C. Balakrishna. 2006. ‘Differences in low average and expert readers as measured by EEG/ERPs: preliminary findings and challenges of an in-school psychophysiological research project’. Research report submitted to NIIDR, USA. Habib, M. 2000. ‘The neurological basis of developmental dyslexia: An overview and working hypothesis’, Brain, 123: 2373–99. Khan, S. C., Frisk, V., and Taylor, M. J. 1999. ‘Neurophysiological measures of reading difficulty in very low birth weight children’. Psychophysiology, 36: 1–10. Kujala, T. and R. Näätänen. 2001. ‘The mismatch negativity in evaluating central auditory dysfunction in dyslexia’, Neuroscience and Behavioral Review, 25 (6), 535–43. Lachman, T. 2002. ‘Reading disability as deficit in functional coordination’, in E. Witruk, A. D. Friederici, and T. Lachman (eds), Basic Functions of Language, Reading, and Reading Disability, pp. 165–98. Netherlands: Kluver Academic Publishers. Luck, S. J. 2005. An Introduction to the Event-Related Potential Technique. London: The MIT Press. McCracken, R. A. 1966. The Standard Reading Inventory Manual. Klamath Falls, OR: Klamath Printing Company. Merzenich, M. M., J. W. Jenkins, P. Johnston, C. Schreiner, S. L. Miller, and P. Tallal. 1996. ‘Temporal processing deficits in language learning impaired children ameliorated by training’, Science, 271 (5245): 77–81. Mody, M., M. Studdert-Kennedy, and S. Brady. 1997. ‘Speech perception deficits in poor readers: auditory processing or phonological coding?’, Journal of Experimental Child Psychology, 64 (2): 199–231. Molina, S., M. A. Garrido, and J. P. Das. 1997. ‘Process-based enhancement of reading: an empirical study’, Developmental Disabilities Bulletin, 25 (1): 68–76. Neville, H. J., S. A. Coffey, P. J. Holcomb, and P. Tallal. 1993. ‘The neurobiology of sensory and language processing in language-impaired children’, Journal of Cognitive Neuroscience, 5 (2): 235–53. Nittrouer, S. 1999. ‘Do temporal processing deficits cause phonological processing problems?’, Journal of Speech, Language and Hearing Research, 42 (4): 925–42. Rey, V., S. De Martino, R. Espesser, and M. Habib. 2002. ‘Temporal processing and phonological impairment in dyslexia: effect of phoneme lengthening on the order judgement of two consonants’, Brain and Language, 80 (3): 576–91. Richardson, U, J. M. Thompson, S. K. Scott, and U. Goswami. 2004. ‘Auditory processing skills and phonological representation in dyslexic children’, Dyslexia, 10 (3): 215–33. Rosen, S. 2003. ‘Auditory processing in dyslexia and specific language impairment: is there a deficit? What is its nature? Does it explain anything?’, Journal of Phonetics, 31 (3–4): 509–27.
Auditory Temporal Processing in Dyslexia
255
Schulte-Korne, G., W. Deimel, J. Bartling, and H. Remschmidt. 1998. ‘Auditory processing and dyslexia: Evidence for specific speech processing deficit’, Neuroreport, 9 (2): 337–40. Shankarnarayan, V. C. and S. Maruthy. 2007. ’Mismatch negativity in children with dyslexia speaking Indian languages’, Behavioural and Brain functions, 3: 1–11. Shaywitz, S. E., B. A. Shaywitz, K. R. Pugh, R. K. Fulbright, R. T. Constable, W. E. Mencl, D. P. Shankweiler, A. M. Liberman, P. Skudlarski, J. M. Fletcher, L. Katz, K. E. Marchione, C. Lacadie, C. Gatenby, and J. C. Gore. 1998. ‘Functional disruption in the organization of the brain for reading in dyslexia’, Proceedings of National Academy of Science USA, 95 (5): 2636–41. Tallal, P. 1980. ‘Auditory temporal perception, phonics and reading disabilities in children’, Brain and Language, 9(2): 182–98. ———. 1984. ‘Temporal or phonetic processing deficits in dyslexia? That is the question’, Applied Psycholinguistics, 5 (2): 167–69. Tallal, P., S. L. Miller, G. Bedi, G. Byma, X. Wang, S. Nagarajam, C. Schreiner, W. M. Jenkins, and M. M. Merzenich. 1996. ‘Language comprehension in language-learning impaired children improved with acoustically modified speech’, Science, 271, (5245): 81–84. Taylor, M. J., M. Batty, Y. Chaix, and J. Demonet. 2003. ‘Neurophysiological measures and developmental dyslexia: auditory segregation analysis’, Current Psychology Letters, 10. Taylor, M. J. and N. K. Keenan. 1990. ‘Event-related potentials to visual and language stimuli in normal and dyslexic children’, Journal of Psychophysiology, 27 (3): 318–27. Temple, E., G. K. Deutsch, R. A. Poldrack, S. L. Miller, P. Tallal, M. M. Merzenich, and D. E. Gabrieli. 2003. ‘Neural deficits in children with dyslexia ameliorated by behavioral remediation: evidence from functional MRI’, Proceedings of National Academy of Science, 100 (5): 2860–865. Temple, E., R. A. Poldrack, J. Salidis, G. K. Deutsch, P. Tallal, M. M. Merzenich, and J. Gabrieli. 2001. ‘Disrupted neural responses to phonological and orthographic processing in dyslexic children: an fMRI study. Brain Imaging’, Neuroreport, 12 (2): 299–307. Wittmann, M. and M. Fink. 2004. ‘Time and language—critical remarks on diagnosis and training methods of temporal–order judgment’. Acta Neurobiol Exp. (Wars), 64: 341–48.
Chapter 15 Brain Networks of Attention and Preparing for School Subjects Michael I. Posner and Bhoomika R. Kar
S
tudies of adults have revealed brain networks used to process spoken and written language, numbers and to carry out the attentional functions of orienting to sensory stimuli, maintaining an alert state and regulating thoughts and emotions (Posner and Rothbart, 2007). All of these functions are crucial for success in school. In this chapter we try to trace the development of some of these functions in infancy and early childhood. We outline some implications for the complex multilingual environment of contemporary India. Our goal is to inform the reader about the surprisingly early start to this development in the hopes of further advancement of methods to augment their development.
Language In the 1970s behavioural studies using habituation to a repeated stimulus provided evidence that from birth infants are able to discriminate basic phonemic units not only in their own language but in other languages to which they have never been exposed (Eimas et al., 1971; Streeter, 1976). Studies using these behavioural methods together with electrical recording from the scalp have probed some of the early development of the phonemic structure underlying language. More recently infants have been exposed to language while resting in fMRI scanner to examine the brain mechanisms activated by language (Dehaene-Lambertz et al., 2006). The infant language system appears to involve the same left hemisphere language structures found in adults (Dehaene-Lambertz et al., 2006). In one study infants listened to sentences presented aurally in their language. Brain areas in the superior temporal lobe (Wenicke’s area) and in Broca’s area were active. When the same sentence was presented after a delay of 14 seconds Broca’s area activity increased, as though this area was involved
Brain Networks of Attention
257
in the memory trace. It has long been supposed that the early acquisition of language might involve very different mechanisms than are active in adults (Vicari et al., 2000). Left hemisphere lesions in infancy do not produce a permanent loss of language function as they might do in adults. Nonetheless, the new fMRI data suggests that the left hemisphere speech areas are involved in receptive language even at three months of age.
Phonemes It has been possible to study changes in phonemic discrimination due to exposure to the native language at least by ten months of age (Kuhl, 1994; Saffran, 2002). Infants appear to acquire a sharpened representation of the native phonemic distinctions (Kuhl et al., 2006). During this same period they also lose the ability to distinguish representations not made in their own language (Werker et al., 1981). At least a part of the loss occurs when the nonnative language requires a distinction that is within a single phonemic category in the native language. An example is the ra-la distinction, important in English; it is lost because it is within a single category in Japanese (McClelland et al., 2002). It is as though Japanese no longer hear this distinction and even when exposed to English they may not improve in distinguishing ra from la. Training by several methods (Iverson et al., 2005; McClelland et al., 2002) seems to improve this ability even in adults, although it is not known how well this knowledge can be incorporated into normal daily life communication. It might be useful to find a way that will preserve the distinctions originally made for the non-native language during infancy. One study showed that 12 sessions of exposure to a mandarin speaker during the first year of life help to preserve a mandarin phoneme in children whose native language was English (Kuhl et al., 2003). A similar amount of exposure to a computerized version of the speaker was not effective, suggesting the importance of social interaction in this early form of learning. More needs to be learned about how and whether media presentation can be effective in learning. There are many reasons why it is useful to know more than one language. Much of the world population lives in places where speakers of two or more language live in close proximity. In addition, there is some reason to believe that proficiency in a second language provides improved performance in the ability to exercise executive control over thoughts (Bialystok and Martin, 2004). We believe that this is one form of attention training and it is discussed in the section of this paper on Attention. There is also some reason to believe that the process of phonemic discrimination being developed in infancy is important for later efficient use of spoken and written language (Guttorm et al., 2005; Molfese, 2000). Electrical recording taken in infancy during the course of phonemic distinctions (Guttorm et al., 2005; Molfese, 2000) have been useful in predicting later difficulties in language and reading. There is a history of using event related potentials to assess infant deafness early in life and being able to do so reliably has been very useful in the development of sign language and other interventions to hasten
258 Michael I. Posner and Bhoomika R. Kar the infants’ ability at communication. Perhaps a similar role will prove to be possible for ERPs in the development of methods to insure a successful phonemic structure in the native language. There have been efforts to develop appropriate intervention in later childhood for difficulties in the use of language and reading such as the widely used FASTFORWARD programmes (http://www.scilearn.com). Although there are disputes about exactly why and for what populations this method works it remains important to develop remedies for language difficulties based upon research.
Reading Reading is a high level skill and in alphabetic languages such as English it has properties related to the phonemic structure of language. There have been many studies of adult reading and much more is known than can be reviewed here (see Posner and Rothbart, 2007; Schlagger and McCandliss, 2007, for reviews). The child’s ability in phonemic awareness, for example rhyming of auditory words, is a good predictor of their being able to learn to read alphabetic languages such as English. Adult studies of reading have revealed a complex neural network involved in the translation of words into meaning. Two important nodes of this network are the visual word form area, of the left fusiform gyrus and an area of the left temporal-parietal junction for translating visual letters into sounds. The visual word form area is involved in integrating or chunking visual letters into units of words (McCandliss et al., 2003). The visual word form has been localized to the fusiform gyrus of the left hemisphere’s visual system. Although there has been some dispute about the operation, it appears to be a part of the visual system that becomes expert in dealing with letters as reading skill develops in later childhood. It is thought that without the functioning of this area reading cannot become fluent. For example, a patient with a lesion that interrupted the flow of information from the right hemisphere to the visual word form area used letter by letter reading when words were presented left of fixation (going to the right hemisphere), while they read words normally when the word was projected to the left hemisphere and thus reached the visual word form area (Cohen et al., 2004). Children from seven to 18 who were deficient in reading skill failed to activate this area, but were able to do so after extensive training (Shaywitz et al., 2007). Languages like English that are highly irregular at the visual level are heavily dependent upon brain areas that translate visual words to sound. These areas are at the temporal parietal boundary of the left hemisphere. Children who have difficulty in learning to read show little activation in these phonological areas. However, phonics training even after 20 sessions can produce relatively normal activity in these areas and also improve reading by several grade levels. The time course of development of the visual word form area in English is important for the development of fluent reading. Phonics training often leaves the child with
Brain Networks of Attention
259
improved decoding skill, but with a lack of fluency. Evidence that the visual word form develops rather late and first only for words with which the child is familiar (Posner and McCandliss, 1999), suggests the importance of continuous practice in reading to develop fluency (Shawitz et al., 2007). More research is needed on the best methods for developing fluency particularly in non-alphabetic languages.
Numeracy The human infant like other animals seems to have an inborn skill to recognize quantity. At least by a few months of age the infant seems to be able to discriminate changes in the number of events presented, provided that the numbers are small. Wynne (1992) showed that infants of seven to nine months looked longer when simple addition problems (presented as puppets) were in error than when they were corrected. Berger and colleagues (2006) compared this ability in seven to nine month old children and adults. Using high-density electrical recording from the scalp they found the same electrodes over frontal midline areas discriminated between errors and correct answers in the two groups. The adult brain made the discrimination by about 250 milliseconds and the infant brain was only delayed by about 50 milliseconds. The authors showed that error detection was signalled by an increase in theta rhythm in both groups. The electrodes in question had been related to activity in the dorsal anterior cingulate by previous studies (Dehaene et al., 1994). The overall network of brain activity in processing number has been studied in children and adults by high-density electrical recording in a task, which required the person to indicate by pressing keys whether a digit was above or below five (Dehaene, 1996; Temple and Posner, 1998). Children as young as five years of age showed that similar brain mechanisms underlie the decision as found in adults suggesting that the number line can be used by this age. There has been some dispute concerning the developmental course of the number line as some studies have suggested that frontal structures (Ansari et al., 2005), rather than parietal structures mediate this decision (Cantlon et al., 2006). Whatever the final decision about the localization of the mechanisms underlying knowledge of numeracy, there is wide agreement that later learning of arithmetic operations depend upon it. Studies using a programme called Rightstart (Griffin et al., 1995) indicated that children from low SES homes were at high risk for failure in elementary school arithmetic, but training in numerical quantity before the start of school could greatly reduce this deficit. Computerized exercises based on this concept have been developed for young children (Wilson et al., 2006). Apparently there are important linguistic and cultural differences in the use of Arabic digits in the performance of calculations that could have important consequences for the acquisition of language by children in different parts of the world. Using Arabic digits
260 Michael I. Posner and Bhoomika R. Kar commonly employed by both cultures, the ability to make simple numerical comparisons were compared for Chinese and English native speakers. Despite the identical input and task quite different networks of brain areas were used by the two groups. English native speakers used the network of parietal and frontal areas discussed above. However, Chinese natives relied on premotor areas not found active for English speakers (Tang et al., 2006). As the tasks were increased in difficulty by requiring addition as well as comparison, the English natives speakers used language areas, as had been reported previously for exact calculation (Dehaene et al., 1999). In addition, English speakers activated limbic areas related to anxiety. However, Chinese native speakers did not show activation of language areas, nor of areas related to anxiety and negative affect.
Attention The role of attention in infancy and early childhood has been covered extensively in the recent volume Educating The Human Brain (Posner and Rothbart, 2007). In that volume we report evidence that the executive attention system appears to be active during error detection in infants and young as seven to nine months (Berger et al., 2006). Recently we have been carrying out a longitudinal study starting with infants at seven months and eventually going to four years and perhaps into early schooling (Sheese et al., 2008). We have so far completed work with the infants and eighteen month olds. Several findings stand out as important for the general theme of preparing the brain for school. The executive attention system does appear to be active during infancy (Sheese et al. 2008). We used anticipatory looking towards visual locations as a possible measure of executive attention because this measure was correlated with performance on spatial conflict tasks at age three (Rothbart et al., 2003). We found infants with higher frequency of anticipatory looks were also more likely to exercise caution in reaching towards novel objects; a behaviour which has been shown at later ages to be related to effortful control (Aksan and Kochanska, 2004; Rothbart et al., 2000). In addition, infants with more anticipatory looks also showed more self-regulatory activity when presented with distressing stimuli; suggesting that cognitive and emotional control emerges from infant regulation. At 18 months we genotyped the children and their caregivers and coded their interaction with parents during a free play task (Sheese et al. 2009). We first examined the Dopamine 4 receptor gene (DRD4) that had been associated with attention and sensation seeking in previous work. We found that a gene environment interaction showing the several key symptoms of ADHD (activity level, impulsivity and risk taking) were greatly affected by parenting only for infants with one form of the DRD4 (the seven repeat allele), but parenting had little influence on the behaviour of children without this gene. The seven repeat has been shown to be under positive selective influence during human evolution. We speculated that alleles of genes might be under positive selection
Brain Networks of Attention
261
because they make the child more likely to be influenced by the dominant culture including parenting (Posner, in press). The importance of parenting for behaviours such as activity level and impulsivity shows that the ability of children to handle the school situation may depend upon the joint interaction of genes and environment. Rueda et al. (2004) have traced the development of executive attention by use of the Attention Network Test (ANT). They found a strong development in the preschool and early school. Rueda et al. (2005) have shown that the brain network underlying executive control and self-regulation can be influenced by training exercises introduced during the preschool period. The finings of superior performance of multilinguals on executive function mentioned previously suggests to us a strong link between executive attention and later ability in language learning and reading acquisition. The ties to attention have been supported by use of the ANT to assess differences in executive attention between monolinguals and bilinguals. In one study (Yang and Lust, 2007) compared Korean and Chinese native speakers who were bilingual in English with French and Spanish speakers bilingual in English with English monolinguals. Both bilingual groups showed better executive attention than monolinguals and the Asian group, whose languages differed the most from English, were superior to the Romance bilinguals. This study shows the close ties between language and attention. It also suggests that the need to select among language may form one important basis for improved executive attention.
The Indian Scene There is good amount of research on reading acquisition and bilingual/multilingual scene of contemporary India. In this section we discuss the complexities related to literacy acquisition and relate the development of executive function to the complex needs of literacy acquisition in multilingual context in India. The process of acquisition of literacy becomes complicated when there is a need to acquire languages following different writing systems as prevalent in India. There are many languages which are spoken, written and read in India, but all the four different orthographic families of modern India—Indo–Aryan, Dravidian, Astro–Asiatic (Munda, Santali), and Tibeto–Burman have a common source in Brahmi and therefore share the same salient features. An Indian child’s first language could be one of the Indo–Aryan languages like Hindi, Marathi, Gujarati, or Punjabi, or Dravidian languages like Kannada, Tamil, Telegu, Malayalam, etc., which form the two major groups and the second language is mostly English. English being the second language is acquired once the child starts school at four years of age when children have already acquired considerable skill in their first language. Reading is a complex skill and fluency in reading is a goal of literacy acquisition (Posner and Rothbart, 2007). Speed has also been added as one of the components (Joshi
262 Michael I. Posner and Bhoomika R. Kar and Aaron, 2000), which could be a sensitive marker for studying fluency in languages following different writing systems. This is particularly relevant for a writing system where phonological decoding is not a prerequisite for acquiring reading skills. Bi-, tri-, or multilingualism is a socio–cultural condition and cannot be ignored in India. Cross-linguistic studies suggest that reading skill develops at a different pace in different orthographies (Karanth, 2003). Less is known about how first language or mother tongue interacts with second language acquisition. Much needs to be learned concerning social factors such as number of languages spoken by the child, relative fluency in all the languages spoken, literacy level of parents, and the extent of preschool exposure to literacy (Karanth, 2001). The nature of orthography, its transparency and form of representation should influence the pattern of reading development. English follows an alphabetic script and depends heavily on grapheme-phoneme correspondence whereas languages with transparent orthographies like Italian, Spanish, German, and Indian languages are considered as alphasyllabaries. Most of the research and theory building in reading has focused on alphabetic scripts and these theories do not fully apply the process of reading acquisition in languages with transparent orthographies. One difference between writing systems related to reading acquisition is that spelling to sound consistency varies across orthographies (Frost et al., 1987). In some orthographies, one letter or letter cluster can have multiple pronunciations (for example, English, Danish), whereas in others it is always pronounced the same way (for example, Hindi, Greek, Italian, Spanish). Similarly, in some orthographies, a phoneme can have multiple spellings (for example, English, French, Hebrew), whereas in others it is almost always spelled the same way (for example, Hindi, Italian). It has been demonstrated that grapheme–phoneme recoding skills take longer to develop in less transparent orthographies like English taking about two years of reading experience as compared to more transparent orthographies like Spanish, Greek, Finnish for which word and non-word reading is acquitted in the middle of first grade (Seymor et al., 2003). Indian scripts, derived from Brahmi, fall in between syllabic and alphabetic writing systems. The alphasyllabaries of India share some characteristics of alphabetic scripts yet are distinct since the basic unit of the script is the syllable and not phoneme. The basic written unit in Indian script is akshara that consists of one of three possibilities: (a) an independent vowel, (b) a consonant symbol with inherent or attached diacritic vowel, and (c) two or three consonants plus a vowel (Padakannaya and Mohanty, 2004). The transparency of akshara makes decoding simpler but the spatial configuration of akshara makes it time consuming to master. A transparent orthography is believed to facilitate comprehension, as decoding is less demanding, for example reading comprehension of an Italian child is higher than that of English. But this cannot be generalized to Indian context as Indian children have more aksharas to learn and they need to master the akshara principle. Akshara awareness has been a good criterion for identification of good and poor readers. Writing systems which are alphabetic in nature with a small set of graphemes often have a high proportion of
Brain Networks of Attention
263
irregular words as compared to alphasyllabaries which have more number of graphemes with close correspondence to the phonemes. Script specific components are involved in literacy acquisition. Phonological awareness is crucial for reading alphabetic scripts. It is neither crucial nor necessary for successful reading acquisition in transparent writing systems. In a study on Indian population with monoliterates, nonliterates and biliterates (Hindi and English, or Kannada, and English) on tasks like rhyme recognition, syllable deletion, and phoneme deletion it was observed that only biliterates performed well on phoneme awareness tasks, others performed well on syllable deletion and rhyme recognition tasks (Karanth, 1998; Prakash and Rekha, 1992). In one of the studies at the Centre of Behavioural and Cognitive Sciences in Allahabad it was observed that poor readers (first grade children) outperformed good readers on syllable awareness tasks in Hindi and English whereas poor readers performed very poor on phoneme deletion and reversal tasks in English language. We also found that performance on phonemic tasks in English was better than phonemic tasks in Hindi (Rimzhim et al., 2007). These findings suggest that the reader while learning English and an Indian script may incorporate different psycho-linguistic processes. Transparent orthographies may demand different strategies when, as in Hindi, the basic unit is a syllable and not a phoneme. In another study on bi- or multilingual adults it was observed that differences in phonological awareness are linked to whether a particular language being tested is one L1, L2, or L3. A neuroimaging study on multilingual adults showed increase in activation in the inferior parietal and inferior frontal regions while progressing from known to unknown language processing. Increase in activation for less familiar scripts for example L2 or L3 as compared to L1 could be due to the requirement of greater effortful control. Literacy acquisition in children has been studied in a sequence of three stages: logographic, alphabetic, and orthographic phases of development. Frith (1985) proposed that children go through the logographic stage of reading while acquiring literacy in English language but phonologically transparent orthographies such as German, Spanish, or Hindi do not depend on logographic reading (Wimmer and Goswami, 1994; Karanth and Prakash, 1996). Orthographic sensitivity is a crucial factor in fluent reading. In English children before age 10 show activation of the visual word form area only by familiar words, but by adulthood even unfamiliar but orthographically correct non-words will activate the visual word form system. Orthographic sensitivity does not seem to be achieved below a certain age or extent of exposure to the language. This has important implications in terms of literacy acquisition and cross linguistic transfer as most reading acquisition for first and second language starts simultaneously at least in case of Hindi and English in North India. There is a difference between the processes of reading acquisition in transparent orthographies including Indian scripts as opposed to opaque languages like English. Children have been observed to make more spelling errors on vowels than consonants in English. But in transparent scripts such as Italian and German, more spelling errors are committed on consonants than vowels. Generalizing Italian and German findings
264 Michael I. Posner and Bhoomika R. Kar to the Indian context is not appropriate because studies conducted in India show that children commit more mistakes on the vowel part of the akshara in Indian orthographies like Gujarati (Patel, 2004). The component of fluency including speed of reading is also important. Recent studies suggest that speed is a crucial factor particularly in transparent orthographies. Akshara presents as a complex graphemic symbol for processing and hence may hinder the speed of processing. If we take speed as a variable in reading acquisition, it has been reported that factors such as word frequency, type (concrete versus abstract), and class (nouns versus verbs) did not affect reading speed but length of the word emerged as a major factor related to rapid reading of words as well as non-words in Kannada. This was in contrast to the results obtained for English language where all the factors listed above showed a direct correlation with speed of reading (Kurian, 1996). The teaching of reading calls for script specific methods (Karanth, 2006). The teaching methods followed for alphabetic scripts as opposed to transparent orthographies could be different. Models derived from studies of English have proposed phonics and the method of teaching how to read and this may not be appropriate for transparent orthographies, as it would mean teaching aksharas like alphabets. A reader of an Indian script does not learn the vowel component and consonant component separately and then combine them to form a syllable. Rather, the child first learns the basic syllabary with primary forms of vowels and consonants and then the entire syllabary containing all possible CV combinations is taught by rote (Karanth, 2006). These aspects are specific to literacy acquisition in Indian context as far as Indian scripts are concerned. An Indian reader also acquires alphabetic script that is English from the time the child enters school at four years of age. English being alphabetic may require a different method of teaching reading. Moreover, one must also consider the fact that the child has the vocabulary and mental lexicon for his/her first language. The neural mechanisms involved in reading acquisition in opaque and transparent orthographies have also shown some variation. Proficient reading in English has been associated with left hemisphere network of frontal, temporoparietal and occipitotemporal regions (Fiez and Peterson, 1998). The visual word form area (fusiform gyrus) is crucial for chunking letters into a unit and seems to develop rather late. Cross language differences depend on precisely how orthography of a language represents its phonology. When learners of transparent writing systems like Italian were compared with non-transparent systems like English or character-based systems like Chinese, similar brain regions were found active during reading (Paulesu, et al., 2001; Siok et al., 2004). However, proficient readers of transparent orthographies showed greater activations in left planum temporale, involved in phonological representation, while English readers showed greater activation in visual word form area in the left inferior occipital temporal region. Developmental neuroimaging studies have confirmed involvement of left posterior superior temporal cortex in phonological decoding. As literacy is acquired the visual word
Brain Networks of Attention
265
form area shows greater activation (Goswami and Ziegler, 2006; Turkeltaub et al., 2003). The developmental imaging studies show that we can begin to pinpoint the neural systems responsible for reading acquisition. However, these studies do not tell what would work in the classroom (Goswami, 2006). The differences between writing systems at behavioural and neural level highlight the complex situation faced by bi-, tri-, or multilingual children in India when it comes to literacy acquisition. Differences between languages in multi lingual persons could be specifically related to the script of the respective languages (Chengappa et al., 2004). In case of biliteracy, that is, literacy acquisition in first and second language, the child is put in a situation of cross linguistic switching which would involve inhibition, conflict resolution between competing languages when it comes to speaking, reading or comprehension. Bilinguals require a cognitive mechanism to direct attention to the target language and inhibit attention for the other (Bialystok et al., 2006). Attention shifts are always required during normal language use when speaking, listening, and reading (Rodriguez-Fornells et al., 2006). Bilingualism likely increases this demand. Results from a study on Korean English bilinguals using the Attention Network Test have shown a positive relation between early childhood bilingualism and executive attention (Yang and Lust, 2007). As we have suggested above there is some information on the relative improvement in attention induced by bilingualism much more needs to be done to understand these improvements and to determine how such issues as the nature of the different writing systems, the speed of acquisition of the two languages and other factors affect the training of attention.
Summary The study of literacy, numeracy, and attention all point to common roots of school success in the experiences of infancy (Blair and Razza, 2007). Of course, it has always been thought that the preschool period was important for preparing the child for a successful school experience. Explicit or implicit training in attention at preschool level may foster the learning of wide variety of skills acquired in school primarily literacy and numeracy (Posner et al., in press). Brain oriented research points to the specific experiences needed and methods to assay whether they have been achieved. This may encourage both parents and those responsible for public education to put more emphasis on the preschool period to foster their task of ensuring the education future of the world’s children.
References Aksan, N. and G. Kochanska. 2004. ‘Links between systems of inhibition from infancy to preschool years’, Child Development, 75 (5): 1477–90. Ansari, D., N. Garcia, E. Lucas, K. Hamon, and B. Dhital. 2005. ‘Neural correlates of number process in children and adults’, NeuroReport, 16 (16): 1769–73.
266 Michael I. Posner and Bhoomika R. Kar Berger, A., G. Tzur, and M. I. Posner. 2006. ‘Infant babies detect arithmetic error’, Proceedings of the National Academy of Sciences of the USA, 103: 12649–553. Bialystok, E., I. M. Craik, C. Anthony, and A. C. Ruocco. 2006. ‘Dual Modality monitoring in a classification task: the effects of bilingualism and ageing’, The Quarterly Journal of Experimental Psychology, 59 (11): 1968–83. Bialystok, E. 2006. ‘Second language acquisition and bilingualism at an early age and the impact on early cognitive development’. Encyclopedia on early childhood development. Center for excellence for early childhood development. Bialystok, E. and M. M. Martin. 2004. ‘Attention and inhibition in bilingual children: evidence from the dimensional change card task’, Psychological Sciences, 7 (3): 325–39. Blair, C. and R. P. Razza. 2007. ‘Relating effortful control, executive function and false belief understanding to emerging math and literacy ability in kindergarten’, Child Development, 78 (2): 647–63. Cantlon, J. F., E. N. Brannon, E. J. Carter, and K. A. Pelphrey. 2006. ‘Functional imaging of numerical processing in adults and four-year-old children’, Public Library of Sciences, 4: 5844–54. Chengappa, S., S. Bhatt, and P. Padakannaya. 2004. ‘Reading and writing skills in multilingual/multiliterate aphasics: two case studies’, Reading and Writing, 17 (1–2): 121–35. Cohen, L., C. Henry, S. Dehaene, O. Martinaud, S. Lehericy, C. Lemer, and S. Ferrieux. 2004. ‘The pathophysiology of letter-by-letter reading’, Neuropsychologia, 42 (13): 1768–80. Dehaene, S. 1996. ‘The organization of brain activations in number comparison: event-related potentials and the additive-factors method’, Journal of Cognitive Neuroscience, 8: 147–68. Dehaene, S., E. Spelke, P. Pinel, R. Stanescu, and S. Tsivkin. 1999. ‘Sources of mathematical thinking: behavioral and brain-imaging evidence’, Science, 284 (5416): 970–74. Dehaene-Lambertz, G., L. Hertz-Pannier, J. Dubois, S. Meriaux, A. Roche, M. Sigman, and S. Dehaene. 2006. ‘Functional organization of perisylvian activation during presentation of sentences in preverbal infants’, Proceedings of the National Academy of Sciences of the USA, 103 (38): 14240–245. Dehaene, S., M. I. Posner, and D. M. Tucker. 1994. ‘Localization of a neural system for error detection and compensation’, Psychological Sciences, 5: 303–05. Eimas, P. D., E. R. Siqueland, P. Jusczyk, and J. Vigorito. 1971. ‘Speech perception in infants’, Science, 171 (3968), 303–06. Fiez, J. A. and S. E. Peterson. 1998. ‘Neuroimaging studies of word reading’, Proceedings of the National Academy of Science USA, 95 (3): 914–21. Frith, U. 1985. ‘Beneath the surface of developmental dyslexia’, in K. E. Patterson, J. C. Marshall, and M. Coltheart (eds), Surface Dyslexia: Neuropsychological and Cognitive Studies of Phonological Reading, pp. 301–27. London: Lawrence Erlbraum. Frost, R., L. Katz, and S. Bentin. 1987. ‘Strategies for visual word recognition and orthographical depth: a multilingual comparison’, Journal of experimental Psychology: Human Perception and Performance, 13: 1104–15. Goswami, U. 2006. ‘Neuroscience and education: from research to practice?’, Nature Reviews Neuroscience, AOP, published online, 7: 406–13. Goswami, U. and J. C. Ziegler, 2006. A developmental perspective on the neural code for written words. Trends in Cognitive Sciences, 10: 142–43. Griffin, S. A., R. Case, and R. S. Siegler. 1995. ‘Rightstart: providing the central conceptual prerequisites for first formal learning of arithmetic to students at risk for school failure’, in K. McGilly (ed.), Classroom Lessons: Integrating Cognitive Theory, pp. 25–50. Cambridge MA: MIT Press. Guttorm, T. K., P. H. T. Leppanen, A. M. Poikkeus., K. M. Eklund, P. Lyytinen, and H. Lyytinen. 2005. ‘Brain event-related potentials (ERPs) measured at birth predict later language development in children with and without familial risk for dyslexia’, Cortex, 41: 3291–303.
Brain Networks of Attention
267
Iverson, P., V. Hazan, and K. Bannister. 2005. ‘Phonetic training with acoustic cue manipulations: a comparison of methods for teaching English r-l to Japanese adults’, Journal of the Acoustical Society of America, 118 (5): 3267–78. Joshi, M. and P. G. Aaron. 2000. ‘The component model of reading: simple view of reading made a little more complex’, Reading Psychology, 21: 285–97. Karanth P. and P. Prakash. 1996. ‘Developmental investigation on onset, progress and stages of literacy acquisition: its implication for instruction processes’. Project report. Delhi: NCERT. Karanth, P. 1998. ‘Literacy and language processes—orthographic and structural effects’, in Marta Kohl de Oliviera and J. Valsiner (eds), Literacy in Human Development, pp. 145–60. New York: Ablex. ———. 2001. ‘Reading into reading research through nonalphabetic lenses: evidence from the Indian languages’, Topic in Language Disorders, 23: 16–27. ———. 2003. Cross-linguistic study of acquired reading disorders: Implications for reading models, disorders, acquisition and teaching. New York: Kluwer Academic Publishers. ———. 2006. ‘The Kagunita of Kannada-learning to read and write an Indian alphasyllabary’, in R. M. Joshi and P. G. Aaron (eds), Handbook of Orthography and Literacy, pp. 389–404. New York: Academic Press. Kuhl, P. K. 1994. ‘Learning and representation in speech and Language’, Current Opinion in Neurobiology, 4: 6812–22. Kuhl, P. K., E. Stevens, A. Hayashi, T. Deguchi, S. Kiritani, and P. Iverson. 2006. ‘Source: infants show a facilitation effect for native language phonetic perception between 6 and 12 months’, Developmental Science, 9: 2F13–F21. Kuhl, P. K., F. M. Tsao, and H. M. Liu. 2003. ‘Foreign-language experience in infancy: effects of short-term exposure and social interaction on phonetic learning’, Proceedings of the National Academy of Sciences of the USA, 100 (15): 9096–9101. Kurian, P. 1996. ‘Variables affecting rapid reading: an experimental study’. Unpublished master’s dissertation, University of Mysore, India. McCandliss, B. D., L. Cohen, and S. Dehaene. 2003. ‘The visual word form area: expertise for reading in the fusiform gyrus’, Trends in Cognitive Sciences, 7: 7293–99. McClelland, J. L., J. A. Fiez, and. B. D. McCandliss. 2002. ‘Teaching the /r/ -/l/ discrimination to Japanese adults: behavioral and neural aspects’, Physiology and Behavior, 77: 4657–62. Molfese, D. L. 2000. ‘Predicting dyslexia at eight years of age using neonatal brain responses’, Brain and Language, 72: 238–45. Padakannaya, P. and A. K. Mohanty. 2004. ‘Indian orthography and teaching how to read. a psycholinguistic framework’, Psychological Studies, 49: 262–71. Patel, P. G. 2004. Exploring Reading Acquisition and Dyslexia in India. New Delhi: Sage Publications. Paulesu, E., J. F. Demonet, F. Fazio, E. McCrory, V. Chanoine, N. Brunswick, S. F. Cappa, G. Cossu, M. Habib, C. D. Frith, and U. Frith. 2001. ‘Dyslexia: cultural diversity and biological unity’, Science, 291 (5511): 2165–67. Posner, M. I. In press. Evolution and Development of Self-Regulation. Posner, M. I. and B. D. McCandliss. 1999. ‘Brain circuitry during reading’, in R. Klein and P. McMullen (eds), Converging methods for understanding reading and dyslexia, pp. 305–37. Cambridge: MIT Press. Posner, M. I. and M. K. Rothbart. 2007. Educating the Human Brain. Washington DC: APA Books. Posner, M. I., M. K. Rothbart, and M. R. Rueda. In press. ‘Brain mechanisms of high level skills’, in A. M. Battro, K. W. Fischer, and P. Léna (eds), Mind, Brain, and Education. Cambridge, UK: Cambridge University Press. Prakash, P. B. Rekha. 1992. ‘Phonological awareness and reading acquisition in Kannada’, in A. K. Srivastava (ed.), Researches in Child and Adolescent Psychology. New Delhi India: NCERT.
268 Michael I. Posner and Bhoomika R. Kar Rimzhim, A., Kar, B. R., and Srinivasan, N. 2007. Psycholinguistic and perceptual processes in good and poor readers. Unpublished master’s dissertation, Centre of Behavioural and Cognitive Sciences, University of Allahabad, India. Rimzhim, A. 2007. Psycholinguistic and perceptual processes in good and poor readers. Unpublished master’s dissertation, Centre for Behavioural and Cognitive Sciences, University of Allahabad, India. Rimzhim, A., Kar, B. R., and Srinivasan, N. 2007. Psycholinguistic and perceptual processes in good and poor readers. Unpublished master’s dissertation, Centre for Behavioural and Cognitive Sciences, University of Allahabad, India. Rodriguez-Fornells, A., R. D. Balaguer, and T. F. Munte. 2006. ‘Executive control in bilingual language processing’. Language Learning: 56, 133–90. Online publication, available online at www.blackwellsynergy.com/doi/pdf. Rothbart, M. K., D. Derryberry, and K. Hershey. 2000. ‘Stability of temperament in childhood: laboratory infant assessment to parent report at seven years’, in V. J. Molfese and D. L. Molfese (eds), Temperament and Personality Development Across the Life Span, p. 85, New York: Guilford Press. Rothbart, M. K., L. K. Ellis, M. R. Rueda, and M. I. Posner. 2003. ‘Developing mechanisms of effortful control’, Journal of Personality, 71: 1113–43. Rueda, M. R., M. K. Rothbart, B. D. McCandliss, L. Saccamanno, and M. I. Posner. 2005. “Training, maturation and genetic influences on the development of executive attention”, Proceedings of the National Academy of Sciences of the USA, 102 (41): 14931–936. Rueda, M. R., J. Fan, J. Halparin, D. Gruber, L. P. Lercari, B. D. McCandliss, and M. I. Posner. 2004. ‘Development of attention during childhood’, Neuropsychologia, 42: 81029–1040. Saffran, J. R. 2002. ‘Constraints on statistical language learning’, Journal of Memory and Language, 47 (1): 172–96. Schlaggar, B. L. and B. D. McCandliss. 2007. ‘Development of neural systems for reading’, Annual Review of Neuroscience, 30: 475–503. Seymour, P. H. K., M. Aro, and J. M. Erskine. 2003. ‘Foundation literacy acquisition in European orthographies’, British Journal of Psychology, 94 (2): 143–74. Shaywitz, B. A., P. Skudlarski, J. M. Holahan, R. N. Marchione, R. T. Constable, R. K. Fullbright, D. Zelterman, B. S. Lacadie, and S. E. Shaywitz. 2007. ‘Age-related changes in reading systems of dyslexic children’, Annals of Neurology, 61 (4): 363–70. Sheese, B. E., M. K. Rothbart, M. I. Posner, L. K. White, and S. H. Fraundorf. 2008. ‘Executive attention and self-regulation in infancy’, Infant Behaviour and Development, 31: 501–10. Sheese, B. E., Voelker, P. M., Rothbart, M. K., and Posner, M. I. 2007. Parenting quality interacts with genetic variation in Dopamine Receptor DRD4 to influence temperament in early childhood. Development and Psychopathology, 19, 1039–46. Sheese, B. E., Voelker, P. M., Posner, M. I. and Rothbart, and M. K. 2009. Genetic variation influences on the early development of reactive emotions and their regulation by attention. Cognitive Neuropsychiatry, 14 (4–5), 332–55. Siok, W. T., C. A. Perfetti, Z. Jin, and L. H. Tan. 2004. ‘Biological abnormality of impaired reading is constrained by culture’, Nature, 431 (7004): 71–76. Sreekumar, Y. U. and Kumaran, S. S. 2006. Development of Paradigms for Neural Mapping of physiological changes associated with lingual variance using functional magnetic resonance imaging, Unpublished PhD thesis, Institute of Nuclear Medicine and Allied Sciences, New Delhi. Streeter, L. A. 1976. ‘Language perception of two-month-old infants shows effects of both innate mechanisms and experience’, Nature, 259 (5538): 39–41. Tang, Y. Y., W. T. Zhang, K. W. Chen, S. H. Feng, Y. Ji, J. X. Shen, E. M. Reiman, and Y. Y. Liu. 2006. ‘Arithmetic processing in the brain shaped by cultures’, Proceedings of the National Academy of Sciences, 103 (26): 10775–780.
Brain Networks of Attention
269
Temple, E. and M. I. Posner. 1998. ‘Brain mechanisms of quantity are similar in 5-year-olds and adults’, Proceedings of the National Academy of Sciences of the USA, 95 (13): 7836–41. Turkeltaub, P., L. Gareau, D. L. Flowers, T. A. Zeffiro, and G. F. Eden. 2003. ‘Development of neural mechanisms for reading’, Nature Neuroscience, 6 (6): 767–73. Vicari, S., A. Albertoni, A. M. Chilosi, P. Cipriani, G. Cioni, and E. Bates. 2000. ‘Plasticity and reorganization during language development in children with early brain injury’, Cortex, 36 (1): 31–46. Werker, J. F., J. H. V. Gilbert, K. Humphrey, and R. C. Tees. 1981. ‘Developmental aspects of cross-language speech-perception’, Child Development, 52: 349–55. Wilson, A. J., S. K. Revkin, D. Cohen, L. Cohen, and S. Dehaene. 2006. ‘An open trial assessment of “The Number Race”, an adaptive computer game for remediation of dyscalculia’, BMC Behavior and Brain Function, 2: 20. Wimmer, H. and U. Goswami. 1994. ‘The influence of orthographic consistency on reading development: word recognition in English and German children’, Cognition, 51 (1): 91–103. Wynn, K. 1994. Addition and subtraction by human infants. Nature, 358, 749–50. Yang, S. and Lust, B. 2005. Testing effects of bilingualism on executive attention: comparison of cognitive performance on two non-verbal tests. BUCLD 29: proceedings online supplement of the 29th Boston University Conference on Language Development, Somerville, MA: Cascadilla Press. Yang, S. and Lust, B. 2007. Cross-lingistic differences in cognitive effects due to bilingualism: experimental study of lexicon and executive attention in two typologically distinct language groups. Proceedings of the Boston University Conference on Language Development (BUCLD) 31. Somerville, MA: Cascadilla Press.
About the Editors and Contributors The Editors Narayanan Srinivasan is currently a professor at the Centre of Behavioural and Cognitive Sciences (CBCS), University of Allahabad. Since 2006, Dr Srinivasan is also a visiting scientist at the Riken Brain Science Institute, Japan. He has a Master’s degree in Electrical Engineering from Indian Institute of Science and subsequently earned his PhD in Psychology from the University of Georgia, USA in 1996. He worked as a postdoctoral fellow at the University of Louisville in areas including visual perception and ophthalmology for two years (1996–98). Prior to joining the Centre for Behavioural and Cognitive Sciences in 2003, he worked at the Nanyang Technological University, Singapore. He employs different methodologies to study cognitive processes and is interested in the study of perception, attention, actions, emotions, and language processing. Dr Srinivasan has numerous publications in international journals and has edited books. He has made many presentations at national and international conferences. Dr Srinivasan is the Convenor of the Cognitive Division of the National Academy of Psychology (India). His hobby is reading books. Bhoomika R. Kar is a faculty member at the Centre of Behavioural and Cognitive Sciences, University of Allahabad since December 2002. After completing her post graduation in Psychology from University of Lucknow, she completed MPhil in Medical and Social Psychology from the Central Institute of Psychiatry, Ranchi. She did her PhD at the National Institute of Mental Health and Neurosciences (NIMHANS), Bangalore. She specializes in neuropsychology, particularly developmental neuropsychology. Her areas of research interest include cognitive development, dyslexia and bilingualism (cross linguistic transfer). Dr Kar has developed and standardized the NIMHANS neuropsychological battery for children (five to 15 years of age). This battery is being employed for child neuropsychological assessment and for research on cognitive disorders in children at various academic institutions and clinical settings in India. She has studied the pattern of growth trends of different cognitive processes. She has published her work in international and national journals and has made numerous presentations at international and national conferences. Janak Pandey has served as Professor of Psychology, University of Allahabad since 1978. He is at present the Head of the Centre of Behavioural and Cognitive Sciences.
About the Editors and Contributors
271
Earlier, after his return from Kansas State University, where he earned his PhD degree as a Fulbright Scholar, he served as Assistant Professor of Psychology at the Indian Institute of Technology, Kanpur. He has also been a scholar-in-residence and Visiting Professor at the Wake Forest University, Professional Associate at the East-West Centre, Hawaii, Visiting Senior Commonwealth Fellow at the University of Manitoba and Visiting Professor, University of Leiden. Prof. Pandey has wide experience of academic– administrative positions (as HOD, Psychology; Dean, Faculty of Arts; Pro-Vice Chancellor; Vice-Chancellor for interim period at Allahabad University, and Director, G. B. Pant Social Science Institute, Allahabad), membership of academic committees (UGC, ICSSR, NCERT, and Universities), and international experience (leader/member of delegation, invited addresses in conferences). Prof. Pandey’s contributions to the discipline have received recognition in the form of several awards, including the ICSSR Professor V. K. R. V. Rao Award in Psychology in 1989, National Fellowship in 1998, and Honorary Fellow of International Association of Cross Cultural Psychology in 2006. He is associated with a number of professional organizations and has held significant positions, such as President of the International Association for Cross-Cultural Psychology, and Associate Editor of The Journal of Cross-Cultural Psychology. His tireless efforts as the editor of the Third and Fourth Surveys of Psychology in India, over two decades have resulted in publication of six volumes known as “Psychology in India: State of the Art”, providing international recognition to psychology research and profession. Prof. Pandey’s work on social influence processes is well known and widely cited. His recent work demonstrates how the subjective construction of economic hardship and environmental degradation is related to coping and health. His emphasis on social–cultural context variables in the study of human nature makes his contributions highly meaningful and relevant.
The Contributors Ahmed is a PhD student at University of Hyderabad. His interests include sequence learning and brain imaging. Raju S. Bapi holds a PhD from the University of Texas at Arlington. He obtained his post-doctoral training at the University of Plymouth. Later he was Research Scientist at ATR labs, Kyoto. Currently he is Reader in Department of Computer and Information Sciences at the University of Hyderabad. His interests include cognitive and computational modelling of sequence and skill learning. Shruti Baijal graduated with Honours in Zoology and obtained MSc in Cognitive Science. She is currently doing PhD in Cognitive Science and uses an integrative approach involving psychophysics and electrophysiology to study attention and awareness. She has been awarded the Studentship by the International Brain Research Organization (IBRO). She has also worked on neural and cognitive changes due to meditation.
272 Advances in Cognitive Science V. S. Chandrasekhar Pammi holds a PhD from the University of Hyderabad. He obtained his Postdoctoral training at Gregory Bern’s lab, Emory University, Atlanta, USA. Currently he is a Research Scientist at Cognitive Neuroimaging Group, Max Plank Institute for Biological Cybernetics, Tuebingen, Germany. His interests include modelling complexity in learning. Gustavo Deco is Research Professor and leader of the Computational Neuroscience group at the Institució Catalana de Recerca i Estudis Avançats at the Pompeu Fabra University, Spain. He received his PhD in Physics in 1987 from National University of Rosario, Argentina. From 1990 to 2003, he led the Computational Neuroscience Group at the Siemens Corporate Research Center in Munich. In 1997, he obtained his habilitation in Computer Science at the Technical University of Munich. In 2001, he received his PhD in Psychology for his thesis on Visual Attention at the Ludwig-MaximilianUniversity of Munich. Arnaud Destrebecqz is Assistant professor at the Psychology Department of the Université Libre de Bruxelles (ULB). In collaboration with Axel Cleeremans and Pierre Perruchet, he is working on the role of elementary associative mechanisms in learning and social cognition. Much of his work makes use of connectionist networks as theoretical models of human cognition. Kenji Doya holds a PhD from University of Tokyo, Japan. He obtained his postdoctoral training at A. Selverston lab, UCSD, LA, USA and later at T. Sejnowski lab, The Salk Institute, USA. Currently he heads the Neural Computation Unit in the Initial Research Project at the Okinawa Institute of Science and Technology (OIST), Okinawa, Japan. His interests include sequence learning, neuro-modulators and meta-learning. David M. Eagleman, PhD, holds joint appointments in Neuroscience and Psychiatry at Baylor College of Medicine in Houston, Texas. He directs the Laboratory for Perception and Action as well as the Initiative on Neuroscience and Law. His laboratory researches time perception, synesthesia, visual illusions, and how neuroscience will affect the legal system. Garipelli Gangadhar is a research assistant at Idiap Research Institute where he explores the use of electroencephalogram correlates of cognitive-states such as alarms and anticipation for brain-computer interaction. He is currently pursuing doctoral degree at EPFL in Switzerland. He received masters degree in computational neuroscience in IIT Madras. Dietmar Heinke is a senior lecturer at the School of Psychology, University of Birmingham. Before joining the School of Psychology he completed an MSc in electrical
About the Editors and Contributors
273
engineering (Technical University of Darmstadt, Germany) and a PhD in Neuroinformatics (Technical University of Ilmenau, Germany). His research interests include developing computational models for a broad range of psychological phenomena. His recent models cover empirical findings on visual attention, action selection and affordances. Especially, his Selective Attention for Identification model (SAIM) explains a large set of experimental evidence on attention and its disorders. John M. Holden is an assistant professor in the department of psychology at Winona State University in the USA. His research interests include reward-specific expectancies, psychoneuroimmunology as it relates to Alzheimer’s and related disorders, addictive drugs and their effects on learning, and animal learning, psychopharmacology, and ethology in general. Glyn W. Humphreys is Professor of Cognitive Psychology at the University of Birmingham. He has research interests in the cognitive neuroscience of vision, attention, and action. He has received the British Psychological Society’s President’s Award and its Cognitive Psychology Prize, and he is a former President of the Experimental Psychology Society. He was the founding editor of Visual Cognition and is currently editor of the Journal of Experimental Psychology: Human Perception and Performance. Bipin Indurkhya did his ME (Electronics) from Philips International Institute of Technological Studies, Eindhoven and PhD (Computer Science) from University of Massachusetts, Amherst. He spent 20 years teaching at various universities including Boston University and Tokyo University of Agriculture and Technology, Japan. Since 2004, he has been with IIT-Hyderabad. His current research includes studying and modelling creativity, usability studies, and developing tools for assisting cognition and communication for autistic and dyslexic children. Denny Joseph was awarded a degree in Electrical Engineering in 2005. At the Indian Institute of Technology-Madras, he pursued his research interests in computational neuroscience and reinforcement learning, and was awarded an MS degree. He recently filed a patent application at the European Patent Office. He is currently employed with the Emirates Airline and Group in Dubai, as a Strategic Research and Innovation Officer. Neha Khetrapal has completed her masters in Cognitive Science from the Centre of Behavioural and Cognitive Sciences, University of Allahabad, India and BA in Psychology (Honours) from Delhi University. The author is a holder of German Research Foundation award at the newly established Excellence Cluster “Cognitive Interaction Technology” at the Bielefeld University, Germany. She is interested in spatial cognition as well as emotions.
274 Advances in Cognitive Science Elisabetta Làdavas is a Professor of Physiological Psychology, Faculty of Psychology, University of Bologna, Italy. She is the director of the International PhD programme in Cognitive Neuroscience, president of Italian Neuropsychological Society and the scientific director of Cognitive Neuroscience Centre, University of Bologna. She is interested in cognitive neuroscience and cognitive neuropsychology of human selective attention in vision, hearing and touch, representation of space and crossmodal integration. Sidney R. Lehky has a PhD in Biophysics and Theoretical Biology from the University of Chicago. He did his thesis work on human visual psychophysics and did postdoctoral studies on neural network modelling at Johns Hopkins University and monkey neurophysiology at NIH. Since then he has worked at a variety of research facilities on issues related to visual perception. Eirini Mavritsaki is a postdoctoral fellow in the University of Birmingham, Department of Psychology, investigating the cognitive functions using models of spiking neurons. She studied Mathematics in University of Crete, where she received her diploma in Applied Mathematics. She then continued as a research associate in the FORTH, IACM in Crete where she worked in mathematical modelling in underwater acoustics. She obtained her PhD degree in Computational Neuroscience from the University of Sheffield, studying the underlying mechanism in nictitating membrane response during classical conditioning. K. P. Miyapuram holds a PhD from University of Cambridge, UK. Currently he is a Research Scientist at Unilever, Netherlands. His interests include sequence learning and reward based learning. J. Bruce Overmier is Professor of Psychology at the University of Minnesota, USA (Graduate Faculties of Psychology, Neuroscience, and Cognitive Science). Overmier has been president of a number of psychological organizations as well as of the International Union of Psychological Science (2004–08). Overmier’s research spans specialties of learning, memory, stress, psychosomatic disorders, and their biological substrates. Vani Pariyadath is a graduate student in Neuroscience at Baylor College of Medicine in Houston, Texas. She obtained her master degree in Cognitive Science from the Centre for Behavioural and Cognitive Sciences. Her research interests include time perception and vision. Saumil Patel received his PhD in Electrical Engineering from the University of Houston and is a Senior Research Scientist in the Sereno laboratory. His research is directed towards understanding fundamental mechanisms of vision and visual perception. It includes
About the Editors and Contributors
275
empirical and theoretical studies of stereoscopic depth perception, eye movements, position and motion perception, and visual attention. He is also involved in the development of instrumentation for optical recording of neural activity. Xinmiao Peng received her PhD in Neuroscience from Washington University School of Medicine, St. Louis in 2004. She then joined Dr John Maunsell’s lab as a research associate at Baylor College of Medicine. In 2006 Xinmiao became a postdoctoral fellow in Dr Anne Sereno’s lab at the School of Medicine, University of Texas, Houston. Trevor B. Penney received his PhD in psychology from Columbia University, New York, in 1997. He was a postdoctoral fellow at the Max Planck Institute of Cognitive Neuroscience in Leipzig, Germany from 1997 to 2000. Subsequently, he was a faculty member in the Department of Psychology at the Chinese University of Hong Kong. Since July 2006, he has been an associate professor in the Department of Psychology at the National University of Singapore. His research interests include the cognitive and neural substrates underlying interval timing and memory. Michael I. Posner is currently Professor Emeritus at the University of Oregon and Adjunct Prof. of Psychology in Psychiatry at the Weill Medical College of Cornell University, where he served as founding director of the Sackler Institute. Posner is best known for his work with Marcus Raichle on imaging the human brain during cognitive tasks. He has worked on the anatomy, circuitry, development and genetics of three attentional networks underlying alertness, orienting and voluntary control of thoughts and ideas. His methods for measuring these networks have been applied to neurological, psychiatric, and developmental disorders. His current research involves training of attention in young children to understand the interaction of specific experience and genes in shaping attention. Viviane Pouthas obtained her PhD in Psychology from University of Paris in 1979. She is currently the Director of Research at the CNRS, France and heads the team on “Perception of time and of rhythms: mechanisms and dysfunction”. She is a member of specialists committee at the universities of TOURS and LILLE. She teaches for masters degree at University of Paris. She is interested in cognitive neuroscience of time perception. Anne B. Sereno is an American neuroscientist whose discovery of object and spatial information in unexpected areas of the brain challenges our current understanding of visual processing. In other research, she advanced methods to measure treatment efficacy on cognitive function and define subtypes in human clinical disorders. She received her PhD from Harvard University and is an Associate Professor at the University of Texas Medical School in Houston.
276 Advances in Cognitive Science Andrea Serino is Assistant Professor at the Faculty of Psychology and is also associated with the Centre for Studies and Researches in Cognitive Neuroscience, University of Bologna. His main research interests focus on integration of information from different sensory modalities in order to represent the body and peripersonal space. He also works with brain damaged patients with the aim of developing and evaluating rehabilitative programmes for neuropsychological deficits following stroke and traumatic brain injuries. Kazuko Shinohara is Associate Professor at Institution of Symbiotic Science and Technology, Tokyo University of Agriculture and Technology, Japan. Major fields of interest are cognitive linguistic study of spatial and temporal concepts, emotion metaphors in language and their multimodal representation, and embodied nature of sound symbolism. Malini Shukla was a graduate student at the Centre for Behavioural and Cognitive Sciences. She is interested in learning disabilities. V. Srinivasa Chakravarthy received the BTech degree from the Indian Institute of Technology, Madras in 1989, and MS and PhD degrees from the University of Texas at Austin in 1991 and 1996, respectively. He is currently an associate professor in the Department of Biotechnology, Indian Institute of Technology, Madras, India. Siwei Liu received a BSc in Applied Psychology from Sun Yat-Sen University, China in 2005, and an MSc degree in Cognitive Neuroscience from the University of York, UK in 2006. She is now a PhD student in the Department of Psychology at the National University of Singapore. Her research interest lies in the cognitive and brain basis of categorization and expectation. Kazuhiro Tamura received BE and ME degrees in computer science at the Tokyo University of Agriculture and Technology, Japan. He was technical staff of the Laboratory for Perceptual Dynamics at the RIKEN Brain Science Institute. His research interests are in cognitive processes and spatial navigation. Barbara Tversky is Professor of Psychology at Columbia Teachers College and Professor Emerita of Psychology at Stanford University. Her research interests have included memory, categorization, spatial language and thinking, event perception and cognition, diagrammatic reasoning, gesture, and visual communication, including applications to design, human-computer interaction, and education. Latha Vaitilingam received a BSoc Sci (Hons) degree in Psychology from the National University of Singapore in 2007. Since February 2008, she has been pursuing an MSc degree in Neuropsychology at Macquarie University in Sydney, Australia.
About the Editors and Contributors
277
Cees van Leeuwen did his PhD in experimental psychology at the University of Nijmegen on the processing of visual object structure. Currently he is head of the Laboratory for Perceptual Dynamics at the RIKEN Brain Science Institute. His research interests are in mid-level vision, brain dynamics, and creative processes Kielan Yarrow currently works as a lecturer at the Department of Psychology, City University, London. He began his research career at the MRC Human Movement and Balance Unit in Queen Square, London, and subsequently completed his PhD and postdoc at the Institute of Neurology, UCL, and the Institute of Cognitive Neuroscience, UCL, respectively. Kielan’s research interests include multisensory perception (particularly temporal perception), attention, and action. His preferred research methods include psychophysics and transcranial magnetic stimulation.
278
Advances in Cognitive Science
Subject Index acquired stimulus equivalence, 17 Actor–Critic–Explorer (ACE) model of BG, 4–5, 71–72, 86–89 architecture and components of, 72, 73 Actor, 72–73, 75 Critic, 72–73, 75 Explorer, 76–77 description and functioning of, 80–81 and muscle arm model, 72, 74 and Parkinson’s disease, 88–89 reward and temporal difference error, computation of, 81–82 simulation results for dopamine deficient conditions, 84–86 explorer dynamics during and post learning, comparison of, 84 STN–GPe layer in, 77–80 training process in, 82–84 affective Stroop task, 139 akshara, in Indian script, 262 alcohol related dementia. see Korsakoff’s disease alignment effect, 24, 25, 31, 37–40 anker compatibility effect, 140 anterior cingulate, 53, 133, 138 anterior inferotemporal cortex (AIT), 94, 113, 115, 117, 118 associative reward prediction (ARP), 83 associative theories of learning, 3, 7 expectancy theory, 10–11 stimulus-response (S-R) theory, 7–8 goalless automatons in, 9 two-process theory, 8–10 anticipatory motivational state, based on goal, 9 transfer of control design, phases in, 9, 10 attentional system, functions of, 133 attention network test (ANT), 261 auditory peripersonal space, 98, 101 auditory temporal processing, 245 avoidance behaviour, and Mowrer’s two-process theory, 8–9 awareness index, 43
basal ganglia (BG), 71. see also Actor–Critic– Explorer (ACE) model of BG bilingualism, and demand on executive attention system, 265 blind cane users, 100 choice RT tasks, 208 circle drawing, as implicit timing task, 220–21 conditioned expectancy model, 16 connectionist models, for neurophysiological disorders, 150–51 conscious knowledge, 44 contingent negative variation (CNV), 225–28 critical flicker fusion threshold (CFFT), 200 crossmodal extinction, 97–98, 100 cue–target onset asynchronies (CTOAs). see reflexive spatial attention deviant stimulus, 216 differential outcomes (DO) procedure Down’s syndrome and, 18 facilitating effect on learning, 15 and training in Prader–Willi syndrome, 17–18 use in memory tasks, 18–19 experiments with pigeons, 18–19 in Korsakoff’s disease, 19–20 discriminative choice learning tasks, 21 common outcomes (CO) procedure, 11–12 comparisons of groups learning in procedures, 13 DO procedure, 12 (see also differential outcomes (DO) procedure) dizocilpine, 21 Dopamine 4 receptor gene (DRD4), 260 duration illusions, 196 brain, repetition effects on, 198–99 duration judgements, role of predictability in, 199–202 duration of stimulus and predictability, 196–98 screening tool for schizophrenia, 202–03 dyslexia, 241–42, 245, 253
Subject Index auditory temporal processing in, 245–46 temporal order judgement paradigm, 246 phonological and temporal processing, neural basis of, 247–48 remediation in, 248 ERP paradigm with stimuli in rapid succession, 252–53 Fast ForWord training programme, 250–51 and neural mechanisms, 251–52 PASS Reading Enhancement Programme, 248–50 Educating the Human Brain, 260 electroencephalography (EEG), 176, 207, 225–29, 234 emergent relations, 17 emotional facial expressions, 133–34 emotions, 132 and attention, interaction between, 133, 135–38 ERP study on, 138 functional imaging studies on, 138 search times for sad and happy schematic faces, in detection task, 136 search times for sad and happy schematic faces, in discrimination task, 137 visual search task, use of, 135–36 and cognitive control, 139 conflict related ERP (N2) and incompatible trials, 144 detection of errors, by ERPs, 141 evaluative component, 140 flanker task, 140 increased N 100 amplitudes for happy target faces, 144 magnitude of flanker compatibility, 143 regulative or strategic component, 139–40 role of ACC, 141 facial expressions and automaticity, 134–35 greater ERN amplitude for erroneous responses, 145 and selective attention, interactions between, 133–34 error-related negativity (ERN), 141 executive attention system, development of, 260–61 explicit and implicit timing processes, distinction between, 220–21
279
exploration, role of BG in. see Actor–Critic–Explorer (ACE) model of BG express saccades, 182 extinction, 97 Fast ForWord training programme, 250–51 fearful faces, 134–35 flicker fusion experiments, 200 fluency, 45 functional magnetic resonance imaging (fMRI), 21, 113, 135, 139, 169, 176, 199, 230, 234, 257 global landmarks vs. local landmarks, 26 gradient asymmetry coefficient, 226 implicit learning, 3–4, 43. see also unconscious learning, acquisition of implicit timing, 207–08 electrophysiological measures of, 215–17 as emergent property of continuous motor responses, 219–21 stimulus duration, pre-attentive processing of, 218 Stop-RT task, for study of interval-timing behaviour, 208–15 bimodal single SOA sequences, use of, 210 good and poor readers, comparison of, 213, 215 parallel timing of auditory subsequences, 213 performance on two SOA polyrhythmic sequence, 214 unimodal polyrhythmic sequences, creation of, 210–11 unimodal single SOA experiments, 209–10 time of occurrence, pre-attentive processing of, 218–19 India, multilingual environment and development of literacy skills in, 261–65 infancy, development of literacy skills in attention, role of, 260–61 language, 256–57 numeracy, 259–60 phonemes, 257–58 reading, 258–59 inhibition of return (IOR), 93–94, 104–09, 118–19, 126–28
280
Advances in Cognitive Science
integrate and fire neurons, 151, 156 inter-problem transfer, of control of choice, 16 Korsakoff’s disease, 19–20 landmarks, role in map reading, 3, 24–25, 39–41 global landmarks, 26, 40–41 and local landmarks, comparison of, 26 iconic and numeric landmarks, experiment on comparison of, 34–35 method used, 35 procedure, 35–36 result and discussion, 36–39 local landmarks, 26, 40 processes used in navigation left–right judgements, 25 mental rotation or reorientation, 25 role of landmarks in reducing misalignment, experiment on method of, 26–28 procedure, 28–29 result and discussion, 29–34 language system, in infants, 256–57 lateral intraparietal area (LIP), 104, 182 literacy acquisition in children, stages in study of, 263 medial frontal activity, 227 mismatch negativity (MMN), 216, 246 negative omission potential (OP), 215–16 numeracy, brain mechanisms in, 259–60 orthography, and pattern of reading development, 262 Parkinson disease (PD) patients CNV reduction in, 228 STN–GPe activity, reduction in complexity of, 88–89 passive oddball paradigms, used in MMN studies of interval timing, 217 PASS Reading Enhancement Programme, 248–50 peripersonal space, 93 action and perception, link between, 101–02 automatic activation of, 98–99 blind-cane users, auditory peripersonal space in, 100–01
changes in space perception by tool-use, 99–01 definition of, 97 evidence for existence, in humans, 97–98 modular organization of, 98 multisensory neurons, role of, 99 phonological processing, 246 point of subjective equality (PSE), 179 positron emission tomography (PET), 234–36 posterior parietal cortex (PPC), 82, 83, 95, 150 PPC damage in humans, 150. see also spiking Search over Time and Space (sSoTS) model Prader–Willi syndrome, and DO training, 17–18 progressive supranuclear palsy patients, 108 proliferation effect, 200 Pyrithiamine, 19 reading, 261–62 brain areas involved in, 258–59 influence of different orthographies on, 262–64 speed of, 264 teaching of, 264 reflexive spatial attention, 104–05 activity of SC neurons during, 110 behavioural results in reflexive spatial attention task, 107 and inhibition of return (IOR), 105 behavioural effects, 105–08 localization of, 108 and role of SC, 108–11 spatial attention task used to elicit IOR, 106 models of, 118 effects of cue and target shape on reflexive attention, 125 neural net model based on repetition suppression, 119–25 SC involvement in manifestation of IOR, 118–19 spatial attention, shapes of cue and target influence on, 125–26 repetition suppression implications of, 126–28 in lateral intraparietal cortex, 113–17 and maintained activity in AIT and LIP, 117–18 in ventral steam, 111–13 reinforcement learning (RL), 3, 4, 57, 71
Subject Index and BG model (see Actor-Critic-Explorer (ACE) model of BG) repetition suppression, 199 right brain damaged (RBD) patients, 97, 100 Rightstart programme, 259 saccade length effect, 180 saccadic chronostasis, 175, 179–80 and antedating account, 180–82 change in clock rate and, 180 illusory timeline across saccade, recollection of, 181 receptive field shifts, of visual neurones, 182 superior colliculus, efference copy signal from, 182 interval estimation experiment on, 182–83 discussion, 188–89 double saccade condition in, 188–89 materials and design used, 183 procedure in saccade and control conditions, 183–85 results obtained, 185–88 time estimation data, 186 methodological issues, in study of, 189 affect on first interval, 191–93 constant fixation conditions, suitable control by, 189–91 as perceptual phenomenon, 193 saccadic chronostasis as order effect, 189 schizophrenia, repetition suppression diagnostic tool for, 203 Selective Attention for Identification Model (SAIM), 150 serial reaction time (SRT) task, 43, 44, 453 short-term working memory, testing of, 18 skill learning, 4, 56 learning paradigm, study on effect of (see supervized and trial and error learning, study on effect of) studies on monkeys, 56 skin conductance responses (SCRs), 135 space representation, 97 spiking Search over Time and Space (sSoTS) model, 95, 151, 168–69 architecture of, 152–56 feature maps and location map, 152 inhibitory mechanisms in
281
active ignoring, 157 frequency adaptation, 156–57 parameter settings for simulations, 157–58 reaction times (RTs), generation of, 158 and preview search paradigm, 152 search efficiency in, 152 simulations in, 154, 159 effects of reduced neuro-transmitter based excitation, 165–67 preview search with old and new stimuli in same or different fields, 162–65, 168 unilateral lesion on search through space and time, effect of, 159–62, 168 spiking characteristics in, 156 standard stimuli, 216 stimulus onset asynchrony (SOA), 209 stopped clock illusion, 179 Stop-reaction time (Stop-RT) task, 209–15 Stroop colour-word task, 139 Subthalamic Nucleus and Globus Pallidus externa (STN–GPe), 71. see also Actor-Critic-Explorer (ACE) model of BG superior colliculus (SC), 94, 104, 108–11, 118–19, 127, 182 supervized and trial and error learning, study on effect of ANOVA results, summary of, 61 behavioural paradigm, 57–59 continued-learning group, 59 mild-learning group, 59 performance measures, calculation of, 59 results and session-wise dendograms of subjects, 59–64 sequence learning tasks, use of, 57, 58 session-wise summary of performance of subjects, 60 subjects, in study, 57 supervized learning, 57 symmetrical relationships, 18 tapping tasks, as explicit timing task, 220–21 temporal order judgement (TOJ) task, 246 time estimation, involvement of brain areas in, 224–25 EEG studies, 225–29 functional imaging studies, 229–34 studies combining PET and EEG data, 234–36
282
Advances in Cognitive Science
unconscious learning, acquisition of, 43–44 implicit influence of perceptual and motor fluency, experiment on control of, 46 discussion and conclusion, 51–53 method, 46 procedure, 46–47 reaction times, 47–49 recognition task, 49–51 results, 47–51
sequence learning studies, based on serial reaction time tasks, 44–46 unsupervized learning, 57 visual backward masking paradigms, 134–35 visual neglect, 150 visual search paradigm, 135 visual–tactile extinction, 98 visual word form area, 258–59
Name Index Aaron, P.G., 261 Abbott, L.F., 120 Abeles, A., 155 Abeles, M., 88 Aerts, J., 53 Agnew, J.A., 251 Agter, A., 154, 157 Ahern, G.L., 138 Ahmed, 56–69 Ahmed, B., 157 Aksan, N., 260 Albani, G., 228 Albertoni, A., 257 Aldridge, V.J., 225, 226 Alesch, F., 228, 229 Alexander, G.E., 82 Algom, D., 139 Allan, L.G., 180, 189 Allen, H.A., 154 Allen, R., 46 Alvarez, D., 15 Amabile, G., 228 Amador, S.C., 104, 108, 113, 127 Amorim, M.A., 46 Anandan, P., 83 Anderson, A.K., 134 Anderson, J., 157 Andreason, P., 242 Ansari, D., 259 Aretz, A.J., 25 Armony, J.L., 94, 135, 138 Armstrong, C., 126 Arnold, D.H., 213 Aro, M., 262 Augath, M., 113 Avidan, G., 113 Backus, B., 162 Baijal, S., 132–45 Bakke, B.L., 19 Balaguer, R.D., 265
Balakrishna, C., 247 Ballard, D.H., 199 Bannister, K., 257 Bapi, R.S., 56–69, 83 Barch, D.M., 139, 140 Baria, A., 220 Bartling, J., 242, 246 Barto, A.G., 81–83 Bassolino, M., 98, 101 Bates, E., 257 Batty, M., 134, 248 Beaudoin, G., 134, 139 Bedell, H.E., 120 Bedi, G., 250 Behramann, M., 150 Bell, A.H., 108, 109, 118 Benjamin, A., 45 Bentin, S., 262 Berger, A., 108, 259, 260 Bergman, H., 88 Berlin, H.A., 175 Berlucchi, G., 94 Bernstein, E., 108 Bernston, G.C., 132 Bhatt, S., 265 Bialystok, E., 257, 265 Biberstine, J., 220 Bikle, P.C., 199 Bilker, W., 241 Billington, M.J., 208 Bjork, R., 45 Black, S., 202 Blair, C., 265 Blair, K.S., 139, 142 Blair, R.J., 139, 142 Block, R.A., 175 Bloom, M., 220 Bobholz, J.A., 104 Boden, C., 250 Bojczyk, K.G., 220 Bonnet, M., 234
284
Advances in Cognitive Science
Botvinick, M.M., 139, 140 Boucsein, W., 134 Bowen, R.W., 200 Box, J., 218 Bradley, M.M., 138 Bradshaw, J.L., 137 Brady, S., 246 Braff, D.L., 202 Brailsford, A., 249 Braine, L.G., 25 Braithwaite, J.J., 152 Brammer, M.J., 134 Brannon, E.N., 259 Braver, T.S., 139, 140 Breitmeyer, B.G., 120 Bremmer, F., 97 Brennan, C.W., 104 Bretherton, L., 246 Breznitz, Z., 247 Briand, K.A., 104–06, 108 Brogan, D., 180 Brooks, D.J., 56 Brown, K.J., 134 Brown, M.W., 111, 112, 199 Brown, P., 88, 179–81, 188, 189, 192, 196 Brown, S.W., 230 Brunel, N., 152, 156 Brunswick, N., 264 Bryant, D.J., 25 Buchner, A., 45 Buckner, R.L., 113, 199 Budd, T.W., 218 Buhusi, C.V., 224 Bull, J.A., 11, 13 Buonomano, D., 196 Buonomano, D.V., 175, 176, 224 Burr, D., 196 Burr, D.C., 181, 191 Byma, G., 250 Cacioppo, J.T., 132 Caffarra, P., 104 Calabresi, P.A., 104 Calvo, M.G., 134 Campbell, F.W., 181 Cantlon, J.F., 259 Careri, J.M., 21 Carlson, J.S., 248
Carter, C.S., 139, 140 Carter, E.J., 259 Carver, C.S., 132 Case, R., 259 Casini, L., 225, 230, 234 Cavanagh, P., 196, 198, 202 Cavangh, P., 175 Cayeux, I., 133 Chaix, Y., 248 Chakravarthy, V.S., 71–89 Channon, S., 45, 52 Chanoine, V., 264 Chan, P. Chi, 213 Chari, S., 247 Chase, R.B., 40 Chasteen, A.L., 94, 138 Chelazzi, L., 155 Chengappa, S., 265 Chen, K.W., 260 Chen, R., 203 Chilosi, A.M., 257 Choate, L.S., 108 Christensen, B.K., 203 Christoff, K., 134 Chun, M.M., 113 Church, R.M., 197, 227 Cioni, G., 257 Cipriani, P., 257 Clarke, K., 165 Cleeremans, A., 44, 47, 52 Coffey, S.A., 247 Cohen, A., 46 Cohen, J.D., 139, 140, 151 Cohen, L., 199, 258, 259 Cohen, L.G., 56 Cohen, Y., 15, 93, 108, 162, 164 Cohen, Y.A., 106 Colby, C.L., 97, 98, 182 Compton, R., 133 Constable, R.T., 252, 258 Cooke, D.F., 99 Cook, N.D., 56 Cools, A.R., 234 Cooper, R., 225, 226 Corballis, M.C., 25, 126 Corbetta, M., 104 Corcos, D.M., 220 Coull, J.T., 230, 231, 234
Name Index Cowey, A., 180, 197 Critchley, H.D., 236 Csepe, V., 247 Cunnington, R., 228, 229 Curran, T., 46, 52 Dallas, M., 45 Damasio, A.R., 132 Das, J.P., 248–250 Daskalakis, Z.J., 203 Das, P., 134 Davidson, R.J., 138 Davis, M., 197 Davis, M.H., 199 Dayan, P., 71 De Araujo, I.E.T., 133 Deco, G., 93, 94, 133, 150–69 Deecke, L., 228, 229 De Groot, P., 135 Deguchi, T., 257 Degueldre, C., 53 Dehaene-Lambertz, G., 256 Dehaene, S., 145, 199, 256, 258–60 Deimel, W., 242, 246 Delange, N., 40 Del Fiore, G., 53 DeLong, M.R., 88 Delos, S., 15, 16 Demonet, J., 248 Demonet, J.F., 264 Denis, M., 25, 40 De Rosa, E., 134 Derr, M.A., 56 Derryberry, D., 260 de Silva, F.P., 202 Desimone, R., 111, 114, 127, 199 Desmurget, M., 82 Destrebecqz, A., 43–53 Deutsch, G.K., 251, 252 Dhital, B., 259 Diamond, M.R., 191 Dienes, Z., 44, 45 Di Girolamo, G.J., 133 Di Lazzaro, V., 88 Di Lollo, V., 200 Dimberg, U., 135 Di Pellegrino, G., 97, 98
285
Dirnberger, G., 228, 229 Doffin, J.G., 220 Dolan, R.J., 94, 135, 138, 236 Dolansky, L., 220 Donk, M., 154, 157 Dorn, C., 251 Dorris, M.C., 108, 109, 111, 118, 127 Douglas, R., 157 Downing, P.E., 113 Doya, K., 56–69, 79, 83, 88 Drevets, W.C., 139, 142 Driver, J., 94, 100, 134, 138, 165, 166 Dubois, J., 256 Dubrovsky, B., 234 Duda, R.O., 59 Duhamel, J.R., 97, 98, 182 Duncan, J., 127, 155, 199 Duncan J., 158 Dusek, P., 234 Dutton, K., 134 Eagleman, D.M., 175, 196–204 Eastwood, J.D., 94, 134–37 Edelman, S., 113 Eden, G.F., 246, 251, 252, 265 Edwards, C.A., 13 Edwards, H., 180 Efron, R., 200 Egeth, H., 125 Egeth, H.E., 105 Eglin, M., 159, 161, 168 Egly, R., 127 Eimas, P.D., 256 Eklund, K.M., 257 Ekman, P., 132 Eley, M.G., 25 Ellis, L.K., 260 Ellis, R., 150 Elsinger, C., 208 Endl, W., 228, 229 Ennis, J., 247 Epstein, C.M., 82 Epstein, R., 113 Erskine, J.M., 262 Esteves, F., 134, 135 Estevez, A.F., 15 Etcoff, N.L., 134
286
Advances in Cognitive Science
Eugene, F., 134, 139 Evans, G.W., 25, 40 Everling, S., 108, 109, 111, 118, 127 Fadiga, L., 97, 99 Fahy, F.L., 111, 112, 199 Fakhri, M., 180 Fan, J., 261 Farnè, A., 97–101 Farrell, W.S.J., 25 Fattapposta, F., 228 Faulkner, A., 180 Fazio, F., 264 Featherstone, E., 236 Fecteau, J.H., 108–10, 118, 119 Feingold, A., 88 Felleman, D.J., 111 Fenske, M.J., 94, 135 Fenwick, P.B., 236 Ferrandez, A.M., 234 Fery, P., 44, 45 Fiesta, M.P., 196 Fiez, J.A., 257, 264 Fink, G.R., 97 Firoozi, O., 247 Fischer, B., 182 Fitzgerald, P.B., 203 Fitzgibbon, E.J., 182 Flowers, D.L., 265 Fogassi, L., 97, 99 Ford, J.M., 218 Fox, E., 134 Frackowiak, R.S.J., 56 Franchi, G., 97 Franck, M., 45 Franklin, L.M., 202 Franklin, N., 25 Freeman, W.J., 88 Fridberg, D., 139, 142 Friedman-Hill, S.R., 159, 161, 168 Friedman, J.H., 108 Frischen, A., 94, 134, 135 Frith, C.D., 56 Frith, U., 263 Frost, R., 262 Fuchs, A.F., 182 Fuentes, L.J., 15 Fulbright, R.K., 252, 258
Fu, Q., 44 Fu, X., 44 Gabrieli, D.E., 251 Gabrieli, J.D.E., 134 Gaillard, V., 44, 45 Gallagher, M., 197 Gallese, V., 97, 99 Gandhi, S.K., 203 Gangadhar, G., 71–89 Garcia, N., 259 Gardner, E.P., 82 Gareau, L., 265 Garnero, L., 234 Garrido, M.A., 250 Gaymard, B., 104 Gelade, G., 151 Gellman, L., 24, 40 Gentilucci, M., 97 George, N., 234 Gerschlager, W., 228, 229 Geutskens, A., 135 Gibbon, J., 197, 213, 227 Gilbert, J.H.V., 257 Gilchrist, I., 158 Given, B.K., 247 Goldberg, M.E., 97, 98, 181, 182, 191 Goldman-Rakic, P., 155 Gonzalez, C., 15 Gordon, E., 134 Goswami, U., 245, 263, 265 Grafton, S.T., 56, 82 Graf, W., 97 Graham, K.S., 113 Graydon, F.X., 56, 83 Graziano, M.S., 97, 99 Graziano, M.S.A., 99 Greenwood, R., 151, 166, 169 Griffin, S.A., 259 Grill-Spector, K., 113, 199 Grimm, S., 218 Grochowski, S., 202 Grondin, S., 175, 225, 230 Grosjean, M., 208 Grossberg, S., 120 Gross, C.G., 97 Gruber, D., 261 Gupta, G., 137
Name Index Gupta, R., 94 Gur, R.C., 241 Gutierrez, E., 135, 138 Gutmann, A., 15, 16 Guttorm, T.K., 257 Habib, M., 247 Haggard, P., 179–82, 188–93, 196 Halgren, E., 132 Hallett, M., 83 Halligan, P.W., 150, 169 Halparin, J., 261 Halsband, U., 56 Hamon, K., 259 Hancock, P.A., 175 Hanley, G., 40 Harner, A.M., 56, 57 Harrington, D.L., 234 Harris, A., 134 Hart, P.E., 59 Hatfield, E., 132 Hayashi, A., 257 Hayes-Roth, B., 40 Haykin, S., 72 Hazan, V., 257 Hazelrigg, M.D., 40 Hazeltine, E., 25, 56 Hazeltine, R.E., 220 Heal, R., 179–81, 188, 189, 192, 196 Heeger, D., 162 Hegarty, M., 40 Heinke, D., 93, 94, 150–69 Heinke, D.G., 169 Heller, W., 142 Hellstroem, A., 189 Henik, A., 108 Hening, W., 104 Henry, C., 258, 259 Henson, R., 113, 199 Henson, R.N.A., 113, 199 Hershey, K., 260 Hertz-Pannier, L., 256 Hiharara, S., 100 Hikosaka, O., 56, 57 Hillyard, S.A., 218 Himmelbach, M., 158 Hirshman, E., 45
287
Hochberg, J., 24, 40 Hochhalter, A.K., 19 Hodgson, T., 165 Hodinott-Hill, I., 180, 197 Hogan, D.E., 13 Hogendoorn, H., 196 Holahan, J.M., 258 Holcombe, A.O., 196 Holcomb, P.J., 247 Holden, J.M., 7–21 Holland, P.C., 197 Holmes, V.M., 246 Holscher, C., 115 Holub, R.J., 19 Holzman, P.S., 104 Hoopen, G. Ten, 189 Hopp, J.J., 182 Horstink, M.W., 234 Horwitz, B., 169 Hugueville, L., 234 Hull, C.L., 10 Humphrey, K., 257 Humphreys, G.W., 15, 93, 94, 150–69 Huotilainen, M., 219 Hurwitz, S., 40 Husain, M., 151, 165, 166, 169 Hu, X.T., 97 Ikeda, A., 234 Ilmoniemi, R.J., 219 Imada, H., 13 Indurkhya, B., 24–41 Inhoff, A.W., 108 Insola, A., 88 Intriligator, J., 175, 196, 198, 202 Inui, K., 199 Iriki, A., 100 Ishai, A., 199 Ishibashi, H., 100 Isnardi, G., 247 Itti, L., 93, 94, 127, 128, 152 Itzchak, Y., 113 Iverson, P., 257 Ivry, R.B., 56, 220, 221, 229, 230 Ivy, R.B., 175, 176 Iwamura, Y., 100 Izard, C.E., 132
288
Advances in Cognitive Science
Jablensky, A.V., 218 Jackson, S.R., 133 Jacobsen, T., 217, 218 Jacoby, L., 45 Jagielo, J.A., 13 James, W., 132 Jankovic, I.H., 26, 40 Janssen, P., 196 Japee, S., 134, 135 Javitt, D.C., 202 Jeanmonod, D., 88 Jech, R., 234 Jenkins, J.W., 246 Jeter, C.B., 105, 108 Jiménez, L., 44, 52 Jin, Z., 264 Joanette, Y., 134, 139 Johnson, H., 182, 189 Johnson-Laird, P.N., 134 Johnson, M.H., 241 Johnson, P., 44, 47 Johnston, A., 213 Johnstone, T., 47, 52 Johnston, P., 246 Joliceour, P., 157, 159, 160 Joseph, B., 17, 18 Joseph, D., 71–89 Joshi, M., 261 Jueptner, M., 56 Jung Stalmann, B., 154, 157 Jusczyk, P., 256 Kaji, R., 234 Kakigi, R., 199 Kanai, R., 196, 197 Kandel, E.R., 82 Kan, E., 241 Kanwisher, N., 113, 134 Kanwisher, N.G., 126 Kapur, S., 203 Karanth, P., 262–64 Kar, B.R., 245–53, 256–65 Karnath, H.O., 158 Katz, L., 262 Kawano, K., 134 Keele, S.W., 220 Keenan, N.K., 247 Kenan, A., 40
Kennard, C., 165 Kennedy, L., 220 Kennett, S., 100 Kenny, S.B., 56 Kertzman, 83 Khatoon, S., 106 Khetrapal, N., 132–45 Kidd, G.R., 218 Kiehl, K.A., 202 Killgore, W.D.S., 134 Kingstone, A., 104 Kirby, J.R., 248, 250 Kirby, L., 247 Kiritani, S., 257 Kirk, B., 17 Kischka, U., 175 Klein, D.J., 132 Klein, R.M., 94, 104–09, 111, 118, 127 Kochanska, G., 260 Koch, C., 93, 94, 127, 128, 152 Konishi, T., 241 Konz, W.A, 14 Kornblum, S., 208 Kourtzi, Z., 113 Kruse, J.M., 14 Kuhl, P.K., 257 Kujala, T., 247, 253 Kuker, W., 158 Kumar, S., 74 Kurian, P., 264 Kushnir, T., 113 Kwak, H.W., 125 Lacassagne, D., 120 Lachman, T., 245 Ladanyi, M., 234 Làdavas, E., 97–102, 165 Landis, T., 56 Lane, R.D., 138 Langlais, P.J., 19 Lang, P.J., 138 Lang, W., 228, 229 Lantero, D., 220 Laureys, S., 53 Lazarus, R.S., 132 Le Bihan, D., 199 LeDoux, J.E., 132 Leenders, K.L., 56
Name Index Lee, P.U., 25 Lehericy, S., 258, 259 Lehéricy, S., 234 Lehky, S.R., 104–28 Lejeune, H., 234 Leonard, C.M., 241 Leppanen, P.H.T., 257 Lercari, L.P., 261 Leung, K. Man, 213 Levesque, J., 134, 139 Levine, M., 24, 26, 40 Lewis, P., 229 Lewis, P.A., 175, 176 Liddle, P.F., 202 Light, G.A., 202 Li, L., 111 Lindinger, W.G., 228, 229 Linwick, D., 19 Liptak, E., 247 Liu, H.M., 257 Liu, J., 134 Liu, S., 207–21 Liu, Y., 157 Lo, C.C., 121 Logothetis, N.K., 111, 113 Lucas, E., 259 Luck, S.J., 252 Lungu, O.V., 21 Lupianez, J., 44 Luppino, G., 97 Lust, B., 261 Lu, X., 56 Lyytinen, H., 257 Lyytinen, P., 257 Macar, F., 225, 230, 231, 234 MacLeod, C., 140 MacLeod, C.M., 139 Madison, D., 152, 157 Magnin, M., 88 Mahoney, J.L., 21 Maki, P., 15, 16 Maki, R.H., 25 Malach, R., 113 Malhotra, P., 151, 166, 169 Mangin, J., 199 Manley, T., 165, 169 Mannan, S., 165
289
Maquet, P., 234 Maravita, A., 100 Marchione, R.N., 258 Marchon, I., 40 Margot, C., 133 Mari-Beffa, P., 15 Marrocco, R., 133 Marsault, C., 234 Marshal, J.C., 150, 169 Marslen-Wilson, W.D., 199 Martin, A., 113, 139, 142, 199 Martinaud, O., 258, 259 Martin, J., 165 Martin, K., 157 Martin, M.M., 257 Maruthy, S., 242 Masmoudi, K., 104 Matelli, M., 97 Mathews, A., 140 Matin, L., 200 Matsui, M., 241 Matsuzawa, J., 241 Mattingley, J.B., 137, 165, 166 Mauk, M.D., 175, 176, 224 Maunsell, J.H., 113 Mavritsaki, E., 150–69 Mayer, A.R., 234 May, M., 40 Mazzone, P., 88 McAuliffe, J., 94, 138 McBride-Chang, C.A., 213 McCallum, W.C., 225, 226 McCandliss, B.D., 257–59, 261 McClelland, J.L., 44, 151, 257 McCracken, R.A., 249 McCrory, E., 264 McKenna, M., 135, 138 Meck, Wh, 197 Meck, W.H., 213, 224, 227 Melvin, R., 120 Mencl, W.E., 252 Meneghello, F., 99 Meng, X., 213 Mensour, B., 134, 139 Meriaux, S., 256 Merikle, P.M., 44, 134–37 Mertens, M., 19 Merzenich, M.M., 246, 251, 252
290
Advances in Cognitive Science
Metzler, J., 25 Meyer, A.S., 234 Miall, C., 229 Miall, R.C., 175, 176 Michie, T.T., 218 Miezin, F.M., 104, 199 Miller, E., 155 Miller, E.K., 111, 114, 115, 139, 199 Miller, S.L., 246, 250, 251 Mineka, S., 134 Mires, J., 25 Mishkin, M., 113 Mishra, R.K., 250 Mitchell, D.G., 139, 142 Miyachi, S., 56, 57 Miyapuram, K.P., 56–69, 83 Miyashita, K., 56, 57 Miyashita, Y., 13 Miyauchi, S., 56 Moats, L., 252 Mody, M., 246 Mohanty, A.K., 262 Mok, L.W., 21 Molfese, D.L., 257 Molina, S., 250 Montague, P.R., 71 Montello, D.R., 40 Morel, A., 88 Morocutti, C., 228 Morrone, M.C., 181, 191, 196 Morton, J., 139, 142 Moss, S.A., 137 Mowrer, O.H., 8–10 Mozer, M.C., 150, 169 Muller, H.J., 168 Munakata, Y., 241 Munoz, D.P., 104, 108–11, 118, 119, 127 Munte, T.F., 265 Näätänen, R., 216–19, 247, 253 Naccache, L., 199 Nagaï, Y., 236 Nagarajam, S., 250 Naglieri, J.A., 248 Nagourney, B.A., 25 Naish, P.L., 180 Nakahara, H., 56 Nakajima, S., 13
Nakajima, Y., 189 Nakamura, K., 56, 182 Nazarian, B., 230, 231, 234 Nelson, E.E., 139, 142 Nelson, S.B., 120 Neville, H.J., 247 Nicoll, R., 152, 157 Niebur, E., 155 Nini, A., 88 Nishida, S., 213 Nitschke, J., 196 Nitschke, J.B., 142 Nittrouer, S., 245 Nobre, A.C., 196 Noguchi, K., 241 Noguchi, Y., 199 Oatley, K., 134 Ochsner, K.N., 134, 139 Ogmen, H., 120 Öhman, A. , 134, 135 Ojemann, J.G., 199 Olivers, C., 152, 158 Olivers, C.N.L., 15, 152, 154, 157, 159, 162–64, 168 Olivieri, G., 134 Oliviero, A., 88 Ollman, R.T., 208 Oppenheimer, D., 45 Oram, M.W., 134 Orfanidou, E., 199 O’Scalaidhe, S., 155 Overmier, J.B., 7–21 Owens, D.A., 184 Paavilainen, P., 216 Padakannaya, P., 262, 265 Paffen, C.L., 196 Palermo, R., 134 Palij, M., 26, 40 Pammi, V.S.C., 56–69 Pandya, D.N., 83 Panitz, D., 134 Pansky, A., 139 Paquette, V., 134, 139 Pare, M., 111 Pariyadath, V., 105, 108, 175, 196–04 Park, J., 189, 196
Name Index Parsons, J., 21 Parton, A.D., 151, 166, 169 Passingham, R.E., 56 Patel, P.G., 264 Patel, S., 104–28 Patteri, I., 126 Paulesu, E., 264 Paus, T., 241 Pavani, F., 98, 99 Pavesi, G., 97 Peduto, A., 134 Peigneux, P., 53 Pelphrey, K.A., 259 Peng, X., 104–28 Penney, T., 176 Penney, T.B., 207–21 Percival, A., 180 Perez, M.A., 56 Perfetti, C.A., 264 Perrett, D.I., 134 Perruchet, P., 45, 46, 52, 53 Person, C., 71 Peruch, P., 40 Pessoa, L., 134, 135, 138, 139, 142 Petersen, S.E., 15, 104, 168, 199 Peterson, G.B., 19 Peterson, S.E., 264 Petronio, A., 165 Pezdek, K., 25, 40 Pfeuty, M., 225, 228, 234 Phaf, R.H., 135 Pierrot-Deseilligny, C., 104 Pine, D.S., 139, 142 Pinel, P., 260 Pitkin, S.R., 21 Poikkeus, A.M., 257 Poizner, H., 104 Pokorny, R.A., 220 Pola, J., 200 Poldrack, R.A., 251, 252 Poline, J.B., 199, 234 Pontefract, A., 104 Pool, J.E., 250 Posner, M., 15, 93, 162, 164 Posner, M.I., 15, 106, 108, 133, 145, 166, 168, 241, 256–65 Pouget, A., 150, 169
291
Pouthas, V., 224–36 Pouthas, Vivane, 176 Pozzessere, G., 228 Praamstra, P., 234 Prablanc, C., 82 Prakash, P., 263 Pratt, J., 94, 138 Prescott, S.A., 157 Presson, C.C., 40 Price, C.J., 159 Printz, H., 151 Pugh, K.R., 252 Purushothaman, G., 120 Purves, D., 181 Putz, B., 56 Rafal, R., 127 Rafal, R.D., 104, 108, 127, 159, 161, 168 Ragot, R., 225, 228, 234 Raichle, M.E., 199 Rainer, G., 196, 199 Rammsayer, T., 175, 196 Ramsperger, E., 182 Rand, M.K., 56, 57 Ranganath, C., 196 Rao, R.P., 199 Rao, S.M., 104, 234 Raos, V., 97 Raymond, J., 135 Raz, A., 88 Razza, R.P., 265 Reagan, S., 46 Reber, A.S., 43, 46 Reber, R., 45 Reed, J., 44, 47 Rees, D., 162 Regard, M., 56 Reiman, E.M., 138 Reingold, E.M., 44 Reinikainen, K., 217–19 Rekha, B., 263 Remschmidt, H., 242, 246 Renault, B., 234 Reuter-Lorenz, P. A., 127 Rey, V., 246 Rhodes, G., 134 Richardson, A.E., 40
292
Advances in Cognitive Science
Richardson, U., 245 Riches, I.P., 111, 112, 199 Riddoch, M.J., 150, 158, 159, 161, 168 Riggio, L., 126 Rimzhim, A., 263 Ringo, J.L., 111 Ritter, W., 202, 215 Rivaud, S., 104 Rivest, J., 175, 196, 198, 202 Riviere, D., 199 Rizzolatti, G., 97, 99 Rizzo, P.A., 228 Robertson, I.H., 165, 166, 169 Robertson, L.C., 159, 161, 168 Robertson, S.D., 220 Robertson, T.J., 40 Roche, A., 256 Rock, D., 218 Rodriguez-Fornells, A., 265 Rokke, E., 14 Rolls, E., 93, 94, 151, 152, 155, 156, 158, 169 Rolls, E.T., 115, 132, 133, 175 Root, J.C., 135 Rorden, C., 165, 166 Rosa, A., 104 Rose, D., 189, 196, 197 Rosen, A.C., 104 Rosenbaum, D.A., 56, 208 Rosen, S., 245 Rossano, M.J., 40 Ross, J., 181, 191, 196 Ro, T., 127 Rothbart, M.K., 241, 256, 258, 260, 261 Rothwell, J.C., 180–82, 190, 191, 193, 196 Rothwell, J.C.E., 179–82, 188, 189, 192 Rotteveel, M., 135 Rousseau, L., 209, 210, 212, 213 Rousseau, R., 209, 210, 212, 213 Rubin, J.E., 79, 88 Rueda, M.R., 260, 261 Rugg, M., 199 Rugg, M.D., 199 Rumsey, J.M., 242 Russo, R., 134 Saccamanno, L., 261 Sacks, O., 19
Sadato, N., 56 Saffran, J.R., 257 Sakai, K., 56 Salidis, J., 252 Sanarelli, L., 228 Sapir, A., 108 Sasaki, T., 189 Sasaki, Y., 56 Savage, L.M., 19, 21 Savoyant, A., 40 Scaglioni, A., 104 Scandolara, C., 97 Schacter, D., 113 Scheier, M.F., 132 Schlack, A., 97 Schlaggar, B.L., 258 Schlag, J., 189, 196 Schlag-Rey, M., 189, 196 Schneidt, T., 220 Schreiner, C., 246 Schroger, E., 217, 218 Schulte-Körne, G., 242, 246 Schwartz, G.E., 138 Schwarz, U., 83 Sciolto, T.K., 104 Scott, R., 45 Scott, S.K., 245 Seidenberg, M.S., 151 Sejnowski, T.J., 71, 150, 151, 157, 169 Sen, K., 120 Sereno, A.B., 104–28 Sereno, M.E., 113 Serino, A., 97–102 Servan-Schreiber, D., 139, 151 Seymour, P.H.K., 262 Shankarnarayan, V.C., 242, 247 Shanks, D.R., 43–45, 47, 52, 53 Shapiro, K., 165 Shaywitz, B.A., 252, 258 Shaywitz, S.E., 252 Sheese, B.E., 260 Sheinberg, D.L., 111 Shelley, A., 202 Shen, J.X., 260 Shepard, R.N., 25, 40 S.H. Feng, Y. Ji, 260 Shibasaki, H., 234
Name Index Shinohara, K., 24–41 Shukla, M., 245–53 Shulman, G.L., 104 Sidman, M., 17 Siegler, R.S., 259 Sigman, M., 256 Simson, R., 215 Sinkkonen, J., 219 Siok, W.T., 264 Siqueland, E.R., 256 Skarda, C.A., 88 Skudlarski, P., 258 Slovin, H., 88 Smilek, D., 94, 134–37 Smith, B.W., 139, 142 Snart, F., 249 Snow, J., 165 Sobotka, S., 111 Sokolov, E.N., 134 Solomon, R.L., 10 Sommer, M.A., 182 Soroker, N., 108 Sowell, E.R., 241 Spelke, E., 260 Spence, C., 100 Spence, K.W., 8, 10 Spencer, R.M., 220, 229, 230 Spencer, R.M.C., 175, 176, 220, 221 Squire, L.R., 199 Srinivasan, N., 94, 132–45, 263 Srivastava, P., 137 Stanescu, R., 260 Starrveldt, Y., 127 Stegeman, D.F., 234 Stein, J.F., 246 Stephan, K.M., 56 Stetson, C., 196 Stevens, E., 257 Stork, D.G., 59 Strallow, D., 104 Streeter, L.A., 256 Studdert-Kennedy, M., 246 Sturman, D., 134, 135, 139, 142 Suetomi, D., 189 Sugase, Y., 134 Summers, J., 189, 196, 197 Suprenant, A.M., 218 Sussman, E., 216
293
Sutton, R.S., 81, 82 Sweeney, W.A., 19 Syssoeva, O., 218, 219 Szapiel, S.V., 104, 108 Takegata, R., 218, 219 Takino, R., 56 Tallal, P., 242, 246, 250–52 Talla, P., 247 Tamura, K., 24–41 Tanabe, H.C., 56 Tanaka, K., 111 Tanaka, M., 100 Tanaka, S., 56 Tang, Y.Y., 260 Tan, L.H., 264 Taylor, C.S., 99 Taylor, H.A., 25 Taylor, M.J., 134, 247, 248 Taylor, T., 94 Taylor, T.L., 108 Tees, R.C., 257 Temple, E., 251, 252, 259 Ten Hoopen, G., 209 Terada, K., 234 Terman, D., 79, 88 Tervaniemi, M., 216–19 Thilo, K.V., 180, 197 Thomas, K.M., 21 Thompson J.G., 132 Thompson, J.M., 245 Thompson, P.M., 241 Thompson, T.I., 17, 18 Thorndike, E.L., 7 Thorndyke, P.W., 40 Thut, G., 56 Tien, K.R., 219 Tipper, S.P., 127 Tlauka, M., 40 Todd, J., 218 Toga, A.W., 241 Tolman, E.C., 8 Tonali, P., 88 Trapold, M.A., 10, 13 Treisman, A., 151, 159, 161, 168 Treisman, M., 180, 197 Treves, A., 155 Trimble, M.R., 236
294
Advances in Cognitive Science
Trinath, T., 113 Tsao, F.M., 257 Tse, C.Y., 215, 219 Tse, P., 175 Tse, P.U., 196, 198, 202 Tsivkin, S., 260 Tucker, D.M., 259 Tucker, G.R., 241 Tuckwell, H., 155, 156 Tunney, R.J., 44, 45 Turk-Browne, N.B., 113 Turkeltaub, P., 265 Turner, R.S., 82 Tversky, B., 24–41 Tyrrell, R.A., 183 Tzur, G., 259, 260 Ueno, S., 134 Ullman, S., 127 Ulrich, R., 196 Umeno, M.M., 182 Umilta, C., 126, 165 Ungerleider, L.G., 113, 134, 135, 138, 199 Usher, M., 155 Vaadia, E., 88 Vaitilingam, L., 207–21 Vandemoorteele, P.F., 234 Vandenberghe, M., 44, 45 Van Essen, D.C., 111 van Leeuwen, C., 24–41 Vaquero, J.M.M., 44 Varela, J.A., 120 Vaughan, H.G., 215 Vaughan, J., 108 Velazco, M.I., 133 Verstraten, F.A., 196 Vicari, S., 257 Vidal, F., 225, 230, 231, 234 Vigorito, J., 256 Voelker, P.M., 260 Vuilleumier, P., 94, 133, 135, 136, 138 Vymazal, J., 234 Vythilingam, M., 139, 142 Wackermann, J., 234 Walker, M.F., 182
Walsh, V., 175, 180, 197 Walter, W.G., 225, 226 Wang, X., 152, 156, 157, 250 Wang, X.J., 121 Ward, L.M., 105 Ward, R., 158 Warren, D.H., 40 Wassef, A.A., 203 Watanabe, M., 196, 197 Watson, C.S., 218 Watson, D., 152, 154, 155, 158, 159 Watson, D.G., 157, 159, 160 Wayne, M.C., 40 Wearden, J.H., 180 Welcome, S.E., 241 Werker, J.F., 257 West, S.O., 40 Whalen, P.J., 197 Whetter, E., 220 Whiteley, L., 181, 182, 188, 190, 191, 193 Whitteridge, D., 157 Wichmann, H., 218 Wickens, C.D., 25 Wiggs, C.L., 113, 199 Wijewickrama, H.S., 202 Wildbur, D., 40 Wilkinson, L., 43, 45, 47, 52 Williams, L.M., 134 Williams, M.A., 137 Willingham, D.T., 56 Willson-Morris, M., 17 Wilson, A.J., 259 Wilson, C.J., 79, 88 Wilson, F., 155 Wilson, F.A., 111 Wilson, P.N., 40 Wimmer, H., 263 Winkler, I., 216 Winter, A.L., 225, 226 Wise, S.P., 56 Wojciulik, E., 134, 165 Wolfe, J.E., 158 Wong, P.S., 135 Wood, F.B., 246 Wood, H.M., 246 Woodley, S.J., 104 Wright, R.D., 105 Wurtz, P., 45
Name Index Wurtz, R.H., 181, 182 Wynn, K., 259 Xiang, J.Z, 111 Xu, Y., 113 Yabe, H., 217–19 Yamane, S., 134 Yang, S., 261, 265 Yang, Z., 181 Yantis, S., 105 Yap, G.S., 97 Yarrow, K., 175, 179–93, 196 Yeterian, E.H., 83 Yew, A.C., 79, 88 Young Yoon, E., 152, 159, 168 Yurgelun-Todd, D.A., 134
Zacks, J.M., 25 Zajonc, R.B., 132 Zakay, D., 175 Zakay, D.A.N., 175 Zametkin, A., 139, 142 Zametkin, A.J., 242 Zeffiro, T., 83 Zeffiro, T.A., 265 Zelaznik, H.N., 220, 221 Zeloni, G., 97 Zentall, T.R., 13 Zhang, W.T., 260 Ziegler, J.C., 265 Zihl, J., 151, 152, 154 Zimmerman, T., 45 Zipursky, R.B., 203
295