TIME-TO-CONTACT
Heiko Hecht Geert J.P. Savelsburgh Editors
Elsevier
TIME-TO-CONTACT
ADVANCES IN PSYCHOLOGY 135 135 Editor:
G.E. STELMACH
ELSEVIER Amsterdam -– Boston –- Heidelberg -– London -– New York -– Oxford –- Paris San Diego -– San Francisco -– Singapore –- Sydney –- Tokyo
TIME-TO-CONTACT
Edited by
Heiko Hecht Department of Psychology, University of Mainz, University Germany
Geert J.P. Savelsburgh Vrije Universiteit, Amsterdam, Amsterdam, The Netherlands
2004
ELSEVIER Amsterdam -– Boston –- Heidelberg -– London -– New York -– Oxford –- Paris San Diego –- San Francisco -– Singapore –- Sydney –- Tokyo
ELSEVIER B.V. Sara Burgerhartstraat 25 P.O. Box 211,1000 211, 1000 AE Amsterdam, The Netherlands
ELSEVIER Inc. 525 B Street Suite 1900, San Diego CA 92101-4495, USA
ELSEVIER Ltd The Boulevard Langford Lane, Kidlington, Langford Oxford OX5 1GB, UK
ELSEVIER Ltd 84 Theobalds Road London WC1X 8RR UK
ElsevierB.V. © 2004 Elsevier B.V. All rights reserved. This work is protected under copyright by Elsevier B.V., and the following terms and conditions apply to its use: Photocopying Single photocopies of single chapters may be made for personal use as allowed by national copyright laws. Permission of the Publisher and payment of a fee is required for all other photocopying, including multiple or systematic copying, copying for advertising or promotional purposes, puiposes, resale, and all forms of document delivery. Special rates are available for educational institutions that wish to make photocopies for non-profit educational classroom use.
Permissions may be sought directly from from Elsevier’s Elsevier's Rights Department in Oxford, Oxford, UK: phone (+44) 1865 1865 843830, fax 1865 853333, e-mail:
[email protected].
[email protected]. Requests may also be completed on-line via the Elsevier homepage (+44) 1865 (http:/ /www.elsevier.com/locate/permissions). /www.else vier.com/locate/permissions). In the USA, users may clear permissions and make payments through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA; phone: (+1) (978) 7508400, fax: (+1) (978) 7504744, and in the UK through the Copyright Licensing Agency Rapid Clearance Service (CLARCS), 90 Tottenham Court Road, London W1P 0LP, UK; phone: (+44) 20 7631 5555; fax: (+44) 20 7631 5500. Other countries may have a local reprographic rights agency for payments. Derivative Works Tables of contents may be reproduced for internal circulation, but permission of the Publisher is required for external resale or distribution of such material. Permission of the Publisher is required for all other derivative works, including compilations and translations. Electronic Storage or Usage Permission of the Publisher is required to store or use electronically any material contained in this work, including any chapter or part of a chapter.
Except as outlined above, no part of this work may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the Publisher. Elsevier's Rights Department, at the fax and e-mail addresses noted above. Address permissions requests to: Elsevier’s Notice No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in verification of diagnoses the material herein. Because of rapid advances in the medical sciences, in particular, independent verification and drug dosages should be made.
First edition 2004 B r i t i of sh Congress Library Cataloging Cataloguing in Publication Data Library catalogue record is available from B r i t i of sh Congress. Library. A catalog from the Library British Library Cataloguing in Publication Data A catalogue record is available from the British Library. ISBN: ISSN:
0-444-51045-1 0-444-51045-1 0166-4115 (Series)
@ The paper used in this publication meets the requirements of ANSI/NISO ANSI/NISO Z39.48-1992 (Permanence of of Paper). Printed in The Netherlands.
V
Foreword: Time for Tau A catcher needs to anticipate the exact moment the ball will hit the hand in order to successfully close the fingers around the ball. The time remaining before the ball reaches the catcher, the time-to-contact (TTC), is determined by its distance, speed and acceleration. Lee (1976) described mathematically that expanding optical patterns can indeed contain accurate predictive temporal information about time to contact. That is, when the relative speed between object and observer remains constant, the inverse of the relative rate of dilation of the closed optical contour generated by an approaching object in the optic array specifies the remaining time to contact of the approaching ball. This optical variable was denoted x and its discovery initiated an incredibly large body of research into TTC judgments, which it is high time to evaluate In particular, it is time to make contact with the different theoretical approaches underlying the x concept. The purpose of the book is to explore and discuss the concept of TTC (not only x!) from very different theoretical perspectives. The first chapter is intended as an introduction and to outline the structure of the book. Thereafter, the book is divided into four sections, that is, foundations of TTC, different behavioral approaches to TTC estimation, TTC as perception and strategy and TTC and action regulation. Each section contains several chapters that provide up-to-date overviews of the respective sub-fields, critically evaluate the existing approaches and formulate alternative views. The 33 contributors to the book come from Australia, Canada, Europe, New Zealand, Japan and the United States of America. They include those with established reputations and young authors who are just beginning to make their mark. Together, they bring a wide-ranging perspective to bear on this rich and expanding field of study, and we thank them for their contribution to this project. In Chapter 1, Heiko Hecht and Geert Savelsbergh summarize the epistemological state of TTC theory. Then Barrie Frost and Hongjin Sun describe the functional biological bases of time-to-collision computation. Markus Lappe reviews the neural building blocks likely to be involved in this computation. Lucia Vaina and Franco Giulianini report an investigation on neural behavior in the face of approaching objects that disappear behind an occluder. The chapter by John Flach, Matthew Smith, Terry Stanard and Scott Dittman shows that control strategies, say in driving behind another car, are changed from a simple constant criterion to a relative criterion, say if retinal size increases break, if it decreases accelerate. When the approach velocity of the object is not constant, as in the case of controlled braking, the observer could make TTC estimates based on the second derivative of x or so-called x-dot. Chapter 6 by John Andersen and Craig Sauer is dedicated to such situations. James Tresilian calls for abandoning
VI VI
research on tau-hypotheses and recommends researchers to concentrate on how temporal control is achieved in interceptive tasks. Paulion van Hof, John van der Kamp and Geert Savelsbergh provide a developmental perspective of control of interceptive timing. In Chapter 9, David Regan and Rob Gray make the case that physiology cannot (yet) replace or render obsolete psychophysics. They suggest that tau-like information is used when available and no other information leads to the successful action more quickly. For this tau-like information edges and textures are particularly important. The latter are the topic of Klaus Landwehr's chapter. Patricia DeLucia considers tau-like information, including expansion rate, to be merely a heuristic. The heuristics serve to accommodate limitations of sensory processes. In a similar fashion, Mary Kaiser and Walt Johnson suggest to limit the domain of x theory. Chapter 13 by Rob Gray and David Regan clinches the case for the use of binocular information in TTC judgment. Simon Rushton takes the next step and sketches a general theory about the use of extraretinal information in interceptive timing. Chris Button and Keith Davids give an assessment of the state-of-the-art of acoustic t. They show that sound intensity seems to be the effective stimulus in most cases. In part IV of the book, Geoffrey Bingham and Frank Zaal argue that T cannot provide the online visual guidance that is required for catching at closer range. In chapter 17, Frank Zaal and Reinoud Bootsma provide a review of potential involvement of x in prehension. Geoffrey Bingham then explores the perception of phase stability and reminds us that timing variables have to be understood as part of a perception-action cycle. Simone Caljouw, John van der Kamp and Geert Savelsbergh, show that interceptive actions are regulated by multiple sources of information and in constant need for recalibration. Finally, Gilles Montagne, Aymar de Rugy, Martinus Buekers, Alain Durey, Gentaro Taga and Michel Laurent present a model of how x may be involved in the regulation of locomotion. We think the book will be of great interest to researchers, teachers and students in various fundamental and applied fields, including perception, human movement sciences, kinesiology, neuroscience, perception, developmental and biological psychology. Our special thanks go to several anonymous reviewers for their helpful comments on the original drafts. We would also like to express our gratitude to Fiona Barren, the technical editor of Elsevier for assisting in the preparation of this volume. Finally, we are most grateful to Petra Glaubitz who worked tirelessly copyediting and indexing the volume.
Heiko Hecht and Geert J. P. Savelsbergh Mainz and Amsterdam, October 2003
VII vn
Table of contents 1. Theories of time-to-contact judgment
1
Heiko Hecht and Geert J. P. Savelsbergh
Part I: Foundations of Time-to-Contact 2. The biological bases of time-to-collision computation
13
Barrie J. Frost and Hongjin Sun
3. Building blocks for time-to-contact estimation by the brain
39
Markus Lappe
4. Predicting motion: A psychophysical study
53
Lucia M. Vaina, and Franco Giulianini
5. Collisions: Getting them under control
67
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
Part II: Different behavioral approaches to Time-to-Contact estimation 6. Optical information for collision detection during deceleration
93
George J. Andersen and Craig W. Sauer
7. Interceptive action: What's time-to-contact got to do with it?
109
James R. Tresilian
8. The information-based control of interceptive timing: A developmental perspective
141
Paulion van Hof, John van der Kamp and Geert J. P. Savelsbergh
9. A step by step approach to research on time-to-contact and time-to-passage
173
David Regan and Rob Gray
10. Textured Tau
229
Klaus Landwehr
Part III: Time-to-Contact as Perception and Strategy 11. Multiple sources of information influence time-to-contact judgments: Do heuristics accommodate limits in sensory and cognitive processes?
243
Patricia R. DeLucia
12. How now, broad Tau? Mary K. Kaiser and Walter W. Johnson
287
VIII
vm 13. The use of binocular time-to-contact information
303
Rob Gray and David Regan
14. Interception of projectiles, from when & where to where once
327
Simon K. Rushton
15. Acoustic information for timing
355
Chris Button and Keith Davids
Part IV: Time-to-Contact and Action Regulation 16. Why tau is probably not used to guide reaches
371
Geoffrey P. Bingham and Frank T. J. M. Zaal
17. The use of time-to-contact information for the initiation of hand closure in natural prehension
389
Frank T. J. M. Zaal and Reinoud J. Bootsma
18. Another timing variable composed of state variables: Phase perception and phase driven oscillators
421
Geoffrey P. Bingham
19. The fallacious assumption of time-to-contact perception in the regulation of catching and hitting
443
Simone Caljouw, John van der Kamp and Geert J. P. Savelsbergh
20. How time-to-contact is involved in the regulation of goal-directed locomotion
475
Gilles Montagne, Aymar De Rugy, Martinus Buekers, Alain Durey, Gentaro Taga and Michel Laurent
21. Subject Index
493
22. Author Index
497
IX
List of contributors George J. Andersen Department of Psychology, University of California, Riverside, CA, USA
Geoffrey P. Bingham Department of Psychology, Indiana University, Bloomington, IN, USA
Reinoud J. Bootsma UMR Mouvement et Perception, University de la Mediterranee, Marseille, France
Martinus Buekers Faculty of Physiotherapy and Physical Education, Motor Learning Lab, Katholieke Universiteit Leuven, Belgium
Chris Button School of Physical Education, Sport and Leisure Studies, University of Otago, Dunedin, New Zealand, University of Edinburgh, St. Leonard's Land, Edinburgh, UK
Simone Caljouw Perceptual-Motor Control: Development, Learning and Performance, Institute for Fundamental and Clinical Human Movement Studies, Vrije Universiteit, Amsterdam, The Netherlands
Keith Davids School of Physical Education, Sport and Leisure Studies, University of Otago, Dunedin, New Zealand
Patricia R. DeLucia Texas Tech University, Lubbock, TX, USA
Scott M. Dittman Visteon Inc., Detroit, MI, USA
Alain Durey (ft) Movement and Perception Lab, Faculty of Sport Science, Universite de la Mediterranee, Marseille, France
John M. Flach Wright State University, Dayton, OH, USA
Barrie J. Frost Department of Psychology, Queen's University, Kingston, Ontario, Canada
Rob Gray Department of Applied Psychology, Arizona State University, Mesa, AZ, USA
X X
Franco Giulianini Brain and Vision Research Laboratory, Department of Biomedical Engineering, Boston University, Boston, MA, USA
Heiko Hecht Institut fur Psychologie, Johannes Gutenberg-Universitat Mainz, Mainz, Germany Man-Vehicle Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
Walter W. Johnson NASA Ames Research Center, Moffett Field, CA, USA
Mary K. Kaiser NASA Ames Research Center, Moffett Field, CA, USA
Klaus Landwehr Psychologisches Institut, Westfalische Wilhelms-Universitat Munster, Munster, Germany Bergische Universitat-Gesamthochschule Wuppertal, Wuppertal, Germany
Markus Lappe Psychologisches Institut, Westfalische Wilhelms-Universitat Munster, Munster, Germany
Michel Laurent Movement and Perception Lab, Faculty of Sport Science, University de la Mediterranee, Marseille, France
Gilles Montagne Movement and Perception Lab, Faculty of Sport Science, Universite de la Mediterranee, Marseille, France
David Regan Center for Vision Research, Department of Psychology BSB, York University, North York, Ontario, Canada
Aymar De Rugy Department of Kinesiology, The Pennsylvania State University, USA
Simon K. Rushton Department of Psychology, University of Cardiff, Cardiff, Wales, UK
Craig Sauer Department of Psychology, University of California, Riverside, CA, USA
XI
Geert J. P. Savelsbergh Institute for Fundamental and Clinical Human Movement Studies, Vrije Universiteit, Amsterdam, The Netherlands Centre for Biophysical and Clinical research into Human Movement, Manchester Metropolitan University, Manchester, UK
Matthew R. H. Smith Delphi Automotive Systems, Kokomo, IN, USA
Terry Stanard Klein Associates, Dayton, OH, USA
Hongjin Sun Department of Psychology, McMaster University, Hamilton, Ontario, Canada
Gentaro Taga Graduate school of Education, University of Tokyo & PRESTO, JST, Japan
James R. Tresilian School of Human Movement Studies, The University of Queensland, St Lucia, Australia
Lucia M. Vaina Brain and Vision Research Laboratory, Department of Biomedical Engineering, Boston University, Boston, MA, USA Harvard Medical School, Cambridge, MA, USA
John van der Kamp Institute for Fundamental and Clinical Human Movement Studies, Vrije Universiteit, Amsterdam, The Netherlands
Paulion van Hof Institute for Fundamental and Clinical Human Movement Studies, Vrije Universiteit, Amsterdam, The Netherlands
Frank T. J. M. Zaal Institute of Human Movement Sciences, Rijksuniversiteit Groningen, Groningen, The Netherlands
This Page is Intentionally Left Blank
Time-to-Contact –- H. H. Hecht Hecht and and G.J.P. G.J.P. Savelsbergh Savelsbergh (Editors) (Editors) Time-to-Contact © 2004 2004 Elsevier Elsevier B.V. B.V. All All rights rights reserved reserved ©
CHAPTER 1 Theories of Time-to-Contact Judgment
Heiko Hecht Universitat Mainz, Mainz, Germany Massachusetts Institute of Technology, Cambridge, MA, USA
Geert J. P. Savelsbergh Vrije Universiteit, Amsterdam, The Netherlands Manchester Metropolitan University, UK
ABSTRACT Tau-theory has become one of the best researched topics in perceptual psychology. A comprehensive look at its accomplishments and failures is long overdue. The current chapter provides a framework designed to help organize the vast literature on the topic. This is done by first outlining the historical roots of the concept and placing it within the context of ecological psychology, which it has long transcended. Then the theoretical significance of x theory will be assessed. Strangely, it can be regarded as most influential and theoretically productive while being utterly wrong at the same time. We make some suggestions as to how one could further exploit the concept without being trapped by its historical baggage. Finally, we speculate on the necessary ingredients for future theorizing on time-to-contact estimation.
2
Heiko Hecht and Geert J. P. Savelsbergh
1. A brief history of tau As an object approaches an observer (or vice versa) the temporal moment of its arrival is specified by optical variables. In other words, the time until the object collides with the observer or passes her eye-plane is given by the relative rate of change of the retinal extent of the object (for the collision case) or by the relative rate of change of the angle between the object and the observer's line of sight (for most passage cases). The remarkable feature of this optical specification, often referred to as tau or the symbol (x) typically used to denote it, lies in the fact that neither object distance nor object size are required if one were to base a time-to-contact (TTC) judgment or a time-to-passage (TTP) judgment on this optically specified information. The question became whether human observers do take advantage of this specification or whether they make separate velocity and distance estimates and combine these into a TTC judgment. The idea that the optical expansion pattern is informative about objects and events can be traced back to Gibson (1947/1982). Gibson suggested that in the case of a moving observer 'the rate of expansion of the image of any point or object is inversely proportional to the distance of that point or object from the observer' (Gibson, 1947, 1982, p.41). The often cited first derivation of T (see e.g. Kaiser & Johnson, this volume) was published as a footnote in Fred Hoyle's (1957) science fiction novel, "The Black Cloud". The much less cited first psychological study on the use of x (Van der Kamp, 1999) to our knowledge stems from Knowles and Carel (1958). These authors wrote the following abstract for the 66the annual convention of the American Psychological Association (August 27-September 3 in Washington): "An analysis of the geometry of head-on-collision situation shows that the relative change of visual angle per unit time determines the time remaining to collision. This experiment was designed to reveal whether observers could utilize this kind of information in the absence of other cues such as familiar size, distance, speed etc. It was shown that for periods up to about four seconds estimates were surprisingly accurate. Beyond four seconds the times were progressively underestimated." Finally, David Lee and his collaborators initiated a field of research by suggesting that observers will use x when it is indeed optically specified. This formulation of tau-theory has become one of the most seminal theories in perceptual psychology (see Lee, 1976; Lee & Reddish, 1981; Lee, Young, Reddish, Lough & Clayton, 1983). The x information for TTC judgments is usually referred to as local x, as the optical information specifying TTC is available in local object parameters,
Theories of time-to-contact judgment judgment
3
such as its retinal area, its retinal width or similar parameters. Accordingly, the T information for TTP judgments is referred to as global x because the object's (or the observer's) path of movement has to be determined from the global flow field. Tau is then given by the instantaneous rate of change of the retinal angle between that path and the object's center. A number of assumptions have to hold for x to yield accurate TTC information. For instance, the object must not change shape or size as it approaches and it must approach at constant velocity. In that case, TTC is given by TTC = x = G /(d0 / dt), where 0 is the optical variable in question (e. g. object diameter as projected onto the retina). Obviously, additional assumptions about the observer's competencies must be made to the effect that she can register the angle 9 and somehow compute its derivative with respect to time. Now tau-theory states that observers are able to do so and base their TTC estimates on x. Tau-theory further states that in order to make TTC judgments the pickup of optical x information is necessary and sufficient. That is, other extraneous information, such as the absolute distance of the object, will not be used. It is understood that the TTC judgments in question here are meant to be meaningful, such as estimating when to initiate an arm movement to catch a ball or when to release an object that is to hit a passing target. Tau theory has inspired so many researchers and continues to do so because it has offered an approach that steered clear of the behaviorist associationism as well as cognitive theorizing. After the dismissal of behaviorism, the cognitive turn (Neisser, 1967) suggested to many psychologists that visual estimations are brought about in a manner similar to other thought processes. According to rational thought or a physicist's approach, the estimation of ingredient variables speed and distance would stand at the beginning of TTC estimation. The beauty of tau-theory lies in the fact that these initial estimations are entirely superfluous. The direct availability of TTC through x elegantly circumnavigates the time-consuming computations that would be required when using the physicist's approach. In this sense, x theory falls within the domain of ecological psychology as laid out by James Gibson (1979). Tau is an informational invariant to which the visual system has direct access, presumably by means of dedicated neural circuitry. Depending on the makeup of this circuitry, what may look complex and intensely computational to the physicist may be very easy for the dedicated processor. The visual system has accordingly been likened to a smart perceptual device, such as a speedometer or a polar planimeter (Runeson, 1977). Their makeup produces a complex quantity as output although the device is simple. And if the nature of the perceptual system is making use of "smart" devices then research should be geared to finding thoseinvariants that are directly accessed by the smart device. Tau has become the prime example of an invariant that fulfills these criteria.
4
Heiko Hecht and Geert J. P. Savelsbergh
The fact that tau-theory emerged at a time when the ecological approach was still in its infancy and its proponents mainly occupied with honing their critique of cognitivist and computational theories, has made for a strange research strategy on the part of ecological psychologists. Rather than testing tautheory for its own sake, the programme was to prop up the prime example for the validity of the ecological approach. Thus, x became a test case for direct perception before it had been properly tested itself. Research results were amassed that showed how timed actions were rather accurate when assessed within an ecological context. The good performance was taken as proof for tautheory. This verificationist attitude, as understandable as it is, has delayed such obvious things as a rigorous psychophysical evaluation of TTC estimation until fairly recently (see e. g. Regan & Hamstra, 1993).
2. The tau hypothesis and its falsification In essence, x theory proper has to be considered as falsified. It is irreconcilable with empirical data in a twofold manner, because sometimes performance is worse than predicted and at other instances performance is surprisingly good even in the absence of x information. Recent work on TTC estimation has revealed that when x information is available, observers often fail to use it properly. For instance, absolute size of the object, rotation, and contrast all interfere with estimation accuracy when facing a clearly visible target (e.g., DeLucia, 1999; Smith, Flach, Dittman & Stanard, 2001). That is, the visual system often is unable to isolate x. On the other hand, judgments are remarkably robust when invariant information is no longer available. For instance, when dramatically reducing the density of an optic flow field consisting of single-pixel dots performance is affected little, suggesting that some image characteristics that covary with x are used as basis for TTC judgments (Kerzel, Hecht & Kim, 1999). Observers seem to adjust to many disturbances, which should not be surprising because they are typical rather than exceptional for ecological contexts. That is, the constancy assumptions that are part and parcel of x theory are more often violated than not. A theory of TTC estimation thus has to explain how such estimates are made in the face of non-rigid size changes during approach, in the face of orientation change of non-spherical objects as well as velocity variations. All of these would fool a x processor that is based on retinal edge distance. Note, however, that for all practical purposes more than one x parameter is specified. For instance, the retinal width of a vertical brick rotating around its center (pitch axis) that is approaching head-on at constant velocity, would yield accurate x information while its retinal length and area would lead to erroneous TTC estimates. Depending on rotation speed length changes could be nulled or
Theories of time-to-contact judgment judgment
5
exaggerated. In terms of tau-theory then what happens if several different TTCs are specified simultaneously? As long as the theory does not state upon which of the taus the TTC estimate is based, performance can likely be attributed to one of the many taus that are present in ecological situations.
3. Tau's significance within and outside ecological psychology In sum, tau-theory can be regarded as most influential, especially within the framework known as the ecological psychology. Gibson's (1979) ecological approach to perception is also known as the direct perception perspective. The word 'direct' refers to the fact that objects, places and events in the environment can be perceived without the need of cognitive mediation to make perception meaningful. The importance of the ecological psychology contribution lies in its emphasis on the nature of information which forms the basis for perception and action. Gibson argued that information is embedded in the optic array, that is, the pattern of light coming from all directions of the environment to a point of observation. Changes in and persistence of patterns in the optic array can be shown to specify the environment. When such information is detected, objects, places and events can be perceived without the need for 'cognitive' processing. Moreover, these optical patterns are determined not only by objects, places and events in the environment, but also by movements of the observer. Examples of structured optical patterns are texture gradients, occlusion patterns (decretion and accretion of texture), motion perspective, and focus of expansion and rate of expansion. Within this perspective, information implies specificity, which is that the information is lawful related to its source (e.g. object, events) such that no other source could have generated that particular pattern (Burton & Turvey, 1990). Moreover, not only the relation between the source and information is specific, but also the relation between information and perception. In other words, the perception of dynamic events entails the detection of a single information variable specifying the event (Michaels & de Vries, 1998). Tau is an informational variable, which seems to pass this ecological exam becoming therefore very popular and incredible important in theory building and experiments. For example, that x is actually used in the control of catching was found in the 'deflating ball experiment' (Savelsbergh, Whiting, & Bootsma, 1991; Savelsbergh, Whiting, Pijpers, & van Santvoord, 1993). Participants were required to catch a ball that was large, small or deflating during approach. The deflating ball results in a lower rate of dilation specifying a larger time to contact, that is, a higher tau-value. The experiments showed that indeed the timing of the hand occurred later for the deflating ball, which indicates that the timing was clearly affected. This was anticipated from a x perspective, and thus, the deflating ball experiment provides qualitative evidence for the use of x in
6
Heiko Hecht and Geert J. P. Savelsbergh Savelsbergh
closing the hand. However, it was also found that the influence of the dilation manipulation was much more pronounced in the monocular viewing condition than in the binocular viewing condition (see also Savelsbergh, 1995). This suggests the contribution of binocular information as well. Additional empirical evidence stems from a replication of the 'deflating ball' study by Van der Kamp (1999), who used next to deflating balls, also inflating balls. The qualitative effects were in agreement with the Savelsbergh et al. (1991) findings, namely, the opening and closing of the hand occurred later (deflating) and earlier (inflating). However, the magnitude of these effects is much smaller than would be predicted on basis of x (see Figure 2 of Chapter 19). It is clear that other information, such as optical size and the rate of change of optical size covary with x, and may therefore also be involved. Together, however, instead of an exclusive regulation on basis of x, the deflating ball experiments point to the use of other information depending on the constraints of the task. Outside the domain of ecological psychology, x has received recognition in two very different ways. First, brain scientists have been inspired by tautheory to search for neural correlates of a x processor without a theoretical need to find it. They have been successful with animal studies (e. g. Sun & Frost, 1998). Second, un-ecological scientists have ventured with their own theoretical and methodological backgrounds into the ecological terrain of real-world perceptual tasks. This has facilitated behavioral studies on human observers that were sympathetic to the domain but critical to the tenets of tau-theory (e. g. Rushton, this volume). Being a prime example of an invariant, x featured surprisingly little in the debate on dynamic event perception (see Hecht, 1996). Direct perceptionists favor a direct availability of complex event characteristics through the kinematic motion information available in the optic array (Runeson & Frykholm, 1983) while others favor an information processing approach suggesting that the visual system uses flexible heuristics (Gilden, 1991). It appears that the status of x theory can considerably strengthen or weaken the direct position. The latter position as a framework for our thinking about timed events is seriously under attack as counterevidence against tau-theory continues to grow.
4. Exhausting the concept The above verdict that x theory is falsified does not imply, of course, that the theory should be thrown out without considering how it could be exploited further or even be salvaged (see Holzkamp, 1972, on the exhaustion principle). We can discern four ways to augment x theory, which is schematically represented in Figure 1.
Theories of time-to-contact judgment
7
tau Processor Motor response TTC estimate
Figure 1: Model of tau-theory. If optical x information is available, the system automatically generates a TTC estimate exclusively based on that information. The motor response is triggered when x has reached a given critical value. The x processor is necessary and sufficient for TTC estimates to be generated.
First, it can be amplified to include all possible x variables. This would be most conservative and basically chain together a number of x processors. All that needs to be added is a rule that determines which of the parallel tauparameters is used for TTC estimation. Second, the output of the x processor could be interpreted and modified by an added processing step. This has been suggested by Tresilian (1995) to the effect that fast actions, such as batting in baseball would receive little or no added processing while slower actions accommodate such processing. Other visual information, speculation or learned strategies can enter this added stage. This can explain errors in prediction motion (PM) tasks (see Figure 2).
tau processor
post processing
motor response
Figure 2: Tau theory augmented by a cognitive post-processing step that interprets the output of the x processor
8
Heiko Hecht and Geert Geert J. P. Savelsbergh Savelsbergh
Finally, a complex theory off TTC could incorporate the gamut of optical variables as well as heuristical rules that compete for input into the TTC estimate, thereby pulling the ecological footing from underneath x theory. The list of competitors can include all potential taus as well as simpler optical parameters, such as rate of expansion or retinal velocity, and non-optical information and strategies. This model is illustrated in Figure 3. tau processor *• optical ^ information ) •*-^..
..
tau processor
\
\
— " * ^ ,
TTC estimate
visual heuristic
motor response
/ '•"••-.
/
"•••-..
motor strategy
•-•-.,
extra-retinal information
Figure 3: Alternate model: The output of the tau-processor(s) is neither necessary nor sufficient. It exists together with other units that can inform a TTC estimate. If only a coarse estimate is needed quickly, for instance to make a decision to move out of harm's way, a simple heuristic may suffice to trigger a motor response. In other cases the tauprocessor may be engaged.
The model sketched in Figure 3 appears quite plausible, but it may in fact be too plausible to be of value. It has pretty much covered all the bases for TTC judgments. It will only start to be productive once all t processors are spelled out and possibly linked to some neural substrate. It also needs to be supplemented with rules stating under what circumstances extra-retinal or heuritstic information is considered. Once the model is filled-in in those terms it hardly deserves to be called x theory. Rather, it will be a complex theory of TTC judgment. The current book provides the building blocks for such a theory of TTC. The following paragraph provides a framework suggesting how one might relate the buildings blocks.
Theories Theories of time-to-contact time-to-contact judgment
9
5. The future of tau Given the multiple ways to exhaust the concept of x, as the research community we have to decide which strategy is best in the sense that it promises to yield the most innovative results. To do so, we need to take into consideration that the visual system may be considerably more flexible than monolithic theories can capture. The visual system may use different strategies depending on the accuracy that is required for a given action. It might use simple heuristical rules, or cheap tricks as the utilitarian notion that the visual system is using a bag of tricks would imply (Ramachandran, 1990). However, the system appears to be smarter than that. The visual system might systematically use different strategies at different times. If this were the case, we cannot discard the use of some variable for good based upon evidence that it is not used in some particular cases. This selective use of strategies as a function of task demands has been introduced as satisficing into cognitive theorizing (Simon, 1969, 1982). In analogy, the visual system can be taken to function as an entity bounded by its capacity limitations which create uncertainty about the future and about costs of information acquisition. It finds one solution to a perceptual problem, among a number of more or less costly solutions, that is sufficiently accurate and as easy to achieve as possible given the required accuracy level. When Regan and Gray (this volume) refer to task-relevant variables they can be interpreted to elaborate the notion of satisficing. A given optical variable that can in principle be used is in fact only used when the desired action elicits it. Only by experimentally holding constant all covariates of a given optical variable can subjects be forced to use the given variable. We need to re-evaluate existing studies under these premises. This way a large body of seemingly contradictory findings can be reconciled, such as some authors reporting excellent performance in natural settings while others find that non-tau variables, such as absolute size, retinal velocity etc. compromise TTC estimation. Likewise, we need modify the way experiments on TTC judgment are being designed. This re-conceptualization of TTC has already begun, as many of the chapters in this volume will prove.
10 10
Heiko Hecht and Geert J. P. Savelsbergh
REFERENCES Burton, G. & Turvey, M.T. (1990). Perceiving the lengths of rods that are held but not wielded. Ecological Psychology, 2,_295-224. DeLucia, P. R. (1999). Size-arrival effects: The potential roles of conflicts between monocular and binocular time-to-contact information, and of computer aliasing. Perception & Psychophysics, 61, 1168-1177. Gibson, J. J. (1947, 1982). Perception and judgements of aerial space and distance as potential factors in pilot selection training. In E. Reed & R. Jones (Eds.), Reasons for realism: Selected essays of James J. Gibson (pp.29-43.) Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers Gibson, J. J. (1979). The ecological approach to visual perception. Hillsdale, NJ: Lawrence Erlbaum. Gilden, D. L. (1991). On the origins of dynamical awareness. Psychological Review, 98, 554-568. Hecht, H. (1996). Heuristics and invariants in dynamic event perception: Immunized concepts or non-statements? Psychonomic Bulletin and Review, 3, 61-70. Hoyle, F. (1957). The black cloud. London: Heineman. Holzkamp, K. (1972). Kritische Psychologie: Vorbereitende Arbeiten. Frankfurt a. M.: Fischer Taschenbuch Verlag. Kerzel, D., Hecht, H. & Kim, N. G. (1999). Image velocity, not tau, explains arrival-time judgments from global optical flow. Journal of Experimental Psychology: Human Perception and Performance, 25, 1540-1555. Knowles W. B. & Carel, W. L. (1958). Estimating time-to-collistion. American Psychologist, 13, 405-506. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5, 437-459. Lee, D. N. & Reddish, P. E. (1981). Plummeting gannets: A paradigm of ecological optics. Nature, 293, 293-294. Lee, D. N., Young, D. S., Reddish, P. E., Lough, S. & Clayton, T. (1983). Visual timing in hitting an accelerating ball. Quarterly Journal of Experimental Psychology, 35A, 333-346. Michaels, C. F. & de Vries, M. (1998). Higher order and lower order variables in the visual perception of relative pulling force. Journal of Experimental Psychology: Human Perception and Performance, 24, 526-546. Neisser, U. (1967). Cognitive Psychology. N.Y.: Appleton. Ramachandran, V. S. (1990). Interaction between motion, depth, color and form: the utilitarian theory of perception. In C. Blackmore (Eds), Vision: Coding and Efficiency, (pp. 347360), Cambrige: Cambrige University press. Regan, D. M. & Hamstra, S. (1993). Dissociation of discrimination thresholds for time to contact and for rate of angular expansion. Vision Research, 33,447-462. Runeson, S. (1977). On the possibility of "smart" perceptual mechanisms. Scandinavian Journal of Psychology, 18, 172-179.
Theories of time-to-contact judgment judgment
11
Runeson, S. & Frykholm, G. (1983). Kinematic specification of dynamics as an informational basis for person-and-action perception; Expectation, gender recognition, and deceptive intention. Journal of Experimental Psychology: General, 112(4), 585-615. Savelsbergh, G. J. P. (1995). Catching "Grasping tau". Human Movement Science, 14, 125-127. Savelsbergh, G. J. P., Whiting, H.T.A. & Bootsma, R.J. (1991). 'Grasping' Tau. Journal of Experimental Psychology: Human Perception and Performance, 17,315- 322. Savelsbergh, G. J. P., Whiting, H.T.A., Pijper, R.I. & Santvoord, van A.A.M. (1993).The visual guidance of catching. Experimental Brain Research, 93, 148-156. Simon, H. A. (1969). The sciences of the artificial (2nd edition 1981). Cambridge, MA: MIT Press. Simon, H. A. (1982). Models of bounded rationality. Cambridge, MA: MIT Press. Smith M. R., Flach, J.M., Dittman, S.M. & Stanard, T. (2001). Monocular optical constraints on collision control. Journal of Experimental Psychology: Human Perception and Performance, 27, 395-410. Sun, H. & Frost, B. J. (1998). Computation of different optical variables of looming objects in pigeon nucleus rotundus neurons. Nature Neuroscience, 1, 296-303. Tresilian, J. R. (1995). Perceptual and cognitive processes in time-to-contact estimation: Analysis of prediction-motion and relative judgment tasks. Perception & Psychophysics, 57, 231245. Van der Kamp, J. (1999). The information-based regulation of interceptive timing. Nieuwegein: Digital Printing Partners Utrecht B.V.
This Page is Intentionally Left Blank
Time-to-Contact Time-to-Contact – - H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 2 The Biological Bases of Time-to-Collision Computation
Barrie J. Frost Queen's University, Kingston, Ontario, Canada
Hongjin Sun McMaster University, Hamilton, Ontario, Canada
ABSTRACT We begin the chapter by arguing that there may be several neural mechanisms that have evolved for computing time-to-collision (TTC) information as a way of controlling different classes of action. We then focus on single unit mechanisms responsible for processing the impending collision of a moving object towards a stationary observer. After discussing TTC processing in the invertebrate visual system, we describe our own work involving neurons in the pigeon nucleus rotundus that respond exclusively to visual information relating to objects that are approaching on a direct collision course, but not to visual information simulating observer's movement towards those same stationary objects. Based on the recorded neuronal responses to various manipulations of the stimuli, we classified these looming sensitive neurons into three different types of looming detectors based on the temporal differences in neuronal response relative to the moment of collision. We also described quantitative models for these looming detectors as a way of explaining their physiological response properties.
14 14
Barrie J. Frost Frost and Hongjin Sun
1. Introduction Information about the time-to-collision or time-to-contact (TTC) has important consequences for the survival of countless species and for their skilled interaction with both the inanimate and animate objects in their environments. As a consequence it appears very probable that there may be several neural mechanisms that have evolved to compute TTC information to control different classes of action, and even different mechanisms within the same animal for different functions. For example, it appears unlikely that mechanisms that have evolved in birds for avoidance of rapidly approaching objects such as predators, where critical and rapid evasive maneuvers are required, are the same mechanisms that control pinpoint landing on branches. In the former case the motion of the approaching predator will primarily determine TTC, whereas in the latter it is only the self-motion of the animal approaching the stationary branch that determines TTC. Of course there may be many instances where both self-motion and motion of another animal determine TTC. For an excellent review that puts TTC in a much broader context the reader is referred to a paper by Cutting, Vishton and Braren (1995). One way that may help conceptualize these factors is to subdivide the primary stimulus determinants of TTC on the one hand, and the general nature of responses controlled by the information on the other, and place these in a simple 2 x 2 table as illustrated in Table 1. Here we have divided the world simply into stationary and moving objects on the vertical axis, and the behavioural responses into approach and avoidance on the other. Examples of TTC studies falling in cell 1 (Stationary objects/Approach behaviour) are the
Source of LoomingStimulus
Behavioural Output
Self-motion towards stationary objects
1
• • •
Insect's landing Bird's landing Human or gerbil approaching towards target
2
• •
Avoiding obstacles Avoiding cliffs and drop offs
Moving objects (towards and away from the observer)
3
• • • •
Pursuit - prey capture Pursuing mates Flock formation Ball catching
4
• •
Predator avoidance Avoiding aggressive encounters
Approach
Avoidance
Table 1: Situations Requiring TTC Information
The Biological Bases of Time-to-Collision Computation
15
landing response of the milkweed bug, Oncopeltus fasciatus (Coggshall, 1972) and the fly (Wagner, 1982). The aerodynamic folding of gannet wings just prior to their entry into water (Lee & Reddish, 1981), birds landing on stationary perches (Lee, Davies, Green & Van der Weel, 1993) or human subjects braking to avoid collision with stationary barriers (Sun & Frost, 1997) or gerbil behaviour of running towards target (Sun, Carey & Goodale, 1992) are other examples that fall into this category. Examples of behaviour falling in cell 2 (Stationary objects/Avoidance) would involve negotiating paths through a cluttered environment where obstacles have to be avoided. This might include steering around barriers, and avoiding holes or sudden drop offs. There appear to be few studies of TTC detection in this category, but Cutting et al.'s (1995) study of path interceptions may be relevant. Prey capture by predators and pursuit chasing during mating could well satisfy entry into cell 3 (Moving objects/Approach), although not all studies of this behaviour have focused on TTC information. Ball catching behaviour, and batting in sports seem also to be appropriate exemplars of this category. Escape from rapidly approaching predators or threatening rivals in territorial mating would be prime example of cell 4 (Moving objects/Avoidance). Throughout the animal kingdom the sight of a rapidly approaching object almost universally signals danger and elicits an escape or avoidance response. When confronted with such a looming stimulus, the visual system must determine precisely the 3D flight path, and compute the TTC of the object, to provide the information necessary for eliciting and controlling the appropriate evasive action (Fishman & Tallaroco, 1961; Schiff et al., 1962; Schiff, 1965; Hayes & Saiff, 1967; Tronick, 1967; Bower et al., 1970; Ball & Tronick, 1971; Dill, 1974; Ingle & Shook, 1985; Yonas & Granrud, 1985). Our own work on neurons in the pigeon nucleus rotundus of pigeons clearly fits in this category because these neurons respond only to the direct collision course of approaching objects (Wang & Frost, 1992, Sun & Frost, 1998), and not to simulation of the movement of pigeons toward the same stationary objects (Sun & Frost, submitted). Also the work on locust looming detectors would fit this category because of the demonstrated elicitation of jumping and flying by the same expanding stimulus patterns that optimally excites the Lobula Giant Movement Detector (LGMD) and the Descending Contralateral Movement detector (DCMD) neurons (Rind & Simmons, 1992, 1999; Hatsopoulos, Gabbiani, Laurent, 1995). Of course it should be remembered that the necessity to compute TTC first requires that any object or surface be indeed on a collision course if the present path of the observing or approaching animal is maintained. Gibson (1979) in his classic work on ecological optics suggests that symmetrical expansion of the images of objects specifies direct approach along a course that will
16 16
Barrie J. Frost and Hongjin Sun
ultimately result in collision with continuous motion. The advantage of using this strategy is that TTC can be computed using monocular information alone. It is possible that, for animals with well-developed binocular stereoscopic visual systems, subpopulations of neurons that respond to stereoscopic motion directions specifying object-motion paths directly toward the animal, might also be used for TTC computations. In this work and his other writings, Gibson also made the clear distinction between collisions with stationary objects occasioned by the motion of the observer (row 1 in Table 1) and other cases where it is the approaching object's motion itself that will result in collision if it continues along this path (row 2 in Table 1). In this chapter we will focus primarily on research that falls in cell 4 simply because it appears that most of the empirical studies about neural mechanisms of TTC have used stimulus arrangements that simulate events that fall into this category, that is, a rapidly approaching object on a direct collision course with the observing animal, and which might therefore require some sort of evasive action or avoidance response on the part of the animal. In the other category of object motion where the observing animal is trying to arrange a collision with the moving object such as the prey capture behaviour of dragonflies (Olberg, Worthington & Venator, 2000), similar processing mechanisms may occur.
2. TTC in the invertebrate visual system Flying insects have long been used as model systems because they exhibit spectacular aerial performance and accomplish this with relatively simple neural computational mechanisms. Moreover since the same neurons can be identified from animal to animal the neural circuitry is often amenable to analysis. Two such neurons, LGMD and DCMD in the locust, that are synaptically linked, have been shown to be selectively responsive to approaching objects (Rind & Simmons, 1992; Rind, 1997; Hatsopoulos et al., 1995). Neurons that respond to changes in depth have also been found in optic lobes of the hawk moth, Manduca sexta (Wicklein & Strausfeld, 2000), but these may be examples of neurons computing approach and recession for the control of self motion, in this case controlling the hovering flight in front of flowers during nectar collection, rather than for the computation of TTC. Because the DCMD neurons can be readily recorded extracellularly, have very large receptive fields, and respond well to the movement of objects, they have been studied extensively for many years. Schlotterer (1977) was the first to use approaching stimuli to show that DCMD neurons were more responsive to approaching objects than other 2D patterns of movement. Rind and her colleagues, and Laurent and his colleagues have extensively studied these neu-
The Biological Bases of Time-to-Collision Computation
17
rons using a variety of stimuli and confirmed that symmetrical expansion generated by an approaching stimulus object is the critical stimulus variable that optimally fires these cells. The allocation of the LGMD - DCMD neurons to cell 4 of our schema presented in Table 1 is justified by their connection to pre-motor interneurons and motor-neurons known to be involved in flying and jumping (Burrows & Rowell, 1973; Pearson et al, 1980; Simmons, 1980). This is further supported by the studies of Robertson and Johnson (1993a, 1993b) who have shown in tethered, flying locusts, that approaching objects elicit a steering avoidance response when the approaching object reaches a critical angular size, thus indicating that some thresholding probably occurs in this pathway. What are the critical features of a symmetrically expanding image that these locust DCMD neurons are responding to that generates their specificity to approaching objects? From an analysis of the several possible cues available in the monocular image Rind and her colleagues (Simmons & Rind, 1992; Rind & Simmons, 1992) have shown that these neurons do not register changes in overall luminance since they respond in a similar manner when light objects approach as when dark objects approach, and their responses were much smaller to sudden luminance change per se. Divergence of two lines moving in opposite directions, to partially represent the opposite contours of a symmetrically expanding object, also did not adequately stimulate DCMD neurons, but increasing the amount of edges in the Receptive Field (RF) and increasing the velocity of edges appeared to be the critical trigger features. Judge and Rind (1997; see also Rind & Simmons 1999) have shown that these locust looming sensitive neurons are very tightly tuned to the direct collision course. Stimulation of the locust retina in one area suppresses LGMD response to a second stimulus presented elsewhere in the visual field, thus indicating there are lateral inhibitory mechanisms operating. Indeed if the appropriate experiments were to be performed one might well find the RF characteristics are similar to those found in the tectofugal or collicular-pulvinar pathway of vertebrates where a directionally specific, double opponent RF organization occurs to ensure that these cells respond to moving objects, and not to the large patterns of optic flow produced by the animal's self motion (Frost 1978; Frost, Scilley & Wong, 1981; Frost, Cavanagh & Morgan, 1988, Sun, Zhao, Southall & Xu, 2002). From a functional point of view this also implies that the LGMD neurons might be interested in approaching objects that fall into cell 4 of our matrix, and not in the locust's approach toward stationary features in its environment. According to Rind and Simmons (1999) the specificity of the LGMD for approaching images is generated by a "critical race over the dendrites of the LGMD in the optic lobe". The two competitive forces in the race are the excitation produced by the moving edges of the expanding image, and lateral inhibition mediated by neurons in the medulla also synapse onto the LGMD. Rind and Bramwell (1996) have produced a neural network model which seems to support
18 18
Barrie J. Frost and Hongjin Sun
this view and have also shown through electron microscopy that the anatomical arrangement of presynaptic connections to LGMD are compatible with this interpretation. Hatopoulus, Gabbiani and Laurent (1995) have also investigated the LGMD of locusts, and shown that this neuron fires with an increasing rate as an object approaches, then peaks, and drops off just before collision occurs. They have shown that the responses are typically brisker for fast moving or smaller objects, but the peak firing rate does not appears to solely depend on the approach speed or object size. They describe the peak as always exhibiting a constant latency after the time at which the object reaches a fixed angular threshold size on the eye (Gabbiani, et al, 1999). These authors have suggested that the behaviour of the LGMD is best described by the following equation: f(t)=Cx0'(t)Xeam
(1)
Here, 0 is visual angular subtense, C is a proportionality constant. In contrast to Rind and her associates view, these authors in recent papers (Gabbiani, et al, 2001; Gabbiani et al., 2002) have suggested that the LGMD postsynaptically multiplies an excitatory and inhibitory input via two different parts of LGMD neuron's dendritic tree. In order to provide evidence in support of this model these authors (Gabbiani et al., 2002) have selectively activated and deactivated pre-and post synaptic inhibition, and have found that it is post-synaptic inhibition that plays a critical role in shaping the temporal response profile of these neurons, and this indicates that the multiplication takes place within the LGMD neuron itself. These findings are noteworthy for two reasons: in the first place they show in a detailed way how these computations which provide information about looming object are accomplished within the neural machinery of the LGMD and its synaptic connections, and secondly they provide one of the first pieces of clear evidence for how multiplication (and division) is accomplished in the nervous system.
3. Neurons that compute tau in the pigeon brain A number of behavioral studies have revealed that the tectofugal pathway in vertebrates might be involved in processing the visual information necessary for generating such escape or avoidance action. Electrical stimulation experiments indicated that the anuran optic tectum is involved in triggering both prey-catching and also various kinds of avoidance behaviours (Ewert, 1984), and ablation of the optic tectum resulted in abolition of all visually guided preycatching and visual avoidance behaviour (Bechterew, 1984, cf: GriisserCornehls, 1984). Electrical or chemical stimulation of the superior colliculus in
The Biological Bases of Time-to-Collision Computation ofTime-to-Collision
19
rats also results in defensive-like reactions (Redgrave et al., 1981; Sahibzada et al., 1986; Dean et al., 1988) and is associated with large increases in blood pressure and heart rate (Keay et al., 1988). Pigeons with bilateral lesions of the optic tectum or/and the nucleus rotundus not only showed substantial impairment in intensity, colour, and pattern discrimination, but also exhibited severe deficits in visually guided orientation, escape or avoidance behaviour (Hodos & Karten, 1966; Hodos, 1969; Hodos & Bonbright, 1974; Jarvis, 1974; Bessette & Hodos, 1989). Wild rats with collicular lesions may ignore an approaching human (Blanchard et al., 1981) and similar results have been reported in hamsters and gerbils (Ellard & Goodale, 1986; Northmore et al., 1988). This evidence provides a vivid illustration of the importance of the tectofugal pathway in guiding orientation, detecting approaching objects, and generating escape or avoidance behaviours. Over the years several investigators have claimed that they have encountered cells that respond specifically to objects approaching the eye on a direct collision course. For example, as early as 1976 Griisser and Grusser-Cornehls (1976) and later Ewert (1984) reported that some frog and toad tectal neurons respond vigorously to stimuli moving on paths directly towards the eye. However from these early studies many of the appropriate controls were not performed to conclusively exclude the possibility that these neurons were simply responding to some aspect of the lateral motion of an approaching stimulus. It should be remembered that as an approaching image expands, obviously there will be 2D motion of the edges of the object and its textures and if the expansion is placed asymmetrically over a standard 2D directionally specific motion it could artifactually stimulate the neuron to give a false impression it is responsive to approaching stimuli. We also had encountered cells we thought were responding to the direct approach path of moving objects in 1983, but it was only when we had extremely well-controlled stimuli, which we could systematically vary in their simulated 3D paths, that we could finally convince ourselves that these neurons were indeed coding some aspect of 3D motion. In 1992 Wang and Frost showed that some neurons located in the dorsal posterior regions of the nucleus rotundus of pigeons responded specifically to the direct approach direction of a soccer ball pattern. Using the 3D imaging capabilities of a Silicon Graphics computer we were able to move this soccer-ball stimulus in any trajectory in 3D space. By performing very time consuming 3D tuning curves on these cells we were able to show that this rotundal subpopulation would only respond when the soccer ball stimulus was on a direct collision course with the bird's head. We chose a soccer ball because the spaceaverage mean luminance did not change as this stimulus expanded and contracted in size (especially when the object moved against a stationary background with the same texture pattern), and it provided many moving and expanding/contracting elements that might be necessary for these neurons to re-
20
Barrie J. Frost and Hongjin Sun
spond. Earlier studies, and often some more recent ones, use a simple expanding square or circle where changes in luminance obviously occur concurrently with the expansion/contraction of stimuli, and this necessitates several other controls to rule out this variable as the major contributor to the responsiveness of the neuron. Also other studies have not specifically performed 3D tuning curves to quantify the true directional tuning of neurons of this type. Figure 1 shows the typical 3D tuning curves of one of these rotundal neurons.
Figure 1: A. A soccer-ball-like stimulus pattern consisting of black and white panels, was moved along simulated 3D trajectories 45° apart in spherical coordinates. The diagram illustrates the 4 planes along which stimuli were moved. B. A typical single neuron from the nucleus rotundus of pigeons exhibiting clear selectivity for a looming visual stimulus. Firing rate is plotted for the different directions of motion of the soccer-ball stimulus in 3D space. Each direction of motion was presented 5 times in a randomly interleaved sequence, and the values plotted represent the mean peak firing rate for each 3D direction. Note that in the standard X-Y (tangent screen plane, or Azimuth = 90°) plot, there is no indication of directional preference, and firing rate is quite low. However, for the 0 azimuthal plane (Zaxis) there is a strong preference for stimuli directly approaching the bird (0°). Polar tuning plots for directions specifying the azimuthal 45 and 135° planes likewise show no strong preference for any direction. Thus it is only the direct collision course or looming direction that produces an increased response in these neurons, and this pattern of activity was typical of the 27 neurons studied in the dorsal posterior area of the nucleus. (© Wang and Frost, 1992, Nature).
The Biological Bases of Time-to-Collision Computation ofTime-to-Collision
21
Even with 26 directions, 45 degrees apart these are relatively crude tuning curves, so in a few cases we have used a much narrower range of directions after having first performed the broad 3D tuning and found them to be very tightly tuned indeed (Sun & Frost, submitted). In fact the half-width and half-height of detailed tuning curves like those illustrated in Figure 2 is about 4 degrees, where the rotation is around a point halfway along the simulated 15 meter path taken by the approaching stimulus of 30 cm in size. This means in simple terms
Azimuthal direction of object motion (degrees)
Figure 2: Fine tuning of two cells located in the pigeon dorsal nucleus rotundus. First these cells were presented with a soccer ball stimulus that moved in 26 directions 45 degrees apart in 3D space, and they only responded to the direct collision course direction. The graphs shown here are the fine grained tuning curves and show that when the soccer ball, which traveled along a simulated path of 15 meters, was rotated by small amounts each time passing through the center of the path, the cells reduced their firing substantially. The few degrees of rotation of the path indicate that now the soccer ball would travel in a "near miss" and not collide with the bird.
that a stimulus that depicts a very "near miss" of the bird's head will only fire the neurons minimally, and one that is a clear "miss" will not evoke any response at all. Perhaps the most important defining character of these neurons' responses, in addition to their sensitivity to the direct collision course direction, was the constant time they fired before collision, irrespective of the size of the simulated approaching soccer-ball, or of its approach velocity (Wang & Frost, 1992). This indicated to us that these neurons might well be computing the optical variable tau that had been suggested by Lee (1976) to provide important information about the TTC with approaching objects.
Barrie J. Frost and Hongjin Sun
22
We also found that there were a variety of times before collision that the population of neurons exhibiting these characteristics showed. This can be seen in Figure 3A. Distribution of Tc Values 8"
I 6
ll 800
900
1000
1100
1200
1300
1400 ms
to
B
Single-cell Variability 8"
=
S 6 "5
2
•
800
900
1000
1100
1200
1300
1400 ms
Tc
Figure 3: A. Distribution of different response onset time for 27 looming cells from the dorsal posterior zone of nucleus rotundus of pigeons. Although different cells exhibit different values of TTC, individual cells (B) show remarkable consistency even when velocity or size of stimulus is varied.
But for a particular neuron, the variation in its response onset was remarkably constant (see Figure 3B) on repeated trials and with stimuli of different sizes and velocities. This variation in the population is precisely what is needed if other factors, such as recognition of what the incoming object is, can jointly influence the time selected to perform escape responses of different sorts, each of which may have characteristic time requirements for their optimal deployment. Clearly the characteristics of these rotundal neurons suggested to us that they might be involved among other things in predator avoidance and thus fall in to cell 4 of our classification system shown in Table 1. To provide some evi-
The Biological Bases of Time-to-Collision Computation ofTime-to-Collision
23
dence for this we tested a few birds under a lightened anesthesia at the conclusion of their recording session. Under deep anesthesia no electromyelograhic signals (EMGs) are obtained from animals, but as the anesthetic is lightened and clearly before any pain stimuli can be experienced, it is possible to obtain good clear muscle responses. These responses do not result in any overt movement of the animal, but can be very useful in indicating what major muscle groups might be involved in a response system normally associated with a stimulus. In this case we recorded from the large pectoralis muscles that power the wings for flight. When we presented the approaching soccer-ball stimulus under these conditions we found that first the tau cells responded with their characteristically maintained burst of firing, then the pectoralis EMGs occurred some 200 milliseconds later, and then finally the heart rate went more slowly up to levels near 300% of the resting rate. These responses again were incredibly specific, and only occurred when the soccer-ball was on a direct collision course with the bird. Near misses and directions 180 degrees away showed no increased EMGs or increases in heart rate. Data typical of these experimental findings can be seen in Figure 4. Although only correlative, we feel this constellation of activity in these tau neurons, and the increased wing EMG and heart rate are indicative of a flight response elicited by the rapidly approaching ball. In a more detailed recent study Sun and Frost (1998) have again confirmed the presence of a population of neurons in nucleus rotundus of pigeons that only respond when the soccer-ball stimulus is on a direct collision course with the bird's head. Additionally, we found that these neurons only responded when our computer simulated the approach of a moving object towards the bird (stimuli falling into cell 4 of the matrix), and not when the complex stimulus pattern was configured to simulate the bird moving towards the same stationary soccer-ball (stimuli falling into cell 1, or possibly 2 of the matrix) (Sun & Frost, submitted).
24
Barrie J. Frost and Hongjin Sun
B Visual response
Visual response
0 Heart rate
-I
*
1
Figure 4: Heart rate and pectoralis muscle EMGs recorded concurrently with single cell response rate from a looming selective rotundal neuron. Note that the "looming cell" begins firing first, then the muscle response occurs, and then heart rate increases dramatically when the soccer-ball looms toward the bird (A). B. No responses occur when the ball moves along the same path but in the opposite direction directly away from the bird. Bars under the visual response histograms indicate the duration of the visual stimulus. Data collection for the looming-selective neuron was terminated with stimulus offset. The neuronal data and EMGs represent the summed activity over 5 sweeps of the stimulus whereas the heart rate data represent the means and standard errors for the same 5 sweeps. Simulated size of stimulus was 30 cm, path length 15 m, and velocity 375 m/s. (© Wang and Frost, 1992, Nature).
To do this we placed the soccer ball against a stationary background, which consisted of a checkerboard pattern. When the soccer-ball was moved in a trajectory towards the bird (symmetrical expansion) while the background remained stationary, the neurons responded in the typical way and identically to the case where no background was present. However, when the background was moved along the same trajectory as the soccer-ball, so as to show a similar but delayed expanding pattern, the neurons did not respond. This latter configuration formed the precise simulation of the bird approaching a stationary soccer-ball that remained a constant distance in front of the background "brick wall". The stimulus conditions simulating a soccer ball approaching the bird, and the bird approaching a stationary soccer ball are shown in Figure 5.
The Biological Bases of Time-to-Collision Computation ofTime-to-Collision
L White stationary background
25
Image Change
Checker board stationary background
Background approaching with object
TIME 1
TIME 2
TIME 3
Figure 5: Schematic diagram of stimulus conditions presented to pigeons. The left portion of the figure illustrates the kind of object movement and its background in the simulated display, while the right portion represents the image change on the screen for the corresponding simulation illustrated on the left. In A, the soccer ball object is presented against a blank background, and the "path" simulates direct approach toward the bird. In B, the same looming object is presented against a stationary textured checkerboard background. In C, both object and textured background move (at the same speed) toward the bird, which simulates the bird's self-motion toward a stationary soccer ball. Note that the expansion pattern of motion of the object is identical in the three conditions. (© modified after Sun and Frost, 1998, Chapter).
It must be emphasized that the expansion pattern of the soccer-ball was identical in these to two cases of moving object and moving bird simulations, yet in the former the cells responded vigorously, while in the latter they were essentially silent. Figure 6 shows the responses of a tau neuron to an approaching soccerball, and also a simulation of the bird approaching a stationary soccer-ball where the rate of expansion is identical in both cases.
26
Barrie J. Frost and Hongjin Sun
B 10cm
40"
c
Juo-t10cm
Q. (0
"5. (A
20cm
\j\i
n
.J.J...
2
Ji
10cm
Q. (A
20cm
20cm
30cm
30cm
jU
40cm
AJM
40cm
50cm
50cm
3
(A 4 0
1
0 sec
3
2
50cm
1
0 sec
3
2
1
0 sec
Figure 6: Comparison of the responses Peri-Stimulus Time Histograms (PSTHs) for a single tau neuron to a series of stimuli (soccer-ball) of varying sizes swept along the direct collision course towards the bird. Responses are the sum of 5 sweeps and are referenced to time zero, which is the time when the stimulus would have contacted the bird. The looming object was presented against a white non-textured background (A), a stationary textured checkerboard background (object-motion) (B), and a looming background moving at the same speed behind the object (C), as shown in Figure 5. The latter condition simulated the approach of the animal toward the stationary ball and background (self-induced motion). Responses were similar to those produced by the looming object against a blank background and a stationary checkerboard background. The magnitude of responses (maximal firing rate) were similar across different object sizes. Note that the neuron did not fire to the self-motion display, even though the soccer ball's image was expanding in the same way in B and C. This implies this tau neuron is exclusively selective for "object motion in depth". The simulated path for the object was 15 m in length and the simulated object size varied from a diameter of 10 cm to 50 cm. Velocity was 500 cm/s. (© modified after Sun and Frost, 1998, Chapter).
The Biological Bases of Time-to-Collision Computation
27
We have also found that not all of the neurons in nucleus rotundus appear to be computing the tau function (Sun & Frost, 1998). Histological examination indicated that those neurons were distributed in a larger anatomical region (dorsal rotundus) as opposed to dorsal posterior rotundus in our earlier discovery by Wang and Frost (1992). In fact, half of the neurons seem to be clearly responding in this fashion, that is, they suddenly start firing at a particular and constant time before the collision event and maintain this high firing rate throughout the remainder of the approach sequence. Roughly a quarter of the neurons that show selectivity to an approaching object show a response that begins earlier for larger objects, or soccer-ball stimuli approaching at slower velocities. In detailed mathematical arguments and quantification of the timing of the response, Sun and Frost (1998) show that these neurons are computing the rate of expansion (ROE), rho of the approaching object. Finally, the remaining quarter of the looming neurons appear to be computing the very same function which best describes the locust looming detector (Hatsopoulos et al., 1995; Gabbiani et al., 1999). An example of the response patterns of each of these three classes of neuron is shown in Figure 7. Sun and Frost (1998) show that on several multidimensional plots these three classes of neurons form very distinct and tight clusters which indicate that there is not some simple underlying continuum that we have arbitrarily divided into three separate groups, but that these are genuine types of neurons each computing the following three functions.
(l)Rho
p(t) = 0'(t)
(2)
(2) Tau
t(t)^-^-
(3)
U{t) (3) Eta
7j(t) = C x G'(t) X eae<"
(4)
28
Barrie J. Frost and and Hongjin Sun Sun
Neuron a
Neuron b
(T)
Neuron c
(P)
(tl)
spikes «x10cm
0cm
A
AW
,»!*••
8
9cm
Bern
3
2
1
d|
jt|tj
0
F
i
r
B
QOcm/s
g
t
375cn
300cm/a
SOOcm/B
J
300cmfs
1
illL 3
i
i
Time-to-collision Figure 7: Based on the differences in the time course of the neuronal responses relative to the moment of collision, the looming sensitive neurons in nucleus rotundus have been classified into three distinct classes. This figure shows the response pattern (PSTHs) for a typical neuron in each of the three classes (neuron a, b, and c for tau, rho, and etc respectively) to a series of stimuli (a simulated moving sphere with a soccer-ball pattern) of varying sizes (A) and of varying velocities (B), moving along the direct collision course path toward the bird. Responses are the sum of 5 trials and are referenced to time zero, which is the time when the stimulus would have collided with the bird. The simulated path was 15 m in length. In (A) velocity for neuron b was 375 cm/s and for neurons a and c was 500 cm/s. In (B), object size was 30 cm for all three neurons. Note that for the neuron a, the timing of the response remains invariant despite substantial changes in size and velocity, whereas for neuron b and neuron c, the timing depends on object size and velocity, with larger or slower objects evoking an earlier response. (© Sun and Frost, 1998, Nature Neurosci).
The Biological Bases of Time-to-Collision Computation
29
An example of such clustering is shown in Figure 8 taken from Sun and Frost (1998).
100
oTAU neuron • ROE neuron • ETA neuron
firingra colli:sion
B
p-off
c
#
14—
o
50
"•••
g (U "6 +^re
A
3? 0.0
AA
i
A
A
0.5
1.0
Variance in response onset Tc Figure 8: Quantitative examination of the timing of the response for the population of nucleus rotundus looming-sensitive neurons when presented with approaching objects that varied in size or velocity. The variances (standard deviation) of Tconsel were plotted along the x axis, and the average drop-off in firing rates at the time of collision, relative to the response peak (%), were plotted along y axis. The data points are clustered in three separate regions, therefore this population of neurons can be classified into three distinct groups (o tau neuron, • rho neuron, • etc neuron). (© Sun and Frost, 1998, Nature Neurosci).
What is rather amazing about these findings is that all looming cells are accounted for. Each rotundal neuron that responds specifically to an object approaching on a collision course with the bird is either a rho neuron, a tau neuron or an eta neuron. Usually in single unit recording studies there are "junk" or "intermediate" categories required for neurons that don't seem to fit the major classifications that apply to the other cells. But here we have no residual cells to account for!
30
Barrie J. Frost and Hongjin Sun
The function of the tau neurons is quite clear; they can provide the animal with useful information about the TTC of the object that is approaching on a direct collision course. The rho cells obviously compute rate of expansion information that is required to compute tau, that is, the denominator in the tau equation. Also the eta function also contains a rate of expansion term and rho neurons could provide essential input into this computation also. Indeed the multiplication and division required in these three equations that describe the total set of looming detectors we have found in the pigeon nucleus rotundus may well be performed by biophysical operations similar to those described in the recent papers by Pena and Konishi (2001) and Gabbiani et al. (2002). In this latter paper Gabbiani et al. (2002) suggest that the eta operation is implemented within single LGMD neurons by exponentiation of the sum of a positive excitatory postsynaptic potential representing the logarithm of angular velocity, and a negative postsynaptic potential representing angular size.
4. Model We have developed neuronal models to explain the physiological response properties of our pigeon looming sensitive neurons. These models were created based on the physiological responses (both qualitative and quantitative data) recorded to various stimulus conditions (including direct manipulation of various optical variables that could be specified by a looming object). The models also take into account the physiological response properties and anatomical connection of the optic tectum that sends a major input to nucleus rotundus. For these rotundal looming detectors, the RF could be composed of a radial arrangement of concentric arrays of RFs of simple local motion detectors (possibly tectal neurons), with the centre of expansion overlapping with the centre of RF radial layout. These tectal neurons would respond to movements that are oriented radially from the centre of the concentric array and they then converge onto rotundal looming detectors. This arrangement is shown in Figure 9.
The Biological Bases of Time-to-Collision Computation
31
General Model
RF of a rotundal neuron
RF of an individual tectal neuron
Figure 9: General model of receptive field (RF) organization of rotundal looming sensitive neurons. The RF is composed of a number RF subunits, each of which corresponds to the RF of a tectal local motion sensitive neuron. These tectal RFs are arranged on the circumference of a series of concentric circles (or rings) with different radii. The converging input from each concentric ring of RF subunits enable both spatial and temporal summation to signal symmetrical image expansion. Note that this general model explains qualitatively the vigorous firing of rotundal looming neurons to symmetrical image expansion.
The spatial and temporal summation of the activity of these small RFs could provide a basis for the rotundal neurons' large receptive field size, the strong directional selectivity for motion along the direct collision course (indicated by symmetrical expansion from a stationary centre). The opponentdirection centre-surround organization of these tectal units (Frost & Nakayama, 1983) could contribute to the silence of these rotundal neurons to stimulation with a self-motion display, in which local segments of the background move in the same direction as that of the object in the adjacent region across the object boundary. A series of quantitative models were also formulated to account for the specific properties of the time course of rotundal neurons response to impending collision. When an object approaches the eye, the visual angular subtense 0(t) and rate of change of visual angle 9'(t) form the basic building blocks for the
32
Barrie J. Frost and Hongjin Sun
calculation of those optical variables that could signal impending collision. For example, TTC can be signaled by the optic variable tau, which can be specified by the ratio of 0' (t) over 0(t) (equal to 1/tau). One way to generate this ratio is through response to the instantaneous values of 0'(t) and 0(t) individually by two sets of neurons, and then through neuronal interaction to generate the ratio. Alternatively, if the instantaneous value of 0'(t) can be encoded (perhaps through the velocity selectivity of local motion detectors, e.g. tectal neurons), then visual angle 0(t) could be registered through the spatial location of the RF of the responding local motion detector relative to the centre of the concentric array of RFs of these local motion detectors. With the centre of image expansion overlapping with the centre of such a concentric array of RFs, a fixed ratio of 0'(t) over 0(t) could be hardwired in the brain to create a threshold for the optical variable 1/tau, so that Rts neurons would only start to fire when the predefined ratio increase to a fixed level. Consequently, such neurons would only fire when TTC decrease to a threshold level as the object approaches (see Figure 10).
B Is radius of the circle of subunlts
Figure 10: Computation of tau. In this model, rate of expansion can be signalled by the spatial summation of the ring of subunits, each with a certain local velocity selectivity (indicated by the size of arrow in A). The instantaneous visual angle could be registered by the distance of this ring of RFs from the centre of the circle at the same time (radius of expansion). To calculate tau, the preferred velocity of each local tectal unit should be linearly related to the radius of the concentric ring of RF subunits located at the same distance (shown in B).
A similar quantitative model of the other two types of rotundal cells has been provided in a recently submitted paper (Sun & Frost, submitted). Our models include alternative one and two component versions where size encoding is realized either from intrinsic pattern of RF organization (one component model)
The Biological Bases of Time-to-Collision Computation ofTime-to-Collision
33
or from the simultaneous coding of visual angle in spatial overlapping neurons, and then converges onto the rotundal neuron. These neuronal models provide explanations for the various physiological responses found in different classes of rotundal looming detectors. Not only are they potentially informative for understanding the neuronal coding of motion in depth, but these models could provide important insights for robotics and machine vision. A neural net model In an interesting recent paper Karanka-Ahonen, Luque-Ruiz and LopezZamora (2002) present the results of a neural network employing backpropagation, implemented in T learn, that learned to predict collision of objects of different sizes that moved towards it. The network consisted of 3 layers with 47 nodes, 40 of which were input nodes, three were hidden nodes, one was the output node and there were three context nodes. The nodes of each layer were completely connected to those of the preceding layer and connections from the context nodes to the context nodes were one to one and reciprocal. After training the network was tested with new objects that differed in size and distance from the original training set and it was found that 75% of collisions were correctly predicted. Interestingly, the behaviour of the hidden units appeared to be very similar to some of the units we found in the pigeon nucleus rotundus, in that the output units predictive response came earlier for larger objects, like our eta cells.
5. Conclusions In this chapter we suggest there may be several different classes of neurons that are each specialized to compute TTC information for different subsets of tasks critical for the survival of animals in their natural environments. We then examine one class of these neurons in detail and find there may be several subroutines required before the tau or eta computations can be performed. What is interesting is that there appears to be convergence in solutions between looming sensitive neurons found in invertebrates and vertebrates in that the eta function describes well some of the neurons in both locust and pigeon visual systems. The functional significance of having both tau and eta neurons is still not obvious, although it is provocative that learning networks appear to give rise to both types of hidden neurons.
34
Barrie J. Frost and Hongjin Sun
REFERENCES Ball, W. & Tronick, E. (1971). Infant responses to impending collision. Science, 111, 818-820. Bechterew, W. (1884). Uber die Function der Vierhugel. Pfliigers Arch, 33, 413. Bessette, B. B. & Hodos, W. (1989). Intensity, colour, and pattern discrimination deficits after lesions of the core and belt regions of the ectostriatum. Visual Neuroscience, 2, 27-34. Blanchard, D. C, William, G., Lee, E. M. C. & Blanchard, R. J. (1981). Taming of wild rattus norvegicus by lesions of the mesencephalis central gray. Physiological Psychology, 9, 157-163. Bower, T. G. R., Broughton, J. M. & Moore, M. K. (1970). Infant responses to approaching objects: an indicator of response to distal variables. Perception and Psychophysiology, 9, 193-196. Burrows, M. & Rowell, C. H. F. (1973). Connections between descending visual interneurons and metathoracic motoneurons in the locust. Journal of Comparative Physiology, 85, 221234. Coggshall, J. C. (1972). The landing response and visual processing in the milkweed bug, Oncopeltusfasciatus. Journal of Experimental Biology, 57, 401-413. Cutting, J. E., Vishton, P. M. & Braren, P. A. (1995). How we avoid collisions with stationary and moving obstacles. Psychological Review, 102, 627-651. Dean, P., Mitchell, I. J. & Redgrave, P. (1988). Response resembling defensive behaviour produced by microinjection of glutamate into superior colliculus of rats. Neuroscience, 24, 501-510. Dill, L. M. (1974). The escape response of the zebra danio (Brachydanio rerio). I. The stimulus for escape. Animal Behaviour, 22, 771-722. Ellard, C. G. & Goodale, M. A. (1986). The role of the predorsal bundle in head and body movements elicited by electrical stimulation of the superior colliculus in the Mongolian gerbil. Experimental Brain Research, 71, 307-319. Ewert, J.-P. (1984) Tectal mechanisms that underlie prey-catching and avoidance behaviours in toads. In H. Vanegas (Ed.), Comparative neurology of the optic tectum (pp. 247-416). New York, NY:Plenum Press. Fishman, R. & Tallaroco, R. B. (1961). Studies of visual depth perception. II. Avoidance reaction as an indicator response in chicks. Perceptual and Motor Skills, 12, 251-257. Frost, B. J. (1978). Moving background patterns alter directionally specific responses of pigeon tectal neurons. Brain Research, 151, 599-603. Frost, B. J., Cavanagh, P. & Morgan, B. (1988). Deep tectal cells in pigeons respond to kinematograms. Journal of Comparative Physiology^ 162, 639-647. Frost, B. J. & Nakayama, K. (1983). Single visual neurons code opposing motion independent of direction. Science, 220, 744-745. Frost, B. J., Scilley, P. L. & Wong, S. C. P. (1981). Moving background patterns reveal double opponency of directionally specific pigeon tectal neurons. Experimental Brain Research, 43, 173-185.
The Biological Bases of Time-to-Collision Computation
35
Gabbiani, F., Krapp, H. G., Koch, C. & Laurent, G. (2002). Multiplicative computation in a looming-sensitive neuron. Nature, 420, 320-324. Gabbiani, F., Krapp, H. G., & Laurent, G. (1999). Computation of object approach by a widefield, motion- sensitive neuron. Journal ofNeuroscience, 19, 1122-1141. Gabbiani, F., Mo, C. H., & Laurent, G. (2001). Invariance of angular threshold computation in a wide-field looming-sensitive neuron. Journal ofNeuroscience, 21, 314-329. Gibson, J. J. (1979). The ecological approach to visual perception (Houghton Mifflin, Boston). Griisser, O. J. & Griisser-Cornehls, U. (1976). Neurophysiology of the anuran visual system. In R. Llinas & W. Precht (eds.), Frog neurobiology. (pp. 297-385). New York, NY: Plenum Press. Hatsopoulos, N., Gabbiani, F. & Laurent, G. (1995) Elementary computation of object approach by a wide-field visual neuron. Science 270, 1000-1003. Hayes, W. N. & Saiff, E. I. (1967). Visual alarm reactions in turtles. Animal Behaviour, 15, 102108. Hodos, W. (1969). Color discrimination deficits after lesions of the nucleus rotundus in pigeons. Brain, Behavior and Evolution, 2, 185-200. Hodos, W. & Bonbright, J. C. (1974). Intensity difference thresholds in pigeons after lesions of the tectofugal and thalamofugal visual pathways. Journal of Comparative and Physiological Psychology, 87, 1013-1031. Hodos, W. & Karten, H. J. (1966). Brightness and pattern discrimination deficits in the pigeon after lesions of nucleus rotundus. Experimental Brain Research, 2, 151-167. Ingle, D. J. & Shook, B. L. (1985). Action-oriented approaches to visuo-spatial brain functions. In D. Ingle, M. Jeannerod & D. Lee (Eds.). Brain mechanisms of spatial vision, (pp. 229258). Dordrecht:Matinus Nijhoft. Jarvis, C. D. (1974). Visual discrimination and spatial location deficits after lesions of the tectofugal pathway in pigeons. Brain, Behaviour and Evolution, 9, 195-228. Judge, S. & Rind, F. C. (1997). The Locust DCMD, a movement-detecting neurone tightly tuned to collision trajectories. Journal of Experimental Biology, 200, 2209-2216. Karanka-Ahonen, J. T., Luque-Ruiz, D. & Lopez-Zamora, M. (2002). A neural network that acquires the capacity of predicting collision from the increase in the retinal size of objects. XIV Congreso de la Sociedad Espanola de Psicologia Comparada. Keay, K. A., Redgrave, P. & Dean, P. (1988). Cardiovascular and respiratory changes elicited by stimulation of rat superior colliculus. Brain Research Bulletin, 20, 13-26. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception 5, 437-459. Lee, D. N., Davies, M. N. O., Green, P. R. & Van der Weel, F. R. (1993). Visual control of velocity of approach by pigeon when landing. Journal of Experimental Biology, 180, 85-104. Lee, D. N. & Reddish, P. E. (1981). Plummeting gannets: a paradigm of ecological optics. Nature, 293, 293-294. Northmore, D. P. M., Levine, E. S. & Schneider, G. E. (1988). Behaviour evoked by electrical stimulation of the hamster superior colliculus. Experimental Brain Research, 73, 595605.
36
Barrie J. Frost and Hongjin Sun
Olberg, R. M., Worthington, A. H. & Venator, K. R. (2000). Prey pursuit and interception in dragonflies. Journal of Comparative Physiology A, 186, 155-162. Pearson, K. G., Heitler, W. J. & Steeves, J. D. (1980). Triggering of locust jump by mutimodal inhibitory interneurons. Journal of Neurophysiology, 43, 257-278. Peiia, J. L. & Konishi, M. (2001). Auditory spatial receptive field s created by multiplication. Science, 292, 249-252. Redgrave, P., Dean, P., Souki, W. & Lewis, G. (1981). Gnawing and changes in reactivity produced by microinjections of picrotoxin into the superior colliculus of rats. Psychopharmacology, 75, 198-203. Rind, F. C. (1997). Collision avoidance: from the locust eye to a seeing machine. In M.V. Srinivasan & S. Venkatesh (Eds), From Living Eyes to Seeing Machines (pp 105-125). Oxford:Oxford University Press. Rind, F. C. & Bramwell, D. I. (1996). Neural network based on the input organization of an identified neuron signaling impeding collision. Journal of Neurophysiology, 75, 967-985. Rind, F. C. & Simmons, P. J. (1992). Orthopteran DCMD neuron: A reevaluation of responses to moving objects. I. Selective responses to approaching objects. Journal of Neurophysiology, 68,1654-1666. Rind, F. C. & Simmons, P. J. (1999). Seeing what is coming: building collision-sensitive neurones. Trends in Neurosciences, 22, 215-220. Robertson, R. M. & Johnson, A. G. (1993a). Retinal image size triggers obstacle avoidance in flying locusts. Naturwissenschaften, 80, 176-178. Robertson, R. M. & Johnson, A. G. (1993b). Collision avoidance of flying locusts: Steering torques and behaviour. Journal of Experimental Biology, 183, 35-60. Sahibzada, N., Dean, P. & Redgrave, P. (1986). Movements resembling orientation or avoidance elicited by electrical stimulation of superior colliculus in rats. Journal of Neuroscience, 6, 723-733. Schiff, W. (1965). Perception of impeding collision: A study of visually directed avoidant behaviour. Psychological Monographs: General and Applied, 79,1-26. Schiff, W., Caviness, J. A. & Gibson, J. J. (1962). Persistent fear responses in rhesus monkeys to the optical stimulus of 'looming'. Science, 136, 982-983. Schlotterer, G. R. (1977). Response of the locust descending movement detector neuron to rapidly approaching and withdrawing visual stimuli. Canadian Journal of Zoology, 55, 13721376. Simmons, P. (1980). Connexions between a movement-detecting interneurone and flight motoneurones of a locust. Journal of Experimental Biology, 86, 87-97. Simmons, P. J. & Rind, F. C. (1992). Orthopteran DCMD neuron: A reevaluation of responses to moving objects. II. Critical cues for detecting approaching objects. Journal of Neurophysiology, 68,1667-1682. Sun, H.-J., Carey, D. P. & Goodale, M. A. (1992). A mammalian model of optic-flow utilization in the control of locomotion. Experimental Brain Research, 91, 171-175. Sun, H.-J. & Frost, B. J. (1997). The effect of image expansion on a human target-directed locomotion task tested in virtual reality. Society for Neuroscience Abstracts,!^
The Biological Bases of Time-to-Collision Computation
37
Sun, H.-J. & Frost, B. J. (1998). Computation of different optical variables of looming objects in pigeon nucleus rotundus neurons. Nature Neuroscience, 1, 296-263. Sun, H.-J., Zhao, J., Southall, T. L., & Xu, B. (2002). Contextual influences on the directional responses of tectal cells in pigeons. Visual Neuroscience, 19, 133-144. Tronick, E. (1967). Approach response of domestic chicks to an optical display. Journal of Comparative and Physiological Psychology, 64, 529-531. Wang, Y. & Frost, B. J. (1992) Time to collision is signalled by neurons in the nucleus rotundus of pigeons. Nature, 356, 236-238. Wagner, H. (1982). Flow-field variables trigger landing in flies. Nature, 297,147-148. Wicklein, M. & Strausfeld, N. J. (2000). Organization and significance of neurons that detect change of visual depth in the hawk moth Manduca sexta. Journal of Comparative Neurology, 424, 356-376. Yonas, A. & Granrud, C. (1985). The development of sensitivity to kinetic binocular and pictorial depth information in human infants. In D. Ingle, M. Jeannerod, & D. Lee (Eds.), Brain Mechanisms and Spatial Vision (pp. 113-145). Amsterdam: Martinus Nijhoff Press.
This Page is Intentionally Left Blank
Time-to-Contact – - H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) Time-to-Contact © 2004 Elsevier B.V. All rights reserved
CHAPTER 3 Building Blocks for Time-to-Contact Estimation by the Brain
Markus Lappe Universitat Munster, Miinster, Germany
ABSTRACT The ability to estimate impending collisions is a vital requirement for any organism. Behavioral studies in several animal species have demonstrated sensitivity to time-to-contact (TTC). The proven capability to estimate TTC in humans and animals must have its origin in brain processes that extract or compute correlates of TTC from the visual input. Unfortunately, not many studies have looked specifically at neuronal sensitivity for TTC. Specific investigations of this question in the primate brain (including humans) are still lacking. However, a great deal is known about the neuronal selectivity for the building blocks of TTC mechanisms: visual motion, looming, binocular disparity, motion in depth, etc. This chapter presents a discussion of the properties of neurons that encode these cues and hopefully will be helpful for homing in on the biological foundations of TTC.
40
Markus Lappe
1. Introduction Since we are able to precisely estimate TTC, our brains must contain mechanisms that conduct the required sensory analysis. Candidates for neural TTC mechanisms have been described in several animal species (Pigeon: Frost & Sun, this volume, Locust: (Rind & Simmons, 1997; Gabbiani, Krapp, Laurent, 1999), Gerbil: (Shankar & Ellard, 2000)). The topic has not yet been specifically addressed, however, for the brains of higher mammals and especially primates. But a number of the sensory cues that are used to derive TTC measurements have been studied in great detail. These include visual motion, motion parallax, binocular disparity, looming and optic flow. After a brief overview of the motion analysis pathway of the primate brain I will review some of these studies and show how the building blocks for TTC estimation are represented in the brain. The presentation will sometimes have to divert a bit towards the analysis of optic flow rather than time-to-collision, mainly because much more is known about the brain's representation of optic flow. As I believe the two tasks to have some commonalities at the basic level, however, I think it might be useful to consider them together.
2. The motion pathway In the cerebral cortex, the processing of visual motion is attributed to a successive series of areas in the so-called dorsal stream pathway. The dorsal stream is specialized in the analysis of spatial relationships and the generation of spatially directed action. Within the dorsal stream, motion information proceeds from the primary visual cortex (VI) to the middle temporal area (MT, also called V5) and the medial superior temporal area (MST) and to several higher areas in the parietal cortex and the anterior part of the superior temporal sulcus. Area VI already contains cells that respond selectively to visual motion in a preferred direction. But area MT is the first cortical area that is dedicated specifically to the processing of motion. It contains a very high proportion of direction selective neurons and it has been linked to behavioral responses to motion stimuli in lesion and microstimulation studies. Typical MT neurons respond selectively to small moving stimuli with a preferred direction and a preferred speed of motion (Maunsell & Van Essen, 1983a). Area MT contains a topographic representation of the visual field, that is the receptive fields of neighboring MT cells correspond to neighboring locations of the visual field (Albright & Desimone, 1987). Receptive field sizes in MT vary with eccentricity and range from 1 to more than 10 degrees of visual angle in diameter (Albright & Desimone, 1987). The properties of MT neurons are well suited for
Building Blocks for Time-to-Contact Time-to-Contact Estimation by by the Brain
41
establishing a cortical representation of motion in the visual field, solving computational problems in the estimation of visual motion such as the aperture problem (Movshon et al., 1985; Pack & Bora, 2001). Area MT projects heavily to area MST, a further specialized motion area in the superior temporal sulcus (Boussaoud, Ungerleider, Desimone, 1990). Like MT, MST contains a large proportion of motion selective neurons, but retinotopy in MST is much coarser than in MT and the receptive fields of MST neurons are much larger (Desimone & Ungerleider, 1986). Neurons in MST respond to motion patterns such as optical expansions or rotations (Tanaka & Saito, 1989; Duffy & Wurtz, 1991). The selective responses of MST neurons to motion patterns depend mainly on the spatial arrangement (directions) of motion vectors and on the size of the stimulated area. Other parameters such as shape and size of the individual motion elements, contrast, or speed gradients did not have much influence on neuronal responses (Tanaka & Saito, 1989; Geesaman & Andersen, 1996; Pekel, Lappe, Bremmer, Thiele, Hoffmann, 1996). Area MST is believed to be involved in the analysis of optic flow and the control of self-motion (Duffy & Wurtz, 1991; Lappe, Bremmer, Pekel, Thiele & Hoffmann, 1996). Area MT also projects to the ventral intraparietal area, or area VIP. Like cells in area MST, VIP cells respond to patterns of visual motion (Schaafsma & Duysens, 1996; Bremmer, Duhamel, Ben Hamed, Graf, 2000). However, they also respond to small objects moving toward the animal, especially when they are near to the head (Colby, Duhamel, Goldberg, 1993).
3. Optical speed Selectivity for the direction of visual motion is a well-documented feature of neurons in many areas of the brain. Direction selectivity is found as early as V1 and is the main feature of neural selectivity in areas MT, MST, and VIP. When tested with moving bars, gratings, or random dot fields of variable speed, most neurons in these areas also display selectivity for the speed of the stimulus. Typically, the speed tuning is independent of other tuning properties of the neuron. For instance, the speed tuning of an MT neuron is independent from its direction tuning (Rodman & Albright, 1987). MT responses have been modeled as the multiplication of a speed factor and a direction factor (e.g. Lappe etal. 1996). Whereas neurons in area VI are influenced by the spatial frequency content of the stimulus and thus react to spatiotemporal frequency rather than visual speed, neurons in MT already display true speed selectivity (Perrone & Thiele, 2001). The most prevalent type of speed selectivity in MT is band-pass tuning with optimal speeds ranging from 0.5-256 degrees per second (Maunsell & Van Essen, 1983a; Mikami, Newsome, Wurtz, 1986; Lagae, Raiguel, Orban,
42
Markus Lappe
1993). Smaller proportions of cells have either low-pass or high-pass characteristics (Lagae et al., 1993). With tuned neurons, the representation of visual speed is not possible from the responses of individual neurons but requires some kind of population code (Lappe et al., 1996; Perrone & Thiele, 2001). The global organization of the representation of visual motion in MT reflects a certain relation to optic flow patterns experienced during forward selfmotion. Preferred speeds increase with eccentricity (Maunsell & Van Essen, 1983a) similar to the way optic flow speeds naturally do. Also, the number of direction sensitive neurons preferring motion away from the fovea is significantly higher than the number of neurons preferring motion towards the fovea (Albright, 1989).
4. Local differential motion and motion parallax Differences in the motion of neighboring image elements can often be more informative than the visual motion of a single location itself (Nakayama & Loomis, 1974). On the one hand, local differential motion is important to segregate moving objects from the background. On the other hand, local differential motion provides local motion parallax information than can be useful to estimate distance and may indirectly contribute to the estimation of TTC. Many neurons in area MT are sensitive to local differential motion by virtue of the antagonistic center-surround organization of their receptive fields (Allman, Miezin, McGuinness, 1985). The receptive field of a such a neuron can be described as composed of two parts. The center is an excitatory region in which the primary direction and speed selectivity of the neuron is established. Stimulation of the center is required to elicit a response from the neuron. The surround is a modulatory region of the receptive field. Stimulation of the surround increases or decreases the response to stimulation in the center, depending on the stimulation and selectivity of the surround (Born, 2000). Stimulation of the surround alone elicits no response. The area covered by the surround is often many times larger than the area covered by the center (Allman et al., 1985; Raiguel, Van Hulle, Xiao, Marcar, Orban, 1995). Despite that the name appears to suggest it, the surround must not lie concentrically around the center of a neurons receptive field. Rather, the surround is often asymmetric and its midpoint is displaced from the center of the receptive field (Raiguel et al., 1995). This suggests that the spatial organization of center and surround enables MT neurons to respond to differences in the motion of adjacent image locations. A neuron's selectivity for local differential motion is shaped by the direction and speed tuning of the surround in comparison to the preferred
Building Blocks for for Time-to-Contact Estimation by the Brain
43
velocity in the receptive field center. The direction preference of the antagonistic surround is usually opposite to the direction preference of the center and neurons with surround preferred direction orthogonal to the center preferred direction are rare (Born, 2000). The response rate of the neuron decreases if the surround is stimulated with the same motion as the center. The response might be enhanced, however, if center and surround are stimulated with motion in opposite directions. The speed tuning properties of the surround can be used to distinguish three classes of neurons with antagonistic surround. In one type of neuron, stimulation of the surround with the same speed as the speed in the center leads to maximum suppression of activity (Allman et al., 1985). Both lower and higher speeds are less effective in suppressing the center response. These neurons therefore encode absolute speed differences. In other neurons, the speed influence is monotonic, either increasing or decreasing with increasing surround speed (Allman et al., 1985; Tanaka, Hikosaka, Saito, Yukie, Fukada, Iwai, 1986). Therefore, these neurons also encode the sign of the speed difference between center and surround. There is also an interaction between the speed preference and the spatial location of the surround (Xiao et al., 1998). The center-surround architecture of MT receptive fields generates a variety of selectivities for local differential motion. These selectivities may be used for visual tasks like foreground/background segregation, optic flow analysis, or three-dimensional shape recognition (as theoretical analysis has shown (Nakayama & Loomis, 1974; Buracas & Albright, 1996; Royden, 1997; Beintema, van den Berg, Lappe, 2002)) and could also be useful for estimation ofTTC.
5. Optic flow and looming Theories of TTC often involve the analysis of motion patterns, especially looming. Selectivity to motion patterns is first found in parietal areas beyond area MT (Lagae et al., 1994). Early studies occasionally identified neurons in the superior temporal sulcus that responded to rotations in depth or to optical expansions (Bruce, Desimone, Gross, 1981; Saito et al., 1986; Sakata et al., 1986). Later, systematic investigation of the response properties of neuron in area MST revealed selectivity to unidirectional motion, rotation, expansion, and contraction patterns (Tanaka & Saito, 1989; Duffy & Wurtz, 1991). The selectivity of these neurons clearly relies on the spatial arrangement of motion vectors. Shape, size, number, or contrast of the individual motion elements did not have much influence on the response (Tanaka & Saito, 1989; Duffy & Wurtz, 1991; Geesaman & Andersen, 1996; Pekel et al., 1996). The same is true for the size change of individual motion elements that normally accompanies
44
Markus Lappe
approaching motion. MST neurons respond to optical expansion patterns in which the size change is removed in the same way as when the patterns contain size change (Pekel et al., 1996). Although some MST neurons are selective for only one particular pattern of the set of expansion, contraction, rotation and uniform motion, the majority of MST neurons responds to several of these stimuli (Duffy & Wurtz, 1991). Many neurons respond similarly to unidirectional motion in a preferred direction, rotation in one of the two principle directions (clockwise, counterclockwise), and either expansion or contraction. Other cells respond to unidirectional motion and to either rotation or expansion/contraction. These two response types form the majority of cells. Very few neurons respond to rotation and expansion/contraction but lack direction selectivity. The smallest group is formed neurons that responded to only one type of motion pattern. At first sight, these response properties seem to open up the possibility that neurons in MST perform a decomposition of the visual motion field into a set of basic invariants. Mathematically, any motion vector field can be locally approximated by a set of four differential invariants: divergence (related to expansion/contraction), curl (related to rotation), and two components of deformation (Koenderink & van Doom, 1976). A closer look, however, shows that the properties of MST cells are incompatible with such a mathematical decomposition. To extract divergence in a mathematical sense a cell would have to respond with equal activity to pure expansion and to a stimulus where the same expansion is superimposed by another motion component, for instance rotation. This is not true for MST cells (Graziano et al., 1994; Orban et al., 1992). Instead, some MST cells even prefer vectorial combinations of rotation and expansion/contraction over the two individual patterns and display a selectivity for spiral motion (Graziano et al., 1994). Secondly, mathematical decomposition predicts selectivity for deformation, since a full local description requires deformation in addition to divergence and curl (Koenderink & van Doom, 1976). Cells responding exclusively to deformation are rare in MST (Lagae et al., 1994). Thirdly, decomposition is a local linearization of the flow field (Koenderink & van Doom, 1976) and valid only within a small neighborhood of any point in the field. Because the receptive fields in MST are quite large they are less likely to signal local properties. Altogether, MST does not appear to form a set of channels selective for expansion, contraction and rotation, but rather a continuum of selectivities in which single neurons respond to several different motion patterns. Area MST is believed to contribute to the analysis of optic flow for the control of selfmotion and in particular to the computation of heading. (Duffy & Wurtz, 1991; Lappe & Rauschecker, 1993) A number of computational models have shown how the response properties of MST neurons can be linked to optic flow based heading estimation (overview in Lappe (2000)). These models share
Building Blocks for Time-to-Contact Time-to-Contact Estimation by by the Brain
45
the assumption that area MT serves as a representation of the optic flow field while area MST performs computations on this representation that generate selectivity for the heading inherent in the flow field. In the output stage, area MST forms a computational map of heading directions. This map could be represented either implicitly by the population, or explicitly by dedicated neurons that individually encode specific headings. Consistent with model predictions, MST neurons have been found to exhibit selectivity for heading and the location of the focus of expansion in a flow field stimulus (Lappe et al., 1996). In these experiments, a monkey was presented with large-field (90 by 90 deg) computer generated optic flow stimuli simulating approaching (expansion) and receding (contraction) self-motion with respect to a random cloud of dots in three-dimensional space. The stimuli displayed realistic flow fields in which dots accelerated with eccentricity, grew larger in size as they approached the animal, and moved with motion parallax according to their distance in depth. To determine the dependence of neuronal responses on the singular point in the optic flow, either a focus of expansion or a focus of contraction, different stimuli depicted self-motion in different directions. Neural responses in MST varied smoothly with the position of the singular point. They often showed a sigmoidal response profile that saturated as the focus of expansion moved into the visual periphery. These experimental findings could be recreated in computer simulations of neurons from the heading model in which the focus of expansion is recovered from the population response. Similarly, the location of the focus of expansion could be retrieved from a population analysis of the neuronal activities in MST with a precision close to that obtained in human psychophysical studies of heading estimation. One of the main problems in heading estimation from optic flow is the combination of translational and rotational self-motion components and the disturbance of the retinal flow field by eye movements. Superimposed visual rotation because of eye movement can drastically change the appearance of the optic flow field on the retina and thus complicate the task of its analysis (overview in Lappe et al. (1999)). Neurons in MST appear to be able to cope with the problem of eye movements and represent heading also during eye pursuit, at least at the population level (Bradley et al., 1996; Page & Duffy, 1999; Krekelberg et al., 2001). Combinations of translational and rotational selfmotion components also occur during movement on a curved path. MST neurons faithfully encode momentary heading even when it changes because of a curve in the path (Paolini et al., 2000; Krekelberg et al., 2001). How can MST's selectivity to optic flow be related to the analysis of TTC? On the one hand, several factors argue against an involvement of MST in TTC estimation. MST neurons generally respond best to very large motion fields, ideally several tens of degrees in diameter. MST neurons respond not only to optical expansions but also to optical rotation, uniform motion fields, or
46 46
Markus Lappe Lappe
spiraling motion. These patterns are not directly linked to TTC mechanisms. On the other hand, MST neurons can also be driven by small stimuli, albeit to a lesser degree. The computational mechanisms for heading detection could in principle also be used to compute the direction of approach of a moving object, at least if the image area that is analyzed is restricted to the size of the object. The capability to deal with combinations of translation and rotation might be useful in estimating the approach of a rotating object such as a flying ball. If MST provides the machinery to perform a complex analysis of patterns of image motion that result from translational and rotational movements, this machinery might be put to use for both self-motion and object-motion tasks. A point that is more directly elated to time-to-collision analysis is MST's ability to extract the speed of an optical expansion. Early work on MST had focused mainly on pattern selectivity (Saito et al., 1986; Sakata et al., 1986; Duffy & Wurtz, 1991). Speed effects were disregarded or, after cursory inspection of the data, declared non-existent. Later studies have specifically addressed the influence of speed on the responses to expansion and rotation patterns (Orban et al., 1995; Duffy & Wurtz, 1997). Like MT, MST contains neurons with band-pass, low-pass, or high-pass speed characteristics. Duffy & Wurtz (1997) also described neurons that respond well to low and to high speeds but less to intermediate speeds. The preferred speeds of bandpass neurons in MST are typically higher than in MT, however, and the speed selectivity curves are shallower. Typical speed tuning curves appear quite linear over a large range up to the preferred speed. The response drops quickly for speeds that exceed the preferred speed. Speed or rate of expansion might be retrieved from these responses by a population code. In addition to selectivity for the average speed of a motion pattern, MST neurons also display selectivity for the speed gradient within the pattern (Duffy & Wurtz, 1997). Such selectivity might be useful to encode properties of the three-dimensional structure of the environment.
6. Binocular disparity and motion in depth Neural selectivity for binocular disparity is a common feature of many areas in the visual motion pathway (overview in Poggio (1995)). For instance, the response to visual motion in many neurons of area MT is modulated by the disparity of the stimulus (Maunsell & Van Essen, 1983b). Most disparity selective MT neurons prefer disparities near zero, but some also prefer stimuli placed nearer or farther than the horopter (Maunsell & Van Essen, 1983b; Bradley et al., 1995; DeAngelis et al., 1998). MT appears to be involved in the perception of depth from disparity (DeAngelis et al., 1998; Bradley et al., 1998). It is therefore conceivable that the disparity of a stimulus can be retrieved from a population code of MT responses to be used in time-to-collision judgments as
Building Blocks for for Time-to-Contact Estimation by the Brain
Al 47
well. However, the disparity selectivity of MT neurons may also be used to enhance the representation of the visual motion field, particularly in relation to self-motion (Lappe, 1996). Disparity selectivity is also found in area MST, but in MST most neurons that prefer disparities different from zero (Roy & Wurtz, 1992; Takemura et al., 2001). A population code can retrieve stimulus disparity from MST responses and suggests a link to the control of vergence eye movements (Takemura et al., 2001). The disparity selectivity of MST neurons together with their selectivity for optic flow patterns also appears well-suited for the analysis of self-motion (Roy & Wurtz, 1992; Lappe & Grigo, 1999). With respect to the measurement of TTC, disparity selectivity is mainly interesting as a contribution to mechanisms that estimate motion in depth. While early work has suggested that the combination of motion and disparity selectivity could lead to selectivity to motion in depth (Zeki, 1974) a later study found no evidence for genuine motion in depth selectivity in MT (Maunsell & Van Essen, 1983b). Rather it appears that MT neurons possess selectivity for a certain preferred range of fixed disparities such that their response is determined by the two-dimensional visual motion present within this disparity range (Bradley et al., 1995; Lappe, 1996). However, selectivity for binocular motion in depth has been described in studies of the lateral suprasylvian cortex of the cat, which may contain homologues of primate cortical areas MT or MST (Sherk & Fowler, 2000). Akase et al. (1998) investigated the responses of neurons in the posteromedial lateral suprasylvian cortex (area PMLS) to object motion in three dimensions. Their stimulus was a small cube presented by a computer graphics setup in which motion, disparity, looming, texture, and shading cues could be included in or removed from the stimulus. A substantial proportion of the PMLS cells they studied preferred movement towards or away from the animal over lateral motion. Testing with stimuli in which depth cues were partially eliminated revealed that the selectivity to approaching or receding motion depended mostly on binocular disparity and looming and less on texture and shading cues. Removal of disparity cues affected selectivity in ninety percent of the cells and removal of size change (looming) in forty percent. Interestingly, the response rates of cells selective to approaching movements usually increased during the approach of the object and were maximal when the object was nearest to the animal. Although Akase et al. did not discuss this observation in relation to TTC, one might expect that this type of response behavior could serve as a basic mechanism for TTC analysis. Somewhat similar behavior has been described, albeit briefly, in the ventral intraparietal area (area VIP) of the rhesus monkey. Area VIP is part of the motion pathway and receives input from area MT and area MST. Several properties of neurons in area VIP suggest that area VIP is especially concerned with the representation of motion in the space near to the animal (Bremmer et
48 48
Markus Markus Lappe Lappe
al., 1997): In addition to being selective to visual motion patterns (Colby et al., 1993; Schaafsma & Duysens, 1996; Bremmer et al., 2000) VIP neurons also often possess somatosensory responses to tactile stimulation on the head (Colby et al., 1993). VIP neurons are also selective for the distance of a visual stimulus and often prefer near or ultra-near (<5cm) stimuli (Colby et al., 1993). VIP neurons are disparity sensitive and the number of near-selective neurons is larger than the number of far-selective neurons (Bremmer et al., 1999). Some neurons in VIP are selective for stimuli moving in depth towards the animal. Colby et al. (1993) reported that the critical variable for these neurons is the projected point of impact of the object on the body. The impact point that elicits the strongest response is associated with the tactile receptive field of the neuron. The response of the neuron is independent of the trajectory along which the object is approaching the point of impact as long as the point of impact remains the same. This type of response behavior seems to suggest a connection to time-tocollision estimation, although little is known yet about the temporal response properties of these neurons.
7. Conclusion It is obvious from the above that our current knowledge of the neural mechanisms of TTC estimation is only cursory and comes from only a few occasional observations. This is surprising when one considers that the basic suggestion of using tau as a simple optical variable for powerful behavioral control is more than 25 years old. But as theories of TTC have become more variable in recent years and the necessity to use many different optical cues has become clear, the knowledge we have about the encoding of general visual cues to motion and depth may become more valuable for research on TTC. With the neural representations of optical velocity, motion parallax, looming, disparity, and motion-in-depth the building blocks for a variety of TTC mechanisms are there. The question that still remains is how these encodings are used and combined to form appropriate representations for TTC judgments.
Building Blocks for for Time-to-Contact Estimation by the Brain
49
REFERENCES Akase, E., Inokawa, H. & Toyama, K. (1998). Neuronal responsiveness to three-dimensional motion in cat posteromedial lateral suprasylvian cortex. Experimental brain research, 122, 214-226. Albright, T. D. (1989). Centrifugal directionality bias in the middle temporal visual area (MT) of the macaque. Visual neuroscience, 2,177-188. Albright, T. D. & Desimone, R. (1987). Local precision of visuotopic organization in the middle temporal area (MT) of the macaque. Experimental brain research, 65, 582-592. Allman, J. M., Miezin, F. & McGuinness, E. (1985). Stimulus specific responses from beyond the classical receptive field: Neurophysiological mechanisms for local-global comparisons in visual neurons. Annual Reviews of Neuroscience, 8, 407-430. Beintema, J., van den Berg, A. V. & Lappe, M. (2002). Receptive field structure of flow detectors for heading perception. In: T. G. Dietterich, S. Becker & Z. Ghahramani, editors: Advances in Neural Information Processing Systems 14, MIT Press. Born, R. T. (2000). Center-surround interactions in the middle temporal visual area of the owl monkey. Journal of neurophysiology, 84, 2658-2669. Boussaoud, D., Ungerleider, L. G. & Desimone, R. (1990). Pathways for motion analysis: Cortical connections of the medial superior temporal visual areas in the macaque. The Journal of comparative neurology, 296, 462-495. Bradley, D., Chang, G. C. & Andersen, R. (1998). Encoding of three-dimensional structure-frommotion by primate area MT neurons. Nature, 392,714-717. Bradley, D., Maxwell, M., Andersen, R., Banks, M. S. & Shenoy, K. V. (1996). Mechanisms of heading perception in primate visual cortex. Science, 273,1544-1547. Bradley, D., Qian, N. & Andersen, R. (1995). Integration of motion and stereopsis in middle temporal cortical area of macaques. Nature, 373,609-611.13 Bremmer, F. & Kubischick, M. (1999). Representation of near extrapersonal space in the macaque ventral intraparietal area (VIP). Abstracts - Society for Neuroscience, 27, 1164. Bremmer, F., Duhamel, J.-R., Ben Hamed, S. & Graf, W. (1997). The representation of movement in near extrapersonal space in the macaque ventral intraparietal area (VIP). In: Thier, P., Karnath, H.-O. (Eds.), Parietal Lobe Contributions to Orientation in 3D-Space. Experimental brain research, Vol 25. Springer, Heidelberg, pp. 619-630. Bremmer, F., Duhamel, J. R., Ben Hamed, S. & Graf, W. (2000). Stages of self-motion processing in primate posterior parietal cortex. In: Lappe, M. (Ed.), Neuronal Processing Of Optic Flow. Academic Press. Bruce, C, Desimone, R. & Gross, C. G. (1981). Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. Journal of neurophysiology, 46, 369-384. Buracas, G. & Albright, T. (1996). Contribution of area MT to perception of three-dimensional shape: A computational study. Vision Research, 36, 869-887. Colby, C. L., Duhamel, J.-R. & Goldberg, M. E. (1993). The ventral intraparietal area (VIP) of the macaque: anatomical location and visual properties. Journal of neurophysiology, 69, 902-914.
50
Markus Lappe
DeAngelis, G., dimming, B. & Newsome, W. (1998). Cortical area MT and the perception of stereoscopic depth. Nature, 394, 677-680. Desimone, R. & Ungerleider, L. G. (1986). Multiple visual areas in the caudal superior temporal sulcus of the macaque. The Journal of comparative neurology, 248, 164-189. Duffy, C. J. & Wurtz, R. H. (1991). Sensitivity of MST neurons to optic flow stimuli. I. A continuum of response selectivity to large-field stimuli. The Journal of comparative neurology, 65, 1329-1345. Duffy, C. J. & Wurtz, R. H. (1997). Medial superior temporal neurons respond to speed patterns in optic flow. Neuroscience, 17,2839-2851. 14 Frost, B. J. & Sun, H.-J. (2002). Neurons that detect looming objects and compute time to collision: Neuronal responses properties and models. In: Hecht, H., Savelsbergh, G. (Eds.), Theories ofTime-to-Contact. Elsevier. Gabbiani, F., Krapp, H. G. & Laurent, G. (1999). Computation of object approach by a wide-field, motion-sensitive neuron. Neuroscience, 19, 1122-1141. Geesaman, B. & Andersen, R. (1996). The analysis of complex motion patterns by form/cue invariant MSTd neurons. Neuroscience, 16, 4716-4732. Graziano, M. S. A., Andersen, R. A. & Snowden, R. (1994). Tuning of MST neurons to spiral motions. Neuroscience, 14, 54—67. Koenderink, J. J. & van Doom, A. J. (1976). Local structure of movement parallax of the plane. Journal of the Optical Society of America, 66, 717-723. Krekelberg, B., Paolini, M., Bremmer, F., Lappe, M. & Hoffmann, K.-P. (2001). Deconstructing the receptive field: information coding in macaque area MST. Neurocomputing, 38, 249-254. Lagae, L., Maes, H., Raiguel, S., Xiao, D.-K. & Orban, G. A. (1994). Responses of macaque STS neurons to optic flow components: A comparison of areas MT and MST. Journal of neurophysiology,l\, 1597-1626. Lagae, L., Raiguel, S. & Orban, G. A. (1993). Speed and direction selectivity of macaque middle temporal neurons. Journal ofneurophysiology, 69, 19-39. Lappe, M. (1996). Functional consequences of an integration of motion and stereopsis in area MT of monkey extrastriate visual cortex. Journal of comparative physiology. A, Neuroethology, sensory, neural, and behavioral physiology, 8, 1449-1461. Lappe, M. (2000). Computational mechanisms for optic flow analysis in primate cortex. In: Lappe, M. (Ed.), Neuronal Processing of Optic Flow. International review of neurobiology, 44. Academic Press, pp. 235-268. 15 Lappe, M., Bremmer, F., Pekel, M., Thiele, A. & Hoffmann, K.-P. (1996). Optic flow processing in monkey STS: A theoretical and experimental approach. Neuroscience, 16, 62656285. Lappe, M., Bremmer, F. & van den Berg, A. V. (1999). Perception of self-motion from visual flow. Trends in cognitive sciences, 3, 329-336. Lappe, M. & Grigo, A. (1999). How stereo vision interacts with optic flow perception: Neural mechanisms. Neural Networks, 12, 1325-1329.
Building Blocks for for Time-to-Contact Estimation by the Brain
51
Lappe, M. & Rauschecker, J. P. (1993). A neural network for the processing of optic flow from ego-motion in man and higher mammals. Journal of comparative physiology. A, Neuroethology, sensory, neural, and behavioral physiology, 5, 374-391. Maunsell, J. H. R. & Van Essen, D. C. (1983a). Functional properties of neurons in middle temporal visual area of the macaque monkey. I. Selectivity for stimulus direction, speed, and orientation. Journal of neurophysiology, 49, 1127-1147. Maunsell, J. H. R. & Van Essen, D. C. (1983b). Functional properties of neurons in middle temporal visual area of the macaque monkey. II. Binocular interactions and sensitivity to binocular disparity. Journal of neurophysiology, 49, 1148-1167. Mikami, A., Newsome, W. T. & Wurtz, R. H. (1986). Motion selectivity in macaque visual cortex. I. Mechanisms of direction and speed selectivity in extrastriate area MT. Journal of neurophysiology, 55, 1308-1327. Movshon, J. A., Adelson, E. H., Gizzi, M. S. & Newsome, W. T. (1985). The analysis of moving visual patterns. In: Chagas, C , Gattass, R., Gross, C. (Eds.), Pattern Recognition Mechanisms. Springer, Heidelberg. Nakayama, K. & Loomis, J. M. (1974). Optical velocity patterns, velocity sensitive neurons, and space perception: a hypothesis. Perception, 3, 63-80. Orban, G. A., Lagae, L., Raiguel, S., Xiao, D. & Maes, H. (1995). The speed tuning of medial superior temporal (MST) cell responses to optic-flow components. Perception, 24, 269285. 16 Orban, G. A., Lagae, L., Verri, A., Raiguel, S., Xiao, D., Maes, H. & Torre, V. (1992). First-order analysis of optical flow in monkey brain. Proceedings of the National Academy of Sciences of the United States of America, 89, 2595-2599. Pack, C. & Born, R. (2001). Temporal dynamics of neural solution to the aperture problem in visual area mt of macaque brain. Nature, 409,1040-42. Page, W. K. & Duffy, C. J. (1999). MST neuronal responses to heading direction during pursuit eye movements. Journal of neurophysiology, 81, 596-610. Paolini, M., Distler, C, Bremmer, F., Lappe, M. & Hoffmann, K.-P. (2000). Responses to continuously changing optic flow in area MST. Journal of neurophysiology, 84, 730743. Pekel, M., Lappe, M., Bremmer, F., Thiele, A. & Hoffmann, K.-P. (1996). Neuronal responses in the motion pathway of the macaque monkey to natural optic flow stimuli. Neuroreport, 7, 884-888. Perrone, J. A. & Thiele, A. (2001). Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nature neuroscience, 4, 519. Poggio, G. (1995). Mechanisms of stereopsis in monkey visual cortex. Cerebral Cortex, 5, 193204. Raiguel, S., Van Hulle, M., Xiao, D., Marcar, V. & Orban, G. (1995). Shape and spatial distribution of receptive fields and antagonistic motion surrounds in the middle temporal area (V5) of the macaque. The European journal of neuroscience, 1, 2064-2082. Rind, F. & Simmons, P. J. (1997). Signaling of object approach by the DCMD neuron of the locust. Journal of neurophysiology, 76, 1029-1033.
52
Markus Lappe
Rodman, H. R. & Albright, T. D. (1987). Coding of visual stimulus velocity in area MT of the macaque. Vision research, 27, 2035-2048. Roy, J.-P. & Wurtz, R. H. (1992). Disparity sensitivity of neurons in monkey extrastriate area MST. The Journal of neuroscience: the official journal of the Society for Neuroscience, 12, 2478-2492. 17 Royden, C. S. (1997). Mathematical analysis of motion-opponent mechanisms used in the determination of heading and depth. Journal of the Optical Society of America. A, Optics and image science, 14, 2128-2143. Saito, H.-A., Yukie, M., Tanaka, K., Hikosaka, K., Fukada, Y. & Iwai, E. (1986). Integration of direction signals of image motion in the superior temporal sulcus of the macaque monkey. The Journal of neuroscience: the official journal of the Society for Neuroscience, 6, 145-157. Sakata, H., Shibutani, H., Ito, Y. & Tsurugai, K. (1986). Parietal cortical neurons responding to rotatory movement of visual stimulus in space. Experimental brain research, 61, 658663. Schaafsma, S. & Duysens, J. (1996). Neurons in the ventral intraparietal area of awake macaque monkey closely resemble neurons in the dorsal part of the medial superior temporal area in their responses to optic flow patterns. Journal ofneurophysiology, 76,4056-4068. Shankar, S. & Ellard, C. (2000). Visually guided locomotion and computation of time-to-collision in the mongolian gerbil (meriones unguiculatus): the effects of frontal and visual cortical lesions. Behavioural brain research, 108, 21—37. Sherk, H. & Fowler, G. (2000). Optic flow and the visual guidance of locomotion in the cat. In: Lappe, M. (Ed.), Neuronal Processing Of Optic Flow. Academic Press. Takemura, A., Inoue, Y., Kawano, K., Quaia, C. & Miles, F. A. (2001). Single-unit activity in cortical area MST associated with disparity-vergence eye movements: Evidence for population coding. Journal ofneurophysiology, 85, 2245-2266. Tanaka, K., Hikosaka, K., Saito, H.-A., Yukie, M., Fukada, Y. & Iwai, E. (1986). Analysis of local and wide-field movements in the superior temporal visual areas of the macaque monkey. The Journal of neuroscience: the official journal of the Society for Neuroscience, 6, 134-144. Tanaka, K. & Saito, H.-A. (1989). Analysis of motion of the visual field by direction, expansion/ contraction, and rotation cells clustered in the dorsal part of the medial superior temporal area of the macaque monkey. Journal ofneurophysiology, 62,626-641. 18 Xiao, D.-K., Raiguel, S., Marcar, V. & Orban, G. A. (1998). Influence of stimulus speed upon the antagonistic surrounds of area MT/v5 neurons. Neuroreport, 9, 1321-1326. Zeki, S. (1974) Cells responding to changing image size and disparity in cortex of the rhesus monkey. The Journal of physiology, 242, 827-841.
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 4 Predicting Motion: A Psychophysical Study
Lucia M. Vaina Boston University, Boston, MA, USA Harvard Medical School, Cambridge, MA, USA
Franco Giulianini Boston University, Boston, MA, USA
ABSTRACT Assad and Maunsell (1995) found neurons in the primate posterior parietal cortex (PPC) whose response seems to be tuned to the animal's inference of the motion of a visual target. These neurons maintain an appreciable response in the absence of retinal stimulation when the visual target is inferred to move behind an occluder. In the present study we investigated psychophysically the ability of human observers to predict the position of a visual target after it disappears behind an imaginary occluder by using a task similar to that by Assad and Maunsell. The accuracy of the predictions were measured psychophysically as a function of the velocity of the target and the time the target was visible by using a forced choice paradigm. The results were interpreted in the context of a simple model that reflects known properties on neurons in PPC.
54
Lucia M. Vaina Vaina and Franco Giulianini
1. Introduction The 2-dimensional retinal images of objects and surfaces that make up our visual environment are affected by a high degree of spatio-temporal discontinuity due to the occlusions of objects and surfaces with each other. Occlusion is an obvious consequence of the fact that we live in a 3-dimensional world. However, despite the fragmented nature of the retinal images our visual system is capable to "unconsciously infer" (Helmoltz, 1910) the parts of an object that are occluded by another object, thus preserving its continuity and integrity. The inference of parts of objects that are occluded in space by another object is referred to as "amodal completion" (Michotte, Thines & Crabbe 1964). Amodal integration (Yantis, 1995) is the temporal analogue of amodal completion, it refers to the capability of observers to perceive an object as continuing behind a surface through time. Amodal integration occurs when a moving object is occluded for a short time by a surface: despite the absence of a retinal signal the object is still perceived as continuing behind the surface. There is emerging evidence that the neurons in the posteriori (Assad & Maunsell, 1995; Snyder et al, 1997) parietal cortex (PPC) use extra-retinal information to infer the motion of an object that is temporally occluded by a real or imaginary occluder. Assad and Maunsell (1995), reported that neurons in PPC maintain an appreciable activity after the disappearance of a moving target that then reappears at a position consistent with the target continuously moving during the time it was not visible (like, for example, a moving object that enters and exits a tunnel). A straightforward sensory-off response could not account for this phenomenon, since almost half of these neurons were significantly more active in this task than in a condition where the target disappeared and then reappeared in the same location after some time, as if it did not move at all during the time it was not visible. The authors suggest that the difference in activity between the two tasks is related to the animal's inference of the motion of the target. The sustained activity of these neurons may provide the neural substrate for constructing an abstract representation of the motion of a visual target. The capability to create and maintain such representation is of crucial importance for accurate and stable visual guidance, such as reaching for a moving object, motor planning, or control of driving. By using a similar task to that of Assad and Maunsell (1995), in this study we explored psychophysically the ability of human observers to infer the motion of a visual target. The display consisted of a dot moving along a circular trajectory for a fixed time and then continuing its motion behind an imaginary occluder. Observers were asked to make judgements on the position of the dot at a time At after it disappeared (see 2. Methods). The measured accuracy of observers' judgements can be explained by a simple integrator that collects the information on the target motion while it is visible.
Predicting Motion: A Psychophysical Study
55
2. Methods 2.1 Apparatus and stimuli Stimuli were created and displayed on a Macintosh computer (832 x 624 pixels resolution, 75 Hz screen refresh rate). The display consisted of a 0.24° diameter white dot that moved along a black circular trajectory 8° in diameter (see Figure la) at a constant angular velocity. The circle was displayed
dot rtarte moving «t
dotdiauppearaal t o - 7 5 0 ms
• *
A
pcwitioi t
nie iiiviBEuB dot «tUmeto+At - m(to + At) angle described the d o t - m to
dot »lart» moving
(b)
Figure 1: (a) Stimulus set up: the circular trajectory is 8° diameter, the fixation square in the center of the circle is 0.4° wide and the moving dot is 0.24° in diameter, (b) Experimental procedure: the dot moves for t0 before disappearing. After a time At, a tag is displayed on the circumference. The tag is displaced from the position where the dot should be by ± a displacement angle a, which is adaptively adjusted from trial to trial (see text). The observer's task is to report whether the dot is ahead of or beyond the tag.
56
Lucia M. Vaina Vaina and Franco Giulianini
throughout the entire experimental run. A circular trajectory permits to keep the eccentricity of the moving dot constant, avoiding the complications resulting from retinal inhomogeneity for motion sensitivity with eccentricity. Observers were instructed to fixate the center of the display indicated by a white 0.4° wide white square. Viewing distance was 60 cm, the angular velocities used in the experiment ranged from 37 deg/sec to 131deg/sec. Angular velocity is expressed in degrees/sec, where "degrees" refer to angles measured on the monitor screen with respect to the center of the circle (not visual angle). 2.2 Procedure The basic experimental procedure is schematically illustrated in Figure lb. At time t = 0, the dot starts moving in counter-clockwise direction along the circle with angular velocity co and, after a time to, it suddenly disappears. The observer is asked to assume that while the dot is not visible it continues to move as if it where behind an imaginary occluder. After a time interval At, a small tag appears perpendicular to the circle (see Figure la & b) and the observer's task is to judge whether the invisible dot is ahead of or behind the tag by pressing one of two keys. In each trial, the tag is randomly placed ahead of or behind the position where the dot should be. During the experimental run the amount of displacement is determined by an adaptive staircase, which automatically terminates when nine reversals are reached. The staircase increases the angular displacement o (see Figure lb) after one incorrect response and decreases it after three consecutive correct responses. This procedure tracks the 74% correct response level of the psychometric function (Wetherill & Levitt, 1965). The step sizes are linearly spaced within each decade of the staircase. The "minimum detectable displacement angle" (mdda) estimated in a single experimental run was the mean of the last six reversals. The data reported show mean and standard error based upon six independent runs (i.e. the standard error plotted reflects between-runs variability). The measured mdda is an estimate of the observer's uncertainty on the angular position of the moving dot at the time At after it is no longer visible. The observer can only tell that the invisible dot is somewhere between "co(At + t0) (position where the dot should be) ± mdda" (see Figure lb). Two different Ats were used: 1250 and 1850 ms. At the beginning of each trial the dot initiated its motion from a randomly chosen position on the circle. Three observers participated in the experiment, one of the authors (FG) and two naive observers (MI and DH). Mddas were measured for two different conditions: (i) the time the dot is visible (t0) and the time it is occluded (At) were kept constant while the angular velocity co was varied and (ii) a> and At were kept constant while t0 was varied.
Predicting Motion: A Psychophysical Study
57
In the experiments reported here, the observers were asked to predict the position of the dot that disappears behind an imaginary occluder. The effect of a real occluder, like a surface defined by illusory contours (Kanisza, 1979), on the accuracy of predictions will be measured in a future experiment. 2.3 Results and model 2.3.1 Results Figures 2a & b show mddas from two observers measured as a function of angular velocity GO, for At = 1250 ms (filled symbols) and At = 1850 ms (open symbols). The graphs in Figure 2 show that for both observers the uncertainty on the angular position of the dot increases linearly with angular velocity. The uncertainties in the angular positions of the dot are higher for the longer At. The lines in the figure represent the best least-square fit to the data of the "mdda = m co" line. As we will elaborate in the next section, these lines represent the predictions of a simple integrator model and the slope "m" can be related to the time-decay parameter T of the model (see next section). For observer FG the slopes are: mi25o = 0.23 (the correlation r between data and the predictions of the best fitting line is 0.97), mi85o = 0.37 (r = 0.94). For observer MI: mi25o= 0.27 (r = 0.99) and m1850= 0.39 (r = 0.96). Figures 3a & b illustrate the data from two observers (DH and FG) for the condition in which the mddas were measured as a function of t0 (the time during which the moving dot was visible). The angular velocity was fixed at 75 deg/sec, At wasl250 ms and t0 varied between 200 and 1750 ms. The plots show relative uncertainties, that is "mdda / a" where a = co At. For both observers, the relative uncertainty did not vary with the time t0. The horizontal dotted lines in Figure 3 represent the mean relative uncertainty for the two observers. The data presented in Figures 2 and 3 are consistent with the predictions of an integrator model discussed in the next section.
58
Lucia M. Vaina and Franco Giulianini
50 -i
40 -
— • — At = 1250 ms --O-- At = 1850 ms
140
angular velocity (deg/s) (a) 50 -i
40 -
—•—
At = 1250 ms
--O--
At = 1850 ms
angular velocity (deg/s) (b)
Figure 2: Mdda as a function of angular velocity for observers FG and MI. The time the dot is visible was kept constant at 750 ms and two different Ats were used: 1250 ms (solid symbols) and 1850 ms (open symbols), (a) Data from observer FG. (b) Data from observer MI. For both observers the "mdda" increases linearly with angular velocity. The dotted lines represent the best fitting model to the data (see text).
Predicting Motion: A Psychophysical Study
59
DH 0.40.3 -
1
(1 , I
0.2 -
i
»
•
rI [it
.
3
0.1 0500
i 1000
i
i
1500
2000
1500
2000
Time dot is visible (to) (ms) (a) 0.5-i
0.4 -
3
FG
0.3 H
I £
0.2-
erf 0.1 -
500
1000 Time dot is visible (t 0 ) (ms) (b)
Figure 3: Relative uncertainty as a function of the time the dot is visible (to) for observers HD (a) and FG (b). The angular velocity is fixed at 75 deg/sec and At = 1250 ms.
60
Lucia M. Vaina and Franco Giulianini
2.3.2 Model Figure 4 illustrates the integrator model used to interpret the psychophysical data. The model involves two distinct sequential processes: (i) a build-up process during which the visual system collects and integrates the information about the motion of the dot and (ii) a decay process during which such information extinguishes. The basic architecture of the model consists of a group of motion selective units that respond to the velocity of the dot in the Middle Temporal (MT) area (Maunsell & Van Essen, 1983) and a neural unit, presumably in PPC, that integrates the signals collected by the MT neurons. MT units
Display
Response of the neural
Response decays
Figure 4: Schematic representation of the model used to interpret the psychophysical data. A group of MT units respond to the moving dot. The signals from these neurons are the input to a neural integrator. The response of the integrator is proportional to the angular distance covered by the dot during the time t0 while it is visible and provides a neural representation of the velocity of the target. Such representation decays exponentially with time.
Predicting Motion: A Psychophysical Study
61
The higher the angular velocity of the dot, the larger the angle it runs and, consequently, the larger the signal collected and integrated during the time the dot is visible. Since this signal is proportional to the angle covered by the dot's motion during the time t0 while it is visible, we suggest that it provides a representation of the angular velocity co of the moving dot. Immediately after the moving dot becomes invisible, the strength of this representation decays exponentially with time constant x. The linking hypothesis between the behavior of the model and the observers' performance is that when, at time At after the dot disappears, observers make a judgment on the angular position of the dot, their response is based on the accuracy of the representation of the angular velocity at the decision time. The observers' uncertainty A(O in judging the angular velocity of the target is reflected in the psychophysically measured mddas. Let ©o be the angular velocity of the dot. We can formulate the model as follows: at time t0 ,when the dot disappears, the neural integrator has collected a signal proportional to coo. Immediately after the dot disappears, the signal starts to decay exponentially and so does the representation of the angular velocity:
where x is the time constant that characterizes the decay process. The model assumes that immediately after the dot disappears, that is for At = 0, the observer has no uncertainty on the angular velocity of the target and that her/his uncertainty increases with At. The observer's uncertainty Aco as a function of At, can then be expressed as: Aw(At)=(oo-a)0e^
= w0(l-e^)
(1)
From (1) we have that at At = 0, Aco = 0 (i.e. no uncertainty) and for At —> °o, Aco = co0. The latter defines the level of total uncertainty of the observer: when Aco = co0, the observer's estimate of the angular velocity of the target is "co0 ± co0", in other words the observer cannot tell whether the target is moving or not. From equation (1), we can obtain the observer's uncertainty on the angular position of the dot (when its motion is masked by an illusory occluder) as a function of At: A a{At)= Aco(At) • At = w0At(l- e^)
= a{l- e ^ )
(2)
where a = coQ At. Aa is the "minimum detectable displacement" measured in our experiment. Equation 2 makes three predictions: (i) for a fixed value of At there should be a linear relationship between angular uncertainty on the dot position
62 62
Lucia M. M. Vaina Vaina and and Franco Franco Giulianini Giulianini
and angular velocity; (ii) for a fixed value of 0) the uncertainty on the angular position should increase with At (since At (l-exp(-At/x)) is a monotonically increasing function of At); (iii) the relative uncertainty Aa/a does not depend on t0 and is constant for a given value of At and x. These predictions are qualitatively confirmed by the experimental data shown in Figures 2 and 3. For both observers, the data in Figure 2 illustrate a clear linear relationship between uncertainty in angular position and angular velocity (prediction (i)), the angular uncertainties are larger for At = 1850 ms (prediction (ii)) and Figure 3 shows that the relative uncertainty on the angular position of the dot does not depend on t0 (prediction (iii)). The value of the time-decay parameter x can be estimated from the data in Figure 2. The lines in Figure 2 represent the best least-square fit to the data of the Aa = mco line, where m is the slope. From (2), we have that m = At (1-exp (-At / T)), which entails that the time constant T. is related to the slope m by: x = -( At/ (In (1 - m / At)). Table 1 reports estimates of x for two observers and for the two different Ats tested.
At = 1250 ms At= 1850 ms
FG x = 6.15 sec x = 8.29 sec
MI x = 5.14 sec x = 7.81 sec
Table 1
Data from Figure 3 also allow to estimate x. The dotted line in Figure 3 is drawn by equation Aa/a = k, where k is the mean uncertainty. From Equation (2) we have Aa/a = (l-exp(-At / x)). Substituting Aa/a = k in this expression gives k = (l-exp(-At / x)) from which we get x = -(At/ In (1 -k)). For observer MI the mean x, from the data in Figure 2, is 6.47 s ± 1.33 s; for FG the mean x, from the data in Figure 2 and 3, is 6.74 s ± 0.78 s. In the context of the proposed model, this means that it takes several seconds for the observer to have an uncertainty on the angular velocity which is 63% of its initial value (when At = x, Aw/ a)o = (l-e~1) = 0.63). For observer FG, for example, a time constant of 6.74 sec means that his uncertainty on the angular velocity of the dot is about 14% of the total angular velocity after 1 second from the time the dot disappeared (for At = 1, Aco/co0 = (l-e~ 1/674 ) = 0.14).
Predicting Motion: A Psychophysical Study
63
3. Discussion The purpose of this study was to characterize psychophysically the ability of human observers to infer the motion of a moving object when it disappears behind an imaginary occluder and to relate this capability to possible neural mechanisms underlying this process. The main result of the study is that the observer's uncertainty on the position of the invisibly moving dot increases linearly with the dot's angular velocity and that the slope of the best fitting line to the data increases with the time At during which the dot motion is invisible(Figure 2). The second result (Figure 3) is that the relative uncertainty on the angular position of the dot is independent of the time the motion of the dot is visible (at least for values of t0 > 200 ms). We suggested that a simple neural integrator model provides a plausible neural substrate for the mechanisms mediating this task. Since MT neurons are sensitive to both speed and direction of the stimulus (Maunsell & Van Essen, 1983), they are a suitable candidate for providing the input to the integrator unit in the model. This unit creates a representation of the target motion that is presumably used by observers to make decisions on the target position. As defined previously, t0 is the time during which the visual system can integrate the motion information. It is important to determine the critical t0 value necessary for creating a robust representation of the target motion. In order to address this question we must measure mddas for a larger range of t0 values, including values shorter than 200 ms (the shortest t0 used in this study). Furthermore we will also measure the minimum At value for which observers can make reliable predictions of the target position. Preliminary data showed that for very short Ats (of the order of about 100 ms), observers cannot do the task. This suggests that perhaps the building up of the representation of target motion continues for some time after the retinal signal is no longer available. It is possible that, during this time, the observer cannot use the representation to make decisions on the putative position of the (invisibly moving) target. It is important to point out that while the representation of the target motion is derived from retinal information, the decision process is based on extra-retinal information. That is, when observers make a decision on the target position at time (to + At), the only information available is that acquired during the time to (when the dot is visible). The use of extra-retinal information is ubiquitous in PPC. This region is part of the neural pathway involved in the generation of the smooth pursuit reflex and activity in this cortical area can be modulated by attention or the intentionality of making an eye movement (Barash, Bracewell, Fogassi, Gnadt & Andersen, 1991a, b; Bushnell, Goldberg & Robinson 1981; Goldberg, Colby & Duhamel 1990). In the experiments reported here, the observers were
64
Lucia M. Vaina and Franco Giulianini
instructed to fixate at the center of the circular trajectory and not make eye movements. However, this would not exclude the possibility that, unconsciously, one pursues the moving dot by attentionally tracking it. The pursuit reflex is a form of predictive behaviour (Barnes, 1993) and is highly dependent on the target motion: for regular, predictable, motions (like, periodic motions or the motion used in our task) the eye - displacement follows closely the target displacement. For target motions that are not predictable the pursuit reflex performance deteriorates (Barnes, 1993). It is, thus, possible to speculate that the same information is used for inferring the target position and for regulating the pursuit response. If this were the case, smooth pursuit of the visual target should not improve significantly the accuracy of the prediction of the target position in the psychophysical task we described here. In planned experiments to compare the performance of observers in the two conditions we will specifically address this speculation. Alternatively, it is possible that the spatial position of the target is represented more abstractly by extraretinal signals in some PPC neurons. We refer to the representation as abstract because the signals contributing to it are not sensory (the target is not visible)) nor coupled with eye movements. Colby and collaborators (1995) suggested that these signals may reflect the observer's expectation that the target is at a particular spatial location or that it is moving at a specific speed, even when the target motion is actually not seen.
Acknowledgment This work was supported in part by the NIH grant EY-2RO1-07861 to L.M.V.
Predicting Motion: A Psychophysical Study
65
REFERENCES Assad, J. A. & Maunsell, J. H. (1995). Neuronal correlates of inferred motion in primate posterior parietal cortex. Nature, 373, 518-21. Assad, J. A. (1998). Personal Communication to L. M V. Barnes, G. R. (1993). Visual-vestibular interaction in the control of head and eye movement: the role of feedback and predictive mechanisms. Prog. Neurobiol., 41, 435-472. Barash, S., Bracewell, R. M., Fogassi L., Gnadt, J. W. & Andersen, R. A. (1991a). Saccade related activity in the lateral intraparietal area. I. Temporal properties. Journal of Neurophysiology 66, 1176-1196. Barash, S., Bracewell, R. M., Fogassi L., Gnadt, J. W. & Andersen, R. A., (1991b). Saccade related activity in the lateral intraparietal area. II Spatial properties. Journal of Neurophysiology, 66, 1176-1196. Bushnell, M. C, Goldberg, M. E. & Robinson, D. L. (1981). Behavioural enhancement of visual responses in monkey cerebral cortex. I. Modulation in posterior parietal cortex related to selective visual attention. Journal of Neurophysiology, 46, 755-772. Colby, C. L., Duhamel, J. & Goldberg, M. E. (1995). Oculocentric spatial representation in parietal cortex. Cerebral Cortex, 5,470-481. Goldberg, M. E., Colby, C. L. & Duhamel, J (1990). Representation of visuomotor space in the prietal lobe of the monkey. CSH Symposium in Quantitative Biology. 55, 729-740 Helmholtz, H. L. F. von (1910). Treatise on Physiological Optics. Translated by J. P. Southall, 1925. New York: Dover. Kanizsa, G. (1979). Organization in vision. New York, Praeger. Maunsell, J. H. R. & Van Essen, D. (1983). Functional properties of neurons in the middle temporal visual area of the macaque monkey. I. Selectivity for stimulus direction, speed, and orientation.. Journal of Neurophysiology, 49, 1127-1147. Michotte, A., Thines, G., & Crabbe, G. (1964). Les complements amodaux des structures perceptives. Louvain: Publications Universitaires de Louvain. Snyder, L. H., Batista, A. P. & Andersen, R. A. (1997). Coding of intention in the posterior parietal cortex. Nature, 386, 167-170 Wetherill, G. B. & Levitt, H. (1965). Sequential estimation of points on a psychometric function. British Journal of Mathematical and Statistical Psychology, 18, 1-10. Yantis, S. (1995). Perceived continuity of occluded visual objects. Psychological science, 6, 182 185.
This Page is Intentionally Left Blank
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 5 Collisions: Getting them under Control
John M. Flach Wright State University, Dayton, OH, USA
Matthew R. H. Smith Delphi Automotive Systems, Kokomo, IN, USA
Terry Stanard Klein Associates, Dayton, OH, USA
Scott M. Dittman Visteon Inc., Detroit, MI, USA
ABSTRACT In a control system, information about a current state is compared to a reference state in order to specify an action that will bring the current state and reference state closer together. This comparison process requires a common currency among the three sources of constraint: intention, action, and information. This chapter considers the possibility that structure in an optic array might be that currency. Performance for several tasks is represented in an optical state space that helps to illustrate the confluence of the multiple sources of constraints. The results suggest that the optical criteria for collision control vary to reflect the different sources of constraint.
68
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
1. Introduction Approach to a solid surface is specified by a centrifugal flow of the texture of the optic array. Approach to an object is specified by a magnification of the closed contour in the array corresponding to the edges of the object. A uniform rate of approach is accompanied by an accelerated rate of magnification. At the theoretical point where the eye touches the object the latter will intercept a visual angle of 180°; the magnification reaches an explosive rate in the last moments before contact. The accelerated expansion in the field of view specifies imminent collision, and it is unquestionably an effective stimulus for behavior in animals with well developed visual systems....the fact is that animals need to make contact without collision with many solid objects of their environment: food objects, sex objects, and the landing surfaces on which insects and birds alight (not to mention helicopter pilots). Locomotor action must be balanced between approach and aversion. The governing stimulation must be a balance between flow and non-flow of the optic array. The formula is as follows: contact without collision is achieved by so moving as to cancel the centrifugal flow of the optic array at the moment when the contour of the object or the texture surface reaches that angular magnification at which contact is made. Gibson (1958/1982, p. 155 - 156) The opening quote is from an article titled "Visually controlled locomotion and visual orientation in animals" by J.J. Gibson. That paper was one of the earliest and clearest examples where Gibson began to frame the problem of perception as part of a closed-loop system where perception and action were dynamically coupled to support goal directed behavior. The goal of the present chapter is to show the value of a control theoretic perspective for understanding how animals solve collision problems. The following section will introduce the "comparator problem." For us, this is the "heart" of a control systems perspective. It provides a framework for conceptually parsing a control problem into three categories of constraint: value, information, and action. This chapter will discuss each of these categories of constraint and will explore some hypotheses about how animals address these constraints in solving the collision control problem.
Collisions: Collisions: Getting Getting them under under Control
69 69
2. The comparator problem In a control system, the comparator is a junction with reference and feedback signals coming in and error signals coming out. In a simple system, such as the servomechanism illustrated in Figure 1, the comparator is analogous to the simple mathematical operation of subtraction. The feedback signal (specifying the current state of the system) is subtracted from the reference signal (specifying the desired state of the system) in order to get an error signal (the deviation from the goal). The error signal then drives action in the direction that will reduce the difference (bring the system state closer to the goal state). In engineering control systems an important step (that is critical in practice, but rarely explicitly acknowledged) is to convert the various signals (reference, feedback, and error) into a common (comparable) medium (e.g., electrical current). Once this is accomplished, the operation of the comparator is directly analogous to the simple mathematical operation of subtraction. Indirect Perception (Information Processin
Manipulations of steering wheel, brakes, etc.
Direct Perception
Figure 1: How is it possible for a biological system to compare perceptual feedback to intentions in order to specify appropriate corrective actions? Classically, this has been thought to require translation into a symbolic neural representation. The idea of direct perception suggests that lawful relations in perceptual arrays may support an indexical coding in the nervous system that can be fully described in terms of the perceptual referents.
70
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
However, for animals, the three signals associated with the comparator rarely come nicely packaged in a common medium or currency that would allow simple subtraction of one from the other to produce the third. For example, the reference or intention may be to get to a meeting across town as quickly as possible without collision. The information may be patterns in an optical flow field. What does it mean to subtract the patterns from the intention? The difference from this subtraction would have to specify the actions of muscles (perhaps on control devices - steering wheel, accelerator, and brake pedal). The natural units for each of these "signals" converging at the comparator are different - desire not to be late for an important meeting and to avoid collisions, a transforming pattern of texture transduced through a retina, and a force or motion of a limb perhaps transduced through a vehicle. How does an animal translate from one medium to another in order to behave appropriately - that is, in order to behave in a way so that errors from intentions are kept within acceptable limits. Psychology has conventionally assumed that the comparator problem was solved "in the head." That is, the general notion was that the three signals (intention, feedback/perception, and error/motor command) were converted to some common symbolic neural code (reflected in terms like program, schema, mental map, mental model, gestalt, etc.). Thus, the neural symbols associated with perceptions could be "compared" with the neural symbols associated with intentions in a way that would specify the appropriate neural symbols to guide actions. Gibson, however, suggested an alternative position. The radical notion of "direct perception" suggests that the comparator problem can be solved in the light. Gibson (1958/1982) wrote: To begin locomotion, therefore, is to contract the muscles as to make the forward optic array flow outward. To stop locomotion is to make the flow cease. To reverse locomotion is to make it flow inward. To speed up locomotion is to make the rate of flow increase and to slow down is to make it decrease. An animal who is behaving in these ways is optically stimulated in the corresponding ways, or, equally, an animal who so acts as to obtain these kinds of optical stimulation is behaving in the corresponding ways (p. 155). The radical aspect of Gibson's theory relative to more conventional wisdom about perception and cognition is that he was the first to imagine how intentions and actions could be specified in optical terms. For example, the intention of contacting an object can be specified as "the state in which the optical contour associated with that object fills the field of view." The action is specified as "do something to make the optical contour associated with that
Collisions: Collisions: Getting Getting them under under Control Control
71 71
object expand." And the current state is specified as the "instantaneous size and location of the optical contour in the field of view". Gibson's ideas are less radical when considered in the context of a control perspective. Perhaps, the most comprehensive application of a control theoretic perspective to psychology is Powers' (1973, 1998) Perceptual Control Theory (PCT). The parallels between how Powers and Gibson describe the locomotion problem are striking: When you learn to drive, the first thing you learn after getting the car into motion is how the road should look relative to the front of the car as you see it from the driver's seat. Somehow this image remains in your head and as you drive along you are continually comparing how the scene does look with how it should look. If the way it does look is shifted to the right of how it should look, you turn the wheel leftward until there is a match. A left shift leads to a rightward turn of the wheel. Once you learn this relationship it becomes automatic; you don't have to think it out any more. You have constructed an automatic control system that will, as long as you're looking, keep the actual perception matching the appearance you know it should have. When you do that, the car settles down into its lane and stays there. The "appearance you know it should have" is called a reference perception, or reference condition, or reference state because it is with reference to this internal information that you judge the perception as too little, too much, or just right; too far left, too far right, or dead on; too hot, too cold, or perfect. If differences exist, we call them "errors" in PCT. Error doesn't mean "blunder" or "mistake"; it means a difference between what is being perceived and what is intended to be perceived (Powers, 1998, p. 8-9). The implication of Powers' description of learning to drive and his choice of the term "perceptual" control theory suggests that the currency exchange that allows feedback to inform action relative to intentions is typically negotiated in the medium of perception. An important point of the PCT approach is that the variable that drives a control system (e.g., a thermostat) is not the "output" variable per se (e.g., the actual room temperature), but the measured or perceived temperature. Ideally, in a designed control system there would be a close correspondence between the output variable and the measured variable. However, for the control system, the measured temperature is the only temperature. Analogously, the animal knows nothing of the world except its own
72
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
perceptions. Of course, if those perceptions are not fairly well tuned to the actual situations in the world, the animal is not likely to survive for very long. Although Gibson and Powers both focus on perception, Powers treats perception as an "image . . . in your head," where as Gibson focused on the structure in an optical array outside the head. Alternatively, the question of perceptual control of action has been framed as trying to learn "how the light gets into the muscles." A problem with this way of framing the problem is that it invites a simple stimulus-response (SR) view of causality. In the simple SR view of causality, perception (light) causes action (muscles). An important implication of a closed-loop coupling of perception and action is that neither perception nor action is causally prior to the other. Thus, action and perception are locked in a circular causality. This opens up the possibility that in many cases (not all) that the problem is solved by putting muscles and intentions into the light. In Gibson's words from the earlier citation "an animal so acts as to obtain these kinds of optical stimulation." Muscles can be put into the light by moving to create an optical flow field. Analogously, the dynamics of a vehicle can be put into the light by manipulating the controls transforming the optical flow (e.g., jiggling the steering wheel or tapping the brakes). Further, we believe that by attending to the consequences of motion, animals can learn to associate those consequences with patterns in the optical flow field. This creates the possibility for intentions to be specified in optical terms. This is our understanding of "direct perception" - that the three components of the comparator problem can be specified as signals within a perceptual array (e.g., the optical flow field). Thus, there is no need to translate to or from a symbolic medium in order to close the perception-action loop. In semiotic terms, we interpret "direct perception" to be the claim that there is an "indexical" relation between properties of the optics and properties of any neural code. As opposed to a "symbolic" relation, requiring some form of interpretation or logical inference. We think it would be a mistake to argue that it is always possible to close the loop within a perceptual array, that is, that there is always an indexical relation. However, we believe that there are many situations where the relation is indexical and we believe that colhsion events typically fall within this category. Collision events are typically well specified in the optical flow field - creating the possibility for direct perception for an animal that is appropriately "attuned" to the optical structure. Interactions where signals in the ecology have an indexical relation to the action requirements correspond to what Rasmussen (1986) called "skill-based" processing. It is important to note that the optical states are expected to have correlated structure in the neural medium, (e.g., weights in a neural network). However, the critical point of "direct perception" is that the neural medium does not introduce additional constraints, at least, when we are dealing with supra-
Collisions: Getting them under Control
73
threshold phenomenon - which is generally the case for control of locomotion. In other words, the claim of direct perception is that the "comparison" of intention with ongoing perception to specify action can be described in terms of lawful relations (e.g., physical laws) that exist independent of any symbolic neural or cognitive process. The neural structure can be tuned to these external constraints, but the constraints exist independently of whether or not they are detected by any cognitive process. Following the lead of Gibson, the goal of this chapter is to explore the possibility that the collision problem can be solved by a control system in which the relevant constraints are specified in optical terms.
3. The intentional constraints: References, values, and consequences Perhaps, the most significant intentional constraint dimension associated with collisions is whether the goal is to create collisions or to avoid collisions. In many sports, the goal is to create collisions, and in many cases the more violent the collision the better (e.g., an overhand smash in racquet sports, the home run swing in baseball, or the rocket volley past the goalie in soccer or hockey). In these contexts, the "value" of a collision typically increases with the velocity of the effector (racquet, bat, stick, or foot) at the point of contact. Of course, value depends on many situational factors and it is easy to imagine exceptions where soft contact can have high value (e.g., a drop volley in tennis, a bunt in baseball, or a kiss). In most vehicle control contexts, the goal is generally to avoid collisions all together (e.g., normal driving) or to make collisions that are as soft as possible (e.g., landing an aircraft). In these contexts, the "value" of a collision is typically inversely related to the velocity of the effector at the point of contact. Another dimension of the intentional dynamics of collisions is the cost of different types of errors. For example, the costs associated with responding earlier or later than some normative ideal of the "right time." For creating collisions, as in racquet sports, there tends to be a precise space-time window associated with success. Responses that are "too early" or "too late" are relatively symmetric in terms of the negative consequences (e.g., hitting the ball out of play or missing completely). For avoiding collisions, as in vehicular control, the consequences of timing errors are typically asymmetric. That is, beginning to brake too early in response to an obstacle in the road rarely leads to serious consequences and this "error" is easily corrected once detected. Initiating the braking too late, however, can have catastrophic consequences that might not be easily corrected. The point is that the costs associated with errors are often not uniform. This may have important implications for whether a control strategy is satisfactory or not.
74 74
John M. Flach, Flach, Matthew R. H. Smith, Terry Terry Stanard and and Scott M. Dittman Dittman
A wide spectrum of tasks can be found in the collision literature - from the forehand drive in table tennis to the left hand (cross-traffic) turn in driving. It is not unusual to find all of these tasks lumped together in discussions of how animals and people control collisions, creating the implication that these are examples of the same control problem. However, from a control systems perspective - it may make a big difference whether the goal is to create a high velocity collision or to avoid collision. From a control systems perspective it would not be surprising if there were satisfactory solutions to one problem that were not satisfactory for the other. It might be valuable to consider the optical specification of collision in light of some of these different intentional constraints - the information value of an optical pattern may, in part, depend upon the intentional value against which "success" is being scored.
4. The action constraints: The plant The term "plant" is control theoretic jargon to specify the physical processes that are being controlled. In the context of collision, this might be the dynamics of the human motor system or the dynamics of a vehicle such as an automobile or aircraft. These dynamics constrain the action possibilities in ways that have important implications for solutions to any control problem. This is clearly illustrated by early research on manual control - which illustrates that the transfer function of the manual controller varies systematically as a function of variations in the plant dynamics (McRuer & Jex, 1967; Flach, 1990; Jagacinski & Flach, 2003). In the literature on the collision problem, a wide range of "plant dynamics" has been studied (from simple head movements, to arm movements, to braking automobiles, to flare maneuvers for landing aircraft). While there are obvious quantitative differences across the range of plants studied, there is one important quality associated with many of the plants studied - inertial dynamics. Inertial dynamics reflect Newton's Second Law of Motion. That is, the acceleration/deceleration of a body in response to a force is directly proportional to the force and inversely proportional to its mass. In the language of control systems - the plant dynamics are second-order. For example, to stop a truck in response to a red light, a braking force is applied which causes the truck to decelerate. In order to bring the truck to zero velocity at a point in front of the red light, the braking force must be initiated well before reaching the target stopping point. How far before depends on the velocity of the truck when the deceleration is initiated and the weight of the truck (whether it is loaded or not). For a given speed, braking would need to be initiated at a farther distance from the target stopping point for a heavier truck. For a given weight, braking would need to be initiated at a farther distance from the target stopping point for faster
Collisions: Getting them under Control
75
speeds. Of course, the state of the brakes and the state of contact with the road (reflecting surface and wheel properties) would also be important variables. The primary implication of inertial dynamics is that feedback based on the single dimension of position is inadequate. That is, there is no functional (one-to-one) relation between position and action. In other words, the control (or problem space) for inertial systems requires consideration of variables in addition to position (or distance). Typically, control engineers add velocity as a second dimension of this state or problem space. However, other dimensions might also be considered in place of or in addition to velocity (e.g., time, or higher derivatives of the motion). It is also important to understand how the inertial constraints shape the response of a system. One way to represent this is in terms of the action system's open-loop response to a fixed (constant input). This is typically called the "step response." The logic here is that since the input is fixed [not modulated to reflect goals (value constraints) or feedback (information constraints)] the pattern of response allows inferences about the action constraints, independent of the other two sources of constraint (i.e., value or information). For example, the step response of the braking system of a car would be the response to a constant deflection of the brake. Or the step response of a bird braking would be the effects of it extending its wings and holding them in a fixed position. In both cases, the pattern of slowing down that resulted would reflect the physical laws of motion. For the same step input given the same parameters (e.g., weight) and the same initial conditions (e.g., speed) the response would be similar, independent of what the driver or bird was intending and independent of whether the driver or bird's eyes were open or closed. In control theoretic terms this response is open-loop. Figure 2 illustrates the response produced by a step braking input to a simple second order process. The step input creates a constant deceleration. In terms of the analytical problem of identifying the logic of a control system it is important that the analyst know the step response for the physical plant being investigated, so that the action constraints are not confused with properties of the control logic. So, for a simple second order process, a constant deceleration may not have to be explained in terms of a continuous adjustment of a brake in reference to some information feedback (e.g., x dot). The constant deceleration may come for "free" as a result of the inertial dynamics.
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
76
o
20
Figure 2: The step response of a simple second-order system.
People interact with many different types of dynamic processes. Inertial dynamics are just one common example. The major point to this section is that the nature of the physical processes being controlled constrains actions and creates demands for the type of information that might provide useful guidance for directing those actions. For researchers who are trying to understand the logic of the comparator process, it will often be useful to know the step response of the physical process. This will help the researchers to differentiate the purely physical (open-loop) constraints from constraints of the closed-loop, control logic. Also, knowledge of the dynamics can be useful for generating hypotheses about the information requirements.
Collisions: Getting them under Control
77 11
5. The information constraints: Optical flow fields The action dynamics of a system creates information requirements for a control system. For example, inertial dynamics mean that information about position will generally be inadequate for solving the collision problem. That is, there is no one-to-one mapping of position (i.e., distance to an obstacle) and action. In other words, the answer to the question: "At what distance should I initiate braking in order to stop at the threshold of the intersection?" is "It depends, among other things, on how fast you are going!" So, the fact of inertial dynamic means that information about position is inadequate for solving the collision problem. The inertial dynamic also means that in principle, for a given "plant" (e.g., a given weight, given state of brakes, given road conditions, etc.) information about position and velocity is typically adequate for solving the collision problem. If the comparator problem were solved in the head, then a logical approach to the information/perception problem would be to study the psychophysical problems of distance and velocity perception. And many have taken this route. However, we will assume that the comparator problem is solved in the light. Thus, the question becomes how are distance and velocity specified optically? A reasonable hypothesis is that distance is specified by the optical angle (8) associated with relevant textures. For example, the distance to a lead car when driving would be specified by the optical angle associated with the taillights of that car. When the distance is great, the optical angle will be small. As the distance becomes less, the optical angle will grow. Similarly, when approaching an intersection the relevant texture elements would be the margins of the intersection - the nearer the intersection the larger would be the optical angle associated with the margin textures. Note that the angular size is not oneto-one with distance - it also varies with the size of the reference object. However, in ecologies where size is relatively constant (e.g., the size of the lead car or the intersection does not change during an approaching event), it may provide a reasonably robust indexical referent for relative distance. The speed or closure rate to a lead car might be specified as the rate of change of the optical angle or the angular expansion rate (9'). High angular expansion rates would be associated with high rates of closure. And low angular expansion rates would be associated with low rates of closure. Again, as with optical angle, angular expansion rate is not one-to-one with closure rate. It varies
78 78
John M. Flach, Flach, Matthew R. H. Smith, Terry Terry Stanard and and Scott M. Dittman Dittman
as a function of distance, size, and velocity relative to the reference texture. Thus, for example, a constant velocity approach results in an increasing expansion rate - (i.e., looming). Although, the two optical variables are not perfectly correlated with position and velocity, they do span the two dimensions (position and velocity) required by the action constraints of a second-order process. Thus, it is plausible that they might provide a satisfactory indexical reference for solving the collision control problem. Whether, the "noise" associated with the mapping (e.g., variability due to size) is a problem or not may depend on the degree of consistency of that dimension in the control ecology and the nature of control errors associated with this noise.
6. Some hypotheses about the control logic Lee (1974, 1976, 1980) was one of the first to demonstrate how the collision problem might be solved by closing the loop in the optical array. Lee (1976) suggested that braking performance might be contingent on the optical parameter T and its derivative x dot. Tau is the inverse fractional rate of expansion of the optical contour associated with a collision obstacle. That is, it is the ratio of the angular projection of an object and the first derivative of this projection (angular expansion rate). Thus, although x is a single optical invariant, it integrates the two dimensions (optical angle/distance, expansion rate/closure velocity) required by the inertial dynamics. Tau has units of time and is thus often referred to as time-to-contact. Lee's work was ground braking in that it provided, for the first time, a plausible control algorithm based completely on optical constraints and it provided clear predictions that could be evaluated empirically. Unfortunately, although there has been a plethora of research papers showing patterns of performance in which control actions are initiated at the right time (as predicted by the T hypothesis), other predictions of the x hypothesis have not been consistent with empirical evidence. In particular, if T and its derivatives are the optical primitives, then performance should be independent of both speed of approach and size of the approaching object. The reason for this is that the dimensions of speed and size cancel in the ratio of angle to expansion rate, leaving only time as the critical variable. Thus, for example, the x hypothesis suggests that actions based on x should occur at the same time-to-arrival independent of the approach speeds to the collision obstacle. To date, every published study that we are aware of that has manipulated speed as an independent variable has found significant performance differences. In general, people respond earlier (at longer time-to-contacts) for slow speeds, than for faster speeds. Likewise, every published study that we are aware of that has manipulated size of the collision obstacle has found significant
Collisions: Getting them under Control
79
performance differences associated with that variable. In general, people respond earlier (at longer time-to-contacts) for large objects. In order to account for the consistent effects of speed and size on human performance Smith, Flach, Dittman, and Stanard (2001) proposed an alternative to the x hypothesis. They proposed that optical angle and its derivative, expansion rate, were independent optical primitives that provided the "direct" indexes to the distance and closure rate associated with a collision event. They proposed that control of collision could be accomplished by weighting these two sources of information to reflect the action constraints of an inertial control system. There are three significant advantages of the Smith et al. Model over the x hypothesis: 1. It can account for both a high correlation with time-to-contact and the significant effects of both speed and size. 2. It is consistent with the dynamic demands associated with control of inertial systems. It has long been known that optimal control of an inertial system can be accomplished using weighted feedback of position and velocity estimates (e.g., see Kirk, 1970). With angle providing an estimate of position and expansion rate providing an estimate of velocity, the Smith et al. Model provides a nice bridge between standard control models and the perceptual analysis. 3. The weighting of the two optical primitives can provide a model for both individual differences and skill development. Both can be modeled as differential weightings of the two primitives. For example, Smith et al. showed that practice, negative transfer, and positive transfer could be predicted as a function of differential weightings of the optical primitives. Similarly, different styles of driving can be modeled as different weightings of angle and expansion rate. Figure 3 illustrates the logic of a simple control system tuned to angle and expansion rate. Note that in this simple control system, angle and expansion rate are input to a comparison process. Typically, the other input to the comparator is called a reference or simply input. In Figure 3 the term "criterion" is used to emphasize that the reference input is a criterion for a decision process. The output of the comparator is typically called the error signal. This makes sense when the criterion function is a "target" value or state and the output of the comparator process is the difference between a current value and a target value. However, as noted in the earlier quote from Powers, the term error has a connotation that can be misleading. The term "command" is used here to emphasize that the output of the comparator process is a call to action (i.e., act in a way to make the optical feedback congruent with some criterion).
80
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
Criterion
Comparator
Command
Position (Separation) Velocity (Closure Rate)
Figure 3: A simple control system in which optical angle (9) and expansion rate (6') are fed back through a comparator which in turn modifies actions in order to satisfy some task criterion.
A useful way to visualize the comparison process is a state space diagram as illustrated in Figure 4. The two feedback variables (angle and expansion rate) are considered the states of this system. They are the coordinates for the state space graph. Any behavior of the system can be illustrated as an event line through this state space. For example, the gray dashed lines show the optical state trajectories for balls approaching at seven different constant speeds (actually radial travel times, see next section for explanation). An event begins near the origin (the object projects a very small optical angle and expansion rate is also small). As the object approaches, its projected angle will grow at an increasing expansion rate. The projected angles for faster balls will grow at higher rates than slower balls. Time is not explicitly represented in the state space. However, a diagonal line with positive slope and zero intercept represents a constant time-to-contact. Time-to-contact would be the inverse of the slope. Shallow slopes represent longer time-to-contacts and steeper slopes represent shorter time-to-contacts. For the constant speed trajectories shown by the dotted line, points closest to the origin represent larger time to contacts and time-tocontact becomes shorter as the event unfolds (moving up and to the right in state space).
Collisions: Getting them under Control
81
Figure 4: An optical state space. Data are from Smith et al. (2001).
The control task illustrated in the state space diagram was a ball-hitting task used by Smith et al. In this task, a ball approached at a constant velocity and the participant had to release a pendulum at the "right time" to hit the oncoming ball. Velocity varied unpredictably from trial to trial. The white region running diagonally through the center of the space shows the "hit zone." If the pendulum was released when the optical variables were in this region of state space, then a collision will result. This region represents the "right time" to release the pendulum. If the pendulum is released at states corresponding to points below this region, then the swing will be too early. If the pendulum is released at states corresponding to points above this region, then the swing will be too late. The circles in Figure 4 represent average data from human participants attempting to create collisions by releasing the pendulum in order to make contact with the on coming ball. The open circles represent performance early in practice and the solid circles represent performance later in practice. Note that the circles fall on the event trajectories for the different ball speeds. The circles represent the "states" where the pendulum was released for the different speed events. As you can see, early in practice there was a tendency for participants to release the pendulum "too early" for the slower balls. With practice, however, the participants learned to release the pendulum at states that fell within the hit zone for balls at all speeds. The solid and dashed black lines in Figure 4 represent functions fit to the performance data. The dashed lines represent simple linear functions (k =
82
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
k0+8') which we will call the ^-function. These functions may reflect the criterion that participants were using for this task. For the performance obtained early in training, the data were fit using a horizontal line. This suggests that the criterion that was being used in the comparator process was a constant value of expansion rate. That is, when the expansion rate reached a criterion value, the pendulum was released. Of course, this criterion did not completely satisfy the goal to hit all balls. Participants using this criterion would miss the slower balls. The comparator process can be described logically as follows: If 0'(t)<X,, then wait. If e'(t) = X, then swing. If 9'(t) > X, then too late. Where X is a critical expansion rate value. Later in practice, the data tended to fall along a diagonal line with a positive slope and nonzero intercept. Again, this linear function may represent the criterion for pendulum release - when the state is below this line wait, when the state reaches the line, release the pendulum: If 9'(t) < X + k0(t), then wait. If 8'(t) = X + k9(t), then swing. If 9'(t) > X + k6(t), then too late. Where X and k are constants. This criterion is more successful than the criterion adopted early in training - as this criterion results in hits at all speeds. Thus, it appears that practice led to a change in which the criterion for the comparison process became better tuned to the objective demands of hitting the ball. Note that a simple x criterion would be represented as a diagonal line with positive slope and zero intercept (X=0):
Collisions: Collisions: Getting Getting them under under Control Control
83 83
If 9'(t) < k6(t), then wait. If 0'(t) = k9(t), then swing. If 9'(t) > kG(t), then too late. Where k would be the inverse of time-to-contact. This would be an excellent strategy for this task since such a criterion could be nicely matched to the objective criterion reflected by the white region. The fact that participants used an expansion rate criterion early in practice and then adopted a criterion that was a function of both angle and expansion rate later in practice led Smith et al. (2001) to propose that angle and expansion rate are independent optical primitives. Note that this does not preclude a x-like strategy. That is, with practice it seems possible that participants might continue to adjust their criterion until the criterion was a boundary with zero intercept that bisected the hit region (closely matching the objective for a perfect hit reflected by the grey line in the center of the hit region). Note that the optical state space provides a powerful tool for visualizing the different constraints contributing to the control problem. The dimensions of the space represent hypotheses about the relevant perceptual variables (reflecting information and action constraints). The criterion functions in this space represent the "reference" in terms of the perceptual variables (reflecting value and information constraints). Commands for action (e.g., wait or release the pendulum) can be logically associated with regions or functions in this state space. Also, the normative criteria for success can be represented as functions or regions in this space (in Figure 4, the hit window). This reflects all three sources of constraint intrinsic to the comparator problem and allows those constraints to be evaluated relative to extrinsic norms for success. Note that for the control system to be successful, the constraints intrinsic to the comparator process must correspond in a functional way to the extrinsic factors that determine success (e.g., whether the ball is actually hit). The fact that the dimensions of the state space are "perceptual" variables is consistent with the spirit of Gibson's direct perception and Powers' PCT. Figure 5 shows an optical state space with a criterion that has a negative slope. In the context of the current literature, which assumes that control of collision means responding at a "right time," this function will seem somewhat odd. However, what is the "right time" to contact when the goal is to avoid contact all together as is typically the case in driving? Instead of representing a
84
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
fixed time to contact, the negative sloped function represents how imminent a collision is relative to the rate of approach (reflected in the expansion rate). For example, when the angular separation between taillights on the leading car is small and expansion rate is low to moderate, there is plenty of time to brake. As the angular separation grows larger, the range of acceptable expansion rates diminishes. The x-intercept, when expansion rate is zero, represents the angular size of the obstacle where you would like to be completely stopped (zero expansion means no relative motion). Thus, the criterion function in Figure 5 reflects a different value system than the criterion function in Figure 4. A control system using this criterion would initiate braking when the optical state approached the criterion function. Pressure on the brake might be adjusted so that the current state stayed at or below the critical margin. For a stationary obstacle, the x-intercept would be the angular size of the object at which the car will be stopped. For a moving object, the x-intercept would be the target following distance when the following car is traveling at the same speed as the lead car.
1
e
e Figure 5: Hypothetical optical criterion for braking to avoid collisions when driving.
Collisions: Collisions: Getting Getting them them under under Control Control
85 85
The criterion function in Figure 5 could be used to parse the collision space into different regions requiring qualitatively different modes of action. In the region well below and to the left of the criterion function no braking is required: If 6'(t) « X - k0(t), then no braking required. In the region surrounding the criterion function controlled (proportional) braking might be required to keep the state close to the criterion function. If 8'(t) = X - k0(t), then brake in a way to make 0'(t) = X - k0(t), If 0'(t) < X - k0(t), then reduce brake pressure. If 0'(t) > X - k0(t), then increase brake pressure. If 0'(t) = X - k0(t), then maintain pressure at current level. In the region above and to the right of the criterion function, full (hard) braking might be required. If 0'(t) » A, - k0(t), then apply maximum pressure to brakes. At the extreme of this region, it will be impossible to avoid collision through braking and perhaps the best response would be to try an evasive maneuver or to prepare for the inevitable collision. If 0'(t) » » A, - k0(t), then pray. When using the state space diagrams in Figures 4 & 5, it is easy to fall back into classical ways of thinking about causality. That is, the states and criterion functions can be easily thought of as external stimuli that cause the actions identified with the different regions (specified by the if-then statements). This is a mistake! It is important to keep in mind the circular coupling illustrated in Figure 3. The state values aren't imposed on the system from without. The states are functions of relative motion of an observer and an obstacle - they are dynamic variables (functions of time). Motion through the state space is the consequence of action (e.g., motion of the car), so that the current state (what region of the state space the system is in) is a result of previous actions (e.g., braking or not). Again, causality is circular - so the states are simultaneously the input to the action and the product of that action. To give causal priority to either the perceptual or action system is to miss the whole point of the control theoretic perspective.
86
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
6. A look inside the head In the previous section, the control system was designed using properties of an optical array as the primitives (state variables) for describing the control problem. This was in part motivated by our belief in the possibility of direct perception and our confidence in this approach has been bolstered by empirical studies of collision control (e.g., Smith, et al., 2001). By direct perception, we simply mean that the control problem is dominated by constraints that are outside the nervous system. This does not deny the participation of the nervous system; but simply means that it should be possible to describe the contribution of the nervous system in purely optical terms. In other words, the behavior of any neural networks involved in collision control should be highly correlated with properties of the optic array, so that a complete functional representation of the control problem can be made without reference to uniquely neural constraints. Thus, we tend to look to constraints external to the nervous system when developing models of the control task. However, that does not mean that research on neural processing is irrelevant. The discovery of neural components that are tuned to specific optical relations can be an important convergent operation in the search to identify the appropriate optical state variables for framing the collision control problem. Sun and Frost (1998) recently reported the discovery of three types of looming-selective neurons in the nucleus rotundus of pigeons. One class of neurons (T) showed response patterns that were independent of size and speed variation. These neurons showed firing patterns (onset and peak rates) that were an invariant function of the time-to-collision. This pattern of firing suggests that the neurons were tuned to a critical ratio between the visual angle (9) and the expansion rate (8') of the approaching object: 6(t)/9'(t) = or 8'(t) = A second class of neurons (p) showed firing patterns that changed with both the size and speed of the approaching object. These neurons showed onsets and peak firing rates that were earlier (longer time-to-contacts) for larger objects and for slower approaches. This firing pattern suggests that these neurons were tuned to expansion rate: 9'(t) = p(t)
Collisions: Getting them under Control
87
A third class of neurons (r|) also showed firing rates that changed with size and speed. Again, onset and peak firing rates were earlier for larger objects and for slower approaches. However, this class showed a decline in firing rate immediately prior to contact (when the visual angle becomes very large), suggesting that firing rate resulted from a competition between an excitatory input associated with expansion rate (0') and an inhibitory input associated with visual angle (0). This combination can be described using the following function: C x 0'(t)/eae(t) = r|(t) This equation is not as intuitive as Equations 1 and 2. However, there is a fairly simple logic to this function. The numerator reflects the excitatory contribution associated with expansion rate (0') and the denominator reflects the inhibitory contribution associated with visual angle (0). The exponential function in the denominator reflects the difference in time course for the excitation and inhibition. The excitation associated with expansion rate grows linearly with a gain equal to C. However, the inhibition associated with angle grows exponentially with a gain equal to a. The result is that expansion rate dominates early but that it is eventually overtaken by the exponentially increasing inhibition associated with visual angle. While there appear to be at least three different classes of neurons, note that there are only two "optical" primitives involved - visual angle (6) and visual expansion rate (0'). Thus, the different neural mechanisms may not reflect different "optical variables," but different "control laws." That is, the different neural mechanisms might represent different solutions to the comparator problem that reflect constraints in addition to the optics (i.e., intentional and dynamic constraints). The pigeons in Sun and Frost's (1998) experiment were being stimulated passively (i.e., no action was possible). A virtual soccer ball was launched at the pigeons that were constrained by a stereotaxic device. In control theoretic terms, the loop had been cut, so in the experimental context the response had no impact on the state of the system. In this context, it is difficult to know what the pigeon were trying to do. However, one guess might be that there was a spreading activation so that several different control circuits related to collisions were stimulated. In other words, since the control task was ambiguous, all possible control circuits were activated. Figure 6 shows the peak response data reported by Sun and Frost plotted in the optical state space. The curved gray lines extending from near the origin and extending with increasing slopes upward to the right represent the event lines for "balls" approaching at constant velocities. At the origin of each trajectory [near (0, 0)] the balls are small and not expanding (large time to contact) as the balls approach (moving up the trajectory) they grow larger and expand at a greater rate as time-to-contact
88
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
gets smaller and smaller. Each of the gray curves represents a different radial travel time, where radial travel time refers to the ratio of ball radius to ball speed (radius/velocity). The optical path is uniquely determined by this ratio. Thus, a ball of radius 15 cm traveling at 250 cm/s will have a trajectory in optical state space that is identical to that for a ball of 30 cm traveling at 500 cm/s. The five curves reflect trajectories for a ball of 30 cm diameter approaching at five different speeds (750 cm/s, 500 cm/s, 300 cm/s, 250 cm/s, 150 cm/s). These were chosen to reflect a subset of the stimulus conditions used by Sun and Frost. The steeper (left most) curve represents the fastest approach (smallest radial travel time) and the shallower (right most) curve represents the slowest approach (largest radial travel time). The data points in Figure 6 are estimates from the data reported by Sun and Frost (1998) for the response onsets for the three classes of neurons [x (open squares), p (filled triangles), & T| (open circles)] that they found. The response criteria corresponding to these three mechanisms are represented by solid lines. The criterion for x is a diagonal line with zero intercept. Thus, it is a constant time from contact. Wagner (1982) plotted the point of initiation of deceleration in approach-to-landing for house flies in an optical state space and found that the points fell along a diagonal line with a zero intercept and a slope with an inverse of 60 ms. This data is consistent with the simple x control system. This might reflect a neural circuit tuned to intercept a moving object. The criterion corresponding to p is a horizontal line that reflects a constant expansion rate (0'). Note that a critical expansion rate criterion will result in responding earlier in time to slower and/or larger objects (larger radial travel times). This trend has been observed repeatedly in studies of human performance in collision judgment and control tasks. This might reflect a control circuit tuned to evade an approaching predator. Perhaps, responding early, particularly to larger predators, might have an adaptive advantage. In this case, the consequences may not be symmetric - responding sooner, may be a lot better than responding later. The criterion corresponding to r| is an exponential curve in optical state space. However, the exponential curve can be well approximated as a line with nonzero intercept (dashed-line). Note that the line fit to this data has a negative slope as in Figure 5. Perhaps, this response might reflect a neural circuit tuned for braking with the goal of soft contact as in landing. The solid black curves fit to the response data from Smith et al. (Figure 4) are T|-functions. Note that for the range of events used in both Figures 4 and 5, it is difficult to differentiate between the linear (k) and exponential (r|) fits.
Collisions: Getting them under Control
0
1 2
3
4
5
89
6 7
Figure 6: Neural response data from Sun and Frost (1998) plotted in an optical state space.
7. Summary and conclusions Organismic thought ... is simply the reliance on analogy, something that every physicist has done from Newton forward and back. Every significant thinker in science has drawn upon useful analogies for simplifying certain stages of thought. What is important is not to stop with rough analogy when the occasion demands that we go on, but to render the analogy into a precise, explicit, and predictive model. Weinberg (1975, p. 31). In this chapter, we have tried to illustrate how control theory can be used when studying the phenomena of collision control. In this regard, control theory is considered a language for building precise models, and for making predictions and experimental data more explicit. As with any other modeling languages or analogies, the language of control theory may illuminate some aspects of the phenomena and obscure other aspects. The analogy is not right or wrong. Rather, it is simply useful or not. Thus, we do not go so far as to claim that animals are control systems or to offer control theory a*s an absolute truth relative to the collision problem. We simply offer control theory, as one tool for exploring the phenomena of collision control. We have found it to be a useful tool and hope that others will also find it useful. At a theoretical level, control theory has been useful to us, as we struggle to understand the fundamental nature of causality in biological and
90
John M. Flach, Matthew R. H. Smith, Terry Stanard and Scott M. Dittman
cognitive systems. Control theory provides a rich language for thinking about how perception and action might be linked through circular causal relations. We also believe this language offers unique insights into Gibson's curious notion of direct perception and how this notion might be realized in concrete models. The control diagrams (e.g., Figure 3) and if-then logical statements describing the comparator process may be useful in building models and simulations of collision control. Simulations and mathematical models can be important tools for making precise predictions to be evaluated against performance data. Finally, the state space diagrams (Figures 4, 5, & 6) provide a useful framework for visualizing data in relation to the comparator problem. These diagrams allow the relations among information, action, and value constraints to be visualized. If the dimensions for this space are chosen correctly, then performance data plotted in this space should help to make the functional relations involved in a comparator process visible. Currently, we have found the dimensions of angle and expansion rate to be useful. However, it might also be interesting to examine other optical (e.g., x x T dot) or non-optical (e.g., distance x velocity) state spaces to see whether these spaces add additional insights into the process. At the end of the day, it is important to remember that in this case the phenomenon of natural collision behavior is king (how animals act to create and avoid collisions). No analogy should usurp this throne. The language of control theory is simply offered as a tool to help researchers to better serve this king. It is offered not as an "answer" to the collision problem, but as a productive way to frame interesting questions about these phenomena.
Collisions: Getting them under Control
91
REFERENCES Flach, J. M. (1990). Control with an eye for perception: Precursors to an active psychophysics. Ecological Psychology, 2, 83-111. Gibson, J. J. (1958/1982). Visually controlled locomotion and visual orientation in animals. Jagacinski, R. J. & Flach, J. M. (2003). Control theory for humans. Mahwah, NJ: Erlbaum. Kirk, D. E. (1970). Optimal control theory: An introduction. Englewood Cliffs, NJ: Prentice-Hall. Lee, D. N. (1974). Visual information during locomotion. In R.B. McLeod & H. Pick (Eds.), Perception: Essays in honor of J.J. Gibson (pp. 250-267). Ithaca, NY: Cornell University Press. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5,437-459. Lee, D. N. (1980). Visuo-motor coordination in space-time. In G.E. Stelmach & Requin (Eds.). Tutorials in motor behavior (pp. 281-293). Amsterdam: North-Holland. McRuer, D. T. & Jex, H. R. (1967). A review of quasi-linear pilot models. IEEE Transactions on Human Factors, 8, 231-249. Powers, W. (1973). Behavior: The control of perception. New York, NY: Aldine. Powers, W. (1998). Making sense of behavior. New Canaan, CT: Benchmark. Smith, M. R. H., Flach, J. M., Dittman, S. M., & Stanard, T. (2001). Monocular optical constraints on collision control. Journal of Experimental Psychology: Human Perception and Performance, 27, 395-410. Sun, H. & Frost, B. J. (1998). Computation of differential optical variables of looming objects in pigeon nucleus rotundus neurons. Nature Neuroscience, 1, 296-303. Rasmussen, J. (1986). Information processing and human-machine interaction: An approach to cognitive engineering. New York, NY: North-Holland. Wagner, H. (1982). Flow-field variables trigger landing in flies. Nature, 296, 147-148. Weinberg, G. M. (1975). An introduction to general systems thinking. New York, NY: Wiley.
This Page is Intentionally Left Blank
Time-to-Contact – - H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) Time-to-Contact © 2004 Elsevier B.V. All rights reserved
CHAPTER 6 Optical Information for Collision Detection during Deceleration
George J. Andersen University of California, Riverside, CA, USA
Craig W. Sauer University of California, Riverside, CA, USA
ABSTRACT In this chapter we will review the empirical and theoretical research on collision detection during deceleration. Most research on this issue has examined the time derivative of x. An alternative to x is presented that utilizes perceived speed and distance information for estimating collisions during deceleration. Finally, a general model for collision detection is presented that accounts for the detection of collision events for linear or non-linear paths and for constant or varying speeds.
94
George J. Andersen and Craig W. Sauer
An important perceptual task is detecting and perceiving collision events. Many of the chapters in the present volume focus on the usefulness of the optical variable x-which specifies the time to contact during an impending collision. The focus of this chapter is to review and discuss the time derivative of x or x - which can specify a collision during deceleration. We will describe the theoretical usefulness of this information for regulating speed during braking, and review the empirical research on this optical variable. We will also present an alternative model to x , the constant deceleration analysis. Finally, we will discuss the broader issue of optical information for detecting collisions, including a general model for collision detection.
1. Optical variable T and TTC One source of information that has been the focus of research is the optical variable x. Consider an observer approaching a static object at a fixed speed. Under these conditions the visual angle of the object will expand. Lee (1976; 1980) provided a formal analysis of this situation and showed that information indicating the time to contact (TTC) was specified by the inverse rate of expansion of the object. Formally, this information, which is referred to as x, is specified as x(t) = Z(t)/V(t) = r(t)/v(t)
(1)
where Z(t) is the distance between the observer and object, V(t) is the velocity of the approaching object, r(t) is the optical projection of the object, and v(t) is the rate of optical expansion, x information is useful for specifying collisions in which the observer or object is approaching at a constant rate of speed. In addition to time to contact, Lee also determined a functional relationship potentially useful for collision avoidance during deceleration (see Figure 1). The time derivative of x, referred to as x, defines a critical value for accurate deceleration to avoid a collision with an approaching object. If x is less than -0.5 then the locomoting observer will collide with the object. If x is greater than -0.5, then the observer will come to a stop (velocity will go to zero) before colliding with the object. Thus, the regulation of x could be used to control braking behavior, as well as provide information that may be potentially useful for determining the severity of an impending collision.
Optical for Collision Optical Information for Collision Detection during Deceleration Deceleration
95 95
0.4 0.3
I 0.2 0.1 0.0 1
200
100
300
Frames
Figure I: Relationship between deceleration and time (in arbitrary frame units) for different values of T
2. Empirical studies on T Recent empirical studies on i have focused on three different methodologies to examine the usefulness of x . One methodology is to present observers with a display simulating approach to an impending collision with a static object, and require the observer to decelerate the approach such that observer velocity reaches zero at the object. Thus the observer actively controls the reduction in speed to a stop. Two other methodologies involve the presentation of displays simulating an impending collision with the observer's speed decelerating. During the trial the display disappears prior to the end of the event (either collision with a static object or reaching zero velocity before reaching the object). In one methodology observers are asked to judge the severity of an impending collision. In a second methodology observers are asked to detect whether or not a collision will occur. All three of these methodologies have been used to determine whether observers use i information in judging collisions events during deceleration. In a series of experiments Yilmaz and Warren (1995) examined the ability of subjects to accurately regulate i when decelerating to a stop. Subjects
96
George J. Andersen and Craig W. Sauer
were presented displays that simulated forward motion over a ground plane towards three signs located in the path of locomotion. As soon as the display started subjects were asked to immediately reduce their forward speed using a computer mouse configured as a braking system. Subjects were asked to stop as close as possible to the signs and to continuously regulate their braking. It is important to note thatx is a dimensionless variable that is not dependent on the size of an approaching object or the ground texture of the surrounding scene. To examine whether size of the approaching object and ground texture are important, Yilmaz and Warren examined four different display conditions. In one condition, referred to as the air variable-size condition, the simulated approach was in empty space and the size of the signs was varied across trials. In a second condition, referred to as the air constant-size condition, the simulated approach was in empty space and the size of the signs was constant. A third condition, the ground variable-size condition, consisted of a ground checkerboard texture pattern with the size of the signs varied across trials. The final condition consisted of the checkerboard texture pattern and the size of the signs was constant (referred to as the ground constant-size condition). If x is the only source of information used for the detection of collision events then performance should not vary according to the size of objects or the texture of the ground. In addition to ground texture and object size, initial TTC varied between 3.0 and 5.0 sec. An analysis of the pattern of braking performance indicated that subjects continuously regulated braking (i.e. did not slam the brake at the end of the trial or undergo a series of rapid decelerations or bang-bang control) on 83% of the trials. With the exception of very short TTC (3.0 sec) there was no effect of variations in the size of the objects or in the presence of a ground plane on the mean stopping distance or in the standard deviation of stopping distance. An analysis of observer's regulation of x during each trial indicated that the mean x value regulated was close to the critical value of -0.5, with no significant difference across variations in the size of the objects or in the presence of a ground texture. These results suggest that subjects regulated their speed consistent with x. Kim, Turvey, and Carello (1993) examined whether x could be used to judge the severity of an impending collision (also see Kaiser & Phatak, 1993). Subjects were shown displays simulating approach to a square defined by 300 dots. Following a period of approach to the square the display disappeared and subjects were asked to judge whether the impending event was an upcoming soft collision (x >= to -0.5) or an upcoming hard collision (x varied between -0.5 and -1.0). A number of variables were examined including initial velocity, starting distance from the square, display duration, and the number of frames. The results indicate that categorical judgments of soft and hard collision were determined by whether the value of x was greater than or less than the critical
Optical for Collision Optical Information for Collision Detection during Deceleration
97
value of -0.5, suggesting that observers use T in judging the severity of an impending collision. Andersen and colleagues (Andersen, Cisneros, Atchley & Saidpour, 1999; Andersen, Saidpour, Cisneros & Atchley, 2000) examined the use of x in detecting an impending collision. Observers were presented with a computer simulation of a roadway scene with 3 stop signs located in the center of the roadway at an intial fixed distance from the observers viewpoint. The displays simulated constant velocity towards the signs for 6 sees followed by a deceleration. The rate of deceleration was systematically varied to result in reaching zero velocity at 2 eyeheights in front of the signs, 1 eyeheight in front, zero velocity at the signs, zero velocity 1 eyeheight behind the signs or 2 eyeheights behind the signs. The display was blacked out prior to the end of the display, and observers were asked to indicate whether or not a collision would occur (prediction motion task). The results indicated that the proportion of collision judgments varied according to the different rates of deceleration. Since these results were mathematically related to different x values the results support the use of i for collision detection. However, in follow up experiments the effects of other perceptual factors were examined. In one experiment the size of the signs were varied as well as the initial speed of approach. Since x is not dependent on these variables both of these factors should not affect performance if observers only use x information to detect a collision. The results indicated a significant shift in the likelihood of reporting a collision with larger signs resulting in an increased probability of reporting a collision. In another experiment, the initial velocity and rate of deceleration was constant but the edge rate of the display, information previously shown to be used to judge egospeed (Larish & Flach, 1990; also see Denton, 1980), was varied. The results indicated a significant shift in the likelihood of reporting a collision - a result not predicted by an analysis based solely on x. Andersen et al. (1999) proposed an alternative analysis to x based on an analysis of available distance information during deceleration. They noted that the critical value of x = -0.5 is the only value at which deceleration is constant and suggested that observers might use an analysis based on constant deceleration. According to the constant deceleration model an observer would compare estimates of the perceived distance from constant deceleration and velocity with the estimates of the perceived distance from perceived size. Formally, they refer to this derivation as d
(2)
where dv is the distance estimate derived from velocity and deceleration (the distance at which velocity, given the current rate of deceleration, is zero) and is specified by
98 98
George George J. Andersen and Craig Craig W. W. Sauer Sauer
dv=l.5v02/a
(3)
where v0 is the instantaneous velocity of observer motion and a is the deceleration of the observer. ds is the distance estimate derived from size and visual angle and is specified by ds = s/tan(8)
(4)
where s is the size of the object in the scene and 9 is the visual angle of the object. How might this information be used for the regulation of braking? If ddiff>0 then the distance based on velocity and constant deceleration exceeds the distance based on size and visual angle. Values greater than zero thus specify that the observer will not stop in time at the current rate of deceleration (i.e. a collision will occur) and the observer should increase the rate of deceleration. If ddiff<0 then the distance based on velocity and constant deceleration is less than the distance based on size and visual angle. Values less than zero thus specify that the observer will reach zero velocity before the object (a collision will not occur) and maintaining the current rate of deceleration will result in a safe stop. If this information were used then variations in the size of objects and forward speed would influence collision judgments as increased size of objects and increased speed of approach to objects affects the perceived distance to those objects. In a series of experiments Andersen and colleagues found that collision objects with greater size and motion at higher speeds resulted in an increase in the proportion of collision judgments providing support for the constant deceleration analysis. It is important to note that x is based on the rate of expansion of the object and thus is independent of size and speed information. Since T and the deceleration analysis are based on independent sources of information then observers could use information from either an analysis of i , an analysis of deceleration, or both.
3. Distribution analysis of T The research by Yilmaz and Warren (1995) suggested that observers regulate deceleration to produce x values around the critical value of -0.5. This conclusion was based on examining the mean value of x used to regulate speed. An alternative analysis of x is to examine histograms of the proportion of time regulating x. If drivers regulate deceleration using the critical value of x then a
Optical Information for for Collision Detection during Deceleration
99
large proportion of time should be spent regulating near the critical t value of -0.5. In addition, if observers only use x information then braking histograms should not be affected other factors such as the size of the signs. In contrast, the constant deceleration model is dependent on information regarding distance estimates from other factors such as size and edge rate information for specifying observer speed. Thus, any effect of these factors on the braking histograms (e.g., a shift in histogram as a function of i ) would be evidence in support of the use of the deceleration model. We recently conducted an experiment to determine whether variations in object size would alter the use of t in a closed loop control paradigm. The stimuli were similar to those used in Andersen et al., (1999). Observers were presented with displays simulating forward motion through a simulated 3D scene consisting of a ground plane and stop signs. Each display consisted of two phases. During the first phase the simulated initial speed was constant for a 6.0 sec period. During the second phase, indicated by a warning tone, the observer was asked to use a foot pedal to stop just before the stop sign. The control dynamics of the brake were the same as those used in the Yilmaz and Warren (1995) study.
Stop Sign Size - • - 200 units - D - 400 units
-.92 -.81 -.70 -.59 -.49 -.38 -.27 -.16 -.05
.05
taudot Figure 2: Percentage of total time braking was regulated as a function oft and sign size.
100 100
George J. Andersen and Craig W. Sauer
x values were derived for each subject in each condition at a sampling rate of 30 Hz. These values were then plotted as a relative frequency histogram to determine the percentage of time controlling at the critical value of -0.5 and to determine whether the distribution was altered by the size of the signs. The histograms are shown in Figure 2. The results revealed two important findings. First, although the greatest proportion of time regulating x was at the critical value of -0.5, it only accounted for approximately 17% of the total control initiated. The second important finding is that variations in the size of the signs sifted the distribution, with a larger sign shifting the control of deceleration via x to more positive values. These results suggest that other factors, such as the size of approaching objects, can alter the use of x.
4. Visual information and attentional factors in assessing collision threat It is important to note that x and the deceleration analysis specify information for detecting only one type of collision event - an impending collision along a linear trajectory during deceleration. These conditions represent on a small number of circumstances that can result in a collision. For the remainder of the chapter we will discuss what are the other collision events that an observer must be able to detect and what information is available to the observer for detecting these events. We will present an information processing model of how an observer might use this information to detect collisions. The perceptual and attentional factors for the detection and avoidance of collisions can best be understood in terms of the variables that define potential collisions. One variable is whether the collision target is stationary or moving. For example, a potential collision threat can be present when static objects are in the path of locomotion (e.g., objects that have fallen on the roadway that one is driving) or when other vehicles are stopped on the roadway (e.g., stopped at a traffic light in the driver's lane). Collisions can also occur when objects are approaching the observer (e.g., when a vehicle traveling in the opposite direction on a two lane road crosses the center dividing line). A second variable is whether the motion of the driver's vehicle and/or collision object is at a constant speed or variable speed (accelerating or decelerating). For example, during freeway driving under congested conditions a driver may need to regulate speed to avoid colliding with the car in front of the driver. Or when a driver must maintain a minimum safe distance when performing a lane change to pass a vehicle. A third variable is whether the path of motion of the driver's vehicle and/or collision object is on a linear or curvilinear path. For example, initiating a left hand turn when traffic in the opposing lane is present requires perceptual judgments of speed and path of motion to avoid a collision with oncoming
Optical for Collision Optical Information for Collision Detection during Deceleration Deceleration
101 101
traffic. Finally, a fourth variable is whether there is a single or multiple collision threats within the scene. For example, driving when multiple vehicles are present and slightly varying speed during heavy rush hour traffic. Under these conditions the driver must determine whether he/she is on a potential collision path with any moving vehicles in the immediate vicinity. These four variables (stationary or moving objects; constant or variable speed, linear or curvilinear paths, and single or multiple collision threats) define all the parameters that specify whether or not a collision is imminent under driving conditions. In order to model the perceptual and attendonal information specifying an impending collision one must determine the visual information available to the observer in detecting collisions under these parameters.
5. Visual information for detecting collisions with multiple moving objects, linear trajectories and constant velocity
Figure 3 and 4: visual information for detecting collisions when multiple moving objects are present and the objects are undergoing motion along a linear trajectory at a constant velocity
In this section we consider the visual information for detecting collisions when multiple moving objects are present and the objects are undergoing motion along a linear trajectory at a constant velocity. Under these conditions a collision event is defined by two sources of information. The first source is that the collision object will be expanding or looming. The rate of expansion is directly specified by the projective geometry of the scene and is a result of the decrease in distance between the observer and the collision object. As discussed above this source of information has been studied extensively in specifying TTC. The second source of information is that the object will maintain a constant angular direction (0) or fixed position in the visual field (see Kaiser & Mowafy, 1993;
102 102
George George J. Andersen and and Craig Craig W. W. Sauer Sauer
and Bootsma & Oudejans, 1993). This information also is a consequence of projective geometry and is a result of the ratio of the 3D horizontal velocity component of the object to 3D velocity of the observer remaining constant over time. All other non-collision objects will have an angular direction that varies over time. It is important to note that the detection of collision events when multiple moving objects are present is specified by the combination of both sources of information being present. Objects in the scene can have a fixed position but not be on a collision path. For example, an object whose motion has the same speed and trajectory as the observer's motion. Similarly, an object can be expanding but not be on a collision path. For example, an object that is decreasing in distance from the observer but is moving laterally across the observer's field of view. These observations suggest that successful performance in detecting collisions when multiple moving objects are present involves determining that a particular object is expanding and that the angular direction is constant. What processes might be involved in performing this task? One model that might account for detection performance is that the visual system has separate mechanisms for recovering expansion and angular position information. The output of these processes could provide input to a specialized mechanism for detecting a collision when multiple moving objects are present. Such a specialized detector would not be activated unless both sources of information are present. If such a process exists early in the visual system it's activation could occur automatically and in parallel. If collision detection is based on the activation of a low-level mechanism then we would expect the combination of both sources of information present in an object in the visual field to "popout" with rapid detection of the object regardless of the number of objects in the scene. The presence of such a low-level "automatic" mechanism would be of great benefit to an organism for tasks involving collision avoidance.
Optical Information for Deceleration for Collision Detection during Deceleration
4.0 3.5 3.0 2.5 2.0 1.5
"
~
•
~
^
•- * " • -
103 103
Duration (sec] - O 5.75 - D - 4.5 •O" 3.25 ~-^^_^ - A 2.0
---D.
O-
o.. * .
t
1.0 0.5 0.0 2 4 Number of Objects
Figure 5: Results of sensitivity of observers (d') as a function of the number of objects and display duration. Performance decreased linearly as a function of the number of moving objects, suggesting that drivers use a serial visual search to detect collisions
An alternative model is that the visual system has separate mechanisms for recovering expansion and angular position, but an early level detector sensitive to both types of information. Rather, a central process may exist that would search for the presence of objects that are expanding and that have a constant angular position. Such a search would likely involve a process of attending to a particular object and determining whether that object is both expanding and maintaining a fixed angular position. If a central limited-capacity process is needed, then collision objects would not be automatic and performance would decrease with an increase in the number of non-collision objects present in the scene. Thus, an important distinction between the two models is whether collision detection, when multiple moving objects are present, is based on the activation of low level automatic processes or requires activation of a central process that involves a limited-capacity visual search. What evidence exists for the role of a central process in visual search? We recently conducted an experiment that examined the detection of collision events when multiple moving objects were present in the scene. Observers were presented with displays simulating a 3D environment consisting of a roadway with multiple moving objects (see Figure 5). On some trials one of the objects
104 104
George George J. Andersen and and Craig Craig W. W. Sauer Sauer
was on a collision path with the observer. The observer's task was to determine whether any object was on a collision path. We examined the ability of observers to detect collisions using a signal detection paradigm. The results indicated that sensitivity in detecting collisions, for nine subjects, decreased with an increase in the number of non-collision objects in the display. In addition, sensitivity increased with a decrease in time to contact. These results indicate that the detection of collision events with multiple moving objects is not performed in parallel but requires a visual search.
6. Visual information for detecting vehicle collisions with multiple moving objects, curvilinear trajectories, and constant velocity The information for detecting vehicle collisions under these conditions are similar to the information available for multiple moving objects, linear trajectories and constant velocity. A collision event will contain image expansion information but the angular direction will decrease over time approaching zero in the limit. When expansion of the object is present and the angular direction is zero a collision will occur. Thus, the critical information in this situation is the detection of the angular direction going to zero.
7. Visual information for detecting vehicle collisions with multiple moving objects, curvilnear trajectories, and variable velocity The information for detecting vehicle collisions under these conditions are similar to the information available for multiple moving objects, curvilinear trajectories and constant velocity. Under these conditions a collision event is defined as an angular direction that will go to zero in the limit and a x value that is greater than -0.5. Any object containing both sources of information will be a collision event. Objects can contain either source of information and not be a collision event.
Optical Information for for Collision Detection during Deceleration
105 105
8. General collision detection analysis: The REACT model The scenarios reviewed above indicate that there are a variety of conditions in which collisions can occur. We propose the following model for the assessment of collision threat, which we call respond to assessment of collision threat (REACT). This model can detect collisions under all the scenarios discussed above. A framework of the REACT model is presented in Figure 6. According to the model, the observer selects an object in the scene and derives the angular direction a in the visual field. If a or d a/dt does not equal zero then the object is not on a collision path with the vehicle. If a or d a/dt equals zero then the object is a potential collision threat and further analysis is conducted. The additional analyses include deriving X, x, and ddiff. If t is not greater than zero the object is not a collision object. If x is greater than zero then two parallel analyses are performed ( x and ddiff) to double up the system and provide redundant information regarding collisions. This is an important step in the model as human observers have been shown to use both sources of information for collision detection. If t and ddiff are zero then the object is on a collision path at a constant velocity. If x is greater than -0.5 and dm is greater than zero then the object is accelerating/decelerating but the rate of speed change is such that a collision will occur (e.g., the rate of deceleration is such that the object will not reach zero velocity before colliding with the vehicle). If x is not greater than or equal to -0.5 and ddiff is not greater than zero then the potential collision object is decelerating at a rate such that it will stop before colliding with the vehicle.
George J. Andersen and Craig W. Sauer
106 106
General Collision Detection The REACT Model
/ >v
Select Object
\ ?
Derive a, da/dt (angular direction)
Isa = 0? Or da/dt •» 0? (constant angular direction)
Is T=o? (Constant expansion)
Is dv-ds = 0? (Constant expansion)
Figure 6: The REACT model, an information processing model, which indicates the specific sources of information that an observer may use for the detection of collision events.
Optical Information for for Collision Detection during Deceleration
107 107
9. Conclusions In this chapter we have focused on the theoretical and empirical research for detecting collisions during deceleration. We have reviewed two different analyses for collision detection under these conditions -x and the constant deceleration analysis. We also reviewed the empirical research examining collision detection during deceleration. It is important to note that the use of x or the constant deceleration analysis are only relevant for detecting collisions during linear trajectories and when speed is varied. These conditions represent on a small subset of scenarios in which collisions can occur. An exhaustive analysis of all impending collisions suggest that collision events are defined by four different variables - stationary or moving objects; constant or variable speed, linear or curvilinear paths, and single or multiple collision threats. A general model, the REACT model, was proposed that reviews the information available to an observer for detecting collisions under these circumstances. An important goal for future research will be to assess the sensitivity to the different sources of information proposed in the model.
108 108
George J. Andersen and Craig W. Sauer
REFERENCES Andersen, G. J., Cisneros, J., Atchley, P. & Saidpour, A. (1999). Effects of speed and edge rate in the detection of collision events. Journal of Experimental Psychology: Human Perception and Performance, 25, 256-269. Andersen, G. J., Cisneros, J., Saidpour, A. & Atchley, P. (2000). Age-related differences in collision detection during deceleration. Psychology & Aging, 15, 241-252. Atchley, P. & Andersen, G. J. (1998). The effects of age, retinal eccentricity, and speed on the detection of optic flow components. Psychology and Aging, 13, 297-308. Bootsma, R. J. & Oudejans, R. R. (1993). Visual information about time-to-collision between two objects. Journal of Experimental Psychology: Human Perception & Performance, 19, 1041-1052. Denton, G. G. (1980). The influence of visual pattern on perceived speed. Perception, 9, 393-402. Kaiser, M. K. & Mowafy, L. (1993). Optical specification of time-to-passage:Observers' sensitivity to global tau. Journal of Experimental Psychology: Human Perception & Performance, 19, 1028-1040. Kaiser, M. K. & Phatak, A. N. (1993). Things that go bump in the light: On the optical specification of contact severity. Journal of Experimental Psychology: Human Perception and Performance, 19, 194-202. Kim, N., Turvey, M. T. & Carello, C. (1993). Optical information about the severity of upcoming contacts. Journal of Experimental Psychology: Human Perception and Performance. 19, 179-193. Larish, J. F. & Flach, J. M. (1990). Sources of information useful for perception of speed of rectilinear self-motion. Journal of Experimental Psychology: Human Perception and Performance, 16, 295-302. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5, 437-459. Lee, D. N. (1980). The optic flow field: The foundation of vision. Philosophical transactions of the Royal Society of London Series B, 290, 169-179. Yilmaz, E. H. & Warren, W. H. (1995). Visual control of braking: A test of the T hypothesis. Journal of Experimental Psychology: Human Perception and Performance, 21, 9961014.
Time-to-Contact –- H. Hecht and and G.J.P. Savelsbergh (Editors) (Editors) © © 2004 2004 Elsevier Elsevier B.V. B.V. All All rights rights reserved reserved
CHAPTER 7 Interceptive Action: What's Time-to-Contact got to do with it?
James R. Tresilian University of Queensland, St Lucia, Australia
ABSTRACT The time remaining before an object arrives somewhere - its time-to-contact (TTC) with that place - is a quantity that could be used to control the timing of interceptive and evasive actions directed at that object. This potential use for TTC information is the primary motivation for studying how people and animals perceive it. However, studies of interceptive actions and the role TTC information might play in controlling them have been very limited until recently - studies of the perceptual process of TTC estimation have been more popular. Unfortunately, TTC perception depends upon the task being performed and it seems impossible to understand TTC perception without understanding interceptive tasks. This chapter argues this case and discusses recent experiments that have attempted to determine how performance of interceptive actions - particularly their timing - depends upon task variables including the speed of the target, the distance to be moved to intercept it, the viewing time, the size of the target and of the intercepting manipulandum. Results demonstrate that the duration and velocity of interceptive movements are systematically and consistently affected by all these variables. The relationship between movement duration and the task variables derived from experimental results is interpreted as an empirical reflection of the 'rule' used by the central nervous system to preprogram movement duration. The role of TTC in the programming and initiation of interceptive movements is explicated.
110 110
James R. Tresilian
1. Introduction Time-to-contact - genetically understood to mean the time remaining before some event occurs, whether this event involves a real physical contact or not - is a quantity assumed to be critical for achieving the temporal control necessary for either bringing about a collision that would otherwise not occur (e.g., interception of a moving target) or avoiding one that probably would. This 'time-to-contact hypothesis' seems a reasonable one but it is not the only possibility - TTC is sufficient but not always necessary for accurate temporal control of collisions (for discussion see Tresilian, 1993). In this chapter I will limit discussion to that class of collision control tasks for which the TTC hypothesis is likely to hold. The TTC hypothesis raises two important questions. First, how does a person or an animal estimate the TTC that allows adequate temporal control in a particular task? In other words, what information is used? Second, how is the TTC estimate actually employed in the control and coordination of movement? In other words, what is the nature of the control strategy used to time an action? These questions are not independent of one another in the sense that the information one type of control strategy requires to achieve a specific temporal accuracy may be different from the information required by another strategy. Conversely, limits on the availability of TTC information will place constraints on the possibility of using different types of control strategy. The question of what information is used for temporal control has been the focus of most research effort; the question of how it is used has received much less attention. The major focus of this chapter will be on issues associated with this second question; but first it is important to make some introductory remarks concerning the other.
2. Information The issue of what information about TTC is used by people and animals was first approached from the perspective of J. J. Gibson's ecological psychology (e.g., Lee, 1974, 1976). Ecological psychology seeks in part to discover laws of perceiving and acting (see e.g., Turvey et al., 1981) and one component of this enterprise is the identification of specificational laws that define sources of perceptual information. A specificational law can be defined as a function (in the mathematical sense) that maps a distal stimulus quantity (such as the size of an object, its shape, distance or its TTC with something) into a proximal stimulus quantity. Such a function could be one-to-one - a single proximal stimulus quantity associated with a particular distal stimulus quantity (Figure 1A) - or one-to-many: many proximal stimulus quantities associated
Interceptive Action: What’s What's Time-to-Contact got to do with it?
111 Ill
with a particular distal quantity (Figure IB). In either case, the proximal stimulus quantity constitutes a source of specific stimulus information about the distal quantity if no additional functions exist that map other distal stimulus quantities to the same proximal quantities (no perceptual ambiguity exists). In Figure 1C many distal quantities are associated with the same proximal quantity, which is therefore ambiguous concerning the distal state of affairs that is actually present. In the absence of ambiguity the presence of a particular quantity in the proximal stimulus means that a particular distal stimulus condition exists.
Distal stimulus quantities
Proximal stimulus quantities
A. Each distal quantity is associated with a single proximal quantity
B.
Each distal quantity is associated with several proximal quantities
C.
Several distal quantities are associated with the same proximal quantity
Figure 1: Different relationships between distal (left) and proximal (right) stimulus quantities. A) One-to-one. B) One-to-many. C) Many-to-one.
112 112
James R. Tresilian Tresilian
The basic proposal of the ecological approach to perception is that under ecological conditions a specificational relationship exist between proximal and distal quantities (Gibson, 1979). That is, proximal quantities are related to distal quantities by relationships that are either one-to-one (Figure 1A) or one-to-many (Figure IB). It seems that early formulations only recognized the existence of functions of the one-to-one type (Figure 1A) and failed to recognize those of the one-to-many type (Figure IB, Cutting, 1986). Functions of the latter type introduce problems concerning selection between different information sources and combination of these sources (see Cutting, 1986; Tresilian, 1999 for discussion). Although established previously (e.g., Hoyle, 1957), it was Lee's (1974, 1976) well known geometrical analysis showing how TTC mapped to a simple quantity in the visual stimulus that served as the primary example of the ecological approach applied to a real perceptual problem (e.g., Turvey et al., 1981; Turvey & Carello, 1987). It was shown that the TTC of an approaching object maps to the reciprocal of the rate of expansion of the object's image (e) relative to the size of this image (s). That is, the distal stimulus (TTC) maps to the proximal stimulus variable s/e. As is well known, Lee denoted quantities of the type s/e with the symbol x (tau). A simple mathematical argument establishes that the optical geometry of collision situations leads to relationships of the form TTC = T
(1)
Obviously, the existence of such a relationship means that x provides TTC information. The interpretation of such relationships from the perspective of ecological psychology led to the development of what I have called the 'tauhypothesis' (see Tresilian, 1999 for review). The essence of this hypothesis is the uniqueness and universality of x: x is the only source of TTC information used by all animal species for the timing of movement in interception and avoidance behaviors. This hypothesis denies the existence of many-to-one 'mappings' from distal to proximal stimulus variables (Figure 1C) and is thus 'ecological' in Gibson's sense. However, it either overlooks the possibility of one-to-many mappings from distal to proximal stimulus quantities (Figure IB), or if such mappings exist it asserts they are not exploited by organisms. In addition, the hypothesis denies the task-specificity of information usage - it asserts that the same information is used for temporal control irrespective of the temporal accuracy or precision demanded by the task. Related to this is its denial of any species-specificity of information usage - all species use the same information. The x-hypothesis just described is truly extraordinary - it is putatively a completely general biological 'law' of perceptually guided behavior. Such a law certainly simplifies enormously what could, in principle, be an extremely
Interceptive Action: What’s What's Time-to-Contact got to do with it?
113
complicated taxonomy detailing different tasks, different species and the information used in each case. This is perhaps the reason why the T-hypothesis has been attractive and influential. It is therefore somewhat disappointing to find that neither empirical research nor logical analysis have provided any support for the T-hypothesis (see, e.g., Tresilian, 1993; Wann, 1996). Quite the opposite in fact: research over the past decade or so has been almost unanimous in its refutation of the T-hypothesis (see Tresilian, 1999, for review). None of this is to deny that tau is involved in the perception of TTC, or to claim that there are no generally applicable laws of perceptual guidance. However, if tau plays a role in TTC perception it is not the role proposed for it by the T-hypothesis and if there are general laws of perceptuo-motor behavior, the T-hypothesis is not one of them. I suggest that it is now time to abandon research into the T-hypothesis and concentrate on the problem of how temporal control is achieved in interceptive tasks. In the past, proposals concerning control have been closely bound up with the T-hypothesis and often confounded with it. This confounding has sometimes prevented independent evaluations of the T-hypothesis - which gives an answer to the question 'what information is used for temporal control?' - and hypotheses that answer the question 'how is information used in temporal control?' This chapter aims to present a discussion of recent research directed towards understanding how interceptive actions are temporally controlled.
3. Temporal control It is generally accepted that skilled voluntary movements involve both generative processes that can function without sensory information being available during execution and sensory guidance processes that rely on information being available during execution. A central generative process is conceived as a pattern generator - traditionally referred to as a 'motor program' the outputs of which constitute command signals for the peripheral neuromuscular system. The generative process will be called a motor pattern generator, abbreviated MPG in what follows. In this context, much research on interception has been directed at two complementary questions: First, what determines the time at which a person initiates the MPG? Second, how is movement execution guided by sensory information such that interception is achieved? The first of these questions will be the major focus here. The basic requirement for successful timing an interception is that the intercepting-effector (hand) arrive at an interception location at the same time as the object to be intercepted (target), give or take some temporal margin of error which will depend upon the task but can be as small as a few milliseconds (e.g., Regan, 1992; Watts & Bahill, 1990). Thus, the condition for successful
114 114
James R. Tresilian
interception at any point in time may be stated as follows: The time remaining until the hand reaches the interception location (its time-to-arrival or time-tocontact, TTC) must be equal to the time to arrival of the target at the interception location (to within the error tolerance of the task). If the action is not controlled on-line, then this condition must be met by initiating the act at just the right moment to ensure that its duration matches the target's TTC with the interception location. Many of the arguments that follow are developed under the assumption that the temporal evolution of the action is not controlled on-line. Although this assumption is unrealistic for acts lasting longer than 200 to 300 milliseconds, it is suggested that many of the conclusions reached are largely independent of whether or not movement timing is controlled on-line. An early proposal for the control of program initiation in interceptive actions was the operational timing (OT) hypothesis put forward by Tyldesley and Whiting (1975). The OT-hypothesis proposes that the temporal accuracy required for successful execution of interceptive actions is achieved by executing movement patterns of pre-determined (pre-programmed) duration following an earlier suggestion put forward by Poulton (1950). Thus, for every interceptive task that a person can perform (catching, playing a tennis stroke, hitting a baseball and so forth), there is an MPG that is responsible for producing the basic movement pattern that characterizes performance. The MPG is capable of determining the duration of any particular performance of the task. Temporal control is a matter of starting the pattern generating process at the right moment, the temporal structure of the movement having been determined in advance. Starting the process requires an initiating signal or 'GO' signal that activates the MPG at the appropriate moment. The GO signal is derived from some perceived quantity: when the value of this quantity exceeds some criterion value the GO signal is sent to the MPG. The central idea of the OT hypothesis is that the duration of the movement pattern (between its initiation and the time of target interception) is pre-programmed. Two versions of this type of OT hypothesis have been considered. In one version the criterion value of the perceptual variable used for activating the MPG associated with a particular interceptive task is assumed fixed. That is, the criterion is not something that can be varied by the nervous system so as to adapt performance to different task conditions. This version of the hypothesis seems to be the one initially favored by D. N. Lee (see, e.g., Lee, 1980) and later by others (e.g., Micheals et al, 2001; Smith et al., 2001; Van Donkelaar et al., 1992). In the other version the criterion value can be varied by the nervous system so as to adapt performance to different task conditions.
Interceptive Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got got to to do do with with it? it?
115 115
3.1 The fixed criterion OT-hypothesis The basic functional structure of the fixed-criterion version is shown in Figure 2: the perceptual variable used for determining when to activate the MPG is derived from the stimulus input (perceptual processing). The value of this variable is then compared to the criterion, if it exceeds the criterion the GO signal is passed to the MPG. This activates the MPG which then outputs signals (motor commands) that descend to the peripheral neuromuscular system. Stimulus input
I
Perceptual processing
Initiating perceptual variable, P
P > criterion? If yes, issue GO signal
GO signal
Motor pattern generator
1
Central motor commands
Figure 2: Functional structure of the fixed-criterion version of the operational timing hypothesis described in the text.
Lee (1980) proposed that the GO signal is derived from information about the TTC of a moving target and an interception location (specifically from the variable x). Thus, the MPG is activated when the perceived value of TTC is equal to some criterion value. The basic idea would work as follows (refer to Figure 3). The TTC information in the stimulus reaches a criterion value (Tc) at time tl. It takes some time for the nervous system to detect that the criterion value has been reached and to activate the MPG - this is called the processing time in Figure 3. After the processing time has elapsed (at t2) the MPG begins to send signals to the neuromuscular system. It takes some amount of time for the effects of these signals to cause force generation in the muscles - called the transmission time in Figure 3. Time t3 marks the start of the movement pattern and the total time between tl and t3 is the time the system takes to react to the fact that the stimulus TTC information has reached the criterion - called the reaction time in the figure (RT = processing time + transmission time). The movement is completed (the intercepting effector reaches the interception location) at time t4 and so the time period (t4 -13) is the movement time (MT).
116 116
James R. Tresilian Tresilian
Movement starts
Stimulus TTC information = Tc Reaction time (RT)
MT ,,
Time -•
Transmission time
Processing time tl
Movement completed
t2
t3
t4
Figure 3: Sequence of events associated with the fixed criterion version of the operational timing hypothesis (see text for details).
In order for the interception to be successful the intercepting effector must reach the interception location at the same time as the target. This will happen if the criterion value of the TTC (Tc) is equal to RT + MT in Figure 3 (assuming that the perceived TTC is an accurate estimate of the true TTC). Since RT and Tc are both invariant, this strategy will only work if MT is also invariant: if MT is free to vary then RT + MT cannot always be equal to Tc. Thus, the fixed criterion OT hypothesis would seem to predict invariant MTs for specific interceptive tasks. Note, however, that this ignores the possibility that movement timing might be corrected during execution if errors are detected. The prediction of invariant MTs is, therefore, restricted to very rapid interceptive actions where there is insufficient time to modify the movement during execution (MTs less than about 150 to 200 milliseconds). A second type of prediction also follows from the idea that the criterion value is fixed. If this is true, then variations of the initiating variable at the time the movement starts should vary in a predictable way with variations in task conditions. Let us consider the example of variations in target speed. If TTC is used as the initiating variable, then the prediction is straightforward. Refer back to Figure 3: the criterion value of TTC is fixed at some value (Tc) and its value at the time of movement initiation is Tc + RT. Thus, the value of TTC at the moment the movement starts should be approximately constant since both Tc and RT are constant. If the initiating variable is not TTC but distance (of the target from the interception location) as some have suggested (e.g., Wann, 1996; see also Van Donkelaar et al, 1992, for a related idea) then the prediction is not
Interceptive Action: What’s What's Time-to-Contact got to do with it?
117 Ill
quite as simple. Refer to Figure 3 but assume the initiating variable is distance (D) and the criterion value is Dc. During the RT period the distance of the target from the interception location will have changed by an amount equal to the target's speed (V) multiplied by the RT. The distance at the time the movement starts is equal to Dc - V x RT. Thus, the distance of the target from the interception location at the time of movement initiation depends upon the speed of the target. As another example, if the initiating variable is the rate of expansion of the target's image then it can be shown that the expansion rate at the time of movement initiation should vary as the square root of target speed (assuming constant size targets; see Mattocks, Wallis & Tresilian, 2002). 3.2 Empirical investigation of the fixed criterion hypothesis A type of change in task conditions that has been found to lead to systematic changes in interception performance is variation of the speed of the moving target. It has been reported many times that people move more rapidly (shorter MTs and greater movement speeds) when intercepting fast moving targets than they do when intercepting slower moving targets (e.g., Bairstow, 1987; Brenner & Smeets, 1996; Carnahan & MacFadyen, 1996; van Donkelaar et al, 1992; Wallace, Stevenson, Weeks & Kelso, 1992). There are two possible reasons for this. It could be a result of there being less time available to complete the interceptive movement when the target moves faster: the person is constrained to move more rapidly when there is less time available. Alternatively, it could be that faster moving targets elicit more rapid response regardless of the available time. Mason and Carnahan (1999) recently observed that most of the studies that had reported an effect of target speed on the speed of interceptive movement had confounded target speed with the time for which the target was visible prior to interception (the viewing time), and hence the time available to make the interception. The confounding was such that the faster the target moved the shorter the viewing time. Mason and Carnahan (1999) reported the results of an experiment that unconfounded target speed and viewing time and found no effects of target speed on the interceptive movement. It was concluded that viewing time and not target speed was the determinant of the rapidity of the interceptive response. As we will see later (section 4), this conclusion is premature - both target speed and viewing time affect the speed of the response. Although it is difficult to explain more rapid responses when less time is available using the fixed criterion idea, the decrease in MT associated with faster moving targets could be explained by the hypothesis that a fixed criterion value of the target's image expansion rate is used. This hypothesis has been used to interpret recent experimental results (Micheals et al., 2001; Smith et al., 2001).
118 118
James R. Tresilian Tresilian
Image plane
Figure 4: Simple two-dimensional optical geometry of direct approach to an imaging system (eye) with focal length /. An object of size S (constant) approaches a stationary imaging system at constant speed V as shown. The object is instantaneously a distance D from the eye and the image of the object is at the same moment of a size s and is growing in size at a rate e. The equation given follows from the fact that the two triangles in the diagram are similar. Differentiation of this equation with respect to time establishes that e is proportional to SV/D2.
To see how a fixed criterion expansion rate can explain the variation in MT with target speed, consider the simple geometry shown in Figure 4. In these conditions it is easy to show that the expansion rate is proportional to SV/D2. If the criterion value of the expansion rate is constant, SV/D2 must take the same value irrespective of speed and so for a given target, higher target speeds (V) must be associated with greater distances D at the moment interceptive movements are initiated. However, D at initiation need only increase in proportion to the square root of V to maintain the same expansion rate value. This implies that the TTC at initiation (= distance at initiation/V) is inversely proportional to the square root of V and therefore smaller when V is larger. If the interception is to be successful, the duration of the interceptive movement needs to be shorter when the target moves faster. More precisely, the MT should vary in inverse proportion to the target speed. As will be described later, this is exactly what is found (Tresilian & Lonergan, 2001; Tresilian et al., 2001). However, to explain this finding using the fixed criterion hypothesis requires an additional hypothesis concerning the mechanism for modulating MT. It would appear that TTC information is necessary for correctly setting the MT even if expansion rate is the initiating variable.
Interceptive lnterceptive Action: What’s What's Time-to-Contact got to do with it?
119 119
Finally, Lee (1980) cited a study by Schmidt (1969) and another by Schmidt and McGowan (1972) that reported results on MT consistency as support for the fixed criterion hypothesis. Schmidt and McGowan's study demonstrated that MT variability in an interceptive task became increasingly smaller with practice. MT seemed to be getting progressively closer to a constant value as participants practiced. Schmidt's study (Schmidt, 1969) showed that MT variability in a simple coincidence anticipation task was very small - between about 4 and 13 milliseconds. Unfortunately, Lee (1980) failed to mention a major feature of Schmidt's data. The participants in Schmidt's experiment were required to slide a pointer through different distances from a start position to a target position. They had to move so that the arrival of the pointer at the target location was temporally coincident with the arrival of a moving object at a specified point on its path. Schmidt studied movements through four different distances (15, 30, 45, 60 cm) and found that MT increased reliably with the distance to be moved. Although this result may seem unsurprising it is nevertheless quite inconsistent with the fixed criterion hypothesis. Overall, there is reason to suggest that the fixed criterion hypothesis is inadequate to provide a convincing account of the flexibility of human performance in interceptive tasks. In the next section, a simple and rather obvious modification is described that may be all that is required to provide an adequate account. 3.3 The variable criterion OT-hypothesis The basic idea of the variable criterion version is that the nervous system can vary the duration of a movement pattern in order to adapt it to different conditions of execution. It is assumed that the duration (MT) of a particular interceptive act is pre-programmed based on factors related to the circumstances of performance. Based on the discussion presented in the last section, these factors include the distance to be moved to make the interception, the time available to make the interception and the speed of the target. The preprogrammed MT based on some set of factors (cp) will be denoted MT((p). Once the MT has been pre-programmed, the MPG needs to be activated so that the intercepting effector arrives at the interception location coincident with the arrival of the target. This requires, of course, that the MPG be activated at the right moment such that the time remaining before the target reaches the interception location (TTC) is equal to MT((p) + RT. Thus, the criterion value of the TTC information (Tc) used to activate the program needs to be set based on MT(cp). If a veridical TTC estimate is used then Tc = MT(cp) + RT.
120 120
James R. Tresilian
Internal factors Situational information Compute MT MT(cp) Stimulus. input
Perceptual processing
TTC information
TTC
Motor pattern generator
I GO signal
Figure 5: Functional structure of the hypothetical motor programming process for interception described in the text.
A block diagram illustrating schematically the idea just described is presented in Figure 5. The MT is 'computed' based on situational information derived either from perceptual processing of the stimulus or from internal factors. The computed MT is then used to set the parameters of the pattern generator and to form the criterion TTC value (Tc) that is compared to the perceptual TTC estimate and the GO signal issued when the perceived value of TTC crossed the threshold defined by the criterion. The variable criterion hypothesis raises the following question: what situational factors determine the value of the criterion? This is not an issue in the context of a fixed criterion hypothesis. In the next section, a series of recent studies is described that investigated systematically how performance on an interceptive task varied with changes in the task constraints. Research of this kind bears directly on the problem of how MT is determined in interceptive task.
Interceptive Action: What’s What's Time-to-Contact got to do with it?
121
4. Variation in performance as a function of changes in task constraints There is an extensive body of experimental literature documenting the way in which the performance of target directed aiming tasks depends upon task constraints when the target is stationary. This literature can probably be said to begin with the classic work of Woodworm (1899) and the later work of Fitts (Fitts, 1954, Fitts & Peterson, 1964). Since then research has studied in considerable detail how aimed movements depend upon the distance to be moved, the direction, the size of the target, the shape of the target, the availability of visual feedback, the size of the pointing device and several other task parameters (for reviews of different parts of this literature see, e.g., Elliot, Helsen & Chua, 2001; Plamondon & Alimi, 1997; Flanders, Tillery & Soechting, 1992). There is no correspondingly systematic literature documenting how performance of interceptive tasks depends upon task constraints. Indeed, the literature that does exist appears to have confounded task constraints in a manner that makes it impossible to determine exactly how performance is influenced by them (cf Mason & Carnahan, 1999). In order to see why this is so, it is necessary to identify what these constraints are. Once task constraints have been identified it is possible to manipulate them experimentally and observe how they influence performance. This is the approach that we have taken recently and is described in what follows. 4.1 Identifying temporal task constraints in interceptive aiming In order to experimentally vary task constraints a task is required that does not allow a person to determine for themselves the constraints they are subject to. As an example, consider the task of grasping a moving target under the conditions shown in Figure 6. There are two different ways in which a person might perform the task of grasping the target in this situation. First, they might move their hand with the target so that the hand moves parallel to but not relative to the target for a short time; then during this period grasp the target. Alternatively, they might move their hand perpendicular to the path of the target and grab the target as the hand intersects the target's path. In the latter case, the target and hand must arrive at the same location at the same time and the task potentially demands a high level of temporal accuracy and precision (see below). The other method does not require such a high degree of temporal precision as will be explained further below.
122 122
James R. Tresilian
Moving target
O
Path of target
Figure 6: Reaching to grasp a moving target. The reach has been resolved into two components of motion, one directly towards the path of the target and one parallel to the path of the target.
The task in which the intercepting effector is constrained to move along a path perpendicular to the path of the target allows systematic manipulation of the temporal constraints of the task. This claim can be explained with reference to Figure 7 (Tresilian, 1999). This figure shows a view from above (plan view) of a task in which a person moves an effector along a straight path to intercept a target moving along a perpendicular straight path. Both the target and the intercepting effector move in the plane of the page, thus both are constrained to move along a single spatial coordinate. There are five parameters of the task that can be manipulated: 1) The target speed (V); 2) the target length, L; 3) The effector width, W; 4) the viewing distance, Z; 5) the distance to be moved, D. The viewing time (VT) identified earlier as an important temporal constraint - can be varied by manipulating either Z or V, but cannot be manipulated independently of these two parameters. Another temporal constraint that can be manipulated in this task is temporal precision or tolerance. The leading edge of the target is shown as a thick black line and this will reach the near edge of the strike zone at a particular moment in time tl. The trailing edge of the target will leave the strike zone at some later time t2. Thus the target is within the strike zone for a period of time equal to t2 - tl that will be referred to as the time window (McLeod et al., 1985; Tresilian, 1999) and denoted by the symbol a>. During this period the target has moved through a distance equal to L + W and so the time window has a duration of (L + W)/V. Referring back to Figure 6, it should now be easier to see why moving with the target (parallel to it) decreases the temporal precision demand of interception: the speed of the target relative to the hand is equal to V - VH (where VH is the speed of the hand parallel to the target's motion) and so the time window is now (L+W)/(V-VH).
Interceptive Action: What’s What's Time-to-Contact got to do with it?
V
123 123
Path of target
target D Target becomes visible here Intercepting effector
Figure 7: Schematic diagram of the hitting task used in the experiments reviewed in the text. The target (length L) moves along a linear track with instantaneous speed V, the target's leading edge is shown thickened. The bat (width W) moves along a linear track perpendicular to that of the target. The bat moves through the region shaded light gray starting a distance D from the target's path. The target can be hit when any part of it is within this region. The target becomes visible when the leading edge is a distance Z from the bat.
The time window can be considered to define the temporal precision required to perform the task: if the centre of the target is in the middle of the strike zone at time tm then the target will be intercepted provided that the time the intercepting effector reaches the strike zone is within the range tm - co/2 < tm < tm+ co/2. One useful way to visualize this constraint is to use a space-time diagram as illustrated in Figure 8. In the diagrams in the figure, the abscissa represents spatial position along the direction of target motion and the ordinate represents time. In both diagrams the position of the leading edge of the target over time is indicated by a thick diagonal line and the position of the trailing edge by a thin diagonal line. The region of space occupied by the target at any instant is just the horizontal region between the points on the two diagonal lines corresponding to that instant. This region is shaded light gray for every instant and so the trajectory of the target is represented as a shaded parallelogram with width L and a slope equal to 1/V. A slower moving target (Figure 8, left panel) is associated with a more steeply oriented parallelogram than a faster moving target (Figure 8, right panel).
124 124
James R. Tresilian
Target's trailing edge
Target's leading edge
Figure 8: Space-time diagram for the task illustrated in Figure 7 for the case of constant target speed. The speed in the left panel is slower than that in the right panel. The target becomes visible at time to and sweeps out the parallelograms shown shaded. The spatial position of the bat along the direction of target motion does not change over time and so the vertical dashed lines define the region occupied by the bat. The bat and target can contact one another in the region shaded dark gray. The height of this region is the time window (co).
The strike zone defined by an effector of width W is shown bounded by vertical dashed lines and the region of the target's trajectory within the strike zone is shown as the darker shaded regions of the oriented parallelograms. The time window is just the height of this darker shaded region (co) and is smaller for the faster moving target (Figure 8, right panel) given that L and W are the same in both panels. If the abscissas are located at time to at which the target becomes visible, then the viewing times are as indicated (VT). In the examples shown in the Figure the viewing times are the same since the target becomes visible at a greater distance from the strike zone when the target moves faster (Z in Figure 8, right). The above shows that manipulations of L, W or V (independent of VT) act to vary the temporal precision demanded by the task (Figure 7) as defined by the time window. It can then be asked whether a person performing the task treats such manipulations as equivalent. For example, if the time window is
Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got to to do with it? it?
125 125
decreased by a certain amount, does it matter whether the decrease is due to a change of bat size (W), of target length (L) or of target speed (V)? In sections 4.2 & 4.4, the results of some recent experiments that examined how performance variables were influenced by changes in W, L, V and D independent of VT are reviewed. 4.2 The effects of manipulating task constraints on interceptive aiming In two series of three experiments we have studied performance of the constrained aiming task illustrated in Figure 6. In the first series (Tresilian & Lonergan, 2002) the target accelerated from rest on a slightly inclined slope at approximately 1.2m/s2. The task was to move a hand held bat along a linear slide (tilted at the same angle as the slope down which the target moved) through one of three different distances (7.5, 20 & 38.5 cm) to intercept the target. The time for which the target was visible prior to reaching the strike zone was held approximately constant. The three experiments examined the effects of varying the time window in each of the three possible ways. In experiment 1 the bat width was varied and the target length, speed and acceleration were constant. In experiment 2 the target length was varied with the bat width, target speed and acceleration held constant. In experiment 3, the target speed upon entering the strike zone was varied, the target length and bat width were held constant. In each experiment the time windows were always approximately 35, 50 and 65 milliseconds. There were small errors in achieving these time windows but both the constant and variable errors were always less than 3 milliseconds. In each experiment participants were required to hit the target on at least 80% of trials in any given condition (15 trials in each of the nine conditions) and only trials on which the target was struck were analyzed. In the actual experiments the majority of participants (6 in each experiment) struck the target on over 90% of trials. Participants were instructed not to stop or make any reversals of direction when intercepting the target and all complied with this instruction. The main dependent variables in the experiments were the duration of the striking movements from initiation to the time the target was hit (MT) and the maximum speed reached during the movement (Vmax). The results in each experiment were fairly clear as all participants performed in essentially the same way. The results are summarized schematically in Figure 9. The results of experiments 1 and 2 were almost identical and displayed the pattern shown in the top panels of Figure 9 in which the effect of time window on MT was very small and not statistically reliable. Experiment 3 was showed a similar, but more exaggerated pattern where the effect of time window on MT was statistically reliable.
126 126
James R. Tresilian
/
CD
E
—
co = 35 ms co = 50 ms co = 65 ms
/
'
V
QJ ><
o
I''''' / / /
-y s
Movement amplitude Figure 9: Schematic illustration of the pattern of results obtained in the experiments reported by Tresilian and Lonergan (2002). The top panels show the results from experiments 1 and 2; the bottom panels show the results of experiment 3. The panels on the left show movement time as a function of the amplitude of the movement for the three different time windows. The panels on the right show maximum speed (Vmax) as a function of movement amplitude.
Since the maximum speed of the movement was reached at about the time the target was struck, the results of the three experiments show that people hit targets that demand a high degree of temporal precision with a greater speed than they hit targets requiring less temporal precision. This effect was small and translated into negligible variations in MT when the temporal precision demands were defined by changes in the target size or bat size. When defined by changes in target speed, the effect was much larger and led to corresponding changes in MT. In experiment 3 it was found that MT was inversely proportional to target speed and directly proportional to the size of the time window. In all experiments, MT increased linearly with distance to be moved regardless of time window (Figure 9).
Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got to to do with it? it?
127 127
A model of the following type can describe the results of the experiments: MT = a + bD +/t(L + W)/V]
(2)
Where a and b are fitting parameters and./O is some function. Little can be said about fi) except that when (L + W) is constant (experiment 3)/[(L + W)/V] takes the form c/V where c is a constant and when V is constant the value of J{) is very small. In addition, J{) does appear to be an increasing function of the time window. Equation 2 is a relationship that has many similarities to Fitts' law as it relates MT as a dependent measure to two task parameters that are independent variables: movement amplitude (D) and a precision constraint. In Fitts type experiments the precision constraint is a spatial one (a measure of target size or width) whereas here it is a temporal one, the time window. Fitts' law takes a logarithmic form, that is, MT - a + pTog22D - pTog2S
(3)
where a and P are fitting parameters and S is a measure of target size. Note that in Fitts' law (equation 3) as the precision constraint gets more demanding (S decreases) the MT increases. In equation 2, as the temporal precision constraint gets more demanding (time window decreases), the MT tends to decrease. The results of this first series of experiments raised at least four questions. First, why do people hit more temporally demanding moving targets harder? Clearly Mason and Carnahan's (1999) conclusion that people only move faster when there is less time available cannot explain the results we have reported and so this is not the only factor that is responsible for faster movements. Second, why does varying the time window by manipulating target size or bat width have very little if any effect on MT whereas the same variations in time window achieved by varying target speed have a relatively large effect on MT? Third, do the results we have obtained have anything to do with the fact that the target was accelerating rather than moving with constant speed as in most other studies? Finally, what is the nature of the functional dependency / that appears in equation 2? In the following sections answers to these questions are proposed based on our recent experimental work (Tresilian, Oliver & Carroll, 2003, Tresilian & Houseman, 2003).
128 128
James R. Tresilian
4.3 Why do people move faster to hit more temporally demanding targets? It has been frequently observed that people move faster to intercept faster targets but no generally accepted explanation has been provided. It is obvious that people must move faster if they are to intercept a target when the time available to make the interception is reduced (Mason & Carnahan, 1999), but this cannot explain the results of the experiments described in the last section. These results indicate that people move faster when the temporal precision demands of the task increase. Thus, it appears that people respond to increasing temporal precision demands by moving faster. This in turn suggests that faster movements may be more temporally precise than slower ones, which is consistent with results on temporally constrained aiming at stationary targets reported by Schmidt and colleagues (e.g., Schmidt et al., 1979) and by Newell and colleagues (e.g., Newell et al., 1979, 1993). What these authors discovered was that when people were required to move over a certain distance in a MT prescribed by the experimenter, the shorter the MT the smaller the variability in MT. In other words, the faster the person was required to move, the less variable was their timing. Thus, a strategy for interception of targets demanding a high level of temporal precision would be to move quickly to intercept such targets so as to exploit the greater temporal precision of rapid movements (see Schmidt, 1988, for an earner statement of this idea). Note that according to this idea performance parameters such as MT and maximum speed are measures of how large the person has estimated the time window to be. Thus, the assumption is made that target size, bat size and target speed influence performance parameters (MT and speed) through the time window quantity. It might reasonably be asked why a person would bother to vary the speed of their movements with the perceived temporal precision demands of the task. Why not always move quickly? The answer to this question may be provided by an appeal to what might be called "the principle of maximum laziness" - people and other animals move in a way that keeps the energy expended in performance as small as possible (see Sparrow, 2000, for recent reviews). Moving quickly requires the production of greater muscular effort, and therefore a greater metabolic energy cost, than moving more slowly. If this explanation for why people move faster is correct, then we need to explain why changes in temporal precision demands have relatively large effects when due to variations in the speed of the target but small effects when due to variation in the size of the target or of the intercepting effector (bat). A possible explanation was suggested in Tresilian and Lonergan (2002). Although changes in sizes (L and W) or changes in speed (V) can produce the same change in the physical time window ([L+W]/V) the effects of these changes on the perceived time window (co') are different: co' is relatively insensitive to changes in size as compared to changes in speed. One reason for this might be
Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got to to do with it? it?
129 129
that people aim to hit close to the middle of the target with close to the middle of the bat. The target at which a person is effectively aiming would, for targets above a certain size, be smaller than the actual size of the target and could remain fairly constant despite increases in the actual size of the target (and similarly for the bat). 4.4 Why does target speed have a greater effect on performance than target size? In a second series of three experiments (Tresilian et al., 2003) we tested the hypothesis that the small effect of target size on MT in the hitting task of Figure 7 could be explained if relatively large variations in the physical size of targets resulted in only small variations in the size of the target region a person attempts to strike (the effective target size). That is, the size of the target that people attempt to hit is a region of the physical target that does not vary very much across targets of different physical sizes. In the first of the three experiments target size (L) was constant and target speed (V) was manipulated (4 different constant speeds were used). In the second experiment, the same 4 speeds were used but the target size was changed so that the time window ([L+W]/V) was held constant at approximately the average of the time windows in experiment 1. In the third experiment, the target speed was always the same (the average of the 4 speeds used in experiment 1) but the target size was varied so that the time windows the same as those in experiment 1. Consistent with the results reported in the earlier study (Tresilian & Lonergan, 2002; Section 4.3) the variation in MT across experimental conditions was largest in experiment 1, smallest in experiment 3 and intermediate in experiment 2 (Tresilian et al., 2003). The effect in experiment 1 was more than five times larger than that in experiment 3 even though the time windows were the same in both experiments. To determine whether or not these differences could be explained by the hypothesis that the effective target size did not change (or changed very little) across changes in the physical size of the target, the variability of target strike locations was used as an empirical measure of effective target size (following Schmidt et al., 1979). Two measures of variability were calculated - the SD and the range of strike locations - on two sets of data: 1) all trials including those on which a participant failed to strike the target and 2) only those trials on which the participant successfully struck the target. Clearly, the value of the effective size depends upon the variability measure computed so the pattern of variation across experimental conditions is relevant for evaluating the hypothesis rather than the actual values. The pattern was basically the same regardless of the data set used or the measure computed and demonstrated that the differences in effective size were smaller than the actual physical size differences between targets in experiments 2 and 3.
130 130
James R. Tresilian
However, this turned out to be insufficient to account for the greater than fivefold difference in effect size between experiments 1 and 3. One simple way to appreciate this conclusion is as follows: the original hypothesis we sought to test asserts that MT is proportional to the effective time window (= [effective size of target and manipulandum]/V) rather than the physical time window ([L+W]/V). If the effective time window is approximated as [(effective target size) + W]/V, then plots of MT against effective time window in experiments 1 and 3 should be straight lines with the same slope (originally the slope in experiment 1 was some five times steeper than that in experiment 3). The difference in slopes between the two experiments was reduced when the effective time window was used but the slope in experiment 1 was still about twice that in experiment 3. Note that the effective time window is a measure of the temporal precision actually achieved by participants in the experiment. The results showed that the temporal precision achieved co-varied with the temporal precision required and hence with MT and movement speed. Although this may not seem surprising, it is nevertheless an important result: it directly demonstrates that temporal precision was better when people made briefer, faster movements. Since the hypothesis that effective target size does not change very much with changes in physical size could not completely account for the greater effect of target speed, it is possible that target speed has an effect on MT that is independent of the time window (Tresilian et al., 2003). Indeed, the results of the experiments reported in Tresilian et al. (2003) were also consistent with the hypothesis that the effects of target size and speed on MT are independent of one another. To put this in formal notation, the results were consistent with both of the following relationships between dependent and independent variables: MT oc [L + W]/V - V
and
MT °c [L + W] - V
In a final experiment (Tresilian & Houseman, 2003) we sought to distinguish between these two possibilities. The experiment involved targets of four different sizes moving at four different speeds, giving a total of 16 sizespeed combinations. The pattern of results predicted by the two possibilities in such an experiment is shown schematically in Figure 10. The results obtained showed exactly the pattern predicted by the first possibility (Figure 10a,b) and this was confirmed by detailed statistical analysis.
Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got to to do with it? it?
a)
131 131
b) slow target
• ^ * ^
....o
o-
.O
...o
-0
JO-'
v
.
v
^ ,
^7. • —v- — fast target
0
d) ^
^
slow target
^
...O - " • " " " ^
o-
* O-
O--
,...o 'w
___^- -
s? fast target
target size
time window
Figure 10: Qualitative pattern of results in the experiment (4 target lengths by 4 target speeds) predicted by two hypotheses. Different symbols are associated with different target speeds: speed V > speed • > speed O > speed • . a) Pattern expected when MT is plotted against target size if MT ~ [L + W]/V - V. b) pattern of the MT results in (a) plotted against time window, c) Pattern of MT against target size plot expected if MT <* [L + W] - V. d) Pattern of data in (c) expected when MT is plotted against the time window. (Arbitrary units on both axes).
Overall, the data from the three series of experiments that we have undertaken (Tresilian & Lonergan, 2002; Tresilian et al., 2003; Tresilian & Houseman, 2003) are consistent with the following relationship between MT and the task variables L, W, V and D (Figure 7): MT = a + bD - cV + d[L + W]/V
(4)
where a-d are positive empirical parameters. The linear effect of distance, D, on MT is further supported by data reported by Schmidt (1969) and by Zaal et al. (1999) as well as that of Tresilian and Lonergan (2002). The results of Tresilian et al. (2003) described earlier might seem to suggest that the actual size of the target (and possibly of the manipulandum) in equation 4 should be replaced with
132 132
James R. Tresilian
the effective size. However, we found that the effective target size measure varied linearly with the physical size: if effective target size is zero when the actual size is zero then the coefficient 'd' in equation 4 can (in part) embody the transformation from physical to effective size. There is one caveat that should be mentioned (Tresilian & Houseman, 2003). Experimental studies of interception invariably involve three quantities the distance over which the target moves prior to interception (viewing distance, Z in Figure 7), the time it takes to traverse this distance (viewing time, VT) and the speed with which it moves (V). Since these variables are related by the formula VT = Z/V, if one is held constant the other two inevitably co-vary. As Mason and Carnahan (1999) pointed out (see section 3.2), if Z is held constant then faster speeds are associated with shorter VTs and shorter VTs will often require briefer, faster movements simply because there is less time available, not because the target is moving faster. Empirical results confirm this (Ball & Glencross, 1985; Laurent et al., 1994; Mason & Carnahan, 1999). Thus, it is critical to keep VT constant and sufficiently large if effects of target speed independent of VT are to be observed (section 4.3) and this was done in all our experiments described above. The consequence of this strategy is that target speed co-varies with viewing distance and so these two variables are confounded raising the possibility that it is viewing distance rather than target speed that is the critical variable. It is not clear, however, why a greater viewing distance would be associated with briefer, faster movements as found in the experiments whereas it is clear why shorter VTs and smaller time windows would be associated with briefer movements. Furthermore, previous studies have demonstrated effects of VT and target speed (Ball & Glencross, 1985; Brouwer et al., 2000; De Lussanet, 2001; Laurent et al., 1994; Mason & Carnahan, 1999; Montagne et al., 2000) but none have demonstrated an independent effect of viewing distance. The next section (4.5) addresses the question of what equation 4 is telling us about the control of interceptive actions. 4.5 Implications for the control of interception The results described in sections 4.4 and 4.2 clearly support the variable criterion version of the OT hypothesis (section 3.3) over the fixed criterion version (section 3.1) since MT was found to vary substantially with the task parameters (equation 4). If it is assumed that the interception movements were performed in a visually open-loop mode so that the observed MTs were the direct result of a pre-programming process (Figure 5), equation 4 can be interpreted as an empirical description of the 'rule' used to pre-program MT. In section 3.3 the pre-programming process was conceived as taking various situational factors as input (Figure 5) and producing a desired MT as output. The list (or vector) of situational factors was denoted cp and the pre-programmed MT
Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got to to do with it? it?
133 133
as a function of cp, MT(cp). If equation 4 is interpreted as an empirical estimate of the form of this function, then cp = (D, L, W, V) and MT (cp) has the form of the right hand side of equation 4. To what extent is the interpretation of equation 4 given in the last paragraph justified? An important assumption is that the movements are executed in a visually open-loop mode or more precisely, that if any on-line adjustments occur they do not significantly affect the pre-programmed MT. The question of whether or not on-line adjustment or control of movement timing is a feature of human interceptive action has been an issue for a long time (e.g., Bootsma & van Wieringen, 1990; Fitch & Turvey, 1978; Lee et al., 1983), but there are no experimental data that allow the matter to be conclusively resolved in the present context. There are plenty of data that show on-line adjustment and/or control of movement direction in interception (e.g., Brenner & Smeets, 1996; Brouwer et al., 2002; De Lussanet, 2001; Port et al., 1997), but little or no direct evidence of on-line adjustment/control of timing. Evidence for the latter has typically been claimed if the data fit some model that involves such control (e.g., Lee et al., 1983; Lee et al., 1999; Peper et al., 1994) but open-loop models can typically be formulated that explain the data equally well (Tresilian, 1994, 1997). In our experiments, however, we did find evidence of at least one correction or change to the movement trajectory for many trials of larger amplitude (D > 30 cm), but for smaller amplitude movements (<15 cm) these changes or corrections were almost never observed (Tresilian & Lonergan, 2002). Another argument for on-line timing control was made by Bootsma and van Wieringen (1990) on the basis of their finding that the temporal variability (across repeated trials) systematically declined over the course of striking movements, being much smaller at the moment of striking the moving target than at the moment the striking movements began. This finding certainly contradicts the notion that MTs are invariant and initiated at a fixed value of some perceptual variable (section 3.1) but is completely consistent with the variable criterion OT hypothesis (section 3.3). Overall, therefore, there is no conclusive evidence for on-line correction or control of movement timing, particularly for smaller amplitude movements. Of course, lack of evidence does not necessarily mean that such control processes are not operative. Two further reasons for supposing that they might not be are as follows. 1) The MTs in our experiments were very short (averaging between about 80 ms for the shortest amplitude movements up to about 400 ms for the longest amplitudes), given that it takes in excess of 100 ms for visuallybased corrections to affect movements (for recent review see, e.g., Paillard, 1996) many of the movements would have been too brief for effective on-line correction. 2) Using a temporally constrained aiming task involving stationary targets, Carlton (1994) showed that when people tried to make an aimed movement last 400ms there was no evidence of corrections. Aimed movements
134 134
James R. Tresilian
of the same amplitude and spatial accuracy demands executed 'as fast as possible' almost always showed evidence of at least one correction despite the average MT being less than 400 ms. Carlton concluded that the requirement for temporal control results in the use of an open-loop control strategy. The most parsimonious position given the arguments presented in this section is that the timing of anticipatory interceptive actions of the type studied in our experiments is pre-programmed and not adjusted during execution (at least for shorter amplitude movements, D < 30 cm). If so, then equation 4 can be interpreted as an empirical description of the form of the rule used by the central nervous system to pre-determine MT, perhaps in the manner suggested by the variable criterion OT hypothesis (section 3.3). Open-loop control of timing suggests another reason for executing interceptions rapidly. The briefer your MT, the longer you wait before initiating the movement and so you are able to watch the target object for longer. Watching the target object for longer will likely give you better perceptual estimates of the variables used for preprogramming and/or initiating the movement and hence more accurate movement timing. This has previously been suggested to be the reason that professional baseball batters with the highest batting averages tend to be those with the briefest, fastest swings (Breen, 1967). It may be the reason for the independent effect of speed observed in our experiments (Tresilian et al. 2003; Tresilian & Houseman, 2003; see section 4.4): if it is more difficult to visually estimate the quantities required for control when the target moves faster, then it will be advantageous to watch it for longer - a strategy that will lead to briefer movements to intercept faster targets in restricted viewing time conditions.
5. Conclusions In this chapter I have argued against the idea that performance of particular interceptive tasks involves initiating the movement pattern when some perceptual quantity reaches & fixed criterion value — a value that is invariant over changes in the conditions in which the task is performed. It is not necessary, however, to drop the idea that initiation of interceptive actions depends upon some perceptual quantity reaching a criterion value. Indeed, it is difficult to conceive of a realistic alternative to this basic idea. In order to account for the available data it is necessary to suppose only that the criterion value used for initiation depends upon the conditions under which the task is performed. These conditions will likely include not only the external task constraints that can be determined from the stimulus (typically visual) input but also include factors associated with the internal state of the performer not dependent upon the current stimulus input (see Figure 5).
Interceptive Action: What’s What's Time-to-Contact Time-to-Contact got got to to do do with with it? it?
135 135
We have conducted some experiments in order to determine what effect external, temporal task constraints have on the performance of interceptive actions (Tresilian & Lonergan, 2002; Tresilian et al., 2003; Tresilian & Houseman, 2003). In order to do this a constrained interceptive aiming task was used in which the task constraints can be manipulated experimentally (Figure 7). The results of experiments completed and analyzed at the time of writing (reviewed in section 4) showed that people move faster to intercept targets when the time window was smaller (targets were hit harder and MTs were generally shorter). This can be interpreted as a strategy to exploit the greater temporal controllability of more rapidly executed movements. In addition, people's MTs and speeds increased when they had to move further to intercept the target and when the target moved faster (independently of the time window). The independent effect of target speed may, in part, reflect a strategy of waiting longer when targets move faster so as to obtain better perceptual estimates of relevant quantities. Overall the results indicate a kind of "Fitts' law" for interceptive aiming that relates MT to the independent variables: movement amplitude (D), time window ([L+W]/V) and target speed (V). In our experiments this relationship takes the form of equation 4 above. An interpretation of the relationship between MT and the task constraints expressed by equation 4 in terms of the variable criterion OT hypothesis (Figure 5) can be given as follows. The nervous system computes a pre-programmed MT, MT(cp), using information about the distance to be moved D, the target and bat sizes and the target speed according to a rule that takes the form of equation 4. The constant parameters depend upon the viewing time, internal factors and other contingencies of performance. A criterion value for TTC information that releases the GO signal is then set as Tc = MT(cp) + RT. If the movement is ballistic then the TTC information must be accurate if the target is to be hit. If there is the possibility for correcting timing errors then it is possible to use information that provides an approximation to the true TTC such as tau (Lee, 1980; Tresilian, 1994). TTC information plays a basic role in this scheme but other perceptual information about the performance constraints is required as well. It has been proposed here that people move more rapidly to intercept targets that impose tighter temporal constraints on performance in order to achieve the temporal precision required. It was argued that a person aims to hit a region of the target (virtual target) that is often less than the whole target and that a change in physical target size produces a smaller change in the size of the virtual target. This hypothesis was used to explain the results reported by Tresilian and Lonergan (2002) and was supported by a later set of experiments (Tresilian et al. 2003). This is not, however, the whole story: people move faster to intercept faster moving targets irrespective of the time window (Tresilian et al., 2003; Tresilian & Houseman, 2003).
136 136
James R. Tresilian Tresilian
It should be emphasized that the account summarized in the last paragraph is hypothetical and it is not claimed to provide a complete explanation of the timing of interceptive action. It applies only to the type of interception task in which the effector intercepts the target in an anticipatory fashion and does not follow the target in the manner discussed in relation to Figure 6. When the effector follows or tracks the target as a component of interception, behavior may be quite different (see Jagacinski et al., 1980; Tresilian & Lonergan, 2002). In addition, perceptual control of movement during execution has hardly been touched upon. Within its domain of applicability, the account offered makes a number of assumptions that should be subject to direct empirical test. The two most important are these: 1. Task constraints are used to pre-program the movement time rather than some other aspect(s) of performance. 2. Target size, bat size and target speed are used to produce an estimate of the time window and through this quantity influence performance parameters. Thus, MT and maximum speed can be interpreted as measures of the perceived time window. With respect to the first of these it is worth noting that MT does not vary as reliably with changes in task constraints as does the index of movement speed used (maximum speed). It may be, therefore, that it is movement speed and not MT that is the controlled variable in these tasks. It is not clear how a strategy could work based on the control of speed unless there were continuous control of the movement by a perceptual estimate of the target's speed as in the model described by Brenner and Smeets (1995). As recently shown, however, this model cannot account for recent data (Brouwer, Brenner & Smeets, 2000). Furthermore, even if the movement is being continuously controlled, it is still necessary to account for how the action is initiated. It is clear that further empirical work is needed to clarify how task constraints affect performance of interceptive aiming tasks and how perceptual variables contribute to the control of the associated movements and forces. Given the considerable research effort currently being directed at these matters, it is to be hoped that the control of interceptive actions will be better understood in the future.
Acknowledgement The author's research reviewed in this chapter was supported by the Australian Research Council and the University of Queensland Foundation.
Interceptive Action: What’s What's Time-to-Contact got to do with it?
137
REFERENCES Bairstow, P. J. (1987). Analysis of hand movement to moving targets. Human Movement Science, 6,205-231. Ball, C. T. & Glencross, D. (1985). Developmental differences in a coincident timing task under speed and time constraints. Human Movement Science, 4, 1-15. Bootsma, R. J. & van Wieringen, P. C. W. (1990). Timing an attacking forehand drive in table tennis. Journal of Experimental Psychology: Human Perception & Performance, 19, 1229-1224. Breen, J. L. (1967). What makes a good hitter? Journal of Health, Physical Education and Recreation, 38, 36-39. Brenner, E. & Smeets, J. B. J. (1996). Hitting moving targets: Co-operative control of 'when' and 'where'. Human Movement Science, 15, 39-53. Brouwer, A-M., Brenner, E. & Smeets, J. B. J. (2000). Hitting moving objects: The dependency of hand velocity on the speed of the target. Experimental Brain Research, 133, 242-248. Brouwer, A-M, Brenner, E. & Smeets, J. B. J. (2002). Hitting moving objects: is target speed used in guiding the hand? Experimental Brain Research, 143, 198-211. Carlton, L. G. (1994). The effects of temporal precision and time minimization constraints on the spatial and temporal accuracy of aimed hand movements. Journal of Motor Behavior, 26, 43-50. Carnahan, H. & MacFadyan, B. J. (1996). Visuomotor control when reaching toward and grasping moving targets. Acta Psychologica, 92, 17-32. Cutting, J. E. (1986). Perception with an eye for motion. Bradford Books, MIT Press. De Lussanet, M. H. E. (2001). The control of interceptive arm movements. Erasmus University Press, Rotterdam. Elliott, D., Helsen, W. F. & Chua, R. (2001). A century later: Woodwork's (1899) two-component model of goal-directed aiming. Psychological Bulletin, 127, 342-357. Fitch, H. & Turvey, M. T. (1978). In D. Landers & R. Christina (Eds), Psychology of motor behavior & sport. Human Kinetics. Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47, 381-391. Fitts, P. M. & Peterson, J. R. (1964). Information capacity and discrete motor responses. Journal of Experimental Psychology, 67, 103-112. Flanders, M. Tillery, S. I .H & Soechting J. F. (1992). Early stages in a sensorimotor transformation. Behavioral & Brain Sciences, 15, 309-320. Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton-Mifflin. Hoyle, F. (1957). The Black Cloud. Harmondsworth: Penguin. Jagacinksi R. J., Repperger, D. W., Ward, S. L. & Moran, M. S. (1980). A test of Fitts1 law with moving targets. Human Factors, 22, 225-233. Latash, M. L. & Gutman, S. R. (1992). Variability of fast single-joint movements and the equilibrium-point hypothesis. In K. Newell & D. Corcos (Eds.) Variability and motor control. Champaign ILL: Human Kinetics.
138 138
James R. Tresilian
Laurent, M., Montagne, G. & Savelsbergh, G. J. P. (1994). The control and coordination of onehanded catching - the effect of temporal constraints. Experimental Brain Research, 101, 314-322. Lee, D. N. (1974). Visual information during locomotion. In R. B. MacLeod & H. L. Pick (Eds), Perception: Essays in honor of James Gibson (pp. 250-267). Ithaca NY: Cornell University Press. Lee, D. N. (1976). A theory of visual control of braking based on information about time to collision. Perception, 5, 437-459. Lee, D. N. (1980). Visuo-motor coordination in space-time. In G. Stelmach & J. Requin (eds.), Tutorials in Motor Behavior (pp. 281-295). Amsterdam: North Holland. Lee, D. N., Craig, C. & Grealy, M. A. (1999). Sensory and intrinsic coordination of movement. Proceedings of the Royal Society of London, B266, 2029-2035. Lee, D. N., Young, D. S., Reddish, P. E, Lough, S. & Clayton. T. (1983). Visual timing in hitting an accelerating ball. Quarterly Journal of Experimental Psychology, 35A, 333-346. Mason, A. H. & Carnahan, H. (1999). Target viewing time and velocity effects on prehension. Experimental Brain Research, 127, 83-94. Mattocks, C, Wallis, G. & Tresilian, J. R. (2002). Obstacle avoidance strategies in driving In A. G. Gale (Ed.) Vision in Vehicles XI (in press). McLeod, P., McGlaughlin, C. & Nimmo-Smith, I. (1985). Information encapsulation and automaticity: evidence from the control of finely timed actions In J. Long & A. D. Baddeley (Eds), Attention and Performance IX. Lawrence Erlbaum. Michaels, C. F., Zeinstra, E. & Oudejans, R. R. D. (2001). Information and action in timing the punch of a falling ball. Quarterly Journal of Experimental Psychology, 54A, 69-93. Montagne, G., Fraisse, F., Ripoll, H. & Laurent, M. (2000). Perception-action coupling in an interceptive task: first-order time-to-contact as an input variable. Human Movement Science, 19, 59-72. Newell, K. M., Hoshizakio, L. E. F., Carlton, M. J. & Halbert, J. A. (1979). Movement time and velocity as determinants of movement timing accuracy. Journal of Motor Behavior, 11, 49-58. Newell, K. M., Carlton, L. G., Kim, S. & Chung, C. H. (1993). The accuracy of movement in space-time. Journal of Motor Behavior, 25, 33-44. Paillard, J. (1996). Fast and slow feedback loops for the visual correction of spatial errors in a pointing task: a reappraisal. Canadian Journal of Physiology & Pharmacology, 7, 401417. Peper, C. L., Bootsma, R. J., Mestre, D. & Bakker (1994). Catching balls - how to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception & Performance, 20, 591-612. Plamondon, R. & Alimi, A. M. (1997). Speed/accuracy trade-offs in target-directed movements. Behavioral & Brain Sciences, 20, 279-296. Poulton, E. C. (1950). Perceptual anticipation and reaction time. Quarterly Journal of Experimental Psychology, 2, 99-112.
Interceptive Action: What’s What's Time-to-Contact got to do with it?
139 139
Regan, D. (1992). Visual judgments and misjudgments in cricket, and the art of flight. Perception 21,91-115. Schmidt, R. A. (1969). Movement time as a determiner of timing accuracy. Journal of Experimental Psychology, 79,43-47. Schmidt, R. A. (1988). Motor control and learning: A behavioral emphasis (2nd edition). Human Kinetics. Schmidt, R. A., Zelaznik, H. N., Hawkins, B., Frank, J.S. & Quinn, J. T. (1979). Motor output variability: A theory for the accuracy of rapid motor acts. Psychological Review, 86, 415-451. Smith, M. R. H., Flach, J. M., Dittman, S. & Stanard, T. (2001). Monocular optical constraints on collision control. Journal of Experimental Psychology: Human Perception & Performance, 27, 395-410. Sparrow, W. A. (Ed.) (2000). Energetics of human activity. Human Kinetics. Tresilian, J. R. (1993). Four questions of time to contact: a critical examination of research on interceptive timing, Perception, 22, 653-680. Tresilian, J. R. (1994). Perceptual and motor processes in interceptive timing. Human Movement Science, 13,335-373. Tresilian, J. R. (1997). The revised tau-hypothesis: A consideration of Wann's (1996) analyses. Journal of Experimental Psychology: Human Perception & Performance, 23, 12721281. Tresilian, J. R. (1999). Visually timed action: time out for tau? Trends in Cognitive Sciences, 3, 301-310. Tresilian, J. R. & Houseman, J. H. (2003). A simple empirical relationship describes performance for a class of interceptive action. Manuscript submitted for publication. Tresilian, J. R. & Lonergan, A. (2002). Intercepting a moving target: effects of temporal precision constraints and movement amplitude. Experimental Brain Research, 142, 193-207. Tresilian, J. R., Oliver, J. & Carroll, T. J. (2003). Temporal precision of interceptive action: differential effects of target speed and size. Experimental Brain Research. 148,425-438. Turvey, M.T. & Carello, C. (1986). The ecological approach to perceiving-acting Ada Psychologica, 63, 133-155. Turvey, M. T., Shaw, R. E., Reed, E. S. & Mace, W. (1981). Ecological laws of perceiving-acting: A reply to Fodor and Pylyshyn. Cognition, 9, 237-304. Tyldesley, D. A. & Whiting, H. T. A. (1975). Operational timing. Journal of Human Movement Studies, 1, 172-177. Van Donkelaar, P., Lee, R. G. & Gellman, R. S. (1992). Control strategies in directing the hand at moving targets. Experimental Brain Research, 91,151-161. Wallace, S. A., Stevenson, E., Weeks, D. & Kelso, J. A. S. (1992). The perceptual guidance of grasping a moving object. Human Movement Science, 11,691-716. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory? Journal of Experimental Psychology: Human Perception & Performance, 22, 1031-1048. Watts, R. G. & Bahill, A. T. (1990). Keep your eye on the ball: the science and folklore of baseball. New York: Freeman.
140 140
James R. Tresilian
Woodworth, R. W. (1899). The accuracy of voluntary movement. Psychological Review Monograph Supplements 3, whole number 13. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1999). Dynamics of reaching for stationary and moving objects: Data and model. Journal of Experimental Psychology: Human Perception & Performance, 25, 149-161.
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 2004 Elsevier B.V. All rights reserved
CHAPTER 8 The Information-based Control of Interceptive Timing: A Developmental Perspective Paulion van Hof Vrije Universiteit, Amsterdam, The Netherlands
John van der Kamp Vrije Universiteit, Amsterdam, The Netherlands
Geert J. P. Savelsbergh Vrije Universiteit, Amsterdam, The Netherlands Manchester Metropolitan University, Manchester, U.K.
ABSTRACT The present chapter deals with the development of the informational basis of infant's timing of interceptive actions such as eye blinking to avoid an approaching object or catching to intercept one. We show that during the first year of life infants discover the affordances for avoidance and interception and further improve the perception of these affordances. Concurrently, the movements performed to actualize these affordances appear to get better attuned to the situation. That is, experimental findings suggest that the first rudimentary movements appear to be based on non-specifying optical variables (e.g., optical angle) which result in sub-optimal performance. However, with age the infants appear to convergence on to specifying variables (e.g. T), and hence become more successful in performing the task. It is concluded therefore that changes in the development of the information-based regulation of interceptive actions can be understood as an education of attention, that is, a convergence to the more useful information.
142 142
Paulion van Hof, John van der Kamp and Geert Savelsbergh
1. Introduction Acting in response to objects that move in our environment is an every day occupation. These interceptive actions must be geared to information about our selves and our surroundings. This information is mainly obtained visually. Timing is the defining characteristic for these interceptive actions. For example, braking the car when an animal suddenly darts across the road has to be in time to avoid an accident. This chapter deals with the control of interceptive timing from a developmental perspective. To this end, we consider the development and performance of time demanding actions, such as catching and avoiding moving objects, during the first months after birth. We provide an overview of how infants time and guide their actions in relation to the optical information sources they use. We show that infants are quite capable to cope with certain timing demands, even shortly after birth. But before addressing that, several other issues have to be pointed out.
2. Trends in research on the development of infant behavior Avoiding or making contact with moving objects is a useful vehicle to study how and which sources of information contribute to the control of movement timing in infancy. However, rather than just describing the development of time demanding actions, such as avoiding or grasping a moving object, the emphasis of the present chapter is on the (changing) role of the sources of information that are involved in the timing of these actions. Although the informational basis of interceptive timing is well studied in adults, this is hardly done in infancy. 2.1 Toward the processes of change underlying development During the nineteen thirties and forties, research on infant behavior exploded. Pioneering work of Myrtle McGraw, Arnold Gesell, and Henry MacHalverson established the so-called 'milestones' of development (Pick, 2003). The primary mechanism of developmental change in motor behavior was believed to be neuromuscular maturation. The widespread acceptance of this view brought further research on the underlying mechanisms of motor development almost to a standstill until the eighties (see e.g. Thelen, 1995). Over the past 25 years infant behavior research rejuvenated. Developmental psychologists realized that newborns were not totally disabled organisms, and eventually this realization pushed aside the decades long reigning belief that newborns were equipped with only a few sensori-motor reflexes (e.g. Piaget,
The The Information-based Information-based Control Control of of Interceptive Interceptive Timing: Timing: A Developmental Developmental Perspective Perspective 143 143
1952). In spite of this renewed attention, the majority of the studies was limited to demonstrating the age of onset and the further development of remarkable and formerly unexpected abilities, and, as a result, remained primarily descriptive. Hence, it has provided extensive knowledge about the emergence and development of certain capabilities, but understanding of underlying mechanisms of development proved to be more difficult to achieve. Nevertheless, more recently a new trend in the field of developmental psychology can be observed. Research topics are shifting from the demonstration of the remarkable adult-like behavior in infancy to the processes underlying change. For example, Johnson and Johnson (2000) recently proposed that the development of object perception and object knowledge is dependent on infants' visual search strategies. By recording the eye-movements of 2- to 3.5month-old infants, they showed that infants' search strategies changed with age. The reason why younger infants are limited in their perception of object unity appears correlated with their inefficient search strategies: younger infants do not scan the entire stimulus and fixations are limited to uninformative regions of the display. Or, in other words, 3.5-month-old infants in this experiment were able to direct their attention to relevant information, whereas the 2-month-olds were not. Also by detailed analysis of infants' search strategies, Taga, Ikejiri, Tachibana, Shimojo, Soeda, Takeuchi, and Konishi (2002), revealed that prolonged fixation on a single object at 2 months (also called 'obligatory attention' (Stechler & Latz, 1966)) may be responsible for the disability to discriminate changes in other objects. In short, changes in infants' control of attention may be one of the factors that induce changes in performance. 2.2 Multi-factorial nature of development Interceptive action is a useful yardstick to study how and which sources of information contribute to the control of timing. At least this is feasible in adults; research in infancy has the additional difficulty that large changes may take place in a multitude of factors, which cannot be easily controlled experimentally. For instance, muscular strength and physical strength (e.g., Wimmers, Savelsbergh, van der Kamp & Hartelman, 1998), control over the mechanically unstable arm (e.g., Out, Savelsbergh, van Soest & Hopkins, 1997) and postural control (e.g., Rochat, 1992), and traditionally the maturation of the brain have been indicated as factors in the development of interceptive actions. A small change in only one of these factors might induce new behavior (see Thelen & Smith, 1994). The informational basis may be one of these factors that contribute to the development of controlling interceptive timing, although not exclusively so. For example, different skill levels in interceptive action may be related to the exploitation of different sources of information at different ages.
144 144
Paulion van Hof, John van der Kamp and Geert Savelsbergh
However, the precise role of the informational basis in the development of interceptive actions is yet to be established. 2.3 Optical information for action vs. optical information for perception Another issue is that a distinction has to be made between the use of optical information to obtain perceptual knowledge about moving objects, and the use of optical information to control interceptive actions. That is, it is recently proposed that two functionally dissociable systems might underlie vision for action and vision for perception (Milner & Goodale, 1995). Analogously, during development the informational bases for perception and action might be different and follow separate trajectories (Van der Kamp & Savelsbergh, 2000, 2002; Berthier, Bertenthal, Seaks, Sylvia, Johnson & Clifton, 2001; Newman, Atkinson & Braddick, 2001). Therefore, it should be realized that the observations from the development of perception, such as indicated by habituation and preferential looking studies, cannot be automatically generalized to the informational regulation of (interceptive) actions. The experimental settings of the perception and action studies differ in most cases, making a comparison difficult, and in disparate studies differences in the available sources of optical information are likely to occur (e.g. Newman, Atkinson & Braddick, 2001, Van der Kamp & Savelsbergh, 2000). For instance, Von Hofsten (1982) has found that newborns' arm movements appear to be adapted to the movement direction of the passing object, indicating that newborns use optical information about movement direction of the object to guide their arm movements. In contrast, with use of habituation and preferential looking methods, Wattam-Bell showed that only from 10-weeks onwards infants begin to discriminate the direction of slowly moving stimuli, and that it is not before 4 months that infants' perception of directional information gets more robust (1992, 1996; cf. Van der Kamp & Savelsbergh, 2000). This example illustrates that when aiming to understand which and how informational variables might give rise to changes in the control of infant's movements one has to study infants' actions themselves and should not take the findings from perception studies as the only starting point.
3. Ecological psychology approach Optical information is primary in guiding the spatio-temporal characteristics of our movements; it provides the most detailed information about objects, places and events that can be used for movement control. Traditionally, optical information was held to be inherently ambiguous because of the loss of information in projecting a three-dimensional world onto a two-dimensional
The Information-based Control of of'Interceptive InterceptiveTiming: Timing:AADevelopmental DevelopmentalPerspective Perspective 145 145
retina. Given this ambiguity, visual perception of the outside world could only be achieved by combining visual sensations with extra-visual information and cognitive processes such as interpretation and association based on past experiences (Berkeley, 1709/1910; Helmholtz, 1867/1925; Piaget, 1952, 1954; Rock, 1983, 1997; Gregory, 1993). An alternative to this traditional point of view is the more recent, ecological psychology approach to perception and action of James J. Gibson (1966, 1979). Gibson (1966, 1979) stressed that the world, and not the retinal image, is the starting point for visual perception. He proposed that changes in the spatial patterns of light reaching a person are sufficient to specify the nature of the surrounding world. In other words, the optic array is structured and contains information about the layout of the environment. This optical information is specific to the objects and events in the environment and can be used to directly guide movement. Hence, knowledge or complicated cognitive processes are therefore not needed for perception and action; perception and action require only the 'pick-up' or detection of available information (Gibson, 1979; Michaels & Carello, 1981). Therefore, Gibson's ecological view is also known as the direct perception perspective. Prominent in the ecological view is Gibson's notion of affordance (Gibson, 1966, 1979). Affordances express the relation between the animal and its environment, as they link perceiving to acting. "The affordances of the environment are what it offers animals, what it provides or furnishes, either for the good or ill" (Gibson, 1979, p. 127). An affordance for a particular organism is a property of the environment that affords relevant behavior to the organism. In other words, it relates the action possibilities of an actor to its environment. Affordances are properties taken with reference to the organism. For example, a large object may afford one-handed grasping for an adult, but not for a child whose small hand will not fit around a large object, it affords two-handed grasping (e.g. Van der Kamp, Savelsbergh & Davids, 1998). To discover an affordance for action, one has to detect the information specifying it (Michaels & Carello, 1981). Our interest goes out to when and how infants come to discover affordances of objects and events. According to Eleanore J. Gibson (1982, 1988), affordances are discovered with the aid of the perceptual systems and exploratory behavior. Given the perception of an affordance, an infant has the possibility to act. Furthermore, for the regulation of that action detection of optical variables useful for movement control is needed. The next section discusses the development of perception and action from an ecological viewpoint and its key concepts exploration and differentiation.
146 146
Paulion van Hof, John van der Kamp and Geert Savelsbergh
4. The ecological approach to development Nowadays, we assume that infants learn to detect relevant information from the optical flow, at least in particular situations, but this assumption has not always been accepted, and strictly speaking it is a relatively new idea. Just like the classical inferential point of view that a meaningful image of the outside world can only be achieved by enrichment of the incoming bare sensations by cognitive inferences, the world of the newborn was long characterized as a blooming, buzzing confusion. Because infants seem to do so little during their first months after birth, it was long thought that the young infant was nearly blind and that it is this 'handicap' that hinders the young infant from learning about the world. But research over the past three to four decades has radically restructured the ideas about the origins of space and motion perception. The old but durable view that infant development begins with sensory systems providing meaningless, disorganized sensations no longer seems tenable. Although human infants do have a smaller perceptual repertoire than adults, infants can visually perceive their environment with objects and events occurring in it. It even seems that looking is the major means of information gathering in the relatively immobile newborn. By actively exploring the environment, such as moving the eye, head, and hand, information is detected and affordances are discovered (E. J. Gibson, 1988; Van der Kamp & Savelsbergh, 1994; Savelsbergh, Wimmers, Van der Kamp & Davids, 1999). The discovery of affordances implies that an infant discovers his or her possibilities for action in a particular environment. Even in young infants the urge to explore their environments is impressive. However, since perception and action systems develop, exploratory possibilities change with age, which provides new possibilities to discover affordances. Exploration is not a blind random search; it is specific and adapted to the properties of the environment (E. J. Gibson, 1988; Palmer, 1989). In short, the discovery of affordances depends both on information available to the infant and on the infant's action capabilities. Eleanore Gibson (1988) outlined the development of exploratory behavior into three overlapping phases. The first phase extends from birth to about four months. Although the limited mobility and the poor visual acuity constrain an infant's scope of exploration to his or her immediate field of vision, discovery of some basic properties of objects is made possible by visual attention to events and in particular to motion. Stationary objects are less attended to. In phase two, beginning around the fifth month, infants start to attend to objects facilitated by more frequently reaching behavior, increasing visual acuity and availability of stereoscopic information for depth. For exploration, the infant is not only dependent on motion of the object, also his or her own mastering of actions lead to the discovery of new affordances. For
147 The Information-based Control of Interceptive Timing: A Developmental Perspective 147
example, an object may be graspable or throwable. In the third phase, beginning around the eight or ninth month of age, further development of the visual system and self-initiated locomotion like crawling, enables the infant to enlarge the scope of exploration. Properties of the extended environment around corners, behind obstacles and behind oneself can be discovered. Affordances of places for hiding and playing are open for investigation. The aforementioned phases show the relation between the available action capabilities, exploratory activity and the discovery of affordances. Exploration serves to gather information relevant for choosing between future actions, which implies that except that the discovery of affordances is closely related to the infant's present action capabilities, the discovery of affordances is also partly determined by the infant's current goals of behavior. Infants have to discover what they want to perceive or what action they want to perform. As such, the discovery of affordances can be regarded as a broadening of intention. Many perceptions and actions may be possible in any situation. Improvements in which of the possible perceptions or actions one intends to actualise can be interpreted as a first robust channeling of attention to detect optical variables that are correlated to the to-be-perceived affordance or environmental property. However, the optical variable whose detection is entailed by a particular intention might change. The process of coming to attend to the optical variable that specifies the to-be-perceived environmental property is what J. J. Gibson (1966) referred to as the education of attention. Eleanore Gibson (1969) has used the term differentiation or selection to refer to this education of attention. She argued that perception is a matter of differentiating what is in the available information, that is, selecting useful information from a large variety of optical variables. In other words, perceivers and actors may change from detecting non-specifying variables to the more useful variables. Another process that may underlie development and learning is calibration. By calibration perception or action is tuned, that is, the same information is detected or used but its relation with perception or action is modified. Given the large variation in perceptual-motor behaviour during infancy, the process of calibration in infancy is perhaps difficult to assess experimentally. Hence, it will not be discussed in this chapter. The education of attention and calibration entail convergence towards more useful non-specifying or even specifying optical variables (Jacobs, 2002; Jacobs & Michaels, 2002)1. The following example serves to illustrate both processes in infancy. Infants actively explore their environment in order to discover affordances and 1 A perceiver's or actor's ability to accurately perceive or perform an action in a particular environment does not necessarily imply that the perceiver or actor has relied on a specifying variable. In some situations a variable may correlate highly with a particular property even when it does not completely specify that property. This lack of specificity may show to be deleterious for performance in other situations. Hence, the phrase useful non-specifying variables.
148 148
Paulion van Hof, John van der Kamp and Geert Savelsbergh
to detect information. But an infant is not only prepared to discover affordances in a general sense, he (or she) is also able to act upon them adaptively. For example, a moving object affords grasping when it fits into the infant's hand and when its approach velocity is manageable. It will elicit reaching and grasping behavior when information specifying the affordance graspable is detected. Given the affordance, multiple information sources may contribute in the control of infant's actions. And possibly dependent of the action outcome, the infant may select more relevant information sources from the optical flow (see E. J. Gibson, 1988). That is, with experience attention gets educated to more useful information (e.g. in the case of timing an eye blink from information specifying distance to information specifying time). Finally, it has to be mentioned that the discovery of affordances and the detection of useful information and improvements herein are not necessarily chronological processes, but are more likely to be intertwined. For instance, education of attention not only underlies changes in the detection of information to improve the control of action, but also the improvement of the perception of affordances. Possibly, using the more useful optical variable to control action (due to education of attention) might induce an improvement in the perception of the affordance. An interesting question, then, would be whether improvement of the perception of an affordance is related to improvements in movement control and vice versa.
5. The informational basis of interceptive actions A striking thing in the large amount of adult literature about control of timing interceptive actions is that it is shown that multiple information sources can contribute in timing interceptive actions. A key source of information in this literature is 'x'. Tau (x(cp)) is an optical variable that directly, without the need for any further knowledge about object size or speed, specifies the exact time-tocontact (TTC) (Lee, 1976). Tau is the ratio between the optical angle subtended by the edges of the object and the point of observation at a given instant and the rate of change of the optical angle. Both are present in the expansion pattern generated by the upcoming object on the retina of each eye. Lee formally demonstrated that the inverse of the relative rate of expansion gives an accurate measure of TTC of the object with the eye as long as the object is not too large and provided that the velocity of the object relative to the observer is constant.
The Information-based Control of Interceptive Timing: A Developmental Perspective 149
D
Z(t)
^)4
(p
Z
Figure 1: A ball of diameter D approaching on a collision course with the eye with speed V. The object subtends an optical angle cp (t) and is a distance Z(t) from the observation point away. For small angles of cp (e.g. the angle subtended by the edges of the object and the point of observation), the organism-environment property time-to-contact (TTC,) is specified by the optical variable x(cp).
Lee's seminal article on x set the stage for a research paradigm in ecological psychology (e.g. Lee & Reddish, 1981; Lee, Young, Reddish, Lough & Clyton, 1983; Todd, 1981; Savelsbergh, Whiting & Bootsma, 1991; Savelsbergh, Whiting, Pijpers & Van Santwoord, 1993). An important reason for this great interest in the role of x in timing interceptive actions has been its status as paradigm example of J. J. Gibson's ecological approach: the optical variable x directly specifies the actor-environment property TTC without the need for complex computations, logical inference or other constructive processes. Detecting x is perceiving TTC. Most studies during the nineteen eighties and early nineties focused on demonstrating that human and animal movements are timed at a constant TTC. Since TTC is thought to be uniquely specified by the optical variable x, it was considered that showing that (interceptive) movements are initiated at a particular time before contact sufficiently proved the exclusive involvement of x in the control of the temporal aspects of interceptive actions. But in spite of the elegance of Lee's x-hypothesis, there are theoretical and empirical limitations to its exploratory value. Binocular vision enhances performance, the interception point does often not coincide with the point of observation, and the objects do not always move with a constant velocity. Besides, much of the empirical evidence related to x is indirect and open to alternative interpretations. This indicates that x cannot be the sole optical variable that is used in the control of interceptive actions. For example, as an object approaches also other informational variables like optical size and rate of change of optical size co-vary with x, and may therefore also be involved. The
150 150
Paulion van Hof, John van der Kamp and Geert Savelsbergh
finding that timing is indeed dependent on object size and approach velocity eventually resulted in the conviction that interceptive actions may be based on the pick up of sources of information that do not always specify TTC. For an elaboration of this argument, the reader is referred to the chapter by Caljouw, van der Kamp, and Savelsbergh (this volume). Hence, besides x other monocular information sources like optical size and its absolute rate of change (Michaels, Zeinstra & Oudejans, 2001; Smith, Flach & Dittman, 2001; Van der Kamp, Savelsbergh & Smeets, 1997), relative size and texture gradients (DeLucia, 1991) are suggested to contribute to the regulation of interceptive timing tasks. But also oculomotor information like accommodation (Gogel, 1977; Leibowitz, Shiina & Hennesy, 1972) and vergence (Heuer, 1993) and binocular sources of information such as disparity, its rate of change (Bennett, van der Kamp, Savelsbergh & Davids, 1999, 2001; Judge & Bradford, 1988; Mon-Williams & Dijkerman, 1999; Van der Kamp, Bennett, Savelsbergh & Davids, 1999) and the x-function of disparity (Laurent, Montagne & Duray, 1996; Van der Kamp et al., 1997) have been implicated in the regulation of interceptive timing.
6. Potential optical variables contributing in the timing of interceptive actions during infancy The observations on the information-based control of timing in adults may serve as starting-point for gaining insight into the informational basis of infant's movements. As illustrated in Figure 2, habituation, preferential looking, and VEP (visual evoked potential) studies have shown that the detection of different types of optical variables that have been indicated to be involved in adult timing show different developmental trajectories. Roughly, there is a sequence of monocular kinetic information to binocular information to monocular static information (Atkinson, 2000; Kellman & Arterberry, 2000; Yonas & Granrud, 1985). Note, however, that this summary is based on studies that were mostly concerned with habituation and preference looking methods, that is, perceptual judgments, and did not directly test whether these information sources are actually used to control movement. To what extent it accounts for changes in infants' timing control of movements for a moving objects is reviewed in the next sections.
The Information-based Control of Interceptive Timing: A Developmental Perspective 151
Mono, kinetic: Optical expansion
R r s t at
Binocular: Accommodation Vergence Disparity
j
week
First at 4 weeks • First at 4 weeks • First at 11-13 weeks
Mono. Static: Pictorial cues
First at 7 months
10
15
20
25
30 Age (weeks post term)
Figure 2: Schematic representation showing the development trajectory of the infant's detection of monocular kinetic, binocular, and monocular static information sources (adapted from Atkinson, 2000, p. 89).
7. Infants' movements for moving objects Moving objects are of the first things that attract infants' attention. From about one month onwards infants track them with their eyes and some older infants also act on them, that is they try to obtain an object that is moving right in front of him. From four or five months of age the infant is successful in that most infants at that age are capable of grasping the moving object, which illustrates that they can use optical variables to control the timing of their actions. Research on eye blinking to looming stimuli has shown that the detection and use of optical variables for control of actions exists from early on (e.g. Yonas, 1981). In the remainder of this chapter, the development of controlling the timing of avoidance behavior and interceptive actions is reviewed. Coming up for discussion are questions like: What affordances do infants discover, and at what age? What optical variables do infants detect and use to gear their movements to? Does this change with development, and if so, how (cf., education of attention)?
152 152
Paulion van Hof, John van der Kamp and Geert Savelsbergh
7.1 Approaching objects: Avoidance or defensive behaviour The optical projection of an approaching object expands symmetrically as the object comes closer to collision with the observer. Because the magnitude of the optical angle subtended by the object is inversely related to the distance from the point of observation, this growth is not linear. During the last part of the approach the increase is more rapid than during the first part. This explosion, or rapid increase, has been called looming and it provides information that collision is imminent (Gibson, 1966, 1979). Even viewing a virtual impending collision is often convincing enough to elicit a response. Take for example, the first film shown in public by the brothers Lumiere (Paris, 1895): a large train on the screen came rushing upon the audience. The train seemed so real that it frightened most onlookers; they recoiled or withdrew their heads from the screen. Some of them even yelled or fainted. This panic reaction was believed to provide sufficient proof of the power of this illusion. Nowadays we are more accustomed to such moviescenes, but research has shown rather clearly that a visual illusion of an approaching object on a collision course is convincing enough to elicit specific reactions in humans and several other species (Schiff, 1965; Schiff, Caviness & Gibson, 1962). In the majority of this research a shadow-casting device is used. A moveable object between point -light source and projection screen serves as the shadow-casting occluder. When screen, object and lamp are aligned, movement of the object towards the point source produces a symmetrically expanding shadow on the rear of the screen and the visual experience of an approaching object on a collision path for an adult observer that is convincing enough to elicit a number of responses: the person may blink his eyes, dodge out of its path, or attempt to block the object's approach. How about the development of such responses to an approaching object? When one considers the action repertoire of an infant, what could an optically expanding shadow signify? In other words, what does an optically expanding shadow afford to an infant? It might be that infants perceive something is actually coming towards them. Nevertheless, infants are likely to be limited in their abilities to execute full avoidance behavior early in life. What responses to an expanding shadow do infants exhibit during the first months after birth? And are these responses indicative of infants' ability to detect information for impending collision? What sources of information are effective in eliciting these responses in infants and what information sources are used in controlling the timing of these responses? The next section will deal with these questions.
153 The Information-based Control of Interceptive Timing: A Developmental Perspective 153
7.2 The discovery of the affordance of impending collision After Schiff (Schiff, 1965; Schiff, Caviness & Gibson, 1962), who was the first in using 'looming stimuli' to elicit a defensive response in several animal species, Bower, Brougthon and Moore (1970; but see also Ball and Tronick, 1971) reported that human infants as young as 6 days old exhibited defensive behavior, like head retraction, raising of the arms, and blinking when confronted with optically expanding shadow patterns. Moreover, the intensity or likelihood of the responses was not different for a virtual or for a real object (Ball & Tronick, 1971), despite the fact that a real approaching object also provides additional information such as air displacement. This has led to the conclusion that the animals and even very young human infants can use the optical information contained in the expanding pattern to perceive impending collision. A couple of years thereafter, however, a debate arose whether infants really perceived the affordance for avoidance. Infants' responses to looming stimuli appeared open to various interpretations. Yonas, Bechtold, Frankel, Gordon, McRoberts, Norcia, and Sternfels (1977) showed that the head and arm movements could be interpreted more properly as instances of tracking behavior. Infants would move their heads because they visually track the rising top contour of the pattern, and their relatively undifferentiated motor behavior may have led to the arms following along. However, young infants do seem to be aware of collision information available in the expanding optical pattern. Blinking with the eyes at virtual impact may serve as indicator for this. Blinking with the eyes might prevent for possible damage to the sensitive cornea of the eyes. Therefore, it would be adaptive for the infant to be able to pick up when something is on a collision course and make a blink to protect the eyes. Assuming this, eye blinking could be interpreted as an indicator of the infant's ability to detect collision information available in the expanding optical pattern2. Eye blinking to looming stimuli is not attributable to tracking behavior since it occurred significantly more frequently to displays specifying approach than to displays specifying a non-expanding rising contour (Yonas et al., 1977). Even a 3- to 4-week-old infant blinks in the presence of an approaching object (Nanez, 1988), but blinking frequency increases dramatically during the first months. Furthermore, infants appear to discriminate between the affordances of approaching obstacles and apertures. Reliable effects of blinking to approach displays, more than to control displays, such as stimuli indicating approaching apertures, have been found in several studies with infants from about 1 month 2
The defensive character of blinking is questionable, since it seems more logical to keep certain parts of the stimulus array in view in order to react properly to an approaching object. Nevertheless, it is considered as the best indicator of awareness to stimuli on a collision course in infancy (Yonas, 1981).
154 154
Paulion van Hof, John van der Kamp and Geert Savelsbergh
on. For instance, Caroll and Gibson (1981) found that three-month-old infants pushed their head backward when faced with impending collision with an approaching solid, looming panel but leaned forward when faced with an approaching panel with a window cut out. The solid panel and the window were identical in size to equate visual expansion rate. These results suggest that infants recognized the functional consequences of contact with these objects, or in other words, differentiated the affordances of these various objects (see also Yonas, 1981; Yonas & Granrud, 1985; Yonas, Pettersen & Lockman, 1978; Nanez & Yonas, 1994; Li & Schmuckler, 1996). A general conclusion that can be drawn from the aforementioned studies on impending collision is that from one-month infants perceive the affordance of impending collision, that is, the looming patterns have become to afford defensive behavior. With increasing age, blinking becomes more and more consistent, and is elicited more frequently. At about six months consistency and frequency are such that defensive blinking is considered established (Yonas et al., 1977). Speed of approach and object size seem to have an effect on the defensive behavior but conclusive proof of these effects is lacking. Although Bower et al. (1970) argued that a slowly approaching object is more effective in eliciting an avoidance response in very young infants, Yonas et al. (1977) did not find evidence to support this claim; the amount of eye blinking in 1- to 2month-old infants did not vary with speed of the virtual impending collision. Unfortunately, no older infants were presented with varying approach speed conditions. Therefore, the effect of approach speed on defensive behavior remains unclear. The same uncertainty holds for the effect of object size. In some of the aforementioned studies it can be found that the nearer the object, the more likely it was to elicit a defensive response (e.g. Bower et al., 1970; Yonas, Oberg & Norcia, 1978). Bower and colleagues (1971) tried to relate the effectiveness of closeness and retinal image size. They presented 8- to 17 days old infants with different sized objects at different distances, but their attempt failed because of the distress it produced, particularly when the objects were very close to the infants' faces. In sum, blinking at virtual impact suggests that the infant has learned to perceive the affordance of impending collision and to make accurate movements based on information available in the expanding optical pattern. From this, the possibility arises to examine what optical variable infants detect and use to regulate the timing of defensive blinking and whether this changes during development (i.e., education of attention).
The Information-based Control of Interceptive Timing: A Developmental Perspective 155
7.3 Changes in the optical variables used to control the timing of defensive movements 7.3.1 Monocular dynamic information The optical expansion pattern of a virtual or real approaching object contains various optical variables that may or may not be picked up and used by infants to make a blink or control other adjustments. Provided that approach velocity is constant, the inverse of the relative rate of expansion would give an accurate measure of TTC of the object with the eye as long as the object is not too large. But it is suggested that also other variables may be used, such as the absolute rate of expansion. Bower, Broughton, and Moore (1970) demonstrated that the specific responses to the illusion of approaching objects shown by 1 to 2 week-old infants were affected by approach speed, suggesting that rather than the relative rate of expansion infants attended to the absolute rate of change of retinal image size. However, Bower et al did not provide any data of temporal characteristics of the infants' responses, it was only reported that the frequency of responses to virtual impending collisions decreased with increasing approach speed (cf. Yonas et al., 1977). Recently, Kayed and Van der Meer (2000) have measured the temporal characteristics of infants' eye blinking. They have used the phenomenon that infants blink with their eyes when confronted with an approaching object as a vehicle to examine which optical information 5- to 7-month-old infants use when timing that eye blink. Infants were seated 40 cm in front of a large screen on which a virtual approaching object was projected. The experiment was based on the shadow caster experiments (e.g., Schiff et al., 1962; Schiff, 1965). In timing the blink, not only the fact that something is virtually approaching has to be picked up, the infant also needs information about when to initiate the action. The involvement of three optical variables in timing the defensive blink was examined3, the optical angle, its absolute rate of change and x. To distinguish between the use of optical angle, its absolute rate of change, or x as the source of information in the control of timing, the 5- to 7-month-old infants were presented with various fast and slow approach velocities and constant acceleration conditions. 3
In fact, Kayed and Van der Meer (2000) compared four possible timing strategies. Namely, the strategies based on optical angle or its absolute rate of change and time strategies based on x or actual TTC. Because the latter strategy is based on a property of the environmental-actor system and not, unlike the other strategies, on an optical variable, it is not discussed here. However, it can be argued that also actual TTC during accelerative approaches can be specified by optical variables. For instance, Stewart, Cudworth, and Lishman (1993) have derived an optical variable that specifies actual TTC in case of constant acceleration: the ratio of angular speed to angular acceleration of the approaching object.
156 156
Paulion van Hof, John van der Kamp and Geert Savelsbergh
For every blink made by each infant, the corresponding values for optical angle, its absolute rate of change, and x were determined. When exploring which optical variable infants used in timing the blink, it was assumed that the variable that an infant kept constant across the different approach conditions was the one used. None of the infants appeared to gear their blinks to a critical value of absolute rate of change of the optical angle. But both the optical angle and x were used to trigger the blink. The authors found that infants who used the optical angle were significantly younger than the infants who used x. In other words, the different age groups used different optical variables. Importantly, the use of an angle strategy led to difficulties in the case of accelerative collision courses. The infants who kept the optical angle constant across the different approach conditions were too late on the fastest accelerating approach, and blinked after the object would have hit them. The older infants used the more sophisticated x-strategy by selecting a more useful optical variable that allowed them to deal not only with constant velocity, but also with the fastest accelerative collision courses. Or in other words, attention gets educated to an optical variable that specifies the property the infant intended to perceive. This difference in use of optical variables with age is suggestive of a shift in the informational basis of blinking during development. Obviously, to ascertain this conclusion longitudinal observations are needed. 7.3.2 Binocular information We already mentioned that Bower et al. (1970) and Ball and Tronick (1971) reported that the intensity of infants' responses to an optically expanding shadow or to a real approaching object are not that different, despite the fact that real approach events also provide additional information such as increasing disparity and convergence information, as well as changes in eye lens accommodation. But this was found for very young infants (6 - to 20-day-olds and 2- to 11-week-olds), for whom it is likely that binocular information cannot be detected. Yonas, Oberg, and Norcia (1978) investigated the development of sensitivity to binocular information for an object approaching on a collision course in older infants, namely in 14- and 20-week-olds. They used a stereoscopic shadow caster, in which two oppositely polarized beams of light cast a double shadow of an object on a rear-projection screen. The infant viewed this screen through polarizing goggles so that a single shadow was presented to each eye. For an infant with normal binocular vision, the two retinal images are fused by the convergence of the eyes, resulting in a stereoscopic percept of the approaching object. But for an infant without stereopsis, two objects appear to approach on a 'miss' path, one object veering off to the right, the other to the left. In the non-stereoscopic conditions the polarized filters in front of the point light sources were removed, so that both shadows were available to both eyes of
The Information-based Control of of'Interceptive InterceptiveTiming: Timing:AADevelopmental DevelopmentalPerspective Perspective 157 157
the infant. Yonas et al. (1978) found that infants in both age groups remained oriented to the center of the screen for a longer period in the stereoscopic condition than in the control condition. Maintaining fixation in the stereoscopic condition does suggest that the eyes of both the 14-week-olds and the 20-weekolds were converging on the approaching virtual target. Nonetheless, only the 20-week-old infants showed meaningful responses to binocular depth information as indicated by the increased frequency of blinking and amount of head withdrawal in the stereoscopic condition compared to the control condition. 7.3.3 Monocular static information It has been shown that 3-month-old infants differentiate approaching apertures from approaching obstacles (Caroll & Gibson, 1981). Li and Schmuckler (1996) found that varying the salience of the background did not affect the looming responses to approaching obstacles in 3- to 5-month-old infants. One interpretation of this result is that background information (occlusion), and hence monocular static information sources, does not play a role in the regulation of defensive actions. In sum, the discovery of the affordance of impending collision seems to occur in early infancy; infants perceive the affordance for impending collision and will blink at virtual impact from about 1 month. Defensive blinking with the eyes develops into a more consistent response during the first months after birth and is considered established at about 6 months of age. This age coincides with the age at which the informational basis of timing the defensive blink is suggested to change. Kayed and Van der Meer (2000) found that infants older than 6 months detected and used a different optical variable than the younger infants, namely the specifying variable x, which may have allowed them to deal with the more demanding accelerative approaches. However, other sources of information, for example, binocular disparity, are likely to be involved in the regulation of defensive actions as well. Kayed and Van der Meer's (2000) finding emphasizes the concept of differentiation (E. J. Gibson, 1988) or education of attention (Jacobs, 2002), since it indicates a convergence towards more useful information in the control of movements. Whether the same optical sources or changes therein underlie improvements in the perception of the affordance of impending collision remains unknown.
158 158
Paulion van Hof, John van der Kamp and Geert Savelsbergh
7.4 Approaching object: Catching behaviour For a successful catch, both spatial and temporal information about when and where to intercept the object is needed. It has long been thought that the timing of interceptive actions, like reaching for and catching a moving object, is an advanced perceptual-motor skill of which young infants are not capable (Kay, 1969). Infants' first actions were considered to be not much more than reflexes or fixed action patterns (DiFranco, Muir & Dodwell, 1978; White, Castle & Held, 1964). But a number of studies have shown that infants' actions are goal-directed and spatially and temporally coordinated, even though they are performed in an erratic way. For example, Von Hofsten (1982) found that when a newborn fixated a toy, his or her reaching movements terminated closer to the object compared to when the object is not fixated, even when the toy was moving slowly. This implies that infants' first arm movements are not just random or reflexive, but are (partly) under visual control. Although the newborns did not succeed in real interceptions, catching a moving object is an ability that seems to develop early in life. Strikingly, infants are capable to catch moving objects successfully as soon as the first reaches for stationary objects, at about 18 weeks of age, can be observed (Out, Savelsbergh & Van Soest, 2001; Von Hofsten & Lindhagen, 1979). From that age onwards, infant's readiness to reach for fast moving objects increases, as does the success of the grasps. Catching proficiency and visual control of catching improve with age. By 18 weeks of age, infants can catch an object moving at 30 cm/s, and by 8 months infants catch objects moving at 125 cm/s (Von Hofsten, 1980). By the age of 9 months infants have been found to achieve a 50 ms precision in catching a target moving at 120 cm/s (Von Hofsten, 1983). When reaching for a moving object the hand should be aimed at the perceived future interception point, rather than aiming at the point where the object is seen when the reach is initiated. In the latter case, the infant would be too late to reach the interception point when an object is moving fast, due to the inertia of the limbs and the time lags involved in conduction of neural signals (Berthenthal, 1996; Haith, 1994; Von Hofsten, 1993; Von Hofsten, Vishton, Spelke, Feng & Rosander, 1998). A number of findings demonstrated that from about 18 weeks of age infants' movements are directly aimed at the perceived future interception point (Out, Savelsbergh & Van Soest, 2001; Robin, Berthier & Clifton, 1996; Von Hofsten, 1979, 1980; 1983; Van der Meer, Van der Weel & Lee, 1994).
The Information-based Control of Interceptive Timing: A Developmental Perspective 159
7.5 The discovery of the affordance catchableness In the avoidance studies (see section 7.2.) it was also observed that the virtual or real approach of an object not only afforded avoidance behaviour, but also reaching behaviour. For example, Yonas and colleagues (1977) presented 1to 2-month-old infants with real objects approaching on collision and noncollision courses in order to investigate whether infants perceived the affordance of impending collision. Although they found that these young infants exhibit no full avoidance behaviour (as characterized by head withdrawal, raising of the arms and blinking with the eyes), the 1- to 2-month-olds did discriminate between the objects on a non-collision and those on a collision course. The infants showed more tracking behaviour during the non-collision trials than in the collision trials, indicating that the same object has different affordances when it approaches on different courses. Besides, in a pilot study, Yonas et al. (1977) tested 4- to 6-month-olds, who were more skilled reachers, in a similar experimental setting. The infants in this age group typically reached for the approaching object and grasped it as soon as it was within reach. The same event -an approaching object- has different affordances for 1- to 2-month-old infants and 4- to 6-month-olds, that is, respectively, the moving object is trackable and graspable. Similar observations were made in a study by Yonas et al. (1978) in which a stereoscopic shadow caster was used. It was found that in the initial seconds of the approach period, in which the expanding shadow does not cover a large part of the retina yet, 20-week-old infants, but not 14-week-olds, extended their hands outward in a reaching posture rather than upward to protect the face. From about 5 months onwards infants almost compulsively reach for stationary objects that are presented to them. Moving objects, however, do interest infants but will not always elicit grasping behavior since object size and approach speed appears to affect the likelihood of reaching for it. This can be deducted from several studies that investigated how prospective control of reaching develops. For example in the study of Van der Meer et al. (1994) in which the object's approach velocity influenced the occurrence of reaching behavior in 20-week-old infants, the fastest speed (8 cm/s) did not elicit reaching behaviour, whereas the lower object speeds did. Similar observations were made by Von Hofsten (1980, 1983), who found that reaching frequency decreased with increasing ball velocity. Also in the Out et al. (2001) study 15- to 19-weekold infants were less likely to reach for the approaching ball when its velocity was high (range: 20-70 cm/s). Moreover, it was shown that in reaching for fast laterally moving objects (15 and 30 cm/s), 18- to 36-week-old infants often chose the hand contra-lateral to the side from where the toy approached (Von Hofsten, 1980). In doing this, the infants may have created more time to reach for the moving target. This strategy becomes more and more convenient as object velocity increases. These examples illustrate the development in infants'
160 160
Paulion van Hof, John van der Kamp and Geert Savelsbergh
perception of the affordance 'catchableness'. From the emergence of catching, infants appear to discover more and more precisely what objects afford catching. Is this improvement of perception of the affordance catchableness related to improvements in the information-based regulation of timing the catch? 7.6 The development of the information-based regulation of catching Studies that assessed the informational basis of infants' reaching for moving objects will be discussed. What types of information do infants use in controlling the temporal aspect of their reaching actions, does this change with age, and if so, how? Like in the previous sections, we will use the developmental sequence depicted in Figure 2 as guide. 7.6.1 Monocular kinetic information Van der Meer, Van der Weel and Lee (1994) carried out a study in which infants' catching behaviour was tested in a situation where a moving target could not be grasped simply by chasing it. In a straight line from left to right and the reverse, a target object was moved with various speeds (6.5, 8, 11.5, and 13 cm/s) in front of the infant but behind a transparent screen, so that the infants could only grasp it as it passes the gap in the screen. Just before the object reached the gap, it was occluded from view by an occluder attached to the screen on the side of the gap. The 43- to 50-week-old infants all looked and reached towards the point where it would reappear, even before the toy disappeared. Grasping movements were always directed towards the interception point, rather than at the current position of the toy. Just as Von Hofsten (1979, 1980, 1983; Von Hofsten et al., 1998) found, infants do not simply chase the object or wait for it to be at the interception point, but show prospective control of catching. By varying the speed of the toy, it was also shown that both the gaze shift and the hand movement were initiated a certain time before contact (i.e., time before reaching the interception point), and not when the toy was a certain distance from the interception point. Van der Meer et al. (1994) argued that these movements could be timed on the ratio between the angular separation of the toy and interception point and its rate of change over time. This T-function of the angular separation of the toy and interception point is another, but mathematically equivalent description of the original x hypothesis (1/ rate of dilation, see e.g. Lee, 1976, 1980) in which the projection plane is perpendicular to the trajectory instead of parallel. Notice also that this T-function is specific to the time remaining before the object reaches the interception point, given the object approaches with constant speed.
161 The Information-based Control of Interceptive Timing: A Developmental Perspective 161
In a second experiment, Van der Meer et al. (1994) tested younger infants, from the age of 16 weeks onwards, and showed that gaze is controlled prospectively as early as 24 weeks, and that prospective control of hand movements follows at 32 weeks (see also Berthier, Berthenthal, Seaks, Sylvia, Johnson & Clifton, 2001). From these ages onwards, infants gear their eye and hand movements to a critical value of a time specifying variable such as x. But before that age, infants were found to initiate their eye and hand movements when the toy is at a particular distance away from the interception place. Hence, younger infants appear to use optical angle, whereas older infants behave consistent with the use of the more useful specifying variable x as a source of information in controlling the timing of the catch. This difference with age is suggestive of a convergence towards specifying optical variables. Attention gets educated to optical variables that are more useful (i.e. infants select an optical variable which allow them to cope with more situations) for regulating the timing of interceptive movements4. Again, the optical information used by the infants for the timing of their catching movements appeared to change with age, just as was found in the previously mentioned blinking study (Kayed & Van der Meer, 2000), only it seemed to occur at a later age, at about 32 weeks compared to 6 months of age in the case of blinking. Note however, that Van der Meer et al. (1994) also found that the informational basis of infants' gaze shift appears to change at 24 weeks of age. These changes in the informational basis do not appear to be fruitless. Gearing interceptive movements to a critical value of the optical angle is not always efficient, since it would only work when the object is moving slowly; in that case the infant has time enough to execute the catching movement. However, when the same value of optical angle is applied in high approach velocities, the infant has less and less time to move his or her hand to the interception point. And although it is found that the mean velocity of the wrist gets higher as object velocity increases (Out et al, 2001), eventually the hand cannot move any faster causing the interception to fail (see also Van der Meer et al., 1995). The tendency to initiate catching movements in the first instance at a critical value of optical angle, and a tendency to shift to the use of a different optical variable is also found by Out, Savelsbergh and Van Soest (2001). In that study, infants were observed when catching an object that approached on a collision course with different velocities at 15 weeks of age and again at 19 4
However, it is important to note that younger infants in these experiments were presented with lower object speeds than the older infants. This difference in object speeds potentially confounds the interpretation of the results. That is, it cannot be ruled out that all infants used looming, the use of which would predict smaller and possibly more difficult to determine timing differences between higher object speeds as compared to lower speeds.
162 162
Paulion van Hof, John van der Kamp and Geert Savelsbergh
weeks. As was found earlier, infants' catching movements were not reactive, that is, they did not move their hands towards the ball as soon as they saw the ball and waited until the ball would arrive at their hands. Catching movements started when the ball was still out of reach, and the initiation was geared to distance (like in Van der Meer et al., 1994). However, the distance at which a catching attempt was initiated was slightly larger at a ball speed of 0.4 m/s compared to 0.2 m/s, which may suggest absolute rate of change of the optical angle as source of information, or perhaps indicate that the young infants are in a transition from the use of an optical angle to a x-strategy. The latter would be indicative of a developmental trend toward a constant TTC. However, whether such a tightly constrained time strategy will be achieved remains a question, since even in adults timing of catching is affected by object velocity and object size (Caljouw, Van der Kamp & Savelsbergh, 2003; Van der Kamp, 1999; Li & Laurent, 1995; Sidaway, Fairweather, Sekiya & McNitt-Gray, 1996). 7.6.2 Binocular information The co-occurrence of the development of binocularity around 4 months and the emergence of successful catching suggests that binocular information sources may play a key role in the regulation of catching movements. Although the contribution of binocular information has only been studied in reaching for stationary objects (e.g. Von Hofsten, 1977; Yonas & Granrud, 1985, Yonas, Arterberry & Granrud, 1987), this has not yet been done for moving objects. However, the role of binocular information in reaching was tested for virtual moving objects. In the aforementioned stereoscopic shadow caster experiment of Yonas, Oberg and Norcia (1978) it was found that 5-month-old infants, but not 3.5-month-old infants, reached significantly more in the stereoscopic condition than in the control condition, suggestive of the detection and use of binocular depth information at 5 months of age but not at 3.5 months of age, at least when deciding to make the reach. It remains to be seen whether it also plays a role in movement control. 7.6.3 Monocular static information Yonas and colleagues (see Yonas & Granrud, 1985 for a review) report a series of studies on the development of pictorial depth perception. Many of these studies used reaching as an index of infants' pictorial depth perception. Paired displays in which pictorial information specified that one object was nearer to the subject than the other were presented to monocular viewing infants. Preferential reaching to the nearer display was taken to indicate the effectiveness of the pictorial cue. It was found that 7-month-old infants, but not 5-month-old infants, perceived pictorial depth information and used it for deciding to make a
163 The Information-based Control of Interceptive Timing: A Developmental Perspective 163
reach for a stationary object. To what extent it accounts for infants' movements for a moving object has yet to be investigated. However, findings in a study of Von Hofsten (1983) suggest that background information does not play an important role in the control of interceptive actions of 34- to 36-week-old infants. The study examined both how the reaches were aimed, and how they were timed. An object moved in a horizontal circular path with various velocities (30, 45, and 60 cm/s) and starting positions. It was found that infants' reaches were closely aimed at the future interception point with a remarkably precise timing. The tested infants appear to use a flexible strategy; the moving object itself, instead of the static background, was shown to serve as reference point for the control of the catch. The infants moved their hands both toward and with the object at the same time, the resultant of which is automatically directed at the meeting-point with the object. Hence, the object is not simply intercepted at fixed position or timed with regards to its surroundings (cf. Van der Meer et al., 1994). Von Hofsten therefore concluded that the infants reached in reference to a coordinate system fixed to the moving object instead of to a static background, that is, simultaneously with the forward movement, the hand moved with the object sideward (Von Hofsten, 1983, p. 83-84). Whether other monocular static information sources contribute to the regulation of timing catching remains unknown. In sum, from about 18 weeks of age infants discover what objects afford catching, and with increasing age, perception of the affordance catchableness improves (i.e., infants take into account object size and speed of approach when deciding whether or not to make a catching movement). Whether this improvement in the perception of catchableness is related to improvements in movement control remains unclear since the precise informational basis of catching in infancy and its development is unknown. The findings in the reviewed studies, however, suggest that the optical information used by infants to time the interceptive action changes with age. Young infants gear their catching movements to a critical value of the optical angle, while older infants use a time specifying variable, such as T, to time their catching movements. Besides, binocular information is likely to be involved in the regulation of catching movements, but its precise role is unknown.
164 164
Paulion van Hof, John van der Kamp and Geert Savelsbergh
8. Summary and conclusions The present Chapter reviewed studies on the development and informational bases of infant's interceptive movements. The ecological approach of perception and action was taken as a framework for understanding the development of how movement coordination is constrained by optical information (cf. E. J. Gibson, 1988; J. J. Gibson, 1979; Michaels & Carello, 1981). Furthermore, principals of perceptual learning and development (cf. Jacobs & Michaels, 2002; E. J. Gibson, 1988) have been applied. The goal of the present chapter was to illustrate that changes in development of the informationbased regulation of interceptive actions can be understood as an education of attention (i.e., convergence to the more useful information). We have shown that improvements in the perception of affordances and changes in the visual control of movements to realize these affordances go hand in hand. However, whether these processes are informationally linked (in other words, whether the same optical sources or changes therein are involved) has not yet been examined. After an infant has discovered an affordance and is able to actualize this affordance with an effective information-movement coupling does it transfer to other action systems? Given the resemblance in the temporal characteristics of avoiding a moving object or catching it, it seems plausible to assume a similarity in the information sources relevant for performing both actions successfully. Every source of information that is in some way confined to the approach of the object might be used to regulate timing of avoidance or interceptive actions. With age, the infant improves in detecting the more specific information. To a certain extent this is what the literature shows. Yet, the assumption that if an infant is able to detect and use the most relevant optical variable to time the initiation of his or her movements during a particular action, he or she would also able to detect and use the same optic variable for a different action, does not follow automatically. A major developmental process is the progress in the coupling between optical information and movement. There is evidence that this process may have to occur for each action system as it emerges (Bremner, 2000). New action systems bring new affordances, and new affordances bring new information-movement couplings. To establish these informationmovement couplings and subsequently select the most appropriate for each situation, some exploratory practice seems essential (E. J. Gibson, 1988). For example, Adolph (2000) demonstrated that the affordance "avoiding risky gaps" did not generalize across changes in posture. She studied the reaching and avoidance behavior of 9-month-old infants facing safe and risky real gaps while in a sitting and in crawling posture. Because the infants were tested within the same session, they had more experience in keeping balance in the sitting posture than in the crawling posture. Adolph found that the proportion of trials with adaptive avoidance responses (i.e., the percentage of avoided risky gaps) was
The Information-based Control of Interceptive Timing: A Developmental Perspective 165
larger in the sitting posture than in the crawling posture. Moreover, infants more closely matched their avoidance responses to the probability of falling in the sitting posture than in the crawling posture. These findings, together with Adolph's (1997; Adolph, Eppler & Gibson, 1993) earlier finding that infants who have learned which slopes pose a threat to stability while crawling have to relearn this for locomotion, demonstrates a specificity of learning at the level of the perception of affordances. That is, there was no transfer between the tasks. Since the realization of an affordance also concerns the coupling between information and movement, an important question is whether specificity of learning also holds for the convergence on the more useful optical variable in the control of movement (i.e., education of attention). When an infant is able to detect and use a specifying optical variable in one task, is he or she able to detect and use it in another, similar task? Results in the reviewed studies seem to suggest that convergence on more useful specifying variables is domain specific. In timing a defensive blink, infants of about 6 months old seem to detect and use the time specifying optical variable x. In timing the catch, however, infants of the same age appear to detect and use a different optical variable, one that appears less useful for regulating the catching movements. If so, and that remains to be proven, this would be in agreement with recent proposals suggesting that different perceptual-motor skills are controlled independently (Milner & Goodale, 1995; cf. Fodor, 1983) and probably follow independent developmental trajectories (Atkinson, 2000; Berthier, Bertenthal. Seaks, Sylvia, Johnson & Clifton, 2001). Although the informational basis of interceptive actions in adults is well studied, this is hardly the case for infants' actions. More infant research is needed for insight into the informational basis of interceptive movements in infancy and how changes therein can be acquired. The identification of optical variables that are actually used to control the action would be a first and indispensable step to a better understanding of the informational basis in the control of time demanding tasks such as avoiding or catching moving objects in development. This might set the stage for a more systematic formulation of the processes underlying perceptual-motor development.
166 166
Paulion van Hof, John van der Kamp and Geert Savelsbergh
REFERENCES Adolph, K. E. (1997). Learning in the development of infant locomotion. Monographs of the Society for Research in Child Development, 62, (3, Serial No. 251). Adolph, K. E. (2000). Specificity of learning: Why infant fall over a veritable cliff. Psychological Science, 11,290-295. Adolph, K. E., Eppler, M. A. & Gibson, E. J. (1993). Crawling versus walking infants' perception of affordances for locomotion over sloping surfaces. Child Development, 64, 1158-1174. Atkinson, J. (2000). The developing visual brain. Oxford, UK: Oxford University Press. Ball, W. & Tronick, E. (1971). Infants responses to impending collision: optical and real. Science, 171,812-820. Bennett, S. J., van der Kamp, J., Savelsbergh, G. J. P. & Davids, K. (1999). Timing a one-handed catch I. Effects of tlestereoscopic viewing. Experimental Brain Research, 129 (3), 362368. Bennett, S. J., van der Kamp, J., Savelsbergh, G. J. P. & Davids, K. (2000). Discriminating the role of binocular information in the timing of a one-handed catch. The effects of telestereoscopic viewing and ball size. Experimental Brain Research, 135 (3), 341-347. Berkeley, G. (1910). Essay toward a new theory of vision. London: Dutton. (Original work published in 1709). Berthenthal, B. I. (1996). Origins and early development of perception, action, and representation. Annual Review of Psychology, 47, 431-459. Berthier, N. E., Bertenthal, B. I., Seaks, J. D., Sylvia, M. R., Johnson, R. L. & Clifton, R. K. (2001). Using object knowledge in visual tracking and reaching. Infancy, 2, 257-284. Bower, T. G. R., Brougthon, J. M. & Moore, M. K. (1970). Infant responses to approaching objects: An indicator of response to distal variables. Perception and Psychophysics, 9, 193-196. Bremner, J. G. (2000). Developmental relationships between perception and action in infancy. Infant Behavior & Development, 23, 567-582. Caljouw, S. R., Van der Kamp, J. & Savelsbergh, G. J. P. (2003). Catching optical information for the regulation of timing, (submitted). Caroll, J. J. & Gibson, E. J. (1981) Differentiation of an aperture from an obstacle under conditions of motion by three-month-old infants. Paper presented at the meetings of the Society for Research in Child Development, Boston, MA. DeLucia, P. R. (1991). Pictorial and motion-based information for depth perception. Journal of Experimental Pyhology: Human Perception and Performance, 17, 738-748. DiFranco, D., Muir, D. W. & Dodwell, D. C. (1978). Reaching in young infants. Perception, 7, 385-392. Fodor, J. (1983). The modularity of mind: An essay on faculty psychology. Cambridge, MA: MIT Press. Gibson, E. J. (1982). The concept of affordances in development: the renascence of functionalism. In W. A. Collins (Ed.), The concept of development: The Minnesota symposia on child psychology, Vol. 15. Hillsdale, NJ: Erlbaum.
167 The Information-based Control of Interceptive Timing: A Developmental Perspective 167
Gibson, E. J. (1988). Exploratory behavior in the development of perceiving, acting and the acquiring of knowledge. Annual Review of Psychology, 39, 1-41. Gibson, E. J. & Pick, A. D. (2000) An ecological approach to perceptual learning and development. Oxford: University Press. Gibson, J. J. (1966). The senses considered as perceptual systems. Boston, MA: Houghton- Mifflin Gibson, J. J. (1979) The ecological approach to visual perception. Boston, MA: HoughtonMifflin. Gogel, W. C. (1977). The metric of visual space. In W. Epstein (Ed.), Stability and constancy in visual perception: Mechanisms and processes (pp. 129-181). New York: Wiley. Gregory, R. L. (1993). Seeing and thinking. Giornale Italiano di Psicologia, 20(5), 749-769. Helmholtz, H. von (1925). Handbook of physiological optics, Vol.3, J. P. S. Southall (Ed. and Trans.) New York: Dover. (Original work published in 1867). Heuer, H. (1993). Estimates of time to contact based on changing size and changing target vergence. Perception, 22(5), 549-563. Jacobs, D. M. (2002). On perceiving, acting and learning: Toward an ecological approach anchored in convergence. (Doctoral dissertation, Vrije Universiteit, Amsterdam). Jacobs, D. M. & Michaels, C. F. (2002). On the apparent paradox of learning and realism. Ecological Psychology, 14, 127-139. Johnson, S. P. & Johnson, K. L. (2000). Early perception-action coupling: eye movements and the development of object perception. Infant Behavior & Development, 23, 461-483. Judge, S. J. & Bradford, C. M. (1988). Adaptation to telestereoscopic viewing measured by onehanded ball-catching performance. Perception, 17(6), 783-802. Kay, H. (1969). The development of motor skills from birth to adolescence. In E. A. Bilodeau (Ed.), Principles of skill acquisition. New York: Academic Press. Kayed, N. S. & Van der Meer, A. (2000). Timing strategies used in defensive blinking to optical collisions in five- to seven-month-old-infants. Infant Behavior & Development, 23, 253270. Kellman, P. J. & Arterberry, M. E. (1998). The cradle of knowledge: Development of perception in infancy. Cambridge, MA: The MIT Press. Laurent, M., Montagne, G. & Duray, A. (1996). Binocular invariants in interceptive tasks: A directed perception approach. Perception, 25(12), 1437-1450. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5(4), 437-459. Lee, D. N. (1980). The optic flow field: The foundation of physics. Philosophical Transactions of the Royal Society of London B 290, 169-179. Lee, D. N. & Reddish, D. E. (1981). Plummeting gannets: A paradigm of ecological optics. Nature, 293, 293-294. Lee, D. N., Young, D. S., Reddish, D. E., Lough, S. & Clyton, T. M. (1983). Visual timing in hitting an accelerating ball. Quarterly Journal of Experimental Psychology, 35(a), 333346. Leibowitz, H, Shiina, K. & Hennesy, R. T. (1972). Ocolumotor adjustments and size constancy. Perception and Psychophysics, 12, 497-500.
168 168
Paulion van van Hof, Hof, John John van van der der Kamp Kamp and and Geert Geert Savelsbergh Savelsbergh Paulion
Li, F. X. & Laurent, M. (1995). Occlusion rate of ball texture as a source of velocity information. Perceptual and Motor Skills, 81(3 Pt 1), 871-880. Li, N. S. & Schmuckler, M. A. (1996). Looming responses to obstacles and apertures: the role of accretion and deletion of background texture. Poster presented at the 10"' Biennial International Conference for Infant Studies, April 17-21 1996, Providence, RI. Michaels, C. F. & Carello, C. (1981). Direct perception. Englewood Cliffs, NJ: Prentice-Hall Michaels, C. F., Zeinstra, E. B. & Oudejans, R. R. D. (2001). Information and action in punching a falling ball. Quarterly Jounal of Experimental Psychology, 54A(1), 69-93. Milner, A. D. & Goodale, M. A. (1995). The visual brain in action. Oxford, UK: Oxford University Press. Mon-Williams M. & Dijkerman H. C. (1999). The use of vergence information in the programming of prehension. Experimental Brain Research, 128, 578-82. Nanez, J. E. (1988). Perception of impending collision in 3- to 6-week-old human infants. Infant Behavior and Development, 11, 447-463. Naflez, J. E. & Yonas, A. (1994). Effects of luminance and texture motion on infant defensive reactions to optical collision. Infant Behavior and Development, 17, 165-174. Out, L., Savelsbergh, G. J. P. & Van Soest, A. J. (2001). Interceptive timing in early infancy. Journal of Human Movement Studies, 40, 185-206. Out, L., Savelsbergh, G. J. P., Soets, van, A. J. & Hopkins, B. (1997). Influence of mechanical factors on movements units in infant reaching. Human Movement Science, 16, 733-748. Out. L., Soest, van, A.J., Savelsbergh, G. J. P. & Hopkins, B. (1998). The effect of posture on early reaching movement. Journal of Motor Behavior, 30, 260-272. Palmer, C. F. (1989). The discriminating nature of infants' exploratory actions. Developmental Psychology 25(6), 885-893. Peper, C. L., Bootsma, R. J., Mestre, D. R. & Bakker, F. C. (1994). Catching balls: how to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception and Performance, 20(3), 591-612. Piaget, J. (1952). The origins of intelligence in children. New York: Basic Books. Piaget, J. (1954). The Construction of Reality in the Child. NY: Free Press. Pick, H. L. (2003). Development and learning: An historical perspective on acquisition of motor control. Infant Behavior and Development, in press. Regan, D., & Beverly, K. I. (1979). Binocular and monocular stimuli for motion in depth: changing disparity and changing size feed the same motion-in-depth stage. Vision Research, 19(12), 1331-1342. Richardson, K. (2000). Developmental psychology: How nature and nurture interact. Londen: MacMillan Press. Robin, D. J., Berthier, N. E. & Clifton, R. K. (1996). Infants' predictive reaching for moving objects in the dark. Developmental Psychology, 32, 824-835. Rochat, P. (1992). Self-sitting and reaching in 5-8 month old infants: The impact of posture and its development on early eye-hand coordination. Journal of Motor Behavior, 24, 210 - 220. Rock, I. (1983). The logic of perception. Cambridge, MA: MIT Press. Rock, I. (1997). Indirect perception. Cambridge, MA: MIT Press.
The Information-based Control of Interceptive Timing: A Developmental Perspective 169 169
Savelsbergh, G. J. P. & Van der Kamp, J. (1993). The coordination of infants' reaching, grasping, catching and posture: A natural physical approach. In G. J. P. Savelsbergh (Ed.), The development of coordination in infancy (pp. 289-317). Amsterdam: Elsevier Science Publishers). Savelsbergh, G. J. P., Whiting, H. T. A. & Bootsma, R. J. (1991). Grasping tau. Journal of Experimental Psychology: Human Perception and Performance, 17(2), 315-322. Savelsbergh, G. J. P., Whiting, H. T. A., Pijpers, J. R. & Van Santwoord, A. M. M. (1993). The visual guidance of catching. Experimental Brain Research, 93, 146-156. Savelsbergh, G.J.P., Wimmers, R.H., Van der Kamp & Davids, K. (1999). The development of motor control and coordination. In, A.F. Kalverboer, M.L. Genta, & B. Hopkins (Eds.), Current issues in Developmental psychology: Biopsychological perspective (107-136) Amsterdam: Kluwer Academic Publishers. Schiff, W. (1965). Perception of impending collision: A study of visually directed avoidant behavior. Psychological Monographs, 79, 1-26. Schiff, W., Caviness, J. A. & Gibson, J. J. (1962). Persistent fear responses in rhesus monkeys to the optical stimulus of "looming". Science, 136, 982-983. Sidaway, B., Fairweather, M., Sekiya, H. & McNitt-Gray, J. (1996). Time-to-collision estimated in a simulated driving task. Human Factors, 38, 101-113. Smeets, J. B. & Brenner, E. (1995). Perception and action are based on the same visual information: distinction between position and velocity. Journal of Experimental Psychology: Human Perception and Performance, 21, 19-31. Smith, M. R. H., Flach, J. M, Dittman, S. M. & Stanard, T. (2001). Monocular optical constraints on collision control. Journal of Experimental Psychology: Human Perception and Performance, 27(2), 395-410. Stechler, G. & Latz, E. (1966). Some observation on attention and arousal in the human infant. Journal of American Academy of Child Psychiatry, 5, 517-525. Stewart, D., Cudworth, C. J. & Lishman, J. R. (1993). Misperception of time-to-collision by drivers in pedestrian accidents Perception, 22, 1227-1224. Taga, G., Ikejiri, T., Tachibana, T., Shimojo, S., Soeda, A., Takeuchi, K. & Konishi, Y. (2002). Visual feature binding in early infancy. Perception, 31, 273-286. Thelen, E. (1995). Motor development: A new synthesis. American Psychologist, 50, 79-95. Thelen, E. & Smith, L. B. (1994). A dynamic systems approach to the development of cognition and action. Cambridge: MIT Press. Todd, J. T. (1981). Visual information about moving objects. Journal of Experimental Psychology: Human Perception and Performance, 7(4), 795-810. Tresilian, J. R. (1994). Approximate information sources and perceptual variables in interceptive timing. Journal of Experimental Psychology: Human Perception and Performance, 20(1), 154-173. Tresilian, J. R. (1994). Perceptual and motor processes in interceptive timing. Human Movement Sciences, 13,335-373. Tresilian, J. R. (1999). Analysis of recent empirical challenges to an account of interceptive timing. Perception and Psychophysics, 61(3), 515-528.
170 170
Paulion van Hof, John van der Kamp and Geert Savelsbergh
Tresilian, J. R. (1999). Visually timed action: time-out for 'tau'? Trends in Cognitive Sciences, 3(8), 301-310. Van der Kamp, J. (1999). The information-based regulation of interceptive action. (Doctoral dissertation, Vrije Universiteit, Amsterdam). Van der Kamp, J., Bennett, S. J., Savelsbergh, G. J. P. & Davids, K. (1999). Timing a one-handed catch II. Adaptation to telestereoscopic viewing. Experimental Brain Research, 129(3), 369-377. Van der Kamp, J. & Savelsbergh, G. J. P. (1994). Exploring exploration in the development of action. Research and clinical center for child development report, 16, 131-139. Van der Kamp, J. & Savelsbergh, G. J. P. (2000). Action and perception in infancy. Infant Behavior & Development, 23, 237-251. Van der Kamp, J., Savelsbergh, G. J. P. & Davis, W. E. (1998). Body-scaled ratio as a control parameter for prehension in 5- to 9-year-old children. Developmental Psychobiology, 33, 351-61. Van der Kamp, J., Savelsbergh, G. J. P. & Smeets, J. B. (1997). Multiple information sources in interceptive timing. Human Movement Sciences, 16(6), 787-821. Van der Meer, A. L. M., Van der Weel, F. R. & Lee, D. N. (1994). Prospective control in catching by infants. Perception, 23, 287-302. Van der Meer, A. L. M., Van der Weel, F. R., Lee, D. N., Laing, I. A. & Lin, J. P. (1995). Development of prospective control of catching moving objects in preterm infants. Developmental Medicine and Child Neurology, 37, 145-158. Von Hofsten, C. (1977). Binocular convergence as a determinant of reaching behavior in infancy. Perception, 6, 139-144. Von Hofsten, C. (1979). Development of visually directed reaching: The approach phase. Journal of Human Movements Studies, 5, 160-178. Von Hofsten, C. (1980). Predictive reaching for moving objects by human infants. Journal of Experimental Child Psychology, 30, 396-382. Von Hofsten, C. (1982). Eye-hand coordination in the new-born. Developmental Psychology, 18, 450-461. Von Hofsten, C. (1983). Catching skills in infancy. Journal of Experimental Psychology: Human Perception and Performance, 9(1), 75-85. Von Hofsten, C. (1993). Prospective control: A basic aspect of action development. Human Development, 36, 253-270. Von Hofsten C. & Lindhagen, K. (1979). Observations on the development of reaching for moving objects. Journal of Experimental Child Psychology, 28, 158-173. Von Hofsten, C, Vishton, P., Spelke, E. S., Feng, Q. & Rosander, K. (1998). Predictive action in infancy: tracking and reaching for moving objects. Cognition, 67, 255-285. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory? Journal of Experimental Psychology: Human Perception and Performance, 22(4), 1031-1048. Wattem-Bell, J. R. B. (1992). The development of maximum displacement limits for discrimination of motion direction in infancy. Vision Research, 32, 621-630.
The Information-based Control of Interceptive Timing: A Developmental Perspective 171 171
Wattem-Bell, J. R. B. (1996). Infants' discrimination of absolute direction of motion. Investigative Ophthalmology and Visual Science, 37, S917. Wattem-Bell, J. R. B. (1996). Development of visual motion processing. In F. Vital-Durand, J. Atkinson & O. J. Braddick (Eds.), Infant Vision, Oxford: Oxford University Press. Wimmers, R. H. & Savelsbergh, G. J. P. (2001). Variability in the emergence of early reaching. Journal of Human Movement Studies, 40, 65-81. Wimmers, R. H., Savelsbergh, G. J. P., van der Kamp, J. & Hartelman, P. (1998). A cusp catastrophe model as a model for transition in the development of prehension. Developmental Psychobiology. 32, 23-35. White, B. L., Castle, P. & Held, R. (1964). Observations on the development of visually-directed reaching. Child Development, 35, 349-364. Yonas, A., Bechtold, A. G., Frankel, D., Gordon, R. F., McRoberts, G., Norcia., A. & Sternfels, S. (1977). Development of sensitivity to information for impending collision. Perception and Psychophysics, 21, 97-104. Yonas, A. (1981). Infants' responses to optical information for collision. In R. N. Aslin, J. R. Alberts, & M. R. Peterson (Eds.), Development of perception: Vol.2 The visual system (pp. 313-334). New York, Academic Press. Yonas, A. & Granrud, C. A. (1985). Reaching as infants' spatial perception. In G. Gottlieb, & N. A. Krasnegor (Eds.), Measurement of audition and vision in the first year of postnatal life: A methodological overview (pp. 301-322). Norwood, NJ: Ablex Publishing Corp. Yonas, A., Oberg, C. & Norcia, A. (1978). Development of sensitivity to binocular information for the approach of an object. Developmental Psychology, 14, 147-152. Yonas, A., Pettersen, L. & Lockman, J. J. (1978). Young infants' sensitivity to optical information for collision. Canadian Journal of Psychology, 33, 268-276.
This Page is Intentionally Left Blank
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 9 A Step by Step Approach to Research on Time-toContact and Time-to-Passage David Regan York University, North York, Ontario, Canada
Rob Gray Arizona State University, East, Mesa, AZ, USA
ABSTRACT We discuss the following three-step approach to testing the hypothesis that a given task of collision avoidance or achievement is carried out by predicting the location of the approaching object at some time in the future: (1) Derive theoretical equations that embody candidate hypothesis; (2) find whether the human visual system contains a mechanism that processes the designated retinal image variable independently of other retinal image variables; (3) carry out field studies to find whether individuals base performance on the designated retinal image variable when performing the task in question. We review equations that relate monocular and binocular retinal information to the direction of an object's motion in three-dimensional space and to its time to collision or passage. We discuss a psychophysical technique for finding whether the human visual system is selectively sensitive to the information in question and can unconfound co-varying retinal image variables. We present a quantitative psychophysical model of the early processing of changing-size, changingdisparity, motion in depth and time-to-contact (TTC) that takes dynamic characteristics into account and that accounts for a wide range of psychophysical data including the common case of an object whose retinal image changes shape as the object grows closer to the observing eye.
174 174
David Regan Regan and and Rob Rob Gray Gray David
1. A three-step approach to behavioral research on time-tocontact and time-to-passage estimation 1.1 Collision avoidance and collision achievement The effective use of visual information to avoid collisions is crucial in maintaining a tolerably low level of risk, not only in highway driving1 but also in the crowded skies over major airports. To play ballgames such as baseball demands accurate, precise, and reliable collision achievement - judgements again that depend on visual information. Visual information takes at least 50 msec to reach primary visual cortex, and the execution of a motor response may take an additional 100 msec at the least. But to successfully evade or intercept a moving object is an everyday event: our motor responses to a challenge are not too late all or even most of the time. How can that be? One proposed solution to this problem is that performance in many collision and avoidance tasks is based on predicting several hundred milliseconds into the future when an approaching object will be at some sharply defined location (where)2. These predictions must sometimes be made when the observing eye is in translational motion (e.g. when driving) and sometimes when the observing eye is stationary (e.g. when hitting a baseball). This when/where prediction hypothesis implies that collision avoidance (e.g. when flying a medical emergency helicopter between high-rise buildings) is achieved by always steering away from the external object with the shortest TTC (as distinct from judging the distances and velocities of external objects) (Regan et al, 1998). Most research on TTC has been restricted to the use of monocularlyavailable information in the special case of a nonrotating or spherical object on a direct collision course. In addition to covering this restricted topic, in this review we extend the discussion to objects whose retinal images change shape as they expand, and also to the use of binocular information. We also discuss visual judgements of the time to passage of an object that is approaching along an oblique trajectory as, for example, when a ball must be caught or hit wide of the body or a car steered so as to pass wide of an irregularly shaped object such as the side of an underpass.
1 In the U.S.A. (which publishes detailed data on road accidents) "tolerably" means 41, 907 killed and 3.1 million injured in a typical year (1996) on the highways (NHTSA). Clearly "tolerable" has different meanings for road travel and, for example, air travel and children's playgrounds. 2 It may be that a quite different stratagem is used alongside the predictive stratagem just described. This alternative stratagem is reviewed by Schoner (1994), Peper et al., (1994), and Montagne et al. (1999).
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
175
1.2 Theoretical equations, laboratory psychophysics, and Held studies Only field studies can reveal which visual information humans actually use to perform a given visually guided motor task in the everyday world. But in field studies the visual environment is generally far more complex than can be arranged in laboratory conditions. As we describe later, in laboratory psychophysics it is possible to dissociate multiple kinds of visual information that co-vary in everyday life, and to measure the degree to which each kind of visual information contributes to an individual's judgement. This is not usually possible in field studies, where it is more difficult to demonstrate that the participant is not responding to a variable other than the one targeted by the designer of the field study. Laboratory psychophysics, however, is a hypothesis-driven endeavor, and one first needs competing hypothesies to test. This is the role of theoretical work in which equations are derived which relate retinal image variables to environmental variables (e.g. time to collision, direction of motion in depth) in a quantitative manner. Thus, a step-by-step rational approach to studies on visually guided motor action in everyday life would be as follows. (1) Derive theoretical equations that embody candidate hypotheses (2) Carry out psychophysical experiments to find whether the human visual system contains a neural mechanism that processes the designated retinal image variable independently of other retinal image variables (e.g., Figures 6, 8 & 13). (3) Carry out field studies to find whether individuals do in practice base performance on the designated retinal image variable when performing the task in question. In Section 1.4 we discuss the crucial importance of the italicized phrase above. But first an aside on the unique relevance of psychophysics to research on visually-guided motor action, and on the sometimes-overlooked severity of the problem of linking psychophysical models to physiological models that are framed in terms of single neurons or single small populations of neurons.
3
We use "neural mechanism" or "mechanism" merely as shorthand for the assumption that there is a physiological basis for any given percept. We do not distinguish between the activity of a single neuron, the activity of a population of neurons or of a cortical area, or the spatio-temporal pattern of activity within and between visual areas supported by profuse long-distance connections (see Kohly & Regan, 2002, Appendix)
176 176
David Regan and Rob Gray
1.3 The psychophysical and the physiological description of the visual system are, at the present time, best regarded as complementary: They exist in different worlds. The human visual system consists of a large number of nonlinear parts (neurons) that interact nonlinearly via profuse connections. In such a system the connectivity can create system properties (i.e. properties of the system as a whole) that cannot be predicted from a knowledge of the system's parts, see Marmarelis & Marmarelis (1978) and Mountcastle (1979). In other words, such a system may be qualitatively different from any of its parts. The relevance to human vision, and especially to the relation between psychophysics and physiology is as follows. 1. Psychophysically based mathematical models of the visual system as-a-whole may be exceedingly difficult to relate to the physiology of the visual system. 2. The sequence of processing in a psychophysical model may have little relation to the peripheral-to-central sequence of cortical areas. 3. The question of where in the visual system a particular process is carried out may not be meaningful: the physical basis of a system property cannot be assigned any discrete location within the system.4 The science of nonlinear systems analysis, developed for application to human-designed nonlinear systems, has been widely used in attempts to advance understanding of biological systems, including the human visual system. Formal systems analysis falls under one of two headings: structural analysis and mathematical (or functional) analysis (Blaquiere, 1966). In mathematical (functional) systems analysis the procedure is to compare the system's output with the known input. In the simplest case of a system that has only one input and only one output the investigator compares the system's output (O) with the input (/) and then identifies function F in equation (1).
O(t)=F[I(t)]
4
(1)
In the context of /MRI subtractive brain imaging maps, this point implies that unless the brain function of interest is shown not to be a system property, then the finding that location L in the brain shows greater activity when this particular function is being exercised than when it is not, cannot be taken to indicate that the physiological basis of the particular function is sited only at location L,
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
111 177
The point of this endeavor is that, given F, we can predict the output for any given input and in this sense we understand the system (F may be a far from simple function of I). Many systems, however, are not so elementary. Figure 1 depicts a general system with multiple inputs, and multiple outputs. For any given output 0, (t) the objective of mathematical (functional) systems analysis is to obtain function F in equation (2).
Equation (2) indicates that, in general, any given output Ot(t) may depend on many, even every one of the n inputs, and that some or even all of the dependencies may be nonlinear.
O Q.
SYSTEM
•§
Figure 1: A general system with multiple inputs and multiple outputs.
1.4 A testable psychophysical hypothesis about early visual processing By its nature, classical visual psychophysics commonly reduces the human visual system to a single output, thereby presenting a less formidable problem than that posed by Figure 1. But the visual system does have multiple inputs at any given location in the visual field (e.g. luminance, colour, motion, changing-size), and if the psychophysical response to a given input did depend nonlinearly on one or more of the other inputs then there would be no generallyapplicable psychophysical model; each of the indefinitely large number of visual environments would require a separate psychophysical investigation.
David Regan and Rob Gray
178 178
Sets of filters
S
1/XX\ A
OpponentProcess Stages
OPA
Label
"A"
o
3J
S
!/XX\B
OPB
"B"
o' CO
£CD
CD
s
f/XX\ c
OPC
/lotor
tinal mag
(D
1 CD
"C"
T
Figure 2: The "sets of filters" concept. Retinal image information passes through a limited number of parallel sets of filters, each of which is selectively sensitive to either a visual submodality or an abstract feature of the retinal image (e.g. a ratio). The output carries a label (A, B, C and so on). For example, sub-modality A is wavelength. The operation of any given set of filters is independent of the operation of all others. Fine-grain information is represented by the relative activity of filters within a set and is recovered later at an opponent stage of processing. The dashed lines with arrowheads indicate a possibility that descending task-dependent signals might affect the properties of some sets of filters. From Human Perception of Objects: Early Visual Processing of Spatial Form Defined by Luminance, Color, Texture, Motion, and Binocular Disparity (p.31), by D. Regan, 2000, Sunderland, MD: Sinauer. Copyright 2000 by Sinauer Associates, Inc. Reprinted with permission.
Psychophysical models of visually-guided motor action that are valid over a wide range of visual environments might be possible if the working hypothesis illustrated in Figure 2 were at least approximately correct. The hypothesis is that the various kinds of visual information that are important for visually guided motor action are processed (almost) independently of one another and (almost) independently of all other kinds of visual information (Regan, 1982, 2000, pp. 30-34; Regan, Beverley & Cynader, 1979). If the visual system is organized functionally in this way we can understand why, once learned, a skill of eye-limb coordination transfers readily from one visual
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
179
environment to another. A corollary to the hypothesis is that the degree to which a key kind of information is processed independently of all others would assume much greater importance in a complex visual environment than in a simple one, and might determine individual differences in task performance in field conditions such as, for example, flying high-performance aircraft (Beverley & Regan, 1980a; Kruk & Regan, 1983).5 In the next sections we describe psychophysical methods for testing whether a given neural mechanism obeys the crucial "independent processing" requirement.
2. Neural mechanisms involved in estimating time to collision on the basis of retinal image expansion and the "Independent Processing" requirement 2.1 A mechanism specialized for the rate of change of retinal image size In this and the following sections we will point out the specific features in the Figure 3 model that correspond to the experimental findings described, but will defer until Section 4 a full description of the model. When a rigid sphere moves directly towards an observing eye, its retinal image expands isotropically (i.e. without changing shape). That is a consequence of simple geometrical optics. But as pointed out by Poincare (1913), retinal image expansion does not necessarily mean that the object is approaching: it could, for example, be stationary and growing larger. There is empirical evidence, however, that the human visual system is biased so that isotropic image expansion produces an impression the object is approaching (Beverley & Regan, 1979a). This is a "safety first" bias: if the price for generating an immediate response to an approaching predator is occasional embarrassment occasioned by flight from a stationary expanding predator, the price would seem worth paying. We will see later that the perceived speed of motion-in-depth is scaled by the object's instantaneous angular subtense so that the perceived speed of approach is inversely proportional to TTC rather than being determined by the approaching object's actual speed (Regan & Hamstra, 1993). And as we also see later, a nonspherical object advancing on other than a collision course does not 5
This chapter discusses visual factors in highly skilled visually guided motor action. At the other extreme, research based on the hypothesis illustrated in Figure 2 can reveal visual loss caused by neuro-ophthalmological disorders that, even though gross, is hidden to standard visual tests. For a brief review, see Regan (2002a).
180 180
David Regan and Rob Gray
generate so effectively a sensation of motion in depth when only monocular information is available: the generation of a motion in depth sensation is impeded by the shape change that accompanies expansion for nonspherical objects that are on other than a collision course (Beverley & Regan, 1979a, 1980b). It makes perfect sense that the visual system should be organized in this way. Only when an approaching object's trajectory offers a threat does the visual system generate a compelling sensation of impending collision. This bias is captured by the model of the early processing of TTC and motion in depth depicted in Figure 3.
COM RECEDING
SUBTRACT INPUTS AND GENERATE MID
MID signal
Figure 3: Psychophysical model of the early processing of changing—size, changing— disparity, motion in depth and TTC. In this schematic the edges of a solid untextured rectangular retinal image are shown by the dashed line. LM: filters that respond best to an edge's local motion along the direction of the arrowed line. The outputs of the LMs (a,b,c,d) assume a magnitude that is linearly proportional to local speed, and a sign that corresponds to the direction of local motion. RM: one-dimensional filters whose output signals both the absolute and relative rates of expansion of the target's image along different meridians [height (RMV) and width (RMW) in this illustration]. S: Subtraction stage. Gi, G2, G3: gain stages. COMPARATOR GENERATES xff. this stage compares the fractional rates of expansion of height and width and, if they are equal, creates signal xg that is inversely proportional to TTC. The DYNAMIC STEREO and differentiating (DIFF) stages correspond to Figure 17, and generate a stereo signal (xB) that is inversely proportional to TTC. COM: The xB and xg signals are combined. The main part of the figure is for approaching motion. A mirror-image section processes receding motion. The insert shows how the motion in depth signal is determined by the difference between the "approaching" and "receding" signals. See text for details.
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
181
Although the rate of expansion of an object's retinal image does not unambiguously signal TTC, the rate of expansion per se is a powerful stimulus in peripheral vision that triggers a rapid involuntary foveation of the approaching object with a consequent possibility of accurately estimating its TTC. It is, therefore, not surprising that the human visual system contains a mechanism that is specifically sensitive to rate of expansion (dQ/dt) independently of angular size (0) and of the ratio Q/(d&dt) (Regan & Hamstra, 1993). Beverley and Regan (1979b) showed that the human visual system contains a filter that is sensitive to the difference in the velocities of the opposite edges of a bar, that is a relative motion detector (RM in Figure 3). This mechanism is quite distinct from the well-known local motion detector (LM in Figure 3) that is responsible for the classical motion aftereffect. Figure 4 shows evidence that the human visual system contains a filter that is sensitive to isotropically changing size, but insensitive to motion within a frontoparallel plane. The inserts in Figure 4A illustrate the two adaptation conditions: squares of constant size whose locations oscillated along a diagonal, and squares whose size oscillated while their locations remained fixed. The essential point is that any given edge oscillated identically in the two cases. The difference between the conditions was that a square's opposite edges oscillated in antiphase to give changing-size and inphase to give changing-location. Therefore, any difference in the adapting effects could not be caused by local motion detectors: such a difference must be evidence for a mechanism sensitive to the difference in the velocities of opposite edges. The continuous line in Figure 4B shows that adapting to size oscillations (adapt size) produced a fivefold elevation of threshold for detecting size oscillations (test size), while having little effect on threshold for detecting oscillations of location (test movement). Furthermore, adapting to oscillations of location (adapt movement) had little effect on either threshold (Figure 4C). We concluded that the human visual system contains a filter that is sensitive to isotropically changing size while being insensitive to motion in a frontoparallel plane (Regan & Beverley, 1978a). Receptive field size for this filter does not exceed ca. 1.5 deg (Beverley & Regan, 1979b; Regan & Beverley, 1979a). There are separate filters for isotropic expansion and isotropic contraction (Regan & Beverley, 1978b). This point is incorporated into the Figure 3 model.
182 182
David Regan and Rob Gray
SIZE
t
600
B ADAPT SIZE test size
O
I
400
LU LU Q
200
d
0
CO LU CC
/ /
*
200
test movement
\
+
,
j
^
\ f
"
C ADAPT MOVEMENT
I
0
test movement
0.3
1
test size
3
10
FREQUENCY Hz
Figure 4: Evidence for "looming" or changing-size filters. (A) The stimulus was two identical squares on either side of a fixation spot. Threshold elevations produced by 25 mins adaptation to 2Hz oscillations of size (B) or by 25 mins adaptation to 2Hz motion within a frontoparallel plane (C) are plotted as ordinate. Test frequencies are plotted along the abscissas. Vertical lines show 1 SE. The large filled star and open star plot threshold elevations for detecting size oscillations and frontoparallel motion respectively following adaptation to 2Hz flicker. From "Looming detectors in the human visual pathway", by D. Regan and K.I. Beverley, 1978, Vision Research, 18, p.416. Copyright Elsevier Science Ltd. Reprinted with permission.
When catching or hitting a ball, it is often the case that the ball is not moving directly towards the observer. For example, in cricket, many catches are achieved with outstretched arm, and in baseball, squash and tennis the ball is commonly hit when it is some distance away from the head. When estimating the time to arrival of the ball in this situation it is important that the Vz
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
183
component of object motion (Figure 5A) is processed independently of the Vx component (Figure 5B). Therefore, we next discuss the early visual processing of the motion of an object moving obliquely rather than directly towards the observing eye. Figure 5C illustrates that an object's velocity vector V can be decomposed into orthogonal components Vx and Vz. Conversely, any direction of motion can be synthesized by combining the two orthogonal velocity components. The trajectories depicted in Figure 5D were generated in this way. A simulated object of constant shape oscillated along one of 11 trajectories, all of which had the same component of motion along the z axis (i.e., oscillations along a line through the eye), but different x-axis components (i.e. oscillations within a frontoparallel plane). After adapting to one of these trajectories, observers measured detection threshold for pure size oscillations, that is for the Vz component of motion along a line through the observing eye. The observer's head was fixed in a head restraint (Regan & Beverley, 1980).
B
V,
Y
Y
Y
*V
Y
Figure 5: Decomposition of a velocity vector. A: Pure antiphase oscillations of a square's opposite edges is equivalent to motion along the z-axis in C. B: Pure inphase oscillations of a square's edges is equivalent to motion along the x-axis in C. C: Motion along an arbitrary direction 9 in depth can be decomposed into an x-axis component Vx and a z-axis component Vz. D: The several trajectories shown have different components in the xdirection but identical components in the z-direction. After "Visual responses to changingsize and to sideways motion for different directions of motion in depth: linearization of visual response", by D.Regan and K.I. Beverley, 1980, Journal of the Optical Society of America, 11, p.1290. Copyright 1980 by the Optical Society of America. Reprinted with permission.
184 184
David Regan and Rob Gray
• with linearizing oscillation O without linearizing oscillation
.1 1 150 (0
I 5
100
.O • mm
50
o
I
-48 -12
-6-3
3
6
12 48
Amplitude of x-axis component arc min pk-pk Figure 6: The z-axis component of velocity is processed independently of the x-axis component. Open symbols plot threshold elevations produced by separately adapting to the 11 different trajectories in depth of an adapting 2Hz oscillation depicted in Figure 5D. Peak-to-peak amplitudes of the inphase (jc-axis) component of the adapting oscillation are plotted as abscissa. All of the 11 adapting oscillations had the same antiphase (z-axis) component of oscillation (6 arc min peak-to-peak) Filled symbols: the experiment was repeated with an 8Hz inphase (*-axis) "jitter" oscillation added to every adapting oscillations. Adaptation of the z-axis component was now independent of the *-axis component. From "Visual responses to changing-size and to sideways motion for different directions of motion in depth: linearization of visual response", by D.Regan and K.I. Beverley, 1980, Journal of the Optical Society of America, 11, p.1293. Copyright 1980 by the Optical Society of America. Reprinted with permission.
Open symbols in Figure 6 show that our expectation of independent processing was denied: the threshold elevation depended markedly on the adapting trajectory, being lowest when one edge of the simulated object's retinal image was stationary, that is the trajectory for which the object would just graze the observer's eye. However, note that the data in Figure 6 were collected with the observer's head on a chin rest. Filled symbols in Figure 6 show that threshold elevation became independent of adapting trajectory, when a Vx-axis jitter oscillation was added to the adapting stimuli. Interestingly, the amplitude
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
185
and frequency of the jitter that was maximally effective was similar to the amplitude and frequency of the noisy variations of eye direction when the head is moving freely rather than being restrained. Regan and Beverley (1980) concluded that, in everyday viewing conditions, the visual system contains a filter sensitive to pure changing-size that abstracts the Vz (line of sight) component of an object's velocity independently of the object's trajectory. Figure 6 illustrates a case when the effect of noise is to improve the performance of the visual system by increasing the independence of processing of the Vz component of an object's velocity. The crosstalk evident in the open symbols in Figure 6 can be attributed to a nonlinearity (e.g. a threshold) in the processing of near-zero local speeds by the local motion detectors labelled LM in Figure 3 (Regan & Beverley, 1981).6 2.2 Two percepts generated by a single stimulus: Changing-size and motion in depth Almost all research on TTC has been restricted to the case of an approaching object whose retinal image does not change shape as it expands. (This is called isotropic expansion). The results of such research is, in general, not valid for an approaching object whose retinal image expands nonisotropically. This is an important caveat. For example, when an individual walks, drives, or flies through an environment cluttered with nonspherical objects (e.g. chairs, cars, trees) the retinal image of these objects expand nonisotropically (except for objects on a collision course closing on the observer in a straight line at eye height). We will see below that one result of the change in image shape is that monocular information about time to passage is corrupted. A rate of change of retinal image size can produce two quite different percepts: a percept of changing-size, and a percept of motion in depth. We will see below that the processing of a changing-size stimulus is, to a considerable extent, organized on an either/or basis. For example, retinal image expansion with no shape change (isotropic expansion) produces an impression of motion in 6
That noise can cause a system to behave more linearly has long been known in engineering. For example, the servomechanisms that control the movements of an aircraft's control surfaces are subjected to a constant high-frequency jitter to create smooth operation and avoid the jerks that would be caused by stiction. Again, the magnetic material in a tape recorder is excited by a constant high-frequency jitter oscillation to distance the auditory waveform to be recorded from the nonlinearity at the zero-input point. This concept was introduced to physiology by Spekreijse (1966) - see also Spekreijse, van Norren and van der Berg (1971) - and, inspired by Spekreijse, to psychophysics by Regan and Beverley (1980, 1981).
186 186
David Regan and Rob Gray
depth with little or no impression of size change. In addition, after adapting to isotropic expansion of the retinal image a subsequently-viewed static target appears to be moving in depth with no change of size (Regan & Beverley, 1978b). We will see below that the situation is different when shape change accompanies expansion. So far as estimating TTC is concerned, it has been proposed that the motion in depth percept is crucial and, in particular, that the perceived speed of the motion in depth of an approaching object is inversely proportional to TTC rather than being determined by the object's actual speed (Regan & Hamstra, 1993). However, for a complete understanding of the processing of time to collision and time to passage in real-world conditions it is necessary to take retinal image shape changes into account. Sensitivities to the percept of motion in depth and the percept of changing-size produced by oscillating size have different dynamic characteristics. Sensitivity to motion in depth is bandpass with a peak at ca. 1 Hz, and does not extend above 2-3 Hz. Sensitivity to changing-size is also bandpass, but the peak is a ca. 3 Hz and sensitivity extends to ca. 15 Hz (Regan & Beverley, 1979b). The experimental results shown in Figure 7 indicate that the mechanisms that support the perception of motion in depth and changing-size have different temporal decay characteristics. Observers adapted to a bright square whose vertical edges ramped towards each other repetitively. When a static test square was viewed immediately following the cessation of adaptation the static square appeared to be increasing in width, but no motion in depth was evident. This changing-width aftereffect decayed relatively quickly, and when it had died away the static test square appeared to be moving in depth towards the head but not changing in width. This motion-in-depth aftereffect decayed relatively slowly. Observers were provided with a knob that caused the vertical edges of the square to ramp apart so that the changing-width effect could be nulled. All reported that, when the changing-width aftereffect was nulled shortly after the cessation of adaptation, a motion-in-depth aftereffect took its place and could be cancelled in its turn. Cancellation data are shown in Figure 7. This either/or phenomenon is discussed in Section 4.3. As shown in Figure 7, the decays of both aftereffects were exponential. The time constants for changing-width and motion-in-depth aftereffects were, for four observers respectively, as follows: 9.5,54; 7.5,24; 6.4,24; 8.1,30 (all in s) (Beverley & Regan, 1979a).
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
8
ts
a—D
2
changing-width aftereffect motion-in-depth aftereffect
o/
2 > '-' > hO
«
..••""
O LU U. LL LU *
•
\
-7
CD
i
187
\
0.5
\
0
20
40
60
80
100
TIME (t) sec Figure 7: Changing-width stimulation produces two percepts: changing-width and motion in depth. Data points plot the decay timecourses of the motion-in-depth aftereffect and the changing-width aftereffect produced by adapting to a 1.0 deg stimulus square whose width decreased. Rates of movement of the vertical edges of a test square that just nulled the aftereffect are plotted versus time elapsed after the cessation of adaptation. After "Separable aftereffects of changing-size and motion in depth: different neural mechanisms?" by K.I. Beverley and D. Regan, 1979, Vision Research, 19, p.729. Copyright 1979 by Elsevier Science Ltd. Reprinted with permission.
2.3 Tau In section 2.1 we reviewed evidence that the human visual system contains a mechanism specialized for isotropic expansion of an approaching object's retinal image. However, although isotropic expansion can signal that the object is approaching it does not unequivocally signal TTC. Equation 3 expresses a relation between, on the one hand, time to collision and, on the other hand, monocularly-available retinal image information. It was derived by Hoyle (1957) in an astronomical context. Lee (1976) subsequently gave the label T to the right side of the equation and suggested that close-range estimations of TTC were based on equation (3)
188 188
David Regan and Rob Gray
TTC -e/(dG/dt)
(3)
where 0 is the instantaneous angular subtense of a rigid spherical object moving at constant speed directly towards an observing eye, and 0 is small. Following Hoyle's initial theoretical work7 research on t did not, however, follow the step-by-step rational path described in section 1.2. Before establishing that the human visual system processes the ratio 9/(d9/dt) independently of both 9 and d9/dt, researchers immediately embarked on field studies. As pointed out by Wann (1996), many of the early experimental designs did not effectively test the x hypothesis or indeed any rival hypothesis: in effect, the x hypothesis was regarded as an axiom rather than an hypothesis. For example, some authors allowed participants to view the approaching object binocularly, but ignored the possible role of stereo information in TTC estimates. A demonstration of the basic requirement that observers can discriminate trial-to-trial variations in the ratio 8/(d9/dt) while ignoring simultaneous trial-to-trial variations in d9/dt and 0 was not provided until 1993 (Regan & Hamstra, 1993; Regan & Vincent, 1995). This was done as follows. An approaching object was simulated on a monitor. The object disappeared at some variable instant. Following each trial the observer was instructed to signal: (a) whether TTC was longer or shorter than the mean of the stimulus set; (b) whether the initial rate of expansion was larger or smaller than the mean of the stimulus set; (c) whether starting size was larger of smaller than the mean of the stimulus set. There were 8 values of the ratio 9/(d 9/dt), 8 values of initial 6, and 8 values of initial dO/dt, The stimulus set consisted of combinations of 6/(d9/dt), 6 and dB/dt, that were presented in random order. The set of 192 stimuli comprised three 8 x 8 arrays. In the first array 9/(d9/dt) and dB/dt were orthogonal, and values of 9 were assigned randomly from the 8 possible values. In the second array 9/(d9/dt) and 0 were orthogonal, and values of d9/dt were assigned randomly from the 8 possible values. In the third array (d9/dt) and 9 were orthogonal, and values of 9/(d9/dt) were assigned randomly from the 8 possible values. It was not possible for the observer to tell from which array any given stimulus had been drawn.
7
Actually a footnote in a science fiction novel. This derivation was hardly "work" to one of the world's leading astrophysicists.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
0.8
0.9
1.0
1.1
1.2
(SIZE)/(RATE OF EXPANSION))
1.3
0.8
0.9
1.0
1.1
1.2
189
1.3
INITIAL RATE OF EXPANSION
Figure 8: Two-task discriminations of simultaneous trial-to-trial variations in the TTC of a simulated approaching object and of its rate of angular expansion. A, B: When the task was to discriminate TTC the observer ignored simultaneous trial-to-trial variations of rate of expansion. C,D: When the task was to discriminate rate of expansion the observer ignored simultaneous trial-to-trial variations of TTC. After "Dissociation of discrimination thresholds for TTC and for rate of angular expansion", by D. Regan and S.J. Hamstra, 1993, Vision Research, 33, pp. 453 & 454. Copyright 1993 by Elsevier Science Ltd. Reprinted with permission.
Each set of responses gave 3 plots with the observer's responses as ordinate, so that each experimental run provided 9 plots. Figure 8 shows only the plots that bear on an observer's ability to dissociate TTC [i.e. 6/(dO/dt)] and rate of expansion (i.e. dO/di). When the task was to discriminate TTC, a plot of response probability versus TTC was steep (Figure 8A) while a plot of the same responses versus starting rate of expansion was flat (Figure 8B). This steep/flat dichotomy indicated that the observer based his responses on the task-relevant variable (6/(d6/dt)), and ignored the task-irrelevant rate of expansion. A plot of response probability versus starting size was also flat (not shown). Conversely,
190 190
David Regan and Rob Gray
when the task was to discriminate rate of expansion, a plot of response probability versus rate of expansion was steep (Figure 8D), while a plot of the same responses versus TTC was flat (Figure 8C). Again, this steep/flat dichotomy indicated that the observer based his responses on the task-relevant rate of expansion and ignored the task-irrelevant 6/(d8/dt). A plot of response probability versus size was also flat (not shown). Regan and Hamstra (1993) concluded: (a) that the human visual system contains a mechanism that is specifically sensitive to the ratio 9/(d9/dt) and is approximately insensitive to at least small variations in d8/dt and 8, (b) that the visual system contains a mechanism that is specifically sensitive to d8/dt and is approximately insensitive to 6/(dO/dt) and to 8. Finally, discrimination thresholds for trial-to-trial variations in 6/(d8/dt), dO/dt and 6 were similar whether measured in the 3-task situation just described or when measured one-by-one. This indicated: (1) that the three tasks did not appreciably load attentional or memory resources, and (2) that the processing that supported any given discrimination was independent of the processing that supported the other two discriminations. The conclusions just set out hold only for central vision. In peripheral vision the ability to dissociate the ratio 8/(d9/dt) from rate of expansion is degraded: an observer experiences a compelling and threatening sensation that an approaching object is on a collision course, but TTC and rate of expansion are partially confounded so that a large object with the same TTC as a small object seems to be approaching at greater speed than the small object (Regan & Vincent, 1995). The finding that judgements of TTC have high precision (i.e. discrimination threshold is low) by no means implies that judgements of TTC are necessarily accurate. To investigate the accuracy with which observers estimate TTC it was necessary to devise a procedure that was influenced neither by the observer's motor reaction time nor by the observer's criterion (Gray & Regan, 1998; Vincent & Regan, 1996). Observers viewed a simulated approaching object. Each trial consisted of a presentation of fixed duration. The stimulus was switched off partway through its flight and some time later (at the designated TTC) there was a brief auditory tone whose timing accuracy was 0.001 sec. The observer was instructed to signal whether the approaching object would have arrived before of after the tone. The result of a "before" signal was that Q/(d8/dt) would be longer on the next presentation while the result of an "after" signal was that 8/(d8/dt) would be shorter. A tracking procedure (Levitt, 1971) caused 8/(d8/dt) to converge onto the designated TTC. Nine staircases were interleaved, comprising three values of TTC and three starting sizes. By subjecting the tracking data to stepwise discriminant analysis, the 9-track organization allowed us to confirm that observers based their responses on the task-relevant variable (8/(d8/dt)) and ignored the following task-irrelevant
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach Time-to-Passage
191
variables: initial rate of expansion; final rate of expansion; change of size during the presentation (Gray & Regan, 1998). In brief, we found that, providing the target is sufficiently large, observer can make tolerably accurate estimates of absolute TTC. Confirming most previous reports we found that, when estimates were based entirely on monocular information, observers slightly underestimated the actual TTC (by 2% to 12% depending on the observer). 2.4 Does perceived distance play a role in monocular estimates of TTC? So far we have assumed that monocular estimates of TTC are based on equation (3), that is TTC = 6/(dd/dt, an equation that does not involve distance. We have also assumed that the perceived speed of motion in depth is inversely proportional to TTC rather than being determined by the object's actual speed (Regan & Hamstra, 1993). These assumptions are encapsulated in the Figure 3 model. Adopting a quite different approach, some authors have proposed that human observers have access to accurate information about an object's absolute distance and its linear speed of approach, and can estimate TTC by dividing absolute distance by linear speed. A number of authors have attempted to distinguish between the two approaches but, according to Abernethy and Burgess-Limerick (1992), failed to find unequivocal support for one or other approach (reviewed in Gray & Regan, 1999b). In an attempt to resolve this question we recently measured accuracy in estimating TTC at two viewing distances (lm and 5m) using the "match arrival time to the brief tone" method described earlier. The 5m display was made five times larger than the lm display so that retinal image information about 6/(d6/dt) was identical at the two distances and corresponded to the same TTC. All monocular and binocular cues to the distance of the displays were available. Differences in estimated absolute TTC for the 5m and lm viewing distances were very small and nonsignificant (3%, 0.3%, and 2.7% for three observers). We concluded that the 5:1 variation in distance had essentially no effect on TTC estimation so that, at least in our experimental conditions, observers ignore distance when estimating TTC on the basis of t.
192 192
David Regan and Rob Gray
2.5 Estimation of TTC by stationary and by moving observers Almost all laboratory investigations of the role of 6/(d6/dt) in estimating TTC are restricted to the case of stationary observer and moving object. (All published animal experiments also suffer from this limitation). On the face of it, this limitation should not be important because 6/(d6/dt) signals TTC whether the closing motion consists of self-motion, object motion, or a combination of the two. This expectation was confounded by our experimental investigation in which we simulated self-motion by using a wide-field (39° horizontal x 29°) flow field of texture elements that appeared to be located at a large distance. This was achieved by means of the optics of an F-18 aircraft simulator (Figure 9). A simulated approaching spherical object was separately generated and displayed within a blank square area at the centre of the flow pattern (Figure 9C). The accuracy of estimating absolute TTC was measured by means of the "match arrival time to a brief auditory tone" technique described earlier. Figure 10 shows those errors. The white bar in Figure 10A shows that, consistent with previous findings (e.g. Gray & Regan, 1998), observers made small (3% to 14%) underestimations of TTC in the baseline condition, that is after adapting to static texture elements as illustrated in Figure 9C. The black bar in Figure 10A shows that, when forward self-motion was simulated by creating a radiallyexpanding flow pattern, underestimation increased considerably. The gray bar in Figure 10A shows that, when backwards self-motion was simulated by creating a radially-contracting flow pattern, the underestimation of TTC became an overestimation.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach Time-to-Passage
A
Side view
B Observer
View from above
C
193
Observer's view
Monitor 2
SG
Figure 9: Judging TTC while moving. A: The radially expanding or contracting flow field consisted of a randomly scattered pattern of squares whose size and instantaneous speed increased radially outwards. They were displayed on MONITOR 1 that was viewed through the optics of an F-18 aircraft flight simulator. A large glass sheet (LG) reflected the display onto a large (75 cm horizontal x 90 cm) high-quality parabolic mirror so that the display seemed to be at a great distance, though it subtended 39 deg horiz. X 27 deg. B: An approaching spherical object was simulated on a second monitor (MONITOR 2). A small thin sheet of glass (SG) reflected this second display into the parabolic mirror so that it also seemed to be at a great distance. Note that, for clarity, the glass sheet LG is omitted from panel B. C: The observer's view of the approaching object (gray circle) and flow field (black squares). The dashed square (not present in the actual display) indicates the central area in which no flow elements were presented. From "Simulated self-motion alters perceived time to collision", by R. Gray and D. Regan, 2000, Current Biology, 10, p.588. Copyright 2000 Elsevier Science Ltd. Reprinted with permission.
David Regan and Rob Gray
194 194
CD C
g "•s CD
• Forward • Static • Backward
Figure 10: Mean % TTC estimation error for the different flow conditions. Filled bars are for FORWARD flow, grey bars are for BACKWARD flow and open bars are for the STATIC condition. Error bars show one standard error. A: Flow pattern dots expanded as they moved radially outward and contracted as they moved inward. B: Dot size constant. From "Simulated self-motion alters perceived time to collision", by R. Gray and D. Regan, 2000, Current Biology, 10, p.588. Copyright 2000 Elsevier Science Ltd. Reprinted with permission.
A Step by Step Approach to Research on Time-to-Contact and Time-to-Passage
195 195
This effect of the peripheral flow pattern did not change when the gap between the outer edge of the approaching object and the inner edge of the flow pattern was increased by up to 5°, indicating that the effect could not have been caused by a direct effect of the flow pattern upon the local changing-size filter in Figure 3 because, as mentioned earlier, the receptive field of this filter is less than ca. 1.5 deg wide (see also Figure 11). Rather, the effect of the flow pattern on estimates of TTC was caused by a long-range interaction. A possible ecological role for this long-range interaction is as follows. When a stationary observer attempts to catch an approaching object there is a clear advantage to having an estimation of TTC based on Q/(dO/dt) to be an underestimation. In which case, the unavoidable variability in the estimate will never create the situation that there is no time left to act on the binocular information about TTC that is available only when the object draws sufficiently close. This binocular information is required to time the finger flexions necessary to hold a catch (Alderson, Sully & Sully, 1974). When the whole body is moving forward (e.g. a monkey swinging from branch to branch) the inertial mass that must be controlled to make fine corrective adjustments when binocular information becomes available is very much greater than in the case of a stationary observer. A simple solution would be a lateral neural interaction that allows the radially-expanding flow field to increase the underestimation of TTC based on 0(dd/dt), and that is what we found. 2.6 Realistic and unrealistic simulations of self-motion by optic flow: The importance of getting it right A secondary point is brought out in Figure 10B. In Figure 10A the flow pattern's squares grew larger as they moved radially outwards and also accelerated so as to simulate self-motion through a three-dimensional world. In Figure 10B the size of the squares was held constant, and the TTC effect was far less than in Figure 10A. This, taken together with other evidence described in Section 2.7 below, brings into question the relevance to everyday life of the considerable literature on optic flow and TTC in which texture element size (e.g. dot size) was held constant. 2.7 A failure to match the expansion of an image's internal texture to its size can disrupt the processing of TTC: A drawback of constant-size dot displays Further to this last point, in everyday life the relative rate of expansion of the size of an approaching textured object is uniquely locked to the relative rate of expansion of the texture elements within the object's image. In this
196 196
David Regan and Rob Gray
situation the presence or absence of surface texture negligibly changes the effectiveness of the approaching object as a stimulus for motion-in-depth perception (Beverley & Regan, 1983). Nevertheless, the early visual processing of the rate of expansion of a textured approaching object's retinal image does take into account any mismatch between the rate of expansion of texture elements and the rate of expansion of the image. In particular, when texture element size expands more slowly than image size (thus indicating different TTCs for the texture elements and the boundary of the object) the generation of a motion-in-depth signal is severely impaired (Beverley & Regan, 1983). Consequently, a mismatch of this kind also affects the accuracy of estimating TTC, whether the texture covers the surface continuously (Vincent & Regan, 1997) or whether the texture consists of dots (Gray & Regan, 1999a). In the case of dotted texture, large errors in TTC are produced when dot size is held constant and dots are larger than 2.2-4.4 arc min. This last finding further calls further into question the applicability to everyday life of TTC and motion-in-depth studies that have used dotted targets and in which dot size has been held constant. For completeness we should add that the reason for embarking on this line of research was to investigate whether limitations in achieving realistic texture dynamics in flight simulator displays might restrict the effectiveness of training in landing and low-level flight. 2.8 TTC and optic flow Figure 11 shows that the looming detectors of Figure 4 can be desensitized by a radial pattern of optic flow of the kind produced on the retina by self-motion through a cluttered 3-dimensional environment. This effect may be a possible cause of highway accidents (see Gray & Regan, in this volume). Here we will only note that the data shown in Figure 11 are consistent with the hypothesis that, because their receptive fields are small, the outputs of looming detectors provide a rough physiological measure of the variable8 div V. The significance of that point is as follows. Two-dimensional optic flow on the retina is a vector field, and as such can be described by assigning a magnitude and direction to the instantaneous velocity V of contour at every point on the retina, or alternatively by assigning a value to divV, curlV, and gradV at every retinal location. From the definition of divV DivV = V • V 8
(4)
For a minimally mathematical account of vector calculus I recommend Schey (1973). Appendix I in Regan (2000) discusses evidence that the human visual system contains mechanisms sensitive to divV, curlV and gradV, where V is local velocity and V is local speed.
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
197
We have, in the case of a two-dimensional flow pattern
Aa -> 0 J
Aa
where Aa is an arbitrary small retinal area, dl is an element of the closed path enclosing area Aa and Vsini9 is the component of velocity normal to dZ (Green, 1967; Schey, 1973). This definition is framed in terms of the limiting case with area Aa tending to zero, but if we approximate for our case of an isotropically-expanding square retinal image of finite but small side length 26, then it can be straightforwardly shown that
to a first approximation, Hence TTC=2/divV
(7)
(Regan & Hamstra, 1993) In principle, therefore, for a moving observer the monocular changing-size mechanism could signal time of arrival at an external-world destination by extracting local divV from the retinal image flow pattern created by self-motion through the three-dimensional environment. Whether this occurs in practice is not known. Nevertheless it may be worth noting that by definition the variation of divV within the retinal image is independent of translational velocity produced by eye rotation, and there is psychophysical evidence that this independence is also shown (at least to a first approximation) by physiological relative motion detectors (see Figure 6).
198 198
David Regan and Rob Gray
A TEST SQUARE
B ADAPTING FLOW PATTERN
50 N
„
OS o03 _l LU Z> CC 0 3
Former location of locus 1
2
3
DISTANCE, X (degrees)
Figure 11: Looming detectors can be activated by the focus of expansion of a flow pattern. (A) Sensitivity to size oscillations of a 0.5 deg test square located a variable distance (X) from the point of fixation (M) were measured before and after adaptation. (B) Observers adapted to a radially expanding flow pattern for 10 min. This particular flow pattern had a sharp maximum of divV at the focus of expansion. (C) Depression of changing-size sensitivity as a function of distance X. Threshold elevations only occurred for points very close to the former location of the flow pattern's focus. From "Visually guided locomotion: psychophysical evidence for a neural mechanism sensitive to flow patterns", by D. Regan and K. I. Beverley, 1979. Science, 205, p.312. Copyright 1979 by the American Association for the Advancement of Science. Reprinted with permission.
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
199
2.9 A developmental hypothesis A predator's most efficient and least bothersome tactic is to kill before the prey is aware of the danger, and a silent approach form the rear is a simple way of achieving this aim when the prey has front-facing eyes. So it is not surprising that an expanding image, especially when located in the peripheral visual field, triggers a fast involuntary motor reflex in humans. The reflex is most evident when attention is distracted and the stimulus is unexpected, and includes a rapid eye movement to foveate the expanding image so as to allow estimation of TTC and identification of the approaching object. It has been proposed that selective visual sensitivity to the ratio 6/(d 6/dt), that is x, is not present from birth as is sensitivity to a looming stimulus (i.e. a rate of expansion (dO/dt). Rather, exposure to the radially-expanding pattern of optic flow associated with self-locomotion in early life progressively develops a neural mechanism whose sensitivity to 9/(d6/dt) is combined with insensitivity to both 0 and d6/dt (Regan & Beverley, 1979a; Regan & Vincent, 1995). This proposal is consistent with evidence that, in adults, the independence of the mechanisms for 6/(dd/dt) and for dd/dt is progressively lost as retinal eccentricity is increased (Regan & Vincent, 1995), because the stimulus necessary for the development of selective sensitivity to 8/(d6/dt) is much weaker in the peripheral visual field. In particular, if the direction of gaze roughly coincides with the direction of self-motion through a three-dimensional world cluttered with objects, the predominant effects of self-motion are that retinal images of nearby object in the central visual field expand with comparatively little translational motion and a value of 6/(d6/dt) that corresponds to TTC, whereas retinal images of nearby objects in far peripheral vision predominantly translate.
3. Neural mechanisms involved in estimating TTC on the basis of binocular information and the "Independent Processing" requirement 3.1 The binocular motion-in-depth mechanism and the importance of reference marks Wheatstone (1852) was the first to demonstrate that a rate of change of horizontal binocular disparity can, by itself, generate a compelling sensation of motion in depth such that the observer has the impression of collision at some future instant. In this section we review evidence that the human visual system contains a specialized mechanism that generates a sensation of motion in depth
200
David Regan and Rob Gray
when stimulated by changing-disparity, a mechanism that is almost totally insensitive to all the other variables so far tested. This last point complies with the "independent processing" requirement for a set of filters (Regan, 1982). Regan, Erkelens and Collewijn (1986a) reported that changing the target's disparity in the situation that no reference mark is visible produces only a weak sensation that the target is moving in depth or even no sensation of motion in depth at all. In other words, the effective binocular stimulus for motion-in-depth perception is a rate of change of relative rather than absolute disparity. A rate of change of absolute disparity more than 100 times larger than the rate of change of relative disparity that produced a just-noticeable sensation of motion in depth produced no motion-in-depth sensation at all for a large target and only a weak sensation for a point target. (For a discussion of absolute and relative disparity see Regan, 2000, pp.348-351). In Figure 12 the stationary reference was a plane covered with random dots. The observer fixated on this plane, and the accuracy of fixation was monitored by nonious lines. The target was a pair of identical bars, one viewed by the left eye and one by the right eye. The bars were seen in binocular fusion. The mean depth of the fused bar could be varied relative to the reference marks. Figure 12 shows that sensitivity to motion in depth (STEREO, continuous line) is greatest when the stationary marks are at the same depth as the changingdisparity target, and that sensitivity falls off steeply as the depth of the bar departs from that of the reference. Many observers have areas of the binocular visual field within which changing-disparity produces no sensation of motion in depth. These are called stereoscotoma (Richards and Regan, 1973). This total loss of motion-in-depth sensation is not accompanied by any loss of sensitivity to motion within a frontoparallel plane, nor any loss in sensitivity to a difference in static disparity (i.e., stereoacuity) (Regan, Erkelens & Collewijn, 1986b; Hong & Regan, 1989). These findings indicate that the binocular mechanism that supports the perception of motion in depth is separate from both the mechanism sensitive to frontal plane motion and the mechanism sensitive to static relative disparity. The finding that some observers can be blind to approaching motion in depth but sensitive to receding motion in depth, or vice versa, is evidence that the binocular motion-in-depth mechanism consists of separate "approaching" and "receding" submechanisms. This point is important for the model shown as Figure 3.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
201
Stereo.
f
5
-20
-10
10
20
Static disparity, min arc
Figure 12: Sensitivity to changing disparity is greatest when the target and the stationary reference are at the same depth. A bar viewed by the left eye and an identical bar viewed by the right eye each executed 0.1 Hz sinusoidal oscillations either in antiphase (Stereo), giving a percept of motion in depth, or inphase (Binoc), giving a percept of motion within a frontoparallel plane. Ordinates plot the amplitudes of oscillation (of either bar) when the observer just detected motion. The observer fixated on a plane of random dots at zero disparity. Abscissae plot the mean disparity of the binocularly-fused bars. The horizontal dashed line shows the monocular threshold (one eye occluded). Adapted from "Some dynamic features of depth perception" by D. Regan and K.I. Beverley, 1973, Vision Research, 13, p.2371. Copyright 1973 by Elsevier Science Ltd. Reprinted with permission.
Observers can discriminate trial-to-trial variations in the speed of the motion in depth sensation evoked by changing-disparity while ignoring simultaneous changes in the direction of motion in depth and the distance moved (i.e., change of disparity A5) for both monocularly visible and cyclopean targets (Portfors-Yeomans & Regan, 1996, 1997). In Figure 13A-I the observer was instructed to signal after each single trial the target's speed, direction, and distance moved. Figure 13A-I shows that, in each of the three tasks, the observer's responses were based on the task-relevant variable, and the two taskirrelevant variables were ignored.
202 202
and Rob Gray Gray David Regan and
r B
100 80
| g 3 8
60
,«" * »
40
••-,-
20 0 100 80 CD ( 8
60 40 20 0
100 r Q
80
2 8.
60 40 20 0.5
1
Direction
1.5
0.8
1
1.2
Speed
1.4
0.8
1
1.2
Excursion
Figure 13: The direction, speed, and distance moved in depth by an approaching object are processed independently. The target was a bright square. The 216 test stimuli appeared to move in depth at different speeds and in different directions. The reference stimulus had the mean speed, mean direction and mean direction moved for the set of test stimuli (1.0 on the abscissae). Following each presentation of the reference and of one of the test stimuli the observer made three judgements. A-C: The percentage of "wider of the head than the reference trajectory" responses is plotted against the task-relevent variable (d
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
203
For our present purpose the relevant conclusion is that the human visual system contains a binocular cyclopean mechanism that supports acute discriminations of the speed of stereomotion-in-depth while being insensitive to A8 and to the direction of motion9. A candidate physiological basis for this proposed mechanism is provided by the finding that the visual cortex of cat and monkey contains one class of neuron that is sensitive to motion in depth while being comparatively insensitive to disparity, and a second class of neuron that prefers motion within a frontoparallel plane and is sharply tuned to static disparity (Cynader & Regan, 1978, 1982; Poggio & Talbot, 1981; Regan & Cynader, 1982; Regan, Beverley & Cynader, 1979; Spileers, Orban, Gulyas, & Maes, 1990). 3.2 Binocular information about TTC In Figurel4 O is an object moving in a straight line at constant speed Vz whose instantaneous distance from the observer is D. P is a stationary reference object whose distance (S) is fixed. The observer's interpupillary separation is I. In Figure 14 the relative horizontal binocular disparity of O with respect to stationary point object P is 5, where
provided10 that D » I. Equation (8) is valid independently of the ocular vergence angle. However, the vergence angle is such that the reference (P) and object (O) are both seen in binocular single vision. Since I and S are constant, we have from equation (8)
^ 1 dt 9
(9)
dtlDj
Harris and Watamaniuk (1995) made the general claim that the human visual system does not contain a cyclopean mechanism specialized for the speed of the motion-in-depth sensation evoked by dS/dt, and that speed discriminations were based on the total disparity change rather than speed. This claim was based on data from two observers in the special case that the target disappeared and reappeared partway through the stimulus presentation, thus requiring the visual system to solve the corresponding problem twice. Portfors-Yeomans and Regan (1996) repeated their experiment and obtained the same result, then went on to show that, when the target did not disappear during a presentation, results similar to those shown in Figure 13 were obtained, i.e. there was clear evidence for a specialized cyclopean mechanism for motion in depth (see also Portfors & Regan, 1997).
204
David Regan and Rob Gray
Hence ^-•^T (10) dt D z since I is constant. Given that TTC=D/VZ, we have from equation (10) TTC =
(11)
D(d8/dt) Rewriting equation (11) TTC = — L (12) (d8/dt) Using a different mathematical procedure, equations (11) & (12) were previously derived by Regan (1995). This derivation is replicated in Gray & Regan (1998). Equation (12) is quite different from the equation (13) subsequently published by Rushton and Wann (1999). TTC=—-— (d8/dt)
(13)
(Note that in their paper they used a instead of 8 to represent relative horizontal disparity). Angle yi in Figure 14 is not, of course, the relative horizontal disparity of object O as in the Rushton and Wann equation except in the special situation assumed by Rushton and Wann that reference object P is at infinity (so that 72 in equation 8 is zero). However, if the reference (P) is at infinity and the observer gazes at it as in the Rushton and Wann (1999) paper, the approaching object would be seen in double vision when it reached the distances when stereoscopic vision is effective. In which case the observer would see two images moving away from each other within a frontoparallel plane rather than a single approaching object. Further to this point, sensitivity to motion in depth is greatest when the reference is at the same depth as the moving object, and falls off steeply as the difference in depth is increased (see Figure 12). In this optimal condition for perceiving motion in depth with the reference at the same distance as the object the Rushton and Wann (1999) equation gives TTC=0 whatever the true value of TTC. 10
Evasive or interceptive action will have been initiated long before the D » I condition is violated so that in practice equation 8 is a good approximation.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach Time-to-Passage
205
Figure 14: Binocular information about TTC. A point object O is moving at constant speed Vz along a straight line passing midway between an observer's eyes. P is a fixed reference object. From "Binocular information about time to collision and time to passage", by D. Regan, 2002, Vision Research, 42, p. 2481. Copyright 2002 by Elsevier Science Ltd. Reprinted with permission.
3.3 The problem of distance in equation (11) At first sight the presence of the instantaneous distance D(t) in equation (11) suggests that binocular information could not provide accurate estimates of TTC, because a human's ability to estimate distance is poor, especially when distance exceeds a few m. A further problem is that a candidate basis for estimating distance-namely the angle of binocular convergence-would seem to be rejected by the finding that, for an object of angular size (0), a large change in ocular convergence (0 to 24 prism dioptres) had no effect on the value of dd/dt required to cancel the sensation of motion in depth produced by a fixed rate of change of size (d6/dt) in one observer, and only a small effect in a second (Regan & Beverley, 1979b).
206 206
David Regan and Rob Gray
Approach speed 5 m/sec (11.2 m.p.h.)
1000 0.1 IE
CO Q U.
100
ddYdt
O LU
a
0.01
5 UJ
o U_
o
10 0.001 10 DISTANCE (metres)
Figure 15: Timecourses of the rate of change of disparity (dS/dt) and its rate of change (d2d/di2) associated with an object approaching at constant speed. The ordinates are logarithmic to bring out the point that the ratio (d2d/dt2)/(dd/dt) increases as the object approaches. As shown in the text, the ratio is inversely proportional to TTC. From "Binocular information about time to collision and time to passage" by D. Regan, 2002, Vision Research, 42, p.2482. Copyright 2002 by Elsevier Science Ltd. Reprinted with permission.
One possible solution to this problem can be understood by reference to Figure 15. The two curves show how the instantaneous values of dd/dt and d2d/dt2 vary with distance for a point object approaching the head at constant speed Vz. The numbers on the ordinate are for Vz=5.0 m/s, but the important point is that the shape of the curves are the same independently of both Vz and object size. Figure 15 shows that, after the value of dS/dt passes detection threshold (at a few 10s of m in this example), its magnitude rises at a rapidly accelerating rate. The timecourse of this increase of dd/dt allows TTC to be estimated without any information as to the instantaneous distance of the approaching object (Regan, 2002b). We have:
A Step by Step Approach to Research on Time-to-Contact Time-to-Contact and Time-to-Passage Time-to-Passage
d8 dt
IV, D2
207 207
(14)
Hence
Therefore d2S/dt2_.2Vz_ 2 dS/dt D 2 TTC
In words, the ratio between the first derivative of the rate of change of disparity and the rate of change of disparity is inversely proportional to time to collision. A signal proportional to (d2d/dt2)/(dd/dt) would grow larger as TTC diminished, thus indicating the growing urgency for evasive or interceptive action. Whether the human visual system contains a mechanism that is sensitive to (d2S/dt2)/(d3/dt) could be tested by using an experimental design analogous to that used to obtain the data shown in Figure 13. This test is currently being carried out. 3.4 Generation of a changing-disparity signal For a monocularly-visible approaching object a changing-disparity signal could, in principle, be obtained directly from the angular velocities of the object's retinal images in the right and left eyes without an intermediate stage of computing static disparity and, as shown by equation (18), the direction of motion can be obtained from the ratio of these angular velocities (Beverley & Regan, 1973; Regan et al., 1986). However, although this "velocity ratio" cue provides directional information about trajectories within the horizontal meridian (Figure 16A), it provides no directional information about trajectories within the vertical meridian (Figure 16B); and it has been shown that directional discrimination thresholds for trajectories within the vertical and horizontal meridians are identical (Portfors-Yeomans & Regan, 1997). More direct evidence that the human visual system contains a cyclopean mechanism that supports acute discriminations of the direction and speed of motion in depth was reported by Portfors-Yeomans and Regan (1996). They
208
David Regan and Rob Gray
found that observers can unconfound and discriminate both the speed and the direction of motion in depth of a target that cannot be seen through one eye alone, that is a target camouflaged in dynamic random noise.
B
Figure 16: Motion in depth contained within the horizontal and vertical meridians. A: Different directions of motion in depth contained within a plane that contains the left and right eyes and is normal to the frontal plane. For brevity we will refer to this as motion within the horizontal meridian. These trajectories can be discriminated using either equation (17) or equation (18). B: Different directions of motion in depth within the horizontal meridian. These trajectories can be discriminated using equation (17) but not equation (18). From "Cyclopean discrimination thresholds for the direction and speed of motion in depth", by C.V. Portfors-Yeomans and D. Regan, 1996, Vision Research, 36, p.3627. Copyright 1996 by Elsevier Science Ltd. Reprinted with permission.
A Step by Step Approach Approach to Research on Time-to-Contact and Time-to-Passage
DISMRITY DETECTORS
DELAY
209
COMPARATORS
Figure 17: Model of the early processing of binocular information about time to collision.
C2
o DC
g 0.
RATE OF CHANGE OF DISPARITY dS/dt
Figure 18: The gain of a comparator in Figure 17 increases quasi-logarithmically with its preferred value of dd/dt.
210 210
David Regan and Rob Gray
Figure 17 depicts one model of the cyclopean motion-in-depth mechanism. In the upper half of the figure two disparity detectors feed three comparators (Cl, C2, C3) whose outputs (x1; x2, x3) are largest when the signals from the two disparity detectors arrive simultaneously. Since xx > x2 > x3 comparator C3 prefers a larger d8/dt than C2, and C2 prefers a larger dS/dt than Cl. (The number of parallel delay/comparator arrangements may be more than three). In addition, the peak response of a comparator increases quasilogarithmically with its preferred value of dd/dt (Figure 18). The outputs of the three comparators are summed to give a signal that is proportional to the log of the mean value of dd/dt between disparities 5i and 82 [ln(dd/dt)t=o]. A similar arrangement shown in the lower half of Figure 17 gives a signal that is proportional to the log of the mean value of dd/dt between disparities 52 and 53 [ln(dS/dt)t= At]- These signals are summed and fed to a differentiating stage whose output is the first temporal derivative of its input. This input is proportional to (d2S/dt2)/(dS/dt) and therefore, according to equation (16) inversely proportional to TTC. Differentiation can be achieved by, for example, a highpass temporal filter (see Regan, 1989, pp. 22-23). If we assume that delays Ti, T2 and T3 are produced by high-pass temporal filtering, and that a further stage of high-pass temporal filtering converts the ln(dS/dt) signal to a d2S/dt2 signal, then we would expect that at any given instant the (d2S/dt2)/(dS/dt) signal would correspond to the situation some time in the past, and that this delay would be greater than the delay associated with the 6/(d6/dt) signal because only one stage of high-pass filtering is required in the latter case (Regan, 2002b). This prediction is in accord with the report that judgements of absolute TTC based on binocular information only are overestimates (Gray & Regan, 1998). 3.4 Objects moving along oblique trajectories: Time to passage In everyday life, collision avoidance/achievement commonly calls for judgements of both the direction of motion in depth of an approaching object (to predict where at some future time) and the time to collision (TTC) or time to passage (TTP) for that object (to predict when). The early visual processing of the direction of an approaching object's motion in depth is discussed elsewhere (Beverley & Regan, 1973,1975; Portfors & Regan, 1997; Portfors-Yeomans & Regan 1997; Regan, 1993; Regan & Kaushal, 1994; Regan, Beverley & Cynader, 1979). In brief an equation relating (a) the direction of an object's motion to (b) the ratio between the angular speed of the object within a frontoparallel plane in cyclopean space (dcp/dt) and the rate of change of relative disparity dd/dt was derived by Regan (1993). The equation can be rewritten as
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
211
(17)
where L is the distance (expressed in units of interpupillary separation I) by which the approaching object will miss a point midway between the eyes. (The ratio (d
(d<|)/dt)R/(d<|>/dt)L+l (d(|)/dt)R/(d<j)/dt)L-l
(18)
where (d
Although it has been shown that the human visual system contains a binocular mechanism that is specifically sensitive to the ratio {d
David Regan and Rob Gray
212
distinction between equations (17) and (18) is that they imply quite different perceived geometries of three-dimensional dynamic space. Equation (18) implies a geometry with heavy emphasis on trajectories that pass close to the head (Beverley & Regan, 1973; Cynader & Regan, 1978) while equation (18) gives a more similar weight to different trajectories. This distinction is illustrated graphically in Figure 19.
/
/
11 1 • 1 1 „
// /'
30
B
20 10 10
.
0
-
o o
20
•
•
0 -10
-10 -20
m RE
Q
f
•
-20
-30
I
1.0
0.5
0
-0.5
-1.0
-0.5
0
0.5
RATIO OF RETINAL IMAGE VELOCITIES
1.0
-20
-15
-10
-5
0
5
10
15
20
(d8/dt)/(d6/dt)
Figure 19: A: Direction of an object's motion in depth plotted versus the ratio between the velocities of the retinal images in right and left eyes. The left hand ordinate and continuous line are for the narrow range of trajectories that pass between the left and right eyes. The right hand ordinate and dashed line are for trajectories that pass to the left or right of the head. B: Direction of an object's motion in depth plotted versus the ratio (dtp/dt)/(d
Finally, there is also monocularly-available information about an object's direction of motion in depth (Regan & Beverley, 1980; Regan, 1986; Bootsma, 1991; Regan & Kaushall, 1994). The equation
Ue/dt
(19)
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach Time-to-Passage
213
relates the distance by which the object will miss the eye (D), the diameter of the object (S) multiplied by the ratio between the object's angular speed within a frontoparallel plane (d
WALL
35m
Figure 20: Time to passage and location at passage. A car (A) travelling at a speed of 30 m/s (ca. 67 m.p.h.) is steered so as to enter an underpass with the wall of the underpass (B) passing 3m from the driver's head. The time to passage and the 3m location at passage of object B can be obtained from retinal image information with high accuracy. From "Binocular information about time to collision and time to passage" by D. Regan, 2002, Vision Research, 42, p.2482. Copyright 2002 by Elsevier Science Ltd. Reprinted with permission.
214
David Regan and Rob Gray
A car (A) is being driven at 30m/s (67m.p.h) towards an underpass so that the point midway between the driver's eyes will pass 3m from the wall of the underpass (B). In Figure 20 a velocity V equal and opposite to that of the car has been impressed on both the car and object B. Figure 20 shows that the rate of change of disparity (dd/dt) will rise above 5 arcmin/s when the car is 35 m from the underpass. The Vcos0 component of relative motion will give dd/dt=0.0014639 radians/s and d2d/dt2=0.0025005 radians/s/s. From equation (16) this retinal image information gives a time to passage of the side of underpass of 1.171 sec. The correct time to passage is 1.167 sec, so the error in the time to passage given by retinal image information is ca. 0.4% at 35 m from passage (0 = 4.9° at that point). The VsinG component of relative motion will give an angular speed across the retina of 0.0732 radians/s (4.2 deg/s). From equation (17) it can be seen that the retinal image information dcp/dt and dd/dt signals that the side of the underpass (B in Figure 20) will pass 3.0 m to the right of the observer's head, providing the car's velocity remains constant. A second numerical example applies to catching a ball. In the game of cricket a fielder is commonly stationed behind the batsman and, for a righthanded batsman, slightly to the right of the batsman. When a fast bowler is operating with a delivery speed of ca. 40 m/s (90 mph) the slip fielder may stand 15m from the batsman. If the ball hits the edge of the batsman's bat the ball may fly towards the fielder, but the trajectory is not known until the ball leaves the bat's edge. The fielder faces the bat and fixates the outer edge rather than following the flight of the ball from the bowler's hand. If the ball deflects from the edge of the bat, the fielder has 0.375 sec to judge the flight of the ball and execute the catch (bare-handed). A catch is often made wide of the body. The correct location of the hand is given by equation (17). If the outstretched arm is 1.0 m long, the obliquity of the trajectory is 3.8 deg, and equation (16) gives the time to passage with an accuracy better than 0.4%.
4. A model of the early processing of information about motion in depth and TTC 4.1 Overview Over the last several decades a considerable amount of quantitative data has been published on the following topics: (1) the early processing of the changing-size and changing-disparity associated with an object that is moving in depth; (2) The processing that result in the perception of motion in depth; (3) The perceived speed of approach of an object as a basis for estimating TTC. The model outlined in Figure 3 can account for the greater part of these
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
215
psychophysical data. The chief features of the model were developed over a series of psychophysical studies (Beverley & Regan, 1979a,b, 1980b; Regan & Beverley, 1978a,b, 1979a,b, 1980; Regan, Beverley and Cynader, 1979; Regan &Hamstra, 1993). The previous and present versions of the model contain both linear and nonlinear stages whose different dynamic characteristics determine the relative weighting attached to the changing-size and changing-disparity information about TTC. Previously published versions of Figure 3 were less complex. The intent was to focus on the main points. In Figure 3 the details required to account for a more complete range of observations have been made explicit in the model rather than implicit as in previous versions. The model depicted in Figure 3 consists of two mirror-image sections that signal, respectively, approaching and receding motion in depth. These signals feed an opponent stage (insert) so that the final motion-in-depth signal is the difference between the approaching and receding signals. This kind of arrangement was called "rein control" by Clynes (1965) who pointed out that its advantages include stability of the zero point, and metabolic efficiency. The magnitude of the final motion-in-depth signal (i.e. the perceived speed of motion in depth) is assumed to be inversely proportional to TTC, so that a larger signal means greater urgency for evasive or interceptive action (Regan & Hamstra, 1993). The separation of the expanding-size and contracting-size subsystems in Figure 3 is consistent with evidence that the two subsystems have different time constants for buildup and decay of aftereffects (Regan & Beverley, 1978b). The separation of the binocular subsystems for approaching and receding motion in depth in Figure 3 is consistent with the following evidence: (1) some observers have a region of the binocular visual field that is blind to approaching stereomotion-in-depth but sensitive to receding stereomotion-in-depth that traverse the same range of disparities; (2) other regions of the visual field have the opposite loss of sensitivity (Hong & Regan, 1989). We will first review the chief features of the model and show how they relate to experimental data. Finally we will describe several secondary features included in Figure 3 and show how they relate to secondary findings. 4.2. The chief features of the model The dashed line represents the boundaries of the retinal image of a square untextured object. The velocity of any given boundary is transduced by a local motion (LM) filter that responds best to motion along the arrowed line. At the time the model was proposed Regan & Beverley (1979b) had in mind that the LM filters were the theoretical construct called Reichardt detectors
216
David Regan and Rob Gray
(Reichardt, 1961), but they can now be taken to be the kind of theoretical construct subsequently created by, for example, Watson and Ahumada (1985), Adelson and Bergen (1985) or the Elaborated Reichardt Detector of van Santen and Sperling (1985) - see review by Mather (1994). The LM filter outputs (a, b, c, d) assume a magnitude that is linearly proportional to local speed, and a sign that corresponds to the local direction of motion. (There is psychophysical evidence that linearity of transduction requires the presence of retinal image temporal jitter of about the power associated with an unrestrained head, see Figure 6) The vertical pair of LM filters feed a one-dimensional relative motion filter RMV that signals both an absolute rate of expansion [(a-b)] and a fractional rate of expansion \_(a-b)/6i ] of height (#;).12 Experimental evidence that the absolute and fractional rates of expansion are signaled in parallel is that observers can unconfound and discriminate simultaneous rates of change of absolute size, fractional rate of change of size, and starting size (Regan & Hamstra, 1993; Regan & Vincent, 1995). Similarly, relative motion filter RMW signals both the absolute and fractional rates of expansion of width 92. (Although the RMV and RMW filters in Figure 3 signal only a rate of increase, the LM filters must signal motion in either direction in order to explain how a rate of change of the target's size is processed independently of bodily motion of the target within a frontoparallel plane, see Figure 6). The receptive field size of RM filters is no more than ca. 1.5 deg (Beverley & Regan, 1979b). The motion-in-depth signal is most effectively excited by isotropic retinal image expansion, that is expansion that is not accompanied by a change of shape, so that the TTC of the approaching object is identical across all meridians (Beverley & Regan, 1979a, 1980b). This is achieved at the "COMPARATOR GENERATES xe" stage by a strong nonlinear interaction across orthogonal meridians that is selective for the condition
e2
(20)
(Beverley & Regan, 1980b). In other words, TTC is computed before a motion in depth signal is generated. If the condition specified in equation (20) is far from being satisfied the feedforward connection reduces gain G3 and attenuates the signal to considerably less than the greater of (a-b)/8i and (c-d)/82 (Beverley & Regan, 1979a, 1980b). If the condition specified in equation (20) is satisfied, the value of xe is proportional to (d6/dt)/6, where 6 is the angular One method for obtaining the fractional rate of increase of 9j is to compute the rate of change of
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
217
diameter across any meridian. In other words, XQ is inversely proportional to TTC. Figure 3 shows that a dynamic stereo signal proportional to dS/dt is processed to yield signal xB that is inversely proportional to TTC (see Section 3.3 above). The boxes labelled DYNAMIC STEREO corresponds to the £ boxes and everything to the left of them in Figure 17. The box labelled DIFF in Figure 3 corresponds to the DIFF box in Figure 17. The x9 and xB signals are combined at an averaging stage (COM) and the combined signal passes on to a stage that subtracts the "approaching" and "receding" signals, and generates a motion-in-depth signal whose speed (for approaching motion) is inversely proportional to TTC. The combination stage accounts for the finding that, although TTC is underestimated when only monocular information is available, and overestimated when only binocular information is available, accuracy is considerably better when both monocular and binocular cues are available (Gray & Regan, 1998). The final subtraction stage explains the finding that the sensation of motion in depth caused by a rate of change of angular subtense can be cancelled by pitting against it an appropriate rate of change of disparity (Regan & Beverley, 1979b). 4.3 Secondary features of the model In addition to the fractional rate of change of height (that feeds the comparator stage in Figure 3), the RMV stage outputs an absolute rate of change of height signal [i.e. (a-b)] from which is subtracted a fraction (ki) of the dd/dt signal. The resulting signal eventually creates a sensation of rate of increase of height. A corresponding output from the RMW stage eventually creates a sensation of rate of increase of width. The k\h\(dd/dt) term explains why, for target of constant angular subtense (6), a rate of change of disparity creates an illusory rate of change of angular subtense (Beverley & Regan, 1973; Regan & Beverley, 1979b). Further to this point, when the signal x$ from the comparator is allowed to pass G3 it reduces gains Gl and G2 so that when x$ is large the observer perceives zero rate of change of height and width. This explains why a rate of expansion accompanied by zero rate of change of shape creates the sensation that the target is moving in depth but not changing appreciably in size (Regan & Beverley, 1978b). The data shown in Figure 7 were explained as follows. The adapting stimulus strongly fatigued the LMs driven by the target's vertical edges and also fatigued the RMW, and (more weakly) fatigued the stage that generated JC9. After adaptation ceased, the fatigued RMw generated one output [labelled (c-d) in Figure 3] that created the sensation of a contracting-width aftereffect. The
218
David Regan and Rob Gray
second postadaptation output from the fatigued RMW filter [labelled (c-d)/02 in Figure 3] activated the comparator (because there was no corresponding signal from RMV), and the comparator then generated a signal that reduced gain G3 so as to block the xe signal generated by the fatigued x9 signal generator. Thus, immediately after the cessation of adaptation the observer experienced a changing-width aftereffect, but no motion-in-depth aftereffect. The postadaptation output from RMw decayed rapidly and this eventually allowed gain G3 to revert to its default value so that the slowly-decaying postadaptation xo signal generated by the fatigued comparator was allowed to pass and to generate a motion-in-depth signal. As soon as the x$ signal was allowed to pass beyond G3, gain G2 was reduced, thus switching off the changing-width aftereffect. The cancellation data shown in Figure 7 confirmed this interpretation. By stimulating the eye with an opposed rate of change of bar width it was possible to exactly cancel the decreasing-width aftereffect. This reduced the output of the fatigued RMW to zero, thus eliminating the "mismatch" output from the comparator to G3, consequently allowing the G3 gain to revert to default value so that the postadaptation x$ signal generated by the fatigued comparator passed through the G3 stage and generated a motion-in-depth signal (Beverley & Regan, 1979). 4.4. Dynamic characteristics and the consequent weighting of binocular and monocular contributions to the estimation of TTC The relative weighting of monocular and binocular information about TTC depends on the linear width of the approaching object (S) expressed in terms of the observer's interpupillary separation (I). Using geometrical optics it has been shown that
dG/dt
S
(21)
(Regan & Beverley, 1979b). In words, the ratio between the rate of change of disparity and the rate of increase of angular size associated with an approaching object is equal to the ratio of interpupillary separation to the object's linear width. Note that equation (21) does not involve the object's distance or speed. Equation (21) predicts that binocular information about TTC will be far more important than monocular information for small objects, a prediction verified by Gray and Regan (1998). The dynamic characteristics of the RM filters, comparator, and binocularly-driven stages differ considerably. Consequently the relative
A Step by Step Approach Time-to-Passage Approach to Research on Time-to-Contact and Time-to-Passage
219
weighting of binocular and monocular information depends strongly on the viewing conditions. This issue was discussed in detail by Regan and Beverley (1979b) and can be summarized as follows so far as unidirectional motion in depth is concerned. Binocular information becomes relatively more effective in generating a sensation of motion-in-depth as approach speed in increased, and relatively less effective as viewing time decreases. This effect of neural dynamics upon weighting adds to the large effect of the approaching object's linear size, expressed by equation (21). In addition, there is marked intersubject variability in relative sensitivity to dO/dt and dS/dt80:1 within the sample of five observers studied by Regan and Beverley (1979b)13. For a further discussion of weighting, with numerical examples, see Gray & Regan in this volume. 4.5 Is the earliest stage of processing based on the local motion of boundaries or directly on changing-size? The LM and RM stages in Figure 3 combine to signal a rate of change in the height and width of the target's image. An alternative way of modeling this response was proposed by Regan (2000), and is depicted in Figure 21. It is widely supposed that the first stage of processing the spatial aspects of the retinal image is a parallel array of spatial filters, each of which prefers a particular target width and orientation (Graham, 1989). According to the theoretical construct depicted in Figure 21 a narrow spatial filter feeds a comparator (e.g. a multiplier) through a delay stage and a wider spatial filter feeds the comparator with no intervening delay. Suppose that the narrow spatial filter responds best to a target of angular width 0, the wider spatial filter responds best to a target of angular width (0 + A0), and that the delay is t sec. Because the output of the comparator is strongest when the two inputs are identical, it will respond most strongly to a rate of expansion of A0/T deg/s. (The "delay and compare" principle of this changing-size or z-axis motion detector is by analogy with the principle of the classical Reichardt (1961) detector for motion within a frontoparallel plane). The model depicted in Figure 21 directly senses a rate of change of size of the image and is, therefore, quite different from the LM and RM stages of the Figure 3 model that infer a rate of change of size from the difference in the velocities of opposite edges.
13
Rushton and Wann (1999) proposed a model of relative weighting that does not take account of
the dynamic characteristics of the mechanisms sensitive to monocular and binocular information about TTC.
David Regan and Rob Gray
220
CD
CD
•a
3 in o
C5
I
CL
(3
a 5
Ix
TIME
Figure 21: Hypothetical local filter sensitive to expansion of an object's retinal image. Two filters for luminance-defined form whose widths differ but are driven from the same retinal location feed a comparator (e.g., a multiplier). The output of the narrower filter is delayed before it reaches the comparator. If a bright bar within an expanding grating is centered on the filters, the output of the comparator will be some function of the rate of decrease of grating spatial frequency. From Human Perception of Objects: Early Visual Processing of Spatial Form Defined by Luminance, Colour, Texture, Motion, and Binocular Disparity (p.342), by D. Regan, 2000, Sunderland, MA: Sinauer.
We prefer to retain the original formulation of the Figure 3 model (Regan & Beverley, 1979b, Beverley & Regan, 1978a) because the data shown in Figure 6 are more straightforwardly explained in terms of LM and RM filters than in terms of Figure 19.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
221
5. An example of the application of the step-by-step approach to field research using real aircraft The hypothesis illustrated in Figure 2 led to the following suggestions. (1) In order to predict the visual aspects of an individual's performance over a range of very many skilled tasks it is not necessary to develop a correspondingly large number of visual tests because the number of independent test results is determined by the number of sets of filters in Figure 2. On the other hand a single "shotgun" test that predicts a wide range of performances is unlikely to exist. (2) Only a few sets of filters may determine performance in some everyday tasks (only one in some cases). (3) Different sets of filters will determine performance in different tasks. Therefore intersubject performance in a particular task will be predicted by testing the particular sets (or set) of filters that determine performance in that particular task. Different tests will predict intersubject differences of performance in different tasks: the visual test must be chosen for the particular task, bearing in mind Figure 2 (Beverley & Regan, 1980a; Kruk & Regan, 1983). (4) Tests that assess sets of filters that are not used in a particular task will not predict performance in that task (e.g. visual acuity fails to predict intersubject differences in the performance of many tasks).. (5) Individuals whose sets of filters operate very independently of one another and of all other visual variables will perform better in a complex visual environment than individuals subject to "crosstalk" between sets of filters, even though the difference in performance might not be evident in a simple visual environment (Beverley & Regan, 1980a). With this rationale in mind studies were conducted using telemetrytracked high-performance jet aircraft (F15 and A4). The purpose was to find whether intersubject differences in the performance of several flying tasks could be predicted as follows: (a) Identify theoretically the set (or sets) of filters that would determine performance in a given flying task; (b) carry out laboratory tests on the set (or sets) of filters so identified; (c) quantify performance in the given flying task using telemetry-tracked aircraft. The laboratory tests included errors in manually tracking the expansion and contraction of a target, and the discrimination of changes in the speed of a radially expanding flow pattern. (Figure 11 indicates that this task assesses sensitivity to TTC signaled by the flow pattern associated with self-motion). Inter-pilot differences in the performance of the flying tasks gave encouraging
222 222
and Rob Gray Gray David Regan and
correlations with inter-pilot differences in laboratory tests of sets of filters. Correlations with standard tests (e.g. visual acuity, stereoacuity) were negligible. The flying tasks studied using real aircraft were formidable but, of course fell within the safety limits of high-level training for top pilots. We wished to go on to study flying performance in extreme situations where a proportion of flying accidents was intended. Such studies could not, or course, use real aircraft. Pilots kindly agreed to fly extreme situations in simulators (a most unpleasant experience for professional pilots). Our aim was to identify crucial visual sensitivities in accident avoidance. Simulator studies including visually-guided landing in low visibility on a shortened runway (low on fuel, with defective instruments), accuracy in holding station in close formation, and performance in air-to-air refueling. Interpilot differences in the performance of flying tasks (as measured, for example, by the number of crashes in the landing task) gave encouraging correlations with the results of laboratory tests. These findings suggested candidates for feedback variables used by pilots in their exploitation of lawful relationships between action and perception (Kruk et al, 1981, 1983; Kruk & Regan, 1983; Kruk & Regan, 1996). The chapter by Gray & Regan in this volume reviews a study on driving performance that followed the rationale described in this section.
6. A case when the available visual information is insufficient to perform the designated task So far we have discussed situations in which an accurate knowledge of the absolute distance of an approaching object is not necessary for collision avoidance or collision achievement. It is fortunate that successful motor action can be achieved entirely on the basis of dynamic retinal image information in so many situations [e.g. equations 3, 16 & 17)], because judgements of absolute distance are notoriously inaccurate, especially for distances greater than a very few m. There are, however, situations that call for accurate estimates of absolute distance, some of which occur in sport. Accomplished gamesplayers have long taken advantage of this deficiency in human vision. One example is in cricket where a slow so-called flight bowler is able to capitalize on human deficiencies in estimating the distance and line-of-sight speed of the ball so as to create errors in estimating where the ball will hit the ground. The problem faced by the batsman can, perhaps, be compared with the problem familiar to the communications engineer who receives a signal that is heavily contaminated by noise. A single presentation of the distorted signal does not allow the engineer to recover the signal. However, several adaptive filter techniques are available
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
223
for progressively building up the capability to 'recognize' a signal - given that the engineer has some prior knowledge of the signal. For example, Woody's adaptive filter technique is to cross-correlate a template of the signal with samples of signal-plus-noise. This technique can identify the unknown time of occurrence of the signal (Harris & Woody, 1969; Regan, 1989, p.68; Woody, 1967). It is well recognized that batsmen use the immediately preceding few deliveries to help judge the trajectory of the next delivery. The batsman create prior knowledge, in other words, build up a mental model. (It is a fact well known to bowlers that a batsman is particularly vulnerable to the first few deliveries received). The skilled slow bowler strives to create in the batsman's mind an erroneous mental model (Regan, 1992). In baseball, the illusion that a delivery is rising rather than falling as it approaches may be achieved similarly (Karnavas, Bahill & Regan, 1990). In cricket the bowler delivers several balls with identical speeds and trajectories, and the batsman learns to anticipate where the ball will hit the ground. The next delivery will be different. For example, it may have the same initial trajectory and speed but (because it is spinning backwards or forwards) hits the ground closer or further away from the batsman than expected. In addition, the trick of projecting the ball slightly upwards can ensure that the vertical angular speed of the ball is below detection threshold at a time when the batsman must commit himself to playing either forwards or back-thus forcing even good batsmen into embarrassing and ungainly changes of intention when faced with a master flight bowler (Regan, 1992).
224
David Regan and Rob Gray
REFERENCES Abernethy, B., & Burgess-Limerick, R. (1992). Visual information for the timing of skilled movements: a review. In J.J. Summers (Ed.), Approaches to the study of motor control and learning. Pp. 27-37. New York: Elsevier. Adelson, E. H. & Bergen, J. R. (1985). Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America, A, 2, 284-299. Anderson, G. J. K., Sully, D. J. & Sully, H. G. (1974). An operational analysis of a one-handed catching task using high-speed photography. Journal of Motor Behaviour, 6, 217-226. Beverley, K. I. & Regan, D. (1973). Evidence for the existence of neural mechanisms selectively sensible to the direction of movement in space. Journal of Physiology, 235, 17-29. Beverley, K. I. & Regan, D. (1975). The relation between discrimination and sensitivity in the perception of motion in depth. Journal of Physiology, 249, 387-398. Beverley, K. I. & Regan, D. (1979a). Separable aftereffects of changing-size and motion-in-depth: different neural mechanisms? Vision Research, 19, 727-732. Beverley, K. I. & Regan, D. (1979b). Visual perception of changing-size: the effect of object size. Vision Research, 19, 1093-1104. Beverley, K. I. & Regan, D. (1980a). Device for measuring the precision of eye-hand coordination while tracking changing size. Aviation, Space and Environmental Medicine, 51, 688693. Beverley, K. I. & Regan, D. (1980b). Visual sensitivity to the shape and size of a moving object: implications for models of object perception. Perception, 9,151-160. Beverley, K. I. & Regan, D. (1982). Adaptation to incomplete flow patterns: no evidence for "filling in" the perception of flow patterns. Perception, 11, 275-278. Beverley, K. I. & Regan, D. (1983). Texture changes versus size changes as stimuli for motion in depth. Vision Research, 23, 1387-1400. Blaquiere, A. (1966) Nonlinear Systems Analysis. New York: Academic Press. Bootsma, R. J. (1991) Predictive information and the control of action: what you see is what you get. International Journal of Sports Psychology, 22, 271-278. Clynes, M. (1969). Symposium on rein control, or unidirectional rate sensitivity, a fundamental dynamic and organizing function in biology. Annals of the New York Academy of Science, 156, 627-698. Cynader, M. & Regan, D. (1978). Neurons in cat parastriate cortex sensitive to the direction of motion in three-dimensional space. Journal of Physiology, 21 A, 549-569. Cynader, M. & Regan, D. (1982). Neurons in cat visual cortex tuned to the direction of motion in depth: effect of positional disparity. Vision Research, 22, 967-82. Graham, N. (1989). Visual Pattern Analyzers. New York: Oxford University Press. Gray, R. & Regan, D. (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38, 499-512. Gray, R. & Regan, D. (1999a). Motion in depth: adequate and inadequate simulation. Perception and Psychophysics, 61, 236-245.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach
225
Gray, R. & Regan, D. (1999b). Do monocular time to collision estimates necessarily involve perceived distance? Perception, 28, 1257-1264. Gray, R. & Regan, D. (1999c).Adapting to expansion increases perceived time to collision. Vision Research, 39, 3602-3607. Gray, R. & Regan, D. (2000a). Estimating time to collision with a rotating nonspherical object. Vision Research, 40, 49-63, 2000. Gray, R. & Regan, D. (2000b). Simulated self-motion alters perceived time to collision. Current Biology, 10, 587-590. Green, B. A. Jr. (1967). Vector Calculus. New York: Appleton-Century-Crofts. Harris, E. K. & Woody, C. D. (1969). Use of an adaptive filter to characterize signal-noise relationships. Computers in Biomedical Research, 2, 242-273. Harris, J. & Watamaniuk, S. N. (1995). Speed discrimination of motion in depth using binocular cues. Vision Research, 35, 885-896. Hong, X. H. & Regan, D. (1989). Visual field defects for unidirectional and oscillatory motion in depth. Vision Research, 29, 809-819. Hoyle, F. (1957). The Black Cloud, Middlesex, England: Penguin. Karnavas, W. J., Bahill, A. T. & Regan, D. (1990). Sensitivity analysis of a model for the rising fastball and breaking curveball. Proc. IEEE Syst Man Cybernet, Los Angeles. Kohly, R. P. & Regan, D. (2000). Long-distance interactions in Cyclopean Vision. Proceedings of the Royal Society, B, 268, 213-219. Kohly, R. & Regan, D. (2002). Fast long-range interactions in the early processing of luminancedefined form. Vision Research, 42, 49-63. Kruk, R. & Regan, D. (1983). Visual test results compared with flying performance in telemetrytracked aircraft. Aviation, Space and Environmental Medicine 54, 906-911. Kruk, R. & Regan, D. (1996). Collision avoidance: A helicopter simulator study. Aviation, Space and Environmental Medicine, 67, 111-114. Kruk, R., Regan, D., Beverley, K. I. & Longridge, T. (1981). Correlations between visual test results and flying performance on the Advanced Simulator for Pilot Training (ASPT). Aviation, Space and Environmental Medicine, 52, 455-60. Kruk, R., Regan, D., Beverley, K. I. & Longridge, T. (1983). Flying performance on the Advanced Simulator for Pilot Training and laboratory tests of vision. Human Factors, 25, 457-66. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5, 437-459. Levitt, H. (1971). Transformed up-down methods in psychoacoustics. Journal of the Acoustical Society of America, 49, 467-477. Marmarelis, P. Z. & Marmarelis, V. Z. (1978). Analysis of Physiological Systems: the White Noise Approach. New York: Plenum Press. Mather, G. (1994). Motion detector models: psychophysical evidence, In A. T. Smith & R. J. Snowden (Eds.) Visual detection of motion (pp. 117-143). London: Academic Press.
226
David Regan and Rob Gray
Mountcastle, V. B. (1979). An organizing principle for cerebral functions: The unit module and the distributed system. In F.O. Schmitt & F.G. Worden (Eds.) The Neurosciences: Fourth Study Program (pp. 21-42). Cambridge, Mass.: M. I. T. Press. National Highway Traffic Safety Administration (NHTSA). Traffic safety facts 1996. Washington, D.C.: NHTSA. Peper, L., Bootsma, R. J., Mestre, D. R. & Bakker, F. C. (1994). Catching balls: How to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception and Performance, 20, 591-612. Poggio, G. F. & Talbot, W. H. (1981). Mechanisms of static and dynamic stereopairs in foveal cortex of the rhesus monkey. Journal of Physiology, 315,469-492. Poincare\ H. (1913). The Value of Science. New York: Science Press. Portfors, C. V. & Regan, D. (1997). Just-noticeable difference in the speed of cyclopean motion in depth and of cyclopean motion within a frontoparallel plane. Journal of Experimental Psychology: Human Perception and Performance, 23, 1074-1086. Portfors-Yeomans, C. V. & Regan, D. (1996). Cyclopean discrimination thresholds for the direction and speed of motion in depth. Vision Research. 36, 3625-3279. Portfors-Yeomans, C. V. & Regan, D. (1997). Discrimination of the direction and speed of a monocularly-visible target from binocular information alone. Journal of Experimental Psychology: Human Perception and Performance, 23, 227-243. Regan, D. (1982). Visual information channeling in normal and disordered vision. Psychological Review, 89, 407-44. Regan, D. (1986). Visual processing of four kinds of relative motion. Vision Research, 26, 127145. Regan, D. (1989). Human Brain Electrophysiology. New York: Elsevier. Regan, D. (1992). Visual judgements and misjudgements in cricket, and the art of flight. Perception, 21,91-115. Regan, D. (1993). Binocular correlates of the direction of motion in depth. Vision Research, 33, 2359-2360. Regan, D (1995). Spatial orientation in aviation: Visual contributions. Journal of Vestibular Research, 5, 455-471. Regan, D. (1997). Visual factors in catching and hitting. Journal of Sports Sciences, 15, 533-558. Regan, D. (2000). Human Perception of Objects: Early Visual Processing of Spatial Form Defined by Luminance, Color, Texture, Motion, and Binocular Disparity. Sunderland, MA: Sinauer. Regan, D. (2002a). The Proctor Lecture: An Hypothesis-based approach to clinical psychophysics and to the design of visual tests, Investigative Ophthalmology and Visual Science, 43 (5), 1311-1323. Regan, D. (2002b). Binocular information about time to collision and time to passage. Vision Research, 42, 2479-2484. Regan, D. & Beverley, K.I. (1973). Some dynamic features of depth perception. Vision Research, 13, 2369-2379.
Approach to Research on Time-to-Contact and Time-to-Passage A Step by Step Approach Time-to-Passage
227
Regan, D. & Beverley, K. I. (1978a). Looming detectors in the human visual pathway. Vision Research, 18,415-21. Regan, D. & Beverley, K. I. (1978b). Illusory motion in depth: aftereffect of adaptation to changing size. Vision Research, 18, 209-212. Regan, D. & Beverley, K.I. (1979a). Visually guided locomotion: psychophysical evidence for a neural mechanism sensitive to flow patterns. Science, 205, 311-313. Regan, D. & Beverley, K. I. (1979b). Binocular and monocular stimuli for motion in depth: changing-disparity and changing-size feed the same motion-in-depth stage. Vision Research, 19, 1331-1342. Regan, D. & Beverley, K. I. (1980). Visual responses to changing size and to sideways motion for different directions of motion in depth: linearization of visual responses. Journal of the Optical Society of America, 11, 1289-1296. Regan, D. & Beverley, K. I. (1981). Motion sensitivity measured by a psychophysical linearizing technique. Journal of the Optical Society of America, 71, 958-965. Regan, D. & Cyander, M. (1982). Neurons in cat visual cortex tuned to the direction of motion in depth: effect of stimulus speed. Investigative Ophthalmology and Visual Science, 22, 535-550. Regan, D., Gray, R., Portfors, C. V., Hamstra, S. J., Vincent, A., Hong, X.H., Kohly, R. & Beverley, K. (1998). Catching, Hitting, and collision avoidance. In L.R. Harris & M. Jenkin (Eds.). Vision and Action (pp.181-214). Cambridge: Cambridge University Press. Regan, D., Hamstra S (1993). Dissociation of discrimination thresholds for time to contact and for rate of angular expansion. Vision Research 33, 447-462. Regan, D. & Kaushal, S. (1994). Monocular judgement of the direction of motion in depth. Vision Research, 34, 163-177. Regan, D. & Vincent A (1995). Visual processing of looming and time to contact throughout the visual field. Vision Research, 35, 1845-1857. Regan, D., Beverley, K. 1. & Cynader, M. (1979). The visual perception of motion in depth. Scientific American, 241, 136-151. Regan, D., Erkelens, C. J. & Collewijn, H. (1986a). Necessary conditions for the perception of motion in depth. Investigative Ophthalmology and Visual Science, 27, 584-597. Regan, D., Erkelens, C. J. & Collewijn, H. (1986b). Visual field defects for vergence eye movements and for stereomotion perception. Investigative Ophthalmology and Visual Science, 27, 806-19. Reichardt, W. (1961). Autocorrelation, a principle for the evaluation of sensory information by the central nervous system, In W. Rosenblith (Ed.) Sensory Communication (pp. 160-175). New York: Wiley. Richards, W. & Regan, D. (1973). A stereo field map with implications for disparity processing. Investigative Ophthalmology and Visual Science, 12, 904-909. Rushton, S. K. & Wann, J. P. (1999). Weighted combination of size and disparity: A computational model for timing a ball catch. Nature Neuroscience, 2, 186-190. Schey, H. M. (1973). Div, Grad, Curl, and all that. New York: Norton.
228
David Regan and Rob Gray
Schoner, G. (1994). Dynamic theory of action-perception patterns: The time-before-contact paradigm. Human Movement Science, 13,415-439. Spekreijse, H. (1966). Analysis ofEEG Responses in Man± Thesis. The Hague: Junk Publishers. Spekreijse, H., van Norren, D. & van den Berg, T. J. T. P. (1971). Flicker responses in monkey lateral geniculate nucleus and human perception of flicker. Proceedings of the National Academy of Sciences of the U.S.A., 68, 2802-2805. Spileers, W., Orban, G. A., Gulyas, B. & Maes, H. (1990). Selectivity of cat area 18 neurons for direction and speed in depth. Journal of Neurophysiology, 63, 936-954. van Santen, J. P. & Sperling, G. (1985). Elaborated Reichardt detectors. Journal of the Optical Society of America, A, 2, 300-320. Vincent, A. & Regan, D. (1997). Judging the time to collision with a simulated textured object: effect of mismatching rates of expansion of size and of texture elements. Perception and Psychophysics, 59, 32-36. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory. Journal of Experimental Psychology: Human Perception and Performance, 22, 1031-1048. Watson, A. B. & Ahumada, A. J. (1985). Model of visual-motion sensing. Journal of the Optical Society of America, A, 2, 322-342. Wheatstone, C. (1838). Contributions to the physiology of vision. Philosophical Transactions of the Royal Society, 13, 371-394. Woody, C. D. (1967). Characterization of an adaptive filter for the analysis of variable latency neuroelectric signals. Medical and Biological Engineering, 5, 539-553.
Time-to-Contact Time-to-Contact – - H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 10 Textured Tau
Klaus Landwehr Universitat Mtinster, Miinster, Germany Universitat Wuppertal, Wuppertal, Germany
ABSTRACT A rush review of research on texture as an independent variable in visual motion perception reveals only minor effects of texture in the context of object-motion but major ones in the context of ego-motion. In order to explain this asymmetry, the notion of texture is thoroughly reanalyzed in geometrical terms, and conditions where it can reasonably be expected to exert an influence on motion-critical behavior are identified. A research program, focused on visual object-motion perception, is sketched in brief outline, and first, preliminary results from two pilot studies are communicated.
230
Klaus Landwehr
1. Introduction This chapter provides a programmatic evaluation of the role of texture in time-to-contact estimation, or visual motion perception more generally. The notion of "texture", which nowadays is often used in a narrow sense, is restored to its original meaning, and a mathematical analysis in terms of tilings, patterns, and fractal-like structures is introduced. The distinction between material and optical (or visual) texture is emphasized, and it is shown that they are not related by way of unequivocal specification, rather, in a "cueing" manner. Eventually, an experimental research program that builds upon these analyses, is sketched, and illustrated with two pilot studies.
2. Texture and tau Exploitation of information from changing visual angles, as some authors assume to obtain in visual motion perception (ego-motion as well as object motion), presupposes optical discontinuities, that is, texture (Euclid, Optics, §§ 53, 56; Gibson, 1958; 1966, pp. 159, 188-195; Lee, 1974). Visual texture does not necessarily correspond to material texture, yet, besides color, shading and glare, it appears to comprise all that meets our resting eyes! An unfailing question for research, then, seems to be, how, besides color, shadow and glare, texture supports discrimination and identification of movable objects, their, and potential observers' trajectories, relative velocities, and eventually, respective arrival times. Considering how color, shading and glare could help in the specification of the presently mentioned physical parameters of motion, it would appear that texture has to carry most of the burden. Empirical work, however, so far did not reveal major effects of texture as an independent variable in visual perception of object motion. Schiff and Detwiler (1979), e.g., in a truncated-event extrapolation task (,,time-to-contact estimation"), found little difference, whether a virtually approaching object (a magnifying black square) was presented in front of a slanted rectangular grid, or a void. Earlier, Harvey and Michon (1974) had not found any effect of a random lines-and-dots background on detection thresholds of motion of a single dot towards or away from another, stationary one (simulating recession from, or approach to, the taillights of an automobile). More recently, Gray and Regan (1999) failed to find an effect of real environmental ,,clutter" (furniture, carpets, walls), again, on time-to-contact estimates, referred to the image of a textureless sphere, mimicking optically equivalent approaches of a large object from a far distance, or a smaller one from close-by, materialized by appropriate image enlargement, and by moving the observer farther away or closer towards the computer screen. With regard to surface texture of the moving object, Li and Laurent (1995)
Textured Tau
231
were able to establish a weak and indirect effect of texture - or rather, the rate of occlusion of two or three stripes, painted on a ball that moved at different (but constant) velocities towards a stationary observer - on one of their dependent measures of an avoidance response (velocity of trunk bending). Unfortunately, the authors' variation of occlusion rate carried velocity as a partial confound, which alone exerted a main effect (there was also a main effect of surface texture as such, though). Eventually, Vincent and Regan (1997) succeeded in demonstrating that, when a semi-regular checkerboard texture of limited, square-shaped extent was magnified faster (slower) than the whole figure, time-to-contact responses were significantly earlier (later) than for the default condition of coordinate magnification - although not wholly compensating for the mismatch. The preceedingly sketched state of affairs re texture as an almost ineffective variable in visual object motion perception contrasts sharply with results form research on ego-motion. Here, specific environmental textures, or their distribution and relative displacement in the field of view, have, inter alia, been demonstrated to affect the steering toward a goal (and the avoiding of an obstacle) during walking (Duchon, 1998), the negotiating of curves while driving a car (Land & Horwood, 1995), and the adjustment of altitude and speed, when flying an aeroplane (Larish & Flach, 1990; Owen, 1990; Wolpert, 1990, pp. 111-115; all were simulation studies; cf. Warren, 1998, for a fairly comprehensive overview). Interestingly, ego-motion perception seems to be influenced by the presence of moving objects (Royden & Hildreth, 1996; Warren & Saunders, 1995), which may be regarded another, indirect effect of environmental texture. Object motion perception during ego-motion has intensively been investigated for baseball playing, but neither (and understandably) object texture nor (less understandably) environmental texture did receive much attention in this research (cf. Oudejans, Michaels, Bakker, & Davids, 1999, for a telling example). Why is it that texture does not loom that large in visual object motion perception? One reason might be economy in visual processing: Once an object is discriminated from its surround, the object's surface texture adds but little information to detection of approach or recession (cf. Beverley & Regan, 1980, on sufficient conditions for visual object discrimination). Also, objects are often rather small, so their surface texture may not resolve well (cf. Regan & Beverley, 1979, Appendix 2, for related calculations of the relative effectiveness of changing apparent size and changing binocular disparity, for objects of different size, approaching from different distances at different velocities). On the other hand, if precision is at a premium, surface texture that informs about an object's shape and rotation, may prove valuable (as it did, to a certain degree, in the Li & Laurent (1995) study cited above). Finally, parallactic displacement relative to, and occlusion of, object surface texture as well as environmental texture, might be used to infer an object's trajectory (a question not addressed in the references cited up to this point; Schiff & Oldak, 1990, Kebeck & Landwehr, 1992, and Bootsma & Oudejans, 1993,
232
Klaus Landwehr
among others, studied trajectories, but did not vary texture). Before analyzing these possibilities in more detail, let us first become clear about what exactly we mean by "texture".
3. On the notion of texture According to Webster's New International Dictionary of the English Language, the concept of texture refers to the arrangement of the threads in a textile fabric, or more generally to the arrangement of the parts that make up something. A more specialized reference is to the fineness or coarseness of tissue; and in a figurative sense, texture may also mean an ,,intricate composition, as of a plan or plot" (Neilson, 1909/1960, p. 2614). Compared to the notion of structure, it would seem that ,,texture" preferably refers to a material's approximately two-dimensional surface, whereas ,,structure" makes reference to its internal, n-dimensional organization (cf. ,,the texture of a mineral" versus ,,molecular structure"). When Gibson (1950a, pp. 61-67) introduced the notion of texture into the psychological literature, he was thinking of the structure or texture of the retinal image. Later, he preferred to think of visible texture in terms of the structure of ambient illumination (Gibson, 1958; 1979, pp. 65-76). Visual or retinal-image texture can be derived from optical texture by a modulation transfer function (Hecht, 1998; Pugh, 1988); optical texture relates to material texture via surface inclination and reflectance, relative to conditions of illumination (Gibson, 1979, pp. 29-31, 86-92; Hecht, 1998; Todd & Mingolla, 1983). Optical textures, like even surfaces, can be described in two spatial dimensions. If surfaces are uneven or non-rigid, and/or if conditions of illumination or of observation change, optical textures are ever changing, too. This is why Gibson (1966; 1979, pp. 246-249) suggested that invariants of structure be sought to explain the phenomenal persistence (,,constancy") of the world. Textures displayed in the visual perception research laboratory usually are pictorial textures - surface pigmentations, cathode-ray modulations, liquid-crystal ionizations, etc. Often, radiance parameters remain unspecified or are taken to be within ,,normal range", so that the description issue reduces to (marked or colored) plane geometry. What are visual textures composed of? Consider a corrugated, crystalline surface. At the material level, we have a micro-layout of the surface's facets. Optically, this will project as a distribution of (internally shaded) areas of some color, and darker discontinuities (,,lines") between them, corresponding to the facets' borders. Geometrically, this may be referred to as a tiling, that is a complete and non-overlapping covering of the Euclidean plane (Griinbaum & Shephard, 1987, p. 16). Now consider a different case - the starred sky at night - a purely optical texture, as it were (although there are material, radiant light sources believed to be behind). Phenomenally, this texture appears as a distribution of ,,points" or ,,dots"
Textured Tau
233
across a homogeneous background (Wertheimer, 1923). Geometrically, we have a pattern, as for the first time precisely defined by Griinbaum and Shephard (1987, p. 204) to be a non-empty family of pairwise disjoint sets of congruent copies of one or several motifs. Both kinds of textures were suggested by Gibson (1950a, pp. 80-90) to be used in psychophysical research on ,,space perception" (perception of surface layout). Gibson (1950a; cf. Gibson, 1979, pp. 147-169, for his own summarizing of his work) mainly looked at regular textures, and the optical transformations they undergo with changing distance (continuous distance along the ground, surface slant, etc.). This lead to the seminal idea that optical gradients of texture inform about surface layout. In the next section, I shall qualify this conjecture. ..Regularity", with Gibson (1950a, pp. 77-78), either meant ,,strict" regularity (i.e., translatory symmetry), or stochastic regularity. The latter has been extensively elaborated upon by Julesz (1965 et passim), and Beck (1966 et passim), and the former received anewed attention in work on picture perception (e.g., Sedgwick, 1992), more than five decennia after its introduction into the art of painting by Alberti (1435; cf. Edgerton, 1975; Kubovy, 1986, and Hagen, 1980, for a collection of readings). It is important to note that regularity exists at more than one level. Basic to all periodic tilings and patterns are five possible lattices of parallelograms, rectangles, squares, rhombs, or equilateral triangles, which can be dissected and marked in 17 different ways if translational symmetry is to be preserved (Schattschneider, 1978; Schoenflies, 1891). Still, an infinity of ,,surface tiles" or motifs can be created, as impressively realized by the Dutch artist M.C. Escher (1898-1972) who also noted that tile and tiling symmetry as well as tiling and color symmetry may differ (Coxeter, 1986; Griinbaum, 1986, p. 56). Next, there are duals, which are tilings defined over singular points of other, original tilings (Laves, 1931). Finally, there is the procedure of Dirichlet ([1850]; explained by Griinbaum & Shephard, 1987, pp. 250-252) that allows one to retile any tiling or pattern into new tiles that assemble all geometrical points closer to a motif or original tiling vertex than to any other - an important tool in analyzing meta-regularities in patterns (Landwehr, 1998, 251 -255). The point for perception is that different levels or layers of regularity might be grasped simultaneously - or interfere. Throughout the preceding discussion it was assumed that textures can be decomposed into elements (cf. Zucker, 1976). This was criticized early by Gibson (Gibson et al., 1955, p. 2, Footnote 2), and it is obviously untrue for natural optical textures (Metzger, 1966, p. 733). Discontinuities need not be abrupt and they do not necessarily close to form ,,figures" or ,,forms". If the underlying material texture or surface layout is not plane (has ,,depth" or ,,3D-structure"), shading cannot be eliminated. In terms of fractal geometry (Barnsley, 1988) we have a layout of fuzzily defined ,,lines", ,,areas", ,,patches", etc., that may also intertwine (Gibson, 1979, p. 28; Landwehr, 1998, pp. 223 and 255-259). Here the point for perception is that visual angles are not easily defined and tracked.
234
Klaus Landwehr
4. Specification and ,,cueing" by texture A given layout of surfaces, under specified conditions of illumination, gives rise to a specific optic array (Gibson, 1958; 1979, pp. 51-52, 86-91). The inverse (as believed by Gibson, 1961, p. 260; 1967), does not necessarily hold. Neither does visible texture unequivocally specify material surface layout, nor does it univocally relate to surface texture. A surface may be irregularly textured, or parts of it occluded, such that visual angles do not properly inform about continuous distance, surface slant, curvature, etc. Given a general rigidity constraint - we are not living in a rubber world - the specification is unique, however, for ego-motion, if surfaces are textured sufficiently dense (cf. Gibson, 1950a, pp. 119-120; 1979, pp. 121-126; Johansson, 1970; Warren, 1995, pp. 267-275). With object motion, some reservations remain: Objects may counter-rotate, so that their rear side is never seen (Lee & Young, 1985). Focused on individual surfaces, the specification issue is even more intricate. For a quarter of a century, Gibson (1950 b; 1979, pp. 150-156) bothered about necessary and sufficient conditions of ,,visual surfaceness", without finding any. Sometimes, a single step in light intensity may suffice to evoke the impression of a hard, impenetrable surface (Metzger, 1930), sometimes, a sequence or gradient of such steps is required (Gibson et al., 1955). Photographs of natural surfaces often afford recognition of the material depicted but not independent discriminability of its solidity (,,material hardness"; Gibson & Bridgman, 1987, p. 4). In fact, pictorial textures are not unanimously seen as simulated surfaces (Kebeck, 1987, p. 59). And finally, there are transparent surfaces that do not structure light at all (Gibson, 1979, p. 152)! On the other hand, there is little doubt that regular textures, subjected to a gradient function, are often highly suggestive of even surfaces, seen in what may be considered everyday canonical orientation: Ground, floor, ceiling, or wall, as viewed from average eyeheight. Movement of oneself or of an object relative to such textures is likely to impress as providing information about velocity and trajectory. Something similar may happen with an object's surface texture suggesting shape and rotation. In any case, kinematic optical information is of two types only: Parallax (i.e., relative displacement), or occlusion (deletion and re-accretion of texture at an edge; Gibson et al., 1969), where the latter presupposes the former. Occlusion may affect both environmental and object texture (Gibson, 1979, pp. 80-82). A third type of optical information, ,,optical magnification and minification" (Gibson, 1958), can be considered another special case of parallax (a size, or similarity transformation; Coxeter, 1969, p. 67), although for object motion, as well as ego-motion through a ,,cluttered" environment, it unavoidably goes with progressive occlusion and disocclusion of background texture. It is important to distinguish ,,looming", which is magnification according to a trigonometric function, from ,,zooming", which is linear magnification (Lumsden, 1980). Taking
Textured Tau
235
trigonometric magnification - as it obtains for constant velocities - as standard, a zooming object appears to accelerate and then decelerate. Physical deceleration and acceleration will ,,flatten out", or increase, a trigonometric function's positive geometric acceleration. Although time to collision is specified only by the ratio of distance (or apparent size) and instantaneous velocity, approach and recession (= relative distance in a non-deforming world) can always be discriminated (up to a threshold of image resolution). How much and which kind of texture do we and other animals need to exploit this unfailing information? Criticizing his own, and his doctoral students' early experiments, in which an expanding dark silhouette had been used, Gibson (1979, pp. 231-232, commenting on Schiff, 1965, and Schiff et al., 1962) suggests that ,,the magnifying of a nested structure of subordinate forms" was necessary if results were to generalize to approach to a surface instead of approach o/an object. The work by Duchon (1998), cited earlier, may be regarded a first test of this hypothesis. It should also be of interest to look at responses to semi-transparent textures, e.g. branching textures (Stevens, 1974) or wire textures, which for many animal species constitute their common surroundings. Gibson's point, though, may also apply to object motion perception: An object's surface texture, as intimated above, may ,,cue" for the object's shape, spin, etc., and this impression might be used to control avoidance of, or making contact with, the object. In short, visual texture, albeit being specific to surface texture and surface layout, does not unequivocally specify the two. At times, however, it can be quite suggestive about object shape, environmental surface layout, ego-motion velocity, and object motion trajectory (especially, the object's rotation). It seems worthwhile, then, to look more closely into the ways that texture might affect timing and distance calibration of motion-related behavior.
5. An experimental research program In traffic engineering, it is common to texture road- and runways regularly. Many driving schools teach a rule of thumb for calculating safe headway distances, based on the regular, metric spacing of road markings and delineator poles. At least one spectacular implementation of a graded texture has been demonstrated to affect speeding behavior of car drivers (Denton, 1980). All of this, like the experimental work cited in Part 2 of this paper, does not yet provide an exhaustive, explanatory account of the influence of grand-scale environmental texture on ego-motion or perceived object motion (cf. Watanabe, 1998, for some other missing ingredients to a general theory of motion perception, not dealt with here). Environments, and the optical textures they provide, need not only be sampled, but have to be designed according to geometrical combinatorics (Appleyard et al., 1964; Landwehr, 1986;
236
Klaus Landwehr
Thiel, 1981). The same holds for objects and surface textures (Landwehr, 1988, 1998). A complete, detailed exposition of an experimental research program on ,,textured motion" is beyond the scope of this contribution. I have space only for a general outline, and I shall confine myself to object motion as viewed by a stationary observer. Variables obviously involved in such setting include number, shape, size, opacity, color, texture, ,,meaning", relative distances, trajectories, and velocities of objects, as well as ocularity and eye movements on part of the observer. A minimum characterization of environments, for the setting defined, still has to consider conditions of illumination (at its extreme: day versus night), atmospheric conditions (especially: fogginess), and layout of opaque and semi-transparent surfaces along with their textures. In terms of visual angles, this ensemble of variables makes for a quantitatively defined set of (partly interrelated) hypotheses. With regard to dependent measures, the scheme is open to a psychophysics interpretation in terms of thresholds, correspondences, and discriminative responses, or, alternatively, a cognitive, or neurobiological interpretation in terms of neuronal or rational functioning. As an example, let us look at texture as our main concern. An object's surface texture undergoes specific transformations, as the object moves relative to an observer (Gibson, 1955). Hence, the object's rigid or non-rigid shape has to be specified first. Visually, the simplest shape, and the one preferentially chosen to begin with, is the sphere, because it invariably projects as a circle (cf. Hilbert & Cohn-Vossen, 1932, § 32). However, the closer it draws, the less of its surface is seen, although the solid angle of its apparent size magnifies (Euclid, Optics, §§ 2324). It is here that surface texture becomes relevant, because of its progressive occlusion, which is invisible for a zero texture (black, white, or homogeneous color). In terms of regular textures, besides an infinity of tilings, obtained by projecting the faces of inscribed polygons onto the sphere's surface (Miyazaki, 1987), there are a few interesting solutions related to a fixed diameter, or two singular, polar points: one is a gradient line (a ,,spiral" with constant tangent vector), running from one pole to the other, a second one is the well-known tiling by meridians, and a third one is the marking by latitudes. Consider the latter ones: viewed binocularly (in canonical orientation), there is parallax in the case of meridians, but not in the case of latitudes (cf. Fig. 1; this is but one geometrical singularity that points to certain interdependencies - in this case, a strict co-implication - between some of the variables listed above; another example, also for spherical objects, is trajectory and apparent acceleration; Kebeck & Landwehr, 1992, p. 152, Footnote 3; cf. Freeman, Harris & Tyler, 1994, on spatial versus temporal speed gradients). If the latitudes texture is tilted by 90° with its polar axis remaining orthogonal to the observer, we have a texture of ,,vertical latitudes", comparable to the meridian texture with regard to binocular parallax. If either of these textures are rotated towards the observer and viewed monocularly
Textured Tau
237
with the line of gaze centered on the visible polar point, both textures, for fair viewing distances, are hard to discriminate from plane figures painted on a disk. I recently studied some of the variations discussed here. The main finding was an interesting interaction effect between object and environmental texture: Responses (in a truncated-event extrapolation paradigm) were earliest for the vertical latitudes object texture seen against a background of equally spaced horizontal stripes!
Figure 1: A sphere tiled by meridians as seen (a) with the left eye and (b) with the right eye, if oriented normally to the cyclopean line of gaze.
As a final example, and one for which I can also present some preliminary results here, consider the relation of object surface texture and trajectory, again for a sphere with either zero texture or a texture of meridians or vertical latitudes. As was hinted at in the preceding section, self-occlusion of an object, that is, decretion of surface texture at one of its edges, and re-accretion at the opposite edge, is highly suggestive of the object's rotation, or, its (skewed) trajectory. Three pairs of straight trajectories of equal length were set up in a large room, symmetric to frontal head orientation: one pair of trajectories converging to the observer's station point, another pair, converging to a point displaced one trajectory-length away from the observer along her cyclopean gaze line, and a third, being the mirror image of the first pair of trajectories, with the objects receding into the distance. Convergence angle was 90° for all pairs of trajectories.
238 238
Klaus Klaus Landwehr Landwehr
Again, a motion extrapolation paradigm was used. Trajectory length was 2.68m. Objects were blacked out when they were 0.548m away from the theoretical collision points. Object diameter was 0.1m, making for plane angular magnification ranges of 2.15° to 10.48° for the near approach event, 1.16° to 1.85° for the distant approach event, and 2.15° to 1.67° for the recession event. Objects travelled at velocities of 0.378m/s, 0.548m/s, or 0.73lm/s, and related times-to-collision were 1.45s, Is, and .75s. Subjects' responses, after conversion into percent accuracy scores (MacLeod & Ross, 1983; Kebeck & Landwehr, 1992, Appendix), were subjected to a 3-way univariate analysis of variance. There were significant effects of trajectory and texture (F=25.717; p<.000, and F=41.681; p<.001, respectively), and a slightly significant, ordinal interaction between the two (F=2.825; p<.035). Scheffe post-hoc tests identified three subgroups for texture, and two homogeneous groups for trajectory, where the recession trajectories-pair did not differ significantly from the near-approach one. These findings seem to suggest that object surface texture (at least a binocularly effective one) affords more precise responding to impending contact between objects, or objects and self. The meridian texture proved more effective than the vertical latitudes texture, suggesting that perceived shape of the object is also of some concern. Effects were more pronounced for distant collision events suggesting that texture helped in trajectory identification. In conclusion then, it appears that texture may also be relevant in the context of visual object motion perception, as it demonstratedly is in the context of visual ego-motion perception.
Acknowledgement Part of the empirical work described in this paper was done while I held a fellowship of the Science and Technology Agency of Japan. I am most grateful to my host, the International Association of Traffic and Safety Sciences, Tokyo, Japan, to the Japanese Ministry of Education, and to Kaoru Noguchi, for unfailing support.
Textured Tau
239
REFERENCES Alberti, L. B. (1435). Depictura [On painting]. Florence. Appleyard, D., Lynch, K. & Myer, J. R. (1964). The view from the road. Cambridge, MA: MIT Press. Barnsley, M. (1988). Fractals everywhere. Boston: Academic Press. Beck, J. (1966). Effect of orientation and of shape similarity on perceptual grouping. Perception & Psychophysics, 1, 300-302. Beverley, K. I. & Regan, D. (1980). Visual sensitivity to the shape and size of a moving object: implications for models of object perception. Perception, 9, 151-160. Bootsma, R. J. & Oudejans, R. R. D. (1993). Visual information about time-to-collision between two objects. Journal of Experimental Psychology: Human Perception and Performance, 19, 1041-1052. Burton, H. E. (Ed.) (1945). The optics of'Euclid. Journal of the Optical Society of'America, 35, 357372. Coxeter, H. S. M. (1969). Introduction to geometry - Second edition. New York: Wiley. Coxeter, H. S. M. (1986). Coloured symmetry. In H. S. M. Coxeter, M. Emmer, R. Penrose, & M. L. Teuber (Eds.), M.C. Escher: Art and science (pp. 15-33). Amsterdam: North-Holland. Denton, G. G. (1980). The influence of visual pattern on perceived speed. Perception, 9, 393-402. Duchon, A. P. (1998). Visual strategies for the control of locomotion. Unpublished doctoral dissertation, Department of Cognitive and Linguistic Sciences, Brown University, Providence, RI. Edgerton, S. Y., Jr. (1975). The Renaissance rediscovery of linear perspective. New York: Basic Books. Euclid (-300 BC). Optics. See Burton (1945). Freeman, T. C. A., Harris, M. G. & Tyler, P. A. (1994). Human sensitivity to temporal proximity: The role of spatial and temporal speed gradients. Perception & Psychophysics, 55, 689-699. Gibson, J. J. (1950a). The perception of the visual world. Boston: Houghton-Mifflin. Gibson, J. J. (1950b). The perception of visual surfaces. American Journal of Psychology, 63,367384. Gibson, J. J. (1955). Optical motions and transformations as stimuli for visual perception. State College, PA: Psychological Cinema Register. Gibson, J. J. (1958). Visually controlled locomotion and visual orientation in animals. British Journal of Psychology, 49, 182-194. Gibson, J. J. (1961). Ecological optics. Vision Research, 1, 253-262. Gibson, J. J. (1966). The senses considered as perceptual systems. Boston: Houghton-Mifflin. Gibson, J. J. (1967). New reasons for realism. Synthese, 17, 162-172. Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton-Mifflin. Gibson, J. J. & Bridgman, B. (1987). The visual perception of surface texture in photographs. Psychological Research, 49, 1-5. Gibson, J. J., Kaplan, G. A. Reynolds, H. N. & Wheeler, K. (1969). The change from visible to invisible: A study of optical transitions. Perception & Psychophysics, 5, 113-116.
240 240
Klaus Landwehr Landwehr
Gibson, J. J., Purdy, J. & Lawrence, L. (1955). A method of controlling stimulation for the study of space perception: The optical tunnel. Journal of Experimental Psychology, 50, 1-14. Gray, R. & Regan, D. (1999). Do monocular time-to-collision estimates necessarily involve perceived distance? Perception, 28, 1257-1264. Griinbaum, B. (1986). Mathematical challenges in Escher's geometry. In H. S. M. Coxeter, M. Emmer, R. Penrose & M. L. Teuber (Eds.), M. C. Escher: Art and science (pp. 53-67). Amsterdam: North-Holland. Griinbaum, B. & Shephard, G. C. (1987). Tilings and patterns. New York: Freeman. Hagen, M. A. (Ed.) (1980). The perception of pictures [2 Vols.]. New York: Academic Press. Harvey, L. O., Jr. & Michon, J. A. (1974). Detectability of relative motion as a function of exposure duration, angular separation, and background. Journal of Experimental Psychology, 103, 317-325. Hecht, E. (1998). Optics. 3rd ed. Bonn: Addison-Wesley. Hilbert, D. & Cohn-Vossen, S. (1932). Anschauliche Geometrie [Geometry and the imagination]. Berlin: Springer. Johansson, G. (1970). On theories for visual space perception: A letter to Gibson. Scandinavian Journal of Psychology, 11, 67-74. Julesz, B. (1965). Texture and visual perception. Scientific American, 212, 38-48. Kebeck, G. (1987). Der relative Einfluss von GroBe- und Dichtegradient auf die Wahrnehmung raumlicher Tiefe in Bildern [The relative influence of size and density gradients on perception of depth in pictures]. Gestalt Theory, 9, 57-69. Kebeck, G. & Landwehr, K. (1992). Optical magnification as event information. Psychological Research - Psychologische Forschung, 54, 146-159. Kubovy, M. (1986). The psychology of perspective and Renaissance art. Cambridge: Cambridge University Press. Land, M. & Horwood, J. (1995). Which parts of the road guide steering? Nature, 377, 339-340. Landwehr, K. (1986). Die okologische Auffilllung der Welt I - Ein Vergleich der Prinzipien der Analyse optischer Stimulus-Information in der Gestalttheorie und in der okologischen Optik [The ecological assembling of the world I. A comparison of the principles of analysis of optical stimulus information in Gestalt theory and in ecological optics]. Gestalt Theory, 8, 186-203. Landwehr, K. (1988). Die Okologische Auffullung der Welt II - Homogenitats-InhomogenitatsUbergange im Ganzfeld [The ecological assembling of the world II - Transitions from homogeneioty to inhomogeneity in the Ganzfeld]. Gestalt Theory, 10, 21-34. Landwehr, K. (1998). Die visuelle Wahrnehmung der Welt - Statische Betrachtungsbedingungen [The visual perception of the world - Static conditions of observation]. Egelsbach: HanselHohenhausen. Landwehr, K. (submitted). Effects of texture on perceived time-to-collision. Larish, J. F. & Flach, J. M. (1990). Sources of optical information useful for perception of speed of rectilinear self-motion. Journal of Experimental Psychology: Human Perception and Performance, 16, 295-302.
Textured Tau
241
Laves, F. (1931). Ebenenteilung in Wirkungsbereiche [Dissectioning the plane into generating regions]. Zeitschriftfur Kristallographie, 76, 277-284. Lee, D. N. (1974). Visual information during locomotion. In R. McLeod & H. Pick (Eds.), Perception: Essays in honor ofJ. J. Gibson (pp. 250-267). Ithaca, NY: Cornell University Press. Lee, D. N. & Young, D. S. (1985). Visual timing of interceptive action. In D. J. Ingle, M. Jeannerod, & D. N. Lee (Eds.), Brain mechanisms and spatial vision (pp. 1-30). Dordrecht: Nijhoff. Li, F.-X. & Laurent, M. (1995). Occlusion rate of ball texture as a source of velocity information. Perceptual and Motor Skills, 81, 871-880. Lumsden, E. A. (1980). Problems of magnification and minification: An explanation of the distortions of distance, slant, shape, and velocity. In M. A. Hagen (Ed.), The perception of pictures (Vol. 1, pp. 91-135). New York: Academic Press. MacLeod, R. W. & Ross, H. E. (1983). Optic-flow and cognitive factors in time-to-collision estimates. Perception, 12, 417-423. Metzger, W. (1930). Optische Untersuchungen am Ganzfeld. II. Mitteilung: Zur Phanomenologie des homogenen Ganzfelds [Optical investigations of the Ganzfeld. 2nd Communication: On the phenomenology of the homogeneous Ganzfeld]. Psychologische Forschung, 13, 6-29. Metzger, W. (1966). Figural-Wahrnehmung [Form perception]. In W. Metzger & H. Erke (Eds.), Handbuch der Psychologie, Bd. I-la: Wahrnehmung und Bewusstsein (pp. 693-744). Gottingen: Hogrefe. Miyazaki, K. (1987). Polyeder und Kosmos. Spuren einer mehrdimensionalen Welt [Polyedra and cosmos. Traces of a multidimensional world]. Braunschweig: Vieweg. (Original work published in 1983). Neilson, W. A. (Ed.) (1960). Webster's new international dictionary of the English language — Third edition. Springfield, MA: Merriam. (Original work published in 1909). Oudejans, R. R. D., Michaels, C. F., Bakker, F. C. & Davids, K. (1999). Shedding some light on catching in the dark: Perceptual mechanisms for catching fly balls. Journal of Experimental Psychology: Human Perception and Performance, 25, 531-542. Owen, D. H. (1990). Perception & control of changes in self-motion: A functional approach to the study of information and skill. In R. Warren & A. H. Wertheim (Eds.), Perception & control of self-motion (pp. 289-326). Hillsdale, NJ: Erlbaum. Pugh, E. N. (1988). Vision: Physics and retinal physiology. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, & R. D. Luce (Eds.), Stevens' handbook of experimental psychology - Second edition [2 Vols.] (Vol. 1, pp. 75-163). New York: Wiley. Regan, D. &Beverley, K. I. (1979). Binocular and monocular stimuli for motion in depth: Changingdisparity and changing-size feed the same motion-in-depth stage. Vision Research, 19, 1331-1342. Royden, C. S. & Hildreth, E. C. (1996). Human headingjudgments in the presence of moving objects. Perception & Psychophysics, 58, 836-856. Schattschneider, D. (1978). The plane symmetry groups. Their recognition and notation. American Mathematical Monthly, 85, 439-450.
242
Klaus Landwehr
Schiff, W. (1965). Perception of impending collision: A study of visually directed avoidant behavior. Psychological Monographs: General and Applied, 79, No. 11. Schiff, W., Caviness, J. A. & Gibson, J. J. (1962). Persistent fear responses in rhesus monkeys to the optical stimulus of ,,looming". Science, 136, 982-983. Schiff, W. & Detwiler, M.L. (1979). Information used in judging impending collision. Perception, 8, 647-658. Schiff, W. & Oldak, R. (1990). Accuracy of judging time to arrival: Effects of modality, trajectory, and gender. Journal of Experimental Psychology: Human Perception and Performance, 16, 303-316. Schoenflies, A. (1891). Krystallsysteme und Krystallstruktur [The structure and systems of crystalls]. Leipzig: Teubner. Sedgwick, H. (1992). Cross-talk between picture surface and pictorial space. Centennial Convention of the American Psychological Association, Washington, DC. Stevens, P. S. (1974). Patterns in nature. Boston: Little & Brown. Thiel, P. (1981). Visual awareness and design. Seattle, WA: University of Washington Press. Todd, J. T. & Mingolla, E. (1983). Perception of surface curvature and direction of illumination from patterns of shading. Journal of Experimental Psychology: Human Perception and Performance, 9, 583-595. Vincent, A. & Regan, D. (1997). Judging the time to collision with a simulated textured object: Effect of mismatching rate of expansion of object size and of texture element size. Perception & Psychophysics, 59, 32-36. Warren, W. H., Jr. (1995). Self-motion: Visual perception and visual control. In W. Epstein & S. Rogers (Eds.), Perception of space and motion (pp. 263-325). San Diego, CA: Academic Press. Warren, W. H., Jr. (1998). Visually controlled locomotion: 40 years later. Ecological Psychology, 10, 177-219. Warren, W. H. & Saunders, J. A. (1995). Perception of heading in the presence of moving objects. Perception, 24, 315-331. Watanabe, T. (Ed.) (1998). High-level motion processing: Computational, neurobiological, and psychological perspectives. Cambridge, MA: MIT Press. Wertheimer, M. (1923). Untersuchungen zur Lehre von der Gestalt. II. [Investigations in the context of Gestalt theory. II.]. Psychologische Forschung, 4, 301-350. Wolpert, L. (1990). Field-of-view information for self-motion perception. In R. Warren & A. H. Wertheim (Eds.), Perception & control of self-motion (pp. 101-126). Hillsdale, NJ: Erlbaum. Zucker, S. W. (1976). On the structure of texture. Perception, 5, 419-436.
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 11 Multiple Sources of Information Influence Time-to-Contact Judgments: Do Heuristics Accommodate Limits in Sensory and Cognitive Processes? Patricia R. DeLucia Texas Tech University, Lubbock, TX, USA
ABSTRACT There is an increasing number of studies which indicates that judgments about time-to-contact (TTC) are influenced by multiple sources of information and are constrained by threshold factors and cognitive operations. In this chapter, I provide an overview of these studies and attempt to provide a framework for studying TTC judgments based on five hypotheses: (a) TTC judgments are based on multiple sources of information, including heuristics (e.g., pictorial depth cues) and invariants (e.g., tau); (b) The sources of information that influence TTC judgments are determined, in part, by limits in sensory processes and (c) by limits in cognitive processes; (d) The sources of information that influence TTC judgments vary throughout a task or event; (e) Apparent spatial extent and mental structure can influence TTC judgments. Support for these hypotheses is presented. It is concluded that the effectiveness of tau is constrained by limits in sensory and cognitive processes and that it is adaptive for the visual system to rely on other sources of information including heuristics, which serve to accommodate such limits and to provide flexibility in performance. Such limits also play an important role in determining which sources of information are effective throughout a task or event. Future research should consider the role of limitations in sensory and cognitive processes in models of TTC perception, determine the effective sources of information for TTC judgments throughout a task or event, and measure the relative strengths and combinatorial rules of these sources.
244 244
Patricia R. R. DeLucia DeLucia Patricia
There is an increasing number of studies which indicates that judgments about TTC and related tasks are influenced by multiple sources of information and are constrained by threshold factors and cognitive operations. In this chapter, I provide an overview of these studies, focusing on research that my collaborators and I have conducted since 1989. In doing so, I hope to provide a framework for studying judgments of time to contact. This framework is based on five hypotheses around which this chapter is organized. These hypotheses are introduced below, but first a note on terminology is in order. In this chapter, I discuss judgments about collision and non-collision events. These include judgments of time to contact (or time to collision, time to arrival, arrival time, time to passage) and judgments about whether a collision would occur (or potential collision, collision detection) rather than when a collision would occur. Here "TTC" refers generally to the time to completion of any such event rather than specifically to time to contact. The phrase "TTC judgments" includes these related tasks. Also, the generic term optical TTC information refers to visual information that veridically specifies any TTC event, such as local x or global x. Unless noted, the studies described in this chapter involved computer simulations of such events. In the five main sections of this chapter, I present an overview of research that supports each of the following hypotheses. In the final section, I conclude that heuristics play an important role in TTC judgments and consider reasons that they would do so. I begin with a list of the hypotheses and a brief introduction to how each is supported. This serves as a preview of what will be detailed in the main sections of the chapter. Hypothesis 1. TTC judgments are based on multiple sources of information, including heuristics and invariants. Heuristics are implicated by findings that TTC judgments are influenced by pictorial depth information, including relative size (the size-arrival effect), height in field, and occlusion. This makes it necessary to determine how effects combine when multiple sources of information are available. Other relevant data include measures of TTC judgments when optical TTC information or one of its components is nullified (by manipulating optic flow via self motion, or occluders). Hypothesis 2. The sources of information that influence TTC judgments are determined, in part, by limits in sensory processes. Limits in sensory processes are implicated by findings that (a) TTC judgments are influenced by lower-order motion which does not specify TTC unequivocally, such as optical magnification, expansion rate, and image velocity; (b) TTC judgments are not affected by irregularities in optical TTC information due to computer aliasing. This suggests that TTC information is
Multiple sources of information influence time-to-contact judgments judgments
245
not extracted on a frame-by-frame basis, possibly due to limits in the temporal resolution of the visual system; (c) TTC judgments are affected by retinal eccentricity, and attention instructions, which suggests that useful information for such judgments may not be accessible to the observer within a single glance or attentional act. Hypothesis 3. The sources of information that influence TTC judgments are determined, in part, by limits in cognitive processes. Limits in cognitive processes are implicated by findings that TTC judgments are affected by the number of objects in the display (set size) in a manner consistent with limited-capacity processing and limits in memory. Other studies of TTC judgments implicate cognitive motion extrapolation and visual imagery. Hypothesis 4. The sources of information that influence TTC judgments vary throughout a task or event. Studies indicate that it is difficult to identify a critical value of a single source of information that accounts for TTC judgments. For example, analyses of a collision-avoidance task and relative TTC judgments indicated that optical size, change in optical size, rate of expansion, or TTC, did not consistently account for performance. However, such analyses assume that the information sources that govern performance remain constant throughout a task. This seems unlikely because the quality of some sources of depth information varies with distance. Alternatively, information sources that influence TTC judgments vary as the distances between the observer and objects in the environment change. This is adaptive and provides flexibility in performance. Hypothesis 5. Apparent spatial extent and mental structure can influence TTC judgments. Apparent spatial extent is implicated by findings that TTC judgments are affected by visual illusions such as the Sander parallelogram illusion. Mental structure is implicated by effects of perceptual set or intention on TTC judgments, particularly a coupling between apparent TTC and apparent depth.
246
Patricia R. DeLucia
1. TTC judgments are based on multiple sources of information, including heuristics and invariants Invariants refer to higher-order properties of the optic array that specify properties of a three-dimensional environment (e.g., Gibson, 1979). The information provided by invariants is veridical and reliable 100% of the time (Cutting & Wang, 2000). In contrast, information provided by heuristics is not necessarily veridical and its reliability varies. Heuristics increase efficiency by restricting search through the possible solutions, but do not guarantee the correct solution and can result in error (Braunstein, 1976). Local and global x are invariants. Pictorial depth cues (e.g., relative size) and lower-order motion (e.g., rate of expansion) are heuristics. I do not adopt specific formalisms or assumptions of perceptual heuristic models described previously (Gilden & Proffit, 1989; Hecht, 1996; Runeson, Juslin, & Olsson, 2000), but rather the more general description of heuristic processes provided by Braunstein (1976). The studies reviewed in this section support the hypothesis that TTC judgments are based on multiple sources of information, including heuristics and invariants. There are numerous citations of empirical studies supporting the use of invariants such as x in TTC judgments (Wann, 1996), although the empirical support and the generality of x has been questioned (e.g., Tresilian, 1995; Wann, 1996). For brevity, I will not review studies which suggest that invariants influence TTC judgments. Rather, I will focus on studies which suggest that heuristics influence such judgments. In particular, I will focus on the effects of pictorial depth information on TTC judgments. I also will consider what happens to such judgments when optical TTC information or one of its components is "nullified." The results of an increasing number of studies indicate that TTC judgments are influenced by pictorial depth information. Pictorial depth information refers to features that characterize a two-dimensional projection of a three-dimensional scene such as relative size and occlusion. As in my earlier work (DeLucia, 1991a), I distinguish between pictorial depth cues and higherorder motion-based information (e.g., x) without assumptions of unconscious inferences or mental computations associated with classical theories of depth perception (Hochberg, 1987). 1.1 Pictorial depth cues: Fundamental issues The pictorial depth cues have been considered relevant only to static or impoverished viewing conditions, and not relevant to moving observers or perception based on motion information (e.g., Gibson, 1979). However, it has been shown that depth cues can affect the perception of distances and motions, and that misperceptions can occur, even when motion information is available.
Multiple sources of information influence time-to-contact judgments
247
An example is the Ames' rotating "window" illusion (Ames, 1951), represented in Figure 1. Observers misperceived the trapezoid's direction of motion in a manner consistent with the depth cues of linear perspective and relative size which contradicted motion perspective information (Hochberg, 1978). The illusion occurred when motion information was above threshold (Hochberg, 1987), and when observers moved while the window remained stationary (Hochberg & Beer, 1991). The implication is that motion does not necessarily result in usable depth information (Hochberg, 1986).
Frontal View
Top View Real Rotation
B y'
Apparent Rotation
Figure 1: Schematic representation of Ames' rotating window illusion.
Such findings are relevant to models of TTC perception. Tau is a motion-based optical invariant that specifies TTC veridically. Generally, x is defined by the ratio of an angular extent (e.g., an object's optical size) to the rate of change of that angular extent (e.g., rate of optical expansion). It is independent of an object's size, speed, or distance from the observer (e.g., Hochberg, 1986). Furthermore, x specifies TTC directly; it is not necessary to perceive speed or distance to perceive TTC (Lee & Young, 1985). Therefore, x renders the pictorial depth cues unnecessary in TTC judgments. Nevertheless, it is necessary to demonstrate that such judgments indeed are based on x and are not influenced by pictorial depth cues. If such influences occur, it becomes important to consider depth cues in models of TTC perception. At the least, limits in sensory and cognitive processes may impose boundaries on the effectiveness of x. For example, x would not be effective if an
248 248
Patricia R. DeLucia DeLucia
object's optical expansion is below detection threshold (DeLucia, 1989; Gray & Regan, 1998; Lee, 1976; Todd, 1981). In such cases, other information such as pictorial depth cues may be used. It is argued here that such limits deserve greater consideration in models of TTC perception. For example, it is important to determine whether observers rely on pictorial depth information and other heuristics when higher-order motion-based information is below threshold due to limits in sensory and cognitive processes; and also when such information is above threshold, as implicated by the Ames window illusion (DeLucia, 1989, 1991a). Our research indicates that pictorial depth information can influence TTC judgments even when optical TTC information such as x is above threshold (e.g., DeLucia, 1991a). The implications are that both pictorial and motion-based depth information must be considered in models of TTC perception and that their limits, relative strengths and combinatorial rules must be specified (DeLucia, 1989, 1991a; DeLucia & Warren, 1994). 1.2 Pictorial relative size: The size-arrival effect Primary Phenomenon. The effect of pictorial relative size on TTC judgments was described by DeLucia (1989, 1991a). In a representative experiment, a large object and a small object approached the observation plane for about 333 ms. As represented in Figure 2, the objects were suspended above a road and moved at identical speeds. Observers completed a relative TTC judgment. They reported which object they thought would "hit" them (or a location next to them) first, had the objects continued traveling beyond the computer monitor. In virtual space, or the three-dimensional space represented by the simulation, the difference in the objects' sizes was designed so that the smaller object projected the smaller image throughout the approach even though it was closer to the viewpoint. The objects' TTCs with the observation plane, and the difference between their TTCs, were within the range considered usable or above threshold in earlier studies (Schiff & Detwiler, 1979; Simpson, 1988). In short, optical TTC information specified that the smaller object would arrive first, but the pictorial depth cue of relative size indicated that the larger object was closer (assuming equal virtual sizes). Would observers report that the smaller object would arrive first, consistent with t? Or would they report that the larger object would arrive first, consistent with relative size? Results indicated the latter. Observers typically reported that the larger object would hit them first. This effect of relative size on judgments of arrival time was termed the size-arrival effect (Caird & Hancock, 1994; DeLucia, 1999; DeLucia & Warren, 1994; van der Kamp, Savelsbergh, & Smeets, 1997). We continue to use this term, abbreviated here as SAE, but note that effects of relative size also were demonstrated in other judgments, such as judgments about potential collision or collision detection.
Multiple sources of judgments of information information influence influence time-to-contact time-to-contact judgments
249 249
The SAE generalizes to displays with longer durations, high resolution, texture, and self motion. We demonstrated the SAE with scene durations of 3 s (DeLucia, 1991b), with high-resolution photographic animations of real approaching objects (DeLucia, 1989, 1991a), and with textured objects (DeLucia, 1991b; DeLucia, Kaiser, Bush, Meyer, & Sweet, 2000, 2003; DeLucia, Kaiser, Garcia, & Sweet, 2001; DeLucia, Meyer, & Bush, 2002; see also Smith, Flach, Dittman, & Stanard, 2001). We also obtained the SAE with simulations of self motion and object motion during self motion, although in some cases the SAE was reduced significantly (DeLucia, 1991b; DeLucia et al., 2002). Finally, the SAE has been demonstrated in other laboratories. An effect of ball size occurred in a collision control task in which observers released a pendulum to hit an approaching ball (Smith et al., 2001), and in TTC judgments of small and large approaching vehicles with high fidelity computer simulations of traffic scenes (Caird & Hancock, 1994). The SAE is a robust phenomenon and is not an artifact of impoverished viewing conditions, as has been suggested (e.g., Cavallo, Mestre, & Berthelon, 1997). The reliability and magnitude of the SAE can be substantial. For example, in a relative TTC judgment task, 20 of 20 observers reported the SAE (DeLucia, 1991a). In an absolute TTC judgment task, the mean TTC estimate was 142 ms longer for a small object than a larger object even though the small object's actual TTC was 237 ms shorter (DeLucia, 1991a). Similarly, in an active collision-avoidance task, observers waited nearly 1 s longer to "jump" over a small object compared to a large object that had the same TTC (DeLucia & Warren, 1994). Various factors constrain the SAE. The SAE was reduced in several conditions: (a) when the ratio of the small object's size to the large object's size was .60 or greater (DeLucia, 1989, 1991a); (b) when the virtual scene was translated laterally during the approach and provided motion perspective information, although the SAE persisted and even increased in some conditions (DeLucia, 1989, 1991a, 1991b); (c) when both objects started and finished closer to the virtual eye and resulted in faster rates of optical expansion; hereafter, I refer to this as the fast expansion scene (DeLucia, 1989, 1991a, 1991b), which is represented in Figure 2; (d) when the altitude of the virtual eye was increased or decreased or when the objects were located on the ground plane (DeLucia, 1991b); such conditions increased the magnitude of the projected vertical motion components (Schiff, 1988); (e) when ground-intercept information was provided by adding markers on the road directly below each object and when observers were instructed to attend to this information, even when such markers did not expand optically (DeLucia, 1989, 1991a, 1991b). Specifying the boundaries of the SAE will help elucidate the role of pictorial and motion-based depth information in TTC judgments. The SAE does not depend on observers' judgments about the objects' virtual (simulated) sizes and speeds. The depth information provided by relative
250
Patricia R. DeLucia
size is predicated on the assumptions that the objects are constant and equal in size. The displays that demonstrated the SAE violated this assumption because the virtual objects were unequal in size. Similarly, although the objects moved at the same speeds, the relative rate of expansion was greater for the smaller object because it was closer and possibly appeared to move faster. With the displays described earlier, we examined whether the SAE was related to judgments about whether the virtual objects were the same size, moved at the same speed, and whether the speed of each object was constant (DeLucia, 1991b).
B
T=Time
Figure 2: Schematic representation of scenes. A. Frontal views of primary scene which resulted in SAE (top); fast expansion scene (middle); ground-intercept information (bottom). B. Top view of primary 3D scene.
Multiple sources of judgments of information information influence influence time-to-contact time-to-contact judgments
251 251
Results indicated that the SAE did not depend on judgments about the objects' virtual sizes or speeds. In some cases, observers reported that the objects were unequal in size or moved at the same speeds, but the SAE occurred nevertheless. In other cases, observers reported that the objects were the same size or moved at different speeds, but the SAE typically did not occur (e.g., fast expansion scene). Furthermore, when observers reported that the smaller object accelerated the SAE occurred in some scenes but not in others. Observers typically reported that the larger object moved at a constant speed. Finally, we examined whether the SAE occurred because the larger object's bottom edge was lower in the projection plane than the smaller object's bottom edge (DeLucia, 1991b). Observers may have perceived the larger object as resting on the ground plane (see DeLucia & Warren, 1994; Sedgwick, 1983). In contrast, the number of observers who reported that the larger object touched the ground was not significant. The lower optical position of the larger object's bottom edge also was ruled out as the basis of the SAE in subsequent studies of collision-avoidance tasks (DeLucia & Warren, 1994). In summary, the SAE does not depend on judgments about the sizes or speeds of the virtual objects, or on judgments about whether the objects touched the ground. Most important, the SAE does not seem to be based on observers' explicit or verbally-reported assumptions of equal-sized objects. The implication is that there may be a stimulus-bound correlation between retinal size and perceived distance or that pictorial relative size may provide a direct stimulus for the organization of perceived depth and does not rely on assumptions based on past experience (e.g., Hochberg & Hochberg, 1952). The SAE occurs in active collision-avoidance tasks. In the studies discussed thus far, observers did not control any aspects of the displays. Some studies suggest that performance in active tasks are better than performance in passive judgments (e.g., Gibson, 1962), but equivocal evidence has been reported (Flach, Allen, Brickman, & Hutton, 1992). Therefore, we evaluated whether the SAE can occur with active collision-avoidance tasks (DeLucia & Warren, 1994). In a representative experiment, displays simulated self motion toward an object that was suspended above a textured ground plane. Observers controlled their altitude with a control stick while approaching the object. They were instructed to get as close as possible to the object and then "jump" over it to avoid collision. The time of the jump, or launch time, was measured. Results indicated that observers jumped significantly later for smaller objects, compared with larger objects that were approached from equal distances at equal speeds and were positioned at equal clearance heights. Moreover, the SAE occurred when the object's width and length were varied independently. Our results are difficult to reconcile with x-based models of TTC perception which do not predict such effects. In any case, the SAE is not limited to passive judgments.
252
Patricia R. DeLucia
Potential effects of binocular disparity and familiar size on SAE thresholds. Gray and Regan (1998; see also Gray & Regan, this volume) reported that observers can make accurate TTC judgments when a binocular correlate of TTC information is presented without monocular information. Their results also suggested that the relative effectiveness of monocular and binocular TTC information depends on object size (see also Rushton & Wann, 1999). Furthermore, effects of size on a ball-catching task occurred in monocular, but not in binocular, viewing conditions (van der Kamp et al., 1997). Moreover, performance was better with binocular viewing compared with monocular viewing when observers completed a prediction-motion (PM) task; that is, they judged the absolute TTC of a moving object that disappeared (Cavallo & Laurent, 1988). However, the effectiveness of binocular disparity decreases as viewing distance increases (e.g., Cutting & Vishton, 1995). At the least, other information sources may be used when disparity is ineffective. We compared SAE thresholds for viewing conditions with disparity to those without disparity. We used scenes in which two objects approached the observer and which were previously shown to result in the SAE. The displays were viewed with Stereographies' CrystalEyes eyewear and provided crossed disparity information. Observers reported which object would reach them first. The smaller object's virtual speed was varied with a staircase procedure. The staircase converged on the 29% and 71% thresholds (i.e., percentage of trials in which the observer selected the smaller object as arriving first), and we used the latter to define the threshold for the elimination of the SAE. When scenes contained line-drawn objects, the effect of disparity was not significant and accounted for less than 1% of the variance. When scenes contained familiar size information, represented in Figure 3, disparity was still not significant but accounted for over 10% of the variance. Also, the mean threshold was smaller when scenes represented familiar spherical objects compared with textured squares. In the former, the approaching objects represented a beach ball and a marble that were shown to observers in the laboratory. In a second experiment, we used the staircase procedure to vary disparity. Results suggested that the mean disparity threshold was substantially greater than that which characterizes typical stereoacuity. Also, the mean threshold was smaller when scenes represented familiar objects compared with line-drawn squares. Although additional experimentation is needed before conclusions can be reached, our initial results suggest that the presence of disparity does not affect thresholds for the SAE, but familiar size does, and that disparity may become more effective when other depth cues such as familiar size are present.
judgments Multiple sources of information influence time-to-contact judgments
253
Figure 3: Schematic representation of scenes with line-drawn objects (top), textured squares (middle) and familiar objects (bottom). Latter scenes were in color.
Finally, we ruled out an account of the SAE based on a conflict between monocular and binocular TTC information that may have occurred when observers viewed "monocular" displays with two eyes (Tresilian, 1994, 1995). The SAE occurred in our collision-avoidance task when observers viewed our displays with one eye or with two eyes (DeLucia, 1999). Optical size, change in optical size, or rate of expansion? The experiments that we conducted on the SAE were not designed to tease apart the effects of pictorial size (optical size) from the effects of change in optical size or its rate of change. All three parameters were relatively smaller for the smaller objects. We favor explanations of our results in terms of relative size because it is more closely tied to apparent distance. Changes in optical size or rate of expansion are more closely tied to apparent velocity (e.g., DeLucia & Warren, 1994). Furthermore, several studies suggest that rate of expansion alone cannot account for the SAE. For example, we analyzed several optical parameters of
254
Patricia R. DeLucia
the displays in our collision-avoidance study (DeLucia & Warren, 1994). If observers delayed their jump until an optical parameter reached a critical value, this value would be the same for the small and large objects; the ratio would approach 1.0 at launch time. Although t and its first derivative were the same for small and large objects, observers jumped sooner for the larger object. This suggests that such information was not the dominant basis for launch time. Based on the mean launch times and launch altitudes, we estimated the magnitude of each object's visual angle or optical size, absolute change in optical size, and rate of change in optical size. For each, we calculated the ratio of large-to-small objects at the beginning of the trial and again at launch time. Analyses indicated that the ratios for optical size, change in optical size, and rate of expansion were closer to 1.0 at launch time compared with the start of the trial. In contrast, the ratio for TTC information began at 1.0 and diverged from 1.0 at launch time (see DeLucia & Warren, 1994, Figure 6). The implication is that optical size and expansion rate may have affected launch time more than did optical TTC information. However, the ratios for optical size and expansion rate still did not reach 1.0 at launch time. Generally, results suggest that rate of expansion alone did not determine launch time. We reached similar conclusions regarding TTC estimates in a PM task (DeLucia et al., 2001). Analyses of relative TTC judgments were consistent with those of our collision-avoidance task. In a pilot study conducted at Wright Patterson Air Force Base, observers viewed approach scenes that resulted in the SAE and reported which object would hit them first. We used a staircase procedure to vary several parameters including the smaller object's virtual size, initial distance, and speed, and the altitude of the virtual eye. We conducted optical analyses of the scenes that corresponded to the mean 71% threshold for each parameter. We assumed that if observers' judgments were based on a critical value of optical size, expansion rate, or TTC, threshold would occur when a particular optical variable reached a specific value, regardless of which parameter was varied. For example, if judgments were based on the objects' optical sizes, the latter would be the same value at threshold whether we varied the virtual object's size, distance, or speed. However, as in our collisionavoidance analyses, we did not identify an optical parameter that consistently accounted for the thresholds. Finally, results suggest that optical size or rate of expansion alone does not necessarily account for performance in a collision control task (Smith et al., 2001). Some of the results were consistent with a linear weighting of optical size and expansion rate; other results suggested a pure optical size strategy. The authors concluded that optical size and rate of expansion were functionally independent degrees of freedom in their task and that observers used optical size and expansion rate alone or jointly to perform the task. We consider expansion
Multiple sources of information influence time-to-contact judgments judgments
255
rate a heuristic because it does not specify TTC unequivocally and depends on an object's size, distance, and velocity (DeLucia, 1989, 1991a). For example, relatively fast objects that may have relatively fast expansion rates may be relatively far, and thus do not always arrive first. The same can occur for relatively near, but slow, objects. Apparent size-distance couplings cannot account for the SAE. One explanation of the SAE is that apparent size is coupled to apparent distance (DeLucia & Warren, 1994). Accordingly, in our collision-avoidance task, observers jumped sooner for the larger object because it appeared closer. This is consistent with traditional studies of apparent size-distance relationships (e.g., Epstein, Park, & Casey, 1961; Hastorf, 1950; Ittelson, 1951a,b; Kilpatrick & Ittelson, 1951). Such an account is based on observers' assumptions about the objects' virtual sizes. If observers assumed that the objects were equal in virtual size, they would infer that the object with the smaller optical size was farther. However, as noted earlier, subjective reports were not consistent with observers' judgments of equal-sized objects (DeLucia, 1991b). Moreover, we ruled out an apparent size-distance account on logical terms (DeLucia & Warren, 1994). Specifically, if observers assumed that the two objects were the same virtual size, the object with the larger optical size would appear closer. However, the perceived change in the larger object's distance over time (speed) also would appear smaller by the same proportion. That is, the larger object would appear not only closer, but slower than the smaller object. Consequently, the TTC of both objects would appear the same (assuming perceived TTC is based on perceived distance and perceived velocity) and observers would begin their jump at the same time for small and large objects. This is contradicted by our data, and suggests that the coupling between apparent size and apparent distance is not necessarily obligatory (Hochberg, 1982). Pictorial relative size can affect judgments about potential collision. Prior studies of judgments about collision typically measured judgments about TTC or when a collision would occur. However, judgments about when a collision would occur must be preceded by the detection of a potential collision or judgments about whether a collision would occur (DeLucia, 2001; DeLucia et al., 2002). We examined whether the effects of pictorial relative size and ground-intercept information that we obtained in judgments about TTC also would occur in judgments about whether a collision would occur between two objects (DeLucia, 1995).
256
Patricia R. DeLucia
45 degrees
I
36 degrees
• Figure 4: Left panel. Path angles of cubes (top view). Right panel. Schematic representation of scene with unequal-sized cubes, ground-intercept information, and a 180-degree path angle (actual scenes were in color).
In a representative experiment, two objects moved toward each other at various path angles and disappeared before they collided with, or passed, each other as represented in Figure 4. When the objects were unequal in size, observers judged more often that the objects would miss. When the objects were equal in size, observers judged more often that they would collide. Performance improved when ground-intercept information was present. These effects persisted when displays simulated self motion, when the objects and surrounding environment were textured, and when displays subtended a 74 deg x 90 deg field of view (DeLucia et al., 2002). The implication is that pictorial depth information can affect judgments about potential collision as it affects judgments about TTC. Indeed, a recent study suggests that static depth information also can affect judgments about heading (Best, Crassini, & Day, 2002).
judgments Multiple sources of information influence time-to-contact judgments
257
1.3 Other depth cues and information integration In light of the robust effects of pictorial relative size on TTC judgments, we examined whether other pictorial depth cues can affect such judgments (DeLucia et al, 2000; DeLucia et al, 2001; DeLucia et al., 2003). In one experiment, two objects approached the viewpoint and observers responded when they thought one of the objects would reach the observation plane; we measured the difference between TTC estimates for the two objects. In another experiment, observers reported which of the objects would reach the observation plane first. We varied one of the object's virtual speed with a staircase procedure and estimated the 50% threshold. Following the method described by Bruno and Cutting (1988), the depth cues of relative size, height in field, occlusion, motion parallax, and texture density were either present or absent, and all factorial combinations were represented. When a cue was present, it suggested that one of the objects was farther away than the other object; however, both objects actually had the same TTC. Generally, our results indicated that relative size, height in field, occlusion, and motion parallax can influence TTC judgments. Relative size accounted for the largest proportion of the variance. If more than one source of information affects TTC judgments it becomes important to determine how such effects combine when multiple sources are available. We used a modification of functional measurement (Anderson, 1974; Bruno & Cutting, 1988) to examine information integration in the absolute and relative TTC judgments described previously (DeLucia et al., 2000; DeLucia et al., 2001; DeLucia et al., 2003). Our results did not support a cue selection strategy in which observers used only one source of information. Rather, multiple sources of information affected performance. Moreover, the rules by which our information sources combined were not always consistent with a strict additive model and depended on the task. Results of absolute TTC judgments (PM tasks) were mostly consistent with an additive combination rule. Results of relative TTC judgments were more consistent with a nonadditive rule. In summary, it is important to measure the relative strengths and combinatorial rules of multiple sources of information in TTC judgments, including pictorial and motion-based depth information. 1.4 What happens when optical TTC information or one of its components is "nullified"? Next, I review studies in which TTC judgments were measured when optical TTC information or one of its components was "nullified" by self motion or by occlusion. Such studies have implications for whether observers
258
Patricia R. DeLucia
can perform TTC judgments on the basis of other factors, that is, when optical TTC information putatively is not available. Effects of self motion on TTC judgments. When an object moves toward a stationary target with a constant velocity on a path that is perpendicular to the observer's primary line of sight, its TTC with the target is specified by the inverse of the relative rate of constriction of the optical gap between the object and target (e.g., Tresilian, 1990). However, such analyses were based on the assumption that the visual angle formed by the position of the object, target and observation point is constant, as is the case when the observer is stationary; self motion can violate the assumptions of these analyses (DeLucia & Meyer, 1997, 1999). Furthermore, empirical studies of judgments about the TTC between two objects involved simulations of a stationary observer. Therefore, we examined whether simulations of self motion affect such judgments (DeLucia & Meyer, 1997, 1999; see also Gray & Regan, 2000; Grutzmacher, Geri, & Pierce, 2000). In a representative experiment, an object moved toward a target on a path that was perpendicular to the primary line of sight and disappeared before it reached the target. Observers pressed a button when they thought the object would reach the target. Simulations represented a stationary observation point or self motion in several directions. Forward self motion decreased the optical gap constriction and backward self motion increased it. As represented in Figure 5, in some scenes, self motion nullified the constriction of the optical gap between the object and target; the gap was the same on the first and last frames. Results indicated that effects of self motion depended on various factors such as background texture, actual TTC, and the speed of object motion and self motion. Generally, TTC estimates increased as actual TTC increased when scenes represented a stationary or a moving observer. Most interestingly, self motion did not always affect TTC judgments - even when optical gap constriction was nullified. Judgments may have been based on information other than optical gap constriction. What information might observers have used? Tresilian (1999) demonstrated subsequently that T specifies the TTC between two objects during self motion that nullifies optical gap constriction. In this case, T is based on the relative rate of optical expansion of either the object or the target that results from forward self motion. This means that the optical invariant that specifies when an approaching object would reach the eye also specifies when an object would reach a target during gap-nulling self motion. With this in mind, we compared TTC judgments of two different events in which TTC was specified by the same invariant. It has been reported that, on the average, observers underestimate the TTC of an approaching object at about 60% of the actual TTC (Tresilian, 1995). If observers used x to judge TTC in our gap-nulling scenes we would expect a similar degree of error. In contrast, such scenes resulted in substantially smaller errors. Therefore, even when two events are specified by the same optical invariant, the accuracy of observers'
Multiple sources of information influence time-to-contact judgments
259
judgments about those events may not be comparable. The implication is that an observer's use of T depends on the context of the scene (Cavallo & Laurent, 1988; DeLucia, 1991b). Alternatively, observers did not use x to judge either event, or they used a different source of information for each event. i
.
\
]
—
. . .. i
,
1
j
i
i
. 1 ..
i
i i i 1,
I
Figure 5: Schematic representation of scenes in which self motion nullified optical gap constriction. Top. First frame. Middle. Last frame. Bottom. Example of a textured scene (actual scenes were in color).
Finally, we compared TTC judgments of scenes in which monocular TTC information has been specified mathematically with judgments of scenes in which monocular TTC information putatively is not available and may require approximations, or binocular TTC information (Tresilian, 1999). In some conditions, performance was comparable nevertheless. These results, and the previously discussed effects of pictorial depth cues on TTC judgments suggest that optical TTC information may be neither necessary nor sufficient to account for TTC judgments in all contexts (see also Kerzel, Hecht, & Kim, 1999; Tresilian, 1991). Effects of static and moving occluders on TTC judgments. Prior studies of TTC typically focused on judgments of unoccluded approaching objects (but see Hancock & Manser, 1997) and have not considered the implications of occlusion for derivations of TTC information or for TTC judgments. We demonstrated that a stationary occluder can affect an approaching object's
260 260
Patricia R. DeLucia DeLucia
optical size and a variable similar to a local x margin that is based on the unoccluded portion of the object. We refer to this as the relative rate of accretion, which is derived in the Appendix (DeLucia et al., 2003; see also DeLucia, 2002). We also obtained differences in TTC judgments of unoccluded and occluded objects. Most recently, we examined the effect of moving occluders on TTC judgments (DeLucia, 2002). A square-shaped or diamondshaped object approached the observer while partially concealed by a stationary or moving occluder, as represented in Figure 6.
Figure 6: Schematic representation of scenes with moving occluder. Top. First frame. Bottom. Last frame. Actual scenes were in color.
The occluder's rightward motion resulted in a decrease in the relative rate of accretion; leftward motion increased it. In one condition, the occluder nullified the object's optical expansion; the object's optical size (either width or area) was the same on the first and last frames. Global x was not affected by the occluder's motion. Observers responded when they thought the object would reach them, and TTC estimates were measured. Results indicated significant effects of the occluder's motion and interactions between motion and other variables such as the object's shape and TTC. For diamond-shaped objects, TTC estimates were smaller for rightward-moving occluders compared with stationary occluders; TTC estimates were greater for expansion-nullifying occluders. Although such directional changes in TTC estimates were consistent with the relative rate of accretion, the latter typically predicted larger changes in TTC estimates than we obtained. Our results are difficult to reconcile with accounts based on the relative rate of accretion or on x.
judgments Multiple sources of information influence time-to-contact judgments
261
2. The sources of information that influence TTC judgments are determined, in part, by limits in sensory processes The studies reviewed here suggest that the sources of information that influence TTC judgments are determined, in part, by limits in sensory processes. Such limits are important for models of TTC perception. Hochberg (1982) argued that theories of visual perception must consider the role of sensory limitations and the size of the effective stimulus. For example, he noted that demonstrations that there are higher-order optical invariants which provide information about the environment, and that observers sometimes use them, alone do not imply anything about perceptual processes; the effectiveness of such information is limited in space and time. Thus, it is critical to determine the size of the effective stimulus and to consider the role of mental structure when sensory limits are exceeded. I summarize several of his main points and discuss their implications for models of TTC perception. The effectiveness of visual information is limited by several factors. First, the effectiveness of higher-order variables depends on whether lower-order variables (which Hochberg called "carrier elements") are above threshold. For example, the texture density gradient and the surface slant which it specifies will not be detected if the texture elements or the change in their density is too small. Second, the effectiveness of visual information depends on the locus of fixation or where one looks. For example, kinetic occlusion information is picked up less effectively when viewed with peripheral vision compared with foveal vision. Third, when observers make eye movements to attain visual information, they cannot encode and store all the information from one glance to the next. Such limits imply that some form of mental structure guides eye movements - which are, after all, elective or purposeful. Finally, sensory limits also exist in the temporal domain. For example, it is not clear how much viewing time is needed before optical invariants become usable or effective. These issues are relevant for models of TTC perception. For example, x is not effective if an object's optical size and expansion are below threshold (DeLucia, 1989; Gray & Regan, 1998; Lee, 1976; Todd, 1981). Similarly, optical size and expansion may be less effective when viewed with peripheral vision compared with foveal vision. Furthermore, it is unlikely that optical TTC information is perceptually extracted continuously (DeLucia, 1991a, 1999). The minimal time over which temporal information is integrated before it can be used may range from 32 ms (Morgan, 1980) to 350 ms (Smith & Gulick, 1957). Alternatively, observers may use an averaging process to extract TTC. That is, TTC may be derived by averaging optical expansion information over a minimal period of time (DeLucia, 1991a, 1999; Tresilian, 1993).
262
Patricia R. DeLucia
Finally, it is essential to determine how observers judge TTC when relevant visual information is rendered ineffective by sensory limits. For example, if higher-order invariants are not effective, observers may rely on heuristics. In addition, mental structures, inferences, or cognitive operations may be activated when higher-order information is below threshold or inadequate (Gibson, 1966; Hochberg, 1982). However, if such mechanisms are implicated when sensory limits render higher-order variables below threshold, it becomes important to consider whether such mechanisms are active even when higher-order variables are above threshold. In summary, despite the availability of optical invariants, it is essential to consider sensory limits in models of TTC perception. In Hochberg's terms, it is critical to determine the size of the effective stimulus. 2.1 Effects of lower-order motion on TTC judgments A distinction was made between lower-order or first-order motion and higher-order motion information (e.g., DeLucia, 1991a). The latter refers to invariants in the optic array that result from three-dimensional motion; an example is x. Lower-order motion refers to movement per se or motion whose thresholds can be measured whether the motion is due to two-dimensional motion in the projection plane or to the projection of three-dimensional motion. Examples include an object's vertical displacement in the projection plane, or its change in optical size. This distinction is important because lower-order motion is always present in higher-order motion information. However, higher-order motion information is not necessarily present in two-dimensional motions. As noted earlier, the effectiveness of higher-order information (e.g., T) depends on whether lower-order motion (e.g., optical expansion) is above threshold. Therefore, it is important to determine the influence of lower-order motion on TTC judgments. For example, lower-order motion may constrain the effectiveness of T or affect TTC judgments even when x is available. Indeed, results of several studies suggest that TTC judgments are influenced by lowerorder motion such as optical magnification, expansion rate, and image velocity (e.g., DeLucia, 1991a; DeLucia & Novak, 1997; DeLucia & Warren, 1994; Gray & Regan, 1999; Kebeck & Landwehr, 1992; Kerzel, Hecht, & Kim, 1999; Regan & Hamstra, 1993; Smith et al., 2001). The implication is that lower-order motion which does not specify TTC unequivocally can affect TTC judgments. Lower-order motion and higher-order motion information both are relevant to models of TTC perception.
Multiple sources sources of of information information influence influence time-to-contact time-to-contact judgments
263 263
2.2 Potential effects of computer aliasing on TTC judgments We examined whether computer aliasing affects TTC estimates and contributes to the SAE (DeLucia, 1999). Display resolution influences the degree of aliasing and affects the accuracy of computer simulations. The effects of aliasing are not limited to the familiar jagged edges associated with computer-generated diagonal lines. The projected size of an approaching object also is affected by aliasing because the computer rounds to the nearest integer to determine which pixel to illuminate; a fraction of a pixel cannot be illuminated. Consequently, increases in a computer-generated image of an approaching object may be too small or too large compared with an ideal projection medium. In the extreme case, increases that are less than the size of a pixel cannot be represented and the object's image does not change. Consequently, x does not specify TTC accurately from one "frame," or one point in time, to the next. The value of x is larger than actual TTC when the change in the object's optical size is too small and is smaller than actual TTC when the change is too large. The potential effects of computer aliasing typically are not addressed in TTC studies but have important methodological and theoretical implications because we do not know how often the visual system samples TTC information. However, we do know that effects of aliasing on TTC judgments can occur only if inaccuracies in the computer-generated images are within the spatiotemporal resolution of the visual system. Therefore, we measured effects of aliasing on TTC judgments. We reasoned that if an approaching object's image does not increase in size from one frame to the next, optical TTC information is not accurate for that period of time. If TTC is extracted from the object's image on those two frames, observers' TTC estimates would be inaccurate. Alternatively, if TTC is derived by averaging optical expansion over several frames (DeLucia, 1991a, 1999; Tresilian, 1993), errors in TTC information due to aliasing may not affect TTC judgments. Moreover, objects with small projected images are affected more by aliasing than objects with larger projected images. For example, simulations of a small approaching object result in smaller changes in the object's image compared with a larger object that has the same virtual velocity and distance. A relatively greater proportion of the smaller object's total optical expansion contains inaccuracies. The implication is that aliasing may contribute to the SAE. We examined these issues with our fast expansion scenes (DeLucia, 1991a). With such scenes, observers reported that the small near object would hit them before the large far object, consistent with optical TTC information. We used animation techniques to create substantial irregularities in optical TTC information in these scenes (see DeLucia, 1991a, Figure 4). Specifically, we kept the object's image size constant on several successive frames of the approach scene. Nevertheless, observers reported that the small object would hit
264
Patricia R. DeLucia
them first. The implication is that observers do not extract optical TTC information on a frame-by-frame basis. We obtained similar results in a PM task when the object's optical size did not change on 0%, 19%, 45%, or 74% of the frames (DeLucia, 1999). The effect of aliasing was not significant, and estimated TTC increased as actual TTC increased despite disturbances in optical TTC information. Finally, mean TTC estimates were greater for a small object compared with a larger object that contained aliasing on a greater percentage of frames. The implication is that pictorial relative size underlies the SAE rather than artifacts of computer aliasing. In summary, our results suggest that irregularities in optical TTC information due to computer aliasing do not affect TTC judgments or account for the SAE. The implication is that TTC information is not extracted on a frame-by-frame basis, possibly due to limits in the temporal resolution of the visual system. Alternatively, TTC information is derived by averaging optical expansion information over a minimal period of time. Finally, a recent study indicated adverse effects of time or motion sampling on TTC judgments of simulated and real objects (Hecht, Kaiser, Savelsbergh, & van der Kamp, 2002). In this case, approaching objects appeared and disappeared at various rates. Such sampling led to overestimations of TTC compared with continuous viewing. The authors concluded that x theory does not predict their sampling effects. 2.3 Effects of attention instructions on TTC judgments In the primary demonstrations of the SAE, we obtained an interesting effect of attention instructions on TTC judgments (DeLucia, 1989, 1991a). The SAE was reduced when ground-intercept information was present and observers were instructed to attend to it rather than to the objects. However, this advantageous effect of ground-intercept information did not occur when observers were instructed to attend to the objects rather than to the groundintercepts. Such effects of attention instructions are important because they indicate that effects of visual information are not distributed uniformly across the optic flow field and, in particular, that useful information for TTC judgments may not be accessible to the observer within a single glance or attentional act (DeLucia, 1991a). This may be due partly to effects of retinal eccentricity.
Multiple sources sources of of information information influence influence time-to-contact time-to-contact judgments
265 265
2.4 Effects of retinal eccentricity on TTC judgments Although visual acuity decreases as eccentricity increases, results of several studies suggest that optical TTC information can be extracted from central and peripheral vision. First, in an outdoor setting, batters hit a pitched ball effectively when part of the ball's path was occluded from view; peripheral vision may have been important in extracting optical TTC information (DeLucia & Cochran, 1985). Second, based on measures of collision-avoidance responses, it has been argued that peripheral retina is sensitive to optical TTC information (Stoffregen & Riccio, 1990). Third, analyses suggest that umpires must rely on peripheral vision to judge runners "safe" or "out," which essentially is a relative TTC judgment (DeLucia & Bush, 1999). However, other studies indicated effects of eccentricity on TTC discrimination thresholds (Regan & Vincent, 1995), and on TTC estimations in a PM task (Manser & Hancock, 1996). Recently, it has been reported that eccentricity can affect judgments about absolute and relative TTC, and judgments of approach and lateral motion (Meyer, 2001). In a study of a PM task, observers pressed a button when they thought an approaching object would reach them, and TTC estimates were measured. In a study of relative TTC judgments, observers reported which of two approaching objects would reach them first, and the velocity of one of the objects was varied with a staircase method; difference thresholds were estimated. In both studies, viewing fixation was manipulated so that the center of the judged object(s) was located 6 or 35 degrees from the fovea on the last frame of the approach sequence. Generally, results indicated that eccentricity can affect both types of judgments. Furthermore, in the PM task, the SAE was greater when the object was viewed peripherally compared with centrally which suggests that peripheral vision may rely more on heuristics compared with central vision. Such findings are consistent with those of Regan and Vincent (1995) who measured TTC discrimination thresholds for approaching objects. They reported that, in foveal vision, TTC was processed independently of optical size and expansion rate. However, this independence decreased as eccentricity increased. The implication is that central and peripheral retina rely on different types of visual information. Meyer examined this issue further in a PM task. An object moved toward a target while moving on a path that was either perpendicular or nonperpendicular to the observer's line of sight. The former display contained lamellar flow; the latter contained both lamellar and radial flow. In some conditions, there was a significant interactive effect of eccentricity and path angle on TTC estimates. The difference in TTC estimates between the two path angles was significant only with peripheral viewing. Analyses of constant error suggested that the central retina is sensitive to radial and lamellar flow, whereas the peripheral retina is not particularly sensitive to radial flow. Meyer
266 266
Patricia R. DeLucia
concluded that his results were consistent with the functional sensitivity hypothesis (Warren & Kurtz, 1992). According to the latter, central vision is sensitive to lamellar and radial flow whereas peripheral vision is mainly sensitive to lamellar flow.
3. The sources of information that influence TTC judgments are determined, in part, by limits in cognitive processes It has been noted that even if observers can use optical TTC information to judge TTC, such judgments may be constrained by cognitive operations especially in complex situations when several objects are judged or when objects are not continuously in view (DeLucia & Novak, 1997). In this section, I review the results of several studies which suggest that limits in attention and memory influence TTC judgments. 3.1 Limited-capacity processing In most studies of relative TTC judgments, displays contained only two objects (e.g., DeLucia, 1991a; Law et al., 1993; Simpson, 1988; Todd, 1981). Results of such studies may not reveal limits in cognitive processing such as capacity limits in memory and attention (e.g., Cowan, 1988; Kahneman, 1973; Shiffrin, 1976). Law et al. (1993) suggested that judgments about relative TTC require resource-limited central processing, and that processing demands and visual scanning play a role in such judgments. Therefore, we measured relative TTC judgments of more than two approaching objects (DeLucia & Novak, 1997). In a representative experiment, two, four, six or eight objects {set size) approached the observer, as represented in Figure 7. Local and global x provided veridical TTC information. Observers reported which object would "hit" or pass them first; we recorded response time (RT) and accuracy. We obtained several interesting results. First, mean RT was greater with four, six, or eight objects than with two objects. This is consistent with limited-capacity processing (e.g., Townsend, 1974). The implication is that processing load affects TTC judgments and must be considered in models of TTC perception. Second, accuracy was above chance probability regardless of set size. This indicates that observers can effectively judge the relative TTC of as many as eight objects.
Multiple sources of information influence time-to-contact judgments
First Frane
267
Last Frame
•
• Figure 7: Schematic representation of a scene with eight approaching objects.
Third, misleading relative size information resulted in chance performance when two objects were present; the SAE occurred. However, misleading relative size information did not degrade performance when more than two objects were present. We hypothesized that observers used multiple sources of visual information more efficiently with relatively smaller set sizes. Consequently, misleading information was more effective with smaller set sizes. The implication is that the number of sources of visual information and the type of visual information that influences TTC judgments varies with the number of objects in the optic array. In summary, our results suggest that limited-capacity processing constrains TTC judgments. 3.2 Does set size affect the detection of TTC (can observers perform a parallel search of TTC)? Studies suggest that there are mechanisms in the human visual system that are sensitive to optical TTC information independently of optical size and expansion (Regan & Hamstra, 1993). We examined whether set size affects the detection of TTC. The absence of set size effects would be consistent with parallel search for TTC and would implicate a mechanism that is selectively responsive to optical TTC information (DeLucia & Novak, 1997). We again used displays of two, four, six, or eight approaching objects. However, rather than perform an identification task as described in the previous section, observers performed a detection task. On half of the trials, one object would have reached the observation plane before the other objects; a target was present. In the other trials, all of the objects would have arrived at the same time; a target was absent. Observers reported whether a target was present or absent. Accuracy was above chance probability regardless of set size, and mean RT was faster with eight objects than with two, four, or six objects. The
268
Patricia R. DeLucia
implication is that detection judgments are less constrained by limits in cognitive processes than are identification judgments. To examine whether observers used parallel or serial search, we computed the slope of the relationship between set size and RT. Parallel search has been characterized by search times of less than 10 ms per element in target-present scenes (Treisman & Gormican, 1988). When pictorial relative size information was consistent with optical TTC information, the slope for target-present trials was 1.75 ms and the slope for target-absent trials was -33.25 ms. The former is consistent with parallel search; the latter is consistent with a degenerate search strategy (Duncan & Humphreys, 1989). When relative size contradicted optical TTC information, respective slopes were 20.69 ms and -9.06 ms. In summary, our results suggest that relative TTC judgments which involve identification are constrained by limited-capacity processing whereas those that involve detection are not so constrained. However, while the detection task performed by our observers involved visual search for a target, it was not identical to the traditional feature-search task; nor did we include a conjunction-search task (e.g., Treisman & Gelade, 1980). More systematic studies of visual search for TTC is required before conclusions can be made. Results of a recent study suggested that collision detection performance is based on a limited-capacity process that uses serial search (Andersen & Kim, 2001). Therefore, task and event parameters may determine whether judgments are affected by limited-capacity processing. 3.3 Limits in short-term memory Set size was again manipulated to examine the role of cognitive processes in a PM task (Novak, 1998). Displays consisted of one, three, or six objects that moved rightward toward a target. Observers pressed a button when they thought one of the objects would reach the target. However, they did not know which object to judge until they received a "post-cue" (Law et al., 1993). This cue occurred immediately, 1.5 s, or 3 s after the objects disappeared. Presumably, cognitive processing demands increased as post-cue delay and set size increased. Generally, results indicated that performance declined as set size and post-cue delay increased. Novak concluded that her results were consistent with limits in cognitive processing, possibly in memory capacity and memory duration. 3.4 Cognitive motion extrapolation and visual imagery We examined whether cognitive extrapolation of motion and visual imagery contribute to PM tasks (DeLucia & Liddell, 1998). In this context, cognitive motion extrapolation (CME) involves a cognitive model of the object's
Multiple sources of information influence time-to-contact judgments judgments
269
visible motion that is used to extrapolate the object's motion after it disappears and to estimate TTC (Schiff & Oldak, 1990; Tresilian, 1995). We proposed that in conjunction with such a model, observers may imagine that the object continues to move after it disappears and respond when the imagined object reaches the target (see also Kaiser & Mowafy, 1993; Rosenbaum, 1975). This is consistent with subjective reports from observers in our studies. We examined the potential role of CME and imagery in PM tasks in three ways, summarized from DeLucia and Liddell (1998). First, we measured performance in an "interruption task" (Cooper, 1989) that involved CME and obviated TTC information. We compared performance in this task with performance in a PM task in which observers could use CME or optical TTC information (it is not possible to disentangle these in the PM task). In the interruption task, an object moved at a constant velocity and disappeared for a variable amount of time. It then reappeared at the correct position in its trajectory, or at a position that was less advanced {undershoot) or more advanced (overshoot) than the correct position, as represented in Figure 8.
Figure 8: Schematic representation of scenes with rightward lateral motion. Top. Final visible position. Second. Correct reappearance position. Third. Overshoot. Bottom. Undershoot.
Observers reported whether the object reappeared at the correct position. TTC information was not relevant because observers were not told when or where the object would reappear. If the pattern of response errors in this task and the PM task were similar, it would implicate CME in the PM task. For example, observers typically underestimated the TTC of approaching objects and more accurately judged the TTC of objects that moved perpendicular to the line of sight (e.g., Schiff & Oldak, 1990). If CME was the basis for this pattern of responses, we would expect similar results in the interruption task. The results of our study were complex. Generally, with lateral motion, the pattern of results in the interruption task was mostly consistent with that obtained in PM tasks. We observed less consistency with approach motion.
270 270
Patricia R. DeLucia DeLucia
Second, we measured TTC judgments in a PM task while manipulating visual information about the environment between the observer and the display (DeLucia, 1989, 1991a; DeLucia & Liddell, 1998). We hypothesized that if observers imagined that the object moved through the environment after its disappearance, they would rely on landmarks to keep track of the imagined object's position; performance would be affected when such cues were minimized. Displays of an approaching object were viewed through an aperture which minimized the information about the environment between the observer and the display. Results from this condition were compared with those in which displays were viewed without an aperture in a fully illuminated room. Although the main effect of viewing condition was not statistically significant, it accounted for over 8% of the variance. Moreover, the slope of the relationship between estimated TTC and actual TTC was smaller-- performance deteriorated, with aperture viewing. Tentatively, such results suggest that observers rely on imagery after the object disappears. Third, we used a selective interference paradigm (e.g., Brooks, 1967; Segal & Fusella, 1970) to examine the processes that underlie the PM task. Participants reported when a rightward-moving object would reach a target. After the object disappeared, two different stimuli appeared, and observers reported which of the two stimuli was longer in duration. The stimuli were presented visually or aurally. In a control condition, observers performed the PM task alone. We reasoned that if observers relied solely on TTC information, performance would be comparable for visual and auditory conditions. However, if the PM task involved CME, we would expect a greater performance decrement when the PM task was performed concurrently with the relative duration task, compared with the PM task alone; and this decrement would be greater for visual stimuli compared with auditory stimuli. Results indicated that the PM task was not affected by the duration judgments. However, the latter were less accurate when visual stimuli were judged concurrently with the PM task than when the stimuli were judged alone. The analogous comparison was not significant with auditory stimuli. The implication is that the TTC judgment and the visual duration judgment demanded common resources. Selective interference was again used to examine the role of cognitive processes in a PM task (Liddell, 1998). An object moved rightward toward a target and disappeared. Observers responded when they thought the object would reach the target and TTC estimates were measured. After the object disappeared, an alphanumeric character appeared in its normal or reflected position, and at one of four orientations about the depth axis. Observers completed a mental-rotation task (Cooper & Shepard, 1973) in which they reported whether the character was in a normal or reflected position, and response times were measured. Liddell hypothesized that the mental rotation task would interrupt the imagery processes that putatively were involved in the
judgments Multiple sources of information influence time-to-contact judgments
271 111
PM task. Furthermore, in half of the trials, a cue occurred before the object began to move and provided information about the character's identity and orientation. In this case, Liddell hypothesized that the imagined rotation involved in the mental rotation task would be performed before the imagery processes were activated in the PM task; thus, the latter would not be interrupted in the cued condition. Generally, mean TTC estimates were not affected by the presence or absence of the cue. However, analyses of RT for the mental rotation task indicated an interaction between cue and orientation. An effect of the latter occurred in the non-cued condition only. Liddell suggested that the PM task and mental rotation task require imagery processes and share limited processing resources.
4. The sources of information that influence TTC judgments vary throughout a task or event The studies reviewed in this section suggest that the information sources that influence TTC judgments vary during the course of a task or event. This may occur because specific sources of information are below threshold at certain distances; such information exceeds threshold and reaches maximal effectiveness as distance changes. This proposal is consistent with an analysis of the perception of layout, or depth perception, by Cutting and Vishton (1995) which indicates that the quality of some information sources varies with distance. Based on analyses of depth thresholds for different information sources, they proposed that the space around a moving observer can be divided into three regions and that effective sources of depth information can be identified within each region. Within personal space, which immediately surrounds the observer, the effective depth cues are occlusion, retinal disparity, relative size, convergence and accommodation. Within action space, 2 m - 30 m from the observer, the effective depth cues are occlusion, height in field, disparity, motion perspective and relative size. Beyond 30 m, in vista space, only pictorial depth cues are effective: occlusion, height in field, relative size and aerial perspective. Note that occlusion and relative size are effective in all three regions. Another possible reason that effective sources of information vary during the course of a task or event is that observers putatively direct their fixation and attention to various locations in space as an event unfolds and a task progresses. Fixation location determines which sources of information fall in central vision and peripheral vision. As noted earlier, this can influence which sources of information are effective. Similar consequences can arise from changes in visual attention, which can affect psychophysical thresholds and the response of cells in the visual cortex (e.g., Gilbert, 1996). Thus, attention may
272
Patricia R. DeLucia
enhance or attenuate the effectiveness of an information source. Cognitive processing demands, which may vary during a task or event, also may have similar consequences. Moreover, Previc (1998) proposed a neuropsychological model of how humans interact with the three-dimensional environment. In this model, threedimensional space is divided into four regions. Each serves a different set of perceptuomotor functions and is mediated by different neuroanatomical systems. For example, in the region most proximal to the observer, reaching and grasping are served by the dorsolateral visual pathway, particularly the areas specialized for global motion analysis and global stereopsis. In a farther region, navigation is served by the ventromedial visual pathway, which putatively has a role in scene memory. The implication for models of TTC perception is that the source of information that affects performance and the visual pathways that process it may vary with different regions of three-dimensional space and the corresponding visual fields. Based on these analyses, it seems unlikely that the information sources that govern performance remain constant throughout a task. Rather, information sources that are processed and that influence TTC judgments probably vary as the distances between the observer and objects in the environment change. Moreover, sensory and cognitive limits and attentional factors likely play an important role in determining which information sources are effective throughout a task or event. If observers sample various sources of information throughout a task, it would be difficult to identify a critical value of a single source of information that accounts for TTC judgments. This is consistent with our analyses of the SAE. Prior accounts of the SAE were based on the assumption that observers used the same source of information throughout the entire viewing period. Alternatively, we proposed that the visual information that guided performance varied over time (DeLucia & Warren, 1994). For example, during the initial part of our collision-avoidance task, observers may have perceived the smaller object as farther due to pictorial relative size. As the object got closer and TTC decreased, the influence of x may have increased. This hypothesis is consistent with our optical analyses. If either optical size or x was used alone, launch times would have occurred when such information was the same value for small and large objects. Instead, the ratio of the objects' optical sizes decreased between the start of the trial and launch time; however, it did not reach 1.0. Similarly, the ratio of the objects' TTCs began at 1.0 and increased at launch time. Even so, this ratio was closer to 1.0 than was the ratio for optical size. Tau and pictorial size both may have influenced performance, but the effect of each may have varied over the course of the trial. More generally, the relative contribution of different information sources to TTC judgments may change over time.
Multiple sources of information influence time-to-contact judgments judgments
273
A similar hypothesis was offered with respect to a collision control task (Smith et al., 2001). The authors suggested that, across experimental sessions, observers adapted their responses to appropriate perceptual information based on the demands of the task. Such adaptation occurred as a result of feedback and practice (see also van der Kamp et al., 1997) and could lead to positive transfer in similar task environments or negative transfer in different task environments. Specifically, the authors proposed that observers adapted their responses to a specific combination of optical size and rate of expansion (although they cautioned that their task invited if not demanded such a strategy). In their proposal, adaptation takes place across sessions. In contrast, we proposed that observers adjust or tune their responses to different sources of information throughout a single event or task (DeLucia & Warren, 1994). That is, effective information varies over the viewing period. It is proposed further here that limits in sensory and cognitive processes influence which sources of information are effective throughout a task or event. Finally, we note that most studies of TTC measured one judgment at the end of an event (e.g., DeLucia, 1991a; Schiff & Detwiler, 1979; Todd, 1981) and tacitly assumed that observers used the same source of information throughout the event. Such a method cannot determine whether effective information varies during the event. Toward this aim, we devised a methodology in which specific sources of visual information and judgments about spatial layout such as TTC are measured concurrently throughout an approach event. Such continuous response measurement or on-line response monitoring can indicate whether (and when) the basis of performance changes from heuristics to invariants during the event.
5. Apparent spatial extent and mental structure can influence TTC judgments The studies reviewed in this section support the hypothesis that apparent spatial extent and mental structure can influence TTC judgments. Specifically, I describe effects of visual illusions on TTC judgments, and the possibility of perceptual couplings in such judgments. 5.1 Visual illusions can affect TTC judgments and interceptive actions According to a x-based model of TTC perception, perceptual judgments of distance or speed are not necessary to judge TTC or to perform interceptive actions (Lee & Young, 1985). Perceived spatial extent should not affect such tasks. Thus, misperceived spatial extent or visual illusions should not affect such tasks. However, our research suggests that they can.
274
Patricia R. DeLucia
We investigated the effects of the Sander parallelogram illusion (Coren & Girgus, 1978) on two types of TTC judgments (DeLucia, Tresilian, & Meyer, 2000). In this illusion, represented in Figure 9, two objectively equal diagonals appear to differ in length.
A =B
Figure 9: Schematic representation of Sander parallelogram (top), and display used in TTC tasks (bottom). Arrows show motion of object (left) and "cursor" that observers controlled (right).
In a representative experiment, an object moved toward the Sander figure on a path that was collinear with a diagonal. It disappeared before it reached the figure. Observers pressed a button when they thought the object would reach the corner of the figure. Mean TTC estimates were longer when the object moved toward the apparently longer diagonal of the figure than the apparently shorter diagonal, particularly when TTC was relatively long (5.5 s). These results suggest that the (mis)perception of extent contributes to TTC judgments. Similar results occurred with an interceptive action. In this case, observers moved a "cursor" with a control stick so that it reached or intercepted the top corner of the Sander figure at the same time that the moving object would reach the corner. Mean interception time was longer when the object moved toward the apparently longer diagonal than the apparently shorter diagonal. Such results occurred when the object moved and disappeared as in the PM task, or when it was visible until the end of the trial. In summary, our results indicate that the Sander illusion can affect TTC judgments and interceptive actions. The implication is that perceived spatial extent influences such responses, which is difficult to reconcile with x -based models.
Multiple sources of information influence time-to-contact judgments
275
5.2 A preliminary study of perceptual couplings in TTC judgments In light of the findings that perceptual attributes of the environment can affect TTC judgments (DeLucia, Tresilian, & Meyer, 2000) and that perceptual attributes can be coupled to one another (Hochberg, 1974), Jason M. Bush and I examined whether perceptual couplings can occur in TTC judgments. A perceptual coupling occurs when a change in the perception of one stimulus dimension results in a change in the perception of a second stimulus dimension. Examples include couplings between apparent lightness and apparent spatial orientation (Hochberg & Beck, 1954), between apparent size and apparent distance (Hochberg, 1974), and between apparent relative distance and apparent motion direction (Hochberg & Peterson, 1987). Perceptual couplings are important because they indicate that perceptual responses vary while the optic array remains unchanged. This has been considered clear evidence for mental processes or enrichment in visual perception (Hochberg, 1974). Results of our pilot study suggest that perceptual couplings can occur in TTC judgments. As represented in Figure 10, an object approached the viewpoint and disappeared. The object consisted of a flat hexagonal shape, similar to the Ames window, that was oriented frontoparallel to the observation plane.
EYE
Figure 10: Schematic representation of approach scenes. Top. Frontal view. Bottom. Top view.
276
Patricia R. DeLucia
The small and large sides were the same distance from the virtual eye and optical TTC information specified that they would arrive simultaneously. However, lines were drawn on the surface so that it appeared as a threedimensional box-like object whose orientation could be perceptually reversed (as in a Necker cube) with the small side or large side closer (e.g., Hochberg & DeLucia, 1986). In different conditions, we instructed observers to perceive the object as slanted in depth with the right side closer or with the left side closer. That is, we manipulated perceptual set or intention. With an opposed-set procedure in which observers maintained one of these perceptual orientations (Peterson & Hochberg, 1983), observers pressed a button when they thought the left side or the right side would reach them. Mean TTC estimates were greater when observers were instructed to perceive a given side as farther than when they were instructed to perceive it as closer. This effect of intention instructions occurred for both sides. Although more experimentation is necessary before conclusions can be reached, our pilot data suggest that perceptual set or intention can affect TTC judgments. The implication is that apparent TTC is coupled to apparent depth. To the extent that perceptual couplings indeed reflect mental structure, the latter can influence TTC judgments.
6. Conclusions With the proposal that T provides information for TTC judgments, Lee stimulated a body of research that has important theoretical and practical implications. Tau is elegant because it provides information that is sufficient for such tasks and does not require perceptual judgments of distance or speed (Lee, 1976, 1980; Lee & Young, 1985). However, research suggests that x is only one of many factors that contributes to TTC judgments. For example, when optical TTC information such as x is above threshold, TTC judgments are nevertheless influenced by pictorial depth cues and lower-order motion. Furthermore, such judgments are not necessarily affected by disturbances in x as can occur with computer aliasing. Even when monocular TTC information putatively is not available, TTC judgments may be comparable to conditions in which such information has been specified. In addition, the use of x may depend on the context of a scene. Moreover, TTC judgments can be affected by perceived attributes of the environment and by visual illusions. Overall, these findings suggest that x is neither necessary nor sufficient to account for TTC judgments in all contexts. Multiple sources of information, including heuristics such as pictorial depth cues, must be considered in models of TTC perception. Cutting (1986) also noted the importance of multiple sources of information in his perceptual theory of directed perception, but emphasized that observers select
judgments Multiple sources of information influence time-to-contact judgments
277 111
from multiple invariants that overspecify an environment. The studies reviewed here suggest that observers use heuristics and invariants. In particular, pictorial or optical size appears to be fundamental to TTC judgments independently of x. An oversimplified example of a size heuristic is that objects with relatively larger optical sizes are coded as relatively closer (independently of assumptions about equal virtual sizes and traditional apparent size-distance couplings discussed earlier). There are several reasons that the visual system would rely on heuristics such as pictorial depth cues rather than on optical TTC information such as x (DeLucia et al., 2000, 2003). First, x is available only in restricted conditions. For example, objects must be rigid and move at a constant velocity, and the law of small angles must be upheld. When such conditions are not met, the utility of x can compromised (Tresilian, 1991). Relatedly, the visual system may not distinguish between events in which conditions required by optical TTC information are met and those in which they are not. Therefore, it would seem advantageous to use heuristics as an ongoing general strategy even when optical TTC information is available. Such a strategy also may accommodate the changes in fixation and attention that occur throughout a task and the corresponding changes in the information sources that are above threshold and effective. Second, the spatiotemporal resolution of the visual system is limited. For example, x would not be effective if an object's optical expansion is below threshold. Third, pictorial depth cues are available in any "static" moment and are available before motion information, which takes time to develop (Hochberg, 1987). Fourth, some pictorial depth cues such as occlusion and relative size are effective over a greater range of distances then motion information. Fifth, cognitive processes are limited. For example, effects of set size suggest that capacity limits in memory and attention can influence TTC judgments. Finally, in the context of the perception of rotary motion, Braunstein (1976) proposed that heuristics rather than invariants are used in perceptual judgments because of memory limitations, processing speed, cognitive load, and degraded information. Such factors also are relevant to TTC judgments. In summary, x is not always effective and its effectiveness is constrained by limits in sensory and cognitive processes. It is adaptive for the visual system to rely on other sources of information including pictorial depth cues and lower-order motion. It is proposed here that heuristics serve to accommodate limitations in sensory and cognitive processes and to provide flexibility in performance. Such limits also play an important role in determining which sources of information are effective throughout a task or event. Results of the studies reviewed here are consistent with Cutting and Wang's (2000) proposal that both heuristics and invariants can guide visual performance. However, they proposed that observers use invariants when they
278
Patricia R. DeLucia
are available and use heuristics when they are not. The studies reviewed here suggest that heuristics are used even when invariants are available. This has been noted before in the context of perception of rotary motion and has been considered adaptive (Braunstein, 1976). In conclusion, TTC judgments are based on multiple sources of information, including heuristics such as pictorial depth cues and lower-order motion. It is essential to identify the sources of information that affect such judgments and to measure the relative strengths and combinatorial rules of these sources. Furthermore, such judgments are constrained by limits in sensory and cognitive processes and heuristics may serve to accommodate such limits and provide flexibility in performance. Future research should consider the role of such limits in models of TTC perception and determine the effective information for TTC judgments throughout a task or event.
Acknowledgment This chapter was written while the author was supported by the Texas Advanced Research Program under grant No. 003644-0081-2001.
Appendix: Relative Rate of Accretion*
-R1. From Lee (1980) 0 = R/Z 9/(d9/dt) = Z/(dZ/dt) = TTC 2. From figure a =9-a dos/dt = dG/dt - da/dt da/dt = 0 «/(da/dt) = 8/(d9/dt) - a/(d8/dt) a>/(de>/dt) = 9/(d9/dt)*(1-a/G) 3. Result co/(do>/dt) = TTC * (1 - a/9) 1 - a/9 = 1 - [(R-V)/Z]/[R/Z] = V/R e>/(da>/dt) = TTC * (V/R) *for small angles Appendix. From DeLucia, Kaiser, Bush, Meyer, and Sweet (2003). Reprinted by permission of The Experimental Psychology Society.
Multiple sources of information influence time-to-contact judgments judgments
279
REFERENCES Ames, A. (1951). Visual perception and the rotating trapezoidal window. Psychological Monographs, 65, No. 324. Anderson, N. H. (1974). Algebraic models in perception. In E. C. Carterette & M. P. Friedman (Eds.), Handbook of perception: Psychophysical judgment and measurement (pp. 215298). NY: Academic Press. Best, C. J., Crassini, B. & Day, R. H. (2002). The roles of static depth information and objectimage relative motion in perception of heading. Journal of Experimental Psychology: Human Perception and Performance, 28, 884-901. Braunstein, M. L. (1976). Depth perception through motion. NY: Academic Press. Brooks, L. R. (1967). The suppression of visualization by reading. Quarterly Journal of Experimental Psychology, 19, 289-299. Bruno, N. & Cutting, J. E. (1988). Minimodularity and the perception of layout. Journal of Experimental Psychology: General, 117, 161-170. Caird, J. K. & Hancock, P. A. (1994). The perception of arrival time for different oncoming vehicles at an intersection. Ecological Psychology, 6, 83-109. Cavallo, V. & Laurent, M. (1988). Visual information and skill level in time-to-collision estimation. Perception, 17, 623-632. Cavallo, V., Mestre, D. & Berthelon, C. (1997). Time-to-collision judgements: Visual and spatiotemporal factors. In T. Rothengatter & E. C. Vaya (Eds.), Traffic and Transport Psychology: Theory and Application, (pp. 97-111). Amsterdam: Pergamon. Cooper, L. A. (1989). Mental models of the structure of visual objects. In B. E. Shepp & S. Ballesteros (Eds.), Object perception: Structure and process. Hillsdale, NJ: Erlbaum. Cooper, L. A. & Shepard, R. N. (1973). Chronometric studies of the rotation of mental images. In W. G. Chase (Ed.), Visual Information Processing (pp. 75-176). NY: Academic Press. Coren, S. & Girgus, J. S. (1978). Seeing is deceiving: The psychology of visual illusions. Hillsdale, NJ: Erlbaum. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bulletin, 104, 163-191. Cutting, J. E. & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.), Perception of space and motion (pp. 69-117). Series: Handbook of Perception and Cognition (2nd ed). San Diego, CA: Academic Press. Cutting, J. E. & Wang, R. F. (2000). Heading judgements in minimal environments: The value of a heuristic when invariants are rare. Perception & Psychophysics, 62, 1146-1159. DeLucia, P. R. (1989). Pictorial depth cues and motion-produced information for depth perception. Dissertation Abstracts International, 51(3), 1526B (UMI No. 9020517).
280
Patricia R. DeLucia
DeLucia, P. R. (1991a). Pictorial and motion-based information for depth perception. Journal of Experimental Psychology: Human Perception and Performance, 17, 738-748. DeLucia, P. R. (1991b). Small near objects can appear farther than large far objects during object motion and self motion: Judgments of object-self and object-object collisions. In P. J. Beek, R. J. Bootsma, & P. C. W. van Wieringen, (Eds.), Studies in perception and action: Posters presented at the Vlth international conference on event perception and action (pp. 94-100). Amsterdam: Rodopi. DeLucia, P. R. (1995). Effects of pictorial relative size and ground-intercept information on judgments about potential collisions in perspective displays. Human Factors, 37, 528538. DeLucia, P. R. (1999). Size-arrival effects: The potential roles of conflicts between monocular and binocular time-to-contact information, and of computer aliasing. Perception & Psychophysics, 61, 1168-1177. DeLucia, P. R. (2001). Age differences in judgments about potential collision and implications for driving. Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting, Santa Monica, CA: Human Factors and Ergonomics Society. DeLucia, P. R. (2002). Judgments of time to contact when an approaching object is partially concealed by a static or moving occluder. Journal of Vision, 2(7), 348a, http://iournalofvision.Org/2/7/348/. DOI 10.1167/2.7.348. DeLucia, P. R. & Bush, J. M. (1999). A critical analysis of umpire mechanics: Implications for training and research. Proceedings of the Human Factors and Ergonomics Society 43rd Annual Meeting, Santa Monica, CA: Human Factors and Ergonomics Society. DeLucia, P. R. & Cochran, E. L. (1985). Perceptual information for batting can be extracted throughout a ball's trajectory. Perceptual and Motor Skills, 61, 143-150. DeLucia, P. R., Kaiser, M. K., Bush, J. M., Meyer, L. E. & Sweet, B. T. (2000). Information integration in judgments about time to contact. Investigative Ophthalmology & Visual Science, 41, 798. DeLucia, P. R., Kaiser, M. K., Bush, J. M., Meyer, L. E. & Sweet, B. T. (2003). Information integration in judgments of time to contact. Quarterly Journal of Experimental Psychology: A, 57A (7), 1165-1189. DeLucia, P. R., Kaiser, M. K., Garcia, A. & Sweet, B. T. (2001). Effects of relative size and height in field on absolute judgments of time to contact [Abstract]. Journal of Vision, 1(3), 321a, http://journalofvision.Org/l/3/312, DOI 10.1167/1.3.312. DeLucia, P. R. & Liddell, G. W. (1998). Cognitive motion extrapolation and cognitive clocking in prediction motion tasks. Journal of Experimental Psychology: Human Perception and Performance, 24, 901-914. DeLucia, P. R. & Meyer, L. E. (1999). Judgments about the time to contact between two objects during simulated self-motion. Journal of Experimental Psychology: Human Perception and Performance, 25, 1813-1833.
Multiple sources of judgments of information information influence influence time-to-contact time-to-contact judgments
281 281
DeLucia, P. R. Meyer, L. E. & Bush, J. M. (2002). Judgments about collisions in simulations of scenes with textured surfaces and self motion: Do display enhancements affect performance? Proceedings of the Human Factors and Ergonomics Society 46th Annual Meeting, Santa Monica, CA: Human Factors and Ergonomics Society. DeLucia, P. R. & Novak, J. B. (1997). Judgments of relative time-to-contact of more than two approaching objects: Toward a method. Perception & Psychophysics, 59, 913-928. DeLucia, P. R., Tresilian, J. R. & Meyer, L. E. (2000). Geometrical illusions can affect time-to-contact estimation and mimed prehension. Journal of Experimental Psychology: Human Perception and Performance, 26, 552-567. DeLucia, P. R. & Warren, R. (1994). Pictorial and motion-based depth information during active control of self-motion: Size-arrival effects on collision avoidance. Journal of Experimental Psychology: Human Perception and Performance, 20, 783-798. Duncan, J. & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96, 433-458. Epstein, W., Park, J. & Casey, A. (1961). The current status of the size-distance hypotheses. Psychological Bulletin, 58, 491-514. Flach, J. M., Allen, B. L., Brickman, B. J. & Hutton, R. J. B. (1992). Dynamic occlusion: Active versus passive observers. Insight: The Visual Performance Technical Group Newsletter, 14, 5-7. Gibson, J. J. (1962). Observations on active touch. Psychological Review, 69, 477-491. Gibson, J. J. (1966). The senses considered as perceptual systems. Boston: Houghton-Mifflin. Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton-Mifflin. Gilbert, C. D. (1996). Plasticity in visual perception and physiology. Current Opinion in Neurobiology, 6, 269-274. Gilden, D. L. & Proffitt, D. R. (1989). Understanding collision dynamics. Journal of Experimental Psychology: Human Perception and Performance, 15, 372-383. Gray, R. & Regan, D. (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38, 499-512. Gray, R. & Regan, D, (1999). Do monocular time-to-collision estimates necessarily involve perceived distance? Perception, 28, 1257-1264. Gray, R. & Regan, D. (2000). Simulated self-motion alters perceived time to collision. Current Biology, 10, 587-590. Grutzmacher, R. P., Geri, G. A. & Pierce B. J. (2000). Time-to-contact estimates for observer versus target motion. Proceedings of the XlVth Triennial Congress of the International Ergonomics Association and 44th Annual Meeting of the Human factors and Ergonomics Society. Santa Monica, CA: Human Factors and Ergonomics Society. Hancock, P. A. & Manser, M. P. (1997). Time-to contact: More than tau alone. Ecological Psychology, 9, 265-297. Hastorf, A. H. (1950). The influence of suggestion on the relationship between stimulus size and perceived distance. Journal of Psychology, 29, 195-217. Hecht, H. (1996). Heuristics and invariants in dynamic event perception: Immunized concepts or nonstatements? Psychometric Bulletin & Review, 3, 61-70.
282
Patricia R. DeLucia
Hecht, H., Kaiser, M. K., Savelsbergh, G. J. P. & van der Kamp, J. (2002). The impact of spatiotemporal sampling on time-to-contact judgements. Perception & Psychophysics, 64, 650-666. Hochberg, J. (1974). Higher-order stimuli and inter-response coupling in the perception of the visual world. In J. J. Gibson, R. B. MacLeod, & H. L. Pick (Eds.), Perception: Essays in honor ofJ.J. Gibson (pp. 17-39). Ithaca, NY: Cornell University Press. Hochberg, J. (1978). Perception (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall. Hochberg, J. (1982). How big is a stimulus? In J. Beck (Ed.), Organization and representation in perception, (pp. 191-217). Hillsdale, NJ: Erlbaum. Hochberg, J. (1986). Representation of motion and space in video and cinematic displays. In K. Boff, J. Thomas & L. Kaufman (Eds.), Handbook of perception and human performance (pp. 22-1 to 21-64). Toronto: Wiley. Hochberg, J. (1987). Machines should not see as people do, but must know how people see. Computer Vision, Graphics, and Image Processing, 37, 221-237. Hochberg, J. E. & Beck, J. (1954). Apparent spatial arrangement and perceived brightness. Journal of Experimental Psychology, 47, 263-266. Hochberg, J. & Beer, J. (1991). Illusory rotations from self-produced motion: The Ames window effect in static objects. Proceedings and Abstracts of the Annual Meeting of the Eastern Psychological Association, 62, Glassboro, NJ: Eastern Psychological Association. Hochberg, J. & DeLucia, P. (1986). The perception of real surfaces' real motions varies with perceived orientation of reversible objects drawn on them: Real and virtual space interact. Proceedings and Abstracts of the Annual Meeting of the Eastern Psychological Association, 57, Glassboro, NJ: Eastern Psychological Association. Hochberg, C. B. & Hochberg, J. E. (1952). Familiar size and the perception of depth. Journal of Psychology, 34, 107-114. Hochberg, J. & Peterson, M. A. (1987). Piecemeal organization and cognitive components in object perception: Perceptually coupled responses to moving objects. Journal of Experimental Psychology: General, 116, 370-380. Ittelson, W. H. (1951a). Size as a cue to distance: Static localization. American Journal of Psychology, 64, 54-67. Ittelson, W. H. (1951b). Size as a cue to distance: Radial motion. American Journal of Psychology, 64, 188-202. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice-Hall. Kaiser, M. K. & Mowafy, L. (1993). Optical specification of time-to-passage: Observers' sensitivity to global tau. Journal of Experimental Psychology: Human Perception and Performance, 19, 1028-1040. Kebeck, G. & Landwehr, K. (1992). Optical magnification as event information. Psychological Research, 54, 146-159. Kerzel, D., Hecht, H. & Kim, N. (1999). Image velocity, not tau, explains arrival-time judgments from global optical flow. Journal of Experimental Psychology: Human Perception and Performance, 25, 1540-1555.
judgments Multiple sources of information influence time-to-contact judgments
283
Kilpatrick, F. P. & Ittelson, W. H. (1951). Three demonstrations involving the visual perception of movement. Journal of Experimental Psychology, 42, 394-402. Law, D. J., Pellegrino, J. W., Mitchell, S. R., Fischer, S. C, McDonald, T. P. & Hunt, E. B. (1993). Perceptual and cognitive factors governing performance in comparative arrivaltime judgments. Journal of Experimental Psychology: Human Perception and Performance, 19, 1183-1199. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5, 437-459. Lee, D. N. (1980). The optic flow field: The foundation of vision. Philosophical Transactions of the Royal Society of London, B-290, 169-179. Lee, D. N. & Young, D. S. (1985). Visual timing of interceptive action. In D. J. Ingle, M. Jeannerod, and D. N. Lee (Eds.), Brain mechanisms and spatial vision (pp. 1-30). Dordrecht: Martinus Nijhoff. Liddell, G. W. (1998). Interfering and updating cognitive representations used in judgments of absolute time-to-contact in a prediction motion task. Dissertation Abstracts International, 58(10), 5678B. (UMI No. 9812032). Manser, M. P. & Hancock, P. A. (1996). Influence of approach angle on estimates of time-tocontact. Ecological Psychology, 8, 71-99. Meyer, L. E. (2001). The effects of retinal eccentricity on judgments about collisions. Dissertation Abstracts International, 62(05), 2523B. (UMI No. 3015713). Morgan, M. J. (1980). Analogue models of motion perception. Philosophical Transactions of the Royal Society of London, B290, 117-135. Novak, J. B. (1998). Judgments of absolute time-to-contact in multiple object displays: Evaluating the role of cognitive process in arrival-time judgements. Dissertation Abstracts International, 58(10), 5679B. (UMI No. 9812047). Peterson, M. A. & Hochberg, J. (1983). Opposed-set measurement procedure: A quantitative analysis of the role of local cues and attention in form perception. Journal of Experimental Psychology: Human Perception and Performance, 9, 183-193. Previc, F. H. (1998). The neuropsychology of 3-D space. Psychological Bulletin, 124, 123-164. Regan, D. & Hamstra, S. J. (1993). Dissociation of discrimination thresholds for time to contact and for rate of angular expansion. Vision Research, 33, 447-462. Regan, D. & Vincent, A. (1995). Visual processing of looming and time to contact throughout the visual field. Vision Research, 35, 1845-1857. Rosenbaum, D. A. (1975). Perception and extrapolation of velocity and acceleration. Journal of Experimental Psychology: Human Perception and Performance, 1, 395-403. Runeson, S., Juslin, P. & Olsson, H. (2000). Visual perception of dynamic properties: Cue heuristics versus direct-perceptual competence. Psychological Review, 107, 525-555. Rushton, S. K. & Wann, J. P. (1999). Weighted combination of size and disparity: A computational model for timing a ball catch. Nature Neuroscience, 2, 186-190. Schiff, W. (1988). Accuracy of judging time-to-contact in visual and audiovisual events: Event anisotropy. Unpublished manuscript.
284
Patricia R. DeLucia
Schiff, W. & Detwiler, M L. (1979). Information used in judging impending collision. Perception, 8, 647-656. Schiff, W. & Oldak, R. (1990). Accuracy of judging time to arrival: Effects of modality, trajectory, and gender. Journal of Experimental Psychology: Human Perception and Performance, 16,303-316. Sedgwick, H. A. (1983). Environment-centered representation of spatial layout: Available visual information from texture and perspective. In J. Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision (pp. 425-458). NY: Academic Press. Segal, S. J. & Fusella, V. (1970). Influence of imaged pictures and sounds on detection of visual and auditory signals. Journal of Experimental Psychology, 83, 458-464. Shiffrin, R., M. (1975). Capacity limits in information processing, attention, and memory. In W. K. Estes (Ed.), Handbook of learning and cognitive processes: Vol. 4. (pp. 177-235). Hillsdale, NJ: Erlbaum. Simpson, W. A. (1988). Depth discrimination from optic flow. Perception, 17, 497-512. Smith, M. R. H., Flach, J. M., Dittman, S. M. & Stanard, T. (2001). Monocular optical constraints on collision control. Journal of Experimental Psychology: Human Perception and Performance, 27, 395-410. Smith, W. M. & Gulick, W. L. (1957). Dynamic contour perception. Journal of Experimental Psychology, 53, 145-152. Stoffregen, T. A. & Riccio, G. E. (1990) Responses to optical looming in the retinal center and periphery. Ecological Psychology, 2, 251-274. Todd, J. T. (1981). Visual information about moving objects. Journal of Experimental Psychology: Human Perception and Performance, 7, 795-810. Townsend, J. T. (1974). Issues and models concerning the processing of a finite number of inputs. In B. H., Kantowitz (Ed.), Human information processing: Tutorials in performance and cognition (pp. 133-183). Potomic, MD: Erlbaum. Treisman, A. M. & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12, 97-136. Treisman, A. M. & Gormican, S. (1988). Feature analysis in early vision: Evidence from search asymmetries. Psychological Review, 95, 15-48. Tresilian, J. R. (1990). Perceptual information for the timing of interceptive action. Perception, 19, 223-239. Tresilian, J. R. (1991). Empirical and theoretical issues in the perception of time to contact. Journal of Experimental Psychology: Human Perception and Performance, 17, 865-876. Tresilian, J. R. (1993). Four questions of time to contact: A critical examination of research on interceptive timing. Perception, 22, 653-680. Tresilian, J. R. (1994). Perceptual and motor processes in interceptive timing. Human Movement Science, 13, 335-373. Tresilian, J. R. (1995). Perceptual and cognitive processes in time-to-contact estimation: Analysis of prediction-motion and relative judgement tasks. Perception & Psychophysics, 57, 231-245.
Multiple sources of information influence time-to-contact judgments judgments
285
Tresilian, J. R. (1999). Analysis of recent empirical challenges to an account of interceptive timing. Perception & Psychophysics, 61, 515-528. van der Kamp, J., Savelsbergh, G. & Smeets, J. (1997). Multiple information sources in interceptive timing. Human Movement Science, 16, 787-821. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory? Journal of Experimental Psychology: Human Performance and Perception, 22, 1034-1048. Warren, W. H. & Kurtz, K. L. (1992). The role of central and peripheral vision in perceiving the direction of self-motion. Perception & Psychophysics, 51, 443-454.
This Page is Intentionally Left Blank
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 12 How Now, Broad Tau? Mary K. Kaiser NASA Ames Research Center, CA, USA
Walter W. Johnson NASA Ames Research Center, CA, USA
ABSTRACT Since David Lee's initial proposal of tau as an optical variable, there are been multiple extensions of the concept. Some of these extensions are anticipated in Lee original characterization, some are not. Of late, broad classes of "time-tocompletion" variables have been embraced as cases of tau. This raises the question of what makes a optical variable tau-like. Is it simply that the variable involve the ratio of two visual angles? Is it that the ratio approaches zero as the anticipated event occurs? And what of perceptual variables in non-visual modalities? Our chapter advocates a fairly restrictive definition of tau, one that recognizes the uniqueness and elegance of the original formulation. Extensions of a mathematical construct should be embraced when they strengthen the original formulation, but eschewed when they compromise the original insight's power and beauty.
288
Mary K. Kaiser and Walter W. Johnson
1. Introduction In this chapter, we will consider how Lee's initial characterization of x (Lee, 1974) has been broadened and expanded - not always to good effect. But readers seeking a denouncement of x as an optical variable for action control and planning had best search elsewhere: We come to praise T, not to bury it. For x researchers are honorable men (and women). And x - at least in its initial conceptualization - is a notion of unusual beauty and elegance. Some of us first encountered a mathematical derivation of time-tocontact (TTC) based on image expansion in Fred Hoyle's science fiction novel, "The Black Cloud" (Hoyle, 1957). In Hoyle's story, a killer cosmic cloud is on a collision course with Earth; the best and brightest scientists are called upon to estimate how long it will take the black cloud to reach our blue planet. Discussion focuses on using Doppler shift data to estimate the cloud's speed, until the clever Dr. Weichart pipes up: "Sorry, I don't understand all this," broke in Weichart. "I don't see why you need the speed of the cloud. You can calculate straight away how long the cloud is going to take to reach us." (p. 22) Weichart compares two images taken one month apart, and notes a five percent increase in size. Then, right there on page 23, the "details of Weichart's remarks and work while at the blackboard" provide a derivation of x. The scientist concludes, "the black cloud will be here by August, 1965." (Fortunately, this being a work of fiction, Earth survived.) By the time the novel ends (167 pages and 58 years later), few readers remember Weichart's clever derivation. Fewer still could imagine its implications for the control and planning of action. That was the amazing insight David Lee provided in his seminal work (Lee 1974, 1976, 1980). The incredible thing about the optical variable x is that it completely recast a classical problem of human perception and action. Traditionally, the problem of timing for interception had been characterized in terms of distance and velocity. The challenge for the perceptual psychologist was to determine which visual cues allowed the performer to estimate these parameters. But perhaps this was not the most appropriate characterization of the problem. Perhaps the problem needed to be analyzed along a non-spatial dimension. This is the re-characterization x provides. Just as Fourier analysis characterizes spatial image information in the frequency domain, x characterizes optical information in the temporal domain. Distance and speed are no longer the relevant parameters: time is. This redefinition has substantial functional utility for human performance, for while action is spatially executed, it is temporally orchestrated. We can therefore
How Now, Broad Tau?
289
appreciate what a revelation it was to identify an optical variable (and, potentially, a class of optical variables) that can serve as the basis for the planning and execution of action. The four temptations of T Revelation naturally generates enthusiasm. Enthusiasm, unless properly tempered, can lead to excesses. And this is the crux of our cautionary tale: a construct as elegant and compelling as x can seduce researchers to commit sins of both theory and experimentation. We will consider four critical dangers to x: 1) the over-extension of the construct; 2) the blurring of distal and optical variables; 3) the construction of overly complex optical variables; and 4) the assumption of temporal estimation.
2. Over-extension The initial formulation of optical x dealt with the expansion of an object's image as it approaches an observer. The TTC is estimated by the ratio of the image size and its rate of expansion. This form of x is now often referred to as "local x" (or xL ), since the TTC information is specified by features local to the target image. Image size can be defined in terms of area or cross-angle, in polar coordinates or Euclidean. So long as certain underlying assumptions are met (e.g., constant approach velocity, no concurrent change in image size due to other object transformation, adherence to the law of small angles), local x estimates remain robust. It quickly became apparent that the geometry of x was extensible to situations where an object would not collide with the observer, but merely pass by as the observer moves through the environment. In such cases, the time-topassage (TTP) is estimated by the ratio of the angle between the object and the observer's heading/track vector, and the rate of change of that angle. This form of optical x was termed "global x" (xG ), since it is now necessary to recover information from the global flow field (i.e., one must determine the observer's direction of motion-usually concurrent with the focus of expansion for a linearly moving observer) in order to compute x. If the observer is stationary (and the target is approaching on a passage course) it is still possible, in principle, to compute TTP: one must simply define an "eye plane" for the observer. The line orthogonal to this plane from the point of observation then serves the same function as the heading/track vector in computing xG.
290
Mary K. Kaiser and Walter W. Johnson
This sequence of development (from local x, to global x for a moving observer, to global x for a stationary observer) demonstrates both how enticing and how suspect it is to extend the construct of x. At first glance, these extensions (from local x to global x for a moving observer, for global x for a moving observer to a stationary one) are mathematically straightforward. To extend the x concept from local to global, simply substitute the angular offset from heading vector for image size; to extend from a moving observer to a stationary one, simply substitute a sight line for the heading vector. But let us consider the perceptual competencies involved for each case. Local x, as the name implies, operates on local image expansion. Information need not be integrated over regions of the visual field. Further, there is evidence that the human visual system possesses receptor mechanisms specifically sensitive to local expansions (Wang & Frost, 1992). The fact, then, that people are sensitive local x information in no way implies that they have the capability to process global x information. Similarly, demonstrating that observers are able to make reasonably accurate global x judgments as they move through the environment does not necessarily imply competence for stationary passage judgments. Whereas a sight line holds mathematical equivalence to a heading vector, it is critical to note that the heading vector is specified by the optical flow field: the sight line is not. Thus, in the progression from local x to moving global x to stationary x, we make trivial mathematical substitutions with profound perceptual assumptions. We first assume a perceptual sensitivity for local image expansion will generalize to the angular distance between target and track vector, then assume further that observers are able to extract a visual location that is not specified directly in the optic array. The construct of x has also been extended to the acoustic domain (Shaw, McGowan, & Turvey, 1991). Again, the mathematical substitution is straightforward: substitute sound intensity (I) for image size in Lee's formulation. The result specifies the TTC with a sound-emitting source moving towards a listener on a linear trajectory at a constant velocity: x = 21/(dl/dt)
(1)
However, Guski (1992) questions whether the human auditory system can, in fact, utilize this kind of information. Thus, two cautions must be taken in this sort of x-variable substitution: First, it is critical to ensure that the proposed variable is, in principle, available in the perceptual array; second, one must demonstrate that human observers are, in fact, sensitive to this information.
How Now, Broad Tau? Tau ?
291
Some extensions of x do not maintain the mathematical characteristics of the original construct. One beauty of the x formulation is that, even though the numerator is an increasing quantity, the denominator's growth rate greatly exceeds the numerator's, permitting the variable to approach zero. A more general form of x has been proposed to describe gap closure (Lee, 1998): here, time to closure is simply the current gap size divided by the (approximately constant) rate of closure. As gap size decreases, time to closure (naturally) approaches zero. While this characterization is certainly mathematically correct, it is also trivial. Granted, Lee developed this generalization in the context of a theory of action coordination (via coordinating, or coupling, multiple taus). Still, when one extends the notion of x so broadly, the concept loses its elegance. It is hardly a revelation that a person who has seventy dollars and is spending ten dollars a day will be broke in a week. It is a bit more surprising (and sobering) to realize that a person who increases his income 3% a year in a world with 12% annual inflation will lose half his buying power in eight years. The initial formulation of x, based on image or angular expansion, offers the elegance of revelation; time-to-closure x, based on gap-size reduction, does not.
3. Blurring of distal and optical variables All problems in perceptual psychology must meet the challenge of mapping distal variables to proximal (e.g., optical, auditory) stimuli. In most traditional visual analyses, this mapping is fairly simple. There are, for example, straightforward geometric mappings between image size and object size and distance. The challenge for x variables is a bit more daunting. Temporal events are characterized by rates of change - the time it takes for an approaching ball to reach you is defined by its current distance and the rate of change of that distance (i.e., the ball's velocity). Fortunately, as Lee's analyses demonstrated, the ball's velocity maps to rate of relative image expansion. In concert with current image size, this optical variable maps to the temporal event of interest: TTC. It is, however, incumbent on researchers to demonstrate that the optical components of the proposed x variable are, in fact, available in the perceptual array. And it is critical to remember that the optical array is sampled at the eyepoint of the actor. In their paper on movement coordination, Lee and his colleagues proposed that people guide their hand-to-mouth movement by coupling the x of the hand-mouth gap with the angular gap to be closed by steering the hand (Lee,
292
Mary K. Kaiser and Walter W. Johnson
Craig, & Grealy, 1999). However, when people demonstrate equivalent behavior with eyes open or closed, Lee et al. postulated that the muscle spindle system must provide sufficient information to specify the taus of movement gaps. The only evidence they offer in this regard is Boyd's study of isolated muscle spindles, which suggests the structure possesses features of a servocontrol system (Boyd, 1980). This is a provocative suggestion, but it hardly justifies developing a coordination model based on optical variables only to conclude that the proposed movement-gap taus must also be sensed and regulated via non-visual modalities.
4. Construction of overly complex optical variables In the section on over-extension, we questioned Lee's development of overly simplistic gap-closure variables, which we argued lacked the mathematical elegance of expansion-based x. To prove that we are equalopportunity critics, we now consider whether some x-like variables that have been proposed suffer from an excess of complexity. As a number of researchers have argued, the initial x formulation could not serve as an adequate control variable for a number of important activities (e.g., Tresilian, 1991; van der Kamp, Savelsbergh, & Smeets, 1997). Take the example of catching an approaching ball. Typically, people catch a ball with their hand displaced from their eyepoint. Further, some of the catching action must be orchestrated when the ball is so close that the law of small angles (a mathematical assumption of x) no longer applies. Both the spatial displacement and the non-small angles bias the x value. While there are numerous ways to compensate for this bias (see "Flying Buttresses" below), the initial ecological approach to tau's inadequacy was to search for a solution within the perceptual array: that is, to identify an optical variable that accurately characterizes time-to-collision in this situation. Bootsma and Oudejans (1993) proposed a candidate variable: 1 tau margin
=
d (In cp). d (In 6) dt dt
(2)
where (p is the optical angle subtended by the ball at the point of observation, and 0 is the angle (at the point of observation) between the ball's center and the catch point. To restate this in a form more akin to Lee's original derivation: 1 tau margin
- d cp / dt _ d 9 /dt
(3)
How Now, Broad Tau?
293
Note that when the offset is zero, the second term of the equation disappears and we're left with the basic x formula. But when there is an offset, observers must be able to coordinate their actions based on the difference between two x-like quantities (basically, a local x and a stationary global x where the "sight line" is defined by the catch point). Once again, what is proposed is a fairly straightforward mathematically extension, but it presupposes a perceptual competence that is suspect. In general, human perceptual systems recognize equivalence (or its lack) and operate on ratios; this proposed variable requires observers to extract a metric difference. We would suggest that optical variables that require arithmetic operations, or that are based on higher-order derivatives (such as acceleration) are likely of little use to human observers (Hecht, Kaiser, & Banks, 1996); they are beyond our perceptual ken.
5. Assumption of temporal estimation There is often a very confused stance with regard to the role of T in timing and time estimation. Timing only requires knowing when to initiate some action or activity. It does not require knowing how long since, or until, another event occurs. Thus, an animal needs no knowledge of the linkage between the value of x (an optical variable), and a value of time. This is in contrast to time estimation, which does require estimating the duration of some just completed event, or how long it will be before some upcoming event occurs. For example, in the famous 'plummeting gannets' hypothesis (Lee and Reddish, 1981), there was a trigger level of x at which the gannets were proposed to fold their wings. And in Lee's later work on hitting an accelerating ball (Lee, Young, Reddish, Lough and Clayton, 1983) there was again the use of x as a trigger at which actions would begin. It is very important to note that neither of these required any form of temporal estimation, nor of any understanding of the linkage between the values of the optical variable and the associated TTC. This brings us to a second fact about x that, while not totally overlooked, is often brushed aside. What aspect of the optical world do animals actually respond to when controlling timing? It is often presumed that it is the TTC predicted by x proper. In fact, the work of Owen (1987) suggests that the optical variable may be the proportional rate of optical expansion, and not its inverse. In studies of the detection of altitude change, Owen found that sensitivity, indexed by either the time required to detect that an altitude change was in progress (RT), or detection accuracy (Ag), was a linear function of the proportional rates of change in altitude. That is:
294 294
Mary K. K. Kaiser and and Walter Walter W. W. Johnson Johnson
Ag=k*Z/Z+b RT = k*Z/Z
(4) +b
(5)
where k and b are constants, Z is altitude and Z is vertical speed. Of course, Z / Z is also proportional to the proportional rate of optical expansion (O), thus:
Ag=k*O + b
(6)
RT = k*O + b
(7)
Thus, for this task, sensitivity scaled with the inverse of x proper, suggesting a fundamental sensitivity to the optical expansion variable. If it had scaled with T, then we should expect: Ag=k*O~l+b
(8)
RT = k*O~}+b
(9)
But this is not what Owen found. Instead he found evidence that people responded to the proportional rate of optical expansion. This, of course, does not mean that temporal estimation is never required. But temporal estimation is only required when an action cannot be initiated concurrently with the trigger level of the optical variable. For example, if a photographer is trying to snap the picture of an approaching runner as she crosses a finish line, the photographer might need to observe the runner, estimate the time until she crosses the finish line, refocus the camera on the finish line (blurring the image of the runner), and begin to snap before the runner comes into focus again. In such a case the photographer might attempt to estimate x proper (the amount of time needed to reach the finish line), and then count it off. This would involve the two additional operations of time estimation (converting optical x into a time estimate) and counting (see Johnson, 1986, for an examination of such timing). However, only in such cases where we need to make temporal predictions do we need the inverse of the proportional expansion rate, or x proper. And even in these cases, we could learn the optical expansion rates associated with the time needed to complete critical physical activities, such as counting to three.
How Now, Broad Tau?
295
Flying buttresses for the temple of T There's an old adage that reminds us: when you get a shiny new hammer, everything looks like a nail. Likewise, when you discover an optical variable with the shiny elegance of x, it is tempting to view the world through xcolored glasses. Tau appears a viable candidate for the coordination of all action. Lee (1998) proposed a general x theory for prospective guidance of movement, based on the coordination (x coupling) of several x variables (e.g., keeping the taus of gaps in a constant ratio during a movement). As we discussed earlier, Lee's proposal requires extending the concept of x to a class of non-expansion optical variables - x becomes a broader basis of control because more types of optical information are defined as x. If all motion to an end-point is characterized as a case of x, our x-hammer will indeed be very busy. But identifying these broadly defined taus, and noting their correlation during action may blind us to people's actual coordination strategies. We proposed that expansion-based taus and optical gap closing are but a subset of the information people use in planning and executing coordinated actions. Humans opportunistically combine their control strategies. In addition to the timing information provided by taus (general and specific), there are myriad ways to orchestrate our actions. We shall consider three: heuristics, closed-loop corrections, and sequential-variable (or piecewise) strategies.
6. Heuristics In a previous section, we raised the concern of ensuring human observers are sensitive to the optical (or acoustic) variables proposed in x (and x-like) formulations. We explore that issue further here and suggest that, in cases where people are not sensitive to the higher-order motion information required for a x-based solution, they employ simpler heuristics which are "good enough" to guide behavior. One of Lee's first extensions of x was to propose that its temporal derivative, x(dot), could be used as an optical variable to control braking (Lee, 1976). Lee proposed that maintaining a x(dot) of -0.5 ensures a safe and efficient stop, and there is evidence that observers can differentiate between approach events with x(dot) values above and below that critical value (Kim, Turvey, & Carello, 1993). But further analysis shows that any constant x(dot) between -1.0 and 0 will result in an a velocity of zero at the point of contact. Although some of these approaches would require decelerations that are physically unrealizable, the fact remains that there is nothing "critical" about a
296
Mary K. Kaiser and Walter W. Johnson
x(dot) of -0.5, save that it results in both zero velocity and constant deceleration (Kaiser & Phatak, 1993). More to the point, there is little evidence that people utilize breaking strategies that maintain a constant x(dot) value of -0.5, or any other value. In fact, drivers' and pilots' braking behavior is highly idiosyncratic, reflecting not only the dynamics of their vehicle, but also level of experience and control style (Moen, DiCarlo, & Yenni, 1976; Spurr, 1969). Yilmaz and Warren (1995) studied braking behavior in a driving simulation and reported that participants' mean x-dot during breaking was "close to the expected value" of -0.5. However, individuals' means ranged from -0.35 to -0.61. Further, these means simply reflected the slope of a regression line fitted to a window of the x by timebefore-stopping function. More worrisome, their analyses indicated that a change in x(dot) of 0.24 (i.e., approximately 50%) was required to elicit a braking adjustment; this hardly seems a sufficiently sensitive response for a proposed control variable. Helicopter pilots are taught a simple rule-of-thumb for visually controlling their speed during a landing: Keep the visual scene moving at a constant rate of a comfortable walk. In other words, they are instructed to maintain a two-eyeheights/sec flow rate; as eyeheight decreases, velocity does so proportionately. This strategy, or heuristic, makes landing a helicopter a very different visual experience than landing a fixed-wing aircraft. In the later case, velocity is held constant as the plane approaches the touchdown point, and pilots experience a "ground rush" as optic flow rate increases. The helicopter pilot, in contrast, seeks to minimize velocity at the point of touchdown. The heuristic of maintaining a constant flow rate helps the pilot "bleed off" velocity (and energy) during approach, such that he will have his craft in the proper state to successfully execute the final landing phase. (We will consider such sequential-variable strategies further in a later section.) The human visual system often makes simplifying assumptions when processing complex motion information. In principle, an observer could use the gravitational acceleration of a falling object to judge its size and distance. But given people's relative insensitivity to observed motion acceleration (Calderone & Kaiser, 1989), it is not surprising that such size/distance estimates are based on average image velocity (Hecht, Kaiser, & Banks, 1996). Similarly, people ignore acceleration when making time-to-passage judgments; their estimates reflect the simplifying assumption of constant velocity (Kaiser & Hecht, 1995; Lee, Young, Reddish, Lough, & Clayton, 1983). Tau-based information is just one of several sources people use to estimate time-to-contact (or passage). Even when x is the most reliable predictor, other factors influence judgments. Object properties (such as size) have been shown to bias TTC estimates (DeLucia & Warren, 1994), as has object and observer speed (when TTC is kept constant; Saidpour & Andersen,
How Now, Broad Tau? Tau ?
297
2002). Nor do people necessarily select the most reliable t information; when shown passage events in which local x is corrupted by rotational image transformations, observers still base their judgments on this rather than the veridical global x information (Kaiser & Hecht, 1994). People integrate numerous sources of information in order to perceive depth (Bruno & Cutting, 1988; Landy, Maloney, Johnston, & Young, 1995; Massaro & Cohen, 1993). Some of these information sources are not terribly reliable; some (like static occlusion) only specify depth order. But the human perceptual system is sufficiently intelligent to exploit appropriate cues to achieve the speed and precision of depth perception necessary to perform the task at hand (Cutting & Vishton, 1995; Sweet, Kaiser, & Davis, 2003). It is reasonable to assume that we are equally opportunistic in selecting optical information to guide and orchestrate our actions.
7. Closed-loop corrections Closely related to the issue of heuristics is that of closed-loop control. There is often an attempt to account for successful behavior by assuming an animal is directly aware of, or perceives, optical variables uniquely specifying kinematic states. This is, of course, not necessary. When controlling behavior, all that is needed is a variable that is sufficiently linked to the goal of the behavior. To return to the case of braking control, the above analysis shows that it is unnecessary to maintain any constant value of x(dot), or x, in order to achieve safe or efficient braking. In particular, holding x to a range of acceptable values is almost certainly good enough. For example, when controlling braking behavior the algorithm might be: Repeat Observe relative optical expansion rate (1/x) "if it exceeds an upper critical value add brakes", else "if it less than lower critical value reduce brakes" else "do nothing" Until stopped The important element of this algorithm is that it will work for a wide range of upper and lower critical values. In fact, the main trick is to select a range which will not result in too much initial braking (leaving a long period of final slow convergence) or too little initial braking (resulting in a rapid and
298
Mary K. Kaiser and Walter W. Johnson
potentially dangerous deceleration at the end). The principle of closed-loop control that is of greatest theoretical value here is that it avoids requiring hard computations or learning highly complex stimulus relationships by the person or animal. The only thing that must be learned is a general strategy with a loose requirement. This is what makes closed-loop approaches so powerful - there is no need for overly precise control because the animal is allowed to correct and home in over time. Closed-loop control strategies are also related to the value of heuristics a simpler variable, combined with the guidance provided by closed-loop correction, is frequently, perhaps typically, good enough. The alternative requirement of discovering, or computing, a complex variable which can work without correction (i.e., is ballistic) is probably not worth the effort, unless the exact method of control is as important as the final goal. Such a ballistic solution is also unlikely because, even if the exact optical variable could be determined, errors in execution and outside disturbances (e.g., variable winds) would make closed-loop control necessary in any case.
8. Sequential-variable (piecewise) strategies Johnson and Andre (1993) also found evidence for a sequentialvariable, or piecewise strategy in the control of deceleration for a helicopter. In this study helicopter pilots flew simulation scenarios in which they were required to descend from a cruise to a hover just above a specified point on the ground. The data from this study did not support any simple x, or x(dot) control strategy. Instead, it appeared that the pilots were utilizing a piecewise strategy: an initial segment in which x was not maintained (but which bled off speed at a much slower rate), followed at the very end by a more or less constant x descent which bled off speed much more rapidly. When debriefed, the pilots explained that they typically decelerate early, and more slowly, and then follow this with a constant "visual speed" segment. The reason for this piecewise strategy was that beginning a constant visual speed segment early leads to too much initial deceleration, and thus a very long time to reach the hover point; beginning the constant visual speed segment much later may lead to too much braking at the last moment. Furthermore, they reported that they normally executed the first part of the strategy by maintaining a constant pitch angle. That is, they exploited a characteristic of the vehicle's dynamics - constant pitch results in the desired deceleration - to control the helicopter. However, even though this heuristic was not available in this simulation (i.e., pitch angle was not allowed to change), pilots demonstrated the same deceleration profiles.
How Now, Broad Tau? Tau ?
299
This sequential strategy points out how 1) the nature of the task (needing to slow to a hover, but not take too long to accomplish it, or require too much last minute braking), and 2) the nature of the vehicle's dynamics (i.e., being able to use a constant pitch angle to bleed off speed at a higher rate early on, but not being able to comfortably decelerate at arbitrarily high rates), generates more complex braking strategies. Thus direct control of x is just one of a range of possible braking strategies, and is used when it is deemed most useful.
9. Conclusion Some readers may feel that we've failed our initial promise to praise the concept of x, that too much of our chapter dwells on caveats and concerns. It's not the case that we love x too little, but rather that we fear some researchers love x too much. Such excessive love can, as we've discussed, lead to the seven deadly sins of x: • . • • • • •
Extensions of x that are mathematically trivial but perceptually suspect. Extensions of the x concept to non-expansion situations. Assuming rather than demonstrating that x-relevant variables are present in the perceptual array. Assuming rather than demonstrating that the human visual system can process complex and/or higher-order optical variables. Assuming rather than demonstrating that people make accurate and reliable temporal estimates. Failing to consider simpler heuristics people might employ (which result in performance correlated with proposed x variables). Failing to consider the wealth of additional information sources people use in concert with x to orchestrate their closed-loop control of action.
Tau is an elegant and powerful construct that should not be loved to death. But it should be celebrated. To that end, we offer this humble ode:
300 300
Mary K. K. Kaiser and and Walter Walter W. W. Johnson Johnson
In praise of T How long is it now from there until here? When will we bump or go thump or pass near? The time to go pow Is all there in t. But do we know how To use the right T? Local or global, are we moving or it's? Use global for passage, local for hits. It's coming at us? Hell! Better use xL. Wait - it's passing us! Wheel We can use xG. Moving through the world, we use many sources To tell us what's near and what's far on our courses. Among these, the taus, general and specific... Optical variables we deem most terrific.
Acknowledgement Funding for this work was provided by the Airspace Operations Systems Project of NASA's Airspace Systems Program.
How Now, Broad Tau?
301
REFERENCES Boyd, I. A. (1980). The isolated mammalian muscle spindle. Trends in Neurosciences, 3, 258-265. Bruno, N. & Cutting, J. E. (1988). Minimodularity and the perception of layout. Journal of Experimental Psychology: General, 117, 161-170. Calderone, J. B. & Kaiser, M. K. (1989). Visual acceleration detection: Effects of sign and motion orientation. Perception & Psychophysics, 45, 391-394. Cutting, J. E. & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein and S. Rogers (Eds.) Handbook of Perception and Cognition: Volume 5, Perception of Space and Motion. New York: Academic Press. DeLucia, P. R. & Warren, R. (1994). Pictorial and motion-based information during active control of self-motion: Size-arrival effects on collision avoidance. Journal of Experimental Psychology: Human Perception and Performance, 20, 783-798. Guski, R. (1992). Acoustic tau: An easy analogue to visual tau? Ecological Psychology, 4, 189197. Hecht, H., Kaiser, M. K. & Banks, M. S. (1996). Gravitational acceleration as a cue for absolute size and distance. Perception & Psychophysics, 58, 1066-1075. Hoyle, F. (1957). The black cloud. London: Heineman. Johnson, W. W. (1986). Studies in motion extrapolation. Unpublished doctoral dissertation. The Ohio State University, Columbus, Ohio. Johnson, W. W. & Andre, A. D. (1993). Visual cueing aids to rotocraft landings. In C. L. Blanken, J. V. Lebacqz, R. H. Stroub & M. S. Whalley (Eds.), Proceedings of piloting vertical flight aircraft: A conference on flying qualities and human factors (pp. 4.1 - 4.20). San Francisco, California: American Helicopter Society/NASA. Kaiser, M. K. & Hecht, H. (1994). Time-to-passage judgments under violations of various constancy assumptions. Presented at the 35th Annual Meeting of the Psychonomic Society. Kaiser, M. K. & Hecht, H. (1995). Time-to-passage judgments in non-constant optical flow fields. Perception & Psychophysics, 57, 817-825. Kaiser, M. K. & Phatak, A. V. (1993). Things that go bump in the light: On the optical specification of contact severity. Journal of Experimental Psychology: Human Perception and Performance, 19, 194-202. Kim, N-G., Turvey, M. T. & Carello, C. (1993). Optical information about the severity of upcoming contacts. Journal of Experimental Psychology: Human Perception and Performance, 19, 179-193. Landy, M. S., Maloney, L. T., Johnston, E. B. & Young, M. (1995). Measurement and modeling of depth cue combination: In defense of weak fusion. Vision Research, 35, 389-412. Lee, D. N. (1974). Visual information during locomotion. In R. B. MacLeod & H. Pick (Eds.) Perception: Essays in honor of J. J. Gibson (pp. 250-269). Ithaca, NY: Cornell University Press.
302
Mary K. Kaiser and Walter W. Johnson
Lee, D. N. (1976). A theory of visual control of breaking based on information about time-tocollision. Perception, 5, 437-459. Lee, D. N. (1980). The optic flow field: The foundation of vision. Philosophical Transactions of the Royal Society of London B, 290, 169-179. Lee, D.N. & Reddish, P.E. (1981). Plummeting gannets: a paradigm of ecological optics. Nature, 292, 293-297. Lee, D. N. (1998). Guiding movements by coupling taus. Ecological Psychology, 10, 221-250. Lee, D. N., Craig, C. M. & Grealy, M. A. (1999). Sensory and intrinsic coordination of movement. Proceedings of the Royal Society of London B, 266, 2029-2035. Lee, D. N., Young, D. S., Reddish, P. E., Lough, S. & Clayton, T. (1983). Visual timing in hitting an accelerating ball. Quarterly Journal of Experimental Psychology, 35A, 333-346. Moen, G. C, DiCarlo, D. J. & Yenni, K. R. (1976). A parametric analysis of visual approaches for helicopters (NASA Technical Note D08275). Langley, VA: NASA. Owen, Freeman, Zaff & Wolpert (1987). Perception and control of simulated self motion. (AFHRL-TR-87-16), Williams AFB, AZ: Operations Training Division, Air Force Human Resources Laboratory. Saidpour, A. & Andersen, G. J. (2002). Use of speed information in detecting collision events. Presented at the 2nd Annual Meeting of the Vision Sciences Society. Spurr, R. T. (1969). Subjective aspects of braking. Automobile Engineer, 59, 58-61. Sweet, B. T., Kaiser, M. K. & Davis, W. (2003). Modeling of Depth Cue Integration in Manual Control Tasks. NASA Technical Memorandum 211407. Yilmaz, E. H. & Warren, W. H. (1995). Visual control of braking; A test of the tau(dot) hypothesis. Journal of Experimental Psychology: Human Perception and Performance, 21,996-1014. Wang, Y. & Frost, B. J. (1992). Time to collision is signalled by neuron in the nucleus rotundus of pigeons. Nature, 356, 236-238.
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 13 The Use of Binocular Time-to-Contact Information
Rob Gray Arizona State University East, Mesa, AZ, USA
David Regan York University, Toronto, Canada
ABSTRACT Early research on time-to-contact (TTC) focused primarily on the monocular information provided by an approaching object's fractional rate of change of retinal image size (i.e. f). More recently, it has been shown that binocular information provided by an approaching object's rate of change of retinal disparity is also used. Furthermore, it has been demonstrated that in some common everyday situations (e.g. catching a tumbling rugby ball) binocular information about TTC can compensate for the ineffectiveness of T. In this chapter we discuss psychophysical studies on the use of binocular TTC information. We review how the physics of geometrical optics combines with the physiological dynamic properties of the mechanisms sensitive to changingsize and changing-disparity to differentially weight the effectiveness of monocular and binocular information about TTC in different viewing situations. Finally, we discuss anecdotal evidence that supports the idea that binocular information about TTC is important in everyday life for actions such as hitting and catching.
304
Rob Gray and David Regan
1. Binocular correlates of time-to-contact (TTC) The fractional rate of increase of subtense of an approaching object has long been recognized as a source of information about the object's time to contact (Hoyle, 1957) and this TTC cue (x) has been investigated extensively1. A fact that was overlooked in most early studies of TTC is that the rate of change of horizontal relative disparity caused by an object's approach can, even by itself, produce a sensation of motion in depth such that the observer has the impression of collision at some future instant. (Wheatstone, 1852). Chapter 9 reviews psychophysical evidence that the human visual system contains a mechanism that is selectively sensitive to changing-disparity, and that this mechanism is distinct from the mechanism sensitive to static disparity. Chapter 9 also reviews evidence, obtained from single-unit studies in animals that different populations of neurons are sharply tuned to, on one hand, the combination of static disparity and motion within a frontoparallel plane (binocular depth neurons) and, on the other hand, to changing-disparity. As to the relation between the rate of change of relative disparity (d&dt) and TTC, Regan (1995) provided a derivation of equation (1) (reproduced in Gray and Regan, 1998). For a point object approaching an observer at constant speed along a line that is perpendicular to the frontal plane and passes midway between the observer's eyes (the z-direction). TTC *
(1) D{dSldt)
where I is the observer's interpupillary separation, D is the object's instantaneous distance and S is the object's instantaneous disparity relative to some fixed reference object2, and D » I (see Chapter 9). To anticipate, in experiments described below (Gray & Regan, 1998, 2000a) we used equation (1) to calculate the actual TTC signaled by the stimulus. We set the instant-by-instant relative disparities according to equation (2).3
1 The conclusions of many of the early studies designed to test the use of x as a TTC cue are questionable because observers viewed the stimulus binocularly so that both monocular and binocular cues to TTC were available. 2 Rushton and Wann (1999) claimed that TTO(relative disparity)/(rate of change of relative disparity). As shown in Chapter 9 this equation is mathematically incorrect. 3 Equation (2) was obtained by integrating the differences in relative disparity (AS) associated with small differences of distance (AD) between the distance at time zero and time t, using the wellknown equation AS=VD2. (Gray & Regan, 1998, Appendix).
The Use Use of Binocular Time-to-Contact Information
St=t » St=0 +
T
305
(2)
where 8T=T and 8T=O are the relative disparities at time t=f and f=0 respectively, T is time to collision and Do is distance at time t=0. This last point may well have been crucial by allowing observers to estimate TTC without needing to estimate distance D in equation (1) (see Chapter 9). It has also been proposed that a rate of change of absolute disparity or of ocular vergence could be used to estimate TTC (Heuer, 1993; Laurent, Montagne, & Durey, 1996). However, it is unlikely that these cues are useful sources of information about TTC given that a rate of change of absolute disparity produces (for small objects) only a weak sensation of motion in depth or (for large objects) no sensation at all, and a rate of change of vergence produces neither a sensation of motion in depth nor any effect on sensitivity to a rate of change of relative disparity (Regan, Erkelens, & Collewijn, 1986a). One possible reason that the role of binocular TTC information has been overlooked in the past is the well-known fact that the effectiveness of static binocular disparity information decreases sharply as viewing distance is increased. The relative disparity for a given depth separation between two objects is inversely proportional to the square of the viewing distance (see Howard & Rogers, 1995, pg. 36-37). It has been demonstrated psychophysically that sensitivity to differences in relative depth falls off steeply from ca. 10m. However, while undoubtedly correct, this geometric property of static disparity is irrelevant to the relative importance in judging TTC of the rate of change of angular subtense (d&dt) and the rate of change of relative disparity (d&dt) associated with an object's approach, because, for an object approaching an observer at either constant or varying speed along a straight line that is perpendicular to the frontal plane and passes midway between the observer's eyes (the z-direction),
{deidt)
S ~~
(3)
(dS/dt) I where d&dt and dS/dt are, respectively, the instantaneous rate of increase of horizontal angular subtense and the instantaneous rate of change of relative disparity, S is the object's linear horizontal width (e.g. in cm), I is the observer's interpupillary separation and D ^ ^ 2 and D 2 » S 2 (Regan & Beverley, 1979a). For our present purpose, the crucial point is that equation (3) does not involve distance.
306
Rob Gray and David Regan
The purpose of the remainder of this chapter will be to consider how well observers can use binocular information to estimate TTC in a variety of viewing conditions. We first turn to a consideration of how performance in estimating TTC can be assessed in the laboratory.
2. A comparison of the utilization of sources of information about TTC Research on time to contact has been plagued by methodological problems (see Wann, 1996 and Gray & Regan, 1999b, and Chapter 9 of this volume). One serious issue has been that in many experiments it is not clear whether the observer's judgments were entirely based on a visual correlate of TTC (i.e., the task-relevant variable) rather than one or more task-irrelevant variables (e.g., initial size, presentation duration, etc.). For example, when the different optical variables are correlated or only partially decorrelated, the expected pattern of results (estimated TTC increases monotonically as a function of f) can occur even when the observer bases estimates of TTC on a variable other than T. In some of the early t studies (e.g. Schiff & Detwiler, 1979) the results might be explained in terms of the use of the total change in object size for a given trial. Further to this point, it may not be intuitively obvious that the method of creating random variations in the task-irrelevant variables (e.g. randomly varying the starting size of the approaching object) does not necessarily solve this problem - see Kohly & Regan, 2002, appendix. It is only by directly evaluating the contribution of these task-irrelevant variables through the use of tools such as an orthogonal stimulus design (e.g. Regan & Hamstra, 1993; Kohly & Regan, 1999) or a stepwise regression analysis (e.g. PortforsYeomans & Regan, 1997; Gray & Regan, 1998) that one can be more confident that it is the perception of TTC that is actually being investigated. In most previous studies of TTC the method of choice has been to briefly present a simulated approaching object and have the observer press a button at the instant when he or she judged that the object would have collided with their head (assuming that it continued to approach after it disappeared). This method (often called the "prediction motion paradigm") has the disadvantage that the estimate of TTC is contaminated by the effect of motor delay, and it also permits the observer to use cognitive strategies for estimating TTC (Gray & Thornton, 2001; Tresilian, 1995). It may be that these problems explain why the underestimates of TTC measured using the alternative procedure described next are consistently and substantially (by roughly 20-30%) smaller than the underestimates derived from the "press button at the perceived TTC" technique (reviewed in Regan & Gray, 2000). In the alternative procedure the observer judges the TTC of the approaching object relative to an accurate (to
The The Use Use of of Binocular Time-to-Contact Time-to-Contact Information
307 307
1 msec) time reference (an auditory click), and multiple staircase (corresponding to different combinations of optical variables) are randomly interleaved (Gray & Regan, 1998). Another approach is that of simulated actions where an observer attempts to "hit" (Gray, 2002) or "catch" (Rushton & Wann, 1999; Smith et al., 2001) a simulated approaching object. However, when simulated or real (e.g. Alderson et al., 1974; Lee, Lishman, & Thomson, 1982; Bootsma & Oudejans, 1993) actions are used rather than "match to the time marker", one cannot directly measure the sensory aspects of TTC estimation. Rather, performance errors lump together visual processing and motor action. Finally, an important distinction is that between judgments of relative and absolute TTC. These two judgments are related in the following way. Precise estimation of relative TTC is necessary but not sufficient for accurate estimation of absolute TTC4. Therefore, it is possible that the visual cues that support judgments of the relative TTC may not be entirely the same as those used in estimating absolute TTC. Next we discuss studies on relative and absolute TTC estimation using binocular cues.
3. Estimates of relative TTC based on binocular information alone It is well established that humans are exquisitely sensitive to differences in the TTC of two objects when judgments are based entirely on x. TTC discrimination thresholds can range from 5%-13% (Regan & Hamstra, 1993; Todd, 1981), e.g. observers can reliably judge that an object with a TTC of 2.1 sec will arrive later than an object with a TTC of 2.0 sec. Can this level of performance be achieved when estimates are based entirely on binocular information about TTC? Gray & Regan (1998) used an 8x8 stimulus array to measure relative TTC discrimination based on equation (1) alone. Within the array the values of initial I/D (dS/dt) and Ad (i.e. the total change in disparity within a trial) were varied orthogonally by varying the presentation duration (At) by ±40% about a mean value of 650 msec. It had been claimed that speed discrimination of cyclopean motion in depth is in general based on A^ rather than dS/dt (Harris & Watamaniuk, 1995). Even though subsequent evidence showed this claim to be 4
For example, consider the case of an observer viewing two approaching objects, object A with an actual TTC of 3.0 sec and object B with an actual TTC of 3.3 sec. If the observer estimated the TTC of object A to be 4.0 sec and the TTC of object B to be 4.4 sec, discrimination of relative TTC would be moderately precise (i.e. they would correctly judge that A arrived sooner than B when the difference was only 10%) but estimation of absolute TTC would be quite inaccurate (roughly 33% error).
308 308
Rob Gray Gray and and David Regan Regan
erroneous , we chose to rule out the possibility that observers used this taskirrelevant variable to discriminate trial-to-trial variations in TTC. Object size was held constant at 0.7 deg. Filled symbols in Figure 1 indicate that the observers did use binocular information to discriminate TTC. i large target O — O small target
TIME TO COLLISION sec
Figure 1: Psychometric functions for discrimination of relative TTC when only binocular TTC information was available. Data are for 4 observers. Reproduced with permission from R.Gray and D. Regan (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38, 499-512. Copyright 1998 Elsevier Science Ltd.
Discrimination thresholds ranged from 5%-13% and were not significantly different from thresholds for judgments based on T alone. Stepwise regression analysis showed that the TTC variable [i.e., initial VD(d&dt)] accounted for a 5
Harris and Watamaniuk (1995) made the general claim that the human visual system does not contain a cyclopean mechanism specifically sensitive to the speed of motion in depth and that speed discriminations were based on the total change of disparity rather than speed. This claim was based on data from two observers in the special situation that the cyclopean target passed through zero disparity so that it disappeared and reappeared partway through the stimulus presentation, thus requiring the visual system to solve the correspondence problem twice within the presentation. Portfors-Yeomans and Regan (1996) repeated their experiment and obtained the same result, then went on to show that when the target did not disappear during a presentation there was clear evidence for a specialized cyclopean mechanism for motion in depth. Observers could discriminate trial-to-trial variations in the speed of motion in depth while completely ignoring simultaneous variations in the disparity traversed (see also Portfors & Regan, 1997).
The Use UseofofBinocular BinocularTime-to-Contact Time-to-Contact Information The Information
309309
high proportion of total response variance (73%-85%), and that responses were not significantly influenced by any of the task-irrelevant variables.
4. Estimates of absolute TTC based on binocular information alone The role of binocular information in the estimation of absolute TTC was first investigated by Heuer, (1993). In this study, an outlined circle that changed in disparity, or expanded or combined both changes was presented for a short interval on each trial. The observer's task was to press a button when they judged that the object would have collided with their head. When only binocular TTC information was available, observers grossly overestimated TTC. For an actual TTC of 2 sec the estimated TTC was 3.8 sec - a 90% error! Estimates based on T alone (error of 50% for a TTC of 2 sec) were large overestimations. This last finding conflicts with most other studies of TTC estimations based entirely on r which found underestimations (reviewed in Gray & Thornton, 2001). Although Heuer's observers were more accurate when both sources of information were available than when one source was missing, errors were still greater than 15% (see below). Heuer interpreted his findings as evidence that "changing size is a more powerful determinant of estimates of TTC than is changing target disparity "(pg. 558), though the earlier findings of Regan and Beverley (1979a) indicate that the generality of this statement is dubious (see section 6 below). It should be noted that there are serious methodological limitations with Heuer's study, namely the use of the"press button at the perceived TTC" technique (see above) and the use of a design that did not permit testing for which optical variables were used to make the judgments. Recently, we (Gray & Regan, 1998) used the staircase tracking procedure described above to measure estimates of absolute TTC based on binocular information. In this experiment, the viewing distance (1.6m) and ocular vergence angle (through the use of a fixation point and nonious lines) were held constant and TTC values ranging from 1.6 to 2.7 sec were used. The presentation duration was varied randomly between 600 and 900 msec giving extrapolation times that ranged between 700-2100 msec. We compared results for a large (0.7 deg) and a small (0.03 deg) target. As shown in Figure 2, estimates of TTC based on binocular information alone, i.e. equation (1) were consistent overestimates that ranged from 2.5% to 10% for the large target. We also used the large target to measure errors in estimating TTC when the changing-size cue was only available. Although the sign of the error was opposite, the absolute magnitude of estimation error was not significantly different for TTC estimates based on T and TTC estimates based on binocular information. Also shown in Figure 2 are estimation errors for a combination of
310
Rob Gray and David Regan
monocular and binocular TTC cues using the large target. In this simulation of the natural everyday situation where both cues are present, TTC estimates were considerably more accurate (ranging from 1.3%-2.7%) than estimates based on either cue alone. If this 1.3% error rate can be extrapolated to TTC values of
g| • H H
Disparity Alone (large target) Disparity Alone (small target) Size Alone (large target) Disparity + Size (large target)
Figure 2: Estimation errors for judgments of absolute TTC. Reproduced with permission from R.Gray and D. Regan (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38, 499-512. Copyright 1998 Elsevier Science Ltd.
300-500msec, then the corresponding accuracy (3.9-6.5 msec) approaches that which is required to hit a baseball (Watts & Bahill, 1991) or a cricket ball (Regan et al., 1979; Regan, 1992). The overestimation/underestimation errors observed when only one cue was available are captured in the model of the early processing of TTC described in chapter 9 (Fig. 3). Next we turn to an important advantage of sensitivity to binocular information about TTC, namely that in some situations rdoes not provide a reliable cue to TTC.
The Use Use of Binocular Time-to-Contact Information
311
5. Viewing situations in which binocular information is the only reliable indicator of TTC 5.1 Estimation of TTC for small approaching objects Equation (3) indicates that binocular TTC information should become relatively more effective as object size decreases. Gray & Regan (1998) directly tested this prediction by comparing judgments of relative and absolute TTC for large (0.7deg ±30%) and small (0.03 deg ±30%) simulated approaching objects. Figure 3 plots psychometric functions for relative TTC judgments based on r alone. For the large target, discrimination thresholds ranged from 6-12% and stepwise regression analysis showed that responses were not influenced by taskirrelevant variables. For the small target, discrimination performance was comparatively poor (thresholds ranged from 17-35%). Performance, however was much worse than revealed by Figure 3: for the small target, X only accounted for a small proportion of total variance (15-42%), and for two observers the task-irrelevant starting size accounted for the greatest proportion of total variance. • — » • targe target o—osmai! target 100 r
TIME TO COLUSION sec
Figure 3: Psychometric functions for discrimination of relative TTC when only monocular information about TTC was available. Data are for 4 observers. Reproduced with permission from R.Gray and D. Regan (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38, 499-512. Copyright 1998 Elsevier Science Ltd.
312 312
Rob Gray Gray and and David Regan Regan
These small-target findings emphasize the necessity of ensuring that observers are basing their responses on TTC prior to the interpretation of the data. If the stepwise regression analysis had not been carried out in this study one might conclude from Fig 3 that, although the discrimination is not very precise, observers can use t to discriminate TTC for small objects. The regression analysis showed, however, that observers cannot perform TTC discrimination at all for the small target. The fact that some kind of a threshold can be estimated from psychophysical data is not enough. It is necessary to show that the threshold is not spurious. How is the ability to discriminate TTC on the basis of binocular information influenced by object size? The open symbols in Figure 1 plot psychometric functions for the small target. As predicted from equation (2), TTC discrimination performance was not statistically different for the large target (thresholds ranged from 5-12%) and the small target (thresholds ranged from 5-10%). In all cases the task-relevant variable i.e. equation (1) accounted for a large proportion of total variance (70-81 %) and task-irrelevant variables did not significantly influence performance.
• t A: "END-SIDE-
B:"SIDE-EW
• C: "SPHERE"
Figure 4: Simulation of an approaching and rotating rugby ball (A & B) and an approaching sphere (C). Reproduced with permission from R. Gray and D. Regan (2000). Estimating the time to collision with a rotating nonspherical object. Vision Res., 40(1), 4963. Copyright 2000, Elsevier Science Ltd.
Given that T does not provide a reliable basis for discriminating trial to trial variations in TTC for a small object, it follows that Tdoes not provide a basis for making absolute judgments of TTC for a small object. However, as shown by the open bars in Fig.2, binocular TTC information can be used to estimates of absolute TTC for a small object.
The The Use Use of of Binocular Time-to-Contact Time-to-Contact Information Information
313 313
5.2 Estimation of TTC for nonspherical rotating objects When a nonspherical object such as an American football or a rugby ball tumbles as it approaches the observer, the retinal image expands and changes shape simultaneously. This poses a difficult problem for the visual system because the value of T is different across the different meridians of the object. The visual system responds by partially or even completely suppressing the motion-in-depth signal (Beverley & Regan, 1979a). This is done automatically and is achieved by comparing &{d&dt) across orthogonal meridians (#is the object's angular subtense across any given meridian) and then suppressing the generation of a motion-in-depth signal unless the value of &{d 6/dt) is the same across different meridians, thus reducing the perceived speed of motion in depth and thereby lengthening the perceived TTC (Beverley & Regan, 1979a, 1980). (Information about the dynamics of this comparison process was provided by our findings with a rapidly-tumbling (2Hz) simulated rugby ball, see below). On the other hand, binocular information is not influenced by the changes in the shape of the retinal image and, therefore, is in this situation a more reliable cue to TTC. •
'SPHERE1
•
-SNO-SiOE'
ESS •SIOe-ENO"
MONOCULAR ALONE
MONOCULAR + BINOCULAR
Figure 5: The matched TTC (i.e., point of subjective equality in a relative TTC discrimination task) was plotted as ordinate. The reference stimulus was always a SPHERE and had a constant TTC of 2400 msec (horizontal dashed line). Reproduced with permission from R. Gray and D. Regan (2000). Estimating the time to collision with a rotating nonspherical object. Vision Res., 40(1), 49-63. Copyright 2000, Elsevier Science Ltd.
Rob Gray and David Regan
314
We used a simulation of an approaching and tumbling rugby ball to compare the relative effectiveness of the two TTC cues (Gray & Regan, 2000a). In the first part of the experiment, the rate of tumbling was low (0.2 rotations/sec), and in the second part the rate was high (2 rotations/sec). The three conditions that were used in the first part of the experiment are shown in Figure 4. As predicted, the change in retinal image shape produced by the simulated rotation altered the perceived TTC when estimates were based on t alone. In a relative TTC discrimination task (Figure 5), analysis of the points of subjective equality (i.e., the 50% point of the psychometric function) revealed that the "END-SIDE" quarter rotation (Fig 4A) was perceived to have a shorter TTC (by up to 470 msec) than the "SPHERE" (Fig 4C), while the "SIDE-END" quarter rotation (Fig 4B) was perceived to have a longer TTC (by up to 280 msec) than the "SPHERE", even though the three simulated approaching objects had the identical actual TTC. This effect can be understood by the fact the "END-SIDE" rotation caused an increase in the value of d&dt for the vertical meridian relative to the "SPHERE" while the "SIDE-END" rotation caused a decrease in the value of d&dt for the vertical meridian relative to the "SPHERE". The situation was different when binocular TTC information was added (Fig 5, right-side), as there was no significant difference in the points of subjective equality for the three approaching objects. _ 8 3
B-SPHERE" O-END-SOP ••SIDE-ENDSBWOCUUR ALONE MONOCULAR INFORMATION ONLY
BINOCUUS
*DOH>
8 oto.SJH
MONOCUUU) womiKKm ONLY
StNOCUUR INFORMATION ADDED
O
Figure 6: Estimation errors for judgments of absolute TTC. In the binocular alone condition the stimulus was spherical and did not expand. *NR signifies that the measurement was not reliable since observers could not base their response on the task relevant variable. Reproduced with permission from R. Gray and D. Regan (2000). Estimating the time to collision with a rotating nonspherical object. Vision Res., 40(1), 4963. Copyright 2000, Elsevier Science Ltd.
The Use Use of Binocular Time-to-Contact Information
315
For the most part, a similar pattern of results was obtained for judgments of absolute TTC for the three stimuli shown in Fig 4. Consistent with the relative TTC data, when estimates of TTC were based on T alone (Fig 6A, left side), the estimation error for the "END-SIDE" rotation was an 8.5% larger underestimation of TTC as compared with "SPHERE". However, contrary to the relative TTC data, there was no significant difference in estimation errors for the "SIDE-END" and "SPHERE" conditions. A stepwise regression analysis performed on this data revealed the reason for this discrepancy. Whereas the task-relevant variable (i.e., f) accounted for a large proportion (75-91%) of total response variance for the "END-SIDE" and "SPHERE"conditions, it only accounted for a small proportion of variance (31-42%) in the "SIDE-END" condition. In addition, task irrelevant variables accounted for about the same amount of variance (38-55%). Therefore, the estimates of TTC shown in Figure 6A&C (black bars) are clearly not reliable (NR) since the observers did not judge the TTC of the simulated object. The likely explanation for our observers' inability to estimate absolute TTC for the "SIDE-END" stimulus was that the value of d&dt for the vertical dimension was near zero during the first part of the trial (due to the opposing effects of the approach and rotation). It has been demonstrated that the sensation of motion-in-depth is greatly weakened when one dimension of the target does not expand - the alternative percept of changing object size is perceived more frequently (Beverley & Regan, 1979a). This study on judgements of absolute TTC provides another example where the psychophysical procedure reliably generated a value (the staircase converged on 'an estimate' of TTC) that, on further examination, was meaningless. The addition of binocular TTC information dramatically improved the accuracy of absolute TTC judgments in all conditions and there were no significant differences between the errors for the three conditions (Figure 6, right side). Furthermore, when binocular information was added observers were better able to ignore task-irrelevant information: the correlates of TTC now accounted for a large proportion of total variance for all conditions. Consistent with these psychophysical findings, Scott, Li, and Davids, (1996) have demonstrated that rotation of a nonspherical object impairs catching performance under conditions of monocular viewing but not when the object is viewed binocularly. It should be noted that, in the experiment discussed above, observers could make fine discriminations of TTC based on % alone (thresholds were not significantly different for the three stimuli shown in Figure 4 and ranged from 711%) and ignore all task-irrelevant variables. This is an example of a case where the accuracy of absolute judgments of TTC (greatly influenced by rotation, unreliable for the "SIDE-END" rotation) cannot be predicted from discrimination thresholds for TTC (which are not influenced by rotation). Finally, in the 2 rotations/sec tumbling condition judgements of TTC based entirely on monocular information were as accurate as in the simulated
316
Rob Gray and David Regan
sphere condition indicating that the comparison process mentioned earlier does not respond to 2Hz oscillations and, therefore, has a time constant >0.5sec (Gray & Regan, 2000a, Appendix)
6. Estimation of TTC following adaptation It is not uncommon that an observer must estimate the TTC of an approaching object following exposure to visual stimuli that desensitizes local detectors of expansion. Given that such adaptation produces a weakened sensation of motion in depth (Regan & Beverley, 1978), Gray & Regan (1999a) asked whether adaptation to expansion has any effect on estimates of TTC based on either monocular or binocular TTC information. Following lOmin of adapting to a ramped increase in target size, estimates of absolute TTC were measured using the staircase procedure. Following adaptation to expansion, TTC estimates based on t alone were 15-27% longer as compared with a baseline condition where observer's adapted to a constant-sized target. It may not be intuitively obvious, given that object expansion is not involved in equation (1), why substantial overestimations of TTC (8-16%) occurred when estimates were based on binocular information alone (Figure 7). H Adapt Static Q Adapt Expansion
§
1
OBSERVER
Figure 7: Estimation errors for judgments of absolute TTC following lOmin adaptation to an expanding target. Solid bars show errors following adaptation to, a target with a constant disparity. Open bars are for a baseline condition in which the adaptation target remained at constant size. Estimates of TTC we based entirely on binocular information. Reproduced with permission from R. Gray and D. Regan (1999). Adapting to expansion increased perceived time to collision. Vision Research, 39, 3602-3607. Copyright 1999. Elsevier Science Ltd.
The Use Use of Binocular Time-to-Contact Information
317
This cross-adaptation effect can be understood in terms of the model of motionin-depth processing proposed by Regan & Beverley (1979a) - see Fig 3 - in which the signal generated by changing retinal image size is summed with the changing disparity signal before generation of the motion-in-depth signal (on the basis of which TTC is estimated). So, unlike the rotation and small object conditions described above, binocular information about TTC cannot be used to compensate for the inadequacy of X following adaptation to expansion. It is known that a radial flow of texture temporarily desensitizes local changing-size detectors located within approximately 0.5 deg of the focus of expansion (Regan & Beverley, 1979b; Beverley & Regan, 1982). Reasoning that a driver who gazes at a textured road while driving along a straight road produces a radial flow of texture on the retina we used an automobile simulator to find whether adaptation to the flow pattern affected judgements of TTC (Gray & Regan, 2000b). Following simulated highway driving on a straight empty road for 5 min, drivers initiated overtaking of a lead vehicle substantially later (220-510ms) than comparable maneuvers made following viewing a static scene. The implication of this lengthening of perceived TTC and consequent change in driving behaviour is that a driver who gazes straight ahead while driving in light traffic along a straight road might be at risk of rear-end collision when overtaking. As explained earlier, these errors of judgement would be present for judgements based both on binocular information and monocular information. Monitoring the scene ahead by shifting the gaze over the scene ahead would reduce errors (Gray & Regan, 2000b).
7. Processing of combinations of binocular and monocular information about TTC Regan and Beverley (1979a) pointed out that models of the processing of binocular and monocular information about TTC must take three factors into account. First is the ratio between the rate of expansion {d&dt) and rate of change of binocular horizontal disparity (d&dt) in the approaching object's retinal images. This ratio follows straightforwardly from geometrical optics - see equation (3). Second is the weighting given to these two variables by the visual pathway consequent on the difference between the dynamic operating characteristics of the mechanisms sensitive to d&dt and to changing-disparity respectively. Third is intersubject variability. (This is very large. Within the five observers studied by Regan and Beverley (1979a) the relative effectiveness of
318
Rob Gray and David Regan
d&dt and d&dt as stimuli for motion in depth varied by 80:1).6 We next discuss these factors. The crucial point brought out in equation (3) is that the ratio between d&dt and dS/dt does not depend on the object's distance nor on its speed. Equation (2) indicates that the magnitude of d&dt relative to the magnitude of d&dt is directly proportional to the approaching object's linear size. Consequently, for small objects the monocular correlate of TTC may be unimportant compared with the binocular correlate. A second implication of equation (2) is as follows. (For clarity we will defer a discussion of the different dynamic characteristics of the two systems). The distance at which binocular information starts to contribute to motion-indepth perception does not depend on the object's linear size, while the distance at which d&dt starts to contribute to motion-in-depth perception does depend on the object's linear size (S) scaled in units of I.7 In other words, for objects of different sizes approaching an observer's head in the z-direction at any given speed Vz, the distance at which binocular information starts to contribute to motion-in-depth perception is the same, while the distance at which d&dt starts to contribute to motion-in-depth perception depends on the ratio S/I. This last point can be illustrated by a numerical example. There are intersubject differences in the two thresholds, but for simplicity we will assume that they are both the same, and equal to 5 arc min/sec (see Fig.3, Beverley & Regan, 1979a). Consider an object 2m wide (about a car's width) so that, if I=6cm, S/I=33. Suppose that an observer is approaching this object with a closing speed of 5 m/sec (about 1 lmph). We have dS
IV7
— =—f l
(4)
dt D (Regan & Beverley, 1979a). Substituting into equation (4) we find that rate of change of relative disparity starts to generate a detectable percept of motion in depth when the object is at a distance of ca. 14m. We have dO SV7 (5)
6
Rushton and Wann (1999) proposed a model of relative weighting that takes no account of either intersubject variability or the known (large) differences in the dynamic characteristics of the mechanisms sensitive to monocular and binocular information about TTC. 7 Detection threshold for changing-disparity is not in general the same as the value for which motion in depth is just detected, and it is the second threshold that is relevant for estimating TTC. This distinction is demonstrated in Fig.3 of Regan and Beverley (1979a) and Fig.2 of Beverley and Regan (1979a).
The The Use Use of of Binocular Time-to-Contact Time-to-Contact Information
319 319
Regan & Beverley, 1979a). Substituting into equation (5) we find that the rate of expansion starts to generate a percept of motion in depth when the object is at a distance of ca. 2m. Looking back at equations (4) and (5) it can be seen why8 the ratio between these two critical distances is equal to y [S /1). Having discussed the differences between the distances at which binocular information and d&dt start to contribute to the estimation of TTC, we now consider the later stage when both are well above threshold. The crucial point now is that the dynamic characteristics of the mechanisms that process monocular and binocular information about TTC are quite different. The effect of this difference on the relative weighting of monocular and binocular information was measured experimentally by Regan and Beverley (1979a) and can be summarized as follows. Binocular information becomes relatively more effective in generating a sensation of motion in depth as approach speed is increased, and relatively less effective as viewing time decreases. In other words, the monocular cue is weighted more heavily at low speeds and brief viewing, while the binocular cue is weighted more heavily at high speeds and longer viewing durations. Both are large effects. A 64-fold increase of speed can increase the relative effectiveness of the binocular cue 16-fold (Beverley & Regan, 1979a, Fig.5). A 16-fold increase in viewing time can increase the relative effectiveness of the binocular cue fourfold. In addition, as already mentioned, the linear size of the approaching object determines the relative magnitude of the two cues. On the other hand, providing that both d&dt and dS/dt were well above threshold, there is evidence that the viewing distance has no effect or only a small effect on the relative effectiveness of the binocular and monocular cues to TTC: a very large change of binocular convergence (0 to 24 prism dioptres) had no effect on relative effectiveness in one observer and produced only a twofold difference in a second observer (Regan & Beverley, 1979a, Fig.6). In everyday life, there are often a very large difference between the effectiveness of monocular and binocular cues to TTC. Regan and Beverley (1979a) gave three examples, basing numerical calculations on the information present in the retinal images weighted by the measured dynamic characteristics of the relevant mechanisms in an individual observer. For an aircraft approaching a runway 100ft wide and 2000ft away at 140mph, the monocular cue to TTC with the runway would be 76 times more effective than the binocular cue for an inspection duration of 1.0 sec. For a cricket ball 50ft away approaching a batsman's head at 90mph the binocular cue would be 2.1 times more effective than the monocular cue for an inspection duration of 0.25 sec. For a fly 50cm away approaching the observer's head at 0.05 m/sec, the 8
This point is valid, though the situation is somewhat more complex, because relative sensitivity to changing-disparity and d&dt depends on Vz (Regan & Beverley, 1979a)
320
Rob Gray and David Regan
binocular cue would be 72 times more effective than the monocular cue for an inspection duration of 1.0 sec. Note, however, that these calculations are for one of the five observers studied, and intersubject variability was 80:1 within these five normally-sighted observers. Within the general population intersubject variability is presumably greater than 80:1.
8. Anecdotal and circumstantial evidence on the importance of binocular TTC information Early hints that binocular disparity could be important for judging TTC included the report of Bannister & Blackburn (1931). They designated 258 Cambridge undergraduates as either "poor" or "good" at ball games (including cricket). The group that was ranked as "good" had a larger mean interpupillary distance than the "poor" group. (A larger interpupillary separation in a larger disparity for a given depth separation between two objects). Using high-speed photography Alderson et al. (1974) found that when catching a ball with one hand the temporal order of finger flexions was disrupted when the lights were switched off 275 ms before the ball arrived, that is when the ball had reached 1.8m from the hand. Another kind of circumstantial evidence is the performance of professional baseball batters who have lost the use of one eye at some point during their playing career. Tony Conigliaro, of the Boston Red Sox was one of the most promising young ballplayers of the late sixties. In 1964, he hit 24 home runs and batted .290 as a 19 year-old rookie. He reached 100 career home runs at the age of 22 (the youngest AL player to reach that milestone at the time). But Conigliaro's promising career took a downturn after he developed a blind spot in one eye when he was hit by a pitch in 1967. Despite hitting 36 homeruns and being named "Comeback Player of the Year" in 1970, Conigliaro could not fully recover. In his last two seasons in the majors (1971 & 1975) he batted only .173 in 95 at-bats. Hofeldt, Hoefle, & Bonafede (1996) have provided direct evidence that visual processing of a rate of change of binocular information is important in hitting a baseball. Wearing a neutral density filter over one eye alters the perceived trajectory of a pendulum bob (Pulfrich, 1922). It is generally supposed that the effect is caused by delaying signals from one eye, thus distorting dynamic binocular information (perceptual delay increases as luminance decreases). Hofeldt and colleagues reasoned that wearing a filter over one eye might affect batting performance. Indeed it did. Using pitch speeds of 75-85 mph, they found a 55% drop in the number of contacts per swing when the hitter wore a filter over one eye. There was no decline in performance when lenses were worn over both eyes suggesting that the effect is not produced by a reduction in contrast. However it was not clear whether the degraded hitting performance was caused by timing error rather than by distortion in binocular
The Use Use of Binocular Time-to-Contact Information
321
information about the trajectory of the ball (Beverley & Regan, 1973; PortforsYeomans & Regan, 1997). Field experiments have also provided hints that binocular information may be important in estimating TTC. Cavallo & Laurent (1988) had observers estimate the TTC with a stationary object while they sat in a car that was driven towards the object, and found that estimates were more accurate when observers viewed the object binocularly than when one eye was occluded. They reported that this improvement only occurred for near targets (nearer than roughly 75m), and on this basis concluded that the effect was is not simply due to having an extra source of ^information from the other eye. In a catching task, Savelsbergh et al. (1991)found that the precision of grasping, as indexed by the standard error over many trials, was better for binocular than for monocular viewing; and McLeod, McLaughlin, & Nimmo-Smith (1985) reported that an observer's accuracy in hitting a dropped squash ball was better when viewing was binocular instead of monocular. On the other hand, occluding one of the pilots' eyes during the approach to landing a jet or a propeller-driven aircraft had either no detrimental effect on landing performance, or performance even improved (Pfaffman, 1948; Lewis & Kriers, 1969; Lewis et al., 1973; Grosslight et al., 1978).
9. Summary and conclusions In this chapter we have reviewed several lines of evidence that support the hypothesis that binocular information can be important for the judgments of TTC. Previous research has provided clear evidence that the human visual system contains a neural mechanism sensitive to binocular information about TTC, and that observers can use this information on its own to make accurate estimates of absolute TTC with an approaching object and to discriminate small variations in relative TTC. Furthermore, it has been demonstrated that binocular TTC information can compensate for the ineffectiveness of Tin some real-world situations such as when the approaching object is small or when the approaching object is nonspherical and rotating (e.g., a tumbling American football). Finally, it has been demonstrated that binocular and monocular information about TTC are combined by the visual system to produce a significantly more accurate estimate of absolute TTC than is the case for estimates based on either cue alone. Collectively, these findings suggest that binocular information plays a larger role in collision avoidance and collision achievement than has been previously believed (e.g. Profitt & Kaiser, 1995). In this chapter we have also highlighted several methodological issues related to the investigation of TTC judgements. We describe several examples of situations in which an observer seemed to be able to make judgments of TTC,
322
Rob Gray and David Regan
but upon further analysis it was revealed that the judgment was actually based on some perceptual variable other than a visual correlate of TTC (e.g. total change in object size). Such findings raise questions about the validity of the results of TTC experiments that have made no provision for checking that observers ignored task-irrelevant optical variables. Past research in the area of time to contact has been heavily biased towards exploring the role of t in judging TTC (Tresilian, 1999), and consequently there are many important questions regarding binocular information about TTC that remain unanswered: How is the binocular information in equation (1) converted into a TTC signal? (Chapter 9 offers one possibility that is framed entirely in terms of retinal image information). Does the relative weighting of binocular and monocular TTC information differ for diverse eye-limb coordination tasks such as hitting, catching or running over rough ground? Is the relative weighting different for novice and experts performers? Does it change with practice? Clearly, the role of different information sources (other than f) in the judgment of TTC and the control of collisions is a fertile and developing research area.
The The Use Use of of Binocular Binocular Time-to-Contact Time-to-Contact Information Information
323 323
REFERENCES Alderson, G.J.K., Sully, D.J. & Sully, H.G. (1974). An operational analysis of a one-handed catching task using high-speed photography. Journal of Motor Behaviour, 6, 217-226. Bannister, H. & Blackburn, J. M. (1931). An eye factor affecting proficiency at ball games. British Journal of Psychology, 21, 382-384. Beverley, K.I. & Regan, D. (1973). Evidence for the existence of neural mechanisms selectively sensitive to the direction of motion in space. J. Physiol. 235, 17-29. Beverley, K. I. & Regan, D. (1979a). Separable aftereffects of changing-size and motion-in-depth: different neural mechanisms? Vision Res, 19(6), 727-732. Beverley, K.I. & Regan, D. (1980). Visual sensitivity to the shape and size of a moving object: implications for models of object perception. Perception, 9, 151-160. Beverley, K.I. & Regan, D. (1982). Adaptation to incomplete flow patterns: no evidence for "filling in" the perception of flow patterns. Perception, 11, 275-278. Bootsma, R. J. & Oudejans, R. R. (1993). Visual information about time-to-collision between two objects. J Exp Psychol Hum Percept Perform, 19(5), 1041-1052. Cavallo, V. & Laurent, M. (1988). Visual information and skill level in time-to-collision estimation. Perception, 17(5), 623-632. Gray, R. (2001). Behavior of college baseball players in a virtual batting task. Journal of Experimental Psychology: Human Perception and Performance., In press. Gray, R. & Regan, D. (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Res, 38(4), 499-512. Gray, R. & Regan, D. (1999a). Adapting to expansion increases perceived time-to-collision. Vision Res, 39(21), 3602-3607. Gray, R. & Regan, D. (1999b). Do monocular time-to-collision estimates necessarily involve perceived distance? Perception, 28(10), 1257-1264. Gray, R. & Regan, D. (2000a). Estimating the time to collision with a rotating nonspherical object. Vision Res, 40(1), 49-63. Gray, R. & Regan, D. (2000b). Risky driving behavior: a consequence of motion adaptation for visually guided motor action. J Exp Psychol Hum Percept Perform, 26(6), 1721-1732. Gray, R. & Thornton, I. M. (2001). Exploring the link between time to collision and representational momentum. Perception, 30(8), 1007-1022. Grosslight, J. H., Fletcher, H. J., Masterton, R. B. & Hagen, R. (1978). Monocular vision and landing performance in general aviation pilots: Cyclops revisited. Human Factors, 20, 127-133. Harris, J. M. & Watamaniuk, S. N. (1995). Speed discrimination of motion-in-depth using binocular cues. Vision Res, 35(7), 885-896. Heuer, H. (1993). Estimates of time to contact based on changing size and changing target vergence. Perception, 22(5), 549-563. Hofeldt, A. J., Hoefle, F. B. & Bonafede, B. (1996). Baseball hitting, binocular vision, and the Pulfrich phenomenon. Arch Ophthalmol, 114(12), 1490-1494.
324
Rob Gray and David Regan
Howard, I. P. & Rogers, B. J. (1995). Binocular Vision and Stereopsis (Vol. 29). Oxford: Oxford University Press. Hoyle, F. (1957). The Black Cloud. Middlesex, England: Penguin. Kohly, R. P. & Regan, D. (1999). Evidence for a mechanism sensitive to the speed of cyclopean form. Vision Res, 39, 1011-1024. Kohly, R. P. & Regan, D. (2002). Fast long-range interactions in the early processing of luminance-defined form. Vision Res, in press. Laurent, M., Montagne, G. & Durey, A. (1996). Binocular invariants in interceptive tasks: A directed perception approach. Perception, 25(12), 1437-1450. Lee, D. N., Lishman, J. R. & Thomson, J. A. (1982). Regulation of gait in long jumping. Journal of Experimental Psychology-Human Perception and Performance, 8,448-459. Lewis, C. E. Jr., Blakeley, W. R., Swaroop, R., Masters, R. L. & McMurty, T. C. (1973). Landing performance by low-time private pilots after the sudden loss of binocular vision— Cyclops II. Aerospace Med. 44, 1241-1245. Lewis, C. E. Jr. & Kriers, G. E. (1969). Flight research program: XIV. Landing performance in jet aircraft after the loss of binocular vision. Aerospace Med., 40, 957-963. McLeod, P., McLaughlin, C. & Nimmo-Smith, I. (1985). Information encapsulation and automaticity: evidence from the visual control of finely tuned actions. In M. I. Posner & O. S. M. Marin (Eds.), Attention and Performance (Vol. 11, pp. 391-406). New Jersey: Lawrence Erlbaum. Pfaffman, C. (1948). Aircraft landings without binocular cues. A study based on observations made in flight. Amer. J. Psychol. 61, 323-335. Portfors-Yeomans, C. V. & Regan, D. (1996). Cyclopean discrimination thresholds for the direction and speed of motion in depth. Vision Res, 36(20), 3265-3279. Portfors-Yeomans, C. V. & Regan, D. (1997). Discrimination of the direction and speed of motion in depth of a monocularly visible target from binocular information alone. J Exp Psychol Hum Percept Perform, 23(1), 227-243. Profitt, D. R. & Kaiser, M. K. (1995). Perceiving Events., Perception of Space and Motion.: Academic Press. Pulfrich, C. (1922). Die Stereoscopie im Dienste der isochromen und heterochromen Photometrie. Naturwissenschaft, 10, 553-564 Regan, D. (1992). Visual judgements and misjudgments in cricket, and the art of flight. Perception, 21(1), 91-115. Regan, D. (1995). Spatial orientation in aviation: visual contributions. J Vestib Res, 5(6), 455-471. Regan, D. & Beverley, K. I. (1978). Looming detectors in the human visual pathway. Vision Res, 18,415-421. Regan, D. & Beverley, K. I. (1979a). Binocular and monocular stimuli for motion in depth: changing-disparity and changing-size feed the same motion-in-depth stage. Vision Res, 19(12), 1331-1342. Regan, D. & Beverley, K. I. (1979b). Visually guided locomotion: psychophysical evidence for a neural mechanism sensitive to flow patterns. Science, 205, 311-313.
The Use Use of Binocular Time-to-Contact Information
325
Regan, D. & Gray, R. (2000). Visually guided collision avoidance and collision achievement. Trends in Cognitive Sciences, 4(3), 99-107. Regan, D. & Hamstra, S. J. (1993). Dissociation of discrimination thresholds for time to contact and for rate of angular expansion. Vision Res, 33(4), 447-462. Regan, D. Beverley, K. I. & Cynader, M. (1979). The visual perception of motion in depth. Scientific American, 24(7], 136-151. Regan, D. Erkelens, C. J. & Collewijn, H. (1986a). Necessary conditions for the perception of motion in depth. Invest Ophthalmol Vis Sci, 27(4), 584-597. Rushton, S. K. & Wann, J. P. (1999). Weighted combination of size and disparity: a computational model for timing a ball catch. NatNeurosci, 2(2), 186-190. Savelsbergh, G. J., Whiting, H. T. & Bootsma, R. J. (1991). Grasping tau. J Exp Psychol Hum Percept Perform, 17(2), 315-322. Schiff, W. & Detwiler, M. L. (1979). Information used in judging impending collision. Perception, 8(6), 647-658. Scott, M. A., Li, F. X. & Davids, K. (1996). The Shape of Things to Come: Effects of Object Shape and Rotation on the Pick-up of Local Tau. Ecological Psychology, 8(4). Smith, M. R., Flach, J. M., Dittman, S. M. & Stanard, T. (2001). Monocular optical constraints on collision control. J Exp Psychol Hum Percept Perform, 27(2), 395-410. Todd, J. T. (1981). Visual information about moving objects. J Exp Psychol Hum Percept Perform, 7(4), 975-810. Tresilian, J. R. (1995). Perceptual and cognitive processes in time-to-contact estimation: analysis of prediction-motion and relative judgment tasks. Percept Psychophys, 57(2), 231-245. Tresilian, J. R. (1999). Visually timed action: time-out for 'tau'? Trends in Cognitive Sciences, 3(8), 301-310. Wann, J. P. (1996). Anticipating arrival: is the tau margin a specious theory? J Exp Psychol Hum Percept Perform, 22(4), 1031-1048. Watts, R. G. & Bahill, A. T. (1991). Keep Your Eye on the Ball: Curve Balls, Knuckleballs, and Fallacies of Baseball. New York: W.H. Freeman and Company. Wheatstone, C. (1852). Contributions to the physiology of vision II. Philos Trans R Soc Lond, 142, 259-266.
This Page is Intentionally Left Blank
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 14 Interception of Projectiles, from When & Where to Where Once
Simon K. Rushton Cardiff University, Cardiff, UK
ABSTRACT In this chapter the where once model is introduced. The where once model is concerned with the prediction of the future egocentric position of a moving object. It is a very simple model based upon four visual variables to which humans have a documented sensitivity. The proposed model is a general purpose predictive algorithm that could be used in conjunction with a variety of perceptuo-motor control strategies and in a variety of circumstances.
328 328
Simon K. K. Rushton Rushton
1 Introduction & background What visual information is used during interception of an approaching projectile? For the majority of the past thirty years, researchers have attempted to answer this question by addressing themselves to two related questions: how does an observer know where the projectile is going and when it will get there? Work on the former has involved identifying possible sources of information about the trajectory of the projectile or the lateral distance at which it will pass the observer. Work on the latter has been concentrated on estimation of 'time-tocontact' (TTC) that is the number of seconds remaining before the projectile hits or passes the observer. Although not all researchers would necessarily acknowledge such an inspiration, this approach can be identified with the Gibsonian, or Ecological programme (Gibson, 1979). The Ecological programme has concentrated on finding patterns or relationships within the optic flow field that can be used to directly regulate action. A classic example of this is the 'focus of expansion' and the visual guidance of locomotion: The "flow-field" is the optic array sampled at a moving observation point by an eye or camera. It was noted that when an observation point translates through an environment that motion within the flowfield, appears to stream out from a single point. This point is the "focus of expansion" and it indicates the direction of travel of the observer (Grindley cited in Mollon, 1997; Gibson, 1958). Hence observers can identify the focus of expansion and determine its position relative to the image of their target (for example a doorway). If the two are not coincident then the observer can change their direction of locomotion appropriately. The critical point is that all the information necessary for the solution of the problem of visual guidance of locomotion is found within the optic array. There is no need to consider anything about the observer - the nature or state of their motor system, or visual system, or how the two are connected. Nor is there any need to build any form of representation of the outside world with which the observer interacts. In the case of projectile interception, tau (when) and the crossing distance (where) are the equivalents of the focus of expansion. Tau, f, is defined as the ratio of the current angular extent of an approaching object to its rate of change of extent (Lee, 1976; see equation [1] below). This ratio 'directly' specifies the number of seconds remaining before the object will collide with the observation point (assuming the velocity of the object remains constant), there is no need to determine distance and velocity. Tau is information that is available in the optic array at the observation point just waiting to be picked up by any creature (or device) with an appropriate perceptual system. In Lee's original example, braking a car, a driver can brake safely by simply monitoring and responding to T. Again, no knowledge of the perceptual or motor systems is required. So long as the observer can increase or decrease the braking, they can
Interception of Projectiles, from from When & Where to Where Once
329
"couple" their braking action directly to the information they receive from their visual system. Since Lee's original paper attempts have been made to extend the use of tau to other problems including interception of projectiles (eg Lee et al, 1983, ball punching). The crossing distance, X c , ratios are also informative patterns in the optic flow field waiting to be picked up. Several different crossing distance equations have been proposed (described below). Again, as with X and the focus of expansion, all other parameters (such as eye orientation) are either assumed not to be necessary, or in some cases minor details that are "obvious" or that can be specified later. The tau model has come under some attack. It is assumed that other chapters in this book provide an up to date summary, but to give a few selective examples, Wann (1996) questioned the data reported to support the use of X. Tresilian (1999) questioned the assumption that X is perceived 'directly', that is from the retinal ratio and suggested that other cues and strategies might be used in the perception of TTC. Michaels et al (2001) questioned whether interceptive timing is based upon X, they suggested that the initiation of an interception action might be based upon a simpler quantity, looming rate. However none of these researchers have challenged the basic when & where idea. The major challenge to the when & where idea came from Peper et al (1994) who proposed that interception is based upon continuous control. In their model, the observer never needs to explicitly know where a projectile is going. The observer can use a continuous control strategy that couples hand movement to optical information so moving their hand to the right place at the right time to intercept a ball (this strategy is covered in more detail later). The where once proposal outlined in this chapter starts with a consideration of the ecological problem for an observer who wants to interact with an object. The where once model might be termed an egocentric model in that it starts with the assumption that the critical problem that an observer must solve to successfully guide action is to determine where an object is relative to their body1, and if observer or the object is moving, to anticipate where the object will be once a period of time has elapsed. Knowing where an object will be once a given period of time has elapsed would be equally useful whether the observer is picking up a teacup, running to tackle a rugby player on the opposing team or catching a ball. In the first part of this chapter the problems with the when & where solution are reviewed. The when & where solution is found wanting, and a more traditional "secondary-school-physics" solution is considered, and relevant data discussed. In the final section the where once solution is introduced. The where once proposal starts from an 'egocentric' perspective on the problem and then 1
cf. the egocentric account of the visual guidance of locomotion (Rushton et al, 1998)
330
Simon K. Rushton
assembles a solution that draws heavily on data and ideas culled from the existing literature on interception, 7 and motion-in-depth.
2. When & where In this chapter I do not provide an extensive review of the when & where literature. If the reader requires a summary they can either consult other chapters within the book or review papers on TTC by Tresilian (1999) or the visual guidance of interception by Regan & Gray (2000). 2.1 When TTC is the number of seconds remaining before the projectile collides with the observation point. T is a first order approximation of TTC, based upon the assumption that the velocity of the projectile is constant (Lee, 1976).
where 6 is the size of the retinal image of the projectile, and 0 is the rate of change of retinal image size. Related information such as Time-To-Passage (TTP), the number of seconds remaining before the projectile crosses the frontoparallel plane containing the observation point has also been identified (Bootsma & Craig, 2002 is the most recent paper on this topic). Unfortunately interception normally occurs in front of the observation point. Therefore neither T nor TTP information provides sufficient information for interceptive timing under natural circumstances. The estimate of X or TTP needs to be "corrected" for interception before the observation point. First the distance of the interception point from the observation point must be known. Then either the velocity of the approaching projectile, or its instantaneous distance is required. However, use of either distance or velocity completely undermines the notion of "direct" perception. There are some cases for which potential "direct" solutions exist. One example would be when the hand and the projectile are fortuitously simultaneously in view. In this situation relative disparity could be used to directly estimate TTC with the hand. But such circumstances are special cases and are not representative of the normal circumstances under which projectiles are intercepted. Tau information is no more "sufficient" for interceptive timing than knowledge of only object distance, or only object speed.
Interception of Projectiles, from from When & Where to Where Once
331
2.2 Where Regan & Beverley (1980) pointed out that if we consider the speed of the lateral movement of an approaching ball within the left eye retinal image (or optic array) and compare it to the speed in the right eye retinal image, that the ratio of the speeds is correlated with the trajectory of the ball. When the ball is travelling straight towards the nose then the ratio will be -1:1, when the ball is travelling towards one of the eyes then the ratio will be 0:1 or 1:0. Bootsma (1991) showed that the lateral distance at which a projectile will cross the plane containing the eyes (the 'crossing distance'), X c , is:
2R
e
{)
where & is the angular lateral speed of the projectile at the Cyclopean eye and R is the radius of the projectile. Other variants of this equation based upon changing disparity have subsequently been proposed (Laurent, 1996, Regan & Kaushall, 1994). x
c _« i 0
(Vi
where <j) is the rate of change of binocular disparity and I the inter-ocular separation. Lastly, X c can be calculated from the relative left/right eye image velocities: Xr I
ocR / a , +1 dR/dL-l
...
Where d R is the angular speed of the ball within the image in the right eye and d L the speed in the left eye. Note the three equations above are approximations. Below (Figure 1) iso-ratio curves are shown in a plan view. The Cyclopean eye (mid-point between the left and right eye) of the observer is located at (0,0). In the three panels the ratio d/0 is constant (iso-ratio curves for d/<j) or d R / d L would look similar). To aid an explanation of how the curves were generated, let us first define (3 as the angle between straight-ahead (x=0) and the direction of travel of the projectile. We will also assume an inter-ocular distance, I , of 64 mm.
332
Simon K. Rushton
In the left panel ratio oc/6 was calculated for a projectile that is at (0,5) and has a trajectory angle, (3, of 10°. If the projectile continued along this trajectory it would cross the x-axis 0.88m to the right of the Cyclopean eye (Xc=5.tan(100)). The ratio a/9 is 13.8 (d/e = X c /I = 0.88/0.064). Using this value for a / 6 , iso-ratio lines were generated from a grid of starting points (marked by circles in each panel). At every point, on every line, within a panel, the ratio oc/6 is the same (left panel d/d = 13.8; middle panel d/8 = 0; right panel d/9 = 28.4). The consequence of this is that at every point within a panel the value of X c calculated using equations [2-4] is the same (left panel X c = 0.88m; middle panel X c = 0; right panel X c = -1.82m). Recall X c is the lateral distance at which the projectile will cross the x-axis (fronto-parallel plane containing the eyes) assuming its velocity, or more specifically its direction of travel, does not change. At any point on one of the iso-ratio lines, the current direction of travel is the tangent to the curve. In the left and right panels the tangents clearly do not converge at a single point on the x-axis. The tangents do cross the x-axis at the estimated position X c when the projectile is straightahead. However, the error in estimation of X c from equations [2-4] increases as function of eccentricity, nearness and lateral distance at which the projectile will pass the Cyclopean eye. This is unfortunate as it means an estimate of X c will in general become more and more inaccurate as it becomes more and more important - just before interception.
•Si v
<} > i
S
9
9
,9 i i i '
i S> i '
i '
la/ -2
-2
Figure 1: Iso-ratio curves (generated from holding the a/§ ratio constant). Cyclopean eye at (0,0). Within a panel at all points, along all curves, the d/(j) ratio is the same. &/<j> ratio calculated for an projectile at (0,5). In the left panel the direction of travel, p, is 10° and the projectile would cross the x-axis 0.88m to the right of the Cyclopean eye. In the middle panel p = 0°, and the intersection point would be at the Cyclopean eye. In the right panel |3 = -20° and the projectile would cross 1.82m to the left of the Cyclopean eye
Interception of Projectiles, from from When & & Where to Where Once
333
The problem of the inaccuracy of X c can be sidestepped by saying that X c is only calculated when the projectile is straight-ahead. However, that would require that a decision-stage be added prior to the estimation of X c to check whether the projectile is straight-ahead, a =0. To determine the direction of the projectile, a , would require use of information about eye-orientation. This would appear to conflict with the spirit of the claim that information about trajectory from speed ratios is "independent of direction of gaze and angle of convergence" (eg, Regan & Gray, 2000). Second, just a single estimate of X c during the whole time-course of the trajectory is not likely to be especially useful. Another solution would be to modify equation [2-4] to include a but this would seem to undermine the notion that where can be perceived 'directly' as it would require use of extra-retinal information about eye orientation. However, perhaps the most fundamental problem with the use of X c is the same problem encountered with use of T information - X c is information about where the projectile will cross the fronto-parallel plane containing the observation point. This is not where interception typically occurs. Therefore again, the crossing distance information would need to be supplemented or "corrected" somehow. 2.3 When & where: An appraisal As should be apparent from the above, the when & where solution to interception is incomplete, to make this statement concrete, it would not be possible to write a computer program to guide a robot hand to intercept a projectile using only the where and when information, where and when information is not sufficient. To attempt to construct models of interception with where and when information as the sole or primary source of information is logically no different to trying to construct models with only positional information or only velocity information. In all cases the models would be insufficient to successfully guide interception. This is not a particularly novel argument. Tresilian has made the same point regarding Ton numerous previous occasions (Tresilian, 1991, 1994, 1995). Given over a decade has elapsed since Tresilian first brought the issue regarding r to attention, and given there are problems of a similar kind with the use of X c , it is no longer tenable to side-step such fundamental problems.
334
Simon K. Rushton
3. A cartesian solution? If the 'direct' solution is found wanting, is it time to return to a more traditional model, one that might be termed the secondary-school physics model? It is known that observers can perceive the position of objects close to the body accurately. Studies suggest that the position of objects in near space is perceived metrically (eg Frisby, Buckley & Duke 1996). Therefore it seems intuitive to assume that changing position, or velocity is also perceived metrically within limits. If this is so then an observer could anticipate future projectile position and trajectory from a combination of position and velocity information (see Tresilian, 1991 regarding the use of distance and velocity information in the perception of TTC). Let us look at perception of velocity. There are two components to velocity, speed and direction. 3.1 Perception of metric speed McKee & Welch (1989) examined the perception of lateral speed. Observers had to compare the speed of objects moving laterally in different disparity-defined depth planes. They concluded that observers were not able to make accurate judgements of "real" or "3D" lateral speed. This contrasted with their ability to make accurate judgements of retinal or angular lateral speed. Other researchers have used similar tasks but with displays, scenes or stimuli that are considerably "richer" in potential information. They found that in those circumstances observers could judge 3D lateral speed. A review of these studies is to be found in Howard & Rogers (2001). An obvious question is whether the visual information available to an observer attempting to catch a ball or otherwise intercept a projectile is comparable to that available in the McKee & Welch study. If we consider the circumstances under which observers intercept projectiles a case can be made that such a comparison is valid: often the size of the projectile is not know and with the projectile flying through empty space most relative and pictorial cues are missing. If this is so then it follows that observers intercept projectiles without use of veridical information about 3D lateral speed. With Phil Duke, I have been investigating the related question of whether observers can accurately perceive speed of motion-in-depth. We have concluded that they cannot (see Regan et al, 1998 for a suggestion that perceived speed in depth might be a function of TTC not actual speed). This conclusion is based upon the finding (as yet unpublished data) that observers cannot make accurate judgements of relative speed of motion-in-depth of an approaching (looming, anti-aliased, faceted) ball presented on a stereoscopic computer
Interception of Projectiles, from from When & Where to Where Once
335
display. More distant objects are systematically perceived as travelling at a slower speed than close objects moving at an identical 3D speed. The same finding was occurred with a very different experimental setup: We replaced the computer-generated image of a ball with a laser spot that moved over a surface, just below eye-level, towards an observer. We found that when the task was done in the dark (but with numerous luminous references for relative disparity), that observers produced a similar under-estimation of the relative speed of distant objects. From our results we infer that in most cases of projectile interception, when the projectile flies through the air, it is very unlikely that the observer has access to 3D speed information (however, we think such information probably is available to a football player intercepting a ball rolling over the pitch). 3.2 Perception of trajectory Given the crossing distance, X c , it should be apparent that if the instantaneous distance, D, is known then (3, the angle between the radial line connecting the observation point to the object, and the trajectory of the ball, can be calculated using simple trigonometry (see Figure 2 below).
Figure 2: Plan view of head and ball with trajectory and variables indiated. Current (solid circle), previous and future (open circles) postions of the ball indicated.
336
Simon K. Rushton
Equation (4) provides an estimate of X c which can be used in the calculation of P, the angle between straight-ahead and the direction of travel of the projectile: I(dR/dL+l) D(dR/dL-l)
(5)
where I is the interpupillarly separation, D is the instantaneous distance, d R is the retinal speed of projectile in image at the right eye and d L is the speed at the left eye. P can also be determined from the ratio of lateral speed to changing disparity (Cumming & Parker, 1993; Regan, 1993): |3 = tan"
Id
(6)
Lastly, (3 can be calculated using D, looming rate and known object diameter. Regan & Gray (2000) provide a recent review of the equations and research related to perception of trajectory, X , crossing distance etc. As an aside, the equations above are similar to those found in the literature on slant perception (as logically they should be). Equation (5) is the equivalent of the slant equation that uses the horizontal size disparity (e.g. see Backus et al, 1999), and equation (6) can be compared to specifications of surface slant from the flow components (see Meese et al, 1995 for an entry into this literature). 3.2.1 Trajectory and coordinate frames We return to an issue discussed earlier regarding the need to know a . Consider a ball moving along its trajectory. At T_, in figure 2, P is the angle at which the object trajectory will cross the "nose axis". At T+1, P is the angle at which the object trajectory did cross the nose axis. The equations in the previous section only give P when the object is currently straight-ahead, or cutting the nose axis (at To in figure 2). At any other time, to calculate P it is necessary to know the current direction of the object, a. The question that arises is, can observers judge P when the object is not straight-ahead? Or phrased another way, do observers take into account the direction a ? Would they be able to recognise that two object trajectories are physically parallel if the objects are not located in the same direction relative to the head?
from When & Where to Where Once Interception of Projectiles, from
337
As there is no data indicating that observers can compare trajectories when the objects are at different distances, we should also explicitly ask the question, can observers take into account the instantaneous distance, D ? If two objects' trajectories are physically parallel, would an observer be able to recognise they were parallel if the objects were not at the same distance? If observers do not take into account D and a , and thus are unable to recognise whether two object trajectories are physically parallel when the objects are not at exactly the same position of space, an observer cannot be said to be able to perceive trajectory. Or rather they are unable to perceive |3 which we might describe as the "3D trajectory" or "physical trajectory". 3.2.2 Empirical studies into the perception of trajectory Recently, with Phil Duke, I assessed observers' ability to judge trajectory using a task that identified perceptually parallel trajectories (data as yet unpublished). We found that observers took neither direction, a , nor distance, D , into account when judging relative trajectory. In other words, observers were unable to judge 3D trajectory. Observers were found to be basing their judgements on a speed ratio, the ratio of lateral angular speed, d , to a motion-in-depth term. An illustrative plot is shown in the Figure 3 below.
1.5
•\
0.5
0.02 0.04 0.06 0.08
0.1
Figure 3: Apparently parallel trajectories (for one observer). Reference trajectory starts at (0, 1.25) and travels in direction of (0.01, 0), (0.02, 0), (0.04, 0) or (0.08, 0). Perceptually parallel trajectories that start at (0, 1.9). Note ball only travelled a small portion of trajectory (approximately 50cm from start), trajectory lines extended to x-axis for illustrative purposes.
338
Simon K. Rushton
Another way of describing the results is to say that the observers' judgements were closer to that which would be expected if observers were judging the lateral crossing distance, X c rather than P. However, use of X c could not fully account for the observers' judgements. It appears (data as yet unpublished) that the motion-in-depth term is a sum of changing disparity, <> f and changing retinal size, 0 (in line with earner findings by Regan & Beverley, 1979 on perceived speed of motion-in-depth and Rushton & Wann, 1999 on the combination of changing disparity and changing retinal size information in the perception of TTC). The ratio of a to the sum of (j) and 9 varies as function of object size and therefore does not specify X c . 3.3 A cartesian solution, a summary Observers appear unable to perceive metric speed or 3D trajectory. Therefore they cannot judge velocity which rules out the use of a standard Cartesian solution. 3.4 A pure positional solution? Could observers base interception solely on instantaneous position? Peper et al (1994) did propose something like this. They suggested that observers use a continuous control strategy based upon on an estimate of "required velocity", X req : Xreq
Where Xproj is the current lateral position of the projectile and Xhand is the current lateral position of the hand. However it can be seen that this model has many of the problems of the when & where account, it only works for interception in a fronto-parallel plane, it does not provide for a hand that might also move in depth. Second, an analysis of time-course interception data shows that in at least one interception task, observers use predictive information about projectile position (Girshick, Rushton & Bradshaw, 2001). Therefore, the required-velocity model as currently formulated does not capture the general action of projectile interception. However, other chapters in this book may describe recent developments of the required-velocity model and so should be consulted.
Interception of Projectiles, from from When & Where to Where Once
339
4. Where once, an egocentric model Here it is suggested that observers attempt to anticipate where the projectile will be (relative to the body) once a period of time has elapsed. The solution offered, the estimation of future position, is simple and general. The where once model is based upon four simple perceptual variables to which humans have a documented sensitivity. The four variables are egocentric distance and direction, tau (see the work of Regan and colleagues that documents psychophysical sensitivity rather than the naturalistic studies concerned with the use of T) and a speed ratio (given by the lateral speed and a motion-in-depth signal). Second, the where once model is a general model. It can be used whenever an observer wishes to intercept an object that has motion relative to the body. Existing models treat projectile interception (the interception of a projectile approaching the body) separately from the interception of laterally moving objects or objects that are moving away from the observer. Such a distinction logically requires an additional system, or decision stage, located prior to the "interception systems", that must determine the direction of travel of the object and then assign the appropriate system responsibility for guiding action. Similarly any decision system would also need to distinguish between a moving or stationary object. The where once model neither makes a distinction between static and moving objects, nor does its function depend on the direction of travel. The simplicity of a solution and the choice of parameters is also important. As is obvious, there are an infinite number of mathematically equivalent solutions to the problem of projectile interception. Many can be ruled out in advance as they involve variables to which humans are not sensitive (such as velocity in the secondary-school physics solution above), incomplete (such as the required-velocity model) or overly complex and case-specific. Below the four variables of the where once model are reviewed. 4.1 Where once and perceptual variables 4.1.1 D and evidence for its use As noted earlier, an observer must use either distance or velocity information during interception. The previous section described findings that are incompatible with a human sensitivity to speed of motion-in-depth in conditions that are similar to those found during projectile interception. Therefore, given that observers clearly can intercept projectiles, it follows that distance information must be implicated.
340
Simon K. Rushton
No published studies to date have directly examined the use of distance information during interception. However, previously reported data can be seen to be compatible with the hypothesis that distance information is used during interception: When the inter-ocular separation is increased through use of telestereoscopes or reduced to zero through use of Cyclopean glasses, the perceived distance of an object should change. The simplest way to think of the direction of the predicted effects is to consider vergence angle. In the case of telestereoscopes, the vergence angle required to foveate an object at a given distance will increase. Under natural circumstances an increase in vergence angle is associated with an object being brought closer. The opposite is true of Cyclopean glasses. With Cyclopean glasses the observer must align the visual axes of eyes so that they are parallel. Under normal circumstances an observer views an object with eyes parallel when the object is distant. (Note we don't expect all objects seen through Cyclopean glasses to appear at infinity, because perception of distance is believed to result from cue-combination, therefore the Cyclopean glasses will have the likely effect of compressing the range of distances in a scene, making close objects appear further but leaving far objects unchanged.) In the literature, different observers have given different reports about their perception when viewing through telestereoscopes and Cyclopean glasses. However, one consistent result is that when observers attempt to reach for objects when wearing these devices, they do make errors in the predicted directions (eg van der Kamp et al, 2001). Usefully, neither telestereoscopes nor Cyclopean glasses should change T . Consider the dipole model of Rushton & Wann (1998) for computation of X (but most any other model can be substituted): (8)
It should be apparent that T would be unchanged by either device. However, if we look at interceptive timing we find that observers swipe earlier with telestereoscopes (Bennett et al, 1999, 2000; van der Kamp et al, 1999; Mon-Williams, Wann & Rushton, 1996) and later with Cyclopean glasses (Mon-Williams, Wann & Rushton, 1996) which is exactly what would be predicted if a distance term was used as the telestereoscopes should make the projectile appear closer and the Cyclopean glasses make it appear more distant. Therefore, these results are compatible with a central role for distance information during projectile interception.
Interception of Projectiles, from from When & & Where to Where Once
341
4.1.2 Evidence for the use of direction information Again, logically we can conclude that observers must use direction information during interception. Empirical evidence comes from a study (Rushton & Bradshaw, 1999) in which observers attempt to intercept an approaching projectile whilst viewing through prisms. The observers made systematic errors, an example of which can be seen in the figure 4 below.
500
i •
400 : ste.O15 ;
300
>
\
•
100 0
i i
• • r
200
3.
\ ':
;
-100 '•_
-200 -300 -300
r
-200
i
i
i
i
-100
i
i
i
i
i
i
t
i
100
i
i
i
200
300
Figure 4: Interceptive actions when wearing prisms. Plan view, dimensions measured in millimetres. Approximate position of Cyclopean Eye indicated by square at (0, -250), a fronto-parallel plane also shown. Ball travels from near (0, 500) towards observer. Finger of right hand is moved from left leg, near (-150, -150) up to intercept the ball.
Prisms perturb perceived direction and the angular error observed during reaching was directly proportional to the optical displacement of the prisms . Therefore we can conclude that the observers are making use of information about perceived instantaneous direction of the projectile. 4.1.3 Tau and a speed ratio Extensive work by Regan and colleagues has documented the sensitivity of the visual system to T . The criticisms of the "tau hypothesis", that tau is used to guide interceptive timing do not have any relevance to the extensive body of :
Head position was held constant so error was introduced in eye-orientation signal.
342
Simon K. Rushton
work by Regan that is concerned with psychophysical sensitivity. Therefore it is taken as demonstrated that the brain has access to X information. The speed ratio could be one of any of the previously identified ratios, d/<|>, d R / d L or d / 9 . Although there is a lot of data is compatible with a sensitivity to a speed ratio, none of the published research fully discriminates between the three (the recent research on perception of trajectory I have conducted with Phil Duke indicates it is probably d/(<j) + 8)). However, for the sake of simplicity, in the section that follows the ratio d/<j> will be assumed (note the where once model works with any of the speed ratios): 4.2 Predicting future position 4.2.1 Tau and current distance - anticipating future distance We assume that the current distance of an object, D o , is known, and also T , the number of seconds before it will collide with the observation point. If the object is (or is assumed to be) travelling at a constant velocity then the distance that the object will travel towards the observer (Ad) in a given interval of time, At, is Ad = DQ.— T
(9)
and the distance from the observer after an interval At has elapsed, D t is D t = D 0 - A d = D 0 .(l-—) x
(10)
As binocular disparity is proportional to the reciprocal of distance, we can substitute l/<() for D into [10] and [11] to give A<> | and <|)t. 4.2.2 Alpha dot / Phi dot and the current direction - anticipating future direction The ratio d/(j> is a term in some of the equations for crossing distance and trajectory angle above. The experiments on the perception of trajectory (3.2.2) show that observers are sensitive to differences in the d/<j) ratio. We use this ratio to predict change in direction Aoc from the predicted change in disparity
(11)
Interception of Projectiles, from from When & & Where to Where Once
343
and by summing this quantity with the current direction, a , we get the predicted direction, at time t , oct at =ao + Aa
(12)
Note there are some errors associated with the above calculation of oct. The where once model provides approximate information. Full mathematical derivations (including equations for future height or vertical gaze direction and analysis of robustness to noise and characteristic mis-estimations) will be presented elsewhere.
5. Strategies Using the where once equations the observer could predict the position of the object at any time in the future. For example, the position of the object 200msec into the future could be predicted throughout the trajectory (this could be done to compensate for perceptual processing delays). Alternatively, a running estimate of the position of an approaching projectile 300msec before contact could be estimated. Other strategies might be to wait until the approaching object is closer than a certain distance or with a looming rate above a certain critical value and only then begin prediction. Lastly, if collision is imminent, the observer could simply predict to where the projectile will have moved in the time it takes to make a reaching movement. So how far ahead into the future should an observer look, when should or when can they start the process? Some of these issues are considered in this section. 5.1 Constraints Let us consider the preconditions for visually guiding interception and a likely sequence of events. First, the observer detects that an object is moving. Once movement is detected, she could initiate an interceptive movement. However, discrimination normally takes longer than detection and, therefore, it is likely to be some time later that the observer is able to determine the object's motion characteristics (the particular perceptual variables being hypothesised to be T and d/((j) + 9)). If the observer wishes to start an interceptive action as soon as object motion is detected, but before object motion is discriminated, then there are several possible strategies available. The observer could initially move towards the current instantaneous position of the object. Alternatively, they could use knowledge of the constraints on movement of the object (for example consider
344
Simon K. Rushton
an object that moves along a rail, or is at the top of an inclined surface, or is about to be dropped through the air). Another possibility would be to base a prediction on knowledge of movement of previous objects (e.g. perhaps all other objects moved left at lm/s). Once the observer can discriminate object movement, she should be able to make a prediction of the future position of the object. However, the accuracy of the estimate of movement is likely to increase. This is for two reasons. First, precision of speed discrimination increases as a function of "inspection time", that is the time it has been viewed. This effect can be seen in figure 5 below. The figure shows performance in a two interval speed discrimination task. The speed of object motion in the two intervals differed by 20%, the observers' task was to decide which was faster. Performance (fraction correct) was found to increase as a function of inspection time and object speed. 1 0.9 0.8 0.7
*-—— — —
4
0.6
0.5 0.05
0.1
0.15
0.2
0.25
0.3
Figure 5: Fraction correct (0.5 = 50%) in a speed discrimination task as a function of "inspection time" (in seconds). Group average for 5 observers. Three different speeds, fast (diamonds), standard (squares) and slow (circles).
Second, if the object is initially distant then motion is likely to be at or below threshold. If the object is moving towards the observer then the motion will move into the suprathreshold range and the precision of the estimate of motion will improve accordingly.
Interception of Projectiles, from from When & & Where to Where Once
345
5.1.1 Detecting and discriminating movement If we assume that an observer wishes to make an interception movement as fast as possible and is bound by the constraints described above, then some object features are likely to influence the initiation of movement. Smeets & Brenner (1994) examined reaction times to detect object movement. They found that detection of fast motion took less time than detection of slow motion. Recently, I have replicated and extended this finding (data as yet unpublished) and a summary is shown in Figure 6 below.
0.40
T —
0.35
0.30 ***
0.25 10
100
Figure 6: Response times (in seconds) for detection (circles) and discrimination (diamonds) as a function of target speed (in degrees/second). Group average from 5 observers, individual averages shown same pattern of response times.
It can be seen that response time, for both detection and discrimination, decreases as the speed of the object motion increases. Also, detection is consistently faster than discrimination, but the difference reduces with object speed. It is likely that any factors that have been previously reported to influence perceived speed, such as contrast (Thompson, 1982) are likely to have an influence on detection and discrimination time. In a second experiment, response times for detection of looming were found to demonstrate a similar pattern with response times decreasing with increasing object speed. It was concluded that angular looming rate rather than TTC was the determinant of response times (see also Lopez-Moliner & Bonnet, 2002). A summary of response time as a function of angular looming rate is shown in Figure 7 below.
346 346
Simon K. K. Rushton Rushton
0.55 0.50 0.45 0.40 0.35 low
standard
high
Figure 7: Response time (in seconds) for discrimination as a function of angular looming rate. Group average from 5 observers, individual averages shown same pattern of response times. In standard condition a disc of radius r started at a distance D and approached the observer with a speed of v . In the "low" condition radius was also r, distance 2D and speed 2v . In "high" condition radius was r, distance D / 2 , speed v / 2 . Therefore TTC and rate of change of TTC was identical in all three conditions.
The pattern of response times as a function of speed and looming rate mirrored in the initiation-of-movement times for laterally moving (eg Brouwer, Brenner & Smeets, 2002) and dropping objects (eg Michaels et al, 2001). Smeets & Brenner (1994) also concluded from that "retinal motion" (the relative motion of an object against a visible background) is the primary factor influencing the time it takes to detect object motion. Data I have recently collected (as yet unpublished) contradicts this conclusion and it appears that both relative motion ("retinal") and absolute motion ("extra-retinal") influence detection and discrimination time. A summary plot is shown in Figure 8 below. In the "standard motion" conditions (SS and SF), a target moves against a stationary background. In the "absolute motion" conditions (ES and EF) the target is the only visible object. In the "relative motion" conditions (RS and RF) the target moves and the background also moves.
Interception of Projectiles, from from When & Where to Where Once
7A1 347
0.40
V
0.35 -
0.30
A,
V
-•
A
-
0.25 SS SF ES EF RS RF Figure 8: Response times (in seconds) for detection (circles) and discrimination (diamonds) as a function of motion condition. Group average from 5 observers, individual averages shown same pattern of response times. First letter, motion condition, S (relative and absolute motion), E (absolute motion only) and R (relative motion only). Second letter speed, S (slow) and F (fast). Relative motion is identical in conditions SS and RS and in conditions SF and RF.
In all motion conditions, detection is faster than discrimination. The typical speed/response time relationship is found in the standard and absolute motion conditions. However, in the relative motion conditions this relationship does not hold for either detection or discrimination. It is possible that perceived speed, rather than relative (or "retinal") motion is a better predictor of response time, but this hypothesis has yet to be tested (though see Tynan & Sekueler, 1982). 5.1.2 Speed of the interceptive movement It has been noted that not only do observers initiate interception earlier with fast moving objects, but that they move their hands at a higher speed. Several reasons for this have been suggested (for example see Tresilian et al, 2003 and Brouwer et al, 2002). Here I identify another potential factor. Simulations of the where once model with noise show that there is a higher error associated with faster moving objects and also over longer prediction periods (the further ahead in time you want to look). As an example, see Figure 9 below.
348
Simon K. Rushton
0.82 0.8 0.78 0.2
0.25
0.3
Figure 9: Estimates of future object position from a simulation with Gaussian noise added to the parameters. Current position of object relative to the observer (0.25m, 0.8m). Estimates marked with crosses, actual future positions with circles. Speed of 0.2m/s (approx 14deg/s to the left or right - both shown). Look-ahead of 100, 200, 300msec.
To make a prediction with an error within certain limits (for example + 0.5cm) the observer would have to look a shorter period of time ahead when predicting the position of a fast moving object compared to a slow moving object. This would have a knock-on effect on the speed of interceptive movements as interceptive movements would have to be executed within shorter time periods. Simulations of different directions of travel show slightly different error distributions. Therefore, by similar reasoning, it might be expected that the speed of an interceptive movement might also vary as a function of object travel direction. The bias and distribution of errors in prediction of position can be estimated for differing speeds and trajectory directions. These predictions can be tested empirically in a position extrapolation task. It should then be possible to examine reach speeds and determine if they vary as would be expected from errors in prediction of position. 5.1.3 Strategies in slower or less demanding interception tasks If we consider a task that allows free movement of the hand and in which the flight time is short, the observer may use a continuously updating estimation of the interception point. If we assume that it takes 300msec to move the hand (the time for a complete perception-action cycle), the observer can simply estimate where the ball will be just before collision (e.g. T -300msec). Thus, the observer can move her hand to the estimated interception point using a Peper-style algorithm (see previous section) but with the estimated interception position being continuously updated.
Interception of Projectiles, from from When & & Where to Where Once
349
If the estimated position of the ball at T -300msec is too close, the observer can switch to say 400msec before contact. If the ball is too far, the observer will have to move her body to intercept the ball as they will be unable to reduce the constant much below 300msec. In some circumstances, it might be desirable or necessary to just predict a sufficient period of time ahead to compensate for any neural/perceptual processing delays (eg predict where the ball would be 100msec beyond where it is currently perceived to be). For example, in the interception task used by Peper et al (1994) and later investigators (Montagne et al, 1999), the hand was restricted to a front-parallel plane. In such a circumstance, it is not possible to move the hand to an arbitrary interception point. Therefore, a simple strategy that closes in on the approaching object and intercepts it before it passes may be appropriate. 5.1.4 Yoking interception movements to gaze-movements It has been suggested that actions may be performed by yoking limb or body movements to gaze movements. For example, when walking to a target it has been reported that the head orientation leads body orientation (Grasso et al, 1998). The where once model could find a role here. The where once model predicts the change in direction, Aoc and disparity, A0, (or distance) that will occur during a given period of time. This information could be used for gaze stabilisation. If an object is currently fixated then Aoc indicates what change of version is necessary to re-fixate it, and A(() the necessary change in vergence. Alternatively, the where once algorithm could be used to drive both gaze movements and interceptive limb or body movements in parallel. This may happen during a game of cricket when the batsman attempts to intercept the approaching ball with his bat, but also appears to "intercept" the ball during flight with his eye (Land & McLeod, 2000).
6. Conclusion The when & where and secondary-school-physics solutions have severe shortcomings and are incompatible with empirical data. Second, they cleave the action of interception of approaching projectiles out from the more general problem of intercepting objects, such as those that may be moving laterally, away from the observer, or even stationary. In contrast, the where once solution is a general solution that returns the predicted egocentric position of objects travelling along a range of different trajectories (directions and speeds).
350
Simon K. Rushton
Acknowledgement Some the work reported in this chapter was supported by funding from Nissan Technical Center North America. Thanks to Paul Warren, Mei-Ling Huang, Heiko Hecht and David Jacobs for comments on earlier versions of this chapter.
Interception of Projectiles, from from When & Where to Where Once
351
REFERENCES Backus, B. T., Banks, M. S., van Ee, R., Crowell, J. A. & Crowell, D. (1999). Horizontal and vertical disparity, eye position, and stereoscopic slant perception. Vision Research, 39, 1143-1170. Bennett, S., van der Kamp, J., Savelsbergh, G. J. P. & Davids, K. (1999). Timing a one-handed catch I. Effects of telestereoscopic viewing. Experimental Brain Research, 129, 362368. Bennett, S.J., van der Kamp, J., Savelsbergh, G. J. P. & Davids, K. (2000). Discriminating the role of binocular information in the timing of a one-handed catch - The effects of telestereoscopic viewing and ball size. Experimental Brain Research, 135, 341-347. Bootsma, R. J. (1991). Predictive information and the control of action: what you see is what you get. International Journal of Sports Psychology, 22, 271-278. Bootsma, R. J. & Craig, C. M. (2002). Global and local contributions to the optical specification of time to contact: Observer sensitivity to composite tau. Perception, 31, 901-924. Brouwer, A. Brenner, E. & Smeets, J. B. J. (2002). Hitting moving objects: Is target velocity used in guiding the hand? Experimental Brain Research, 143, 198-211. Gumming, B. G. & Parker A. J. (1994). Binocular mechanisms for detecting motion-in-depth. Vision Research, 34, 483-495. Frisby, J. P., Buckley, D. & Duke, P. A. (1996). Evidence for good recovery of lengths of real objects seen with natural stereo viewing. Perception, 25 (2): 129-154. Gibson, J. J. (1958). Visually controlled locomotion and visual orientation in animals. British Journal of Psychology 19, 182-194. Gibson, J. J. (1979). The ecological approach to visual perception. Houghton-Mifflin, Boston, MA. Girshick, A. R., Rushton, S. K. & Bradshaw, M. F. (2001). The use of predictive visual information in projectile interception. Investigative Ophthalmology & Visual Science, 42, S3335. Grasso, R., Prevost, P., Ivanenko, Y. P. & Berthoz, A. (1998). Eye-head coordination for steering of locomotion in humans: an anticipatory synergy. Neuroscience Letters, 253, 115-118. Howard, I. P. & Rogers, B. J. (2001). Seeing in Depth. I Porteous Publishing, Toronto, Canada. Van der Kamp, J., Bennett, S. J., Savelsbergh, G.J.P. & Davids, K. (1999). Timing a one-handed catch II. Adaptation to telestereoscopic viewing. Experimental Brain Research, 129, 369-377. Van der Kamp, J., Salvesbergh, G. J. P. & Rosengren, K. S. (2001). The separation of Action and Perception and the Issue of Affordances. Ecological Psychology, 13, 167-172. Land, M. F. & McLeod, P. (2000). From eye movements to actions: how batsmen hit the ball. Nature Neuroscience, 3, 1340-1345. Laurent, M., Montagne, G. & Durey, A. (1996). Binocular invariants of interceptive tasks: A directed perception approach. Perception, 25, 1437-1450. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5,437-459.
352
Simon K. Rushton
Lee, D. N., Young, D. S., Reddish, P. E., Lough, S. & Clayton, T. M. H. (1983). Quarterly Journal of Experimental Psychology, A, 35, 333-346. Lopez-Moliner J. & Bonnet, C. (2002). Speed of response initiation in a time-to-contact discrimination task reflects the use of eta. Vision Research, 42, 2419-2430. McKee, S. P. & Welch, L. (1989). Is There A Constancy For Velocity. Vision Research, 29, 553561. Meese, T. S., Harris, M. G. & Freeman, T. C. A. (1995). Speed gradients and the perception of surface slant: analysis is two-dimensional not one-dimensional. Vision Research, 35, 2879-2888. Michaels, C. F., Zeinstra, E. B. & Oudejans, R. R. D. (2001). Information and action in punching a falling ball. Quarterly Journal Of Experimental Psychology, A, 54, 69-93. Mollon, J. (1997). "....on the basis of velocity cues alone": some perceptual themes. Quarterly Journal of Experimental Psychology, 50A: 859-878. Montagne, G., Laurent, M. Durey, A. & Bootsma, R. (1999). Movement reversals in ball catching. Experimental Brain Research, 129, 87-92. Mon-Williams, M., Wann, J. P. & Rushton, S. (1996). Binocular vision in interceptive timing. Ophthalmic and Physiological Optics, 16, 254. Peper, L., Bootsma, R. J., Mestre, D. R. & Bakker, F. C. (1994). Catching balls: How to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception and Performance, 20, 591-612. Regan, D. (1993). Binocular correlates of the direction of motion in depth. Vision Research, 33, 2359-2360. Regan, D. & Beverley, K. I. (1979). Binocular and monocular stimuli for motion in depth: changing-disparity and changing-size feed the same motion-in-depth stage. Vision Research, 19, 1331-1342. Regan, D. & Beverley, K. I. (1980). Visual responses to changing size and to sideways motion for different directions of motion in depth: linearization of visual responses. J. Opt. Soc. Am. 11,1289-1296. Regan, D. & Gray, R. (2000). Visually guided collision avoidance and collision achievement. Trends in Cognitive Sciences, 4,99-107. Regan, D., Gray, R., Portfors, C. V., Hamstra, S. J., Vincent, A., Hong, X. H., Kohly, R. & Beverley, K. I. (1998). Catching, hitting and collision avoidance. In L. R. Harris & M. Jenkin (Eds.), Vision and action (pp. 181-214). Cambridge, UK: Cambridge University Press. Regan, D. & Kaushall, S. (1994). Monocular judgements of the direction of motion in depth. Vision Research, 34, 163-177. Rushton, S. K. & Bradshaw, M. F. (1999). What information do we use during interception of an approaching projectile? Perception, 29, 122. Rushton, S. K., Harris, J. M., Lloyd, M. R. & Wann, J. P. (1998). Guidance of locomotion on foot uses perceived target location rather than optic flow. Current Biology, 8, 1191-1194.
Interception of Projectiles, from from When & Where to Where Once
353
Rushton, S. K. & Wann, J. P. (1999). Weighted combination of size and disparity: a computational model for timing a ball catch. Nature Neuroscience, 2, 186-190. Smeets, J. B. J & Brenner, E. (1994). The difference between the perception of absolute and relative motion: A reaction time study. Vision Research, 34, 191-195. Thompson, P. (1982). Perceived rate of movement depends on contrast. Vision Research. 22, 37780. Tresilian, J. R. (1991). Empirical and Theoretical Issues In The Perception Of Time To Contact. Journal of Experimental Psychology: Human Perception And Performance, 17, 865876. Tresilian, J. R. (1994). Approximate Information-Sources And Perceptual Variables In Interceptive Timing. Journal of Experimental Psychology: Human Perception And Performance, 20, 154-173. Tresilian, J. R. (1995). Perceptual and Cognitive-Processes In Time-To-Contact Estimation Analysis Of Prediction-Motion And Relative Judgment Tasks. Perception & Psychophysics, 57, 231-245. Tresilian, J. R. (1999). Visually timed action: time-out for 'tau'? Trends in Cognitive Sciences, 3, 301-310. Tresilian, J. R., Oliver, J. & Carroll, T. J. (2003). Temporal precision of interceptive action: differential effects of target size and speed. Experimental Brain Research, 148,425-438. Tynan, P. D. & Sekueler, R. (1982). Motion processing in peripheral vision: Reaction time and perceived velocity. Vision Research, 22, 61-68. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory? Journal of Experimental Psychology-Human Perception and Performance, 22, 1031-1048.
This Page is Intentionally Left Blank
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 15 Acoustic Information for Timing
Chris Button University of Otago, Dunedin, New Zealand
Keith Davids University of Otago, Dunedin, New Zealand
ABSTRACT In this chapter, we consider how acoustic information contributes to the timing of goal-directed human movements, particularly tasks involving relative motion with objects and surfaces in the environment. We begin by examining whether acoustic information can specify time-to-contact (TTC) during relative approach. Later in the chapter we evaluate the broader role of acoustic information in interceptive timing and describe an investigation of effects of removing acoustic information on co-ordination of one-handed catching. It is concluded that acoustic information has a significant role to play in regulating timing behavior.
356
Chris Button and and Keith Davids
1. Introduction A significant amount of research has examined the role of a variety of optical sources for provision of time-to-contact (TTC) information, as discussed in many chapters of this book (see also Lee, 1976; Savelsbergh, Whiting & Bootsma, 1991; Savelsbergh & van der Kamp, 2000; Tresilian, 1999). In particular, it has been argued that interceptive actions (or avoidance behaviors) can be regulated by TTC information during relative approach of a performer and an object or surface. Michaels and Carello (1981) proposed that the auditory perceptual system could also be used to regulate action and that information from the acoustic array could lawfully specify motion paths of objects with respect to observers. When an object passes close to an individual it could provide acoustical as well as optical information that can be perceived and used to guide actions such as interception or locomotion (Schiff and Detwiler, 1979; Lee, 1990; Lee et al., 1992). Seifritz et al. (2002) have termed the use of acoustic information for guiding relative motion between performer and object or surface auditory motion. As we note in this chapter, a number of research studies have revealed that the sense of hearing is well adapted for localizing and identifying sound sources and for guiding interceptive actions, particularly when there is relative motion between performer and an object or surface (e.g., Kunkler-Peck & Turvey, 2000). However, very little experimental work has examined the use of acoustical and visual information in judging TTC together. In this chapter, we review the literature on perception of acoustic information for TTC and more broadly examine the role of acoustic information as a constraint on the coordination of movement timing behavior.
2. Rising intensity: A likely candidate for acoustic TTC? A prevalent proposal in the literature is that there is a perceptual priority for rising intensity of sound, which has biological salience for animals negotiating complex dynamic environments through locomotion (Neuhoff, 1998). Neuhoff (2001) suggested that people might be more sensitive to rising intensity of acoustic information compared to falling intensities because of the well-documented facility for perceptual overestimation of intensity change during relative approach. This asymmetry in the perception of egocentric auditory motion was argued to provide an adaptive advantage that facilitates the pick up of looming acoustic information sources. This explanation has found some support in work with human infants who demonstrated behavioral signs of preparation for contact with signals of rising intensity (Freiberg, Tually & Crassini, 2001).
for Timing Acoustic Information for
357
According to Schiff and Oldak (1990) TTC1 of an object with an observer is related to the inverse square law (the intensity of a sound at an auditory perceptual system is inversely proportional to its distance from the source). In several studies reported by Schiff and Oldak (1990), the role of the rate of intensity change of a sound source in judging TTC was examined. In one experiment, 60 participants viewed a total of 16 stimulus films depicting vehicles/people passing by them at a distance of 3-5 metres. Two films showed a speaking person approaching the camera head-on. The events ranged in TTC from between 1.5 to 6 seconds. Acoustic information sources involved a spoken phrase repeated over by an approaching person and the sound of tyres and engines from vehicles. Speed of people and vehicles was 'approximately constant'. Each film was stopped at various times during the approach and participants were required to predict the time at which each object would have reached or passed by them. In the acoustic condition participants could not see the film, but could hear the events. TTC judgement data were converted to percentage estimates to enable comparison between approaches of different time lengths. Results showed that the vast majority of judgements were underestimates (89%). Modality of presentation (visual, auditory-visual or auditory) was a significant factor in the study. Acoustic information alone provided less accuracy in predicting TTC than visual or audiovisual events, although it has to be noted that, because sound was presented monaurally in the studies, interaural differences in changing sound intensity were not present. Films with longer TTC values resulted in greater acoustic to visual differences in accuracy of judgements. An accuracy differential was also noted for conditions where stimuli bypassed an observer compared to a head-on approach. Judgements were better in by-pass conditions, although the differential was less noticeable in acoustic conditions. The data suggested that acoustic information does not play a leading role in judging time of arrival, particularly with increasing stimulus trajectories. There may have been several reasons for the findings of Schiff and Oldak (1990), including the use of sighted participants and inadequacy of sound source presentation. These issues were examined in a second experiment, in which 6 congenitally blind participants attempted to judge TTC in identical conditions to the first experiment. Data comparing judgements of TTC showed that blind participants used acoustic information as well as sighted participants could use visual information, and that blind participants were better than sighted participants in the acoustic conditions only. As with the sighted participants, 1
Schiff and Detwiler (1990) preferred the use of the 'time of arrival' to describe TTC.
358
Chris Button and Keith Davids
there was a tendency towards underestimation, but the blind participants estimated TTC well even at longer trajectory values. In conclusion, the experiments reported by Schiff and colleagues (Schiff & Detwiler, 1979; Schiff & Oldak, 1990) suggest that head-on approach and trajectories of longer than 2s lead to less accurate TTC judgements from acoustic information alone. The presence of underestimates reflected conservative judgements by participants. Although the authors argued differently, there was still a question mark over the quality of the acoustic information presented to the participants. Congenitally blind participants performed as well with acoustic sources as did sighted participants with visual information. Deficiencies in the use of acoustic information by sighted participants may have been due to lack of quality of acoustic sensory stimulation or better education of attention to acoustic sources of information by blind participants. In reviewing the findings of Schiff and Oldak (1990), Guski (1992) indicated how the underestimation of TTC with the point of observation in the region of 40-77% by sighted participants wearing blindfolds is in line with research on visual tau and may reflect a kind of built in 'safety margin' to ensure object avoidance. He also noted that sightless participants performed considerably better than their sighted counterparts in estimating acoustic time to contact. Guski (1992) agreed with the conclusion of Schiff and Oldak (1990), that acoustic information is primarily used to spatially locate objects before vision specifies time to contact information. Guski (1992) pointed out that the visual system is constructed in a different way to the auditory system, with diverse ways to pick up visual information on time to contact. The auditory system is designed for unique and rapid pick up of information and that is why simple reaction times for acoustic information is faster than to visual information. Hence, acoustic TTC may be used when use of visual information is prevented by task constraints e.g., low visibility, visual handicaps, divided visual attention etc. Evidence with artificial sound sources has shown performance on judging distance of stationary sounds to be quite poor (Guski, 1992; Speigle & Loomis, 1993). It appears that acoustic information can provide the platform for the visual system to pick up more precise information about an approaching object. In the literature, there is growing evidence in support of the rate of change of acoustic intensity as a key variable for perceiving auditory motion, based on the inverse variation with distance of sound pressure at the point of hearing (sound pressure is related to intensity by distance of a source) (Ashmead, Davis & Northington, 1995). Shaw, McGowan and Turvey (1991) originally proposed an acoustic variable to specify TTC with an object emitting sound, travelling in a straight line, and at a constant speed towards an observer. Time to contact is specified by the relationship of an object's spatial position and velocity with respect to an observer. Velocity can be perceived by the rate of
Acoustic Information for for Timing
359
intensity growth or loudness of an approaching object. Shaw et al. (1991) suggested that the ratio of dl/dt (where I is the intensity field emitted by the sound source) could be a candidate property for acoustic tau, similar to the way that optic tau can be derived from the visual flow field (e.g., Lee, 1976). Multiple acoustic variables contribute to TTC judgements The great range of acoustic variables that are proposed as candidates for specifying TTC is notable. For example, Jenison (1997) argued that observer motion structures the ambient acoustic array demonstrating that higher order variables such as position, velocity and TTC, are measured from observations of interaural time delays, Doppler shift and average sound intensity. These variables were considered by Jenison (1997) to sufficiently specify auditory motion kinematics. The work of Rosenblum, Wuestefeld and Saldana (1993) showed that a number of acoustical dimensions, which are modulated with changing distance, can potentially support anticipatory looming judgements, including intensity and the overall pattern of spectral change, with the higher frequency portion of the sound spectrum changing at a disproportionally faster rate during approach. It was also argued that familiarisation with a sound source might form the basis of successful perception of auditory looming information. Using a classical motion prediction paradigm with sighted participants who listened to approaching sound sources on headphones, Rosenblum et al. (1993) showed that performance accuracy decreased with increasing time between signal disappearance and the time of acoustic stimulus passage. Their data also supported a tendency for participants to build in safe anticipatory periods in estimating time of arrival of the acoustic stimulus. The provision of performance-related feedback greatly improved performance, emphasising the significance of task experience and contradicting suggestions of a selected adaptive advantage conveyed by the capacity to pick up acoustic intensity changes to approaching sound sources. The issue of familiarisation with a sound source was also addressed by Ashmead et al. (1995) who argued that, although some sources of acoustic information require familiarisation with the level or spectral composition of sound, such as level, reverberation and spectral shifts, others like motion-related information sources do not. Sound pressure varies with distance and hence familiarity helps assess distance information from sound. Ashmead et al. (1995) proposed that the rate of change in sound pressure as one moves towards an object or surface, when expressed proportional to the absolute sound pressure, precisely specifies the distance between perceiver and source. This relationship seems to be independent of absolute sound pressure.
360
Chris Button and Keith Davids
The analysis provided by Ashmead et al. (1995) focused on information to guide actions, requiring no familiarity with sound sources. The main question they were interested in concerned whether acoustic information could change during locomotion towards a sound source. The issue concerned distance-related changes in sound pressure at a listener's ears during movement. They argued that the rate of change in sound pressure, proportional to the overall level of sound pressure, specifies information on listener-to-source distance. Again, an analogy was drawn with the optic variable tau, although rather than use the term tau, Ashmead et al. (1995) preferred the term proportional change rate. Ashmead et al. (1995) assumed in their experiments that a listener was moving towards a stationary, constant-pressure sound source at constant velocity. Whilst this may seem a simplistic model, Ashmead et al. (1995) wished to ascertain whether people actually used rate of change of sound pressure as an informational constraint to guide action. Movement towards a sound source can provide further information because the listener is sampling the sound source at more than one distance. Hence, an equation can be formulated to denote that the rate of change in pressure is inversely proportional to the square of the distance or dP/dr = cr ~2, where P = pressure of sound at the listener's ears; c = intensity of the sound (varies as the square of pressure in wave motion) and r = distance between sound source and listener. Because the rate of change of pressure is related to the overall pressure perceived at the listener's position, the use of this informational variable does not assume that one needs to know the value of the sound pressure. This is equivalent to the use of the optic variable tau since one does not need to know the size of a visually accessible object, for example. According to Ashmead et al. (1995) proportional rate of change of sound pressure can specify distance or time to an object. Both could be used under different task constraints, for example, distance could be useful under the more static task constraints of reaching for a sound-emitting source compared to the more typical dynamic task constraints of moving towards a source. In the experiments reported in their paper, Ashmead et al. (1995) examined whether people could use motion-related distance information when walking towards an unseen sound source. Accuracy and consistency of locomotion was measured in two conditions: stationary and dynamic, with proportional rate of change information being available in the latter. There was a need to include an action component (i.e., locomotion) in the experiment because such information sources are typically calibrated in action-scaled units (e.g., number of steps) (Kim, Turvey & Carello, 1993). In their first experiment, 9 blindfolded participants walked towards a sound source either after listening or during listening. Controls were put in place to limit reverberation and familiarity of sound sources used in the experiment. Stimulus sound levels during the experiment ranged from 54 dB to 82 dB (sound
for Timing Acoustic Information for
361
pressure and distance to listener was varied to achieve these values). Results showed that participants were more accurate and consistent in walking towards a target when they heard the sound during approach compared to when they were stationary. During stationary listening participants tended to overshoot the target (constant error). Moving when listening resulted in lower constant error to shorter target distances. This performance differential was also substantial for the longer distances, suggesting a high level of sensitivity to proportional rate of change information (e.g., for the 19m targets the proportional rate of change information was around 5% of the overall sound pressure). Further, variable error was lower in moving compared to stationary conditions (implying greater consistency of performance), in both distance conditions. Unfortunately, it was not clear from the description of the experimental methods whether the obtained differences may have been due to participants listening to the moving sound source for longer periods of time. There was no indication how long the sound source was emitted in both conditions, and whether the length of emission was comparable across conditions. A second experiment examined whether the opportunity to listen at more than one location could have accounted for the differences between the moving and stationary listening conditions. In the stationary condition, 8 participants listened from 2 standing locations during each trial. Better performance observed in the moving condition would provide support for the benefits of the proportional rate of change information compared to the opportunity to hear the sound twice from different locations. Results showed that performance was more accurate in the moving condition than both other conditions up to llm. For shorter distances constant error was lower in moving conditions. CE in both stationary conditions did not vary significantly, although accuracy was higher in the double stationary condition. For VE, moving was better than the stationary condition, but not the double stationary condition. The findings suggested that participants took advantage of the proportional rate of change information for locomoting accurately and consistently towards a target. The outcomes of the experiments by Ashmead et al. (1995) resulted in similar findings to Speigle and Loomis (1993) in support of an advantage for moving over stationary listening for using proportional rate of change information to locomote towards a target. Ashmead et al. (1995) also ran some computer simulations on distance information to support their experimental field work and concluded that proportional rate of change information was not the only source of information that could be used to support actions. They concluded that a multiple-source strategy could be employed by performers in using acoustic information for supporting various types of action, perhaps involving weightings. The proposal was that "...performance was not consistent
362
Chris Button and Keith Davids
with the idea that participants used either source of information about distance, familiar sound level or proportional change rate, as the sole basis of their actions" (p 253). The authors acknowledged that their assumption was that participants used proportional change rate information for distance perception and that there may have been other sources of information that provided information for participants including motion-related changes in the frequency spectrum of a sound or the ratio of direct to reverberant sound and acoustic motion parallax. As with the literature on visual perception for action, data from behavioral studies of auditory looming have been supported by some neuroscientific evidence. For example, Seifritz et al. (2002) used fMRI and dynamic acoustic intensities to examine the neural basis of the apparent selective advantage for increasing sound intensity. They predicted that rising sound intensity would activate a cortical network for space perception and the allocation of visual attention. Different patterns of neural activation were found for perception of rising and falling intensity of acoustic signals. Data show that rising and falling acoustic intensity resulted in greater activation of right temporal plane more than constant intensity values. Rising intensity values resulted in activation of a distributed neural network supporting spatial recognition, auditory motion perception and visual attention. The network integrated regions such as the right temporoparietal junction, midbrain areas and right motor and premotor cortices as well as the left cerebellar cortex. The anisotropic characteristic of the perception of acoustic intensity change was suggested to signal the prominence in actions of rising intensity of stimulation created by looming sources in natural surroundings.
3. Acoustic information and co-ordination of timing behavior The research reviewed so far in this chapter has shown that acoustic intensity change, amongst other sources such as interaural, spectral and reverberation information, may have a functional role in auditory motion. There seems good agreement on the view that the binaural rate of intensity growth (loudness growth) is the main information used about an approaching object's trajectory and velocity of approach. Sounds, like visual information, occur within a background context, and the relation between source intensity and background intensity is more distinct with decreasing distances to the point of perception. Although there have been very few comparisons of auditory motion perception between sighted and blind individuals, evidence shows that pick up of acoustic information can be used to judge TTC when use of visual information is prevented by task or organismic constraints such as low visibility, visual handicaps or divided visual attention (Guski, 1992).
for Timing Acoustic Information for
363
However, acoustic information has a number of functional uses and should not merely be viewed as a 'back-up' to motor performance in conditions where visual information has been degraded. Acoustic information can be used to supplement visual information for co-ordination and timing of interceptive actions. In the remainder of this chapter we examine the role of acoustic timing in interceptive actions to exemplify its functionality. Other work on acoustic information over the last 20 years has considered how interlimb co-ordination can exhibit self-organising properties constrained by the sound of a rhythmic metronomic beat (for detailed reviews see Swinnen & Carson, 2002; Kelso, 1995). In a variety of different movement tasks typically involving the oscillation of two limbs simultaneously, it has been shown that the cycling frequency of the metronome affects the stability of the co-ordination pattern adopted. From a dynamical systems perspective, acoustic information from the metronome acts as a powerful constraint on movement system dynamics. Button, Bennett and Davids (2001) confirmed this idea by examining how the co-ordination of a unimanual prehension task differed when performed in self-paced and externally-paced conditions. The provision of an external pacing sound source contributed to the intrinsic stability of the movement. The effect of the sound source on movement co-ordination is an example of what has been termed 'inherent anchoring' (Byblow, Chua, & Goodman, 1995), where a temporary state of behavior is provided that is "expressed as compressions of variability in movement amplitude where information is specified" (p. 124). There is also an increasing amount of empirical support for the idea that acoustic information is often naturally available to guide interceptive actions (e.g., Warren, Kim & Husney, 1987; Holder, 1998). For example, Warren et al. (1987) showed that acoustic information can not only help human observers to perceive spatial and temporal dimensions of the environment, but also physical properties of objects and surfaces. Such properties include elasticity, weight and rigidity of objects. Runeson and Frykholm (1983) demonstrated that kinematics specify the dynamics of object properties through analysis of motion. For this reason Warren et al. (1987) used a motor task rather than verbal judgements or estimations for their experiments. In the first experiment the task was to bounce a ball to a target height after exposure to information about the two-body system, comprising floor and ball. Four informational conditions were used, ranging from no prior information to self-bounce information in which all acoustic, haptic and visual information sources were available. Results showed that with increasing availability of information, bouncing accuracy was highest, although there were no differences between audio-visual and auditory conditions. The
364
Chris Button and Keith Davids
data suggested that performers preferred to vary the impulse applied by the hand to the ball in bouncing to target heights. A more recent study by Holder (1998) considered the co-ordination of table tennis strokes when concurrent acoustic information was removed. The time to peak velocity of the wrist of the striking hand was measured from players performing forehand drives in table tennis to a ball delivered by an expert player. It was found that outcome scores for accuracy decreased in the non-auditory condition. Interestingly, intra-individual analyses showed that the time-to-peak wrist velocity was increased when acoustic information was perturbed, with a concomitant increase in intra-individual levels of performance variability. An interesting question is whether a perturbation of acoustic informational constraints at the point of release could give rise to similar precise adjustments of the hand around the ball during one-handed catching. Data, which could shed some light on that question, were reported by Savelsbergh, Netelenbos and Whiting (1991a) who examined differences in catching ability between deaf and hearing children. In deaf children, acoustic perception of information from ball release cannot aid co-ordination of hand transport and prehension phases of catching, unlike unimpaired children. Savelsbergh et al. (1991a) found that deaf children could satisfy the task constraints of one-handed catching as well as hearing children, although they had problems with balls approaching them from the periphery or outside the field of view. Deaf children showed higher movement initiation times after a tennis ball was released from a ball machine, although no differences were found in overall movement times between the deaf and hearing children. It seems likely that there was a compensation in the co-ordination of the hand transport and grasp phases, for the deaf children, due to the lack of acoustic information about from the projectile release mechanism of the ball machine. Unfortunately this study lacked a detailed kinematic analysis of the temporal co-ordination of the catching action to verify this conclusion. Button (2002) attempted to examine the effects of removing acoustic information prior to ball release on co-ordination of a one-handed catching task in hearing participants, an issue raised by the study of Savelsbergh et al. (1989a). It was expected that removal of acoustic information from a ball projection machine - would only partially constrain timing and co-ordination characteristics of the one-handed catching action because it was unlikely to be used concurrently to regulate catching behavior throughout its entirety. Since the acoustic information from ball release was being removed rather than trajectory or velocity related information concerning the ball-path, it was predicted that the influence of such informational constraints would only be observed at the beginning of the action (i.e., at movement initiation).
for Timing Acoustic Information for
365
Audilory trial Venicil (mm)
Figure 1: Adapted from Button (2001). 3-D space plots of wrist displacement during sample trials taken from a skilled catcher. The location of ball-contact is represented relative to the start position of the hand. Acoustic information concerning ball release from the serving machine was prevented with the use of headphones.
Indeed the most consistent finding in the data was that participants (7 out of 8) generally initiated movements later in the non-auditory condition. This finding is similar to that of Savelsbergh et al. (1991a), however in that study it was unclear how the deaf children adapted their movement patterns. Button (2002) showed that some catchers (3 out of 8) moved their hand more quickly toward the approaching ball (increased wrist velocity) in order to intercept the ball after the delayed initiation. An alternative strategy employed was to catch the ball nearer to the body. Sample trials depicting the displacement of the wrist during the catching action show this strategy clearly (see Figure 1). Several catchers also tended to open the catching hand earlier to compensate for the delayed reaction. These findings each indicate that acoustic information from ball release was typically used by the catchers to anticipate ball release. In the absence of this information source, adaptation in the overall action was necessary in order to catch the ball. The research described in this sub-section supports the arguments proposed by many theorists that perceptual and motor substitution is a hallmark of skilled behavior (Lacquiniti & Maioli, 1989a, b: Schmidt & Fitzpatrick, 1996; Fitzpatrick, 1998; Beek and Bingham, 1991). As constraints on performance
366
Chris Button and Keith Davids
change, perhaps through unforeseen occlusion of typically available information, skilled performers are able to immediately re-assemble a perception-action coupling by substituting other available sources of information to achieve the same task goals. That is, through experience, they might become better attuned to multiple information sources, which can be used to guide the timing and coordination of the arm and hand in catching (van der Kamp et al., 1996; Savelsbergh & van der Kamp, 2000).
4. Conclusion In this chapter we have overviewed the literature examining the role of acoustic information in making time to contact judgements and supporting interceptive actions. As research in the optical array has also demonstrated, there are a range of potential variables which can be used to help acoustically specify TTC (e.g., rising intensity, interaural time delays, Doppler shift, and proportional change rate). Experiments requiring TTC judgements to be based solely on such acoustic variables typically lead to underestimates. However, in most situations (except with congenitally blind or deaf people) vision and audition are used together, with audition often operating as an early warning system. Skilled performers in sport often exploit acoustic information to help time their responses. For example, the sound of a squash opponent's shoes against a wooden floor may be used to estimate the distance of a player behind him/her and to decide whether to play a drop-shot or a drive to the back court. An interesting line of inquiry in future research may be to address how acoustic information is used when it is continually available to the performer, such as the 'brushing' of skis over snow for a slalom skier. In most situations, however, the quality of acoustic information may be insufficient to completely specify accurate responses. Therefore, visual information is often necessary to provide informational support by 'filling in the gaps'. It can be concluded that the integration of several informational sources through experience into a multimodal, global array can facilitate the timing and accuracy of skilled movements.
for Timing Acoustic Information for
367
REFERENCES Ashmead, D. H., Davis, D. L. & Northington, A. (1995). Contribution of listeners' approaching motion to auditory distance perception. Journal of Experimental Psychology: Human Perception and Performance, 21, 239-256. Beek, P. J. & Bingham, G. P. (1991). Task-specific dynamics and the study of perception and action: A reaction to von Hofsten (1989). Ecological Psychology, 3, 35-54. Button, C. (2002). The Effect of Removing Acoustic information of Ball Projection on the Coordination of One-Handed Ball-Catching. In: K. Davids, G. Savelsbergh, S. Bennett & J. van der Kamp (Eds.). Interception Actions in Sport: Information and Movement. Taylor & Francis, London, pp. 184-194. Button, C, Bennett, S. J. & Davids, K. (2001). Grasping a better understanding of the intrinsic dynamics of rhythmical and discrete prehension. Journal of Motor Behavior. 33:1, 2736. Byblow, W. D., Chua, R. G. & Goodman, D. (1995). Asymmetries in coupling dynamics of perception and action. Journal of Motor Behavior, 27, 123-137. Fitzpatrick, P. (1998). Modeling co-ordination dynamics in development. In K. M. Newell & P. C. M. Molenaar (Eds.), Applications of Nonlinear Dynamics to Developmental Process Modeling, (pp. 39-62). Mahwah, N.J.: LEA. Freiberg, K., Tually, K. & Crassini, B. (2001). Use of an auditory looming task to test infants' sensitivity to sound pressure level as an auditory distance cue. British Journal of Developmental Psychology, 19, 1-10. Guski, R. (1992). Acoustic tau: An easy analogue to visual tau. Ecological Psychology. 4:3, 189197. Holder, T. (1998). Sources of information in the acquisition and organisation of interceptive actions. Unpublished Doctoral Dissertation, Chichester Institute of Higher Education, University of Southampton, UK. Jenison, R. L. (1997). On acoustic information for motion. Ecological Psychology, 9, 131-152. Kamp, J. van der, Vereijken, B. & Savelsbergh, G. J. P. (1996). Physical and informational constraints in the co-ordination and control of human movement. Corpus, Psyche et Societas: An International Review of Physical Activity, Health and Movement Science, 2,102-118. Kelso, J. A. S. (1995). Dynamic patterns: The self-organization of brain and behavior. London: MIT Press. Kim, N., Turvey, M. T. & Carello, C. (1993). Optical information about the severity of upcoming contacts. Journal of Experimental Psychology: Human Perception and Performance. 19, 179-193. Kunkler-Peck, A. J. & Turvey, M. T. (2000). Hearing shape. Journal of Experimental Psychology: Human Perception and Performance, 26, 279-294.
368
Chris Button and Keith Davids
Lacquiniti, F. & Maioli, C. (1989a). The role of preparation in tuning anticipatory and reflex responses during catching. Journal ofNeuroscience, 9, 134-148. Lacquiniti, F. & Maioli, C. (1989b). Adaptation to suppression of visual information during catching. Journal of Neuroscience, 9, 149-159. Lee, D. N., Weel, F. R. van der., Hitchcock, T., Matejowsky, E. & Pettigrew, J. D. (1992). Common principle of guidance by echolocation and vision, Journal of Comparative Physiology, 171,563-571. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5,437-459. Lee, D. N. (1990). Getting around with light and sound. In: R. Warren & A. H. Wertheim (Eds.) Perception and Control of Self-Motion. Hillsdale, NJ: LEA. Neuhoff, J. G. (1998). Perceptual bias for rising tones. Nature, 395,123-124. Neuhoff, J. G. (2001). An adaptive bias in the perception of looming auditory motion. Ecological Psychology, 13,87-110. Rosenblum, L. D., Wuestefeld, A. P. & Saldana, H. M. (1993). Auditory looming perception influences on anticipatory judgements. Perception, 22, 1467-1482. Runeson, S. & Frykholm, G. (1983). Kinematic specification of dynamics as an informational basis for person and action perception: Expectation, gender recognition and deceptive intention. Journal of Experimental Psychology: Human Perception and Performance, 8, 733-740. Savelsbergh, G. J. P. & van der Kamp, J. (2000). Information in learning to co-ordinate and control movements: Is there a need for specificity of practice? International Journal of Sport Psychology, 31,467-484. Savelsbergh, G. J. P., Netelenbos, J. B. & Whiting, H. T. A. (1991a). Auditory perception and the control of spatially coordinated action in deaf and hearing children. Journal of Child Psychology and Psychiatry, 32, 489-500. Savelsbergh, G. J. P., Whiting, H. T. A. & Bootsma, R. J. (1991b). Grasping Tau. Journal of Experimental Psychology: Human Perception and Performance, 17, 315-322. Schmidt, R. C. & Fitzpatrick, P. (1996). Dynamical perspective on motor learning. In H.N. Zelaznik (Ed.), Advances in Motor Learning and Control (pp. 195-223). Champaign, 111.: Human Kinetics. Schiff, W. & Oldak, R. (1990). Accuracy of judging time to arrival: Effects of modality, trajectory and gender. Journal of Experimental Psychology: Human Perception and Performance, 16,303-316. Schiff, W. & Detwiler, M. L. (1979). Information used in judging impending collision. Perception, 8, 647-658. Seifritz, E., Neuhoff, J. G., Bilecen, D., Scheffler, K., Mustovic, H., Schachinger, H., Elefante, R. & Di Salle, F. (2002). Neural processing of auditory looming in the human brain. Current Biology, 12, 2147-2151. Shaw, B. K., McGowan, R. S. & Turvey, M. T. (1991). An acoustic variable specifying time-tocontact, Ecological Psychology, 3, 253-261.
Acoustic Information for for Timing
369
Speigle, J. M. & Loomis, J. M. (1993). Auditory distance perception by translating observers. Proceedings of IEEE 1993 Symposium on Research Frontiers in Virtual Reality (pp.9299). Los Alamitos, CA: Institute of Electrical and Electronics Engineers Computer Society. Swinnen, S. P. & Carson, R. G. (2002). The control and learning of patterns of interlimb coordination: Past and present issues in normal and disordered control. Acta Psychologica, 110, 129-137. Warren, W. H, Kim, E. E. & Husney, R. (1987). The way the ball bounces: visual and auditory perception of elasticity and control of the bounce pass. Perception, 16, 309-336.
This Page is Intentionally Left Blank
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 16 Why Tail is Probably Not Used to Guide Reaches Geoffrey P. Bingham Indiana University, Bloomington, IN, USA
Frank T. J. M. Zaal University of Groningen, Groningen, The Netherlands
ABSTRACT We suggest that xmay not be used to guide reaches. The reasons are: (1) targeted reaching is an intrinsically spatial task that requires information about length dimensioned properties (i.e. object distance, size, and/or velocity); (2) x is time dimensioned and thus provides no spatial information. A x variable can be constructed from perceived spatial variables and then used in a control dynamic for reaches. However, this is more complex than simply using the measured spatial variables directly in a dynamic to control reaches. Furthermore, because the control dynamic determines the timing of the behaviors, the expectation that x might be used specifically for timing is undermined. Finally, (3) x is unstable because the hand can slow to near zero velocities while still at significant distances from a target and, as a result, x grows large without bound. When used as a driver in a control dynamic, x can send the system spiraling into instability. This problem can be mitigated somewhat, but its better to simply avoid it by not using x.
372
Geoffrey P. Bingham and Frank T. J. M. Zaal
1. The main point In this chapter, we consider whether x might be used in the online guidance of reaches. We suggest that it is not used to guide targeted reaches. One of the original arguments for the use of t was that x is a temporal variable and as such, x is appropriate for controlling the timing of behaviors, especially interceptive behaviors. If we restrict consideration to interception of target objects by the hand, then we will distinguish two fundamentally different types of situations. In one situation, the target travels towards the observer and the problem is to time the initiation of the grasp to capture the object successfully. This is a prototype situation for the use of x and we have no argument with this. In the other situation, however, the target object either sits unmoving or travels away from the observer and the problem is to reach with the hand to the location of the target.1 This is the problem case. As discussed in chapter 18 by Bingham in this volume, another of the original arguments for the use of x is related to x's temporal dimensionality. The ability to use x means that the classic measurement problem in space perception can be avoided. Optical structure is angular and temporal. The length dimension is lost in the projection from surface structure to optical structure. This is the origin of problems like distance perception and size perception. For the solution of timing problems that only require information about time, the problem of spatial measurement can be avoided by using x. Following the same argument, however, x provides no information about spatial properties. In the case of a projectile approaching an observer at constant velocity, x only provides information about time-to-contact. It provides no information about object size, distance or velocity. The object could be large, far away and moving fast or small, nearby and moving more slowly. The problem is that reaching to the location of a target object is an inherently spatial task. Measurement of lengths or length dimensioned properties is unavoidable. This was shown, for instance, by Bingham and Pagano (1998) who measured targeted reaches under various visual conditions. A control condition prevented participants from obtaining visual information about the distance of a target object. In this case, participants brought the hand up immediately in front of the eye so as to occlude the target, and then they moved the hand outwards at a steady slow speed, while keeping the target occluded, until the hand simply contacted the target. The time at which the hand contacted 1
There is an intermediate situation in which the target is traveling towards the observer so as to pass within a reachable proximity, but not so as to hit the eye. Assuming that the eye is not moved to an interceptive location (as in solutions to the "outfielder problem" (Michaels & Oudejans, 1992; McBeath, Shaffer & Kaiser, 1995; McLeod & Dienes, 1996; Oudejans, Michaels, Bakker & Dolne, 1996), and that the hand is sent out to intercept the target, then this is essentially the same as the second situation. (See e.g. Peper, Bootsma, Mestre and Bakker (1994).)
Tau and Visually Guide Reaching
373
the target could not be anticipated and furthermore, if the target was located beyond reach, participants would only have stopped when they ran out of arm. This manner of reaching is not representative of normal reaching. Bingham (1995) discussed ways that X might be used in visually guided reaching and showed that length dimensioned variables must be incorporated into the use of x variables. As also shown by other investigators (e.g. Bootsma & Peper, 1992), the trajectory of a reach to a target can be decomposed into two components: One in a frontoparallel plane and another in depth. It is the depth component that requires the length dimension. Bingham (1995) showed how, for instance, information about the size of the target relative to the size of the hand is required to derive an effective x variable for the depth component. The requirement for length dimensioned information may be fulfilled via other means (e.g. information about definite distance), but fulfilled it must be. Consider the case of reaching to intercept a target object that is moving away from the observer. The images of both target and hand would be contracting, so there would be a x for each one. The signs would be opposite to those for trajectories approaching an observer, but no matter. Under conditions of constant velocity, the x 's would specify time since departure from the eye location, but this situation does not entail constant velocities. In fact, this situation nearly always involves radically changing velocities. But we can put this aside and consider a case in which we presume unrealistically that both the hand and target are moving at constant velocities. Given only optical x's as information, the observer does not know what the velocities are, or how far away the hand and target are. The hand's position and movement may be known via other means (e.g. somatosensory information), but not the target's. This is the crux. The target could be near and moving slowly or far and moving rapidly. Shrinking the difference in x's (here, the difference in times-since-departure) does not yield convergence of hand position on target position. The x's could be the same, but the velocities and positions radically different. Tau could be used to guide the hand to a target moving away from the observer, but the derivation of a suitable X would entail length dimensioned information just as in the case of reaching to an unmoving target. But it is worse. Not only would information about target size or distance be required, but in addition, information about target velocity is needed. A final criticism will be that X is unstable when used in a control dynamic to drive the hand to a target. Given this, its reasonable to consider whether X would be used at all instead of simply using the distance and velocity information which is required in any case. One reason that X might still be used has been suggested by Zaal, Bootsma and van Wieringen (1998) and that is to coordinate the timing of grasping with reaching. We will return to this possibility after describing some data and some simulations of visually guided reaching that illustrate the points made thus far.
374
Geoffrey P. Bingham and Frank T. J. M. Zaal
2. Mass-spring control and models of visually guided reaching Its now fairly well established that the control of the limbs in discrete reaching is accomplished by controlling the limb as a tunable damped massspring with an adjustable stiffness, k, and equilibrium point, xep (Bizzi, Hogan, Mussa-Ivaldi, F. & Giszter, 1992; Feldman, 1980; 1986; Feldman, Adamovich, Ostry & Flanagan, 1990; Hogan, Bizzi, Mussa-Ivaldi & Flash, 1987; Latash, 1993)2:
mx + bx + k(x — xep) = 0 where x, x-dot, and x-double dot are position, velocity and acceleration, respectively, m is mass, b is damping, k is stiffness, and xep is the position of the equilibrium point. The role of visual information would be to determine the behavior of the equilibrium point by acting as a driver, G(k xep):
mx + bx + k(x) = G(kxep) So, the task becomes to position the equilibrium point at the target and the dynamics of the mass-spring determines the timing. The G function will also contribute to the timing, but the fact that timing is constrained by the massspring dynamics undermines one of the reasons for the use of T in the driver. We will consider two alternative models that vary in terms of G.3 The first model incorporates x in the driver. This model essentially instantiates the 'constant %' strategy hypothesized by Bingham (1995). The strategy is appropriate for controlled approach situations in which others (e.g. Yilmaz and Warren (1997)) 2
It has also been suggested that reach trajectories are limit cycles, that is, that reaches exhibit trajectory stability as well as endpoint stability (Schoner, 1994; Zaal, et al., 1999). Limit cycles are only generated by nonlinear dynamics. The driven mass-spring could be nonlinear depending on the form of the driver. (See, for instance, the chapter by Bingham in this volume.) However, a driver could also be linear. A linear mass-spring with a moving EP can yield a trajectory stability that is derived from the postural stability of the moving EP. The empirical problem is to discriminate among these possible types of trajectory stability. There is evidence for trajectory stability that comes from the postural stability of a moving EP (Won & Hogan, 1995). On the other hand, if the EP is assumed in this case to be driven externally (that is, by a motor program), then this understanding is undermined by results showing that limb movements exhibit phase resetting in response to perturbations (Kay, Saltzman & Kelso, 1991). 3 Other models are possible and in fact, exist (e.g. SchOner (1994), Zaal, et al. (1998), Zaal, et al. (1999)). However, the models we use are sufficient to illustrate general points about alternative perceptual variables that might be used in these models. The conclusions extend to the other possible models.
Tau and Visually Guide Reaching
375
have hypothesized that Tau-dot might be used. The 'constant x' strategy is just that, to move so as to preserve a constant x value, for instance, 200 ms. In strictly mathematical terms, this yields a Zeno's paradox type situation in which one never reaches the target: "I am going to get there in a fifth of a second....I am going to get there in a fifth of a second...." and so on ad infinitum. But this also yields a straight trajectory on the phase plane (or state space, i.e. a plot of v versus x) into the origin (that is, the target) with a slope of l/xc. As shown by Bingham (1995), this converges to an arbitrarily small distance from the target in arbitrarily small time depending on the value of xc. Soft contact can be achieved either by physiological tremor or by aiming just inside the target. The model we used is not a pure 'constant X' model, but it is closely related. The model is:
x = kjg(T-.2)dt where
T = \i and xT is target distance, x is the distance of the hand, xT is target velocity, and x is hand velocity. In cases where the target object is unmoving, xT is zero and x reduces to (x T -x)/x. Essentially, this system tends to drive the integrand to zero (thus, it drives x to .2). If the integrand were simply equal to x, then this system would act to drive x to zero. (Although this strictly qualifies as a 'constant x' strategy with xc = 0, it also corresponds to the more generally expected use of x for targeting, that is, simply drive x to zero for soft contact and target acquisition). The problem in this case is that (XT-X) (the distance of the hand from the target) is constrained, but x (hand velocity) is not. x can be made to approach zero simply by driving x arbitrarily large. The result is unstable. Driving x to a constant non-zero value (e.g. .2) yields another constraint so that x = (xT-x)/xc. That is, x is proportional to the remaining distance to the target (and therefore, does not go arbitrarily large). Thus also, we see an approximately straight trajectory in phase space into the origin with a slope of l/xc. The x in equation (1) cannot be specified optically simply by x's projected from hand and target. Spatial or length dimensioned quantities specific to target distance and velocity are required. An apparent exception to this is a version of x derived by Bootsma and Peper (1992) to model guidance of a hand to a target. However, their derivation required that the hand move along a straight path to the target. This, in turn, entails three conditions. First, reaches must exhibit straight paths. Often they do, but equally often they do not. For instance, the reaches in our experiment shown below did not follow straight
376
Geoffrey P. Bingham and Frank T. J. M. Zaal
paths. Second, the location of the target must be known, that is, the distance and direction of the target relative to the hand must be known in advance. This, by itself, is not so troubling and simply conforms to the assertion that target distance must be perceived. The third condition is more problematic. It is that the target location must be known accurately. Any error in estimation of the target location will generally require for correction departures from a straight reach path. As soon as the reach trajectory departs from a straight path, the x variable derived by Bootsma and Peper (1992) becomes invalid. One of the reasons some form of online visual guidance is generally expected is that distance perception is known to be inaccurate (e.g. Bingham & Pagano, 1998; Bingham, et al., 2000). The x variable in question fails exactly when it is most needed. Finally, the derivation fails when the hand is traveling directly between the eye and target, but reaches can be performed equally well approaching a target from this direction as from any other direction. Without doubt, reaching requires perception of the target distance (xT) (even if that perception is inaccurate). In principle, an initial assessment of target distance can be used to obtain target size (sT) from target image size (IT): sT = xT * IT. Subsequently, target size can be used to obtain target distance from image size: xT(t) = sT * l/IT(t). Similarly, target velocity can be obtained from target xT: x T (t) = xT(t) * l/xT. Hand distance and velocity can be derived similarly (if not obtained from somatosensory sources). But, given the necessity to obtain such length dimensioned quantities visually, it is not then necessary that they be used to construct a x variable for control. To the contrary, they could be used directly in a control dynamic. An alternative model entails use both of relative distance between hand and target and of target velocity: m'x + bx + kx = k \
(xT - x)
The assumption that the relative distance between the hand and target is perceived is much less problematic than an assumption that the definite distance must be perceived continuously, rapidly, and accurately. Nevertheless, the model assumes that definite target velocity is perceived and this remains a worrisome assumption. Both of these models (equations (1) and (2)) are realizations of Zaal, Bootsma and van Wieringen's (1999) suggestion that reaching is simply performed in an inertial reference frame determined by the target. The visual information is used effectively to perform a Galilean transformation, that is, a change of coordinate systems.
Tau and Visually Guide Reaching
377
3. Simulations: Reaching to moving targets Using these two models, we performed simulations of reaches to moving targets. We also tested two closely related models that are the same as those in equations (1) and (2), but with the xT terms removed from each, respectively. With this, they loose their Galilean character, but relatedly, only information about target distance, not velocity, is required. Perhaps these models might be adequate. We will refer to the latter simplified models as (la) and (2a), respectively. We also performed simulations of reaches to unmoving targets and 'double-step' targets. The 'double-step' targets changed instantaneously to somewhat nearer or farther locations after a reach was initiated. We will return to an account of these results subsequently. First, we will focus on reaches to moving targets. Tau Driver with x T
Tau Driver without x T
Figure 1
378
Geoffrey P. Bingham and Frank T. J. M. Zaal
In all simulations, the target lay initially at a distance of 5 units and started moving at 5 units per second at initiation of the reach. Performance of model (1) is shown in the top panel of Figure 1 where position time series are shown for the target, the hand, and the equilibrium point. The hand successfully closed on the target in about 1.5 seconds and then continued to track the target stably. The same pattern occurred with somewhat slower and faster target speeds although target acquisition occurred a little sooner or later, respectively. The performance of model (la) is shown in the bottom panel of Figure 1. In this case, the hand never converged to the target, but instead settled at a distance from which it tracked the target. Once hand velocity was equal to target velocity (x = x T ), then the distance at which the hand remained from the target was determined by xc * x = xT - x. The faster the target moved, the larger the distance. The tracking distance could be reduced by reducing xc, but the smallest value of xc that admits stable behavior is limited to about 200ms by delays within the system. The bottom line is that the X based model cannot acquire a moving target unless T is derived using both target position and velocity to determine a x for the hand in a target frame of reference. The top panel of Figure 2 illustrates the performance of model (2). The hand closed on the target within 1.25 seconds and then stably tracked the target. Again, the pattern was the same at slower and faster target speeds. The middle and bottom panels show the performance of model (2a). In both cases, the model failed to track the target. This occurred because the relative distance grew smaller (and weaker as a driver) as the absolute distances became larger, so the absolute distance between the hand and target also grew. As shown in the bottom panel, with an increase in the gain of the driver, the hand can catch the target, but it is unstable because it also fails to track. So, once again, information about both target position and velocity is required for the hand to be able to stably acquire and track the target. In this case, however, only the relative distance between hand and target (i.e. proportion of the total target distance) was required. Nevertheless, absolute target velocity was needed.
Tau and Visually Guide Reaching
379
Relative Distance Driver with x T
2.5
Relative Distance Driver without x T
Relative Distance Driver without x T
2.5
Figure 2
4. Human performance: Reaching to moving targets How well do people actually perform a task like this? We tested participants in a task illustrated in Figure 3. A seated participant held down a button with his right index finger. The button was located at the front edge of a flat bed plotter on which was positioned a movable target. The target was an upright wooden dowel with a wooden triangle flag projecting from it. The dowel had velcro on its top as did the participant on his index finger. The dowel sat inserted in a short pipe from which it was to be removed by the participant by
380
Geoffrey P. Bingham and Frank T. J. M. Zaal
reaching to contact the velcro on the top and then lifting it. The pipe ensured that the dowel had to be lifted straight upwards for successful acquisition. Hand and target movement were recorded by a Watsmart optoelectric kinematic measurement system. Infered diodes (IREDS) were placed on the participant's index finger, on the flag of the target, and immediately below the target dowel on the carriage supporting the target. The positions of the IREDS were sampled at 100 hz. Participants performed 170 trials in which they reached to acquire the target. On each trial, one of a number of possible events could occur. First, the target was positioned at one of three initial distances: Near (15cm), Medium (20 cm), Far (25 cm), or Very Far (50 cm). Second, the target might (a) remain unmoving, or (b) jump at 1 meter/second from Near to Medium, from Far to Medium, or from Medium to either Near or Far distances, or (c) move away from any of the Near, Medium or Far initial locations at one of three possible speeds: 20 cm/s, 30 cm/s, or 40 cm/s. Jumps or movement of the target started 100 ms after the participant initiated the reach and left the button. As shown in Figure 3, targets moved away in the x direction and when acquired, targets were lifted upwards in the z direction.
IREDS
Figure 3
Representative performance in acquiring a moving target moving at the fastest velocity (40 cm/s) is shown in Figure 4. The blue line is the finger, the red line is the target position (carriage IRED), and the green line is the flag IRED (which was above and slightly beyond the carriage IRED). In the x, the finger can be seen to acquire and track the target while in the z, the finger hovers
Tau and Visually Guide Reaching
381
above the target while tracking it and then moves down to contact the top of the target and then lift it from the pipe. Once the target dowel is clear of the pipe, movement of the finger and target in the x stops. The x component of this performance was well simulated by models (1) and (2).
x-position: depth
1000 500 -
50
100
150
200
250
300
350
300
350
300
350
y-position: width
400 200
50
100
150
200
250
z-position: height
600
0
50
100
150
200
Figure 4
250
382
Geoffrey P. Bingham and Frank T. J. M. Zaal
5. 'Double-step' targeting and the instability of x: A final weakness The final weakness of the idea that x is used to drive a reach to a target is revealed in simulations of 'double-step' targeting experiments. In 'doublestep' targeting, the position of the target is changed suddenly while the hand is moving towards the target. The problem is that x is unstable, x becomes small as the hand approaches a target because both the distance of the hand from the target and the velocity of the hand relative to the target become small. When the target is suddenly moved to a larger or smaller distance from the hand, the distance becomes relatively large but, due to the inertia of the hand, the relative velocity remains small and near zero (after a brief, pulse-like increase in the relative velocity due to the sudden movement of the target). Sometimes the hand must reverse its direction of movement because it has passed the target. Again, hand velocity goes through zero while distance from the target remains relatively large. A very small number divided into a large number yields an exceptionally large number, x grows without bound. The hand is sent spiraling into instability. This problem is general. Whenever the hand slows to small velocities or stops while still at significant distance from a target, this behavior will result. We simulated a 'double-step' reach using model (2) which uses relative distance information. The target at distance 5 jumped to a near location at distance 3 at 750 ms into the reach. As shown in Figure 5, the hand overshot the nearer distance and so, reversed direction of movement. Then it slightly overshot the target again and settled to the target position.
Relative Distance Driver
2.5
Figure 5
Tau and Visually Visually Guide Reaching
383
Participants in the Experiment described above were also tested in 'double-step' trials in which targets moved suddenly either to a farther or to a nearer location. The reaching response is shown in Figure 6 for a trial in which the target jumped to a nearer location. The response was very like that exhibited by model (2). The hand overshot, reversed, overshot, and settled to the target location. x-position: depth
400 200 -
50
100
150
200
250
300
350
300
350
y-position: width
400 200 •
50
100
200
250
z-position: height
600 400
150
Hand-
target .
200
50
100
150
200
250
300
350
Figure 6
Next, we tested this situation with model (1) which uses T. AS shown in the top panel of Figure 7, the hand fails to acquire the target in this case. The reason is shown in the bottom panel of Figure 7 in which the behavior of x is shown, x vacillates and jumps to extreme values as the hand reverses its direction of movement. This sends the hand itself into erratic movement. This problem can be mitigated by placing x in a saturation function that limits its ability to hit the dynamic with extreme values. We modified model (1) by replacing (x - .2) by arctan (x - .2). The arctan function suppresses large (+ or -) values. The resulting behavior in response to the 'double-step' perturbation was stable. The behavior was much like that of model (2) but with less overshoot as shown in Figure 8.
384
Geoffrey P. Bingham and Frank T. J. M. Zaal
Tau Driver
2.5
Figure 7
The added complexity of the saturation function can be avoided, however, by simply using model (2), that is, relative distance information instead of x.
Tau and Visually Guide Reaching
385
Tau Driver
2.5
10
-*?*-
arctan (Tau)
^^_^~Tau headed ""^ south
t
-10
1
1.5
2
2.5
Time Figure 8
6. Conclusions Spatial, length dimensioned information about target position and velocity is required to enable reaches to acquire a moving target successfully and track it stably. It is possible to use the information to construct a X variable that is then used to execute the 'constant x' strategy for visual guidance. This requires information about both target and hand position and velocity. Alternatively, information about target velocity could be used together with information about the relative distance of the hand with respect to the target and this information can be used directly. The information in this case is simpler (target velocity and mere relative distance) and the model is simpler, that is, the intervening variable, T, need not be constructed. Finally, x is unstable as a control variable for reaches unless its behavior is modified by a saturation function. Again, greater complexity is entailed. Model (2) is the more likely on grounds of parsimony. The question is whether there are timing demands imposed by the higher dimensional aspects of reaching and grasping that might be better served by use of a x variable. For instance, there is the issue of
386
Geoffrey P. Bingham and Frank T. J. M. Zaal
coordinating the frontoparallel and depth components, especially if we cannot safely assume a straight path of movement (Bingham, 1995; Bootsma & Peper, 1992). x variables might be used to achieve the appropriate timing as suggested by Bingham (1995) and Lee (1998). (See also Lee, Georgopoulos, Clark, Craig & Port (2001).) On the other hand, the respective proportionate distances might also be used to achieve this coordination. Another coordination problem is the timing of grip opening and closing relative to the reach trajectory. Zaal, et al, (1998) have suggested that x variables might be used independently to coordinate the timing of the grip aperture with approach to a target. If x is being used to control grasping, then perhaps one might use it for the control of reaching as well. On the other hand, if reaching is controlled via perception of relative distance and target velocity, then perhaps grasping could be controlled independently using the same variables. That is, Zaal, et al.'s argument for use of perceptual variables in independent control structures for reaching and grasping, respectively, could simply be extended to different perceptual variables, ones that are spatial, not temporal, but potentially effective in the context of the right control structures, nevertheless.
Tau and Visually Visually Guide Reaching
387
REFERENCES Bingham, G. P. (1995). The role of perception in timing: Feedback control in motor programming and task dynamics. In E. Covey, H. Hawkins, T. McMullen & R. Port (eds.) Neural Representation of Temporal Patterns, pp. 129-157. New York: Plenum Press. Bingham, G. P. & Pagano, C. C. (1998). The necessity of a perception/action approach to definite distance perception: Monocular distance perception to guide reaching. Journal of Experimental Psychology: Human Perception and Performance, 24, 145-168. Bingham, G. P., Zaal, F., Robin, D. & Shull, J. A. (2000). Distortions in definite distance and shape perception as measured by reaching without and with haptic feedback. Journal of Experimental Psychology: Human Perception and Performance, 26(4), 1436-1460. Bizzi, E., Hogan, N., Mussa-Ivaldi, F. & Giszter, S. (1992). Does the nervous system use equilibrium point control to guide single and multiple joint movements? Behavioral and Brain Sciences, 15, 603-613. Bootsma, R. J. & Peper, C. E. (1992). Predictive visual information sources for the regulation of action with special emphasis on catching and hitting. In L. Proteau & Elliott (eds.) Vision and Motor Control (pp. 285 -314). Amsterdam: North-Holland. Feldman, A. G. (1980). Superposition of motor programs-I. rhythmic forearm movements in man. Neuroscience, 5, 81-90. Feldman, A. G. (1986). Once more on the equilibrium-point hypothesis (X model) for motor control. Journal of Motor Behavior, 18(1), 17-54. Feldman, A. G., Adamovich, S. V., Ostry, D. J. & Flanagan, J. R. (1990). The origin of electromyograms- Explanations based on the equilibrium point hypothesis. In Winters, J. M. & S. L-Y. Woo (eds.) Multiple Muscle Systems: Biomechanics and Movement Organization. New York: Springer-Verlag. Hogan, N., Bizzi, E., Mussa-Ivaldi, F. A. & Flash, T. (1987). Controlling multijoint motor behavior. In Pandolf, K. B. (ed) Exercise and Sport Sciences Reviews V15, (pp 153190). New York: MacMillan. (esp. pp.167-170). Kay, B. A., Saltzman, E. L. & Kelso, J. A. S. (1991). Steady-state and perturbed rhythmical movements: A dynamical analysis. Journal of Experimental Psychology: Human Perception and Performance, 17, 183-197. Latash, M. L. (1993). Control of Human Movement. (Ch 1: What muscle parameters are controlled by the nervous system? pp. 1-37 and Ch 3: The equilibrium-point hypothesis and movement dynamics, pp. 81-102.) Campaign, IL: Human Kinetics. Lee, D. N. (1998). Guiding movement by coupling taus. Ecological Psychology, 10, 221-250. Lee, D. N., Georgopoulos, A. P., Clark, M. J. O., Craig, C. M., & Port, N. L. (2001) Guiding contact by coupling the taus of gaps. Experimental Brain Research, 139,151-159. Michaels, C. F. & Oudejans, R. R. D. (1992). The optics and actions of catching fly balls: Zeroing out optical acceleration. Ecological Psychology, 4, 199-222. McBeath, M. K., Shaffer, D. M. & Kaiser, M. K. (1995). How baseball outfielders determine where to run to catch flyballs. Science, 268, 569-573.
388
Geoffrey P. Bingham and Frank T. J. M. Zaal
McLeod, P. & Dienes, Z. (1996). Do catchers know where to go to catch the ball or only how to get there? Journal of Experimental Psychology: Human Perception and Performance, 22,531-543. Oudejans, R., Michaels, C, Bakker, F. & Dolne, M. (1996). The relevance of action in perceiving affordances: Perception of catchableness of fly balls. Journal of Experimental Psychology: Human Perception and Performance, 22, 879-891. Peper, L., Bootsma, R. J., Mestre, D. R. & Bakker, F. C. (1994). Catching balls: How to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception and Performance, 20, 591-612. Schoner, G. (1994). Dynamic theory of action-perception patterns: The time-before-contact paradigm. Human Movement Science, 13,415-439. Won, J. & Hogan, N. (1995). Stability properties of human reaching movements. Experimental Brain Research, 125-136. Yilmaz, E. & Warren, W. (1997). Visual control of braking: A test of the tau-dot hypothesis. Journal of Experimental Psychology: Human Perception and Performance. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1998). Coordination in prehension: Information-based coupling of reaching and grasping. Experimental Brain Research, 119,427-435. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1999). Dynamics of reaching for stationary and moving objects: Data and model. Journal of Experimental Psychology: Human Perception and Performance, 25, 149-161.
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 17 The Use of Time-to-Contact Information for the Initiation of Hand Closure in Natural Prehension
Frank T. J. M. Zaal University of Groningen, Groningen, The Netherlands
Reinoud J. Bootsma University of the Mediterranean, Marseilles, France
ABSTRACT In prehension, the opening and closing of the hand need to be coordinated with the transport of that same hand. We provide evidence for the hypothesis that this coordination is based on the use of first-order time-to-contact information. This information acts to affect the relative stability of a hand opening regime and a hand closing regime, leading to a smooth and stable transition from the one to the other, rather than simply triggering closure of the hand upon passing a critical value. Whereas manipulations of target width, target size, and target distance led to a pattern of data that could not be captured by a number of competing models for the coordination of reaching and grasping, the appropriately parameterized dynamic timing model that we adopted for this study fitted the data quite well. In this chapter we pursue two goals at the same time. We build a case for the use of first-order time-to-contact information in the coordination in prehension. At the same time, we use that case as a vehicle for discussing various aspects of the concept of first-order time to contact and the application of dynamical-systems models.
390
Frank T. J. M. Zaal and Reinoud J. Bootsma
1. Introduction Visual guidance of action implies the existence of information as well as the existence of a control law expressing how that information is used in guiding the action. This chapter will be concerned with the use of x-like perceptual variables as information for visually guiding the initiation of hand closing during natural prehension. Natural prehension can be considered as the act of coordinated reaching and grasping. The reaching component of prehension refers to the transport of the hand to the target object and the grasping component refers to the opening and closing of the hand. Both processes evolve simultaneously and need to be coordinated for a prehensile act to be successful. The hand needs to open and then close to end up in an appropriate configuration at the time that the hand reaches the object to be grasped. It is this coordination of reaching and grasping that we consider in this chapter. In a previous study (Zaal, Bootsma, & Van Wieringen, 1998), we showed that the coordination of reaching and grasping could well be based on first-order time-to-contact information. The coordination of reaching and grasping, here, and throughout the literature on prehension, for that matter, is operationalized as the timing of the initiation of closing of the hand, a moment that in most cases coincides with the moment of peak hand aperture. During a prehensile movement, the hand first opens to reach a peak aperture somewhat larger than the size of the object to be grasped. When roughly 70% of the movement duration has passed, the hand starts to close in on the target object (e.g., Jeannerod, 1981, 1984). Below, we will discuss a set of alternative accounts for this timing, accounts that can be found in the vast literature on prehension. Indeed, one of the objectives of the present study is to rule out these alternatives and make a more convincing case for the coordination to be based on first-order time-to-contact information. But equally important is our other goal of taking this example of the use of x-like information in a highly nonlinear control law to address issues pertinent to studies of visually guided action in a much broader context than prehension alone. To do so, we first need to explain where the time-to-contact information that we advocate to be involved in the coordination of prehension can be found. In any manual prehension situation, a hand approaches an object. That is to say, one of the things that happen during prehension is the closing of a gap between the hand and the target object. As this closing of the gap happens at a certain speed, first-order time to contact is defined as the inverse of the relative rate of closing of the gap. Using the notation proposed by Bootsma, Fayt, Zaal, and Laurent (1997), first-order time-to-contact TCj(D) equals the ratio of D and - D, in which D stands for the distance between the hand and the target and the dot refers to the derivative with respect to time. In words, at each point in time, TC](D) equals the current size of the gap between the hand and the target
Coordination in Prehension
391
divided by the current speed of closing that gap. Two things are worth mentioning at this point. First, TCj(D) is fully determined by the reaching kinematics, and, thus, in the model that we present, the unfolding of the reaching determines the timing of the grasping. Second, the distance D is defined relative to a hand and a target and not with reference to the point of observation, as it usually is in studies involving x. As shown by Bootsma and colleagues (Bootsma & Craig, 2002; Bootsma & Oudejans, 1993; Bootsma & Peper, 1992), however, TCj(D) is optically available. Mathematically, the specification of Td(D) boils down to the combination of the relative rate of constriction of the optical gap 0 between the hand and the object and the relative rate of contraction of the solid optical angle (p of the image of the hand. Thus, TCj(D) equals x((p,0) with cp referring to the solid optical angle subtended by the hand at the point of observation and 0 referring to the optical angle between the hand, the point of observation and the target object. Note that this notation attempts to be as clear as possible in making the distinction between the optical variable - the specificator x((p,0) - and the physical variable - the specificandum TCi(D) - that it specifies (cf. Bootsma et al., 1997; Bootsma & Craig, 2002). The specificator is defined in terms of perceptual variables such as optical angles, whereas the specificandum is defined in terms of physical variables such as distances and speeds. Furthermore, note that the fact that although xfcp,0J mathematically is a combination of x((p), and T(0), the optical variable x(
392
Frank T. J. M. Zaal and Reinoud J. Bootsma
first-order time-to-contact); The fact that the speed of approach does change is immaterial to that one-to-one relation. And the fact that TC](D) does not equal the actual time to contact, a quantity that can only be known a posteriori, does not rule against the use of that variable in the on-line visual control of the act. Along the same lines, the optical specification of TCj(D), that is, x(cp,9), rests on the condition (not assumption) that the reaching trajectory is a straight line (see Bootsma & Oudejans, 1993; Bootsma & Peper, 1992). In other words, there is a one-to-one relation between the first-order time-to-contact variable pertaining to the physical gap and a first-order time-to-contact variable pertaining to the optical structures related to the hand approaching the target along a linear path. The point to note here is that it is also immaterial whether the actual reaching paths are straight or not for the relation to be true. Thus, straight trajectories or not, the optical variable still specifies the first-order time-to-contact under straight-line conditions. If we are able to show that a control law based on this informational variable would result in an adaptive control of the coordination of reaching and grasping - and that is exactly what we will do later in the chapter we argue that we have made good progress in identifying the informational variable as well as the control law, and, thus, in uncovering the control. The formulation of the control law would then allow the further testing of the hypothesized use of the optical variable using more precise predictions concerning, for instance, the effects of directly manipulating the optical variable. 1.1 The dynamic timing model Above, we discussed the time-to-contact variable that we consider in the present study. Next, we will lay out the dynamic model, proposed by Schoner (1994a) and elaborated by Zaal and colleagues (1998), which is the basis of the control law that we propose to be at work in the coordination of reaching and grasping. Since the Schoner (1994a) model is a dynamic-systems model, one of its primary features is stability. The key phenomenon that the model addresses is perceptually driven change. For instance, the gannets studied by Lee and Reddish (1981) are known to fold their wings just before entering the water while diving to catch fish. Lee and colleagues suggested that the timing of this wing folding would be based on first-order time-to-contact information. The control law proposed by Lee and Reddish (1981) - and in the majority of studies on the visually induced initiation of movement, for that matter (cf. Tyldesley & Whiting, 1975) - was simply a triggering of the wing folding upon reaching a critical value of x, the optical variable specifying first-order time-to contact. Schoner's (1994a) model captures this folding of the wings as a change in wing postures. These postures are endowed with stability properties, in the sense that the model postulates that both the open-wing posture and the folded-wing posture are stable against perturbations: The bird intentionally assumes one of
Coordination in Prehension
393
both postures and is able to maintain that posture in the face of perturbing forces. In modeling terms, a continuous state variable (x in the equations in the Appendix) represents all possible wing postures; the variable x can attain all values between -1 (an open-wings posture) and +1 (a closed-wings posture), thus allowing for any wing configuration between open and folded wings. As mentioned above, the two extreme postures - open-wings and closed-wings are endowed with stability properties, modeled as two point attractors (fixed points) in state space. Now, the folding of the gannet's wings can be regarded as a transition from one stable point, representing the open-wings posture, to the other stable point, representing the closed-wings posture. Schoner proposed that the transition path in state space would also need to be stable against perturbations, a feature that is important if several movements need to be coordinated (for details, see Schoner, 1990; 1994a). In modeling terms, this asks for a limit-cycle attractor with a path in state space going through the points representing the stable postures (cf. Schoner, 1994a). The role of the optical variable in the model is in affecting the relative stability of the two point attractors representing the stable postures. In the gannet example, the operative optical variable is first-order time to contact with the water surface, or, more precisely, its inverse. Figure 1 illustrates the working of the model in terms of potential wells.1 The two point attractors representing the open-wings posture (x = -1) and the closed-wings posture (x = +1) are depicted as two potential wells. Figure 1A shows the situation in which both postures are equally stable: The width and depth of the wells characterize the stability of the point attractors. As said, the role of the optical variable in the model - the rate function in Figure ID - is in affecting the relative stability of the two point attractors. Since first-order time to contact decreases with time (although not linearly in the case of a gannet falling under gravity), its inverse will grow exponentially. By having the value of the rate function determine the relative stability of the two point attractors the relative depths of the two potential wells in Figure 1A-C are determined by the value of the rate function in Figure ID - this change in the value of the rate function leads to a loss of stability of the point attractor representing the openwings posture (x = -1), giving rise to a transition to the increasingly more stable point attractor representing the closed-wings posture (x = +1). Not captured in Figure 1 is the fact that this transition - x going from -1 to +1 - also has stability properties of its own. In short, a visually induced loss of stability of the openwing posture in combination with a gain in stability of the closed-wings posture leads to the change from the one posture to the other, and thus, to the folding of the wings. 1
As explained in Zaal et al. (1998), this representation of the model is mathematically not correct, but we consider it a helpful way of explaining the gist of the model.
394
Frank T. J. M. Zaal and Reinoud J. Bootsma
•a
Posture
B
+1
D
13
1 Posture +1
Time
Posture Figure 1: Illustration of the Schoner (1994a) model of dynamical perception-action coupling. We represented the fixed-point attractors in the model with potential wells (AC), the depth and width of which are affected by the value of the optical variable (D). Adapted from: Zaal, Bootsma, & Van Wieringen (1998), Coordination in prehension: Information-based coupling of reaching and grasping, Experimental Brain Research, 119, 427-435, © Springer Verlag.
One might wonder what would be the advantage of using such a seemingly complicated model over a model in which the folding of the wings is triggered upon reaching a critical value of first-order time-to-contact (cf. Lee and Reddish, 1981), the more standard way of approaching such a movement initiation. First, let us mention that the Schoner (1994a) model may be parameterized in such a way that a very quick transition takes place at some specific value of the rate function, thus capturing exactly the latter scenario. Importantly, however, even in this case, the model assumes a continuous coupling of the optics and the dynamics. That is to say, a standard model of triggering movement initiation upon T passing a threshold would not have a
Coordination in Prehension
395
solution if for some reason x would pass the same threshold in the opposite direction. In contrast, the Schoner model predicts a transition the other way from folded wings to unfolded wings - because of the continuous character of the visual guidance. Such situations might occur if, in the case of the diving gannets, the water front would recede as the result of a wave or in case that the wind would throw the bird to a higher altitude. The continuous guidance leads to stable behavior, in which wing posture is geared to the first-order time to hit the surface. It is exactly this type of stability that makes the Schoner model an appealing alternative to the standard discrete-control alternative (also see Wann, 1996, for an empirical critique of such a model). Above, we used the diving gannets to explain the Schoner (1994a) model. In the model, a state variable x was mapped onto the wing postures that the gannet could attain. The model does not apply to postures exclusively, however. This modeling approach would be appropriate for any observable that can be said to be stable in the face of perturbations. For instance, Schoner showed how the landing of flies (Wagner, 1982) could be understood as a visual regulation of their flight motor activation. In this case, point attractors are mapped onto the situation that the flight motor is turned off and onto the situation that the flight motor is turned on with a constant thrust. During landing, then, at sufficiently high rates of the optical variable, the flight motor would be turned on to decelerate the fly, until it would be turned off again just before contact (for more details, see Schoner, 1994a). Other examples would be jumping to hit a falling ball (Lee, Young, Reddish, Lough, & Clayton, 1983; Michaels, Zeinstra, & Oudejans, 2001), modeled as the visually induced switch from an initial posture to a final posture much alike the account for the folding of the gannets' wings. We will apply the Schoner (1994a) model to the coordination of reaching and grasping, following up and improving on our previous study (Zaal et al., 1998), which we take as a promising start to demonstrate the model's applicability and the implication of first-order time-tocontact information for the timing of the initiation of hand closure. In our study on the timing of the coordination of the reaching and grasping component of prehension (Zaal et al., 1998), we considered the grasping movement to be the subsequent opening and closing of the hand. Following Schoner's (1994a) lead, we mapped a state variable onto this opening and closing. More specifically, in the model, x = -1 represents the hand-opening regime and x = +1 represents the hand-closing regime. Thus, the initiation of hand closure would coincide with the point that the state variable x would be zero, going from a negative to a positive value. The model seemed to capture the pattern in the timing of the initiation of hand closure under different conditions of reaching distance and target speed quite well (see Zaal et al., 1998, Figure 4). First-order time to contact TCj(D) was not constant across conditions, as would
396
Frank T. J. M. Zaal and Reinoud J. Bootsma
be predicted by a discrete control scheme. Still, the model simulation nicely mirrored this pattern of varying rC/fDj-values. Considering that we did not go at length attempting to optimize the model, the results reported by Zaal and colleagues would seem convincing. Some reservations seem to be in order, however. Recall that, according to the model, the kinematics of the reaching movement determine the time course of TCrfD), which, in turn, affects the relative stability of the hand-opening and hand-closing regimes. Consequently, to assess the success of the model, one would need to determine the reaching kinematics, compute the TC;(7))-traces, simulate the model, determine the point in time where the state variable x crosses zero, and compare that time to the observed time of the start of hand closing. In the Zaal et al. (1998) paper, we chose to derive the reaching kinematics from a model that we had developed using the same data set (Zaal, Bootsma, & van Wieringen, 1999). As a consequence, we ignored any variation in reaching kinematics within a condition (or, even, across participants). This meant that the possible resulting variation in initiation times of hand closing was not used to evaluate the model performance. This meant that for six distance conditions we ran six model simulations. Moreover, model parameters were assumed to be equal for all participants. Finally, this procedure only allowed a visual assessment of the goodness of fit, whereas alternative hypotheses regarding the timing of the coordination in prehension had been tested, and rejected, using powerful repeated-measures ANOVAs. The latter type of analysis requires a measure on each trial rather than a measure averaged across participants, as used in the Zaal et al. (1998) study. Thus, a more powerful evaluation of the model would use the reaching kinematics of individual trials to compute TC](D)-time series for each trial, run the numerical simulation of the model and determine the point representing the model prediction of the moment of the initiation of hand closure, compare that moment to the observed moment in that trial, and enter these comparisons into a repeated-measures ANOVA. This is exactly what we will do in the analyses reported below. Furthermore, we decided to use a different data set for this exercise, a data set from an experiment in which reaching distance, target size, and target width had been manipulated (Bootsma, Marteniuk, MacKenzie, & Zaal, 1994). Not only did these data allow for a generalization across more conditions than reaching distance and target speed - the parameters manipulated in the Zaal et al. (1998) study - but using a fresh data set, with no information on the possible success of the model simulations, seemed a more appropriate route to take in gathering convincing support for the conclusions and method. Now, before turning to the analyses, we will briefly discuss a number of alternative hypotheses regarding the timing of the initiation of hand closure.
Coordination in Prehension
397
1.2 Coordinated reaching and grasping Over the last two decades, a fair number of possible mechanisms for the coordination of reaching and grasping have been proposed. Jeannerod's (1981, 1984) early investigations into prehension seemed to suggest that the moment of peak hand aperture occurred at about 70 percent of the total movement time. More specifically, Jeannerod hypothesized that the moment of peak hand aperture coincided with the moment of peak hand deceleration, the operationalization of the start of the so-called low-velocity phase of the reaching component. Several studies from the early nineties of last century, however, indicated that Jeannerod's thesis did not hold up under experimental scrutiny (Gentilucci, Castiello, Corradini, Scarpa, Umilta, & Rizzolatti, 1991; Jakobson & Goodale, 1991; Marteniuk, Leavitt, MacKenzie, & Athenes, 1990; Paulignan, MacKenzie, Marteniuk, & Jeannerod, 1991; Wallace, Weeks, & Kelso, 1990). An alternative proposed by Gentilucci and colleagues was that a prehension movement was organized such that the hand closure time - the time from peak hand aperture to object contact - would be constant (Gentilucci, Chieffi, Scarpa, & Castiello, 1992). Indeed, no statistically significant effects of reaching amplitude, under location-perturbation and no-perturbation conditions, were found in the experiments performed by the Gentilucci group. Notwithstanding the earlier findings of target size effects on the duration of hand closure (Von Hofsten & Ronnqvist, 1988; see also Zaal et al., 1998 for more details on the dependency of the duration of hand closure on target size), Hoff and Arbib (1993) adopted the idea of a constant duration of hand closure to feature prominently in their model of the control and coordination of reaching and grasping, the model that, to date, is widely considered to be the state of the art in the understanding of prehension. Around the same time that the Gentilucci et al. (1992) study showed that the duration of hand closure was not affected by reaching amplitude manipulations, Bootsma and van Wieringen (1992) proposed that the coordination of reaching and grasping might well be emerging from the use of the same information in both components of prehension. More specifically, they suggested that as soon as the movement was on its way, time-to-contact information (see above) would be available to guide both reaching and grasping. Since the coordination would be the consequence of the common use of an information source, no dedicated mechanisms were needed to instantiate that coordination. While there are profound differences between the theses put forth by Bootsma and van Wieringen (1992) and by Gentilucci and coworkers (1992), the two studies share the notion that the timing of the initiation of hand closure might be controlled based on the time remaining until contact between hand and object. Note, however, that the two proposals capitalize on completely different
398
Frank T. J. M. Zaal and Reinoud J. Bootsma
notions of time-to-contact. The Bootsma and van Wieringen notion of time-tocontact refers to prospective information on first-order time-to-contact TCrfD). At each point in time, this information about the time it would take to reach the target under prevailing speed conditions is available. In contrast, the time-tocontact as referred to by Gentilucci and colleagues is the actual time it takes from the moment of peak hand aperture - the moment of initiation of hand closure - until the moment of first hand-object contact. First, since reaching speed is not constant, this time, in general, does not equal the first-order time-tocontact: In other words, the times that feature in the Bootsma and van Wieringen (1992) study and in the Gentilucci et al. (1992) study are different times! Second, whereas the Bootsma and van Wieringen model proposes that control and coordination are on the basis of continuously available information, the Gentilucci et al. (1992) account, and with it the Hoff and Arbib (1993) model, assume a priori knowledge about the upcoming duration of hand closure. The moment of peak hand aperture is simply prescribed to occur at hand closure time before contact; times for hand opening and hand closing are allocated such that peak hand aperture should occur at a fixed time before the movement should be finished (which, in turn, assumes a priori knowledge about the duration of the entire movement). Our effort to emphasize the difference between the two time-to-contact notions that feature in the literature on the coordination of reaching and grasping might seem a bit elaborate to some readers. Nevertheless, the confusion surrounding these ideas tempted us to draw the distinction as clearly as possible. Two studies, conducted independently from each other and published around the same time, tested the hypotheses that the timing of the initiation of hand closure was based on time-to-contact information. Watson and Jakobson (1997) had participants pick up objects that moved at different speeds from different directions to a designated interception location. Their data analyses indicated that the duration of hand closure differed significantly among the direction conditions. Clearly this constitutes a violation of the hypothesis formulated in the Gentilucci et al. (1992) study, the study that inspired the Hoff and Arbib (1993) model. Surprisingly, however, Watson and Jakobson interpreted their findings as providing evidence against the thesis proposed by Bootsma and van Wieringen (1992), whose proposal involved the use of prospective first-order time-to-contact information. Unfortunately, the confusion has not ended there. For instance, in the recent paper in which Mon-Williams and Tresilian (2001) propose a new simple rule of thumb for the coordination of reaching and grasping (below, we will discuss this proposal more elaborately), Watson and Jakobson (1997) conclusion is easily taken at face value, and the idea that the coordination of reaching and grasping might be based on first-order time-tocontact information is simply dismissed. Of course, if Watson and Tresilian would have been aware of the Zaal et al. (1998) study that features so
Coordination in Prehension
399
prominently in this chapter, the conclusion that they seemingly drew would not have to be any different: The initiation of hand closure is not based on first-order time-to-contact reaching some critical value. However, the same study (i.e., Zaal et al, 1998; see also Zaal, 1995) indicated that under a different control law - the timing model that we discussed above - first-order time-to-contact did seem to be implied in the coordination in prehension. The more general conclusion that seems to be implied by the Watson and Jakobson (1997) study, and cited as such in the Mon-Williams and Tresilian (2001) paper, that first-order time-to-contact information is not involved in the coordination of reaching and grasping, can not be drawn. True, first-order time-to-contact at the moment of the initiation of hand closure is not constant across conditions. But, under a different control law, the same information might well be functional in exactly that coordination. The point to take here is that showing that a specific informational variable does not seem to work given some control law (e.g., triggering some movement when first-order time-to-contact passing some critical value) can not be taken to show that the informational variable per se is not involved in the control of the movement under scrutiny. Do not let hypothesis doubling - one hypothesis regarding the informational variable and another hypothesis regarding the way that this variable might be used - fool you (cf. Bootsma et al., 1997)! Back now to our chronological review. Apparently convinced that currently no viable model for the coordination of reaching and grasping was available, Mon-Williams and Tresilian (2001) proposed what they called a 'simple rule of thumb for elegant prehension'. As was true for the Hoff and Arbib (1993) model, the Mon-Williams and Tresilian model assumes a priori knowledge about the duration of the prehensile movement. Furthermore, the model assumes knowledge of the size of hand aperture at its peak as well as at the moment of contact. Indeed, stable linear relations between target size and peak hand aperture are well documented (e.g., Bootsma et al., 1994; Marteniuk et al., 1990; for a review, see Smeets & Brenner, 1999) and might well be part and parcel of the planning system of prehension. The Mon-Williams and Tresilian model, now, states that the relative timing of hand opening and hand closing is proportional to the relative amplitudes of hand opening and hand closing. In prehension, the difference between peak hand aperture and the initial hand aperture is typically larger than the difference between peak hand aperture and the hand aperture at contact. As a result, hand opening would take longer than hand closing, with the ratio of the two aperture differences determining the proportion of movement duration allocated to hand opening and hand closing, respectively. Empirical results did seem to follow the predictions of the model, although the evidence was based on the results of linear regression. Below, we will put the model to a statistically more rigorous test.
400
Frank T. J. M. Zaal and Reinoud J. Bootsma
Finally, we would like to mention yet another recently proposed model for the coordination of reaching and grasping within prehension. Wang and Stelmach (1998; 2001) found in their experiments with trunk-assisted reachingto-grasp movements that the distance between the hand position at the moment of peak hand aperture and the hand position at contact did not vary across conditions. This would seem to be an alternative model to the models that we discussed until now, that all capitalize on timing rather than on spatial variables, although Wang and Stelmach are not clear in indicating if their model should be regarded an alternative to existing models. Unfortunately, they did not contrasts their account to other proposals or discuss it in the light of competing models.2 The analyses to be presented below will also consider Wang and Stelmach's proposal. To sum up this section on the coordination of reaching and grasping, a number of accounts, with differing amounts of support, is currently available in the literature. Using data that we collected more than a decade ago (Bootsma et al, 1994), we will test five alternatives: (i) coordination based on a constant closure time (cf. Gentilucci et al., 1992; Hoff & Arbib, 1993; Watson & Jakobson, 1997), (ii) coordination based on the simple rule of thumb relating hand aperture sizes to the relative timing of hand opening and closing (cf. MonWilliams & Tresilian, 2001) (iii) coordination based on a constant closure distance (cf. Wang & Stelmach, 1998, 2001), (iv) coordination based on TCj(D) having a constant value at the moment of peak hand aperture (see Watson & Jakobson, 1997; Zaal et al., 1998), and (v) coordination based on TC,(D) in combination with the nonlinear control that we discussed above (cf. Zaal et al., 1998).
2. Materials and methods Five participants, one male and four females, between 21 and 27 years of age, were asked to pick up rectangular objects that were presented on the table at which they were seated. They started their movement from a standardized initial hand position, with their underarm and hand in line with the shoulder in the sagittal plane and their thumb and index finger gently touching. Target objects were positioned at either 20 cm or 30 cm away from the starting position along the sagittal plane. Target objects were 16 rectangular wooden blocks of varying size (defined as the extent perpendicular to the (sagittal) plane of movement of the hand, 3, 5, 7, or 9 cm) and of varying width (0.5, 1.0, 1.5, or 2.0 cm). Participants were asked to pick up the object as quickly as possible, 2
These authors did also not allude to the fact that Gentilucci et al. (1992), in addition to their finding of constant closure times, also reported constant closure distances, but clearly favored the control to be based on temporal variables rather than on spatial variables.
Coordination in Prehension
401
using a precision grip, bring it to the starting position, and replace it at the target position at a desired orientation. Ten experimental trials per condition were collected, resulting in a total of 320 trials per participant. Movement kinematics were obtained by tracking four IREDs (infrared emitting diodes) using an Optotrak 3010 registration system at a sampling rate of 200 Hz. The IREDs were placed on (i) the target, (ii) the upper medial corner of the thumb nail, (iii) the upper lateral corner of the index-finger nail, and (iv) above the styloid process on the radial side of the wrist. High-frequency noise was removed from the Optotrak records using a double-pass second-order Butterworth filter at a cut-off frequency of 10 Hz. Time derivatives were calculated with a three-point central difference technique. Hand position was defined as the projection of the thumb IRED position on the line connecting the starting position and the target position. Hand aperture was calculated as the three-dimensional distance between the IREDs on the thumb and the index finger. Hand speed and hand opening and closing speed were the time derivatives of hand position and hand aperture, respectively. Movement initiation and movement termination were determined by the times that hand opening speed exceeded a threshold of 1 cm/s and that hand closing speed fell below a threshold of 1 cm/s, respectively. Trials in which IREDs had been obscured within an interval of 100 ms around the movement segment were excluded from further analyses. This resulted in a removal of 59 trials, leaving 1541 trials deemed suitable for further processing. The moment of the initiation of hand closing was defined as the moment that hand closing speed exceeded a threshold of 1 cm/s. Due to occasional double-peaked hand-aperture profiles, the moment of the initiation of hand closing was more 25 ms later than the moment of peak hand aperture in 8 trials. The dependent variables were (i) the hand closure duration, computed as the time from the initiation of hand closing until movement termination, (ii) the distance between the hand position at the moment of the initiation of hand closing and the target position, operationalized as the hand position at the moment of movement termination, (iii) the difference in time between the actual moment of the initiation of hand closure and the prediction of that moment based on the simple rule that relates hand opening and closing amplitudes to the durations of hand opening and hand closing, as proposed by Mon-Williams and Tresilian (2001), (iv) the first-order time-to-contact at the moment of the initiation of hand closing, and (v) the difference in time between the moment of the initiation of hand closing and the moment predicted by the dynamic timing model; Details of the numerical simulations will be given below. To determine the latter two dependent variables, first-order time-to-contact was computed as the current distance between the hand and the target divided by the current hand speed (i.e., TC,(D)).
402
Frank T. J. M. Zaal and Reinoud J. Bootsma
Participant
Cvision
EB
3.11
JM
3.53
SD
4.39
SP
3.05
TW
2.23
Table 1: Parameter values of cvision used in the numerical simulations of the different participants' data sets. In all simulations, a= 10, w = 10, y= 10, pin, = 90,
To assess the success of the dynamic timing model for predicting the moment of the initiation of hand closure, we performed several numerical simulations. All numerical simulations were done using a Runge-Kutta algorithm with a fixed time step of 0.005 s, such that simulation results could be compared directly to the kinematic data which was acquired at a sample frequency of 200 Hz. The moment that the state variable x changed from a negative number to a positive number was taken as the model prediction of the moment of the initiation of hand closure (cf. Schoner, 1994a; Zaal et al., 1998). In the set of simulations that we report here, we assumed the same dynamics for all participants, with exception of the strength of the contribution of the optical variable in (changing) the stability of the hand-opening and hand-closing regimes. In mathematical terms, of all parameters in the model only cvision was allowed to vary across participants (Table 1). For each participant, cvision was determined such that the sum of squared differences in time between the actual moment of the initiation of hand closure and the model prediction of that moment was minimal across all the trials of that participant. More specifically, we computed for each trial the time difference between the actual moment of the initiation of hand closure and the model prediction of the initiation of hand closure for some value of cvision. Using a numerical minimization procedure in which cViSion was varied, we determined that value of cvision for which the sum of the squared differences across all trials of the participant was minimal. Table 1 presents the values of cvli,on that we found using this procedure.
Coordination in Prehension
403
3. Results The analyses focused on the timing of the moment of the initiation of hand closure. In general, this moment coincides with the moment of peak hand aperture, but occasionally the hand aperture profile is double-peaked and the second peak, associated with the initiation of hand closure, is lower than the first peak. In the latter case, the moment of the initiation of hand closure and the moment of peak hand aperture have a distinct temporal occurrence. As mentioned above, in this study we chose to consider the moment of the initiation of hand closure rather than the moment of peak hand aperture. Repeated measures analyses of variance (ANOVAs) with factors of distance (20 and 30 cm), size (3, 5, 7, and 9 cm), and width (0.5, 1.0, 1.5, and 2.0 cm) demonstrated that the duration of hand closure varied across conditions (Fig 2A). The ANOVA yielded a significant main effect of width, F(3, 12) = 7.9, p < 0.005 as well as a significant Distance x Width interaction effect, F(3, 12) = 5.1, p < 0.05. The duration of hand closure decreased with increasing width, more so for the longer distance than for the shorter distance. The interaction effect for hand closure duration followed Fitts' law much in the same way as movement duration does (cf. Bootsma et al., 1994). That is to say, the duration of hand closure was linearly related to Fitts' index of difficulty, defined as the 2-base logarithm of the ratio of twice the reaching distance and the object width (Fitts, 1954; Fitts & Peterson, 1964; Bootsma et al., 1994). The linear regression between hand closure duration and the Fitts' index of difficulty was significant, intercept 0.87 s, slope 0.011, r(32) = 0.78, F(l, 30) = 46.8, p < 0.001. In line with the findings reported by Wang and Stelmach (2001), a repeated-measures ANOVA on the distances between the position at the moment of the initiation of hand closure and the target position did not show significant main effects of distance or size. The effect of width, however, was significant, F(3, 12) = 4.1, p < 0.05: The average distance was 26, 28, 32, and 33 mm for the 0.5, 1.0, 1.5, and 2.0-cm width conditions, respectively (see also Fig 2B). None of the interaction effects reached significance. To assess the simple rule of thumb that was proposed by Mon-Williams and Tresilian (2001), we performed a repeated-measures ANOVA on the differences in time between the actual moment of the initiation of hand closure and the moment predicted from that simple rule. The ANOVA indicated a significant effect of size, F(3, 12) = 108.7, p < 0.001, as well as a significant effect of width, F(3, 12) = 6.6, p < 0.01. Furthermore, both the Distance x Size interaction, F(3, 12) = 7.1, p < 0.01, and the Distance x Width interaction, F(3, 12) = 4.3, p < 0.05, turned out to be significant.
404
Frank T. J. M. Zaal and Reinoud J. Bootsma
200
B 35 •
Q u
25 •
Closur
1,0 1,5 Width (cm)
2,0
.
•
1
15
0,5
1,0 1,5 Width (cm)
2,0
-100 0,5
1,0
1,0 1,5 Width (cm)
-D-*-O-0-
D20-S3 D20-S5 D20-S7 D20-S9
1,5
2,0
D30-S3 D30-S5 D30-S7 D30-S9
-80
0,5
1,0 1,5 Width (cm)
2,0
Figure 2: Inter-subject means of (A) the duration of hand closure, (B) the distance covered by the hand from the moment of initiation of hand closure until the moment of contact with the object, (C) the difference in time between the actual moment of the initiation of hand closure and the time predicted by the Mon-Williams and Tresilian (2001) rule of thumb, (D) the first-order time-to-contact TC[(D) at the moment of initiation of hand closure, and (E) the difference in time between the actual moment of the initiation of hand closure and the time predicted by dynamic timing model, for the different conditions in the Bootsma et al. (1994) data (D: distance; S: Size).
Coordination in Prehension
405
Figure 2C illustrates these effects, with a positive difference meaning that the Mon-Williams and Tresilian rule predicted the moment of the start of hand closure too early.It is clear that especially the size manipulations were not captured adequately by the model. The larger the object size, the more the rule predicted the moment of initiation of hand closure too early. That is not to say that the model did not explain any of the variance in the data. Linear regressions of the kind reported in the Mon-Williams and Tresilian (2001) paper (but now performed on individual trials rather than on pooled data: i?2-values are not directly comparable), showed that the model explained on average 70% of the variance in observed hand opening times (with /?2-values of .51, .65, .73, .80, and .73 for the individual participants' data, respectively). However, the ANOVA indicated that the model does not capture all the subtleties in the timing of the coordination, and, therefore, fails to provide a full account. The repeated-measures ANOVA on the TC;(D)-values demonstrated a significant main effect of distance, F(l, 4) = 10.1, p < 0.05, with mean TCt{D) of 33 and 40 ms for the 20 and 30 cm distances, respectively. The ANOVA also yielded a significant Size x Width interaction effect, F(9, 36) = 2.2, p < 0.05. Inspection of Fig. 2D reveals that this interaction effect did not seem to reflect any systematic pattern in the rC;(D)-values. As mentioned above, we performed numerical simulations of the Dynamic timing model, with parameters optimized to fit the present data. Fig. 2E shows the mean differences in time between the actual moment of the initiation of hand closure and the model prediction of that moment (for the parameter values that we used in the simulations, see Table 1). A repeatedmeasures ANOVA on these time differences did not show any effect to be significant. Thus, whereas all ANOVAs on the dependent measures relating to the competing hypotheses that we considered above resulted in significant effects, no such effects were present when considering the set of optimized simulation data. Next, we assessed the success of the model on a trial-to-trial basis by also inspecting the individual difference values. Figure 3 presents histograms of the computed differences for each participant. Although there are instances that the model prediction differs considerably from the observed moment of initiation of hand closure (note that the histograms do not span the entire range of difference values in the data set), the overall picture is very favorable for the model. In roughly 95% of the trials, the model prediction was within 35 ms of the actual moment of initiation of hand closure. Furthermore, linear regressions of the observed hand opening time onto the hand opening time predicted by the model indicated that the model explained on average 88% of the variance in observed hand opening times; /?2-values were .84, .93, .78, .90, and .91 for the five participants' data, respectively.
406
Frank T. J. M. Zaal and Reinoud J. Bootsma
150
-40
-20
0
20
40
-40
Difference (ms)
-40
-20
0
20
-20
0
20
40
Difference (ms)
40
Difference (ms)
-40
-20
0
20
40
Difference (ms)
150
-40
-20
0
20
40
Difference (ms)
Figure 3: Distributions of the differences in time between the actual moment of initiation of hand closure and the moment predicted by the dynamic timing model, for each participant in the Bootsma et al. (1994) study. Note that the range of differences was restricted for presentation purposes. In total, 2% of all trials fell outside the range that we used to prepare this Figure (4, 1, 10, 9, and 8 trials for participants EB, JM, SD, SP, and TW, respectively)
Coordination in Prehension
407
4. Discussion We designed the new analyses of the Bootsma et al. (1994) data with two goals in mind. First, they intended to provide an example of the use of firstorder time-to-contact information under a continuous, nonlinear control law. We hope that this example will help change the lingering view that building a theory in which this type of information features prominently might be a specious enterprise (cf. Wann, 1996). Second, focusing more on the topic of prehension per se, the analyses were meant to show that first-order time-to-contact information might well be involved in the coordination of reaching and grasping. 4.1 The coordination of reaching and grasping Using the Bootsma et al. (1994) data set, we tested several current hypotheses regarding the timing of the initiation of hand closure in prehension. First, the hypothesis that the duration of hand closure is constant across conditions did not hold: The duration of hand closure decreased with increasing width, and the decrease was stronger when reaching for far targets than when reaching for near targets. Interestingly, whereas several studies showed that the duration of hand closure decreases with increasing target size (Von Hofsten & Ronnqvist, 1988; Wang & Stelmach, 2001; Zaal et al., 1998; see also Berthier, Clifton, Gulapalli, McCall, & Robin, 1996, and Marteniuk et al., 1990), with size referring to the diameter of the spherical objects in all those studies, our analyses did not indicate a significant size effect, but now with size referring to the lateral extent of the target objects. Note that increasing the diameter of a spherical object also increases its circumference, such that the surface area available for placing the digits in prehension increases as well (Zaal & Bootsma, 1993). Given the significant size effects reported repeatedly in previous studies and the significant width effect identified here, the conclusion seems to be justified that it is this area available for contact - determining the required precision of placing the digits - that is the variable that affects the duration of hand closure. That is to say, a variable that has been proven to determine the timing of the reaching component of prehension now seems also involved in determining the duration of hand closure - usually thought of as a characteristic of the grasping component of prehension. The next step, relating the duration of hand closure to both target width and target distance, combining these two variables into a Fitts' index of difficulty (Fitts, 1954; Fitts & Peterson, 1964), seems an obvious one. Indeed, we found a Fitts-like speed-accuracy trade-off: Duration of hand closure was linearly related to the index of difficulty. The consequence of all this is that also reaching amplitude - the distance manipulation in the Bootsma et al. (1994) study - should affect the duration of
408
Frank T. J. M. Zaal and Reinoud J. Bootsma
hand closure. Although we did not find such an effect here (cf. Gentilucci et al., 1993; Wang & Stelmach, 2001), we did report such an effect in a study in which the range of reaching amplitudes was much larger (Zaal et al., 1998). In sum, the duration of hand closure is not constant across target-width and distance conditions. This variation in the duration of hand closure puts a heavy burden on the Hoff and Arbib (1993) account that supposes a priori knowledge of that duration on the part of the person performing the prehensile act. Next, the analyses indicated that the distance covered by the hand during hand closure varied across conditions, falsifying Wang and Stelmach's (2001) hypothesis that keeping this distance at a constant value leads to the observed coordination patterns of reaching and grasping. Furthermore, a more powerful analysis than the one performed by Mon-Williams and Tresilian (2001) showed that the simple rule of thumb that they proposed does not seem to hold. Both in the Mon-Williams and Tresilian study and the Bootsma et al. (1994) study that provided the data for the current analyses, sizes and widths of the targets were manipulated independently. Whereas a simple regression using median scores across participants made for a promising case in the Mon-Williams and Tresilian study3, the analysis of similar data with more powerful repeated-measures ANOVAs in the present study clearly showed the limited predictive power of the rule of thumb proposed by Mon-Williams and Tresilian. The rule of thumb clearly captures a fair amount of the variance in the observed timing but is not able to account for the full picture, with all its subtle dependencies on manipulations of target size, target width, and distance. Now, returning to the role of time-to-contact information in the coordination of reaching and grasping, a similar concern with the analyses that we presented in the Zaal et al. (1998) study, that used regression rather than ANOVA-techniques, led us to the timeconsuming simulation study that we report in this chapter. The results of the simulations corroborated our earlier findings that firstorder time-to-contact information might well be involved in the coordination of reaching and grasping. Clearly, TC](D) at the moment of initiation of hand closure is not constant (cf. Zaal et al., 1998). Thus, the hand closure is not started upon reaching a critical value of TC^D). The simulations of the dynamic timing model, however, show that the opening and closing of the hand could well be controlled by first-order time-to-contact information. This information acts to stabilize and destabilize hand opening and hand closing regimes, determining the coordination of the two components of prehension. The analyses that we presented here favor the non-linear dynamics variant of the use of TC](D) over a number of competing hypotheses. By subjecting all hypotheses to 3
Mon-Williams and Tresilian (2001) reported that a quadratic regression showed that the rule of thumb predicted the observed times 'extremely well'. We are more hesitant in the assessment of this fit.
Coordination in Prehension
409
the same powerful ANOVAs, we have attempted to make fair comparisons among hypotheses. Still, a number of concerns might need to be addressed before the final verdict can be given. We will discuss some of these in turn. 4.2 The dynamics of coordinated reaching and grasping The present study is a follow-up on the Zaal et al. (1998) study. In the latter study, we introduced the Schoner (1994a) model in the context of prehension and showed that the patterns in the timing of the initiation of hand closure were nicely captured by that model. The case that we made was based on (i) initiation times of hand closure, averaged across trials and participants, (ii) a model of the reaching dynamics, and (iii) as a consequence, for evaluation of the success of the approach, a visual inspection of the results of just a few numerical simulations. Above, we argued that a fair comparison would, for each individual trial, take real reaching kinematics to determine time series of TCj(D), run a numerical simulation of the model, compute the difference between observed moment of initiation of hand closure and the model prediction, and submit all these differences to the same repeated-measures ANOVA that we used to falsify other hypotheses. And so we did. As a result, we were able to present a parameter setting of the model that seemed to capture the complex patterning in the timing of the coordination of reaching and grasping. That is to say, the ANOVA did not yield any significant effect, and the differences between the predicted and observed times of initiation of hand closure were quite small. However, a statistically well-educated reader might remark that ANOVAs (or any other inductive statistics, for that matter) are not designed to draw conclusions on the basis of a lack of significant effects. And this reader, of course, is completely right. In reaching this kind of conclusion, we need to control the Type II error - the chance of a real effect not being detected because of a lack of power of the analyses. To keep the chance of making this error at the order of magnitude of the significance level that we used in the ANOVAs, we would need a sample with more participants. A study involving a larger number of participants might thus be a next step in probing the hypothesized coordination mechanism. However, given that the ANOVAs proved rather sensitive in picking up all other effects, we are rather confident about the outcome of such a study. Furthermore, and perhaps in addition to a replication with more participants, a more stringent test of the model would be one in which we would perturb the reaching component by moving the target. The model makes specific predictions about the effects of those perturbations on the timing of the coordination of reaching and grasping. Remember that the key idea of the model is the stability of the relation between reaching and grasping and the
410
Frank T. J. M. Zaal and Reinoud J. Bootsma
ability to deal with perturbations. A direct test of this ability would be the most convincing next step, we would argue. One of the points of critique that a model such as the one that we applied here, a model with seven parameters, fairly often receives is that fitting that model to any data set will always be possible. The seven parameters would allow for an easy job. First, note that the apparent number of degrees of freedom is larger than the real number. Parameters in the model are not always independent (e.g., the parameters a and j'are equal in the current model), some parameters were simply set to be zero (in the current study, rcrit), and changing other parameters does not change much to the model's behavior. We ended up optimizing three parameters: pint, a, and cvision (however, below we consider the role of the other parameters when we address the question regarding the need of a limit-cycle component in the model). Relatedly, one could wonder if the dynamic timing model would work with any exponentially increasing function and if other rate variables then the inverse of TCi(D) would render similar results as the ones that we presented above. Unfortunately, a general answer to this question cannot be given. We simply cannot know if all of the infinite number of possible competing hypotheses prove to be inferior to the one that we defend here. But to get some feeling for the possible danger of finding another (optical) variable that would fit the data equally well or better than TCrfD), we decided to put one such potential competitor to the test. We performed a large number of model simulations in which relative distance (see also Bingham & Zaal, this volume) took the place of TCi(D) in the model equations. We refrain here from a discussion on the qualities qua perceptual variable of relative distance; The main concern at this point is that the inverse of remaining relative distance between the hand and the target - at each moment in time, relative distance equals the distance between the hand and the target divided by that distance at the start of the reach - grows exponentially in time, in a way similar to the inverse of TCS(D). We started out, using the data of one of our participants, by varying the parameters a, 0), y, J3int, o and at each combination of parameters, finding a value for cvision that led to the smallest sum of squared differences between the model predictions and the observed times of initiation of hand closure. This procedure was identical to the one that we used in the model simulations using TCi(D). It turned out that the sums of squared differences in the simulations with relative distance were about five times larger than those in the simulations with TCrfD), meaning that the model with TCi(D) explained more variance in the data than did the model with relative distance - the proportion explained variance can be computed by dividing the variance in hand opening times minus the sum of squared differences by the variance in hand opening times. Thus, the larger the sum of squared differences, the smaller the percentage of explained variance. Since the variance of the hand opening times was equal in both sets of simulations - the hand opening times were the same -
Coordination in Prehension
411
the fact that the sums of squares were larger for the model with relative distance than for the model with first-order time to contact indicates a better fit of the data with the latter model. And because the sum of squared differences did not seem to vary much with all variations in the parameters, we stopped here and concluded that a model with TCj(D) would capture the timing patterns in the data better than the alternative model with relative distance. Less comprehensive simulations of the other participants' data sets confirmed this picture. The dynamic timing model states that two potentially stable states exist. We endowed the hand opening and closing processes with stability properties, and made contact with the model through a mapping of an abstract state variable x onto the hand opening and closing. In our approach, a negative value of x maps onto the hand-opening regime whereas a positive value of x maps onto the handclosing regime. According to the Schoner (1994a) model, the switch from hand opening to hand closing, resulting from the visually induced loss of stability of the hand-opening state and the relaxation to the hand-closing state, also has stability properties. That is to say, the limit-cycle component of the model stabilizes the timing of the switch. In this application of the model, however, the rationale for this temporal stability does not seem directly obvious. In other applications of the same model, the limit cycle's role is much more clear. For instance, the same model (or close cousins) has been proposed for the modeling of discrete movements from a dynamic-systems perspective (Schoner, 1990, 1995; Zaal, Bootsma, & van Wieringen, 1999). For this purpose a state variable x is mapped onto the position of the hand. Two stable postures are distinguished, the hand at its initial position and the hand at its final position. In the model, these stable postures are represented by fixed-point attractors. Analogously to the unfolding of the dynamics in the timing model discussed above, the movement from the initial position to the final position results from vanishing of the fixed-point attractor at the initial-position state, after which a switch is made to the fixed-point attractor at the final-position state. And, the movement itself is stabilized by a limit-cycle attractor. Thus, from a dynamic-systems perspective, a discrete movement is half a cycle of a periodic attractor (i.e., a limit-cycle attractor), started and stopped at fixed-point solutions of the same dynamic. Importantly, the limit-cycle component of the dynamics determines in large part the timing of the movement (cf. Schoner, 1990, 1994b, 2002). When different timing processes need to be coordinated, such that events of both processes need to occur at the same time, the stability of the timing of each process, taken care of by the limit-cycle properties, might be a necessary condition (cf. Schoner, 1990, 2002). For instance, when both hands make a reaching movement, even of different amplitudes, the hands tend to arrive at their targets at about the same time (e.g., Kelso, Putnam, & Goodman, 1983). This behavior emerges naturally from a system of coupled limit-cycle oscillators as discussed here (Schoner,
412
Frank T. J. M. Zaal and Reinoud J. Bootsma
1990). Thus, the limit-cycle component of the dynamics is important for a proper coordination of two (or more) timing processes. Now, this may be true for the coordination of two reaches, but it is hard to think of an instance of coordination of two of the timing processes that are the subject of the present study. In the situation of two reach-and-grasp movements, the coupling of the two reaching components, in a way suggested above, already takes care of any synchronized grasping. But, the fact that we have a hard time finding such an example, does, of course, not mean that the stability property captured by the limit-cycle attractor is superfluous. Perhaps an extension of the model to a multifinger grasping movement might reveal the need for a limit-cycle component in the grasping dynamics.4 Although the precision grip has been the grip type studied most widely, on many occasions people pick up objects using other digits than only the thumb and index finger (early studies on the development of reaching and grasping include beautiful examples of all the subtleties of the simple act of picking up an object; for instance, see Halverson, 1931). In this case, the digits arrive simultaneously at the target, a pattern of behavior that a system of coupled limit-cycle oscillators would be able to show. As mentioned in passing above, in our search to find the optimal parameter set, we found ourselves adjusting the two parameters pint and a (while Cvision was allowed to vary across participants) while changing the parameter values of a, co, and ^did not have much of an effect. Thus, to arrive at the optimal solution, we effectively changed the parameters that were related with the fixed-point attractors in the model while leaving unchanged the parameters that were related to the limit-cycle attractor. Given this situation and referring to the discussion above, we asked if a model without a limit-cycle component would perform inferior to the one with the limit-cycle component. We repeated the numerical simulations, now with the parameters pertaining to the limit-cycle component (i.e., cc, co, and y) set to zero. We followed the same procedure as outlined above. Thus, with fiint set to a value of 90 and a set to a value of 0.75, we searched for each participants' data set for a value of cvision such that the sum of squared differences between the observed moment of initiation of hand closure and the model prediction of that moment in each individual trial was minimal. Finally, we submitted the set of differences computed with those parameters to a repeated-measures ANOVA. The results of these analyses of the model without limit-cycle component were comparable to those obtained earlier 4
An interesting discussion is presently going on in the literature on prehension. Some authors argue that prehension should not be seen as coordinated reaching and grasping, but rather as (coordinated) pointing of two digits - usually the thumb and index finger (e.g., Meulenbroek, Rosenbaum, Jansen, Vaughan, & Vogt, 2001; Rosenbaum, Meulenbroek, Vaughan, & Jansen, 2001; Smeets & Brenner, 1999). The issue of which view is correct (assuming, for now, that it can not be the case that both views are correct) has not been resolved yet. The coordination issue of multiple digits, however, is equally appealing under both views.
Coordination in Prehension
413
with the model with the limit-cycle component. The ANOVA indicated no significant effects (F-ratios were roughly the same in the two sets of numerical simulations). Moreover, the model predictions of the two models, the one with a limit-cycle component and the other without the limit-cycle component, were almost equal. With the exception of a single trial in which the model without limit-cycle component predicted the moment of the initiation of hand closure 155 ms later than the model with the limit-cycle component, the differences between the two model predictions were always in the range of -15 to +25 ms, with 99% of the absolute differences being 5 or 10 ms (i.e., 1 or 2 data samples). The conclusion seems obvious: The model without limit-cycle component performed equally well as did the model with limit-cycle component. Hence, the limit-cycle component did not seem critical in modeling the present data set. However, the situation could be different when, for instance, perturbations would be invoked, and the timing of the switching process would be more significant. As mentioned above, we are not sure how the status of the limitcycle component should be assessed in modeling timing situations as the one under study here. Future theoretical and empirical work seems needed to clarify this issue. Finally, we would like to make a short comment regarding the abstract nature of the Schoner (1994a) model that we adopted in our studies of the coordination of reaching and grasping. Fixed-point attractors and limit-cycles are mathematical constructs that might seem far removed from the (neurological) reality. First, we would argue that although undoubtedly knowledge about neurological processes provides important ingredients of a full understanding of human movement, this full understanding is still far in the future. At the present state of affairs, kinematics (and perhaps muscle activation) lends itself much better to observation at a temporal resolution relevant to the control and coordination processes of human movement. The stability properties of those observables, stability being related to control (cf. Schoner, 1994b), might be the most promising entrance into understanding human movement. It is exactly those stability properties that the Schoner (1994a) model addresses. The mathematical constructs relate directly to assumed and/or observed stability properties of the timing processes. Interestingly, a number of other models would capture the same stability properties in qualitatively the same but quantitatively slightly different ways (e.g., see Schoner, 1990, 1994b, for other models exhibiting analogous behavior to the model that we adopted here). At present, the model does not make tight contact with the full kinematics. Although the model captures the moment of initiation of hand closure quite well, it does not capture the full hand opening and closing kinematics. An important future step would be to expand the model of the timing dynamics that we studied here into a full model that grasps the detailed kinematics of prehension (cf. Zaal,
414
Frank T. J. M. Zaal and Reinoud J. Bootsma
1995). The full model should account for a number of rather strong relations among different kinematic variables of prehension that have been reported. Above, we mentioned the strong and reproducible relation between peak hand aperture and target size (e.g., Bootsma et al., 1994; Marteniuk et al., 1990). But also the simple rule of thumb proposed by Mon-Williams and Tresilian (2001), although not able to capture all the subtleties of the timing of the initiation of hand closure, still accounted for about 70% of the variance in hand opening times. The analyses that we presented also indicated a quite strong relation between the duration of hand closure and the required end-point precision for the reaching component. A final example is the relation between the peak hand closing velocity and the amount of finger closing needed to go from the peak hand aperture to the hand aperture at first hand-target contact (Zaal & Bootsma, 1993; Bootsma et al., 1994). To accommodate all those relations, and perhaps more, a full model of prehension would have to be able to make predictions regarding kinematic measures and not only about the timing of some key moments in the act. One modeling route might be to couple the timing dynamics to a load dynamics (cf. Schoner, 1994b), such that the coupled system would show end-effector kinematics that can be compared to experimentally observed kinematics. Even after this extension of the model, the form of the mathematical equations might still look odd and hard to relate for the more neurologically interested scientist. Perhaps, the formulation of the model with (neural) field equations, such as the Amari-type ones used to model movement preparation processes and the effects of prior experience (Erlhagen & Schoner, 2002; Thelen, Schoner, Scheier, & Smith, 2001; Schoner, 2002) would be more appealing to some. A certain parameterization of these equations allows, for instance, for oscillatory behavior (cf. Amari, 1977). The reformulation of the Schoner (1994a) model into a model based on field equations would also unify accounts for movement preparation and movement execution, in a way that the theoretical distinction between theses concepts might simply vanish (cf. Erlhagen & Schoner, 2002). We hope that the present study helps appreciate the potential role of first-order time-to-contact information (r-like perceptual variables) in visually guiding human movement. Especially in the coordination of reaching and grasping, we think that we have added material for a convincing case of a role for this information. To us, the marriage of direct perception and dynamicsystems theory seems to be a healthy one.
Coordination in Prehension
415
Acknowledgements Part of the work reported here was performed while Frank Zaal held a post-doctoral fellowship at the Institute for Fundamental and Clinical Human Movement Sciences (IFKB) at the Vrije Universiteit in Amsterdam. The data used for the current study were collected by Reinoud Bootsma during a stay with Ron Marteniuk and Christie MacKenzie at the University of Waterloo. We thank them for their permission to use the data set.
416
Frank T. J. M. Zaal and Reinoud J. Bootsma
REFERENCES Amari, S. (1977). Dynamics of pattern formation in lateral-inhibition type neural fields. Biological Cybernetics, 27,77-87. Berthier, N. E., Clifton, R. K., Gullapalli, V., McCall, D. D. & Robin, D. J. (1996). Visual information and object size in the control of reaching. Journal of Motor Behavior, 28, 187-197. Bingham, G. P. & Zaal, F. T. J. M. (this volume). Why X is probably not used to guide reaches. Bootsma, R. J. & Craig, C. M. (2002). Global and local contributions to the optical specification of time to contact: Observer sensitivity to composite tau. Perception, 31, 901-924. Bootsma, R. J., Fayt, V., Zaal, F. T. J. M. & Laurent, M. (1997). On the information-based regulation of movement: What Wann (1996) may want to consider. Journal of Experimental Psychology: Human Perception and Performance, 23, 1282-1289. Bootsma, R. J., Marteniuk, R. G., MacKenzie, C. L. & Zaal, F. T. J. M. (1994). The speedaccuracy trade-off in manual prehension: Effects of movement amplitude, object size and object width on kinematic characteristics. Experimental Brain Research, 98, 535541. Bootsma, R. J. & Oudejans, R. R. D. (1993). Visual information about time-to-collision between 2 objects. Journal of Experimental Psychology: Human Perception and Performance, 19, 1041-1052. Bootsma, R. J. & Peper, C. E. (1992). Predictive visual information sources for the regulation of action with special emphasis on catching and hitting. In L. Proteau & D. Elliott (Eds.), Vision and motor control (pp. 285-314). Amsterdam: North-Holland. Bootsma, R. J. & Van Wieringen, P. C. W. (1992). Spatiotemporal organization of natural prehension. Human Movement Science, 11, 205-215. Erlhagen, W. & Schoner, G. (2002). Dynamic field theory of movement preparation. Psychological Review, 109, 545-572. Fitts, P. M. (1954). The information capacity of the human moor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47, 381-391. Fitts, P. M. & Peterson, J. R. (1964). Information capacity of discrete motor responses. Journal of Experimental Psychology, 67, 103-112. Gentilucci, ML, Castiello, U., Corradini, M. L., Scarpa, M., Umilta, C. & Rizzolatti, G. (1991). Influence of different types of grasping on the transport component of prehension movements. Neuropsychologia, 29, 361-378. Gentilucci, M., Chieffi, S., Scarpa, M. & Castiello, U. (1992). Temporal coupling between transport and grasp components during prehension movements: Effects of visual perturbation. Behavioural Brain Research, 47, 71-82. Halverson, H. M. (1931). An experimental study of prehension in infants by means of systematic cinema records. Genetic Psychology Monographs, 10, 107-283. Hoff, B. & Arbib, M. A. (1993). Models of trajectory formation and temporal interaction of reach and grasp. Journal of Motor Behavior, 25, 175-192.
Coordination in Prehension
417
Jakobson, L. S. & Goodale, M. A. (1991). Factors affecting higher-order movement planning: A kinematic analysis of human prehension. Experimental Brain Research, 86,199-208. Jeannerod, M. (1981). Intersegmental coordination during reaching at natural visual objects. In J. Long & A. Baddeley (Eds.), Attention and performance IX (pp. 153-168). Hillsdale, NJ: Erlbaum. Jeannerod, M. (1984). The timing of natural prehension movements. Journal of Motor Behavior, 16, 235-254. Kelso, J. A. S., Putnam, C. A. & Goodman, D. (1983). On the space-time structure of human interlimb coordination. Quarterly Journal of Experimental Psychology, 35A, 347-375. Lee, D. & Reddish, P. E. (1981). Plummeting gannets: A paradigm for ecological optics. Nature, 293, 293-294. Lee, D., Young, D. S., Reddish, P. E., Lough, S. & Clayton, T. M. H. (1983). Visual timing in hitting an accelerating ball. Quarterly Journal of Experimental Psychology, 35A, 333346. Marteniuk, R. G., Leavitt, J. L., MacKenzie, C. L. & Athenes, S. (1990). Functional relationships between grasp and transport components in a prehension task. Human Movement Science, 9, 149-176. Meulenbroek, R. G. J., Rosenbaum, D. A., Jansen, C, Vaughan, J. & Vogt, S. (2001). Multijoint grasping movements: Simulated and observed effects of object location, object size, and initial aperture. Experimental Brain Research, 138, 219-234. Michaels, C. F., Zeinstra, E. B. & Oudejans, R. R. D. (2001). Information and action in punching a falling ball. Quarterly Journal of Experimental Psychology, 54A, 69-93. Mon-Williams, M. & Tresilian, J. R. (2001). A simple rule of thumb for elegant prehension. Current Biology, 11, 1058-1061. Paulignan, Y., MacKenzie, C. L., Marteniuk, R. G. & Jeannerod, M. (1991). Selective perturbation of visual input during prehension movements: 1. The effects of changing object position. Experimental Brain Research, 83, 502-512. Rosenbaum, D. A., Meulenbroek, R. J., Vaughan, J. & Jansen, C. (2001). Posture-based motion planning: Applications to grasping. Psychological Review, 108, 709-734. Schoner, G. (1990). A dynamic theory of coordination of discrete movement. Biological Cybernetics, 63, 257-270. SchOner, G. (1994a). Dynamic theory of action-perception patterns: The time-before-contact paradigm. Human Movement Science, 13,415-439. Schoner, G. (1994b). From interlimb coordination to trajectory formation: Common dynamical principles. In S. P. Swinnen, H. Heuer, J. Massion & P. Casaer (Eds.), Interlimb coordination: Neural, dynamical, and cognitive constraints (pp. 339-368). San Diego: Academic Press. Schoner, G. (2002). Timing, clocks, and dynamical systems. Brain and Cognition, 48, 31-51. Smeets, J. B. J., & Brenner, E. (1999). A new view on grasping. Motor Control, 3, 237-271. Thelen, E., Schoner, G., Scheier, C. & Smith, L. B. (2001). The dynamics of embodiment: A field theory of infant perseverative reaching. Behavioral and Brain Sciences, 24, 1-33.
418
Frank T. J. M. Zaal and Reinoud J. Bootsma
Tyldesly, D. A. & Whiting, H. T. A. (1975). Operational timing. Journal of Human Movement Studies, 1, 172-177. Von Hofsten, C. & Rb'nnqvist, L. (1988). Preparation for grasping an object: A developmental study. Journal of Experimental Psychology: Human Perception and Performance, 14, 610-621. Wagner, H. (1982). Flow-field variables trigger landing in flies. Nature, 297, 147-148. Wallace, S. A., Weeks, D. L. & Kelso, J. A. S. (1990). Temporal constraints in reaching and grasping behavior. Human Movement Science, 9, 69-93. Wang, J. S. & Stelmach, G. E. (1998). Coordination among the body segments during reach-tograsp action involving the trunk. Experimental Brain Research, 123, 346-350. Wang, J. S. & Stelmach, G. E. (2001). Spatial and temporal control of trunk-assisted prehensile actions. Experimental Brain Research, 136, 231-240. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory? Journal of Experimental Psychology: Human Perception and Performance, 22, 1031-1048. Watson, M. K. & Jakobson, L. S. (1997). Time to contact and the control of manual prehension. Experimental Brain Research, 117, 273-280. Zaal, F. T. J. M. (1995). On prehension: Toward a dynamical account of reaching and grasping movements. Unpublished Doctoral dissertation, Vrije Universiteit, Amsterdam. Zaal, F. T. J. M. & Bootsma, R. J. (1993). Accuracy demands in natural prehension. Human Movement Science, 12, 339-345. Zaal, F. T. J. M, Bootsma, R. J. & van Wieringen, P. C. W. (1998). Coordination in prehension: Information-based coupling of reaching and grasping. Experimental Brain Research, 119,427-435. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1999). Dynamics of reaching for stationary and moving objects: Data and model. Journal of Experimental PsychologyHuman Perception and Performance, 25, 149-161.
Coordination in Prehension
419
APPENDIX: Model equations The dynamic timing model that we adopted here is formulated as a set of differential equations. The state variable x is mapped onto the hand-opening and hand-closing regimes of prehension. Since the model includes an oscillatory component, a second variable is needed in these differential equations. This variable could have been the temporal derivative of x, but in line with Schoner (1994a) we chose to take a more abstract route and use an auxiliary variable y, with no specific meaning, as the second variable. The model is built from a combination of so-called intrinsic dynamics - the dynamics that give the state variable x its stability properties - and the contribution of the visual information:
rfAy'
J grasp
J vision
V )
The model defines three attractors in state space. On the one hand, fixed-point attractors are defined for the hand-opening regime (xopen) and the hand-closing regime {xciose). On the other hand, the model defines a limit-cycle attractor passing through those two fixed-point attractors: J grasp
Jose
Jopen
Jclose
(2b)
fOpen(^y) = -^frange\v_y
\
(2c)
420
Frank T. J. M. Zaal and Reinoud J. Bootsma
Finally, the contribution of the visual variable r(D) is defined: X
Jvision\X>
y> Xopen ^open'
X
close ^close >
Pvisio
geyX>y>
X
open
X
open'}'open ) \ open/
X X
J range v-*> ^ ' close ^
Pvisi ision (D)
= Vision
X
close I
J
close)
~ >"„,,)
(3a)
(3b)
Note the similarity between Equation (3a) and Equations (2c) and (2d). A closer look at these equations learns that the growing value of the visual variable r(D), through the variable /%&„„, decreases the strength of attraction of the point-attractor at xopen and increases the strength of attraction of the point attractor at xciose; f5VisWn and /?;„, play the same role in their respective equations. Obviously, a more thorough introduction into the model equations can be found in the original Schoner (1994a) paper.
Time-to-Contact –- H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 2004 Elsevier B.V. All All rights reserved reserved
CHAPTER 18 Another Timing Variable Composed of State Variables: Phase Perception and Phase Driven Oscillators
Geoffrey P. Bingham Indiana University, Bloomington, IN, USA
ABSTRACT In this chapter, we consider a perceptible variable that is related to x, but is different from x. The variable is phase, ((>. <)| is similar to x in that both are timing variables and both are ratios of spatial variables that could be state variables of a dynamical system. As such, either could be used to drive a damped mass-spring system to yield an autonomous dynamical organization. Finally, both x and <j) are perceptible variables. We describe experiments in which we have investigated the perception of relative phase. Then, we describe a phase driven and phase coupled dynamical model of bimanual coordination. An important feature of this model is that it can account for both movement study and judgment study results. However, the way the perceptible property is used in each case is taskspecific.
422
Geoffrey P. Bingham
1. T as a temporal variable composed of spatial state variables In perception/action research, there have been two especially salient reasons to hypothesize and investigate x as a perceptible variable. Both reasons relate to the problem of timing actions and coordinating them with respect to objects in the surroundings that are moving relative to the performer. The first reason is that a temporal variable is most appropriate for the control of timing. Temporal variables are more commonly studied in audition than in vision where the prevailing focus has been on spatial variables. The problem in 'space perception' is that the length dimension is lost in optical structure which is intrinsically angular and temporal. (For extended discussion of the following, see Bingham 1988 and 1995). Optical extents can be described in radians or degrees, but not centimeters or inches. Extra-optical variables must be considered when investigating visual perception of length related properties like size, distance, or velocity. Perhaps related to this fact is the recurrent finding that space perception is rather inaccurate or imprecise when apprehension of lengths (as opposed to length ratios) is required (e.g. Bingham & Pagano, 1998; Bingham, Zaal, Robin & Shull, 2000; Tittle, Todd, Perotti & Norman, 1995; Todd, Tittle & Norman, 1995). As an optical variable, x can only be angular and temporal, but it is equivalent to the ratio of the distance and velocity of a surface moving towards an observer. In the ratio, the length dimension cancels leaving only time. If x is used by performers to scale their actions to the surroundings, then the measurement problem in space perception can be avoided. The second reason to study x emerged as perception/action research began to focus on the issue of stability and the need to integrate perceptible variables into the underlying dynamics of action. The progenitor of task dynamic approaches to perception/action was the ^-model (or Equilibrium Point (EP) model) of limb movement (Feldman, 1980; 1986; Feldman, Adamovich, Ostry & Flanagan, 1990; Latash, 1993). Feldman showed that the muscles and the peripheral nervous system were organized to control joint motion as an abstract mass-spring organization parametrized by stiffness and the EP. Ignoring the differences among competing mass-spring models (e.g. a-model versus Xmodel), its now clear that the damped mass-spring is fundamental to the organization of action (Bizzi, Hogen, Mussa-Ivaldi & Giszter, 1992; Feldman, Adamovich, Ostry & Flanagan, 1990; Hogan, Bizzi, Mussa-Ivaldi & Flash, 1987; Latash, 1993). The advantages of this organization all amount to stability. The relatively autonomous peripheral organization entails both short neural transmission distances and use of intrinsic spring-like muscle properties to yield a nearly linear spring at the joint level (Latash, 1993). The organization combines postural control with the control of movement to yield fixed point stability or equifinality for end postures in discrete movements. Mass-spring
Phase Perception and Dynamics of Bimanual Coordination ofBimanual
423
organization yields stable posture, but the problem of movement stability remains. As fixed point stability is desirable for posture, limit cycle stability is desirable for movements. In perturbation experiments, Kay, Saltzman and Kelso (1991) found that rhythmic limb movements exhibit limit cycle stability. If the mass-spring organization is to be used to account for rhythmic movement, then its necessary to drive the mass-spring to yield limit cycle stability (Schoner, 1990; Zaal, Bootsma & van Wieringen, 1999). The presence of limit cycle stability implies, in turn, that the dynamic is nonlinear (Jordan & Smith, 1977), that is, products or quotients of state variables appear in the dynamical equations. Kay, et al. (1991) also found that rhythmic limb movements exhibit phase resetting. This means that the mass-spring is driven in a way that preserves autonomous organization. The driver is not itself external to the oscillator dynamic, but instead must be a function of the behavior of the oscillator itself. The driver can not be a function of time, but instead must be a function of the state variables of the oscillator, namely, position and velocity (x[t], v[t]). Especially in tasks that require interaction or coordination with events in the surroundings (e.g. catching a ball), the dynamic must be driven perceptually. This can be accomplished using x, because x combines position and velocity in a quotient to yield time. This solution was investigated initially by Schoner (1991) and subsequently by others (Bingham, 1995; Zaal, Bootsma & van Wieringen, 1998). (See the chapters 16 by Bingham and Zaal and 17 by Zaal and Bootsma in the current volume.) So, the advantages of T are that it is temporal (not spatial), but is nevertheless composed of spatial variables that can be the state variables in a dynamical (e.g. mass-spring) system. In the following, we introduce another variable that is, like x, composed of a ratio of position and velocity, spatial variables that can be state variables in a dynamical system. Also like x, therefore, this other variable can be used to drive a mass-spring system to yield limit cycle behaviors. The other variable is phase, <)). Unlike x, however, <() has not been treated as a perceptible variable until fairly recently.
2. Relative phase and the HKB Model In coordination of rhythmic bimanual movements, relative phase is the relative position of two oscillating limbs within an oscillatory cycle. For people without special skills (e.g. jazz drumming), only two relative phases can be stably produced in free voluntary movement at preferred frequency (Kelso, 1995). They are at 0° and 180°. Other relative phases can be produced on average when people follow metronomes, but the movements exhibit large amounts of phase variability (Tuller & Kelso, 1989). They are unstable.
424
Geoffrey P. Bingham
Preferred frequency is near 1 Hz. As frequency is increased beyond preferred frequency, the phase variability increases strongly for movement at 180° relative phase, but not at 0° (Kelso, 1990). If people are given an instruction not to correct if switching occurs, then movement at 180° will switch to movement at 0° when frequency reaches about 3-4 Hz (Kelso, 1984; Kelso, Scholz & Schoner, 1986; Kelso, Schoner, Scholz & Haken, 1987). With the switch, the level of phase variability drops. There is no tendency to switch from 0° to 180° under any changes of frequency. These phenomena have been captured in a dynamical model formulated by Haken, Kelso and Bunz (1985). The HKB model is a first order dynamic written in terms of the relative phase, <j), as the state variable.
e )
\ 0 *7 \ /
+180
e
increasing frequency Figure 1: The HKB model
The equation of motion, which describes the temporal rate of change in (|), that is, <j), is derived from a potential function, V(<|)), which captures the two stable relative phases as attractors as shown in Figure 1. The attractors are wells or local minima in the potential layout. As the dynamic evolves, relative phase is attracted to the bottom of the wells at 0° and 180°. A noise term in the model causes the relative phase to depart stocastically from the bottom of a well. The effect of an increase in frequency is represented by changes in the potential. The well at 180° becomes progressively more shallow so that the stochastic variations in relative phase produce increasingly large departures in relative phase away from 180°. These departures eventually take the relative phase into the well around 0° at which point, the relative phase moves rapidly to 0° with small variation.
Phase Perception and Dynamics of Bimanual Coordination
425
3. Investigating phase perception The HKB describes the basic phenomena of bimanual coordination intuitively. People asked to oscillate their limbs stably at 0° or 180° know what to do, but if they are asked to oscillate at other phases, they do not. 0° and 180° are clearly delineated by the form of the potential function which represents the relative stability (or inversely, the effort) of oscillating at different relative phases. People seem to know where these phases are in a space of relative stabilities. Nevertheless, we wondered: What is the ultimate origin of the potential function in this model? Why are 0° and 180° the only stable modes and why is 180° less stable than 0° at higher frequencies? To answer these questions, we investigated the perception of relative phase because the bimanual movements are coupled perceptually, not mechanically (Kelso, 1984; 1995). The coupling is kinesthetic when the two limbs are those of a single person. Schmidt, Carello and Turvey (1990) found the same behaviors in a visual coupling of limb movements performed by two different people. Similar results were obtained by Wimmers, Beek, and van Wieringen (1992). To perform these tasks, people must be able to perceive relative phase, if for no other reason, than to comply with the instruction to oscillate at 0° or 180° relative phase. Because the coupling is perceptual and because achievable phase relations seem to be specified in a space of relative stabilities (see Bingham, Zaal, Shull and Collins (2001) for discussion), we investigated the visual perception of mean relative phase and of phase variability using both actual human movements (Bingham, Schmidt & Zaal, 1998) and simulations (Bingham & Collins, submitted; Bingham, et al, 2001; Zaal, Bingham & Schmidt, 2000). Participants observed two disks (1.7° visual angle) oscillating on a computer monitor along straight horizontal paths (8° in length), one above the other. (Zaal, et al. (2000) also investigated motions in depth which yield optical expansion and contraction, and replicated the results for motion in a frontoparallel plane.) In Zaal, et al. (2000) and Bingham, et al. (2001), the manipulated variables included mean relative phase (0°, 45°, 90°, 135°, 180°) and phase variability (0°, 5°, 10°, 15° phase SD). Bingham, et al. (2001) also manipulated frequency (0.75 hz and 1.25 hz). However, we will describe in detail results of Bingham and Collins (submitted) which replicated the previous results, but extended the manipulation of frequency to 1 hz, 2 hz, and 3 hz. Different groups of ten participants each judged either mean relative phase or phase variability on a ten point scale: For mean phase, 1=0° and 10 = 180°; for phase variability, 1 = 'not variable' and 10 = 'highly variable'. Participants received extensive instruction and demonstrations distinguishing mean phase and phase variability. They performed blocked trials in which the variable being judged was manipulated
426
Geoffrey P. Bingham
while the other variable was held constant. Finally, participants were tested in a completely randomized design. Results in the blocked and randomized conditions were comparable. As shown in Figure 2, judgments of mean relative phase varied linearly with actual mean relative phase. However, as phase variability increased, 0° mean phase was increasingly confused with 30° mean phase. Furthermore, as illustrated in Figure 4, although mean judgments tracked actual relative phases very well, the variability of the judgments exhibited an inverted-U pattern. This meant that judgments of 90° relative phase, for instance, were far less reliable than judgments of 0° relative phase.
n° 45° qn°
lHz
2 Hz
3 Hz
Figure 2: Judgments of mean phase
We found that judgments of phase variability (or of the stability of movement) followed an asymmetric inverted-U function of mean relative phase, even with no phase variability in the movement as shown in Figures 3 and 4. This replicated the shape of the potential function in the HKB model. Movement at 0° relative phase was judged to be most stable. At 180°, movement was judged to be less stable. At intervening relative phases, movement was judged to be relatively unstable and maximally so at 90°. Levels of phase variability were not discriminated at relative phases other than 0° and 180° because those movements were already judged to be highly variable even with no phase variability. The standard deviations of judgments followed this same asymmetric inverted-U pattern as shown in Figure 4.
All 427
Phase Perception and Dynamics of Bimanual Coordination
"8
f 2" O°SD n° 4B° qn°
1 Hz
n° 45° qn° ias°iBn° n° 4B°
2 Hz
isn°
3 Hz
Figure 3: Judgments of phase variability
Finally, we investigated whether phase perception would vary in a way consistent with the finding in bimanual coordination studies of mode switching from 180° to 0° relative phase when the frequency was sufficiently increased. Also, movement studies revealed that increases in the frequency of movement yielded increases in phase variability at 180° relative phase but not at 0° relative phase. Only this latter result might be resplicated in the perceptual judgments. As shown in Figure 3, as frequency increased, movements at all mean relative phases other than 0° were judged to be more variable. This was true in particular at 180° relative phase. Furthermore, as shown in Figures 3 and 4, this occurred even when there was no phase variability in the movement. Also in this latter case (i.e 0 phase SD), frequency had no effect on judged levels of phase variability at 0° mean phase (although in cases of 5°, 10° and 15° phase SD, actual phase variability became less salient at higher frequencies). Again, mean phase was judged accurately on average, but as frequency increased, judgments tended to drop as shown in Figures 2 and 4. As also shown, the pattern of results for mean judgments of phase variability was replicated in the pattern of results for judgment variability both for judgments of mean phase and of phase variability.
428 428
Geoffrey P. Bingham
Judged Mean Relative Phase
Judged Phase Variability
c
r B
lHz
2
0
20 40 60 80 100 120 140 160 180
0
20 40 60 80 100 120 140 160 180
Relative Phase (deg)
Relative Phase (deg)
JS P* 2 B g 1.75'
0
20 40 60 80 100 120 140 160 180
Relative Phase (deg)
0
20 40 60 80 100 120 140 160 180
Relative Phase (deg)
Figure 4: Mean results for 0 phase SD only (i.e. no actual phase variability)
These results were all consistent with the findings of the studies on bimanual coordination. The asymmetric inverted-U pattern of the judgments in Figures 3 and 4 is essentially the same as the potential function of the HKB model. The potential represents the relative stability of coordination or the relative effort of maintaining a given relative phase. The two functions match not only in the inverted-U shape centered around 90° relative phase, but also in the asymmetry between 0° and 180°. 180° is perceived to be less stable than 0° and increasingly so as frequency increases. This congruence of the movement and perception results supports the hypothesis that the relative stability of bimanual coordination is a function of phase perception and its stability.
Phase Perception and Dynamics of Bimanual Coordination
429
4. Endpoints versus the trajectory Many rhythmic tasks entail moving in synchrony with a discrete auditory pulse, that is, a metronome. Perhaps this has inspired a common intuition that the perception of rhythmic movements focuses on the endpoints of movement. Indeed, rhythmic movements could be generated using a mass-spring by simply switching the position of the EP discontinuously from one endpoint to the other (although Feldman and his co-authors explicitly reject this possibility (Feldman, Adamovich, Ostry & Flanagan, 1990)). If perception of relative phase depended only on relative positions at endpoints of motion, then stability only at 0° and 180° relative phase would be predicted. We investigated whether perception of relative phase focused on the endpoints of motion or instead, used the entire trajectory of oscillation. We could not use selective occlusion of portions of the trajectories to address this question because discontinuities at occlusion boundaries would perturb motion perception strongly. Instead, we chose to put phase variability selectively into portions of the oscillatory cycle as follows. See Figure 5.
No Alignment (phase variability everywhere)
Endpoint Alignment (100ms window centered on each endpoint)
lhz
Position Peak Velocity Alignment (100ms window centered on peak velocities)
lhz
Position Endpoint & Peak Velocity Alignment (50ms wmclows)
lhz
Position Figure 5. (
Position = regions with no phase variability)
430
Geoffrey P. Bingham
First, we put relative phase variability throughout the cycles (No Alignment) as in previous studies. Second, we put it everywhere but in a 100ms window around each endpoint (Endpoint Alignment). Oscillation frequency was 1 Hz, so 20% of the cycle was free of phase variability. Third, we put it everywhere but in a 100ms window around each peak velocity (Velocity Alignment). Fourth, we put it everywhere but in a 50ms window around the endpoints and peak velocities (Critical Points Alignment). As before, 4 levels of phase variability were tested. (With a 20% reduction in the EA, VA, and CPA conditions, these were 0°, 4°, 8°, and 12° phase SD as opposed to 0°, 5°, 10°, and 15° phase SD in the NA condition). Only mean relative phases of 0° and 180° were tested. Ten observers judged phase variability as in the previous experiments. If phase perception uses only the endpoints of oscillatory movement, then the increasing levels of phase variability should have been invisible in the Endpoint Alignment condition because there was no phase variability at the endpoints. The judgment curves should be flat. The logic was the same in the Velocity and Critical Points Alignment conditions under the assumption that perception focuses on peak velocities or on critical points.
2
No alignment
I
Vel alignment CP alignment
Phase SD (deg): 0 5 10 15 0°Mean Phase Figure 6
Endpoint alignment 0 5 10 15 180° Mean Phase
Phase Perception and Dynamics of Bimanual Coordination
431
As shown in Figure 6, the results did not confirm these predictions. Instead, they revealed that the entire trajectory is used to perceive relative phase. Looking at the 180° mean phase condition, removal of variability from endpoints decreased perceived variability somewhat (as it should have because it was 20% less), but the remaining portions of the cycle were still used to detect variations in phase variability. However, removal of variability from the peak velocities had no effect. Removal from both peak velocity and endpoints in the CPA condition had half the effect of removal from just endpoints, but in the CPA condition, the window around endpoints was half as large. That is, only 10% of the variability was removed around the endpoints, the other 10% was removed at the peak velocities where it had no effect. The conclusion was that the whole trajectory is used, but phase variability at peak velocity is invisible because movement there already looks variable. (The effect is similar to that for phase variability at 90° mean phase shown in Figure 3. Movement at 90° relative phase looks variable intrinsically, so actual phase variability cannot be resolved.) None of the alignment manipulations had any effect for movement at 0° mean phase. Accordingly, we concluded that the entire trajectory is used, but resolution of relative phase varies with relative velocity or the velocity difference between the two oscillators. (With 0° mean phase, relative velocity was always zero, or with noise, nearly so.) Phase perception becomes unstable as the relative velocity becomes large. Next, we developed a model of bimanual coordination in which the role of phase perception is explicit. The goal was to account both for our phase judgment results and for results from previous motor studies.
5. Modeling the single oscillator Our perception studies had been inspired originally by the HKB model. The HKB model is a first order dynamical model in which relative phase is the state variable. That is, the model describes relative phase behavior directly without reference to the behavior of the individual oscillators. However, the model was derived from a model formulated by Kay, Kelso, Saltzman and Schbner (1987) that did describe the oscillation of the limbs explicitly. In this latter model, the state variables are the positions and velocities of the two oscillators. To develop this model, Kay, et al. (1987) first modeled the rhythmic behavior of a single limb. In this and a subsequent study (Kay, Saltzman & Kelso, 1991), they showed that human rhythmic limb movements exhibit limit cycle stability, phase resetting, an inverse frequency-amplitude relation, a direct frequency-peak velocity relation, and, in response to perturbation, a rapid return to the limit cycle in a time that was independent of frequency. A dimensionality analysis showed that a second-order dynamic with small amplitude noise is an
432
Geoffrey P. Bingham
appropriate model. The presence of a limit cycle meant the model should be nonlinear and a capability for phase resetting entailed an autonomous dynamic. Kay, et al. (1987) captured these properties in a 'hybrid' model that consisted of a linear damped mass-spring with two nonlinear damping (or escapment) terms, one taken from the van der Pol oscillator and the other taken from the Rayleigh oscillator (hence the 'hybrid') yielding: x + bx + a x 3 + y x 2 x + kx = O
(1)
where x, x-dot, and x-double dot are position, velocity and accleration, respectively, k is stiffness, b is linear damping, and a and y are the nonlinear damping coefficients. This model was important because it captured the principle dynamical properties exhibited by human rhythmical movements. However, the relation between the terms of the model and known components of the human movement system was unclear. The damped mass-spring was suggestive of Feldman's X-model which represents a functional combination of known muscle properties and reflexes. Nevertheless, in the hybrid model, the functional realization of the nonlinear damping terms was unknown. Following a strategy described by Bingham (1988), Bingham (1995) developed an alternative model to the hybrid model. All of the components of the new model explicitly represented functional components of the perception/action system. First, the model incorporated the X-model, that is, a linear damped mass-spring:
x + bx+k(x-x ep )=0 This mass-spring must be driven to generate rhythmic movement. This can be achieved by moving the EP: x + k(x) = k(x ep (t)) If the timing is imposed, then the result is the standard forced oscillator: x + bx + k(x) = c(sin(t)),
c = f(k)
The problem is that this is a nonautonomous dynamic, that is, a dynamic that would not exhibit phase resetting. Another problem is that limb movements are known to exhibit organizations that are both energetically optimal and stable (e.g. Diedrich & Warren, 1995; Margaria, 1976; McMahon, 1984). Both energy optimality and stability are achieved by driving a damped mass-spring at resonance, that is, with the driver leading the oscillator by 90°. Accordingly,
Phase Perception and Dynamics of Bimanual Coordination
433
Hatsopoulos and Warren (1996) suggested that this strategy might be used in driving the Feldman mass-spring. Bingham (1995) solved these problems by replacing time in the driver by the perceived phase of the oscillator. That is, instead of c sin(t), the driver is c sin(<))), where <) is the phase. Because (|) (= f[x, dx/dt]) is a (nonlinear) function of the state variables, that is, the position and velocity of the oscillator, the resulting dynamic is autonomous. The perceptually driven model is: x + bx + kx = csin[(|)]
(2)
where and
arctan —- ,
c = c(k)
L J The amplitude of the driver is a function of the stiffness. Bingham (1995) showed that this oscillator yields a limit cycle. This is also shown in Figure 7 by rapid return to the limit cycle after a brief perturbing pulse. As also shown, the model exhibits the inverse frequency-amplitude and direct frequency-peak velocity relations as frequency was increased from 1 hz to 6 hz. These relations are apparent in the first panel of Figure 7 showing a phase plot generated by gradually increasing the frequency of the oscillator. In the second and third panels, the model is compared to the human movement data reported by Kay, et al. (1987).
20 IS 10
s 0 -5
•£>
-IO
•
-?
l i b 1 \
60
600
50
' J 500
40
§400
•If 30
^300
£20
J 200 > 100
1 -15 i
-1
0 Position
1
jS
0
2 3 4 5 6 Frequency (Hz)
Figure 7
0
2
3 4 5 Frequency (Hz)
434
Geoffrey P. Bingham
Finally, the model exhibits a pattern of phase resetting that is similar to that exhibited by the hybrid oscillator as shown in Figure 8. The model is phase delayed by a delaying perturbation and phase advanced by an advancing perturbation. However, as shown in the second panel by the graph taken from Kay, et al. (1991), human participants exhibited phase advance in response to all perturbations. Following a suggestion in Kay, et al., we modified the phase driven oscillator model to include a momentary increase in stiffness in response to a perceived departure from the limit cycle as a result of perturbation (see the first panel of Figure 7). k in equation (2) above was changed to k = kj+ y | et -1-, | where en = (vn2 + x2)V 5 is both the radius of the trajectory on the phase plane and a measure of the energy of motion. Thus, stiffness was incremented in proportion to the change in the radial coordinate in phase space. (The other coordinate is the phase angle, (|)) Note that equation (2) remains autonomous because en is a function of state variables x and v. en is hypothesized as another perceptual variable in addition to <\>. The result, shown in the third panel of Figure 8, was a pattern of phase response similiar to the human data.
-50
0
50 100 150 200 250
300 350 400
Old Phase (deg)
.2
.3 .4 .5
.6
Old Phase
Figure 8
.7
.8
.4
.6
.8
Old Phase
1
1.2
1.4
Phase Perception and and Dynamics of of Bimanual Coordination Coordination
435
Goldfield, Kay and Warren (1993) found that human infants were able to drive a damped mass-spring at resonance. The system consisted of the infant itself suspended from the spring of a "jolly bouncer" which the infant drove by kicking. This essentially instantiates the phase driven oscillator model and shows that even infants can use perceived phase to drive such an oscillator at resonance. We hypothesize that all adult rhythmic limb movements are organized in this way.
6. Modeling coupled oscillators With this model of a single oscillating limb, we were ready to model the coupled system. Kay, et al. (1987) had modeled the coupled system by combining two hybrid oscillators via a nonlinear coupling:
+bx, + a x ,
+ YX2 X,
+ kxx =(xj -x 2 )[a + b(x] -x 2 ) 2 J 2
x 2 +bx 2 +06X2+7*2X2 +kx 2 =(x 2 -x 1 )[a + b(x 2 -x 1 ) J
(3)
This model required that people simultaneously perceive the instantaneous velocity difference between the oscillators as well as the instantaneous position differences so that both could be used in the coupling function. This model did yield the two stable modes (namely, 0° and 180° relative phase) at frequencies near 1 hz, and mode switching from 180° to 0° relative phase at frequencies between 3 hz and 4 hz. We have proposed an alternative model (Bingham, 2001; Bingham & Collins, submitted) in which two phase driven oscillators are coupled by driving each oscillator using the perceived phase of the other oscillator multiplied by a term, P, that represents the perceived relative phase. P is computed as the sign of the product of the two drivers. P simply indicates at each instant whether the two oscillators are moving in the same direction (sgn = +1) or in opposite directions (sgn = -1). The model is: Xj + b Xj + kx ( = c sin (c|)2 )Pij
x 2 + b x 2 + k x 2 =csin((|)1)Pij
(4)
436
Geoffrey P. Bingham
where = sgn(sin(<|>1)sin(<|>2)+a(xi-xj)Nt)
(5)
As shown in equation (5), the product of the two drivers is incremented by a Gaussian noise term with a time constant of 50 ms and a variance that is proportional to the velocity difference between the oscillators. This noise term reflects known sensitivities to the directions of optical velocities (De Bruyn & Orban, 1988; Snowden & Braddick, 1991) and is motivated by the results from the phase perception experiment described above. We found that relative phase appeared more variable as the relative velocity of the two oscillators increased. It is important to note that this noise term does not imply that the velocity difference is perceived, but only that the ability to resolve the relative direction of movement is affected by the relative speeds. This model also yields only two stable modes (at 0° and 180° relative phase) at frequencies near 1 hz, and yields mode switching from 180° to 0° relative phase at frequencies between 3 hz and 4 hz. This is shown in Figure 9 where the oscillators were started at 180° at 1 hz and then, as the frequency was increased, exhibited increasing variability in relative phase eventually switching to 0° (360°) at a frequency of about 4 hz. After the switch, the variability decreased strongly.
"M
400
?
200
a-
0
%
-200 -400
1/V/—
5
10
15
20
Time
Figure 9
Furthermore, the model predicts our results for judgments of mean relative phase and of phase variability. Judged mean phase is produced by integrating P over a moving window of width o (= 2 s) to yield PJM:
f Pdt Jt-CT
(6)
Phase Perception and Dynamics of Bimanual Coordination Coordination
437
Judged phase variability is predicted by integrating (P - PJM)2 over the same window to yield PJV:
(7) PJM varies linearly with actual mean phase and PJV yields an asymmetric inverted-U as a function of actual mean phase as shown in Figure 10. As also shown, PJM and PJV behave as did the respective judgments in response to increases in the frequency of oscillation. Compare Figure 10 to Figure 4. The model reproduced the results from the judgments studies where observers judged either mean relative phase or phase variability. (These results were obtained using an external forcing to drive the system to phases other than 0° and 180° (Tuller & Kelso, 1989): x, + bx, +kxj = c sin((|)2)Py x 2 + b x 2 + kx 2 = c sin ((j), )Pij + d sin(.>/k~t + <))R) where <(>R was manipulated to achieve particular relative phase relations.)
!a
SU 20
0
0 20 40 60 80 100 120 140 160 180 200
-20 0 20 40 60 80 100 120 140 160 180 200
Measured Relative Phase
Measured Relative Phase
Figure 10
There are two aspects of the perceptual portions of the model that should be emphasized. First, there are actually two perceptible properties entailed in the model. The two are very closely related, but they are distinct. The first is the
438
Geoffrey P. Bingham
phase of a single oscillator. The perception thereof is entailed in the single oscillator model. This is, of course, incorporated into the coupled oscillator model. The second perceptible property is relative phase. This latter property brings us to the second aspect of the model to be noted. This is especially important. This model is being used to model performance in two different tasks, one is a coordinated movement task and the other is a judgment task. Equation (5) represents the way the perception of relative phase plays a role in the coordinated movement task. This is in terms of the momentary value of P, that is, whether the oscillators are perceived to be moving in the same or in opposite directions at a given moment in time. This modifies the driving effect of the respective perceived phases. In contrast, equations (6) and (7) represent the way the perception of relative phase plays a role in the judgment tasks. In this case, the behavior of P is assessed (that is, integrated) over some window of time that is large enough to span one or two cycles of movement. So, the two tasks are connected by a single perceptible property, but the way the property is evaluated and used is task-specific. The model is representative of nonlinear dynamics: complex behavior emergent from simple dynamic organization. The model captures both the movement results and results of perceptual judgments. Two relatively simple equations (4) capture the fundamental properties of human rhythmic movements: limit cycle stability, phase resetting, inverse frequency-amplitude and direct frequency-peak velocity relationships, the stable modes and mode transitions and the increasing patterns of instability leading up to mode transition. With the addition of two more simple equations (6) and (7) computing a mean and a variance, the model accounts for the results for perceptual judgments of mean relative phase and of phase variability and the ways these vary with the frequency of movement. All this from a model with 6 parameters (k, b, c, a, y, and a), four of which are fixed and one, k, is varied to generate variations in frequency of movement. (Note: because c=f(k), c varies with k but once the scaling of c is fixed, this does not represent an extra degree of freedom.) This model builds on the previous results of Kay et al. (1987) and Kay et al. (1991) which revealed fundamental dynamic properties of human movement. Those properties are captured by the new model as they were by previous models. However, unlike the previous models, the new model is an explicit perception/action model. Its components are interpretable in terms of known components of the perception/action system. It explicitly represents the perceptual coupling that is well recognized to be fundamental to the coordination task and the resulting bimanual behaviors.
Phase Perception and Dynamics of Bimanual Coordination
439
<)> and 1
We began by studying the visual perception of relative phase and indeed, we found that the pattern of judgments was consistent with the pattern of results from studies on rhythmic limb movement. We should mention that we have also replicated the visual phase perception results in a haptic phase perception task (Wilson, Craig & Bingham, in press). However, when we turned to modeling, it was necessary to consider a related but different variable, namely, phase (|). Both phase and relative phase are required for the model of bimanual coordination. Phase is required in the context of each single oscillator while relative phase is required in addition for their coordination. (In fact, for the single oscillator, we ultimately required a third perceptible variable, the 'energy' e.) It is the phase that is most relevant in the current context. As a perceptible variable, phase (|) is very similar to x. However, they are not the same. Both are derived as a ratio of position and velocity, that is, variables that may be state variables in a dynamic system. In terms of these variables,
(8)
Lx-fJ but this is not entirely appropriate because there are constraints on the underlying space for x that do not apply for ty and visa versa, T is defined to describe motion relative to (usually an approach to) an origin along the positive half of the x axis, that is, the range of x includes positive values and zero. Alternatively, 0 is defined relative to the equilibrium point of an oscillator and x takes on both positive and negative values around an origin located at the equilibrium point. Then, while both are timing or time relative variables, only x is a temporal variable. Dimensionally, (> | is dimensionless, more specifically, angular. As shown in equation (8), the ratio of state variables is normalized by frequency, f. Thus, unlike x, <\> is defined relative to a cyclic or periodic structure. The oscillatory dynamic is intrinsic to (f>, but not to x. So, <\> is certainly not the same as x, but they reflect similar strategies for understanding and modeling perception/action systems. Both are derived in terms of ratios of spatial variables so both avoid the classic measurement problem in space perception. Relatedly, both are intrinsic timing variables and so relate directly to timing of behavior. Finally, both are derived from variables that can play the role of state variables in a dynamic system. This means that they can be used as drivers to build autonomous dynamical organizations to model stable, self-organizing perception/action systems.
440
Geoffrey P. Bingham
REFERENCES Bingham, G. P. (1988). Task specific devices and the perceptual bottleneck. Human Movement Science, 1, 225-264. Bingham, G. P. (1995). The role of perception in timing: Feedback control in motor programming and task dynamics. In E. Covey, H. Hawkins, T. McMullen & R. Port (eds.) Neural Representation of Temporal Patterns, pp. 129-157. New York: Plenum Press. Bingham, G. P. (2001). A perceptually driven dynamical model of rhythmic limb movement and bimanual coordination. Proceedings of the 23rd Annual Conference of the Cognitive Science Society, (pp. 75-79). Hillsdale, N.J., LEA Publishers. Bingham, G. P. & Pagano, C. C. (1998). The necessity of a perception/action approach to definite distance perception: Monocular distance perception to guide reaching. Journal of Experimental Psychology: Human Perception and Performance, 24, 145-168. Bingham, G. P., Schmidt, R. C. & Zaal, F. T. J. M. (1998). Visual perception of relative phasing of human limb movements. Perception & Psychophysics, 61, 246-258. Bingham, G. P., Zaal, F., Robin, D. & Shull, J. A. (2000). Distortions in definite distance and shape perception as measured by reaching without and with haptic feedback. Journal of Experimental Psychology: Human Perception and Performance, 26(4), 1436-1460. Bingham, G. P., Zaal, F. T. J. M., Shull, J. A. & Collins, D. R. (2001). The effect of frequency on visual perception of relative phase and phase variability. Experimental Brain Research, 136,543-552. Bingham, G. P. & Collins, D. R. (submitted). Phase perception and a phase driven and phase coupled dynamical model of bimanual rhythmic movement. Bizzi, E., Hogan, N., Mussa-Ivaldi, F. & Giszter, S. (1992). Does the nervous system use equilibrium point control to guide single and multiple joint movements? Behavioral and Brain Sciences, 15, 603-613. Collins, D. R. & Bingham, G. P. (2001). How continuous is the perception of relative phase? Inter Journal: Complex Systems, MS #381. De Bruyn, B. & Orban, G. A. (1988). Human velocity and direction discrimination measured with random dot patterns. Vision Research, 28, 1323-1335. Diedrich, F. J. & Warren, W. H. (1995). Why change gaits? Dynamics of the walk-run transition. Journal of Experimental Psychology: Human Perception and Performance, 21, 183-202. Feldman, A. G. (1980). Superposition of motor programs-I. rhythmic forearm movements in man. Neuroscience, 5, 81-90. Feldman, A. G. (1986). Once more on the equilibrium-point hypothesis (A, model) for motor control. Journal of Motor Behavior, 18(1), 17-54. Feldman, A. G., Adamovich, S. V., Ostry, D. J. & Flanagan, J. R. (1990). The origin of electromyograms- Explanations based on the equilibrium point hypothesis. In Winters, J. M. & S. L-Y. Woo (eds.) Multiple Muscle Systems: Biomechanics and Movement Organization. New York: Springer-Verlag. Goldfield, E. C, Kay, B. A. & Warren, W. H. (1993). Infant bouncing: The assembly and tuning of an action system. Child Development, 64, 1128-1142.
Phase Perception and Dynamics of Bimanual Coordination
441
Haken, H., Kelso, J. A. S. & Bunz, H. (1985). A theoretical model of phase transitions in human hand movements. Biological Cybernetics, 51, 347-356. Hatsopoulos, N. G. & Warren, W. H. (1996). Resonance tuning in arm swinging. Journal of Motor Behavior, 28, 3-14. Hogan, N., Bizzi, E., Mussa-Ivaldi, F. A. & Flash, T. (1987). Controlling multijoint motor behavior. In Pandolf, K. B. (ed) Exercise and Sport Sciences Reviews VI5, (pp 153-190). New York: MacMillan. (esp. pp.167-170). Jordan, D. W. & Smith, P. (1977). Nonlinear Ordinary Differential Equations. Oxford, England: Clarendon. Kay, B. A., Kelso, J. A. S., Saltzman, E. L. & Schoner, G. (1987). Space-time behavior of single and bimanual rhythmical movements: Data and limit cycle model. Journal of Experimental Psychology: Human Perception and Performance, 13, 178-192. Kay, B. A., Saltzman, E. L. & Kelso, J. A. S. (1991). Steady-state and perturbed rhythmical movements: A dynamical analysis. Journal of Experimental Psychology: Human Perception and Performance, 17, 183-197. Kelso, J. A. S. (1984). Phase transitions and critical behavior in human bimanual coordination. American Journal of Physiology: Regulation, Integration, and Comparative Physiolology, 15, R1000-R1004. Kelso, J. A. S. (1990). Phase transitions: Foundations of behavior. In H. Haken and M. Stadler (eds.), Synergetics of cognition. Springer Verlag, Berlin, pp. 249-268 Kelso, J. A. S. (1995). Dynamic patterns: The self-organization of brain and behavior. MIT Press, Cambridge, MA. Kelso, J. A. S., Scholz, J. P. & Schoner, G. (1986). Nonequilibrium phase transitions in coordinated biological motion: Critical fluctuations. Physics Letters A, 118, 279-284. Kelso, J. A. S., Schoner, G., Scholz, J. P. & Haken, H. (1987). Phase-locked modes, phase transitions and component oscillators in biological motion. Physica Scripta, 35, 79-87. Latash, M. L. (1993). Control of Human Movement. (Ch 1: What muscle parameters are controlled by the nervous system? pp. 1-37 and Ch 3: The equilibrium-point hypothesis and movement dynamics, pp. 81-102.) Campaign, IL: Human Kinetics. Margaria, R. (1988). Biomechanics and energetics of muscular exercise. Oxford: Clarendon Press. McMahon, T. A. (1984). Muscles, reflexes, and locomotion. Princeton, N.J.: Princeton University Press. Schmidt, R. C, Carello, C. & Turvey, M. T. (1990). Phase transitions and critical fluctuations in the visual coordination of rhythmic movements between people. Journal of Experimental Psychology: Human Perception and Performance, 16, 227-247. Schoner, G. (1990). A dynamic theory of coordination of discrete movement. Biological Cybernetics, 63, 257-270. Schoner, G. (1991). Dynamic theory of action-perception patterns: The "moving room" paradigm. Biological Cybernetics, 64, 455-462. Snowden, R. J. & Braddick, O. J. (1991). The temporal integration and resolution of velocity signals. Vision Research, 31, 907-914.
442
Geoffrey P. Bingham
Tittle, J. S., Todd, J. T., Perotti, V. J. & Norman, J. F. (1995). Systematic distortion of perceived three-dimensional structure from motion and binocular stereopsis. Journal of Experimental Psychology: Human Perception and Performance, 21(3), 663-678. Todd, J. T., Tittle, J. S. & Norman, J. F. (1995). Distortions of three-dimensional space in the perceptual analysis of motion and stereo. Perception, 24, 75-86. Tuller, B. & Kelso, J. A. S. (1989). Environmentally specified patterns of movement coordination in normal and split-brain subjects. Experimental Brain Research, 75, 306-316. Wilson, A. N., Bingham, G. P. & Craig, J. C. (in press). Proprioceptive perception of phase variability. Journal of Experimental Psychology: Human Perception and Performance. Wimmers, R. H., Beek, P. J. & van Wieringen, P. C. W. (1992). Phase transitions in rhythmic tracking movements: A case of unilateral coupling. Human Movement Science, 11, 217226. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1998). Coordination in prehension: Information-based coupling of reaching and grasping. Experimental Brain Research, 119,427-435. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1999). Dynamics of reaching for stationary and moving objects: Data and model. Journal of Experimental Psychology: Human Perception and Performance, 25, 149-161. Zaal, F. T. J. M., Bingham, G. P. & Schmidt, R. C. (2000). Visual perception of mean relative phase and phase variability. Journal of Experimental Psychology: Human Perception and Performance, 26, 1209-1220.
Time-to-Contact Time-to-Contact – - H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © 2004 Elsevier B.V. All rights reserved
CHAPTER 19 The Fallacious Assumption of Time-to-Contact Perception in the Regulation of Catching and Hitting Simone Caljouw Vrije Universiteit Amsterdam, The Netherlands
John van der Kamp Vrije Universiteit Amsterdam, The Netherlands
Geert J. P. Savelsbergh Vrije Universiteit Amsterdam, The Netherlands Manchester Metropolitan University, Manchester, UK
ABSTRACT Interceptive actions, such as catching and hitting, were thought to be timed on the basis of a critical value of perceived time-to-contact (TTC), provided by the optical variable x(cp). This optical variable is a monocular invariant that specifies, given a constant approach velocity, the time-to-contact of the ball with the observer's eye. The present Chapter shows that the original formulations of this TTC model were fatally flawed in many ways and that they are subjected to revision. The second part of this Chapter presents alternative models for the information-based regulation of interceptive actions. We no longer present the perception of TTC as an intermediate phase in interceptive timing. Instead, we propose that information directly regulates action. Furthermore, we suggest that not a sole variable, but multiple variables might constrain natural interceptive actions. The question what information contributes to interceptive timing cannot be posed in isolation from the question how information is used to regulate timing. The original TTC model assumed a critical timing strategy, but there is also online regulation. The final section of the present chapter discusses different online control models and their informational inputs.
444
Simone Caljouw, John van der Kamp and Geert Savelsbergh
1. The timing of catching and hitting Skilled athletes perform dynamic interceptive actions such as catching and hitting with ease and a high degree of accuracy. To intercept a ball, actors must contact the ball with the right spatial orientation and velocity of the hand, bat, or racket. The end-effector has to be at a certain location at a particular time. Therefore, precise coordination, between visual information about the target's trajectory and the kinematics of the end-effector is necessary to prepare for impact. Timing is very important in these interceptive actions. In catching, the ball will bounce off the hand or crush the fingers when the moment of hand closure is late or early, respectively. In the case of balls travelling with a speed of about 10 m/s, the precision with which the grasping action of the hand must be timed is estimated to be on the order of 15 ms (Alderson, Sully, & Sully, 1974). When striking fast balls, the temporal precision is even higher. Researchers reported temporal accuracy of about 6.5 ms for a forehand drive in table tennis (Bootsma & Van Wieringen, 1990) and 2.5 ms for batting in cricket (Regan, 1997). While hitting is similar to catching in a number of ways, there are some important differences. In catching, the contact with the ball should always be soft, because the fingers have to enclose the ball. On the other hand in hitting, the main objective is to transfer energy to the ball in order to transport it with a certain velocity in a particular direction. The ball is received and sent away in the same action. Hence, the actor has to ensure that the implement (or hand) travels with the right velocity at the moment of contact to transfer an appropriate amount of kinetic energy to the ball. In a review on batting in men's cricket, Stretch, Bartlett, & Davids (2000) described how a batter sometimes faces extremely severe spatio-temporal constraints. For example, in the leg glance, the frontal plane of the blade of the cricket bat (± 10 cm in width) has to be quickly turned almost 90 degrees to deflect the ball past the wicket keeper. Yet, skilled cricketers can successfully perform this intricate stroke against fast bowling speeds of around 40 m/s (Regan, 1997). With one of the most remarkable aspects of catching and hitting being the precision with which the actions are timed, it is perhaps not surprising that special attention has been paid to visual information specifying the temporal relation between an observer and the ball. Whiting & Sharp (1973) clearly showed that it was necessary to "keep an eye on the ball" for a successful timing of interceptive action. Catching performance was better when the visibility of the ball is available 320 ms instead of 160 ms before ball-hand contact. An organism-environment property thought to be of particular importance in the
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 445 445
timing of interceptive actions is the time-to-contact (TTC), i.e. the time remaining before the ball reaches the observer. TTC can be defined as the distance of the ball at some instant of time divided by its velocity. Information about TTC is identified in the structure of optical expansion patterns generated by the approaching ball (see Section 2). This optical pattern is visually available and does not have to be computed from independently obtained perception of distance and velocity. In this Chapter we adopt Gibson's theory of direct perception or the ecological approach as a starting-point to understand how movement coordination is constrained by information (Gibson, 1979; Michaels & Carello, 1981). Gibson rejected the claim that observers have to construct a representation of the world from ambiguous stimuli (resulting from a 2D projection of the 3D world on the retina) by relying on constructive processes such as inference. In contrast, he argued that objects and events can be directly perceived and information does not have to be disambiguated. An abundance of information resides in optical patterns that are generated by (moving) objects and events as well as by movements of the observer. Perception is based on the detection of optical patterns that directly specify actor-environment properties. Specificity implies one-to-one mapping between physical properties and optical information. Given the constraints and regularities in the to be considered ecologies, information is lawfully related to its source (i.e. object, event) such that no other source could have generated the same optical pattern (Runeson, Jacobs, Andersson, & Kreegipuu, 2001). For many years, research on information-movement coupling in interceptive actions was dominated by the perspective that humans are able to detect a source of information that specifies the time it takes an object to reach an observer (TTC). By detecting the optical information, the actor perceives when an object is close enough to act upon it. In the present Chapter we will begin with an overview of this original TTC-model. We then review recent studies that show the flaws of this model and we outline that the TTC-model is insufficient to explain the information-based regulation of catching and hitting. In the final Section we will provide empirical evidence for the view that interceptive actions can be regulated by multiple sources of information. We will present alternative models and consider the implications for general issues in information-movement coupling.
446
Simone Caljouw, John van der Kamp and Geert Savelsbergh
2. The TTC-model Although it was already derived in 1958 that the optical expansion pattern of an approaching object specifies TTC (Knowles & Carel, 1958; Purdy, 1958), it was not until the early eighties that it became widely accepted. Within the ecological psychology framework, Lee (1976) demonstrated that TTC is directly available in the optic array. Hence, there is no need for actors to estimate the distance and velocity of an object. He mathematically showed that, given a constant approach velocity, the inverse of the relative rate of optical expansion, directly specifies TTC with the point of observation.1 This optical variable is denoted tau (i.e. T(
Critical value
Information
M«P))
Perception (TTCO
Action (Timing)
Figure 1: The original tau model; by detecting the optical information x(
1
x(cp) = ^
= -5 = TTC,
Z is a organism-environment property (i.e. distance between object and observer), which together with its first time derivative defines TTC. The subscript 1 indicates that only the first-order time derivative of Z is considered. For small angles of
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 447 447
Following Lee's seminal article, two lines of research aimed at proving the TTC-model (schematised in Figure 1). The first approach tried to establish whether humans were sensitive to the optical variable x(q>) (i.e. the relation between the first two boxes in figure 1). For example, Regan and co-workers addressed with help of psychophysical studies whether observers are able to detect x(cp) in order to make judgments about the TTCi with an approaching simulated rigid sphere (Regan & Hamstra, 1993; Regan & Vincent, 1995). They showed evidence that observers were able to discriminate variations in the ratio (p/(f> (i.e. x((p)), while totally ignoring the co-varying variables cp and (p. This work has been frequently cited in order to show that sports players can discriminate x((p) accurately enough to account for the precision with which interceptive actions have to be performed. Although low discrimination thresholds are requisite for utilizing x((p) in guiding actions, it does not prove the actual use of x(
448
Simone Caljouw, John van der Kamp and Geert Savelsbergh
on water. Since the seabird is accelerating under the influence of gravity, the TTCi (which assumes constant speed) overestimates the real TTC with the water. Film-analysis showed that the birds ignored acceleration and therefore started streamlining longer before contact, when diving from greater heights. Lee and Reddish (1981) concluded that the gannets folded their wings when TTCi reached a specific value (but see critical comments from Wann, 1996). The original TTC model, as illustrated in Figure 1, supposes a critical timing strategy. That is, a relation between information and movement can be achieved when the threshold value of the perceived TTC is equal to the movement duration plus a certain visuomotor interval. With this control strategy, movement adjustments can be made with an interval equal or larger than the visuomotor interval. Following the approach of Lee, many studies showed that participants behaved in accordance with TTCi (Bootsma & Van Wieringen, 1990; Cavallo & Laurent, 1988; Lee, Lishman, & Thomson, 1982; Lee, Young, Reddish, Lough, & Clayton, 1983; Savelsbergh, Whiting, Burden, & Bartlett, 1992; Todd, 1981). However, if movement variables are correlated with TTCi, utilization of the optical variable T(
3. Testing the TTC-model The first round of research aimed at identifying the optical patterns that might account for the timing of interceptive actions. The TTC-model was introduced and research was conducted to affirm this theory. Unfortunately, the actual involvement of T((p) in the regulation of interceptive actions was not really put to the test. Part of the research set out to investigate the link between information and perception, that is the coupling between t((p) and the perception of TTC, and part of the research addressed the relation between the to-be perceived TTCi and the timing of actions. The direct relation between information and movement was left out of consideration. Therefore, a second round of research aimed at investigating the limitations of the TTC-model. This falsificationalist approach assessed the predictive value of T(cp) for the timing of interceptive actions and considered behaviour that was not in agreement with the use of T(cp).
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 449 449
3.1 Testing the use of x(
) is manipulated, but also the retinal image size and its rate of change. To investigate the quantitative effects, we replicated the "grasping tau" study and refined the methodology (Van der Kamp, 1999). Not only balls that decreased in size, but also balls that increased in size were used. In addition, the balls approached at two constant velocities. The qualitative effects were in agreement with the former deflating ball experiment, that is, opening and closing of the hand occurred earlier and later for the inflating and deflating ball, respectively. However, as presented in Figure 2 the magnitude of the effects was much smaller than would be predicted solely on the basis of x(cp). Evidently, the timing of one-handed catching is not
450
Simone Caljouw, John van der Kamp and Geert Savelsbergh
solely based on T(cp). Of course this does not suggest that x(cp) is insignificant, but it does emphasis that rigorous tests are needed to investigate whether other informational sources are exploited as well.
Deflating 2 m/s Deflating 1 m/s
r i
Inflating 2 m/s
1
Inflating 1 m/s -200
i
-100
0
100
200
TTC difference (ms)
Figure 2: Estimated (grey) and observed (white) differences in timing (TTC difference in ms) of the grasp onset between the constant balloons and the inflating and deflating balloons for both approach velocities (1 and 2 m/s). Note: Minus sign indicates that occurrence is later (Van der Kamp, 1999).
In order for x(cp) to be accepted as a viable explanation for the regulation of interceptive timing a number of underlying assumptions must hold. For example, regulation on basis of x((p) is assumed to be monocular and the points of interception and observation are assumed to coincide. Spelling out the implicit assumptions provide insight into the explanatory scope of x((p) and provides a useful framework for the identification of other informational variables that might contribute. The point of observation does not coincide with the point of interception Precisely timed interceptive actions can be performed almost anywhere within reach. For example, in tennis you can return a ball with several different techniques. You can use a forehand, a backhand, a lob etc. All these interceptive actions can be accurately performed at completely different positions relative to the eye. Unfortunately, T(cp) only specifies TTC with the point of observation. If T(cp) represents all the information available to an observer, considerable timing
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 451 451
errors would be made when the ball is not on a head-on collision course. The magnitude of these timing errors increases with the velocity of the object, and the distance between the point of interception and the point of observation. However, these timing errors are not observed, even the catching skills of young infants are more accurate than would be predicted from the sole use of i((p) (Von Hofsten, 1983; cf. Chapter 8 of van Hof, van der Kamp, & Savelsbergh). To regulate interceptive actions, a more general information source is necessary that specifies the ball's first-order time-to-interception point. An optical variable that specifies the TTCi between a moving object (i.e. the ball) and any designated target in the action space (i.e. the interception point) was defined (Bootsma & Oudejans, 1993; Tresilian, 1990). Bootsma and Oudejans (1993) mathematically derived that TTCi can be specified by a combination of the relative rate of expansion of the object's contour (t((p)) and the relative rate of constriction of the gap between the ball and the interception point T(6) (see Figure 3 ) }
_——-«"*,—-i——-1 \^
Poid& or ob'S^rvstiOfi
D®
Figure 3: An approaching ball at distance Z from the point of observation and at distance D from the hand. The angle subtended between the point of observation and the ball is 9 and the angle between the hand, the point of observation, and the ball is 0.
dD/dt dZ/dt dB/dt d(p/dt dG/dt D Z tan6 cp 9 For small angles TTC! is specified by the optical variable T((p,6). This equation reduces to Lee's original x(cp) function when the ball is on a head-on collision course, and TTC] is fully specified by the constriction of the gap, when the object moves at a constant distance from the observer. 1
452
Simone Caljouw, John van der Kamp and Geert Savelsbergh
A forced choice paradigm provided evidence for participants' sensitivity to x((p,9). Participants had to predict which of the two objects, moving laterally in front of the participant, crossed the midline first (Bootsma & Oudejans, 1993). Also, an experiment in which participants had to trap balls rolling down a track way indicated that participants did not rely on x((p) alone, not even in the event that only the ball was visible (Tresilian, 1994a). Noteworthy, both studies only considered outcome measurements (i.e. percentage of correct judgments and number of timing errors, respectively). Consistent with the TTC model presented in Figure 1, it is claimed that x((p,0) uniquely specifies TTCi, and as such, is the only optical variable involved in timing interceptive actions. However, none of the above-described studies actually proved that T(cp,8) was the sole variable exploited to regulate interceptive actions. Binocular information contributes to the timing of interceptive actions Regulation on basis of x((p) and x(cp,8) assumes that interceptive timing is solely based on monocular information, since the formulated optical patterns are detectable from a single point. Hence, binocular viewing should not result in a modification of timing as compared to monocular viewing. And indeed, some one-eyed individuals reached the highest level of achievement. Excellent examples are the Tiger of Pataudi, the leading batsman and captain of the Indian cricket team and the aviator Wiley Post (Regan, 1997). Also, Jack the one-eyed bullfighter should be doomed, if he wasn't able to avoid collisions solely on the basis of monocular information (cf. homepage Jack Johnson, 2002). Nevertheless, sportsmen with intact stereovision seldom attempt to avoid or hit an object while viewing it with one eye. Already in 1931, Banister and Blackburn studied the influence of binocular vision on the efficiency at ball games. They discovered that Cambridge undergraduates with a larger interocular (10) distance performed generally better in ball sports. They argued that an increase in 10 distance results in enhanced stereoscopic vision, and therefore in better performance. Recently it was shown that an important difference between participants with high and low stereopsis resides in the temporal accuracy of the grasp. Low stereopsis seems to be associated with lower catching performance due to a later onset of hand closure (Lenoir, Musch, & La Grange, 1999; Mazyn, Lenoir, Montagne, & Savelsbergh, 2001). Although binocular information seems to be important in the regulation of interceptive actions, many studies implicitly assumed that monocular information rather than binocular information was used to regulate timing (Cavallo & Laurent, 1988; Michaels, Zeinstra, & Oudejans, 2001; Savelsbergh et al., 1991). For example, in some studies subjects performed the interception
The TTC Perception The Fallacious FallaciousAssumption AssumptionofofTTC Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 453 453
task under binocular viewing, but the discussions of these experiments focused entirely on the monocular variable x(cp). In these studies it cannot be excluded that participants used the binocular sources, and that predominantly these sources have led to conclusions in favour of the x((p)-strategy. The first empirical support for the use of binocular information in the determination of TTC is provided by stereoscopic simulation studies (Gray & Regan, 1998; Heuer, 1993; Rushton & Wann, 1999). Stereoscopic computer simulations have induced a large progress in the investigation of binocular and monocular information sources. They are often favourable over natural settings, because of their ease of use and high level of experimental control. However, the question is whether prediction-motion tasks and actions, requiring actual interception of an approaching object, rely on the same information. To test this, Van der Kamp, Savelsbergh, & Smeets (1997) investigated, in a natural interceptive task, what the role of monocular and binocular information was. They studied the timing of the opening and closing of the hand in response to approaching balls of different diameter. The results showed differences in timing under monocular, but not under binocular viewing. Recently, Michaels et al. (2001) have found similar timing differences for subjects punching different balls. The results of these experiments will be further discussed in Section 4.2. Research in which participants had to wear a telestereoscope also proves that binocular information contributes to interceptive timing (Bennett, van der Kamp, Savelsbergh, & Davids, 1999, 2000; Judge & Bradford, 1988; Van der Kamp, Bennett, Savelsbergh, & Davids, 1999). Helmoltz (1962) invented the telestereoscope in order to manipulate binocular information. It consists of two pairs of mirrors positioned parallel to each other that increase the IO distance (see Figure 4). An increased IO distance results in a larger angle subtended by the ball and the two eyes (i.e. A) and a shorter perceived object distance. Early research of Judge and Bradford (1988) showed a disruption in catching performance when binocular information was manipulated with help of the telestereoscope. During the experiment most participants failed to catch the ball. Unfortunately, it was not clear whether the observed effects resulted from spatial or temporal inaccuracy. Recent research with the telestereoscope focussed on the timing of one-handed catching. The results showed that participants closed their hand earlier when wearing the telestereoscope (Bennett et al., 1999, 2000; Van der Kamp et al., 1999).
454
Simone Caljouw, John van der Kamp and Geert Savelsbergh
Figure 4: Schematic arrangement of the mirrors of the telestereoscope. The mirrors increase the interocular separation between the left eye (L) and the right eye (R). As a result an object B is perceived as object B1 at distance Z. The angle subtended between the two points of observation and the ball is A.
Theoretically, it is possible to formulate a binocular variable that specifies TTCi, that is, the angle between the points of observation and the ball divided by its derivative (Heuer, 1993).3 It is noteworthy that even in the case of an imperfect tracking of the ball by the eye or in the case of the actor keeping his eyes steady, the angles of interest would still be available through a combination of oculomotor and optical information. Similar as the t((p) function, the T(A) function only specifies TTC in the case of a head-on approach. Laurent, Montagne, & Durey (1996) showed that there exists a more general binocular source which specifies TTCi to the cyclopean frontal axis (frontal axis that runs through both eyes). In the case of indirect approaches, the monocular information sources described by Bootsma and Oudejans (1993) and Tresilian (1990) are theoretically more powerful than the invariant described by Laurent
1 5Z ^ A For small angles of A (e.g. the angle subtended by the object and the points of observation), x(A) specifies the TTC! (e.g. first order time-to-contact with the points of observation) of an object, approaching at distance Z with a velocity Z . Michaels (1986) presented an ecological based formalization, wherein A is described as the transformation over two optic arrays.
FallaciousAssumption AssumptionofofTTC Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 455 455 The Fallacious TTC Perception
et al. (1996), because this binocular invariant only specifies the TTC] with the frontal axis. Recently, however, (Tresilian, 1999a) mathematically derived a binocular 'gap' invariant that specifies the TTCj between two objects on a collision course.4 Wrapping up, the conclusion that empirical findings are in contrast with the exclusive use of T(cp) induced the search for other optic quantities that specify TTC]. For example, x((p) was combined with the relative rate of constriction of the 'gap' (i.e. x(6)), to account for non-head on collisions. Attempts were also made to discover the role of binocular variables, for example, the relative rate of change of target vergence (i.e. x(A)) and the binocular function of the constriction of the 'gap'. Heuer (1993) stated that any variable that is accessible to the visual system and that is proportional to the distance of an object could be used to formalize a tau function, because a tau function is expressed as the ratio between the instantaneous distance of an object and the rate of change of this distance over time. So, the original TTC-model was adjusted in that TTCi was no longer exclusively specified by x(cp). According to the new model, multiple sources specify TTCi and a specific source is selected depending on the task at hand (Cutting, 1986, 1991; Laurent et al, 1996, cf. Figure 5). Consistent with the original model remains the claim that information about TTCi is indispensable in the regulation of interceptive timing. This claim will be considered next.
Critical value
Information (tau-functions)
Perception (TTd)
_L
Action (Timing)
Figure 5: Schematic representation of the adjusted TTC model. A tau-function is an optical variable, proportional to the distance of the object, divided by its rate of change. Multiple sources that specify TTC! are used to regulate the timing of interceptive actions.
4
It is rather laborious to show mathematically that an observer can obtain the time to contact between two objects (for example, ball and hand) on a collision course binocularly. In a nutshell, it is a complex expression involving the variables 8] (the angle subtended between the two points of observation and the first object), 82 (the angle subtended between the two points of observation and the second object) ccR (the angle subtended between the left eye, object 1 and the right eye), and \j/R (the angle subtended by the gap between the two objects and the right eye).
456
Simone Caljouw, John van der Kamp and Geert Savelsbergh
3.2 Testing the use of a TTCi strategy At present, there is not much consensus about the use of TTCi to control catching and hitting. The experimental findings obtained in this area are quite various and the theoretical picture is still incomplete. From some experiments it appears that interceptive actions can be regulated on the basis of perception of TTCi, without taking acceleration into account. For example, (Lee et al., 1983) performed an experiment in which participants had to jump up to punch a falling football accelerating due to gravity. They suggested that the knee and elbow angles were better geared to TTCi than to information about the real TTC. Since this interpretation received some critical comments (Tresilian, 1993; Wann, 1996), Michaels et al. (2001) replicated the "punching ball" study. Elbow flexion and extension were examined under both monocular and binocular conditions, with two ball sizes, dropped from two heights. The results showed that elbow flexion was influenced by object size, that is, participants responded earlier to larger objects than to smaller objects. These results imply that elbow flexion (the first phase) was not initiated at a constant value of TTC]. On the other hand, Michaels et al. found that elbow extension (the second phase) could have been coupled to a critical value of TTCi in four of the five subjects. Thus, the results are partly consistent with a critical TTCi strategy, and partly not. There are also experiments showing that participants do take acceleration into account. Contrary to the abovementioned findings, Lacquaniti and colleagues (Lacquaniti, Borghese, & Carrozzo, 1992; Lacquaniti, Carrozzo, & Borghese, 1993; Lacquaniti & Maioli, 1989) showed that for catching balls that fell from different heights (0.2, 0.4, 0.8, 1.2, and 1.6m) the moments of anticipatory EMG amplitude of the biceps muscle occurred at the same 'real' TTC across conditions (e.g. 150 ms). In this experiment participants used better estimates of TTC than would be expected on the basis of a TTCi strategy. This suggests that observers did take constant acceleration into account. In free-fall situations, where acceleration is solely due to the force of gravity, participants might have used a general rule to take into account the errors resulting from the difference between TTCi and real TTC. To test whether the brain models Newton's laws, Mclntyre and co-workers examined the way in which astronauts caught balls (Mclntyre, Zago, Berthoz, & Lacquaniti, 2001). In microgravity, objects will no longer accelerate downward, but will move with a constant velocity in the release direction. Mclntyre et al. projected the ball downward with three initial speeds (0.7, 1.7, and 2.7 m/s) from a starting position of 1.6m. In lg (on earth) they replicated the results of Lacquaniti et al. (Lacquaniti & Maioli, 1989; Lacquaniti et al., 1993). However, in Og (during the flight) the peak of anticipatory biceps EMG occurred earlier as compared to lg. After a
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 457 45 7
quantitative analysis, Mclntyre et al concluded that this shift was best explained by a failure to fully adjust for the lack of ball acceleration in Og. So, participants might adopt a general rule to compensate for acceleration due to gravity. Summarized, recent studies do not unambiguously confirm that actors rely on TTQ information when intercepting objects under gravity. But also studies that investigated interceptions of objects approaching with a constant approach velocity refuted a TTC] strategy. The use of a TTCrstrategy predicts that actions always occur at a constant time before contact, irrespective of the size of the object and the speed of approach. These variables cancel out of the formula (p/(f>, which results in time as the sole measurement unit. Van der Kamp et al. (1997) have systematically investigated the effects of ball size on the timing of one-handed catches. They found that participants opened and closed their hand at different times before contact. The larger the ball the earlier the hand opened and closed (see also Michaels et al., 2001). These results are in agreement with the "size-arrival" effect found by DeLucia and co-workers in TTC judgement studies using computer-simulated scenes (see Chapter 10 this volume). Studies that investigated the effect of approach speed also showed that participants did not initiate their movements at a constant TTC. The TTC at which participants initiated their movement consistently decreased with increasing speed (Bennett et al., 1999; Li & Laurent, 1994, 1995). Neither size nor speed effects are consistent with a TTCi strategy.
4. Alternative models for the control of interceptive actions We must conclude that there is hardly any empirical evidence for the original TTC-model as presented in Figure 1. At first, the axiom that exclusively x(cp) is detected to perceive TTQ for the regulation of interceptions is flawed. Furthermore, actors do not solely regulate their movements on basis of the perceived TTCi. Most research presented suggests that an explanation of the control of timing exclusively on the basis of a critical t(cp) strategy must be ruled out. Therefore we have to search for other possible models that might explain interceptive timing.
458
Simone Caljouw, John van der Kamp and Geert Savelsbergh
4.1 Direct action For many years the search for optical variables was constrained by information specifying TTCi. This was due to the implicit reasoning that the detection of information about an event entails the perception of this event, and it is the perception of this event that regulates the action (cf. Figure 1). But, as shown in the previous Section, actors do not always rely on perceived TTQ. This raises the issue whether, in the case of interceptive movements, the used information ought to specify TTC. The intermediate phase of perception may be superfluous. That is, information might directly regulate the timing without the need for perceiving TTC. This perspective results in the assumption that any information that is to some degree correlated to the approach of the ball, including optical variables that do not specify the environmental property TTC (i.e. lower order variables), might be used to regulate interceptive movements. For instance, several authors have recently proposed that the optical basis for the timing of interceptive action may be the "looming" variable or absolute rate of expansion (Michaels et al., 2001; Smith, Flach, Dittman, & Stanard, 2001; Van der Kamp et al., 1997, 1999). As described in the previous Section, it was found that the timing of catching a ball is affected by ball size and velocity. The larger the object and the faster the approach, the earlier (i.e. at a longer TTC) people initiated their action. Michaels et al. (2001) attempted to push the qualitative agreement between (p and the initiation of the action into a quantitative one. They presented techniques to determine the critical values and corresponding visuomotor intervals of optical variables. The optical variable used to initiate flexion of the arm turned out to be the absolute rate of expansion, rather than the relative rate of expansion. The optical variable that regulated the extension of the arm was less conclusive, since different variables (including (j) and x(q>)) were exploited by different participants. Also, the telestereoscope experiments showed that informational variables that do not specify TTCi contribute to the timing of catching. The telestereoscope manipulates the absolute value of the binocular variable target vergence by increasing the IO separation. Hence, when the relative rate of change of the binocular variable provides the information, wearing a telestereoscope will have no effect on the timing of the catch. The results showed that participants, while wearing the telestereoscope, closed their hand earlier around the ball (Bennett et al., 1999, 2000; Van der Kamp et al., 1999), suggesting that lower order variables contribute to the regulation of catching. In a recent article Michaels (2000) discussed the use of a lower order variable to control the timing of interceptive actions. Ecological psychologists expect a 1:1 mapping between informational variables and to-be-perceived
FallaciousAssumption Assumptionof ofTTC TTCPerception Perceptionin inthe theRegulation Regulationof ofCatching Catchingand andHitting Hitting 459 45 9 The Fallacious
properties. This relation holds for the optical quantity T(cp), which specifies TTCi. On the other hand, (j) is ambiguous with respect to object size, distance, velocity, and TTC. It specifies nothing. But, perhaps it is not necessary to first perceive an environmental property before acting upon it. The present proposal is that the initiation of interceptive actions may require the detection of a critical threshold value of an optical quantity; it does not require the perception of TTC. In the direct action model (schematised in Figure 6) the optical variable directly modulates action.
Critical value
Information
1
Action (Timing)
Figure 6: Schematic representation of the direct action model. In this model information directly modulates action, without the interposition of an unnecessary perceptual entity.
The direct action model, in line with the original TTC-model, assumes that the initiation of an interceptive movement is based on a critical value of a single informational variable. Most of the presented research in this Chapter violated regulation on the basis of x(cp), and indicated the exploitation of other informational variables (e.g. T((p,6), x(A), (p, A). What remains is a rather fragmented picture; it is impossible to provide a precise formulation of an alternative variable that can account for all of the observed effects. Gibson (1979) introduced the concept of a compound invariant, which is a unique combination of invariants. He argued that when a number of stimuli are completely covariant, when they always go together, they constitute a single stimulus (Gibson, 1979, p. 141). Obviously the question that has to be answered is "how do the variables combine to constitute a compound invariant"? At present we have no clear answer to this question. Note however that by no means the constituting informational variables have to be chosen such that in combination they inform precisely about TTC.
460
Simone Caljouw, John van der Kamp and Geert Savelsbergh
4.2 Directed action The original TTC model assumes that the pick up of informational variables is not flexible. After all, a sole variable could be used by many species to regulate many interceptive actions. Based on the evidence presented, we can conclude that multiple optical quantities are involved in the regulation of interceptive actions. But, how should the claim that timing is based on multiple information sources be understood? One might assume that temporal control is based on a single source of information, selected out of several available sources depending on the task at hand (Savelsbergh & Van der Kamp, 2000). For example, Van der Kamp et al. (1997) investigated what the role of monocular and binocular information was, in a natural interceptive task. They studied the timing of the opening and closing of the hand in response to an approaching ball. A size-arrival effect was found in the monocular condition but not in the binocular condition (cf. Michaels et al., 2001). This indicates that participants use different variables under different viewing conditions. Van der Kamp et al. (1997) argued that the size-arrival effect is consistent with the use of expansion information. No size-arrival effect was found in the binocular condition, which suggests that participants used a binocular variable that accounted for the constant TTC strategy (i.e. relative rate of target vergence). So, the human visual system is opportunistic and flexible in the pick up of variables that are present in the optic array (tau, optical expansion etc.). The idea that different information sources might be needed to control the rich diversity of interceptive tasks is in line with the present neuroscientific view that the visual system is composed of several functional subdivisions. The onedimensional view of much traditional research that the only role of the eyes and the associated cortical areas is to build a single representation of the world around us is in general rejected. For example, Milner and Goodale's (1995) model of the two-visual pathways can help us understand how the brain uses information depending on the task at hand. The dorsal stream, which addresses the motor areas of the brain more frequently, supports the visual control of goal directed actions. The ventral stream, mainly connecting to higher cortical areas such as memory, is used to identify characteristics of objects and events. These findings suggest that multiple information sources are detected with multiple perceptual systems as a function of task-constraints. That is to say, selection of information depends on constraints such as the available sources of information, environmental regularities and task demands. The process of selection is not an internal constructive process, such as inference, but relies on constraints in the task ecologies.
FallaciousAssumption Assumptionof ofTTC TTCPerception Perceptionininthe theRegulation Regulationof ofCatching Catchingand andHitting Hitting 461 461 The Fallacious
In contrast to the perspective that a single source of information is used depending on the task-circumstances is the view that various binocular and monocular optical information sources as well as oculomotor information simultaneously contribute to the regulation of timing. Tresilian (1994b; see also Wann & Rushton, 1995) proposed that timing patterns reflect the combination, by summation or multiplication, of several differently weighted sources of information. The weights are granted on basis of prior and current perceptual information about the accuracy of the information sources. For instance, both monocular and binocular information may be used to time interceptive movements when participants view the scene with two eyes. Both Heuer (1993), and Gray & Regan (1998), in line with Regan and Beverly (1976), argued that monocular and binocular information is summed and that binocular information becomes more dominant as the object size is small. And, comparably, the dipole model of Rushton and Wann (1999) combines estimates of TTCi, based on binocular (disparity) and monocular (expansion) information and applies different weights to both cues depending on their immediacy. When disparity information specifies that the ball will hit the point of observation in advance of the temporal estimate that is provided by expansion information, participants will rely on disparity information. Notice that Rushton and Wann follow the adjusted TTCi model (cf. Figure 4), which was found to be flawed in the previous paragraphs. Bennett et al. (2000) used the telestereoscope to test whether the contribution of binocular information is mediated by object size. They predicted an interaction effect between ball size and telestereoscopic viewing. If subjects are more reliant on binocular information when catching a small ball, they should initiate key aspects of the catch earlier under telestereoscopic viewing than with normal viewing. If subjects are less reliant on binocular information when catching a large ball, there should be less or no effect of telestereoscopic viewing on the timing of the catch. This experiment showed, as expected, that participants closed their hand earlier for the small and medium sized balls under telestereoscopic viewing compared to normal viewing. Following this perspective, adaptation to new task-constraints should be based on recalibration of the existing information-movement coupling instead of selecting an alternative coupling. To investigate whether adaptation occurs through selecting a different information-movement coupling or through recalibration of an existing information-movement coupling, another one-handed catching experiment with the telestereoscope was performed (Van der Kamp et al., 1999). A pre-exposure, exposure, post-exposure design was used. In the preexposure and post-exposure phase half of the participants were required to perform monocular catches and the other half had to perform binocular catches.
462
Simone Caljouw, John van der Kamp and Geert Savelsbergh
In the exposure phase both groups had to catch balls under binocular viewing while wearing the telestereoscope. If participants were able to transfer easily to the information-movement coupling they established in the post-exposure design (i.e. selection) no after-effects would occur. However, in the case of a recalibration process, after-effects would be expected. If recalibration was restricted to the manipulated information, only the binocular group would show after-effects, whereas after-effects might occur in the monocular group if recalibration involved multiple information sources. The results showed that the hand was closed later in the first three trials after removal of the telestereoscope. No differences in the after-effects were found between the monocular and binocular group. This experiment suggests that adaptation to the telestereoscope is due to recalibration of the coupling between information and movement, rather than a selection of other information. Moreover, the recalibration is not restricted to the manipulated binocular information but may encompass also monocular information sources. Proponents of the directed perception approach (Cutting, 1986, 1991; Laurent et al., 1996) argued that the perception of an environmental property (e.g. TTC) could be perceived by the detection and combination of different optical variables (e.g. i((p), T(A), and x(cp,6)). We come to terms with this approach in that interceptive actions may be based on multiple sources of information (cf. Figure 7) and these variables contribute more or less depending on the task-circumstances. However, we do not agree completely with proponents of the directed perception approach for several reasons. First, we reject the perspective that perception of the environmental property TTC is a vital step in the timing of an interceptive action. Second, we prefer a selforganized dynamic account of information use above a calculated combination of several differently weighted sources of information. The dynamical systems approach claims that optical information is one of the influences acting on the intrinsic dynamics of a system. So, we may assume that different information-related dynamics emerge from the use of different informational variables. If multiple sources of information are exploited, the entire dynamics of the system is given by a synergy in which the multiple information-related dynamics are assembled. For example, the dynamics of a system intercepting an approaching ball at the right time may be described by a synergy of the dynamics emerging from the use of expansion velocity and change of disparity. This process of assembling depends on different constraints; (i) Organismic constraints determine whether the particular information sources can be detected.
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 463 463
Task constraints
Multiple information q),A,T((p),T(e),t(A)etc.
I Action (Timing)
Critical value(s)
Figure 7: Schematic representation of the directed action model. On the basis of taskconstraints several co-varying variables are combined and/or selected that directly modulate action.
The change of disparity will not exert any influence on the dynamics of the system if the actor does not have stereoacuity. (ii) Environmental constraints determine which information is available to the system. If the approaching ball is far away binocular information is less salient (Collewijn & Erkelens, 1990). (iii) Task-constraints determine which information is relevant to perform a specific task. If a fronto-parallel moving ball, at a constant distance from the point of observation, has to be intercepted, the system will not be attracted to the dynamics of both information sources and the dynamics emerging from other information sources will contribute to the entire dynamics of the system. A selforganized dynamic account of information use does not suggest that actors weight and combine information sources, it suggests that the system settles in a state depending on several constraints. Allthough this dynamical process of combining information may be conveniently described in terms of addition and multiplication, it does not necessitate the brain solving this equation. To account for the direct influence of a multitude of variables on the timing of interceptive actions we introduce the term "directed action".
464
Simone Caljouw, John van der Kamp and Geert Savelsbergh
5. Another control strategy Accepting that empirical evidence for the original tau-model is at best scarce, one of two positions can be taken. One can either propose an alternative for the information used or one can reconsider the way in which the information is used to regulate the movement. The TTC-model was originally a predictive model, that is, it presumed that a movement was regulated on basis of information that specified when a future event (i.e. the interception) would occur. A critical value of perceived TTCi was used to initiate a ballistic movement of an invariant movement time and a constant visuo-motor interval (Tyldesley & Whiting, 1975). Predictive information might be useful in a visual masking task, where participants have to predict when a suddenly occluded object will arrive at a designated target area. In a natural interceptive task participants are able to view the ball until it is received into the hand. Hence, there is no need for information that accurately predicts the place and time of interception. The information continuously informs the actor about the current state of the actor-environment system and this information is used to update the movement online. In this section, different continuous control models will be presented. The first model is purely phenomenological and describes, with help of dynamical equations of motion, how the timing of the grasp component in prehension emerges on the basis of optical information generated by the reach component (Zaal, Bootsma, & van Wieringen, 1998; see also Chapter 17 of this volume). The other models described, are based on the Vector Integration to Endpoint model of Bullock and Grossberg (1988). They take into account a lateral catching task in which besides temporal information also spatial information is required (Dessing, Bullock, Peper, & Beek, 2002; Michaels, Jacobs, & Bongers, 2003; Peper, Bootsma, Mestre, & Bakker, 1994) 5.1 Dynamical systems approach Originally, the dynamical systems approach captured stability-related features of rhythmic movements. In the last decade, the dynamical systems theory has also been applied to describe the information driven change in the initiation and trajectory formation of discrete movements (Schoner, 1990, 1994). Schoner (1994) modelled, for instance, the wing retraction of the diving gannets (Lee & Reddish, 1981; see Section 2) as a dynamical system continuously guided by x(cp). The unfolded and folded wing positions are modelled as two stable point attractors, and the stability of the transition is regulated by a limit cycle. During the dive TTC information is generated and the optic quantity
The Fallacious FallaciousAssumption Assumptionof ofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 465 465
l/t((p) regulates the (de)stabilization of the two postures. If l/x(cp) is greater than the critical value the folded wing posture destabilizes and simultaneously the unfolded wing posture stabilizes. As a result from the relative change in stability the system follows a limit cycle until it stabilizes onto the unfolded wing posture. In Chapter 17 of this volume Frank Zaal and Reinoud Bootsma describe how the Schoner model can be applied to the coordination of prehension. Zaal et al. (1998) modelled the hand opening and hand closing as point attractors that are connected by a limit cycle. The movement of the hand towards the object generates the optical information that changes the stability of the point attractors. When reaching towards an object, the angle subtended by the hand ((p) and the angle subtended by the gap between hand and object (6) decrease. So, the aforementioned optical variable x(cp,6) might be used to influence the stability of the hand opening and hand closing regimes. In this continuous dynamical information-movement coupling perspective, the initiation of hand closure does not have to occur at a constant value of T((p,0), although the threshold value determines when hand opening becomes unstable and hand closing stable. Note however that they did not examine whether x((p,0) was involved, they simply adopted TTCi as the relevant actor-environment property. 5.2 The required velocity strategy and vector integration to endpoint models Peper et al. (1994) formulated a continuous control model in order to account for their data. In Peper et al.'s experiment, participants had to catch balls that were swung towards them while their hand was restricted to move along the lateral axis. According to a predictive strategy the kinematics of the catch would be unaffected as long as the ball trajectories converge at the same time on the same interception point. In contrast, the kinematics of the catch varied systematically with approach angle, even when future time and place of interception were similar. The authors concluded that hand movements were continuously coupled to a variable that brought the hand to the right place at the right time. This kind of coupling does not demand accurate predictions, "accuracy is achieved during the unfolding of the act" (Peper et al., 1994, p.610). They proposed a strategy entailing a continuous regulation of hand displacement velocity on the basis of information that specifies the velocity required to ensure interception. The currently required velocity is specified by the ratio of the difference in lateral position of the hand and the ball divided by the time before the ball reaches the interception point. So, the model assumes that, besides picking up information that specifies TTCi, participants can also
466
Simone Caljouw, John van der Kamp and Geert Savelsbergh
pick up (kinaesthetic and/or optical) variables that specify the position and speed of the hand and the lateral position of the ball. The required velocity (RV) model has been elaborated (Bootsma, Fayt, Zaal, & Laurent, 1997) and empirically supported (Montagne, Fraisse, Ripoll, & Laurent, 2000; Montagne, Laurent, Durey, & Bootsma, 1999). Montagne and co-workers were especially interested in the movement reversals that occurred when the hand started at the interception point. There was a pattern to the reversals with respect to approach angle of the ball, in that the number of left-right reversals was higher for outward ball trajectories and the number of right-left reversals was higher for inward ball trajectories. Recently, however, it was shown that the RV-model and its extension, failed in simulating the behaviour they were designed to explain (Dessing et al., 2002). Dessing et al. (2002) remedied this deficiency by reformulating the RV-model in terms of a neural network based on the VectorIntegration-To-Endpoint (VITE) model developed by Bullock and Grossberg (1988). The VITE model was originally constructed to explain hand reaching towards stationary targets. The VITE model in its simplest form consists of four elements and its dynamic interactions. The target position vector stage represents the perceived location of the target. The present position vector stage represents the actual position of the hand. A difference vector is continuously established between the target position vector and the present position vector. It specifies the distance and direction that the hand has to travel to reach the target. A separate GO-signal gates this difference vector command that actuates the movement. The GO-signal is under voluntary control and if it becomes positive, the present position vector is updated and the movement continuous to unfold. A new present position vector is established in the direction of the target position vector at a rate proportional to the difference vector times the GO-signal. In this form, the VITE model explains trajectory formation during reaching movements to stationary targets in which the velocity of the hand becomes zero as it approaches the target. Intercepting moving targets is fundamentally different from intercepting stationary targets as the trajectory of the hand is not only spatially but also temporally constrained; the hand has to be at a certain target location at a particular time. This implies that time-dependent parameters have to be incorporated in the VITE-model. To reproduce the qualitative effects observed in the lateral interception experiments of Montagne et al. (1999; 2000), Dessing et al. (2002) translated the required velocity-model of Peper et al. in terms of the VITE model. First, the GO-signal was modulated according to the first-order TTC, in that the gain of the GO-signal increases as TTC becomes smaller (the RVITE-model). Second, to improve the control of the hand velocity, they included parallel to the difference vector a relative velocity vector (the RRVITE-model). This final model closes the gap between target position and
The Fallacious TTC Perception FallaciousAssumption AssumptionofofTTC Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 467 467
hand position in the available time span, while avoiding superfluous movements given the ongoing movement of object and hand. All the above-mentioned continuous control models tried to tackle the question how information relates to movement trajectories, by assuming TTCi as an input variable. In the present Chapter it was demonstrated that besides TTCi, other (non-specifying) variables might be exploited to regulate interceptive timing. So, more in line with the approach taken here, an alternative solution to finding better fits between models and data may be to implement other input variables besides TTCi instead of elaborating and adjusting the existing continuous control models. An example of such an approach is the recent work of Michaels, Jacobs, and Bongers (2003). Instead of elaborating the simple VITE model to account for intercepting moving objects (Dessing et al., 2001), Michaels et al. (2003) changed the input variables of the simple VITE model. They implemented the optical variable lateral-velocity-to-expansion ratio to specify continuously changing target positions along the lateral axis. So, in their VITE model, the target position vector changes in accordance with the lateral-velocity-to-expansion ratio. A previous study of Jacobs and Michaels (2003) showed that hand movements in a lateral catching task, were better geared to the lateral-velocity-to-expansion ratio than to the momentary lateral ball position and TTCi. Momentary values of this ratio often do not specify the future passing distance of the ball in a way that permits catchers to make a ballistic movement to the predicted position. This variable can lead to successful interceptive movement only if continuously coupled to movement production with a properly calibrated control law. By implementing this new input variable into the simple VITE model they were able to quantitatively predict the individual hand trajectories accurately.
6. Summary and conclusions The present Chapter reviewed studies on the information-based regulation of interceptive timing. We started with a description of the TTCmodel. This model fitted well with the dominant view that the process of perception precedes the process of motor programming. From this perspective a performer is dependent on the accurate prediction of when an object arrives at a certain point in order to program the effector movement to the future interception point. So, interceptive timing was thought to be based on the perception of TTC, provided by the optical variable x((p). This original model was fatally flawed in many ways. For example, x(cp) was disqualified as being the only relevant source of information for catching and hitting tasks because (1)
468
Simone Caljouw, John van der Kamp and Geert Savelsbergh
the interception point does not coincide with the point of observation, (2) binocular vision enhances performance, and (3) the target will not always move with a constant velocity. However, the inexorable believe in the "informationperception-movement" sequence has led researchers to assume that information about TTC is indispensable in the regulation of interceptive actions. The emphasis on the process of perception constrained the search for alternative information sources. Besides x(cp) other sources were proposed that also specified TTC (e.g. x(A) and T(cp,9)). It is now well accepted that actors do not regularly control their actions on the basis of a constant TTC strategy. The finding that timing is dependent on object size and approach velocity eventuated in the conviction that interceptive actions may be based on the pick up of information sources that do not specify TTC. During natural interceptions actors can continuously view the ball up to and including the point of interception. They do not need to perceptually construct the ball's trajectory for prediction. Therefore, every source of information that is in some way confined to the approach of the ball might be used to regulate interceptions. Today, the original TTC-model is rejected and the original formulations are subjected to revision. The second part of this Chapter reviewed alternative models for the information-based regulation of interceptive actions. We no longer presented the perception of TTC as an intermediate phase in interceptive timing. Instead, we proposed that information directly regulates action. Furthermore, we suggested that not a sole variable, but multiple variables might constrain natural interceptive actions. The use of multiple information sources can be actualised by the processes of selection and combination of sources. In the event of selection, an information source is picked up depending on the task at hand, and actors adapt to different performance settings by a flexible change in information-movement couplings. In the event of combination, all sources contribute conjointly, and recalibration may occur when one of the sources is manipulated. Likely, both processes are involved, but more research is needed to understand the nature of multiple sources of information used to constrain natural interceptive actions The question "what information contributes to interceptive timing" cannot be posed in isolation from the question "how information is used to regulate timing". The original TTC model assumed a critical timing strategy, but there is also online regulation. The final Section of the present Chapter discussed different online models. Most modellers did not implement sources of information, but assumed TTCi as an input variable. Demonstrating that perception of TTCt is not an indispensable phase in the regulation of interceptive timing, it seems challenging to include other information sources into these online control strategies, as well.
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 469 469
To account for the direct influence of a multitude of variables on the timing of interceptive actions we introduced the term "directed action". With this denomination we referred to a dynamical systems perspective in which the entire dynamics of the system is given by a synergy in which the multiple information-related dynamics are assembled, depending on organismic, environmental, and task constraints. Taken as a whole the present chapter shows that we have to abandon the view that perception of TTC is an indispensable phase in the regulation of interceptive timing. Instead we have to increase our understanding of the large variety of information sources that support various functional interceptive behaviours. The issue of how information and control strategies differ as a function of task-constraints is highly important.
470
Simone Caljouw, John van der Kamp and Geert Savelsbergh
REFERENCES Alderson, G. J. K., Sully, D. J. & Sully, H. G. (1974). An operational analysis of a one-handed catching task using high-speed photography. Journal of Motor Behavior, 6(4), 217-226. Banister, H. & Blackburn, J. M. (1931). An eye factor affecting efficiency at ball games. British Journal of Psychology, 21, 382-384. Bennett, S. J., van der Kamp, J., Savelsbergh, G. J. P. & Davids, K. (1999). Timing a one-handed catch I. Effects of telestereoscopic viewing. Experimental Brain Research, 129(3), 362368. Bennett, S. J., van der Kamp, J., Savelsbergh, G. J. P. & Davids, K. (2000). Discriminating the role of binocular information in the timing of a one-handed catch. The effects of telestereoscopic viewing and ball size. Experimental Brain Research, 135(3), 341-347. Bootsma, R. J., Fayt, V., Zaal, F. T. J. M. & Laurent, M. (1997). On the information-based regulation of movement: What Wann (1996) may want to consider. Journal of Experimental Psychology: Human Perception and Performance, 23(4), 1282-1289. Bootsma, R. J. & Oudejans, R. R. D. (1993). Visual information about time-to-collision between two objects. Journal of Experimental Psychology: Human Perception and Performance, 19(5), 1041-1052. Bootsma, R. J. & Van Wieringen, P. C. W. (1990). Timing an attacking forehand drive in table tennis. Journal of Experimental Psychology: Human Perception and Performance, 16(1), 21-29. Bullock, D. & Grossberg, S. (1988). Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties during trajectory formation. Psychological Review, 95(1), 49-90. Cavallo, V. & Laurent, M. (1988). Visual information and skill level in time-to-collision estimation. Perception, 17(5), 623-632. Collewijn, H. & Erkelens, C. J. (1990). Binocular eye movements and the perception of depth. Reviews of Oculomotor Research, 4, 213-261. Cutting, J. E. (1986). Perception with an eye for motion. Cambridge, MA: The MIT Press. Cutting, J. E. (1991). Four ways to reject directed perception. Ecological Psychology, 3(1), 25-34. Dessing, J. C, Bullock, D., Peper, C. L. & Beek, P. J. (2002). Prospective control of manual interceptive actions: Comparative simulations of extant and new model constructs. Neural Networks, 15(2), 163-179. Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghthon Mifflin. Gray, R. & Regan, D. (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38(4), 499-512. Helmholtz, H. von (1962). Physiological optics. Dover, New York. Heuer, H. (1993). Estimates of time to contact based on changing size and changing target vergence. Perception, 22(5), 549-563.
The Fallacious FallaciousAssumption AssumptionofofTTC TTCPerception Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 471 471
Jacobs, D. M. (2001). On perceiving, acting and learning: Toward an ecological approach anchored in convergence. Doctoral Dissertation, Vrije Universiteit, Amsterdam. Johnson, J. (2002). One eyed jack. Retrieved October 2 2002, http://oneeyedjack. rodeoannouncer.com/index.htm. Judge, S. J. & Bradford, C. M. (1988). Adaptation to telestereoscopic viewing measured by onehanded ball-catching performance. Perception, 17(6), 783-802. Knowles, W. B. & Carel, W. L. (1958). Estimating time to collision. American Psychologist, 13, 405-406. Lacquaniti, F., Borghese, N. A. & Carrozzo, M. (1992). Internal models of limb geometry in the control of hand compliance. Journal of Neuroscience, 12(5), 1750-1762. Lacquaniti, F., Carrozzo, M. & Borghese, N. A. (1993). Time-varying mechanical behavior of multijointed arm in man. Journal of Neurophysiology, 69(5), 1443-1464. Lacquaniti, F. & Maioli, C. (1989). The role of preparation in tuning anticipatory and reflex responses during catching. Journal of Neuroscience, 9(1), 134-148. Laurent, M , Montagne, G. & Durey, A. (1996). Binocular invariants in interceptive tasks: A directed perception approach. Perception, 25(12), 1437-1450. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5(4), 437-459. Lee, D. N., Lishman, J. R. & Thomson, J. A. (1982). Regulation of gait in long jumping. Journal of Experimental Psychology: Human Perception and Performance, 8, 448-459. Lee, D. N. & Reddish, D. E. (1981). Plummeting gannets: A paradigm of ecological optics. Nature, 293, 293-294. Lee, D. N. & Young, D. S. (1985). Visual timing of interceptive action. In D. J. Ingle, M. Jeanerod & D. N. Lee (Eds.), Brain mechanisms and spatial vision (pp. 1-30). Dordrecht: Martinus Nijhoff. Lee, D. N., Young, D. S., Reddish, D. E., Lough, S. & Clayton, T. M. (1983). Visual timing in hitting an accelerating ball. Quarterly Journal of Experimental Psychology, 35(a), 333346. Lenoir, M., Musch, E. & La Grange, N. (1999). Ecological relevance of stereopsis in one-handed ball catching. Perceptual and Motor Skills, 89, 495-508. Li, F. X. & Laurent, M. (1994). Effect of practice on intensity coupling and economy of avoidance skill. Journal of Human Movement Studies, 27, 189-200. Li, F. X. & Laurent, M. (1995). Occlusion rate of ball texture as a source of velocity information. Perceptual and Motor Skills, 81(3 Pt 1), 871-880. Mazyn, L., Lenoir, M., Montagne, G. & Savelsbergh, G. J. P. (2001). Do we need binocular depth vision to control the timing of a catch? In: N. Gantchev (Ed.), From basic motor control to functional recovery II: Towards an understanding of the role of motor control from simple systems to human performance. Sofia: Professor Marin Drinov Academic Publishing House.
472
Simone Caljouw, John van der Kamp and Geert Savelsbergh
Mclntyre, J., Zago, M., Berthoz, A. & Lacquaniti, F. (2001). Does the brain model Newton's laws? Nature Neuroscience, 4(7), 693-694. Michaels, C. F. (2000). Information, perception, and action: What should ecological psychologists learn from Milner and Goodale (1995)? Ecological Psychology, 12(3), 241-258. Michaels, C. F. & Beek, P. J. (1995). The state of ecological psychology. Ecological Psychology, 7(4), 259-278. Michaels, C. F. & Carello, C. (1981). Direct perception. Englewood Cliffs, NJ: Prentice-Hall. Michaels, C. F., Jacobs, D. M. & Bongers, R. M. (2003). Predicting lateral interceptive hand movements. (Submitted). Michaels, C. F., Zeinstra, E. B. & Oudejans, R. R. D. (2001). Information and action in punching a falling ball. Quarterly Journal of Experimental Psychology, 54A(1), 69-93. Milner, A. D. & Goodale, M. A. (1995). The visual brain in action. Oxford, UK: Oxford University Press. Montagne, G., Fraisse, F., Ripoll, H. & Laurent, M. (2000). Perception-action coupling in an interceptive task: First-order time-to-contact as an input variable. Human Movement Sciences, 19, 59-72. Montagne, G., Laurent, M., Durey, A. & Bootsma, R. J. (1999). Movement reversals in ball catching. Experimental Brain Research, 129(1), 87-92. Peper, C. L., Bootsma, R. J., Mestre, D. R. & Bakker, F. C. (1994). Catching balls: how to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception and Performance, 20(3), 591-612. Purdy, W. C. (1958). The hypothesis of psychophysical correspondence in space perception. (Doctoral dissertation. Cornell University). Regan, D. (1997). Visual factors in hitting and catching. Journal of Sport Sciences, 15(6), 533558. Regan, D. & Beverley, K. I. (1979). Binocular and monocular stimuli for motion in depth: changing-disparity and changing-size feed the same motion-in-depth stage. Vision Research, 19(12), 1331-1342. Regan, D. & Hamstra, S. J. (1993). Dissociation of discrimination thresholds for time to contact and for rate of angular expansion. Vision Research, 33(4), 447-462. Regan, D. & Vincent, A. (1995). Visual processing of looming and time to contact throughout the visual field. Vision Research, 35(13), 1845-1857. Rind, F. C. & Simmons, P. J. (1999). Seeing what is coming: building collision-sensitive neurones. Trends in Neurosciences, 22(5), 215-220. Runeson, S., Jacobs, D. M., Andersson, I. E. K. & Kreegipuu, K. (2001). Specificity is always contingent on constraints: Global versus individual arrays is not the issue. Behavioral and Brain Sciences, 24(2), 240-241. Rushton, S. K. & Wann, J. P. (1999). Weighted combination of size and disparity: a computational model for timing a ball catch. Nature Neuroscience, 2(2), 186-190.
The Fallacious TTC Perception FallaciousAssumption AssumptionofofTTC Perceptionininthe theRegulation RegulationofofCatching Catchingand andHitting Hitting 473 473
Savelsbergh, G. J. P., Whiting, H. T. A. & Bootsma, R. J. (1991). Grasping tau. Journal of Experimental Psychology: Human Perception and Performance, 17(2), 315-322. Savelsbergh, G. J. P., Whiting, H. T. A., Burden, A. M. & Bartlett, R. M. (1992). The role of predictive visual temporal information in the coordination of muscle activity in catching. Experimental Brain Research, 89(1), 223-228. Savelsbergh, G. J. P., Whiting, H. T. A., Pijpers, J. R. & Van Santvoord, A. M. M. (1993). The visual guidance of catching. Experimental Brain Research, 93, 146-156. Savelsbergh, G. J. P. (1995). Catching "Grasping tau" comments on J. R. Tresilian (1994). Human Movement Science, 14, 125-127. Savelsbergh, G. J. P. & van der Kamp, J. (2000). Information in learning to co-ordinate and control movements: Is there a need for specificity of practice? International Journal of Sport Psychology, 31,1-18. Schoner, G. (1990). A Dynamic Theory of Coordination of Discrete Movement. Biological Cybernetics, 63(4), 257-270. Schoner, G. (1994). Dynamic theory of action-perception patterns - the time-before-contact paradigm. Human Movement Sciences, 13(3-4), 415-439. Shankar, S. & Ellard, C. (2000). Visually guided locomotion and computation of time-to-collision in the mongolian gerbil (Meriones unguiculatus): the effects of frontal and visual cortical lesions. Behavioral Brain Research, 108(1), 21-37. Smeets, J. B. J. & Brenner, E. (1994). The difference between the perception of absolute and relative motion: A reaction time study. Vision Research, 34(2), 191-195. Smith, M. R. H., Flach, J. M., Dittman, S. M. & Stanard, T. (2001). Monocular optical constraints on collision control. Journal of Experimental Psychology: Human Perception and Performance, 27(2), 395-410. Stretch, R. A., Bartlett, R. & Davids, K. (2000). A review of batting in men's cricket. Journal of Sport Sciences, 18(12), 931-949. Sun, H. J. & Frost, B. J. (1998). Computation of different optical variables of looming objects in pigeon nucleus rotundus neurons. Nature Neuroscience, 1(4), 296-303. Todd, J. T. (1981). Visual information about moving objects. Journal of Experimental Psychology: Human Perception and Performance, 7(4), 795-810. Tresilian, J. R. (1990). Perceptual information for the timing of interceptive action. Perception, 19(2), 223-239. Tresilian, J. R. (1993). Four questions of time to contact: a critical examination of research on interceptive timing. Perception, 22(6), 653-680. Tresilian, J. R. (1994a). Approximate information sources and perceptual variables in interceptive timing. Journal of Experimental Psychology: Human Perception and Performance, 20(1), 154-173. Tresilian, J. R. (1994b). Perceptual and motor processes in interceptive timing. Human Movement Sciences, 13, 335-373.
474
Simone Caljouw, John van der Kamp and Geert Savelsbergh
Tresilian, J. R. (1999a). Analysis of recent empirical challenges to an account of interceptive timing. Perception and Psychophysics, 61(3), 515-528. Tresilian, J. R. (1999b). Visually timed action: time-out for 'tau'? Trends in Cognitive Sciences, 3(8), 301-310. Tyldesley, D. A., & Whiting, H. T. A. (1975). Operational Timing. Journal of Human Movement Studies, 1, 172-177. Van der Kamp, J. (1999). The information-based regulation of interceptive action. (Doctoral dissertation, Vrije Universiteit, Amsterdam). Van der Kamp, J., Bennett, S. J., Savelsbergh, G. J. P. & Davids, K. (1999). Timing a one-handed catch II. Adaptation to telestereoscopic viewing. Experimental Brain Research, 129(3), 369-377. Van der Kamp, J., Savelsbergh, G. J. P. & Smeets, J. B. (1997). Multiple information sources in interceptive timing. Human Movement Science, 16(6), 787-821. Von Hofsten, C. (1983). Catching skills in infancy. Journal of Experimental Psychology: Human Perception and Performance, 9(1), 75-85. Wang, Y. & Frost, B. J. (1992). Time to collision is signalled by neurons in the nucleus rotundus of pigeons. Nature, 356(6366), 236-238. Wann, J. P. (1996). Anticipating arrival: Is the tau margin a specious theory? Journal of Experimental Psychology: Human Perception and Performance, 22(4), 1031-1048. Wann, J. P. & Rushton, S. (1995). Grasping the impossible: Stereoscopic virtual balls. In R. G. Bardy, R. J. Bootsma & Y. Guiard (Eds.), Studies in perception and action III (pp. 207210). Hilsdale, NJ: Lawrence Erlbaum Associates. Whiting, H. T. A. & Sharp, R. H. (1973). Visual occlusion factors in in a discrete ball catching task. Journal of Motor Behavior, 6, 11-16. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1998). Coordination in prehension. Information-based coupling of reaching and grasping. Experimental Brain Research, 119(4), 427-435.
Time-to-Contact Time-to-Contact – - H. H. Hecht and G.J.P. Savelsbergh (Editors) (Editors) © © 2004 2004 Elsevier Elsevier B.V. B.V. All All rights rights reserved reserved
CHAPTER 20 How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion Gilles Montagne Universite de la Mediterranee, Marseille, France
Aymar De Rugy The Pennsylvania State University, University Park, USA
Martinus Buekers Katolieke Universiteit Leuven, Leuven, Belgium
Alain Durey (U1) Universite de la M6diterrane'e, Marseille, France
Gentaro Taga University of Tokyo & PRESTO, JST, Japan
Michel Laurent University de la Mediterranee, Marseille, France
ABSTRACT This chapter is designed to show how time-to-contact (TTC) can be involved in the regulation of goal-directed locomotion. Two integrative models relying (at least partly) on the use of TTC are presented. The first one links a perceptual variable to a movement variable and allows an agent to get to the right place at the right time to catch a fly ball. The second one links a perceptual variable to the dynamics of the neuro-musculo-skelettal system and allows an agent to put a foot on a target placed on the floor. The status of TTC in these models as well as the respective contributions of these models to a better understanding of the mechanisms underlying goal-directed actions are discussed.
476
G. G. Montagne, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent
1. Introduction Most of the actions we perform daily strive towards a goal. Generally this goal achievement relies on a close dialogue between the perceptual and the motor components of the action. Goal-directed locomotion is one of the paradigms frequently used by researchers interested in discovering the mechanisms underlying this dialogue. This last statement is corroborated by the increasing publication rate on this topic and, among others, by a recent special issue of Ecological Psychology, dedicated to «Visually Controlled Locomotion and Orientation» (vol. 10, 1998). It is worth mentioning that goal-directed locomotion groups numerous tasks characterized by various spatio-temporal constraints. We can easily differentiate heading tasks relying on the perception of self-motion direction (e.g., Warren & Hannon, 1988) from locomotion pointing tasks relying on the perception of the spatio-temporal vicinity of a target (e.g., Montagne et al., 2000a). This chapter is primarily concerned with the latter aspect, that is with tasks involving precise positioning of the foot or the body at the right place and the right time. These tasks include, for example, positioning the foot on a visible target on the floor while walking (De Rugy et al., 2001) or positioning the body at the right pace and the right time to intercept a fly ball (McLeod & Dienes, 1993). The regulation of goal-directed locomotion relies on the use of control mechanisms linking perceptual input to motor output. It is obvious that the kind of mechanisms described (and consequently the modeling of these mechanisms) depends highly on the metatheoretical background embraced. Warren (1998) differentiates control mechanisms relying on the construction and the use of internal representations of the world {model-based control) (Haruno et al., 2001) from those involving the use of information available in the perceptual flow {information-based control) (Warren et al., 2001). While the former model rests on a cognitive approach of perception and action (Marr, 1982), the latter rests on the ecological approach of perception and action (Gibson, 1979). Our work comes within the framework of the ecological approach of perception and action (Gibson, 1979). Gibson (e.g., 1958) emphasized the potential richness of the optic flow available to an agent engaged in a goaldirected locomotion task. He showed that each displacement gives rise to a specific optic flow. This specificity translates into optical invariants characterizing the displacement produced in relation to the environment. While underscoring this kinesthetic function of vision, Gibson opened the door to very parsimonious control mechanisms. Since the original proposal of Gibson, numerous theoretical and empirical studies have shown that (i) task-specific control information (Warren, 1998) exists (e.g., Lee, 1976) and that (ii) this information substrate can be used directly to control the action (e.g., Savelsbergh et al., 1991; De Rugy et al.,
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
477 All
2001). The theoretical work implemented was designed to identify which information specifies a given property of the agent-environment system (Lee, 1976; Bootsma & Peper, 1992; Laurent et al., 1996). This specification process allows the agent to be informed about the validity of the movement/displacement produced in relation to the task at hand. The second kind of studies was designed to investigate whether the information described is involved in the regulation of action. Some nice experimental manipulations of the optic flow revealed that information can actually take part in the control of goal-directed action (e.g., Savelsbergh et al., 1991). Nevertheless, the results are controversial because it is not clear how the information participates in the regulation of movement/displacement (Bootsma et al., 1997; Montagne & Laurent, 2002). For example, numerous studies (e.g., McLeod & Dienes, 1993) have shown vertical optical information to be a good candidate for displacement control when one catches a fly ball. Unfortunately no model indicates how this information could be used in the control process. While there is no doubt that canceling optical acceleration guarantees success, nothing is sure about how the reference value (optical acceleration equals to zero) is reached and maintained. The present controversies will be overcome only when a testable conception of the way information is used to regulate behavior is developed (see Montagne et al., 1998 for a review). This chapter is designed to present two models showing how TTC can take part in the regulation of locomotion pointing tasks. Note that TTC is a property of the agent-environment system and that several sources of information specifying TTC are available in the optic flow. Consequently the two models presented can be implemented indifferently on the basis of the physical properties or of their optical counterparts.
2. The modeling 2.1 How to catch a fly ball Consider the situation illustrated in Figure 1. An agent runs to catch a fly ball that has been hit in a field defined by the external coordinate system R. The agent's movement relative to the ball can be most directly controlled within another coordinate system, R', which is anchored on the projection onto the horizontal plane (going through the head) of the ball's current position (Figure 1 and Figure 2) (von Hofsten, 1983; Zaal et al., 1999). Within this coordinate system, axis X' is defined by the origin of R' and the agent's location.
478 478
G. Montagne, Montague, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent
Figure 1: Representation of the displacements of a fly ball and an outfielder. The displacements can be described in reference to a coordinate system (R') anchored to the moving ball or to an external coordinate system (R) anchored to a static background. The angle of gaze elevation is also represented (<]> 3 at time 13).
In the coordinate system R', the velocity of the agent (V(R)) can be defined (as each vector) by a direction and an amplitude. The "directional problem" is solved as soon as the velocity vector is brought into alignment with axis X'1 (t7 in Figure 2). That amounts to keeping the direction of axis X' invariant during the displacement. In fact, this is the case when, in the external coordinate system R, the agent adopts the same horizontal velocity as that of the ball (Vba]i(R)) (Figure 3). The agent is then left with the "amplitude problem"; he or she has to determine the amount of acceleration along the axis X' needed to succeed in the task. In this chapter we will assume that the agent is able to keep direction of axis X' invariant (e.g., Lenoir et al., 2002; Chardenon et al, 2002, under revision) and we will specifically examine the control mechanism underlying the solving of the "amplitude problem".
1
This leads to an apparent inconsistency that will be removed further on. The velocity of the agent in R corresponds to the sum of his/her velocity along X' and the horizontal velocity of the fly ball (Equation 4). Consequently a displacement directed towards the ball in R' (Figure 2) can correspond, depending on the amplitude of V(R.), to a displacement directed towards the interception point in R (Figure 3).
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
479
tio
R'
Figure 2: Overhead view of the displacements of both the agent and the ball from initiation of the agent displacement (to) to ball interception (t10). In the coordinate system R\ the velocity of the agent (V (R)) can be defined (as each vector) by a direction and an amplitude. The "directional problem" is solved as soon as the velocity vector is brought into alignment with axis X' (t7 in this figure). The agent is then left with the "amplitude problem"; he or she has to determine the amount of acceleration along the axis X' needed to succeed in the task.
Our formalization is inspired by a model proposed by Peper and Bootsma (Peper et al., 1994; Bootsma et al., 1997; Bootsma, 1998) and tested empirically by Montagne and collaborators (Montagne et al., 1999, 2000b). According to this model, the control of a given action entails the regulation of the produced acceleration on the basis of a perceived difference between current and required behaviors. The same logic can apply to the task under consideration (Equation 1). The acceleration to be produced along the X' axis (A(R)) corresponds to the difference between the velocity required to succeed in the task (Vreq(R)) and the current velocity of the agent (V(R)) (Equation 1): ~
&^
req(R')
where a and ft are constants.
(1)
480
G. G. Montagne, A. De De Rugy, M. M. Buekers, A. Durey, G. G. Taga and M. Laurent
To arrive at the right place at the right time, the agent's required velocity along the X' axis (Vreq(R')) corresponds to the ratio between the current distance between the agent and the projection of the ball to the horizontal plane going through the eye and the time remaining before the ball reaches this plane (Equation 2). This required velocity (Vreq(R)) is optically specified by an information present in the optic flow produced by the combined displacements of the agent and the ball (Equation 2). D V
re?f/
=
°
=
TTC
8 2d{tan(0))/dt
m K}
Where Vreq(R) is the velocity required along the X' axis, D is the current distance between the agent and the projection of the ball to the plane going through the eye, TTC is the time remaining before the ball reaches the plane going through the eye, g is a gravitational constant, and $ is the angle of gaze elevation (Figure 1). On the other hand, the current velocity of the agent in the coordinate system R' (V(R>)) corresponds to the rate of change of the distance between the agent and the projection of the ball to the plane going through the eye. Once again the current velocity (V(R>)) is optically specified (Equation 3). The agent just has to know the ball size (r) in advance.
V
• \ (l +ran 2 0 ) 0 l =D = -rcos
\tanG
(3)
1
tan 9 J
Where V(R') is the velocity of the agent along the X' axis, D is the rate of change of the current distance between the agent and the projection of the ball to the plane going through the eye, r is the radius of the ball, ^ is the angle of gaze elevation, and 0 is the angle subtended by the ball at the point of observation. We turn now to the predictions that can be made on the basis of this model. Relative to the ground coordinate system R, the agent's velocity {Vw) equals the sum of his/her velocity along X' (V(R)) and the horizontal velocity of the fly ball (VbaiuR)) (Figure 3A and Equation 4).
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
A
481
B Interception place
9-
VbaI1(R)
D
V C R')
Figure 3: Predictions following from the model. A. In the coordinate system R the velocity of the agent (V(R)) corresponds to the algebraic sum of the displacement velocity of the agent along the X' axis (V(R.)) and the horizontal velocity of the ball in the coordinate system R (Vball(R)). B. When the velocity along the X' axis (V(R>)) is identical to the displacement velocity required (Vreq(R)) the agent reaches the landing point after a rectilinear displacement. C and D. When the velocity along the X' axis (V(R)) differs from the displacement velocity required (Vreq(R)) the agent reaches the landing point after a curvilinear displacement. Y
(R)
y
balI(R)
\H)
Among other things, this model leads to the following predictions. When the current velocity (V(R)) matches the required velocity (Vreq(R')) (Figure 3B), (i) the agent moves directly toward the interception place and the displacement velocity is constant. When there is a difference between the current velocity and the velocity required, (i) the agent will follow a curvilinear path (Figure 3C-D) and (ii) the velocity profile will exhibit an inverted U shape. 2.2 How to put a foot on a target The role of vision in regulating this type of action has been extensively investigated (e.g., Berg et al., 1994). Some behavioral studies have shown temporal information (i.e., information specifying TTC or a variant of it) to be essential for the regulation process (Lee et al., 1982; Warren et al., 1986; De
482
Montague, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent G. Montagne,
Rugy et al., 2000a, 2002a). Nevertheless, to our knowledge, very few models have been proposed that formalize the way information about TTC is used. Warren et al. (1986) showed that continuous locomotor pointing tasks can be performed by simply relating the vertical impulse to be produced to an optical specification of the temporal gap separating two targets. Nevertheless recent models incorporating neurophysiological data on basic locomotion and its adaptation (Taga, 1991, 1995a, b) opened the door for much more refined formalization of the mechanisms underlying perceptualmotor coordinations. In a recent study the model elaborated by Taga (1998) was taken as a starting point in an integrative model of visuo-locomotor pointing tasks we proposed (De Rugy et al., 2002b). The next paragraph is designed to make a very general presentation of Taga's model. The initial model elaborated by Taga (1991; 1995a, b) presents locomotion as emerging from the interplay between neural, muscular, and skeletal components. The musculo-skeletal system is composed of eight segments and twenty single joint or double-joint muscles, while the neural system contains a rhythmic generator (assumed to reflect the activity of a central pattern generator) composed of seven pairs of neural oscillators, each controlling the movements of a corresponding joint (Figure 4). Moreover, the global state indicates the current situation referred to the locomotor cycle and is used to produce a phase-dependent modulation of the motor command. Finally a posture controller is in charge of maintaining stability of the stance limb and an upright posture. Recently Taga (1998) integrated in the model a mechanism in charge of step length regulation. Modification of step length was possible on the very last step preceding obstacle avoidance through modulation of both ankle extensor (push-off phase) and hip flexor (end of the push-off phase and beginning of the swing phase). The effect of the motor command is then phase-dependent and perfectly integrated into a rhythmic generator since it modulates only the gain level of its output to the concerned muscles. The regulations produced were under the control of a Discrete Movement Generator, which was in charge of parametrizing the step length command on the basis of the obstacle location. Our model (De Rugy et al., 2002b) extends Taga's model (1998) in that the step length modulation command (q), which is determined by a Visuo-Motor Controller (Figure 4), is continuously related to an optical information about TTC, which gives rise, if necessary, to regulation of step lengths several steps from the target (Equation 5). The next section is designed to show how this mechanism operates.
How Time-to-Contact is Involved in in the the Regulation of Goal-Directed Locomotion
q* = -(fA»
483 483
(5)
where q is the parameter determining the strength of hip and ankle muscle modulation, cp is a constant, and A is the visual parameter. Neural System "/isuo-Motor Controller
. Visual System
(V-M) Global State
rr3 Motor " " " " " Com in ancf Generation
P osture Controller (PC)
M 1
Sensory System
Musculo-Skeletal System Global States 2 • 3 ' 4 ' 5 right ,left i-
stance
Environment
Figure 4: A) model of the neuro-musculo-skeletal system for basic locomotion and its intentional adaptation (Taga, 1998); a. Outline of the model. Our contribution (De Rugy et al., 2002b) lies in the Visuo-Motor Controller (V-M) that replaces the initial Discrete Movement Generator (DM) in Taga (1998); b) The musculo-skeletal model; c) Correspondence between the global states and the swing and stance phases of the right and left legs.
We propose that information about time to foot (De Rugy et al., 2002a) is sufficient to control step length. Time to foot (TTF), which is available optically (Equation 6), expresses the time remaining before the agent reaches the target with his/her current eye-foot axis.
484 484
G. Montagne, Montague, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent
-1
—T
(6)
tan a J
with co the angle formed by the eye-target direction and the vertical and a the angle formed by the eye-target direction and the eye-foot axis. We hypothesized again that locomotion could be regulated on the basis of an optically specified difference between current and required behavior. The current behavior can be characterized at a given footfall by the step period of the current step (SP<m.nyy this step period corresponds to the difference between time to foot at the current (TTFn) and previous (TTFn_i) footfalls (Equation 7). SPcur„ = TTFn - TTFn
(7)
Consequently, the required behaviors, expressed as the step periods required for precise pointing {SPreqLn or SPreq2.n), could emanate from the ratios between TTF n and the remaining step numbers before pointing (Equation 8). These remaining step numbers correspond to the two whole numbers closest to the ratio of time-to-foot and the current step period (TTFn/SPcurn).
SPreql.n
, [ TTF
int\
=-
spreql.n
(8) •
int\
+1
where int(x) is the integer part of x. We assumed for obvious reasons of parsimony that the smallest difference (dn) between the required {SPreqln or SPreq2.n) and the current (SPcurn) behaviors is then taken into account (Equation 9). <
SPreqZn
^cur.n
(9)
-SP •l.»
^cur.n\
\) )
The produced regulations depend on this difference provided that the latter exceeds a perceptual threshold (<5) (Equation 10). Note that even if a given regulation brings the difference back below the threshold, the motor command is nevertheless maintained until pointing (Equation 10).
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
485
A,=' (10) The visual parameter (An) of Equation 5 being defined, then our integrative model of locomotor pointing task is operational. This model allows some simulations to be made. These simulations demonstrate the model is valid; it gives rise to a successful pointing behavior when the agent has to produce locomotion regulations (Figure 5). Moreover the observed regulations of step length at the vicinity of the target (Figure 5) are similar to those produced by subjects performing a locomotion pointing task (e.g., Lee et al., 1982; Montagne et al., 2000a). In particular, the time course of the inter-trial variability of the toe-target distance and the relation between the total amount of regulation and the initiation rank of the regulation was similar in the simulation and in real life pointing (De Rugy et al., 2002b).
Figure 5: Stick figures of simulated locomotor pointing. A) steady state walking, b and c) two trials performed with the visual strategy for step-length modulation activated (8 = 0.035 and (p =2), for two positions of the target (arrows). The dotted line indicates the position of the feet at two consecutive footfalls in steady state walking, such that the distance between the arrows and the dotted lines represents the locomotor adjustments required to reach the target. In a and b the produced locomotor regulation brings the system close to the target, through lengthening and shortening of step length, respectively (De Rugy et al., 2002b).
486
G. Montague, Montagne, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent
3. Discussion This chapter was designed to show how TTC could be used to regulate locomotion pointing tasks. Two models based (at least partly) on the use of TTC have been presented. In this section we first discuss the status of TTC in these models, and then we present the respective contributions of these models to a better understanding of the mechanisms underlying goal-directed locomotion. First of all, note that even if the term TTC is a generic term grouping all temporal properties of the agent-environment system, the temporal properties involved in the two models are different. The TTC involved in the first model corresponds to the time remaining before the ball reaches the plane going through the eye. The TTC (time-to-foot) involved in the second model expresses the time remaining before the agent reaches the target with his/her current eyefoot axis. Consequently, the optical correlates of these properties of the agentenvironment system are also different. On the other hand, the place taken by TTC in the two models is different in nature. TTC is the only property of the agent-environment system involved in the second model, but it appears as a useful property (among others) in the first one. Yet, whatever the task constraints, TTC appears essential in the regulation process of goal-directed locomotion tasks. The second point has to do with the contributions of these models. The first contribution concerns the nature of the models. This kind of integrative model offers a testable conception of the relations between information and movement. While describing not only the information but also the way information is used to regulate the movement/displacement, these models open the door to new experimental programs designed to test the causal function of the control mechanisms. Let us return to the "outfielder's problem". Debates between the proponents of previously proposed perceptual strategies have focused on the shape of agents' paths of displacement during interception tasks. Specifically, it has been argued that rectilinear paths support the Optical Acceleration Cancelation strategy (e.g., McLeod & Dienes, 1993) while curvilinear paths support the Linearizing Optical Trajectory strategy (McBeath et al., 1995). Our model can account for these two path types (see Figure 3B-D) as well as for differences in agents' velocity and acceleration. Depending on the way the information is taken into account in the displacement regulation process (i.e., depending on the weight put on both required and current velocity), the agent can perform the same goal-directed task differently. Thus, this model, while showing explicitly how information takes part in the regulation process, offers agents a flexible means by which to adapt their behaviors to current task constraints. Furthermore, the two models presented in this chapter are intended to "catch" the control mechanisms involved in different tasks. Although they lie in
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
487
the same theoretical framework, they also refer to a different level of analysis. The first kind of model links a perceptual variable to a movement variable; the second links a perceptual variable to the dynamics of the neuro-musculo-skeletal system. The two models present large similarities. They constitute two prototypes of information-based control models. They rest on the use of "Gibsonian" information, that is the current state of the agent-environment system is specified by information picked up in the perceptual flow. Consequently the regulations are produced in a prospective way on the basis of the available information. That means that whatever the model under consideration, the agent is informed continuously about the amount of regulation required to perform the task successfully. On the other hand, these models also present specificities. Bootsma (1998) differentiates two kinds of coupling: the information-movement one and the perception-actuation one. The former constitutes the behavioral expression of the control mechanism, while the latter expresses structural (neuro-musculoskeletal) foundations. The two models presented in this chapter refer to different kinds of coupling. The first model (fly balls) focuses on the informationmovement coupling without paying attention to the neuro-physiological foundations of behavior. Conversely, the second model (locomotion pointing) emphasizes the perception-actuation coupling. We agree with Bootsma (1998) that the choice of a kind of modeling depends heavily on the level of explanation embraced by the researcher. If the researchers aim at discovering the organizational principles (Bootsma, 1998) underlying the behavior of different species, then the information-movement coupling level is adapted. Our first model (about fly balls) catches a control mechanism that could be useful for the human species independently of the action system involved (locomotion, assisted displacements (wheelchair, car, bike, ...)) but also for different species (humans, insects, robots, ...). The implicit hypothesis is that different kinds of structures are able to instantiate the same mechanism while this instantiation process is considered a constraint acting on the information-movement coupling. Conversely, if the researcher is also interested in the neuro-physiological foundation of a given goal-directed behavior or in the similarities between the internal organization of several systems (Beek et al., 1998), then the perception-actuation coupling level is the good one. Our second model (see also Schoner, 1994) illustrates this point as it focuses on a particular action system (locomotion) and it incorporates neurophysiological knowledge on basic locomotion and its modulation. This model shows how goal-directed locomotion emerges as a behavioral attragent through dynamical interactions between the neural, musculo-skeletal, and environmental components.
488
G. Montagne, Montague, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent
Yet, rather than emphasizing the (real) limits of these models it seems important to highlight their complementarities. They constitute two different tools. The information-movement coupling model serves to determine the similarities in the behavioral signature of agents engaged in goal directed-action. The perception-actuation coupling mode allows a better understanding of the emerging properties of behavior in close relation with the functioning of the structures in charge of its implementation. This chapter shows what a clear conception of the relationships between perceptual and motor components can add to the understanding of the mechanisms underlying goal-directed locomotion. Of course, these models are (with several others) the first attempts to bridge the gap between perception and action and they are widely perfectible. The challenge in the next few years will be to put them to the test through empirical investigations.
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
489
REFERENCES Beek, P. J., Peper, C. E., Daffertshofer, A., van Soest, A. & Meijer, 0. (1998). Studying perceptual-motor actions from mutually constraining perspectives. In A. A. Post, J. R. Pijpers, P. Bosch and M.S. J. Boschker (Eds.), Models in Human Movement Science (pp. 93-111). Enschede : Print Partners Ipskamp. Berg, W. P., Wade, M. G. & Greer, N. L. (1994). Visual regulation of gait in bipedal locomotion: Revisiting Lee, Lishman, and Thomson (1982). Journal of Experimental Psychology: Human Perception and Performance, 20, 854-863. Bootsma, R. J. & Peper, C. E. (1992). Predictive visual information sources for the regulation of action with special emphasis on catching and hitting. In L. Proteau, and D. Elliott (Eds.), Vision and Motor Control (pp. 285-314). Amsterdam : Elsevier Science Publishers. Bootsma, R. J., Fayt, V., Zaal, F. T. J. M. & Laurent, M. (1997). On the information-based regulation of movement : Things Wann (1996) may want to consider. Journal of Experimental Psychology: Human Perception and Performance, 23, 1282-1289. Bootsma, R. J. (1998). Ecological movement principles and how much information matters. In A. A. Post, J. R. Pijpers, P. Bosch and M. S. J. Boschker (Eds.), Models in Human Movement Science (pp. 51-63). Enschede: Print Partners Ipskamp. Chardenon, A., Montagne, G., Buekers, M. J. & Laurent, G. (2002). The visual control of ball interception during human locomotion. Neuroscience Letters, 334, 13-16. Chardenon, A., Montagne, G., Laurent, M. & Bootsma, R. J. The perceptual control of goaldirected locomotion: A common control architecture for interception and navigation? Manuscript under revision. De Rugy, A., Montagne, G., Buekers, M. J. & Laurent, M. (2000a). The control of locomotion pointing under restricted informational conditions. Neuroscience Letters, 281, 87-90. De Rugy, A., Montagne, G., Buekers, M. J. & Laurent, M. (2000b). The study of locomotor pointing in virtual reality : The validation of a test set-up. Behavior Research Methods Instruments and Computers, 32, 215-220. De Rugy, A., Montagne, G., Buekers, M. J. & Laurent, M. (2001). Target expansion is used to control locomotor pointing. Behavioral and Brain Research, 123, 11-15. De Rugy, A., Montagne, G., Buekers, M. J. & Laurent, M. (2002a). Visually guided locomotion: evidence for locomotor pointing control based on temporal information. Experimental Brain Research, 146, 129-141. De Rugy, A., Taga, G., Montagne, G., Buekers, M. J. & Laurent, M. (2002b). Perception-action coupling model for human locomotor pointing. Biological Cybernetics, 87, 141-150. Gibson, J. J. (1958). Visually controlled locomotion and visual orientation in animals. British Journal of Psychology, 49, 182-194. Gibson, J. J. (1979). The ecological approach to visual perception. Hillsdale, NJ : Laxrence Erlbaum Associates, Inc. Haruno, M., Wolpert D. M. & Kawato M. (2001). Mosaic model for sensorimotor learning and control. Neural Computer, 13, 2201-20.
490
G. Montagne, Montague, A. De Rugy, M. Buekers, A. Durey, G. Taga and M. Laurent
Hofsten, C. von. (1983). Catching skills in infancy. Journal of Experimental Psychology: Human Perception and Performance, 9, 75-85. Laurent, M., Montagne, G. & Durey, A. (1996). Binocular invariants in interceptive tasks : A directed perception approach. Perception, 25, 1437-1450. Lee, D. N. (1976). A theory of visual control of braking based on information about time-tocollision. Perception, 5, 437-459. Lee, D. N., Lishman, J. R., & Thomson, J. A. (1982). Regulation of gait in long jumping. Journal of Experimental Psychology : Human Perception and Performance, 8, 448-458. Lenoir, M., Musch, E., Thiery, E. & Savelsbergh, G. J. (2002). Rate of change of angular bearing as the relevant property in a horizontal interception task during locomotion. Journal of Motor Behavior, 34, 385-401. Marr, D. (1982). Vision. New York: W. H. Freeman and Co. McBeath, M. K., Shaffer, D. M. & Kaiser, M. K. (1995). How baseball outfielders determine where to run to catch fly balls. Science, 268, 569-572. McLeod, P. & Dienes, Z. (1993). Running to catch the ball. Nature, 362, 23 Montagne, G., Laurent, M. & Durey, A. (1998). Visual guidance of goal-oriented locomotor displacements: the example of ball interception tasks. Ecological Psychology, 10, 25-37. Montagne, G., Laurent, M., Durey, A. & Bootsma, R. J. (1999). Movement reversals in ball catching. Experimental Brain Research, 129, 87-92. Montagne, G., Cornus, S., Glize, D., Quaine, F. & Laurent, M. (2000a). A 'perception-action coupling' type of control in long-jumping. Journal of Motor Behavior, 32, 37-44. Montagne, G., Fraisse, F., Ripoll, H. & Laurent, M. (2000b). Perception-action coupling in interceptive task : First order temporal relation as an input variable. Human Movement Science, 19, 59-72. Montagne, G. & Laurent, M. (2002). Visual information for one-handed catching. In K. Davids, G. J. P. Savelsbergh, S. Bennett and J. Van der Kamp (Eds.), Interceptive Action in Sport: Information and Movement (pp. 144-157). London: Routledge. Peper, C. E., Bootsma, R. J., Mestre, D. R. & Bakker, F. C. (1994). Catching balls : How to get the hand to the right place at the right time. Journal of Experimental Psychology: Human Perception and Performance, 20, 3,591-612. Savelsbergh, G. J. P., Whiting, H. T. A. & Bootsma, R. J. (1991). 'Grasping' tau. Journal of Experimental Psychology : Human Perception and Performance, 31, 1655-1663. Schoner, G. (1994). Dynamic theory of action-perception patterns: The time-before-contact paradigm. Human Movement Science, 13(3-4), 415-441. Taga, G. (1991). Self-organized control of bipedal locomotion by neural oscilators in unpredictable environment. Biological Cybernetics, 65, 190-208. Taga, G. (1995a). A model of the neuro-musculo-skeletal system for human locomotion. I. Emergence of basic gait. Biological Cybernetics, 73, 97-111. Taga, G. (1995b). A model of the neuro-musculo-skeletal system for human locomotion. II. Realtime adaptability under various constraints. Biological Cybernetics, 73, 113-121. Taga, G. (1998). A model of the neuro-musculo-skeletal system for anticipatory adjustment of human locomotion during obstacle avoidance. Biological Cybernetics, 78, 9-17.
How Time-to-Contact is Involved in the Regulation of Goal-Directed Locomotion
491
Warren, W. H., Young, D. S. & Lee, D. N. (1986). Visual control of step length during running over irregular terrain. Journal of Experimental Psychology: Human Perception and Performance, 12, 259-266. Warren, W. H. & Hannon, D. J. (1988). Direction of self-motion is perceived from optic flow. Nature, 336, 583-585. Warren, W. H. (1998). Visually controlled locomotion : 40 years later. Ecological Psychology, 10, 177-219. Warren, W. H., Kay, B. A., Zosh, W. D., Duchon, A. P. & Sahuc, S. (2001). Optic flow is used to control human walking. Nature Neuroscience, 4, 213-216. Zaal, F. T. J. M., Bootsma, R. J. & van Wieringen, P. C. W. (1999). Dynamics of reaching for stationary and moving objects : data and model. Journal of Experimental Psychology : Human Perception and Performance, 25, 149-161.
This Page is Intentionally Left Blank
SUBJECT INDEX Accommodation 150, 156, 271 Accretion 5, 168, 234, 237, 260 Action space 271, 451 Action goal-directed 475, 477, Aerial perspective 271 Affordance 145, 147, 148, 153 ff Aliasing 263, 264 Ames window 248, 275, 282 Apparent size 231, 235, 236, 255, 275, 277 Attractor 393, 411,420 - limit-cycle attractor 393, 411, 412,419 Auditory 36, 185, 190, 192, 270, 284, 290, 291, 307, 356 ff, 362 ff, 367 ff, 429 - motion 356, 358, 359, 362, 368 Averaging 133, 217, 261, 263, 264 Binocular disparity 39, 40, 46, 47, 51, 157, 199, 203, 211, 231, 252, 305, 320, 331, 342 Binocular vision 149, 156, 323, 324, 352, 452, 468 Cognitive motion extrapolation 245, 268, 280 Cognitive operation 243, 244, 262, 266 Cognitive processes 11, 145, 243, 245, 247, 266, 268, 270, 273, 277, 278, 284, 325 Cognitive processing 266, 268, 272 Collision control task 110, 249, 254, 273 Collision detection 93, 94, 97, 102, 103, 105, 107, 108, 244, 248, 268 Collision-avoidance task 245, 249, 251, 253 ff, 272 Computer aliasing 10, 244, 263, 264, 276, 280 Conjunction-search 268 Constraint 67, 68, 73, 75, 83, 122, 123, 127, 234, 356, 360, 363, 375, 487 control mechanism 476,478,486, 487
Convergence 33, 141, 147, 156 ff, 170, 205, 211, 237, 271, 297, 319, 333, 373, 471 Degenerate search strategy 268 deletion 168, 234 Depth cues 47, 243, 246 ff, 252, 257, 259, 271, 276 ff Detection 5, 15, 46, 93, 96, 100, 102 ff, 106 ff, 145, 147 ff, 162, 183, 206, 223, 225, 230, 231, 248, 255, 267, 268, 284, 293, 301, 318, 343 ff, 445,458,459,462 - Detection threshold 183, 206, 223, 230, 248,318 Directed perception 167, 276, 324, 351, 462,470,471,490 Disparity 46 ff, 52, 150,156,168, 173, 178, 180, 200 ff, 241, 252, 271, 283, 303 ff, 307 ff, 316 ff, 324, 325, 330, 331, 334 ff, 338, 342, 349, 351 ff, 447, 461 ff, 472,473 Dynamic systems 169 Eccentricity 40, 42,45, 56, 265, 332 Ecological Psychology 1, 3, 5, 6, 10, 91, 110, 112, 144, 145, 149, 167, 242, 279, 281, 283, 284, 301, 302, 325, 351, 367, 368, 387,446, 470, 472, 476, 490, 491 Effective stimulus 68, 261, 262 Egocentric 327, 329, 339, 349, 356 Familiar size 2, 252, 282 Fast expansion scene 249 ff, 263 Feature-search 268 Fidelity 249 Fixation 55,143, 157, 182, 198, 200, 261, 265,271,277,309 Flow 3, 10, 36, 37, 40, 42, 44, 45, 49 ff, 68, 70, 72, 77, 91, 146, 148, 192 ff, 224, 227, 241, 265, 282, 289, 290, 296, 301,
494
Subject Index
317, 323, 324, 328, 336, 359,418,476, 487 - optic flow 4, 17, 40 ff, 47, 50 ff, 108, 167, 195, 196, 199, 244, 264, 283, 284, 296, 302, 328, 329, 352,476,480, 491 - lamellar flow 265 Forward self motion 258 Global stereopsis 272 Grasping 11, 121, 137, 139, 142, 145, 148, 151, 159, 160,169, 272, 321, 325, 367, 368, 373, 385, 388, 389 ff, 397 ff, 407 ff, 412 ff, 442, 444, 449, 473,474,490 Ground-intercept information 249, 250, 255, 256, 264, 280 Height in field 244, 257, 271, 280 Heuristics 6, 8, 10, 243, 244, 246, 248, 255, 262, 265, 273, 276 ff, 281, 283, 295 ff Higher-order motion 246, 248, 262, 295 Hypothesis doubling 399 Identification 110, 165, 199, 230, 238, 267, 268, 450 Impoverished viewing 246, 249 Information integration 164, 257, 280,445, 461,465,468,487,488 Interceptive action 109, 113 ff, 132 ff, 139, 141 ff, 148 ff, 163 ff, 170, 204, 207, 215, 241, 273, 274, 283, 284, 341, 343, 353, 356, 363, 366, 367,443 ff, 447 ff, 455 ff, 460, 462, 463, 468 ff, 473, 474 Interruption task 269 Invariants 3, 10,44, 167, 232, 243, 244, 246, 261, 262, 273, 277, 279, 281, 324, 351, 459, 470, 471, 476,490 Launch time 251, 254, 272 Law of small angle 277, 289, 292 Locomotion 36, 52, 68, 70, 71, 73, 91, 96, 100, 138, 147, 165, 166, 198, 199, 227,
239, 241, 242, 301, 324, 328, 329, 351, 352, 356, 360, 441, 473, 475 ff, 482 ff Looming 11, 13, 15, 17, 18, 20, 22, 24 ff, 33, 35ff, 39, 40, 43,47, 48, 50, 78, 86, 91, 101, 151 ff, 157, 161, 168, 169, 182, 196, 198, 199, 227, 234, 242, 283, 284, 324, 329, 334, 336, 343, 345, 346, 356, 359, 362, 367, 368, 447,458, 472, 473, Lower-order motion 244, 246, 262, 276 ff Mental structure 243, 245, 261, 262, 273, 276 Mental-rotation task 270 Monocular 6, 10, 11, 16, 17,91, 139, 150, 151, 155, 157, 160, 162, 168, 169, 173, 180, 185, 191, 197, 201, 213, 217 ff, 224 ff, 240, 241, 252, 253, 259, 276, 280, 281, 284, 303, 304, 308, 310, 311, 315 ff, 352, 387, 440,443,449, 450, 452 ff, 456,460, 461,470,472, 473 Motion parallax 40, 42, 45, 48, 257, 362, 447 Motion perspective 5, 247, 249, 271 Necker cube 276 Neuroanatomical 272 Occlusion 5, 54, 157, 168, 231, 234, 236, 237, 241, 244, 246, 257, 259, 261, 271, 277, 281, 297, 366, 429,471,474 Opposed-set procedure 276 Optic array 5, 6, 67, 68, 70, 86, 145, 234, 246, 262, 267, 275, 290, 328, 331, 446, 454,460 Optical gap constriction 258, 259 Optical magnification 234, 240, 244, 262, 282 Parallax 50, 234, 236, 257 - binocular parallax 236 Parallel search 267 Perceived velocity 255, 353
Subject Index
Perceptual coupling 273, 275, 276,438 Personal space 271 Pictorial depth information 37, 162, 244, 246, 248, 256 Potential collision 100, 105, 244, 248, 255, 256, 280 Prediction-motion 7, 11, 97, 252, 280, 283, 284, 306, 325,453 Prehension 138,168, 170, 171, 281, 363, 364, 367, 388, 389, 390, 394, 395, 397, 399, 400, 407 ff, 412,413,416 ff, 442, 464, 465, 474 Primary line of sight 258 Psychophysical threshold 271 Radial flow 265, 317 Rate of accretion 260 Reaching 54, 74, 95, 97, 122, 125, 134,137, 140, 145 ff, 158 ff, 272, 341, 343, 360, 371 ff, 376, 377, 379, 380, 383, 385, 387, 388, 389 ff, 394 ff, 403,407 ff, 440,442,465,466,474,491 Relative duration task 270 Relative size 150, 244, 246 ff, 250, 251, 253, 255, 257, 264, 267, 268, 271, 272, 277, 280 Resolution 55, 235, 245, 249, 263, 264, 277, 413,431,441 Response time 345 ff Retinal disparity 271,303 Retinal eccentricity 108, 199, 245, 264, 265, 283 Sander parallelogram illusion 245, 274 Selective interference paradigm 270 Self motion 16,17, 244, 249, 251, 256 ff, 280,281,302 Serial search 268 Set size 245, 266 ff, 277 Spatial extent 243, 245, 273, 274 Stability 165, 167, 215, 363, 374, 388, 389,
495
392, 395, 396,402, 409,411, 413, 419, 422,423,425,426, 428,429,431, 432, 438,464,465,482 Staircase procedure 252, 254, 257, 316 Surface slant 233, 234, 261, 336, 352 Tau 1 ff, 11, 27, 28, 57, 61, 62, 75, 78, 79, 82, 83, 88, 90, 94, 112 ff, 141, 149, 155 ff, 160 ff, 187, 188, 191, 199, 213, 219, 229, 246 ff, 251, 254, 258, 260 ff, 272 ff, 287 ff, 325, 328, 330, 341, 342, 368, 371 ff, 375, 390, 391, 392, 394, 416, 421 ff, 439, 443 - Tau auditory 185,190, 192, 270, 284, 290, 291, 307, 356 ff, 362 ff, 367 ff, 429 - Tau global 3, 108, 244, 246, 266, 282, 289, 290, 293, 297 - Tau local 2, 244, 259, 289, 290, 293, 297 Texture 5, 19, 47, 68, 70, 77, 78, 96, 150, 168, 178, 192, 195, 196, 220, 224, 226, 228, 229 ff, 249, 257, 258, 261, 284, 317,471 Threshold 18, 32, 35, 73, 77, 120, 181, 182, 184, 185, 190, 198, 201, 235, 243, 244, 247, 248, 252, 254, 257, 261, 262, 271, 276, 277, 312, 318, 319, 344, 394, 401, 448, 459, 465,484 Tiling 232, 233, 236 Time to arrival 78, 114, 182, 242, 244, 284, 368
Time to completion 244, 287 Time to passage 2, 108,174, 185, 186, 205, 206, 210, 213, 214, 226, 244, 282, 289, 296, 301 Trajectory 19, 24,48, 54 ff, 64, 87, 100 ff, 123, 124, 133,151, 160, 174, 180, 184, 185, 202, 211, 213, 214, 223, 231, 234 ff, 242, 269, 280, 284, 290, 320, 328, 331 ff, 342, 343, 348, 358, 362, 364, 368, 373 ff, 386, 392, 416,417, 429, 431, 434, 444,464, 466,468,470, 486
496 496
Subject Subject Index Index
TTC - optical TTC 244, 246, 248, 254, 257, 259, 261, 263 ff, 269, 276, 277 - relative TTC 245, 248, 249, 254, 257, 265, 266, 268, 307, 308, 311, 313 ff, 321 - TTC judgment 2 ff, 8, 9, 48, 243 ff, 252, 257 ff, 261ff,270ff, 315 - TTC perception 109, 113, 243, 247, 248, 251, 261, 262, 266, 272, 273, 276, 278 Unconscious inference 246 Velocity 2 ff, 8, 9, 17, 21, 22, 24, 26, 28 ff, 32, 35, 43, 48, 51, 53, 55 ff, 73 ff, 77 ff, 81, 88, 90, 94 ff, 101, 104, 105, 109, 137,138,148,149,155,156,159 ff, 168, 169, 183 ff, 196,197, 207, 211, 214, 215, 231, 234, 235, 241, 253, 255, 258, 263, 265, 269, 277, 283, 288 ff, 295, 296, 328, 330, 332 ff, 338, 339, 342, 351, 352, 358 ff, 362, 364, 365, 371 ff, 380, 382, 385, 391, 397,414,
422, 423,430 ff, 435, 436, 438 ff, 443 ff, 451, 454 ff, 462,465, 467, 468, 471, 478 ff, 486 - image velocity 10, 244, 262, 282, 296 Viewing duration 319 Virtual size 248, 251, 254, 255, 277 Vista space 271 Visual angle 2, 31 ff, 40, 56, 68, 86, 87, 94, 98, 230, 233, 234, 236, 254, 258, 287, 425 Visual attention 65,146, 271, 358, 362 Visual cortex 40, 49 ff, 174, 203, 224, 227, 271 Visual illusion 152, 245, 273, 276, 279 Visual imagery 245, 268 Visual pathway 35, 182, 227, 272, 317, 324, 460 - dorsolateral 272 - ventromedial 272 Visual perception 10, 35, 137, 145, 167, 224, 227, 230, 232, 239, 240, 242, 261, 275, 279, 281, 283, 325, 351, 362, 422, 425, 439,440, 442,470, 489
AUTHOR INDEX Abernethy,B. 191,224 Adamovich, S. V. 374, 387,422, 429, 440 Adelson,E. H.51,216, 224 Adolph, K. E. 164, 166 Ahumada, A. J. 216, 228 Akase, E. 47, 49 Alberti, L. B. 233, 239 Albright, T. D. 40,41, 42, 43, 49, 51 Alderson, G. J. K. 195, 307, 320, 323,444, 470 Alimi, A. M. 121, 138 Allen, B.L. 251,281 Altaian, J. M. 42, 43, 49 Amari, S.414, 416 Ames, A. 247, 248, 275, 279, 282, 287 Andersen, G. J. 93, 97, 98, 99, 108, 296, 302 Andersen, R. A. 41, 43, 49, 50, 63, 65 Anderson, G. J. K. 224 Anderson, N. H. 257, 279 Andersson, I. E. K. 445, 472 Andre, A. D. 298, 301 Appleyard, D. 235, 239 Arbib, M. A. 397, 398, 399, 400, 408, 416 Arterberry, M. E. 150, 162, 167 Ashmead, D.H. 358, 359, 360, 361, 367 Assad, J. A. 53, 54, 65 Atchley, P. 97, 108 Athenes, S. 397, 417 Atkinson, J. 144, 150,151,165,166,171, 241 Backus, B.T. 336, 351 Bahill, A.T. 113, 139, 223, 225, 310, 325 Bairstow, P. J. 117,137 Bakker, F. C. 138, 168, 226, 231, 241, 352, 372, 388, 464, 472, 490 Ball, C. T. 132, 137 Ball,W. 15,34, 153,156,166 Banister, H. 320, 323, 452, 470
Banks, M. S. 49, 293, 296, 301, 351 Barash, S. 63, 65 Barnes, G. R. 64, 65 Barnsley, M. 233, 239 Bartlett, R. M. 444, 448, 473 Batista, A. P. 65 Bechterew.W. 18, 34 Bechtold,A. G. 153,171 Beck, J. 233, 239, 275, 282, 284 Becker, S. 49 Beek, P. J. 280, 365, 367,425, 442, 464, 470,472,487,489 Beer, J. 247, 282 Beintema, J. 43,49 BenHamed, S. 41,49 Bennett, S. J. 150, 166, 170, 340, 351, 363, 367,453,457,458, 461, 470, 474, 490 Berg, W. P. 185,481,489 Bergen, J. R. 216, 224 Berkeley, G. 145,166 Berthelon, C. 249, 279 Berthenthal, B. I. 144, 165, 166 Berthier, N. E. 144, 158, 161, 165, 166, 168,407,416 Berthoz, A. 351,456,472 Bessette, B. B. 19,34 Best, C. J. 256, 279 Beverley, K. I. 168, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 196, 198, 199, 201, 203, 205, 207, 210, 211, 215, 216, 217, 218, 219, 220, 221, 224, 225, 226, 227, 231, 239, 241, 305, 309, 313, 315, 316, 317, 318, 319, 321, 324, 325, 331,338,352,461,472 Bilecen, D. 368 Bingham, G. P. 365, 367, 371, 372, 374, 376, 386, 387,410, 416,421, 422, 423, 425,432,433,435, 439, 440,442 Bizzi, E. 374, 387, 422, 440, 441 Blackburn, J. M. 320, 323,452, 470
498
Author Index
Blakeley, W. R. 324 Blanchard, D. C. 19,34 Blanchard, R. J. 19, 34 Blaquiere, A. 176, 224 Bonafede, B. 320, 323 Bonbright, J. C. 19, 35 Bongers, R. M. 464, 467,472 Bonnet, C. 345, 352 Bootsma, R. J. 5, 11, 102, 108, 133, 137, 138, 140, 149, 168, 169, 212, 224, 226, 231, 239, 280, 292, 307, 323, 325, 330, 331, 351, 352, 356, 368, 372, 373, 375, 376, 386, 387, 388, 389, 390, 391, 394, 397, 398, 399, 400,403, 404,406, 407, 408, 411, 414, 415, 416,418, 423, 442, 444, 448,449,451, 452,454, 464, 466, 470, 472,473,474, 477,479, 487, 489, 490,491 Borghese, N. A. 456,471 Born, R.T. 41, 42, 43, 49,51 Boussaoud, D.41,49 Bower, T. G. R. 15, 34, 153, 154, 155, 156, 166 Boyd, I. A. 292, 301 Bracewell, R. M. 63, 65 Braddick, O. J. 144, 171, 436,441 Bradford, C. M. 137, 150, 167, 453, 471 Bradley, D. 45, 46, 47, 49 Bradshaw, M. F. 338, 341, 351,352 Bramwell, D. I. 17, 36 Braren, P. A. 14, 34 Braunstein, M. L. 246, 277, 278, 279 Bremmer, F. 41,47, 49, 50, 51 Bremner, J. G. 164,166 Brenner, E. 117, 133, 137, 169,345,351, 353,399,412,417,473 Brickman, B. J. 251, 281 Bridgman, B. 234, 239 Brooks, L. R. 270, 279 Brougthon, J. M. 153, 166 Brouwer, A-M. 132, 133, 136, 137, 346, 347,351
Bruce, C. 43, 49 Bruno, N. 257, 279, 297, 301 Buckley, D. 334, 351 Buekers, M. 476,489 Bullock, D. 464,466,470 Bunz, H.424,441 Buracas, G. 43, 49 Burden, A. M. 448, 473 Burgess-Limerick, R. 191, 224 Burrows, M. 17, 34 Burton, G. 5, 10 Burton, H. E. 239 Bush, J. M. 249, 265, 275, 278, 280, 281 Bushnell, M. C. 63, 65 Button, C. 355, 363, 364, 365, 367 Byblow, W. D. 363, 367 Caird, J. K. 248, 249, 279 Calderone, J. B. 296, 301 Caljouw, S. R. 150, 162, 166, 443 Carel.W.L. 2,446,471 Carello, C. 96, 108, 112, 139, 145, 164, 168, 295, 301, 356, 360, 425,441, 445, 472 Carey, D. P. 15, 36 Carlton, L. G. 133, 137, 138 Carlton.M. J. 138 Carnahan, H. 127,139 Caroll, J. J. 154,157,166 Carroll, T.J. 127,139,353 Carrozzo, M. 456,471 Carson, R. G. 363,369 Casey, A. 255, 281 Castle, P. 158,171 Cavallo, V. 249, 252, 279, 321, 323, 448, 452, 470 Cavanagh, P. 17, 34 Caviness, J. A. 15, 36, 152, 153, 169, 242 Chang, G. C. 49 Chardenon, A. 478, 489 Chieffi, S. 397, 416 Chua,R.G. 121,137,363, 367
Author Index
Chung, C. H. 138 Cisneros, J. 97, 108 Clark, M. J. O. 386, 387 Clayton, T. 2, 10, 138, 293, 296, 302, 352, 395,417,448,471 Clifton, R. K. 144, 158, 161, 165, 166, 168, 407, 416 Clynes, M. 215, 224 Clyton, T. M. 149, 167 Cochran, E. L. 265, 280 Coggshall, J. C. 15, 34 Cohn-Vossen 236, 240 Colby, C. L. 41, 48, 49, 63, 64, 65 Collewijn, H. 200, 211, 227, 305, 325, 463, 470 Collins, D. R. 166,425,435, 440 Cooper, L. A. 269, 270, 279 Coren, S. 274, 279 Cornus, S. 490 Corradini, M. L. 397,416 Covey, E. 387, 440 Cowan, N. 266, 279 Coxeter, H. S. M. 233, 234, 240 Crabbe, G. 54, 65 Craig, C. M. 93, 138, 292, 302, 330, 351, 386, 387, 391, 416, 439,442 Crassini, B. 256, 279, 356, 367 Crowell, D. 351 Crowell.J. A. 351 Cudworth, C. J. 155, 169 Cumming, B. G. 50, 336, 351 Cutting, J. E. 14, 15, 34, 112, 137, 246, 252, 257, 271, 276, 277, 279, 297, 301, 455, 462, 470 Cynader, M. 178, 203, 210, 212, 215, 224, 227, 325 Daffertshofer, A. 489 Davids, K. 145, 146, 150, 166, 169, 170, 231, 241, 315, 325, 351, 355, 363, 367, 444, 453, 470, 473, 474,490 Davis, D. L. 358, 367
499
Davis, W. E. 170, 297, 302 Day, R. H. 256, 279 DeBruyn, B.436, 440 De Lussanet, M. H. E. 132, 133, 137 De Rugy, A. 475, 476,482, 483, 485, 489 De Vries, M. 5, 10 Dean, P. 19,34,35,36 DeAngelis, G. 46, 50 DeLucia, P. R. 4, 10, 150, 166, 243, 246, 248, 249, 250, 251, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282,296,301,457 Denton, G. G. 97,108, 235, 239 Desimone, R. 40, 41,43, 49, 50 Dessing, J. C. 464,466,467, 470 Detwiler, M. L. 230, 242, 248, 273, 284, 306, 325, 356, 357, 358, 368 Di Salle, F. 368 DiCarlo, D. J. 296, 302 Diedrich, F. J. 432, 440 Dienes, Z. 372, 388, 476,477, 486, 490 Dietterich, T. T. 49 DiFranco, D. 158, 166 Dijkerman, H. C. 150, 168 Dill, L. M. 15,34 Distler, C. 51 Dittman, S. M. 4, 11, 67, 79, 91, 139, 150, 169, 249, 284, 325,458,473 Dodwell, D. C. 158, 166 Duchon, A. P. 231, 235, 239, 491 Duffy, C. J. 41, 43, 44,45, 46, 50, 51 Duhamel,J.-R.41,49, 63, 65 Duke, P. A. 334, 337, 342, 351 Duncan, J. 268, 281 Durey, A. 305, 324, 351, 352, 454, 466, 471,472,465,490 Duysens, J. 41,48, 52 Edgerton, S. Y., Jr. 233, 239 Elefante, R 368
500 500
Author Index
Ellard, C. G. 19, 34,40, 52, 447, 473 Elliott, D. 137, 387, 416,489 Eppler, M. A. 165, 166 Epstein, W. 167, 242, 255, 279, 281, 301 Erkelens, C. J. 200, 211, 227, 305, 325,463, 470 Erlhagen, W.414,416 Ewert, J.-P. 18, 19, 34 Fairweather, M. 162, 169 Fayt, V. 390,416, 466, 470,489 Feldman, A. G. 374, 387,422,429,432, 433, 440 Feng, Q. 158, 170 Fischer, S. C. 283 Fishman, R. 15, 34 Fitch, H. 133, 137 Fitts, P. M. 121,127, 135, 137, 403,407, 416 Fitzpatrick, P. 365, 367, 368 Flach, J. M. 4, 11, 67, 74, 79, 91, 97, 108, 139, 150, 169, 231, 240, 249, 251, 281, 284, 325, 458, 473 Flanagan, J. R. 374, 387, 422, 429, 440 Flanders, M. 121,137 Flash, T. 374, 387,422,441 Fletcher, H. J. 323 Fodor, J. 139, 165, 166 Fogassi, L. 63, 65 Fowler, G. 47, 52 Fraisse, F. 138,466,472,490 Frank, J.S. 139, 371, 389,415,465 Frankel, D. 153, 171 Freeman, T. C. A. 139, 236, 239, 240, 302, 325, 352, 490 Freiberg, K. 356, 367 Frisby.J. P. 334, 351 Frost, B. J. 6, 11, 13, 15, 17, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 31, 32, 34, 36, 37, 40, 50, 86, 87, 88, 89, 91, 290, 302, 447, 473, 474 Frykholm, G. 6, 10, 363, 368
Fukada, Y. 43, 52 Fusella, V. 270, 284 Gabbiani, F. 15, 18, 27, 30, 35,40, 50 Carello, C. 360,367 Garcia, A. 249, 280 Geesaman,B.41,43, 50 Gelade, G. 268, 284 Gellman, R. S. 139 Gentilucci, M. 397, 398, 400, 408, 416 Georgopoulos, A. P. 386, 387 Geri, G. A. 258, 281 Ghahramani, Z. 49 Gibson, E. J. 145, 146, 147, 148, 154, 157, 164, 166, 167 Gibson, J. J. 2, 3, 5, 10,15, 16, 35, 36, 68, 70, 71, 72, 73, 83, 90, 91, 110, 112, 137, 138, 145, 147, 148,149, 152, 153, 154, 164, 167, 169, 230, 232, 234, 235, 236, 240, 241, 242, 246, 251, 262, 281, 282, 301, 328, 351, 445,459, 470, 476,489 Gilden, D.L. 6,10, 246,281 Girgus, J. S. 274, 279 Girshick, A. R. 338,351 Giszter, S. 374, 387, 422,440 Giulianini, F. 53 Gizzi,M. S. 51 Glencross, D. 110, 112, 137, 138 Glize, D. 490 Gnadt, J. W. 63, 65 Gogel.W.C. 150,167 Goldberg, M. E. 41, 49, 63, 65 Goldfield, E. C. 435, 440 Goodale, M. A. 15, 19, 34, 36, 144, 165, 168,397,417,460,472 Goodman, D. 363, 367,411,417 Gordon, R.F. 153,171 Gormican, S. 268, 284 Graf, W. 41,49 Graham, N. 219, 224 Granrud, C. A. 15, 37, 150, 154, 162, 171 Grasso, R. 349, 351
Author Index
Gray, R. 9, 173, 190, 191, 193, 194, 196, 204, 210, 211, 212, 217, 218, 222, 224, 225, 227, 230, 240, 248, 252, 258, 261, 262, 281, 303, 304, 306, 307, 308, 309, 310, 311, 312, 313, 314, 316, 317, 323, 324, 330, 333, 352, 453, 461, 470 Graziano, M. S. A. 44, 50 Grealy, M. A. 138, 292, 302 Green, B. A. Jr. 197, 225 Green, P. R. 15, 35 Greer, N. L. 489 Gregory, R. L. 145, 167 Grigo, A. 47, 50 Gross, C.G. 43,49, 51 Grossberg, S. 464, 466, 470 Grosslight, J. H. 321,323 Grunbaum, B. 232, 233, 240 Griisser, O. J. 18, 19, 35 Grusser-Cornehls, U. 18,19, 35 Grutzmacher, R. P. 258, 281 Gulick, W. L. 261, 284 Gullapalli,V.416 Gulyas, B. 203, 228 Guski, R. 290, 301, 358, 362, 367 Gutman, S. R. 137 Hagen, M. A. 233, 240, 241 Hagen, R. 323 Haken,H.424,441 Halbert,J. A. 138 Halverson, H. M.412,416 Hamstra, S. J. 4, 10, 179, 181, 186, 188, 189, 190, 191, 197, 212, 215, 216, 227, 262, 267, 283, 306,307, 325, 352,447, 472 Hancock, P. A. 248, 249, 259, 265, 279, 281,283 Hannon, D. J. 476,491 Harris, E. K. 223, 225 Harris, J. M. 225, 307, 308, 323, 352 Harris, L. R. 212, 227, 352 Harris, M. G. 236, 239, 352
501
Hartelman.P. 143,171 Haruno, M. 476,489 Harvey, L. O., Jr. 230, 240 Hastorf, A. H. 255, 281 Hatsopoulos, N. G. 15, 16, 27, 35, 433, 441 Hawkins, B. 139, 387, 440 Hayes, W. N. 15, 35 Hecht, E. 232, 240 Hecht, H. 1,4, 6, 10, 50, 246, 259, 262, 264, 281, 282, 293, 296, 297, 301, 350 Heitler, W. J. 36 Held, R. 158, 171 Helmholtz, H. L. F. von 65,145,167,470 Helsen.W.F. 121,137 Hennesy, R. T. 150, 167 Heuer, H. 150, 167, 305, 309, 323, 417, 453,454,455,461,470 Hikosaka, K. 43, 50 Hilbert, D. 236, 240 Hildreth.E.C.231,241 Hochberg, J. 246, 247, 251, 255, 261, 262, 275, 276, 277, 282, 283 Hodos, W. 19, 34, 35 Hoefle, F. B. 320, 323 Hofeldt, A. J. 320, 323 Hoff, B. 397, 398, 399,400, 408, 416 Hoffmann, K.-P. 41,50, 51 Hogan, N. 374, 387, 388, 422,440, 441 Holder, T. 363, 364, 367 Holzkamp, K. 6, 10 Hong, X. H. 200, 212, 215, 225, 227, 352 Horwood.J. 231,240 Hoshizakio, L. E. F. 138 Houseman, J. H. 127, 130, 131, 132, 134, 135,139 Howard, I. P. 305, 324, 334, 351 Hoyle, F. 2,10,112,137,187,188,225, 288,301,304,324 Humphreys, G. W. 268,281 Hunt, E. B. 283 Husney, R. 363, 369 Hutton, R. J. B. 251,281
502 502
Author Index Author Index
Ikejiri, T. 143, 169 Ingle, D. J. 15, 35, 37, 241, 283, 471 Inokawa, H. 49 Inoue, Y. 52 Ito, Y. 52 Ittelson, W. H. 255, 282, 283 Ivanenko, Y. P. 351 Iwai, E. 42, 52 Jacobs, D. M. 147, 157, 164, 167, 350, 445, 464,467,471,472,472 Jagacinski, R. J. 74, 91, 136 Jakobson, L. S. 397, 398,400,417, 418 Jansen.C. 412,417 Jarvis.C. D. 19,35 Jeannerod, M. 35, 37, 241, 283, 390, 397, 417
Jenison, R. L. 359, 367 Jex, H. R. 74, 91 Johansson, G. 17, 36, 234, 240 Johnson, J. 452, 471 Johnson, K. L. 143, 167 Johnson, R. L. 144, 161, 165, 166 Johnson, S. P. 143, 167 Johnson, W. W. 2, 287, 294, 298, 301 Johnston, E. B. 297, 301 Jones, R. 10 Jordan, D.W. 423,441 Judge, S. J. 17, 35, 150, 167, 453, 471 Julesz,B.233, 240 Juslin, P. 246, 283 Kahneman, D. 266, 282 Kaiser, M. K. 2, 96, 101, 108, 249, 264, 269, 278, 280, 282, 287, 293, 296, 297, 301,302,321,324,372,387,490 Kanizsa, G. 65 Kaplan, G. A. 239 Karanka-Ahonen, J. T. 33, 35 Karnavas, W. J. 223, 225 Karten, H. J. 19,35 Kaushall, S. 212, 331,352
Kawano, K. 52 Kawato, M. 489 Kay, B. A. 374, 387, 423,431,433,434, 435,438,440,441,491 Kay, H. 158,167 Kayed.N.S. 155, 157,161,167 Keay, K. A. 19, 35 Kebeck, G. 231, 234, 236, 238, 240, 262, 282 Kellman, P. J. 150, 167 Kelso, J. A. S. 117, 139, 363, 367, 374, 387, 397,411,417,418, 423, 424, 425, 431, 437,441,442 Kerzel, D. 4, 10, 259, 262, 282 Kilpatrick, F. P. 255, 283 Kim, E. E. 363, 369 Kim, N-G. 4,10, 96, 108, 259, 262, 268, 282,295,301,360,367 Kim,S.138 Kirk, D.E. 79,91 Knowles, W. B. 2,446,471 Koch, C. 35 Koenderink, J. J. 44,50 Kohly, R. P. 175, 212, 225, 227, 306, 324, 352 Konishi, M. 30, 36 Konishi, Y. 143,169 Krapp, H. G. 35, 40, 50 Kreegipuu, K. 445, 472 Krekelberg, B.45, 50 Kriers,G.E.321,324 Kruk,R. 179,221,222,225 Kubischick, M. 49 Kubovy, M. 233, 240 Kunkler-Peck, A. J. 356, 367 Kurtz, K. L. 266, 285 La Grange, N. 452, 471 Lacquiniti, F. 365, 368 Lagae.L.41,42,43,44,50,51 Laing, I. A. 170 Land.M.F. 231,240, 349, 351
Author Index
Landwehr, K. 229, 231, 233, 235, 236, 238, 240, 262, 282 Landy,M. S. 297, 301 Lappe, M. 39, 41, 42, 43,44, 45, 47, 49, 50, 51,52,447 Larish, J. F. 97, 108,231,240 Latash, M. L. 137, 374, 387,422,441 Latz, E. 143, 169 Laurent, G. 15, 16,18, 35, 40, 50, 489 Laurent, M. 132, 138, 150, 162, 167, 168, 230, 231, 241, 252, 259, 279, 305, 321, 323, 324, 331, 351, 352, 390, 416, 448, 452, 454, 455, 457,462,466,470, 471, 472, 475,477, 489,490 Laves, F. 233, 241 Law, D. J. 266, 268, 283 Lawrence, L. 138, 240, 324,474 Lee, D. N. 2, 10, 15, 21, 34, 35, 37, 78, 91, 94, 108, 110, 112, 114, 115, 119, 133, 135, 138, 139, 148,149, 158, 160, 167, 170, 187, 225, 230, 234, 241, 247, 248, 261, 273, 276, 283, 287, 288, 290, 291, 292, 293, 295, 296, 301, 302, 307, 324, 328, 330, 351, 352, 356, 359, 368, 386, 387, 392, 394, 395,417,446, 447, 448, 451, 456, 464,471,476, 481, 485, 489, 490,491 Lee, E. M. C. 34 Lee, R. G. 139 Leibowitz, H. 150, 167 Lenoir, M. 452, 471, 471,478, 490 Levine, E. S. 19, 35 Levitt, H. 56, 65, 190, 225 Lewis, C.E. Jr. 321,324 Lewis, G. 36 Li, F. X. 154, 157, 162, 168, 230, 231, 325, 457,471 Liddell, G. W. 268, 270, 280, 283 Lin, J. P. 170 Lindhagen, K. 158, 170 Lishman, J. R. 155, 169, 307, 324, 448, 471, 489, 490
503
Lockman, J.J. 154,171 Lonergan, A. 118, 125, 126, 128, 129, 131, 133, 135, 136, 139 Longridge, T. 225 Loomis, J. M. 42,43, 51, 358, 361, 369 Lopez-Moliner J. 345, 352 Lopez-Zamora, M. 33, 35 Lough, S. 2, 10, 138, 149, 167, 293, 296, 302,352,395,417,448,471 Lumsden, E. A. 234, 241 Luque-Ruiz, D. 33, 35 Lynch, K. 239 Mace, W. 139 MacFadyan, B. J. 137 MacKenzie, C. L. 396, 397, 415, 416,417 MacLeod, R. W. 138, 238, 241, 282, 301 Maes, H. 50, 51,203,228 Maioli,C. 365,368,456,471 Maloney, L. T. 297, 301 Manser, M. P. 259, 265 Marcar,V. 42, 51,52 Margaria, R. 432,441 Marmarelis, P. Z. 176, 225 Marmarelis, V. Z. 176,225 Marr, D. 476,490 Marteniuk, R. G. 396, 397, 399, 407, 414, 416,417 Mason, A. H. 117, 121, 127, 128, 132, 138 Masters, R. L. 324 Masterton, R. B. 323 Matejowsky, E. 368 Mather, G. 216, 225 Mattocks, C. 117, 138 Maunsell, J. H. R. 40, 41, 42, 46, 47, 51, 53, 54, 60, 63, 65 Maxwell, M. 49 Mazyn, L. 471 McBeath, M. K. 372, 387,486,490 McCall.D. D.407,416 McDonald, T. P. 283 McGlaughlin, C. 138
504
Author Index
McGowan, R. S. 119, 290, 358, 368 McGuinness, E. 42,49 Mclntyre, J. 456, 472 McKee, S. P. 334, 352 McLaughlin, C. 321,324 McLeod, P. 91, 122, 138, 241, 321, 324, 349, 351, 372, 388,476,477, 486, 490, McMahon, T. A. 432, 441 McMullen, T 387, 440 McMurty, T. C. 324 McNitt-Gray, J. 162, 169 McRoberts, G. 153, 171 McRuer, D. T. 74, 91 Meese, T. S. 336, 352 Meijer, O. 489 Mestre, D. R. 138, 168, 226, 249, 279, 352, 372, 388,464, 472,490 Metzger, W. 233, 234, 241 Meulenbroek, R. G. J. 412, 417 Meyer, L. E. 249, 258, 265, 274, 275, 278, 280, 281, 283 Michaels, C. F. 5, 10,138, 145,147,150, 164, 167,168, 231, 241, 329, 346, 352, 356, 372, 387, 388, 395, 417, 445, 452, 453,454, 456,457, 458, 460, 464, 467, 472 Michon, J. A. 230, 240 Michotte, A. 54, 65 Miezin, F. 42, 49 Mikami, A. 41,51 Miles, F. A. 52 Milner, A. D. 144, 165, 168, 460, 472 Mingolla, E. 232, 242 Mitchell, I. J. 34 Mitchell, S. R. 283 Miyazaki, K. 236, 241 Mo, C. H. 35 Moen, G. C. 296, 302 Mollon, J. 328, 352 Montagne, G. 132, 138, 150, 167, 174, 305, 324, 349, 351, 352, 452, 454, 466, 471,
472,475,476, 477, 479,485, 489,490 Mon-Williams, M. 150, 168, 340, 352, 398, 399, 400,401, 403,404, 405, 408, 414, 417 Moore, M. K. 34,153, 155, 166 Moran, M. S. 137 Morgan, B. 17, 34 Morgan, M. J. 269, 282 Mountcastle, V. B. 176, 226 Movshon.J. A.41,51 Mowafy, L. 101, 108, 269, 282 Muir, D. W. 158, 166 Musch.E.452,471,490 Mussa-Ivaldi, F. A. 374, 387,422, 440, 441 Mustovic, H. 368 Myer, J. R. 239 Nakayama, K. 31, 34, 42, 43, 51 Nanez, J. E. 153,168 Neilson.W. A. 232, 241 Neisser, U. 3, 10 Netelenbos, J. B. 364, 368 Neuhoff, J. G. 356, 368 Newell, K. M. 128, 137, 138, 367 Newsome,W.T.41,50, 51 Nimmo-Smith, I. 138, 321, 324 Norcia, A. 153, 154, 156, 162, 171 Norman, J. F. 422, 442 Northington, A. 358, 367 Northmore, D. P. M. 19, 35 Novak, J. B. 262, 266, 267, 268, 281, 283 Oberg, C. 154,156,162,171 Olberg, R. M. 16, 36 Oldak, R. 231, 242, 269, 284, 357, 358, 368 Oliver, J. 127, 139, 353 Olsson, H. 246, 283 Orban, G. A. 41, 42, 44, 46, 50, 51, 52, 203, 228, 436,440 Ostry, D. J. 374, 387, 422, 429, 440
Author Index
Oudejans, R. R. D. 102, 108,138, 150, 168, 231, 239, 241, 292, 307, 323, 352, 372, 387, 388, 391, 392, 395,416,417, 451, 452, 454, 470, 472 Out,L. 143,158,159,161,168 Owen, D.H. 231,241,302 Pack, C. 41,51 Pagano, C. C. 372, 376, 387,422, 440 Page,W.K.45,51 Paillard,J. 133,138 Palmer, C. F. 146, 168 Paolini, M. 45, 50,51 Park, J. 255, 281,475 Parker, A. J. 336, 351 Paulignan, Y. 397,417 Pearson, K. G. 17, 36 Pekel,M.41,43,50, 51 Pellegrino, J. W. 283 Pefia, J. L. 30, 36 Peper, C. L. 133, 138, 168, 174, 226, 329, 338, 348, 352, 372, 373, 375, 387, 388, 391, 392, 416,464,465, 470, 472, 477, 479, 489,490 Perotti, V. J. 422, 442 Perrone, J. A. 41,51 Peterson, J. R. 121, 137,403,407, 416 Peterson, M. A. 171, 275, 276, 282, 283 Pettersen.L. 154,171 Pettigrew, J. D. 368 Pfaffman, C. 321, 324 Phatak, A. N. 96, 108 Phatak, A. V. 296, 301 Piaget, J. 142, 145, 168 Pick, A. D. 167 Pick, H. L. 91, 138, 142, 168, 241, 282, 301 Pierce, B.J. 258,281 Pijper, R. I. 5, 11,449,473, 489 Pijpers, J. R. 149, 169 Plamondon, R. 121, 138 Poggio, G. F. 46, 51, 203, 226 Poincare, H. 179, 226
505
Port, N. L. 387 Portfors, C. V. 201, 202, 203, 207, 208, 210, 211, 212, 226, 227, 306, 308, 321, 324, 352 Portfors-Yeomans, C. V. 226 Poulton.E. C. 114,138 Powers, W. 71,79, 91 Previc, F. H. 272, 283 Prevost, P. 351 Profitt,D.R. 281,321,324 Pugh, E. N. 232,241 Pulfrich, C. 320, 323, 324 Purdy, J. 240 Purdy, W. C. 446, 472 Putnam, C. A. 411,417 Qian, N. 49 Quaia, C. 52 Quaine, F. 490 Quinn, J. T. 139 Raiguel, S.41,42,50, 51,52 Ramachandran, V. S. 9,10 Rasmussen, J. 72, 91 Rauschecker, J. P. 44, 50 Reddish, D. E. 149, 167,447,448, 464, 471 Reddish, P. E. 2, 10,15, 35, 138, 293, 296, 302,352,392,394,395,417 Redgrave, P. 19, 34, 35, 36 Reed, E. S. 10, 139 Regan, D. M. 4, 9, 10, 113, 139, 168, 173, 174, 175, 178, 179, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 210, 211, 212, 213, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 230, 231, 239, 240, 241, 242, 248, 252, 258, 261, 262, 265, 267, 281, 283, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312,312, 313, 314, 315,316, 317, 318, 319, 321, 323, 324, 325, 330, 331,
506 506
Author Index
333, 334, 336, 338, 339, 341, 352, 444, 447,452,453,461,470,472 Reichardt, W. 215, 219, 227, 228 Repperger, D. W. 137 Reynolds, H. N. 239 Riccio, G. E. 265, 284 Richardson, K. 168 Rind, F. C. 15, 16, 17, 18, 35, 36, 40, 51, 447,472 Ripoll, H. 138,466,472,490 Rizzolatti, G. 397,416 Robertson, R. M. 17, 36 Robin, D. J. 158, 168, 387,407,416,422, 440 Robinson, D. L. 63, 65 Rochat, P. 143, 168 Rock, I. 145, 168 Rodman, H. R. 41,51 Rogers, B. J. 242, 279, 301, 305, 324, 334, 351 Ronnqvist, L. 397, 407,418 Rosander, K. 158, 170 Rosenbaum, D. A. 269, 283,412,417 Rosenblum, L. D. 359, 368 Rosengren, K. S. 351 Ross, H.E. 238,241 Rowell, C. H. F. 17, 34 Roy, J.-P. 47, 52 Royden, C. S. 43, 52, 231, 241 Runeson, S. 3, 6, 10, 11, 246, 283, 363, 368, 445,472 Rushton, S. K. 6, 204, 219, 227, 252, 283, 304, 307, 318, 325, 327, 329, 338, 340, 341, 351, 352, 353, 453, 461, 472, 474 Sahibzada, N. 19, 36 Sahuc, S. 491 Saidpour, A. 97, 108, 296, 302 Saiff, E. I. 15, 35 Saito, H.-A.41,43,46, 52 Sakata, H. 43, 46, 52 Saldana, H. M. 359, 368
Saltzman, E. L. 374, 387, 423,431, 441 Santvoord, van A. A. M. 5,11, 449,473 Sauer, C. 93 Saunders,J. A. 231,242 Savelsbergh, G. J. P. 1, 5, 11, 50, 138, 141, 143, 144, 145, 146,149, 150, 158, 161, 166,168, 169, 170,171, 248, 264, 282, 285, 292, 321, 325, 351, 356, 364, 365, 366, 367, 368,443,448,449,451, 452, 453,460, 470,471,473,474, 476, 490 Scarpa, M. 397,416 Schaafsma, S. 41,48, 52 Schachinger, H. 368 Schattschneider, D. 233, 241 Scheffler, K. 368 Scheier, C. 414, 417 Schey, H. M. 196, 197, 227 Schiff, W. 15, 36,152, 153, 155, 169, 230, 231, 235, 242, 248, 249, 269, 273, 283, 284, 306, 325, 356, 357, 358, 368 Schlotterer, G. R. 16, 36 Schmidt, R. A. 119, 128, 129,131,139 Schmidt, R. C. 365, 368,425,440,441, 442 Schmuckler, M. A. 154, 157,168 Schneider, G. E. 35 Schoenflies, A. 233, 242 Schoner, G. 174, 228, 374, 388, 392, 394, 395,402,409, 411,413, 416, 417, 419, 420,423,424, 431,441, 464, 465, 473, 487,490 Scilley, P. L. 17, 34 Scott, M. A. 67,315, 325 Seaks.J. D. 144,161, 165, 166 Sedgwick, H. A. 233, 242, 251, 284 Segal, S. J. 270, 284 Seifritz, E. 356, 362, 368 Sekiya, H. 162, 169 Sekueler, R. 347,353 Shaffer, D. M. 372, 387, 490 Shankar, S. 40, 52, 447, 473 Sharp, R. H. 444,474 Shaw, B. K. 290, 358, 368
Author Index Shaw, R. E. 139 Shenoy, K. V. 49 Shepard, R. N. 270, 279 Shephard, G. C. 232, 233, 240 Sherk, H. 47, 52 Shibutani, H. 52 Shiffrin, R. M. 266, 284 Shiina, K. 150, 167 Shimojo, S. 143, 169 Shook, B. L. 15, 35 Shull, J. A. 387, 422,425,440 Sidaway.B. 162,169 Simmons, P. J. 15, 16, 17, 36, 40, 51,447, 472 Simon, H. A. 9, 11,327 Simpson, W. A. 248, 266, 284 Smeets, J. B. J. 117, 133, 136, 137, 150, 169, 170, 248, 285, 292, 345, 346, 351, 353, 399, 412, 417, 453, 473, 474 Smith, A. T. 225 Smith, L. B. 143,169,414,417 Smith, M. R. H. 4,11, 67,79, 81, 83,86, 88, 91, 114, 117, 139, 150, 169, 249, 262, 273, 284, 307, 325, 458, 473 Smith, P. 423,441 Smith, W.M. 261,273 Snowden, R. J. 50, 225,436, 441 Snyder, L. H. 54, 65 Soechting, J. F. 121,137 Soeda, A. 143, 169 Souki, W. 36 Southall, T. L. 17,37,65, 167 Sparrow, W. A. 128, 139 Speigle, J. M. 358, 361,369 Spekreijse, H. 185,228 Spelke, E. S. 158, 170 Sperling, G. 216, 228 Spileers, W. 203, 228 Spurr, R. T. 296, 302 Stanard, T. 4, 11, 67, 79, 91, 139, 169, 249, 284, 325, 458, 473 Stechler, G. 143,169
507
Steeves, J. D. 36 Stelmach, G. E. 91,138, 400,403, 407,408, 418 Sternfels, S. 153, 171 Stevens, P. S. 235, 241,242 Stevenson, E. 117, 139 Stewart, D. 155,169 Stoffregen, T. A. 265, 284 Strausfeld, N. J. 16,37 Stretch, R. A. 444, 473 Sully, D.J. 195, 224, 323, 444,470 Sully, H. G. 195, 224, 323, 444, 470 Sun, H.-J. 6, 11, 15, 17, 21, 23, 25, 26, 27, 28, 29, 32, 36, 37, 40, 50, 86, 87, 88, 89, 91,447,473 Swaroop, R. 324 Sweet, B. T. 249, 278, 280, 297, 302 Swinnen, S. P. 363, 369, 417 Sylvia, M R . 144,161,165,166 Tachibana, T. 143,169 Taga, G. 143, 169,475, 482, 483, 489, 490 Takemura, A. 47,52 Takeuchi, K. 143, 169 Talbot, W. H. 203, 226 Tallaroco, R. B. 15, 34 Tanaka, K. 41,43, 52 Thelen, E. 142,143,169, 414,417 Thiel, P. 236, 242 Thiele, A. 41, 50, 51 Thiery, E. 490 Thines, G. 54, 65 Thompson, P. 345, 353 Thomson, J. A. 307, 324,448, 471,489, 490 Tillery, S. I.H121, 137 Tittle, J. S. 422,442 Todd, J. T. 149,169, 232, 242, 248, 261, 266, 273, 284, 307, 325, 422, 442, 448, 473 Torre, V. 51 Townsend, J. T. 266, 284
508
Author Index
Toyama, K. 49 Treisman, A. M. 268, 284 Tresilian, J. R. 7, 11, 109, 110, 112, 113, 117, 118, 122, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 138, 139, 169, 170, 246, 253, 258, 259, 261, 263, 269, 274, 275, 277, 281, 284, 285, 292, 306, 322, 325, 329, 330, 333, 334, 347, 353, 356, 398, 399,400,401,403, 404,405, 408,414, 417,449,451, 452, 454,456,461,473,474 Tronick, E. 15, 34, 37, 153, 156, 166 Tsurugai, K. 52 Tually, K. 356, 367 Tuller, B. 423, 437, 442 Turvey, M. T. 5, 10,96, 108, 110, 112, 133, 137, 139, 290, 295, 301, 356, 358, 360, 367,368,425,441 Tyldesley, D. A. 114, 139, 392,418, 464, 474 Tyler, P.A. 236, 239 Umilta, C. 397,416 Ungerleider, L. G. 41,49, 50 Vaina, L. M. 53 van den Berg, A. V. 43, 49, 50 van den Berg, T. J. T. P. 228 van der Hitchcock, T. 368 van der Kamp, J. 2, 6, 11, 141, 143, 144, 145, 146, 150, 162, 166, 169, 170, 171, 248, 252, 264, 273, 282, 285, 292, 340, 351, 356, 366, 367, 368,443, 449, 450, 451, 453, 457, 458, 460,461, 470,473, 474, 490 van der Meer, A. L. M. 155, 157, 158, 159, 160, 161, 162, 163, 167, 170 van der Weel, F. R. 15, 35, 158, 160, 170 van Donkelaar, P. 114, 116, 139 van Doom, A. J. 44, 50 vanEe, R. 351
van Essen, D. C. 40, 41, 42, 46, 47, 51, 60, 63,65 vanHof, P. 141 vanHulle, M.42, 51 van Norren, D. 228 van Santen, J. P. 216, 228 van Santwoord, A. M. M. 149, 169 van Soest, A. J. 158, 161, 168 van Wieringen, P. C. W. 133, 137, 140, 280, 388, 396,411,418, 423, 442, 464, 474, 491 Vaughan.J. 412, 417 Venator, K. R. 16, 36 Vereijken, B. 367 Verri, A. 51 Vincent, A. 188, 190, 196, 199, 212, 216, 227, 228, 231, 242, 265, 283, 352, 447, 472 Vishton, P. M. 14, 34, 158, 170, 252, 271, 279, 297, 301 Vogt, S.412,417 VonHofsten, C. 144, 158, 159, 160, 162, 163, 170, 397,407, 418,451,474 Wade, M. G. 489 Wagner, H. 15, 37, 88, 91, 395, 418 Wallis,G. 117,138 Wang, J. S. 400, 403,407,408,418 Wang, R. F. 246, 277, 279 Wang, Y. 15, 19, 20, 21, 24, 27, 37, 290, 302,447,474 Wann, J. P. 137,170, 188, 204, 219, 227, 228, 246, 252, 283, 285, 304, 306, 307, 318, 325, 329, 338, 340, 352, 353, 395, 407, 416, 418,448, 453,456, 461, 470, 472, 474, 489 Ward, S. L. 137 Warren, P. 350 Warren, R. 241, 242, 248, 249, 251, 253, 254, 255, 262, 272, 273, 281, 296, 301, 368
Author Index
Warren, W. H. 95, 96, 98, 99, 108, 231, 242, 266, 285, 296, 302, 363, 369, 374, 388, 432, 435, 440, 441, 476, 481, 491 Warren, W. H., Jr. 231, 234, 242 Watamaniuk, S. N. 203, 225, 307, 308, 323 Watanabe, T. 235, 242 Watson, A. B. 216, 228, 398,400, 418 Wattem-Bell, J. R. B. 170, 171 Watts, R. G. 113,139,310,325 Weeks, D. L. 117, 139, 397,418 Weel, F. R. 368 Weinberg, G. M. 89,91 Welch, L. 334, 352 Wertheim, A. H. 241 Wertheimer, M. 233, 242 Wetherill, G. B. 56, 65 Wheatstone, C. 199, 228, 304, 325 Wheeler, K. 239 White, B. L. 158, 171 Whiting, H. T. A. 5, 11, 114, 139, 149, 169, 325, 356, 364, 368, 392, 418,444, 448, 449, 464, 473, 474, 490 Wicklein, M. 16, 37 Wieringen, P. C. W. 133, 373, 376, 390, 391, 394, 397, 398, 416, 423, 425, 444, 448,470,491 William, G. 34 Wilson, A. N. 439,442 Wimmers, R. H. 143, 146, 169, 171, 302, 425,442 Wolpert, D. M. 481,482, 489 Wolpert,L. 231,242 Won, J. 374, 388 Wong, S. C. P. 17, 34
509
Woodworth, R. W. 121, 137, 140 Woody, C. D. 223, 225, 228 Worthington, A. H. 16, 36 Wuestefeld, A. P. 359, 368 Wurtz, R. H. 41,43, 44, 46, 47, 50, 51, 52 Xiao,D.-K. 42,43, 50, 51,52 Xu,B. 17, 37 Yantis, S. 54, 65 Yilmaz, E. H. 95, 96,98, 99, 108, 296, 302, 374, 388 Yonas, A. 15, 37, 150, 151,153, 154, 155, 156, 159, 162, 168, 171 Young, D. S. 2, 10, 138, 149, 167, 234, 241, 247, 273, 276, 283, 293, 296, 302, 352, 395,417,447,448,471,491 Young, M. 297, 301 Yukie, M. 43, 50 Zaal, F. T. J. M. 131, 140, 371, 373, 374, 376, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 397, 398,400, 402, 407, 408,409, 410, 411, 413,415, 416, 418, 422,423, 425,440, 442,464, 465, 466, 470,474,477,489,491 Zaff302 Zago, M. 456,472 Zeinstra, E. B. 138, 150, 168, 352, 395,417, 452, 472 Zeki, S. 47, 52 Zelaznik, H. N. 139, 368 Zosh.W. D.491 Zucker, S. W. 233, 242
This Page is Intentionally Left Blank