THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory VOLUME 20
This Page Intentionally Left Blan...
38 downloads
1533 Views
16MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory VOLUME 20
This Page Intentionally Left Blank
THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
EDITEDBY GORDON H. BOWER STANFORD UNIVERSITY, STANFORD, CALIFORNIA
Volume 20 1986
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers
Orlando San Diego New York Austin Boston London Sydney Tokyo Toronto
COPYRIGHT 0 1986 BY ACADEMIC PRESS, INC. ALL RIGHTS RESERVED. NO PART O F THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL. INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC. Orlando, Florida 328x7
United Kingdom Edition ublished by
ACADEMIC PRESS I&.
(LONDON) LTD.
24-28 Oval Road. London NWI 7DX
LIBRARY OF CONGRESS CATALOG CARDNUMBER:66-301 04 ISBN 0-12-543320-4
(alk. paper)
PRINTED IN THE llNlT6D STATES OF AMERICA
(16 87
xx
89
Y X 7 6 5 4 3 2 I
CONTENTS
RECOGNITION BY COMPONENTS: A THEORY OF VISUAL PATTERN RECOGNITION Irving Biederman I . Introduction .......................................................... I1 . An Analogy between Speech and Object Perception .................. 111. Theoretical Domain: Primal Access to Contour-Based Perceptual Categories ..... IV . Basic Phenomena of Object Recognition ................ V . Recognition by Components: An Overview ...................... VI . Nonaccidentalness:A Perceptual Basis for a Componential Representati VII . A Set of 36 Components Generated from Differences in Nonaccidental among Generalized Cones ............................. VIII . Relation of RBC to Principles of Perceptual Organization .................... IX . A Limited Number of Components? ...... ......................... X. tion ..................... XI . Componential Recovery Principle ................................ XI1 . Conclusion .......................................................... References ..... .........................
12 22 23 28 46 51 51
ASSOCIATIVE STRUCTURES IN INSTRUMENTAL LEARNING Ruth M . Colwill and Robert A . Rescorla I . Introduction .......................................................... I1 . Evidence for Response-Reinforcer Associations ............................ 111. Separation of R-Reinforcer from S-Reinforcer Learning ..................... IV . The Role of the Stimulus in Instrumental Behavior .......................... V . Conclusion .......................................................... References ........................................................... V
55 57 78 82 98 98
Contents
vi
THE STRUCTURE OF SUBJECTIVE TIME: HOW TIME FLIES John Gibbon 1. Introduction . . . . . . . . . . . . . ......................................... 11. The Temporal Middle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Experiment 1: Baseline Time Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Time-Left Mixture: The Harmonic Mean . . . . . . . . . . . . . . . . . . . . . . . . . V . Experiment 2: Arithmetic and Harmonic Mean Standards ..................... VI. Experiment 3: Harmonic Mean Asymptote ......................... VII. Concluding Remarks . . . . . ....................... Appendix: Double Standard .................................. References . . . . . . . . . . . . . . . . . ..... ..........
105
I08 112 I I5 122 I25 130 131 134
THE COMPUTATION OF CONTINGENCY IN CLASSICAL CONDITIONING Richard H . Granger, Jr. and Jeffrey C . Schlimmer 1. Introduction: Theory and Experiment in Classical Conditioning . . A Three-Level Analysis of Classical Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . .
11. 111. IV. V. VI.
Background: Historical Perspective on Contingency ..... Detail: The Contingency Computation, Algorithm, and Implementation . . . . . . . . . Breadth of the Theory: Blocking, Latency, Tracking, Learned Irrelevance Summary: Limitations and Contributions of the Theory . . . . . . . . . . . . . . . . . . . Appendix A: Derivation of Contingency Surface . . . . . . . . . . . . . . . . . Appendix B: Comparative Analysis of Performance of Contingency A References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I37 I39 I50 153 176 I83 185 I86 189
BASEBALL: AN EXAMPLE OF KNOWLEDGE-DIRECTED MACHINE LEARNING Elliot Soloway 1. 11. 111. IV. V. VI. VII.
Introduction: Motivation and Goals ............................ Representing the Game of Interpretation Process ... .......................... Generalization Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluation Process . . . . . . Experiments . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References .................................
193 194 I96 211 220 2 24 234 235
Contents
vii
MENTAL CUES AND VERBAL REPORTS IN LEARNING Francis S . Bellezza 1. Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Mental Cues and the Computer Metaphor . . . . . . . . . . . . .
1V. Properties of Mental Cues Important in Learning . . . . . . . . . . . . V. Mental Cues Formed under Different Task Sets ............................. References . . . . . .
..................
237
257 268
MEMORY MECHANISMS IN TEXT COMPREHENSION Murray Glanzer and Suzanne Donnenwerth Nolan Introduction: Restrictions ......................... Background: Preceding Work ................................. 111. Text Comprehension Studies ........................... IV. Theoretical Analysis of Thematic Information Carryover ..................... V. General Theoretical Statement ........................ 1 Text: Abstraction Paradigm 1.
11.
........................ ..................... Index.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
275 277 28 I 304 307 312 314 315 319
This Page Intentionally Left Blank
RECOGNITION BY COMPONENTS: A THEORY OF VISUAL PATTERN RECOGNITION Irving Biederman DEPARTMENT OF PSYCHOLOGY STATE UNIVERSITY OF NEW YORK AT BUFFALO BUFFALO, NEW YORK 14260
I. Introduction This article describes recent research and theory on the human’s ability to recognize visual entities. The fundamental problem of object recognition is that any single object can project an infinity of image configurations to the retina. The orientation of the object to the viewer can vary continuously, each giving rise to a different two-dimensional projection. The object can be occluded by other objects or texture fields, as when viewed behind foliage. The object need not be presented as a full-colored, textured image, but instead can be a simplified line drawing. Moreover, the object can even be missing some of its parts or be a novel exemplar of its particular category. But it is only with rare exceptions that an image fails to be rapidly and readily classified, either as an instance of a familiar object category or as an instance that cannot be so classified (itself a form of classification). A Do-It-Yourself Example
Consider the object shown in Fig. 1. We readily recognize it as one of those objects that cannot be classified into a familiar category. Despite its overall unfamiliarity, there is near unanimity in its descriptions. We parse-or segment-its parts at regions of deep concavity and describe those parts with common, simple volumetric terms, such as “a block,” “a cylinder,” “a funnel 01’ truncated cone.” We can look at the zigzag horizontal brace as a texture region or zoom in and interpret it as a series of connected blocks. The same is true of the mass at the lower left-we can see it as a texture area or zoom in and parse it into its various bumps. Although we know that it is not a familiar object, after a while we can say what it resembles: a New York City hot dog cart, with the large block being the central food storage and cooking area, the rounded part underneath as a wheel, THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 20
1
Copyright Q 1986 by Academic Rcss. Inc. All rights of reproduction in any form reserved.
2
Irving Biederman
Fig. I . A do-it-yourself object. There is a strong consensus in the segmentation loci of this configuration and in the description of its parts.
the large arc on the right as a handle, the funnel as an orange juice squeezer, and the various vertical pipes as vents or umbrella supports. It is not a good cart, but we can see how it might be related to one. It is like a 10-letter word with 4 wrong letters. We readily conduct the same process for any object, familiar or unfamiliar, in our foveal field of view. The manner of segmentation and analysis into components does not appear to depend on our familiarity with the particular object being identified. The naive realism that emerges in descriptions of nonsense objects may be reflecting the workings of a representational system by which objects are identified.
11. An Analogy between Speech and Object Perception As will be argued in a later section, the number of categories into which we can classify objects rivals the number of words that can be readily identified when listening to speech. Lexical access during speech perception can be successfully modeled as a process mediated by the identification of individual primitive elements, the phonemes, from a relatively small set of primitives (MarslenWilson, 1980). We only need about 38 phonemes to code all the words in English, 15 in Hawaiian, and 55 to represent virtually all the words in all the languages spoken on earth. Because the set of primitives is so small and each phoneme specifiable by dichotomous (or trichotomous) contrasts (e.g., voiced vs
Visual Pattern Recognition
3
unvoiced, nasal vs oral) on a handful of attributes, one need not make particularly fine discriminations in the speech stream. The representational power of the system derives from its permissiveness in allowing relatively free combinations of its primitives. The hypothesis explored here is that a roughly analogous system may account for our capacities for object recognition. In the visual domain, however, the primitive elements would not be phonemes, but a modest number of simple volumes such as cylinders, blocks, wedges, and cones. Objects are segmented, typically at regions of sharp concavity, and the resultant parts matched against the best-fitting primitive. The set of primitives derives from combinations of contrastive characteristics of the edges in a two-dimensional image (e.g., straight vs curved, symmetrical vs asymmetrical) that define differences among a set of simple volumes (viz., those that tend to be symmetrical and lack sharp concavities). The particular properties of edges that are postulated to be relevant to the generation of the volumetric primitives have the desirable properties that they are invariant over changes in orientation and can be determined from just a few points on each edge. Consequently, they allow a primitive to be extracted with great tolerance for variations of viewpoint and noise. Just as the relations among the phonemes are critical in lexical access-“fur” and “rough” have the same phonemes, but are not the same words-the relations among the volumes are critical for object recognition: Two different arrangements of the same components could produce different objects. In both cases, the representational power derives from the enormous number of combinations that can arise from a modest number of primitives. The relations in speech are limited to left-to-right (sequential) orderings; in the visual domain a richer set of possible relations allows a far greater representational capacity from a comparable number of primitives. The matching of objects in recognition is hypothesized to be a process in which the perceptual input is matched against a representation that can be described by a few simple volumes in specified relations to each other.
111. Theoretical Domain: Primal Access
to Contour-Based Perceptual Categories Our theoretical goal is to account for the initial categorization of isolated objects. Often, but not always, this categorization will be at a basic level, for example, when we know that a given object is a typewriter, banana, or giraffe (Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976). Much of our knowledge about objects is organized at this level of categorization-the level at which there is typically some readily available name to describe that category (Rosch et al., 1976). The hypothesis explored here predicts that in certain cases subordi-
4
Irving Biederman
nate categorizations can be made initially, so that we might know that a given object is a floor lamp, sports car, or dachshund, more rapidly than we know that it is a lamp, car, or dog (e.g., Jolicour, Gluck, & Kosslyn, 1984).
THEROLE OF SURFACECHARACTERISTICS There is a restriction on the scope of this approach of volumetric modeling that should be noted. The modeling has been limited to concrete entities of the kind typically designated by English count nouns. These are concrete objects that have specified boundaries and to which we can apply the indefinite article and number. For example, for a count noun such as chair we can say “a chair” or “three chairs.” By contrast, mass nouns are concrete entities to which the indefinite article or number cannot be applied, such as water, sand, or snow. So we cannot say “a water” or “three waters” unless we refer to a count noun shape as in “a drop of water,” “a bucket of water,” or “a grain of sand,” each of which does have a simple volumetric description. We conjecture that mass nouns are identified primarily through surface characteristics such as texture and color rather than through volumetric primitives. Under restricted viewing conditions, as when an object is partially occluded, texture, color, and other cues (such as position in the scene and labels) may contribute to the identification of count nouns, as, for example, when we identify a particular shirt in the laundry pile from just a bit of fabric. Such identifications are indirect, typically the result of inference over a limited set of possible objects. The goal of the present effort is to account for what can be called primal access: the first contact of a perceptual input from an isolated, unanticipated object to a representation in memory.
IV.
Basic Phenomena of Object Recognition
Independent of laboratory research, the phenomena of everyday object identification provide strong constraints on possible models of recognition. In addition to the fundamental phenomenon that objects can be recognized at all (not an altogether obvious conclusion), at least five facts are evident. Typically, an object can be recognized (1) rapidly, (2) when viewed from novel orientations, (3) under moderate levels of visual noise, (4) when partially occluded, and ( 5 ) when it is a new exemplar of a category. Implications
The preceding five phenomena constrain theorizing about object interpretation in the following ways.
Visual Pattern Recognition
5
1. Access to the mental representation of an object should not be dependent on absolute judgments of quantitative detail because such judgments are slow and error prone (Miller, 1956; Gamer, 1966). For example, distinguishing among just several levels of the degree of curvature or length of an object typically requires more time than that required for the identification of the object itself. Consequently, such quantitative processing cannot be the controlling factor by which recognition is achieved. 2. The information that is the basis of recognition should be relatively invariant with respect to orientation and modest degradation. 3. Partial matches should be computable. A theory of object interpretation should have some principled means for computing a match for occluded, partial, or new exemplars of a given category. We should be able to account for the human’s ability to identify, for example, a chair when it is partially occluded by other furniture, or when it is missing a leg, or when it is a new model.
V. Recognition by Components: An Overview Our hypothesis, recognition by components (RBC), bears some relation to several prior conjectures for representing objects by parts or modules (e.g., Binford, 1971; Guzman, 1971; Marr,1977; Marr & Nishihara, 1978; Tversky 8z Hemenway, 1984). RBC’s contribution lies in its proposal for a particular vocabulary of components derived from perceptual mechanisms and its account of how an arrangement of these components can access a representation of an object in memory. When an image of an object is painted across the retina, RBC assumes that a representation of the image is segmented-or parsed-into separate regions at points of deep concavity, particularly at cusps where there are discontinuities in curvature (Hoffman & Richards, 1985). In general, concavities will arise whenever convex volumes are joined, a principle that Hoffman and Richards (1985) call transversality. Such segmentation conforms well with human intuitions about the boundaries of object parts and does not depend on familiarity with the object, as was demonstrated with the nonsense object in Fig. 1. The resultant parsed regions are then approximated by simple volumetric components that can be modeled by generalized cones (Binford, 1971; Marr, 1977, 1982). A generalized cone is the volume swept out by a cross section moving along an axis (as illustrated later in Fig. 5). [Marr (1977, 1982) showed that the contours generated by any smooth surface could be modeled by a generalized cone with a convex cross section.] The cross section is typically hypothesized to be at right angles to the axis. Secondary segmentation criteria (and criteria for determining the axis of a component) are those that afford descriptions of volumes that maximize symmetry, length, and constancy of the size and curvature of the cross
Irving Biederman
6
section of the component. Of these, symmetry often provides the most compelling subjective basis for selecting subparts (Brady & Asada, 1984; Connell, 1985). These secondary bases for segmentation and component identification are discussed below. The primitive components are hypothesized to be simple, typically symmetrical volumes lacking sharp concavities, such as blocks, cylinders, spheres, and wedges. The fundamental perceptual assumption of RBC is that the components can be differentiated on the basis of perceptual properties in the twodimensional image that are readily detectable and relatively independent of viewing position and degradation. These perceptual properties include several that have traditionally been thought of as principles of perceptual organization, such as good continuation, symmetry, and Pragnanz. RBC thus provides a principled account of the relation between the classic phenomena of perceptual organization and pattern recognition: Although objects can be highly complex and irregular, the units by which objects are identified are simple and regular. The constraints toward regularization (Pragnanz) are thus assumed to characterize not the complete object, but the object’s components. By the preceding account, surface characteristics such as color and texture will typically have only secondary roles in primal access. This should not be interpreted as suggesting that the perception of surface characteristics per se is delayed relative to the perception of the components, but merely that in most cases the surface characteristics are generally less efficient routes for accessing the classification of a count object; that is, we may know that a chair has a particular color and texture simultaneously with its volumetric description, but it is only the volumetric description that provides efficient access to the mental representation of “chair. I Relations among the Components. Although the components themselves are the focus of this article, as noted previously, the arrangement of primitives is necessary for representing a particular object. Thus, an arc side-connected to a cylinder can yield a cup, as shown in Fig. 2. Different arrangements of the same components can readily lead to different objects, as when an arc is connected to the top of the cylinder to produce a pail in Fig. 2. Whether a component is ”
‘There are, however, objects that would seem to require both a volumetric description and a texture region for an adequate representation, such as hairbrushes, typewriter keyboards, and corkscrews. It is unlikely that many of the individual bristles, keys, or coils are parsed and identified prior to the identification of the object. Instead, those regions are represented through the statistical processing that characterizes their texture (e.g., Beck, Prazdny, & Rosenfeld, 1983; Julesz. 1981), although we retain a capacity to zoom down and attend to the volumetric nature of the individual elements. The structural description that would serve as a representation of such objects would include a statistical specification of the texture field along with a specification of the larger volumetric components. These compound texture-componential objects have not been studied, but it is possible that the characteristics of their identification would differ from objects that are readily defined solely by their arrangement of volumetric components.
Visual Pattern Recognition
(a)
(b)
7
(C)
(d)
Fig. 2. Different arrangements of the same components can produce different objects.
attached to a long or short surface can also affect classification, as with the arc producing either an attach6 case or a strongbox in Fig. 2. The identical situation between primitives and their arrangement exists in the phonemic representation of words, where a given subset of phonemes can be rearranged to produce different words. The representation of an object would thus be a structural description that expressed the relations among the components (Winston, 1975; Brooks, 198 1; Ballard & Brown, 1982). A suggested (minimal) set of relations is described in Table I and would include specification of the relative sizes of the components and their points of attachment. STAGESOF PROCESSING
Figure 3 presents a schematic of the presumed subprocesses by which an object is recognized. An early edge extraction stage provides a line drawing description of the object. From this description, nonaccidental properties of the image, described below, are detected. Parsing is performed at concave regions simultaneously with a detection of nonaccidental properties. The nonaccidental properties of the parsed regions provide critical constraints on the identity of the components. Within the temporal and contextual constraints of primal access, the stages up to and including the identification of components are assumed to be bottom up. A delay in the determination of an object's components should have a direct effect on the identification latency of the object. The arrangement of the components is then matched against a representation in memory. It is assumed that the matching of the components occurs in parallel, with unlimited capacity. Partial matches are possible, with the degree of match assumed to be proportional to the similarity in the components between the image and the representation.2 This stage model is presented to provide an overall theoretical context. The focus of this article is on the nature of the units of the representation. 2Modeling the matching of an object image to a mental representation is a rich, relatively neglected problem area. Tversky's (1977) contrast model provides a useful framework with which to consider this similarity problem in that it readily allows distinctive features (i.e., components) of the image to be considered separately from the distinctive components of the representation. This allows principled assessments of similarity for partial objects (components in the representation, but not in
8
Irving Biederman
rn Extraction
Parsing at Regions of Concavity
Detection of Nonoccidental Properties
1
Determination of Components
Matching of Components
Object ldentif ication
Fig. 3. Presumed processing stages in object recognition.
VI.
Nonaccidentalness: A Perceptual Basis for a Componential Representation
Recent theoretical analyses of perceptual organization (Binford, 98 1; Lowe, 1984; Witkin & Tenenbaum, 1983) suggest a perceptual basis for RBC. The central organizational principle is that certain properties of the two-dimensional image are taken by the visual system as strong evidence that the three-dimensional object contains those same properties. For example, if there is a straight line in the image, the visual system infers that the edge producing that line in the threedimensional world is also straight. Images that are symmetrical only under reflection are interpreted as arising from objects with that property. The visual system ignores the possibility that the property in the image is merely a result of an (highly unlikely) accidental alignment of eye and a curved edge. the image) and novel objects (containingcomponents in the image that are not in the representation). It may be possible to construct a dynamic model based on a parallel distributed process as a modification of the kind proposed by McClelland and Rumelhart (1981) for word perception, with components playing the role of letters. One difficulty facing such an effort is that the neighbors for a given word are well specified and readily available from a dictionary; the set of neighbors for a given object is not.
Visual Pattern Recognition
9
If the image is symmetrical, we assume that the object projecting that image is also symmetrical. The order of symmetry is also preserved: Images that are symmetrical under both rotation and reflection, such as a square or circle, are interpreted as arising from objects (or surfaces) that are symmetrical under both rotation and reflection. Although skew symmetry is often readily perceived as arising from a tilted symmetrical object or surface, there are cases where skew symmetry is not readily detected (Attneave, 1983). Parallelism and cotermination constitute the remaining nonaccidental relations. All five of these two-dimerrsional nonaccidental properties and the associated three-dimensional inferences are described in Fig. 4 (modified from Lowe, 1984). Witkin and Tenenbaum (see also Lowe, 1984) argue that the leverage provided by these nonaccidental relations for inferring a three-dimensional structure from a two-dimensional image is so powerful that they pose a challenge to the effort in computer vision and perceptual psychology that assigned central importance to variation in local surface characteristics, such as luminance. The psychological literature provides considerable evidence supporting the assumption that these nonaccidental propa p l e of Non-Accidentalnes: Criticol information is unlikely to be a unseqwnce of on occident of viewpoint.
m
c
e Inference from Image Fmtum 3-0 Inference
2-D Rebtion 4. Collinearity of points or lines
Examples
Collinearity in 3-Space / /
2. Curvilineorityof points of arcs
Curvilinwrity in 3-spacS /
3. Symmetry
/
------.A \
........'...
.
Symmetry in 3-qpaw
(Skew Symmetry 7)
4.Porallel Curves (Over Small Visuol Angles)
Curves ore pmalkl in 3-Spaw
5. Vertices-- two or more terminations ato
Curves terminate at o cmmon winl m 3-Swce
"Fork"
"Arrow"
Fig. 4. Five nonaccidental relations (adapted from Lowe, 1985).
Irving Biederman
10
erties can serve as primary organizational constraints in human image interpretation. PSYCHOLOGICAL EVIDENCE FOR THE RAPIDUSE NONACCIDENTALRELATIONS
OF
There can be little doubt that images are interpreted in a manner consistent with the nonaccidental principles. But are these relations used quickly enough so as to provide a perceptual basis for the components that allow primal access? Although all the principles have not received experimental verification, the available evidence does suggest that the answer to the preceding question is “yes.” There is strong evidence that the visual system quickly assumes and uses collinearity, curvature, symmetry, and cotermination. This evidence is of two sorts: ( I ) demonstrations, often compelling, showing that when a given two-dimensional relation is produced by an accidental alignment of object and image, the visual system accepts the relation as existing in the three-dimensional world; and (2) search tasks showing that when a target differs from distracters in a nonaccidental property, as when one is searching for a curved arc among straight segments, the detection of that target is facilitated compared to conditions where targets and background do not differ in such properties.
I.
Collinearity versus Curvature
The demonstration of the collinearity or curvature relations is too obvious to be performed as an experiment. When looking at a straight segment, no observer would assume that it is an accidental image of a curve. That the contrast between straight and curved edges is readily available for perception was shown by Neisser (1963). He found that a search for a letter composed only of straight segments, such as a Z, could be performed faster when it was embedded in a field of curved distracters, such as C, G, 0, and Q , than when it was among other letters composed of straight segments such as N, W, V , and M. 2 . Symmetry and Parallelism Many of the Ames demonstrations, such as the trapezoidal window and Ames room, derive from an assumption of symmetry that includes parallelism (Meson, 1952). Palmer ( 1980) showed that the subjective directionality of arrangements of equilateral triangles was based on the derivation of an axis of symmetry for the arrangement. King, Meyer, Tangney, and Biederman (1976) demonstrated that a perceptual bias toward symmetry accounted for a number of shape constancy effects. Garner (1974), Checkosky and Whitlock (1973), and Pomerantz (1978) provided ample evidence that not only can symmetrical shapes be quickly discriminated from asymmetrical stimuli, but the degree of symmetry was also a
Visual Pattern Recognition
II
readily available perceptual distinction. Thus, stimuli that were invariant under both reflection and 90" increments in rotation could be rapidly discriminated from those that were only invariant under reflection (Checkosky & Whitlock, 1973).
3. Cotermination The "peephole perception" demonstrations, such as the Ames chair (Meson, 1952) or the physical realization of the impossible triangle (Penrose & Penrose, 1958), are produced by accidental alignment of noncoterminous segments. The success of these demonstrations documents the immediate and compelling impact of this relation. The registration of cotermination is important for determining vertices that provide information which can serve to distinguish the components. In fact, one theorist (Binford, 1981) has suggested that the major function of eye movements is to determine coterminous edges. With polyhedra (volumes produced by planar surfaces), the Y, arrow, and L vertices allow inference as to the identity of the volume in the image. For example, the silhouette of a brick contains a series of six vertices, which alternate between L's and arrows, and an internal Y vertex, as illustrated in any of the straight-edged cross-sectioned volumes in Fig. 6. The Y vertex is produced by the cotermination of three segments, with none of the angles greater than 180". (An arrow vertex contains an angle that exceeds 180".) This vertex is not present in components that have curved cross sections, such as cylinders, and thus can provide a distinctive cue for the cross-sectional edge. Perkins (1983) has described a perceptual bias toward parallelism in the interpretation of this ~ e r t e x .[Chakravarty ~ (1979) has discussed the vertices formed by curved regions.] Whether the presence of this particular internal vertex can facilitate the identification of a brick versus a cylinder is not yet known, but a recent study by Biederman and Blickle (1985, described below) demonstrated that deletion of vertices adversely affected object recognition more than the deletion of the same amount of contour at midsegment. The T vertex represents a special case in that it is not a locus of cotermination (of two or more segments), but only the termination of one segment on another. Such vertices are important for determining occlusion and thus segmentation (along with concavities) in that the edge forming the (normally) vertical segment 3When such vertices formed the central angle in a polyhedron, Perkins (1983) reported that the surfaces would almost always be interpreted as meeting at right angles as long as none of the three angles was less than 90". Indeed, such vertices cannot be projections of acute angles (Kanade, 1981). but the human appears insensitive to the possibility that the vertices could have arisen from obtuse angles. If one of the angles in the central Y vertex was acute, then the polyhedra would be interpreted as irregular. Perkins found that subjects from rural areas of Botswana, where there was a lower incidence of exposure to carpentered (right-angled) environments, had an even stronger bias toward rectilinear interpretations than Westerners (Perkins & Deregowski, 1982).
12
Irving Biederman
of the T cannot be closer to the viewer than the segment forming the top of the T (Binford, 1981). By this account, the T vertex might have a somewhat different status than the Y,arrow, and L vertices in that the T’s primary role would be in segmentation rather than in establishing the identity of the v01ume.~ Vertices composed of three segments, such as the Y and arrow, and their curved counterparts, are important determinants as to whether a given component is volumetric or planar. Planar components are discussed below but, in general, such components lack three-pronged vertices. The high speed and accuracy of determining a given nonaccidental relation, for example, whether some pattern is symmetrical, should be contrasted with performance in making absolute quantitative judgments of variations in a single, physical attribute, such as length of a segment or degree of tilt or curvature. For example, the judgment as to whether the length of a given segment is 10, 12, 14, 16, or 18 inches is notoriously slow and error prone (Miller, 1956; Garner, 1962; Beck er al., 1983; Virsu, 1971a,b; Fildes & Triggs, 1985). Even these modest performance levels are challenged when the judgments have to be executed over the brief 100-msec intervals (Egeth & Pachella, 1969) that are sufficient for accurate object identification. Perhaps even more telling against a view of object recognition that would postulate the making of absolute judgments of fine quantitative detail is that the speed and accuracy of such judgments decline dramatically when they have to be made for multiple attributes (Miller, 1956; Gamer, 1962; Egeth & Pachella, 1969). In contrast, object recognition latencies for complex objects are reduced by the presence of additional (redundant) components (Biederman, Ju, & Clapper, 1985, described below).
VII. A Set of 36 Components Generated from Differences in Nonaccidental Properties among Generalized Cones I have emphasized the particular set of nonaccidental properties shown in Fig. 4 because they may constitute a perceptual basis for the generation of the set of
components. Any primitive that is hypothesized to be the basis of object recogni4The arrangement of vertices, particularly for polyhedra, offers constraints on “possible” interpretations of lines as convex, concave, or occluding (e.g., Sugihara, 1984). In general, ‘the constraints take the form that a segment cannot change its interpretation (e.g., from concave to convex) unless it passes through a vertex. “Impossible” objects can be constructed from violations of this constraint (Waltz, 1975) as well as from more general considerations (Sugihara, 1982, 1984). It is tempting to consider that the visual system captures these constraints in the way in which edges are grouped into objects, but the evidence would seem to argue against such an interpretation. The impossibility of most impossible objects is not immediately registered, but requires scrutiny and thought before the inconsistency is detected. What this means in the present context is that the visual system has a capacity for classifying vertices locally, but no perceptual routines for determining the global consistency of a set of vertices.
Visual Pattern Recognition
13
Constant
Fig. 5 . Variations in generalized cones that can be detected through nonaccidental properties. Constant-sizedcross sections have parallel sides; expanded or expanded and contracted cross sections have sides that are not parallel. Curved versus straight cross sections and axes are detectable through collinearity or curvature. The three values of cross-sectional symmetry (symmetrical under reflection and 90"rotation, reflection only, or asymmetrical) are detectable through the symmetry relation.
tion should be rapidly identifiable and invariant over viewpoint and noise. These characteristics would be attainable if differences among components were based on differences in nonaccidental properties. Although additional nonaccidental properties exist, there is empirical support for rapid perceptual access to the five described in Fig. 4. In addition, these five relations reflect intuitions about significant perceptual and cognitive differences among objects. From variation over only two or three levels in the nonaccidental relations of four attributes of generalized cylinders, a set of 36 components can be generated. A subset is illustrated in Fig. 5. Some of the generated volumes and their organization are shown in Fig. 6. Three of the attributes describe characteristics of the cross section: its shape, symmetry, and constancy of size as it is swept along the axis. The fourth attribute describes the shape of the axis: 1. Cross section A. Edges S Straight C Curved B. Symmetry + + Symmetrical: Invariant under rotation and reflection Symmetrical: Invariant under reflection Asymmetrical
+
Irving Biederman
14
C.
2.
Constancy of size of cross section as it is swept along axis + Constant Expanded - - Expanded and contracted
Axis D. Curvature + Straight Curved
A. PERCEPTUAL BIASESAMONG
THE
COMFUNENTS
The values of these four attributes are presented as contrastive differences in nonaccidental properties: straight versus curved, symmetrical versus asyrnmetrical, parallel versus nonparallel. Cross-sectional edges and curvature of the axis are distinguishable by collinearity or curvilinearity. The constant versus CROSS SECTION
Fig. 6. Proposed partial set of volumetric primitives (geons) derived from differences in nonaccidental properties.
Visual Pattern Recognition
15
expanded size of the cross section would be detectable through parallelism; a constant cross section would produce a generalized cone with parallel sides (as with a cylinder or brick); an expanded cross section would produce edges that were not parallel (as with a cone or wedge), and a cross section that expanded and then contracted would produce an ellipsoid with nonparallel sides and an extrema of positive curvature (as with a lemon). As Hoffman and Richards (1985) have noted, such extrema are invariant with viewpoint. The three levels of cross-sectional symmetry are equivalent to Garner's (1974) distinction of the number of different stimuli produced by 90" rotations and reflections of a stimulus. Thus, a square or circle would be invariant under 90" rotation and reflection; but a rectangle or ellipse would be invariant only under reflection, as 90" rotations would produce a second figure. Asymmetrical figures would produce eight different figures under 90" rotation and reflection. 1 . Negative Values The plus values are those favored by perceptual biases and memory errors. No bias is assumed for straight and curved edges of the cross section. For symmetry, clear biases have been documented. For example, if an image could have arisen from a symmetrical object, then it is interpreted as symmetrical (King et al., 1976). The same is apparently true of parallelism. If edges could be parallel, then they are typically interpreted as such, as with the trapezoidal room or window. 2. Curved Axes Figure 7 shows three of the most negatively marked primitives with curved cross sections. Such volumes often resemble biological entities. An expansion and contraction of a rounded cross section with a straight axis produces an ellipsoid (lemon) (Fig. 7a), an expanded cross section with a curved axis produces a horn (Fig. 7b), and an expanded and contracted cross section with a rounded cross section produces a banana slug or gourd (Fig. 7c). In contrast to the natural forms generated when both cross section and axis are curved, the components swept by a straight-edged cross section traveling along a curved axis (e.g., the components on the first, third, and fifth rows of Fig. 8) appear somewhat less familiar and more difficult to apprehend than their curved counterparts. It is possible that this difficulty may merely be a consequence of unfamiliarity. Alternatively, the subjective difficulty might be produced by a conjunction-attention effect (CAE) of the kind discussed by Treisman (e.g., Treisman & Gelade, 1980). CAEs are described in the section on attentional effects. In the present case, given the presence in the image of curves and straight edges (for the rectilinear cross sections with curved axis), attention (or scrutiny) may be required to determine which kind of segment to assign to the axis and which to assign to the cross section. Curiously, the problem does not present
Irving Biederman
16
Cross Section : Edge: Curved (C) Symnetry: Yes (+I Size: Exwnded R Contmcted:(--I
Cross Section: Edge: Curved (C) Symmetry: Yes (+) Size: ExDanded (+I A:* Curved (-1
W H o r n l
Cross Section: Edge: Curved (C) Symmetry: Yes (+I Sire: Expondad It Contracted (-4 Axis: Curved (-1
C
(Gourd)
Fig. 7. Three curved components with curved axes or expanded and/or contracted cross sections. These tend to resemble biological forms.
itself when a curved cross section is run along a straight axis to produce a cylinder or cone. The issue as to the role of attention in determining components would appear to be empirically tractable using the paradigms created by Treisman and her colleagues (Treisman & Gelade, 1980; Treisman, 1982a,b; Treisman & Schmidt, 1983).
3. Asymmetrical Cross Sections There are an infinity of possible cross sections that could be asymmetrical. How does RBC represent this variation? RBC assumes that the differences in the departures from symmetry are not readily available and thus do not affect primal access. For example, the difference in the shape of the cross section for the two straight-edged volumes in Fig. 9 might not be apparent sufficiently quickly to affect object recognition. This does not mean that an individual could not store the details of the volume produced by an asymmetrical cross section. But if such detail required additional time for its access, then the expectation is that it could not mediate primal access. As of this writing, I do not know of any case where primal access depends on discrimination among asymmetrical cross sections
-m l s
Visual Pattern Recognition
17
CROSS SECTION
Ban
Straight S CuMldC
SYmrrmtrY
siza
A&
Rot BW++ constow++ Stmight + Ref+ E e Curd~sym~xpac~\t--
+
++
-
+
++
-
++
-
-
e c++
-
-
+
-
-
-
-
Q
c
m s mls
alC +
within a given component type, for example, among curved-edged cross sections of constant size, straight axes, and a specified aspect ratio. For example, the curved cross section for the component that can model an airplane wing (or car door) is asymmetrical. Different wing designs might have different-shaped cross sections. I assume that most people, including wing designers, will know that the object is an airplane, or even an airplane wing, before they know how to classify the wing on the basis of the asymmetry of its cross section. A second way in which asymmetrical cross sections need not be individually represented is that they often produce volumes that resemble symmetrical, but truncated wedges. This latter form of representing asymmetrical cross sections would be analogous to the schema-plus-correctionphenomenon noted by Bartlett (1932). The implication of a schema-plus-correctionrepresentation would be that a single primitive category for asymmetrical cross sections and wedges might be sufficient. For both kinds of volumes, their similarity may be a function of the detection of a lack of parallelism in the volume. One would have to exert scrutiny to determine whether a lack of parallelism was caused by a cross section with
Irving Biederman
\
Fig. 9. Volumes with an asymmetrical, straight-edged cross section. Detection of differences between such volumes might require attention.
nonparallel sides or by a symmetrical cross section that varied in size. In this case, as with the components with curved axes described in the preceding section, a single primitive category for both wedges and asymmetrical straightedged volumes could be postulated that would allow a reduction in the number of primitive components. There is considerable evidence that asymmetrical patterns require more time for their identification than symmetrical patterns (Checkosky & Whitlock, 1973; Pomerantz, 1978). Whether these effects have consequences for the time required for object identification is not yet known. 4.
Conjunction-Attentional Effects
A single feature can often be detected without any effect of the number of distracting items in the visual field. For example, the time for detecting a blue shape (a square or a circle) among a field of green distracter shapes is unaffected by the number of green shapes. However, if the target is defined by a conjunction of features, for example, a blue square among distracters consisting of green squares and blue circles, so that both the color and the shape of each item must be
Visual Pattern Recognition
19
determined to know if it is or is not the target, then target detection time increases linearly with the number of distracters (Treisman & Gelade, 1980). These results have led to a theory of visual attention that assumes that the human can monitor all potential display positions simultaneously and with unlimited capacity for a single feature (e.g., something blue or something curved). But when a target is defined by a conjunction of features, then a limited capacity attentional system that can only examine one display position at a time must be deployed (Treisman & Gelade, 1980). The extent to which Treisman and Gelade’s (1980) demonstration of conjunction-attention effects may be applicable to the perception of volumes and objects has yet to be evaluated. In the extreme, in a given moment of attention, it may be the case that the values of the four attributes of the components are detected as independent features. In cases where the attributes, taken independently, can define different volumes, as with the shape of cross sections and axis, an act of attention might be required to determine the specific component generating those attributes: Am I looking at a component with a curved cross section and a straight cross section or is it a straight cross section and a curved axis? At the other extreme, it may be that an object recognition system has evolved to allow automatic determination of the components. The more general issue is whether relational structures for the primitive components are defined automatically or whether a limited attentional capacity is required to build them from their individual edge attributes. It could be the case that some of the most positively marked volumes are detected automatically, but that the volumes with negatively marked attributes might require attention. That some limited capacity is involved in the perception of objects (but not necessarily their components) is documented by an effect of the number of irrelevant objects on perceptual search (Biederman, 1981). Reaction times and errors for detecting an object, for example, a chair, increased linearly as a function of the number of nontarget objects in a 100-msec presentation of a clockface display (Biederman, 1981). Whether this effect arises from the necessity to use a limited capacity to construct a component from its attributes or whether the effect arises from the matching of an arrangement of components to a representation is not yet known.
B. ADDITIONALSOURCES OF CONTOUR VARIATION 1 . Metric Variation
For any given component type, there can be an infinite degree of metric variation in aspect ratio, degree of curvature (for curved components), and departure from parallelism (for nonparallel components). How should this quantitative variation be conceptualized? The discussion will concentrate on aspect ratio, probably the most important of the variations. But the issues will be
20
Irving Biederman
generally applicable to the other metric variations as well. [Aspect ratio is a measure of the elongation of a component. It can be expressed as the width-toheight ratio of the smallest bounding rectangle that would just enclose the component. It is somewhat unclear as to how to handle components with curved axis. The bounding rectangle could simply enclose the component, whatever its shape. Alternatively, two rectangles could be constructed.] One possibility is to include specification of a range of aspect ratios in the structural description of the object. It seems plausible to assume that recognition can be indexed, in part, by aspect ratio in addition to a componential description. An object’s aspect ratio would thus play a role similar to that played by word length in the tachistoscopic identification of words, where long words are rarely proffered when a short word is flashed. Consider an elongated object, such as a baseball bat with an aspect ratio of 15:l. When the orientation of the object is orthogonal to the viewpoint so that the aspect ratio of its image is also 15: 1, recognition might be faster than when presented at an orientation where the aspect ratio of its image differed greatly from that value, say 2: 1. One need not have a particularly fine-tuned function for aspect ratio as large differences in aspect ratio between two components would, like parallelism, be preserved over a large proportion of arbitrary viewing angles. Another way to incorporate variations in the aspect ratio of an object’s image is to represent only qualitative differences so that variations in aspect ratios exert an effect only when the relative sizes of the longest dimensions undergo reversal. Specifically, for each component and the complete object, three variations could be defined, depending on whether the axis was much smaller, approximately equal to, or much longer than the longest dimension of the cross section. For example, for a component whose axis was longer than the diameter of the cross section (which would be true in most cases), only when the projection of the cross section became longer than the axis would there be an effect of the object’s orientation, as when the bat was viewed almost from on end so that the diameter of the handle was greater than the projection of its length. A close dependence of object recognition performance on the preservation of the aspect ratio of a component in the image would be inconsistent with the emphasis by RBC on dichotomous contrasts of nonaccidental relations. Fortunately, these issues on the role of aspect ratio are readily testable. Bartram’s (1976) experiments, described later in Section XI,A, suggest that sensitivity to variations in aspect ratio need not be given heavy weight: Recognition speed is unaffected by variation in aspect ratio across different views of the same object. 2 . Planar Components A special case of aspect ratio needs to be considered: When the axis for a constant cross section is much smaller than the greatest extent of the cross section, a component may lose its volumetric character and appear planar, as the
Visual Pattern Recognition
21
flipper of the penguin in Fig. 10 or the eye of the elephant in Fig. 11. Such shapes can be conceptualized in two ways. The first (and less favored) is to assume that these are just quantitative variations of the volumetric components, but with an axis length of zero. They would then have default values of a straight axis (+) and a constant cross section (+). Only the edge of the cross section and its symmetry could vary. Alternatively, it might be that a flat shape is not related perceptually to the foreshortened projection of the volume that could have produced it. Using the same variation in cross-sectional edge and symmetry as with the volumetric components, seven planar components could be defined. For symmetry, there would be the square and circle (with straight and curved edges, respectively), and for + symmetry the rectangle, triangle, and ellipse. Asymmetrical (-) planar components would include trapezoids (straight edges) and drop shapes (curved edges). The addition of these seven planar components to the 36 volumetric components yields 43 components (a number close to the number of phonemes required to represent English words). The triangle is here assumed to define a separate component, although a triangular cross section was not assumed to define a separate volume under the intuition that a prism (produced by a triangular cross section) is not quickly distinguishable from wedges. My preference for assuming that planar components are not perceptually related to their foreshortened volumes is based on the extraordinary difficulty of recognizing objects from views that are parallel to the axis of the major components, as shown in Fig. 26 (below). What might be critical here is the presence of a trihedral vertex, such as a fork or an arrow, or a curved counterpart to such vertices (Chakravarty, 1979). Such vertices provide strong evidence that the image is generated from a volumetric rather than a planar component.
++
3. Selection of Axis Given that a volume is segmented from the object, how is an axis selected? Subjectively, it appears that an axis is selected that would maximize its length, the symmetry of the cross section, and the constancy of the size of the cross section. It may be that by having the axis correspond to the longest extent of the component, bilateral symmetry can be more readily detected as the sides would be closer. Typically, a single axis satisfies all three criteria, but sometimes these criteria are in opposition and two (or more) axes (and component types) are plausible (Brady, 1983). Under these conditions, axes will often be aligned to an external frame, such as the vertical (Humphreys, 1983). 4. Parsing at Joins without Concavities
RBC assumes that parsing is primarily performed at regions of concavity. Some objects, however, can be readily modeled with a pair of components but no
22
Irving Biederman
concavity is apparent at the join of the components. For example, a rocket (or any cylinder with a tapered end) can be modeled by joining a cylinder and a cone. A cane furnishes another example. The join between the handle (a cylinder with a curved axis) and the long straight section does not have a concavity. Because the cross sections of the components in these cases are of identical shape and size, no concavity is produced. Such cases can be accommodated by formulating a secondary parsing rule: Parsing, if it is performed at all in the absence of concavities, occurs at regions where nonaccidental properties vary. In the case of the rocket, there would be a change from parallelism of the sides of the rocket’s tank to converging (nonparallel) edges for its nose cone. For the cane, it would be the change from straight to curved sides of the components. Almost always, of course, whenever volumes have different sized cross sections or differ in a nonaccidental property, concavities will be produced and it is these concavities that provide the most compelling support for segmentation. It is possible that when the secondary rule forms the only basis for parsing, recognition performance would suffer compared to objects whose components were segmentable at concavities.
VIII.
Relation of RBC to Principles of Perceptual Organization
Textbook presentations of perception typically include a section of gestalt organizational principles. This section is almost never linked to any other function of perception. RBC posits a specific role for these organizational phenomena in pattern recognition. Specifically, as suggested by the section on generating components through nonaccidental properties, the gestalt principles (or better, nonaccidental relations) serve to determine the individual components rather than the complete object. A complete object, such as a chair, can be highly complex and asymmetrical, but the components will be simple volumes. A consequence of this interpretation i s that it is the components that will be stable under noise or perturbation. If the components can be recovered and object perception is based on the components, then the object will be recognizable. This may be the reason why it is difficult to camouflage objects by moderate doses of random occluding noise, as when a car is viewed behind foliage. According to RBC, the components accessing the representation of an object can readily be recovered through routines of collinearity or curvature that restore contours (Lowe, 1984). These mechanisms for contour restoration will not bridge cusps. For visual noise to be effective, by these considerations, it must obliterate the concavity and interrupt the contours from one component at the precise point where they can be joined, through collinearity or constant curvature, with the contours of another component. The likelihood of this occurring
Visual Pattern Recognition
23
by moderate random noise is, of course, extraordinarily low, and it is a major reason why, according to RBC, objects are rarely rendered unidentifiable by noise. The consistency of RBC with this interpretation of perceptual organization should be noted. RBC holds that the (strong) loci of parsing are at cusps; the components are organized from the contours between cusps. In classical gestalt demonstrations, good figures are organized from the contours between cusps. Experiments subjecting these conjectures to test are described in a later section.
IX. A Limited Number of Components? The motivation behind the conjecture that there may be a limit to the number of primitive components derives from both empirical and computational considerations in addition to the limited number of components that can be discriminated from differences in nonaccidental properties among generalized cones. People are not sensitive to continuous metric variations, as evidenced by severe limitations in the human’s capacity for making rapid and accurate absolute judgments of quantitative shape variation^.^ The errors made in the memory for shapes also document an insensitivity to metric variations. Computationally, a limit is suggested by estimates of the number of objects we might know and the capacity for RBC to readily represent a far greater number with a limited number of primitives. A.
EMPIRICAL SUPPORT FOR
A
LIMIT
Although the visual system is capable of discriminatingextremely fine detail, I have been assuming that the number of volumetric primitives sufficient to model rapid human object recognition may be limited. It should be noted that the number of proposed primitives is greater than the three-cylinder, sphere, and cone-advocated by many how-to-draw books. Although these three may be sufficient for determining relative proportions of the parts of a figure and can furnish aid for perspective, they are not sufficient for the rapid identification of objects.6 Similarly, M a n and Nishihara’s (1978) pipe cleaner (viz., cylinder) ‘Absolute judgments are judgments made against a standard in memory (e.g., that shape A is 14 inches in length). Such judgments are to be distinguished from comparative judgments in which both stimuli are available for simultaneous comparison (e.g., that shape A, lying alongside shape B, is longer than B). Comparative judgments appear limited only by the resolving power of the sensory system. Absolute judgments are limited, in addition, by memory for physical variation. That the memory limitations are severe is evidenced by the finding that comparative judgments can be made quickly and accurately for differences so fine that tens of thousands of levels can be discriminated. But accurate absolute judgments rarely exceed 7 f 2 categories (Miller, 1956). 6Paul Cezanne is often incorrectly cited on this point. “Treat nature by the cylinder, the sphere, the cone, everything in proper perspective so that each side ofan object or plane is directed towards
24
Irving Biedennan
representations of animals (1978) would also appear to posit an insufficient number of primitives. On the page, in the context of other labeled pipe cleaner animals, it is certainly possible to arrive at an identification of a particular (labeled) animal (e.g., a giraffe). But the thesis proposed here would hold that the identifications of objects that were distinguished only by the aspect ratios of a single component type would require more time than if the representation of the object preserved its componential identity. In modeling only animals, it is likely that Marr and Nishihara capitalized on the possibility that appendages (e.g., legs and neck) can often be modeled by the cylindrical forms of a pipe cleaner. By contrast, it is unlikely that a pipe cleaner representation of a desk would have had any success. The lesson from Marr and Nishihara’s demonstration, even limited for animals, may well be that a single component, varying only in aspect ratio (and arrangement with other components), is insufficient for primal access. As noted earlier, one reason not to posit a representation system based on fine quantitative detail (e.g., many variations in degree of curvature) is that such absolutejudgments are notoriously slow and error prone unless limited to the 7 2 2 values argued by Miller (1956). Even this modest limit is challenged when the judgments have to be executed over a brief 100-msec interval (Egeth & Pachella, 1969) that is sufficient for accurate object identification. A further reduction in the capacity for absolute judgments of quantitative variations of a simple shape would derive from the necessity, for most objects, to make simultaneous absolute judgments for the several shapes that constitute the object’s parts (Miller, 1956; Egeth & Pachella, 1969). This limitation on our capacities for making absolute judgments of physical variation, when combined with the dependence of such variation on orientation and noise, makes quantitative shape judgments a most implausible basis for object recognition. RBC’s alternative is that the perceptual discriminations required to determine the primitive components can be made qualitatively, requiring the discrimination of only two or three viewpoint-independent levels of variation.’ Our memory for irregular shapes shows clear biases toward “regularization” (e.g., Woodworth, 1938). Amply documented in the classical shape memory literature was the tendency for errors in the reproduction and recognition of irregular shapes to be in a direction of regularization in which slight deviations from symmetrical or regular figures were omitted in attempts at reproduction. Alternatively, some irregularities were emphasized ( “accentuation”), typically
a central point” (italics mine; Cezanne, 1904/1941). Cezanne was referring to perspective, not the veridical representation of objects. 7Limitationon our capacities for absolute judgments also occurs in the auditory domain (Miller, 1956). It is possible that the limited number of phonemes derives more from this limitation for accessing memory for fine quantitative variation than it does from limits on the fineness of the commands to the speech musculature.
Visual Pattern Recognition
25
by the addition of a regular subpart. What is the significance of these memory biases? By the RBC hypothesis, these errors may have their origin in the m a p ping of the perceptual input onto a representational system based on regular primitives. The memory of a slight irregular form would be coded as the closest regularized neighbor of that form. If the irregularity was to be represented as well, an act that would presumably require additional time and capacity, then an additional code (sometimes a component) would be added, as with Bartlett’s (1932) idea of “schema with correction.”
CONSIDERATIONS: B. COMPUTATIONAL SUFFICIENT? ARE36 COMPONENTS Is there sufficient coding capacity in a set of 36 components to represent the basic level categorizations that we can make? Two estimates are needed to provide a response to this question: (1) the number of readily available perceptual categories, and (2) the number of possible objects that could be represented by 36 components. The number of possible objects that could be represented by 36 components will depend on the allowable relations among the components. Obviously, the value for estimate (2) would have to be greater than the value for estimate ( 1 ) if 36 components are to prove sufficient.
C. How MANYREADILY DISTINGUISHABLE OBJECTS Do PEOPLEKNOW? How might one arrive at a liberal estimate for this value? One estimate can be obtained from the lexicon. There are less than 1500 relatively common basic level object categories, such as chairs and elephants.8 If we assume that this estimate is too small by a factor of two, then we can assume potential classification into approximately 3000 basic level categories. As will be discussed, RBC holds that perception is based on the particular subordinate level object rather than the basic level category, so we need to estimate’the mean number of instances per basic level category that would have readily distinguishable exemThis estimate was obtained from three sources: (1) Several linguists and cognitive psychologists provided guesses of from 300 to lo00 concrete noun object categories. (2) The 6-year-old child can name most of the objects that he or she sees on television and has a vocabulary that is under 10,OOO words. Perhaps lo%, at most, are concrete nouns. (3) Perhaps the most defensible estimate was obtained from a sample of Webster’sseventh new collegiate dictionary. The author sampled 30 pages and counted the number of readily identifiable, unique concrete nouns that would not be subordinate to other nouns. Thus, “ w d thrush” was not counted because it could not be readily discriminated from “sparrow.” “Penguin” and “ostrich” and any doubtful entries were counted as separate noun categories. The mean number of nouns per page was 1.4; with a 1200-page dictionary, this is equivalent to 1600 noun categories.
26
Irving Biederman
plars. Almost all natural categories, such as elephants or giraffes, have one or only a few instances with differing componential description. Dogs represent a rare exception for natural categories in that they have been bred to have considerable variation in their descriptions. Person-made categories vary in the number of allowable types, but this number often tends to be greater than the natural categories. Cups, typewriters, and lamps have just a few (in the case of cups) to perhaps 15 or more (in the case of lamps) readily discernible exemplars. Let’s assume (liberally) that the mean number of types is 10. This would yield an estimate of 30,000 readily discriminable objects (3000 categories X 10 typeskategory). A second source for the estimate is the rate of learning new objects. A total of 30,000 objects would require learning an average of 4.5 objects per day every day for 18 years, the modal age of the subjects in the experiments described below. Although the value of 4.5 objects learned per day seems reasonable for a child in that it approximates the maximum rates of word acquisition during the ages of 2-6 years (Carey, 1978; Miller, 1977), it certainly overestimates the rate at which adults develop new object categories. The impressive visual recognition competence of a child of 6 , if it was based on 30,000 visual categories, would require the learning of 13.5 objects per day, or about 1 per waking hour. By the criterion of learning rate, 30,000 categories would appear to be a liberal estimate. AND THE D. RELATIONSAMONG THE COMPONENTS
REPRESENTATIONAL CAPACITY OF 36 COMPONENTS This calculation is dependent upon two estimates: (1) the number of components needed to uniquely define each object, and ( 2 ) the number of readily discriminable relations among the components. We will start with estimate ( 2 ) and see if it will lead to a plausible value for estimate (1). A possible set of relations is presented in Table I. Like the components, the properties of the relations noted in Table I are nonaccidental in that they can be determined from almost any viewpoint, are preserved in the two-dimensional image, and require the discrimination of only two or three levels. The specification of these five is conservative in that (1) it is a nonexhaustive set in that other relations can be defined, and ( 2 ) the relations are only specified for a pair, rather than triples, of components. Let’s consider these in order of their appearance on the table. Relative size. For any pair of components, C , and C,, C , could be much greater than, smaller than, or approximately equal to C,. Verticality. C, can be above or below C,, a relation, by the author’s estimate, that is defined for at least 80% of the objects. Thus, giraffes, chairs, and typewriters have a top-down specification of their components, but forks and knives do not.
Visual Pattern Recognition
27
TABLE I
GENERATIVE POWEROF 36 COMPONENTS 36
First component, C,
X
36
Second component, C2
X
3
Size (C, >> C2, C2 >> C,, CI = C2)
X
1.8 CI top or bottom (represented for 80% of the objects) X
2
Nature of join [end to end (off-center) or end to side (centered)]
X
2
Join at long or short surface of C,
X
2
Join at long or short surface of C2 = 55,987 possible two-component objects With three components: 55,987 x 36 x 46.2 = 87 million possible three-component objects; equivalent to learning 13,242 new objects every day (-827lwaking hour or 13/minute) for 18 years
Centering. The connection between any pair of joined components can be end to end (and of equal-sized cross section at the join), as the upper and forearms of a person, or end to side, producing one or two concavities, respectively (Marr, 1977). Two-concavity joins are far more common in that it is rare that two endto-end joined components will have equal-sized cross sections. A more general distinction might be whether the end of one component in an end-to-side join is centered or off-centered at the side of the other component. The end-to-end join might represent only the limiting, albeit special case of off-centered joins. In general, the arbitrary connection of any two volumes (or shapes) will produce two concavities. Hoffman and Richards (1985) discuss the production of concavities through the meeting of surfaces as a principle of transversality . Relative size ofsur&aces atjoin. Other than a sphere and a cube, all primitives will have at least a long and a short surface. The join can be on either surface. The attach6 case in Fig. 2a and the strongbox in Fig. 2b differ by the relative lengths of the surfaces of the brick that are connected to the arch (handle). The handle on the shortest surface produces the strongbox; on a longer surface, the attach6 case. Similarly, among other differences, the cup and the pail in Fig. 2c and d, respectively, differ as to whether the handle is connected to the long surface of the cylinder (to produce a cup) or the short surface (to produce a pail). In considering only two values for the relative size of the surface at the join, we are conservatively estimating the relational possibilities. Some volumes such as the wedge have as many as five surfaces, all of which can differ in size.
Irving Biederman
28
Representational Calculations
The 1296 different pairs of the 36 volumes (i.e., 362), when multiplied by the number of relational combinations, 43.2 (the product of the various values of the five relations), give us 55,987 possible two-component objects. If a third component is added to the two, then this value has to be multiplied by 1555 pairs of possible components (36 components X 43.2 ways in which the third component can be related to one of the two components) to yield 87 million possible threecomponent objects. If only 1% of the possible combinations of components were actually used (i.e., 99% redundancy), then the 36 components with the five relations could represent 870,000 possible objects. One would have to acquire 132 objects per day for 18 years (or about 8 per waking hour) to reach this value. This value constrains the estimate of the number of components per object that would be required for the unambiguous identification. If objects were distributed relatively homogeneously among combinations of relations and components, then only two or three components would be sufficient to unambiguously represent most objects! We do not yet know if there is a real limit to the number of components. A limit to the number of components would imply categorical effects such that quantitative variations in the contours of an object (e.g., degree .of curvature) which did not alter a component’s identity would have less of an effect on the identification of the object than contour variations that did alter a component’s identity.
X. Experimental Support for a Componential Representation According to the RBC hypothesis, the preferred input for accessing object recognition is that of the volumetric components. In most cases, only a few appropriately arranged volumes would be all that is required to uniquely specify an object. Rapid object recognition should then be possible. Neither the full complement of an object’s components nor its texture, color, or the full bounding contour (or envelope or outline) need be present for rapid identification. The problem of recognizing tens of thousands of possible objects becomes, in each case, just a simple task of identifying the arrangement of a few from a limited set of components. Overview of Experiments
Several object-naming reaction time experiments have provided support for the general assumptions of the RBC hypothesis, although none has provided tests for the specific set of components proposed by RBC. In all experiments, subjects
Visual Pattern Recognition
29
named briefly presented pictures of common objects. That RBC may provide a sufficient account of object recognition was supported by experiments indicating that objects drawn with only two or three of their components could be accurately identified from a single 100-msec exposure. When shown with a complete set of components, these simple line drawings were identified almost as rapidly as fullcolored, detailed, textured slides of the same objects. That RBC may provide a necessary account of object recognition was supported by a demonstration that degradation (contour deletion), if applied at the regions that are critical according to RBC, rendered an object unidentifiable. All the original experimental results reported here have received at least one (and often several) replication.
INCOMPLETE OBJECTS A. PERCEIVING Biederman et al. (1985) studied the perception of briefly presented partial objects lacking some of their components. A prediction of RBC was that only two or three components would be sufficient for rapid identification of most objects. If there was enough time to determine the components and their relations, then object identification should be possible. Complete objects would be maximally similar to their representation and should enjoy an identification speed advantage over their partial versions. 1. Stimuli
The experimental objects were line drawings of 36 common objects, 9 of which are illustrated in Fig. 10. The depiction of the objects and their partition into components were done subjectively, according to generally easy agreement among at least three judges. The artists were unaware of the set of components described in this article. For the most part, the components corresponded to the parts of the object. Seventeen component types were sufficient to represent the 180 components comprising the complete versions of the 36 objects. The objects were shown either with their full complement of components or partially, but never with less than two components. The first two components that were selected were the largest and most diagnostic components from the complete object, and additional components were added in decreasing order of size or diagnosticity, as illustrated in Figs. 11 and 12. Additional components were added in decreasing order of size and/or diagnosticity, subject to the constraint that the additional component be connected to the existing components. For example, the airplane, which required nine components to look complete, would have the fuselage and two wings when shown with three of the nine components. The objects were displayed in black line on a white background and averaged 4.5" in greatest extent. The purpose of this experiment was to determine whether the first few components that would be available from an unoccluded view of a complete object
30
Irving Biederman
Fig. 10. Nine of the experimental objects.
would be sufficient for rapid identification of the object. In normal viewing, the largest and most diagnostic components are available for perception. We ordered the components by size and diagnosticity because our interest, as just noted, was on primal access in recognizing a complete object. Assuming that the largest and most diagnostic components would control this access, we studied the contribu-
Fig. 1 I . Illustration of the partial and complete versions of two three-component objects (the wine glass and flashlight) and a nine-component object (the penguin).
Visual Pattern Recognition
31
Fig. 12. Illustration of partial and complete versions of a nine-component object (airplane).
tion of the nth largest and most diagnostic component, when added to the n - 1 already existing components, because this would more closely mimic the contribution of that component when looking at the complete object. (Another kind of experiment might explore the contribution of an “average” component by balancing the order of addition of the components. Such an experiment would be relevant to the recognition of an object that was occluded in such a way that only the displayed components would be available for viewing.) 2. Complexity The objects shown in Fig. 10 illustrate the second major variable in the experiment. Objects differ in complexity; by RBC’s definition, in the number of components that they require to look complete. For example, the lamp, flashlight, watering can, scissors, and elephant require two, three, four, six, and nine components, respectively. As noted previously, it would seem plausible that partial objects would require more time for their identification than complete objects, so that a complete airplane of nine components, for example, might be more rapidly recognized than only a partial version of that airplane with only three of its components. The prediction from RBC was that complex objects, furnishing more diagnostic combinations of components, would be more rapidly identified than simple objects. This prediction is contrary to those models that assume that objects are recognized through a serial contour tracing process (e.g., Hochberg, 1978; Ullman, 1983).
32
Irving Biederrnan
3. General Procedure
Trials were self-paced. The depression of a key on the subject’s terminal initiated a sequence of exposures from three projectors. First, the comers of a 500-msec fixation rectangle (6” wide) which corresponded to the comers of the object slide was shown. The fixation slide was immediately followed by a 100msec exposure of a slide of an object that had varying numbers of its components present. The presentation of the object was immediately followed by a 500-msec pattern mask consisting of a random-appearing arrangement of lines. The subject’s task was to name the object as fast as possible into a microphone which triggered a voice key. The experimenter recorded errors. Prior to the experiment, the subjects read a list of the object names to be used in the experiment. [Subsequent experiments revealed that this procedure for name familiarization produced no effect. When subjects were not familiarized with the names of the experimental objects, results were virtually identical to those obtained when such familiarization was provided. This finding indicates that the results of these experiments were not a function of inference over a small set of objects.] Even with the name familiarization, all responses that indicated that the object was identified were considered correct. Thus, “pistol,” “revolver,” “gun,” and “handgun” were all acceptable as correct responses for the same object. Reaction times (RTs) were recorded by a microcomputer which also controlled the projectors and provided speed and accuracy feedback on the subject’s terminal after each trial. Objects were selected that required two, three, six, or nine components to look complete. There were 9 objects for each of these complexity levels, yielding a total set of 36 objects. The various combinations of the partial versions of these objects brought the total number of experimental trials (slides) to 99. Each of 48 subjects viewed all the experimental slides, with balancing accomplished by varying the order of the slides. 4 . Results
Figure 13 shows the mean error rates as a function of the number of components actually displayed on a given trial for the conditions in which no familiarization was provided. Each function is the mean for the nine objects at a given complexity level. Although each subject saw all 99 slides, only the data for the first time that a subject viewed a particular object will be discussed here. For a given level of complexity, increasing numbers of components resulted in better performance, but error rates were modest. When only three or four components for the complex objects (those with six or nine components to look complete) were present, subjects were almost 90% accurate (10% error rate). In general, the complete objects were named without error, so it is necessary to look at the RTs to see if differences emerge for the complexity variable.
Visual Pattern Recognition
33
‘9 30
L
Number of Components
l.3
in Complete Object:
+
2 A... ..A 3 X---x 6 - 9
02
I
1
I
3
4
5
6
I
I
7
8
+ I
9
Number of Components Presented Fig. 13. Mean percentage of error as a function of the number of components in the displayed object (abscissa) and the number of components required for the object to appear complete (parameter). Each point is the mean for nine objects on the first occasion when a subject saw that particular object.
Mean correct RTs, shown in Fig. 14, provide the same general outcome as the errors, except that there was a slight tendency for the more complex objects, when complete, to have shorter RTs than the simple objects. This advantage for the complex objects was actually underestimated in that the complex objects had longer names (three and four syllables) and were less familiar than the simple objects. Oldfield (1959) showed that object-naming RTs were longer for names that have more syllables or are infrequent. This effect of slightly shorter RTs for naming complex objects has been replicated, and it seems safe to conclude, conservatively, that complex objects do not require more time for their identification than simple objects. This result is contrary to serial contour tracing models of shape perception (e.g., Hochberg, 1978; Ullman, 1983). Such models would predict that complex objects would require more time to be seen as complete compared to simple objects, which have less contour to trace. The slight RT advantage enjoyed by the complex objects is an effect that would be expected if their additional components were affording a redundancy gain from more possible diagnostic matches to their representations in memory.
B. LINEDRAWINGS VERSUS COLORED PHOTOGRAPHS The components that are postulated to be the critical units for recognition can be depicted by a line drawing. Color and texture would be secondary routes for recognition. From this perspective, Biederman and Ju (1985) reasoned that nam-
Irving Biederman
34
I100 r
t
Number of Components in Complete Object:
+
2
A.....A 3 X---
x 6
- 9
c
0
700
F Number of Components Presented
Fig. 14. Mean correct reaction time as a function of the number of components in the displayed object (abscissa) and the number of components required for the object to appear complete (parameter). Each point is the mean for nine objects on the first occasion when a subject saw that particular object.
ing RTs for objects shown as line drawings should closely approximate naming RTs for those objects when shown as colored photographic slides with complete detail, color, and texture. In the Biederman and Ju experiments, subjects identified brief presentations (50-100 msec) of slides of common objects. Each object was shown in two versions: professionally photographed in full color or as a simplified line drawing showing only the object’s major components (such as those in Fig. 10). Color and lightness were diagnostic for some of the objects (e.g., banana, fork, fish, camera), but not others (e.g., chair, pen, mitten, bicycle pump). In three experiments subjects named the object; in a fourth experiment a yes-no verification task was performed against a target name. Overall, performance levels with the two types of stimuli were equivalent: mean latencies in identifying images presented by color photography were 11 msec shorter than the drawing, but with a 3.9% higher error rate. An occasional advantage for the color slides was likely due to a more faithful rendition of the object’s components rather than any use of color for recognition: The advantage for the colored slides .was independent of whether its color was diagnostic of its identity. Moreover, there was no color diagnosticity advantage-much less an increased advantage-of the color slides on the verification task, where the color of the to-be-verified object could be anticipated. If color mediated recognition, then targets such as banana, when
Visual Pattern Recognition
35
shown as a color slide, should have enjoyed an increased advantage over their line-drawn versions compared to targets such as chair. This failure to find a color diagnosticity effect, when combined with the finding that simple line drawings can be identified so rapidly as to approach the naming speed of fully detailed, textured, colored photographic slides, supports the premise that the earliest access to a mental representation of an object can be modeled as a matching of a line drawing representation of a few simple components. Such componential descriptions are thus sufficient for primal access. Surface characteristics can be instrumental in defining edges and are powerful determinants of visual search, but may play only a secondary role in speeded recognition.
C. THEPERCEPTION OF DEGRADED OBJECTS Evidence that a componential description may be necessary for object recognition (under conditions where contextual inference is not possible) derives from experiments on the perception of objects which have been degraded by deletion of their contour (Biederman & Blickle, 1985). RBC holds that parsing of an object into components is performed at regions of concavity. The nonaccidental relations of collinearity and curvilinearity allow filling in: They extend broken contours that are collinear or smoothly curvilinear. In concert, the two assumptions of (1) parsing at concavities and (2) filling in through collinearity or smooth curvature lead to a prediction as to what should be a particularly disruptive form of degradation: If contours were deleted at regions of concavity in such a manner that their endpoints, when extended through collinearity or curvilinearity, bridge the concavity, then the components would be lost and recognition should be impossible. The cup in the right column of the top row of Fig. 15 provides an example. The curve of the handle of the cup is drawn so that it is continuous with the curve of the cylinder forming the back rim of the cup. This form of degradation in which the components cannot be recovered from the input through the nonaccidental properties is referred to as nonrecoverable degradation and is illustrated for the objects in the right column of Fig. 15. An equivalent amount of deleted contour in a midsection of a curve or line should prove to be less disruptive as the components could then be restorel! through collinearity or curvature. In this case, the components should be recoverable. Examples of recoverable forms of degradation are shown in the middle column of Fig. 15. In addition to the procedure for deleting and bridging concavities, two other applications of nonaccidental properties were employed to prevent determination of the components: (1) Vertices were altered by deleting one or two of their segments so that forks or Y’s were made into L’s or line segments, often producing
36
Irving Biederman
Fig. 15. Example of five stimulus objects in the experiment on the perception of degraded objects. The left column shows the original intact versions. The middle column shows the recoverable versions. The contours have been deleted in regions where they can be replaced through collinearity or smooth curvature. The right column shows the nonrecoverable versions. The contours have been deleted at regions of concavity so that collinearity or smooth curvature of the segments bridges the concavity. In addition, vertices have been altered (e.g.. from Y’s to L’s) and misleading symmetry and parallelism introduced.
a simple planar surface, as illustrated in the stool in Fig. 15; and, (2) misleading symmetry and parallelism were introduced, as in the spout of the watering can and the parallel edges of the surfaces among the fungs of the stool (Fig. 15). Even with these techniques, it was difficult to remove all the components, and some remained in nominally nonrecoverable versions, as with the handle of the scissors. Subjects viewed 35 objects in both recoverable and nonrecoverable versions. Prior to the experiment, all subjects were shown several examples of the various forms of degradation for several objects that were not used in the experiment. In addition, familiarization with the experimental objects was manipulated between subjects. Prior to the start of the experimental trials, different groups of six subjects (1) viewed a 3-second slide of the intact version of the objects, for example, the objects in the left column of Fig. 15, which they named, (2) were provided with the names of the objects on their terminal, or (3) were given no
visual p.#ern Recognition
37
familiarization. As in the prior experiment, the subject’s task was to name the objects. A glance at the second and third columns in Fig. 15 is sufficient to reveal that one doesn’t need an experiment to show that the nonrecoverable objects would be more difficult to identify than the recoverable versions. But we wanted to determine if the nonrecoverable versions would be identifiable at extremely long exposure durations (5 sec) and whether the prior exposure to the ifitact version of the object would overcome the effects of the contour deletion. The effects of .contour deletion in the recoverable condition were also of considerable interest when compared to the comparable conditions from the partial object experiments. 1 . Results
The error data are shown in Fig. 16. Identifiability of the nonrecoverable stimuli was virtually impossible: The median error rate for those slides was 100%. Subjects rarely guessed wrong objects in this condition. Almost always they merely said that they “don’t know.” In those few cases where a nonrecoverable object was identified, it was for those instances where some of the components were not removed, as with the circular rings of the handles of the scissors. Even at 5 sec, error rates for the nonrecoverable stimuli, especially in the name and no familiarization conditions, was extraordinarily high. (Data for the 5 sec exposure duration are not shown in Fig. 16.) Objects in the recoverable condition were named at high accuracy at the longer exposure durations. As in the previous experiments, there was no effect of familiarizing the subjects with the names of the objects compared to the condition in which the subjects were provided with no information about the objects. There was some benefit, however, of providing intact versions of the pictures of the objects. Even with this familiarity, performance in the nonrecoverable condition was extraordinarily poor, with error rates exceeding 60% when subjects had a full 5 sec for deciphering the stimulus. As noted previously, even this value underestimated the difficulty of identifying objects in the nonrecoverablecondition in that identification was possible only when the contour deletion allowed some of the components to remain recoverable. The emphasis on the poor performance in the nonrecoverablecondition should not obscure the extensive interference that was evident at the brief exposure durations in the recoverable condition. The previous experiments had established that intact objects, without picture familiarization, could be identified at nearperfect accuracy at 100 msec. At this exposure duration, error rates for the recoverable stimuli in the present experiment, whose contours could be restored through collinearity and curvature, were -65%. The high error rates at 100-msec exposure duration suggest that these filling in processes require both time (on the
Irving Biederman
38
90 - \ \
Unrecoverable
\
80
70
60 L
' 2
c
-t -
\
50
c
Q)
E 240 c 0
s30
Recoverable \
20 '
I0
\
\
Name-None
'
Picture 400
I
I
200
750
Exposure Duration (msec) Fig. 16. Mean percentage of errors in object naming as a function of exposure duration, nature of contour deletion (recoverable vs nonrecoverable components), and prefamiliarization (none, name, or picture). No differences were apparent between the none and name pretraining conditions, so they have been combined into one function.
order of 200 msec) and an image-not merely a memory representation-to be successfully executed. The dependence of componential recovery on the availability of contour and time was explored parametrically by Biederman and Blickle (1985). To produce the nonrecoverable versions of the objects, it was necessary to delete or modify the vertices. The recoverable versions of the objects tended to have their contours deleted in midsegment. It is possible that some of the interference in the nonrecoverable condition was a consequence of the removal of vertices rather than the production of inappropriate components. The experiment also compared these two loci (vertex or midsegment) as sites of contour deletion. Contour
Visual Pattern Recognition
39
deletion was performed either at the vertices or at midsegments for 18 objects, but without the accidental bridging of components through collinearity or curvature that was characteristic of the nonrecoverable condition. The percentage of contour removed was also varied with values of 25, 45, and 65% removal, and the objects were shown for 100, 200, or 750 msec. Other aspects of the procedure were identical to the previous experiments, with only name familiarization provided. Figure 17 shows an example for a single object. The mean percentages of errors are shown in Fig. 18. At the briefest exposure duration and the most contour deletion (100-msec exposure duration and 65% contour deletion), removal of the vertices resulted in considerably higher error rates than the midsegment removal, 54 and 31% errors, respectively. With less contour deletion or longer exposures, the locus of the contour deletion had only a slight effect on naming accuracy. Both types of loci showed a consistent improvement with longer exposure durations, with error rates below 10% at the 750-msec duration. By contrast, the error rates in the nonrecoverable condition in the prior experiment exceeded 75%, even after 5 sec. We conclude that the filling in of contours, whether at midsegment or vertex, is a process that can be completed within 1 sec. But the suggestion of a misleading component through collinearity or curvature that bridges a concavity produces an image that cannot index the original object, no matter how much time there is to view the image. Locus of Deletion Proportion Contour
At Midsegment
At Vertex I
I -
Fig. 17. Illustration for a single object of 25, 45, and 65%contour removal centered at either midsegment or vertex.
Irving Biederman
40
60Contour Deletion
50
---
-
At Vertex At Midsegment
40-
w + ac8 LE
30-
c
0
g
20-
40
-
Percent Contour Deletion Fig. 18. Mean percentage of object-naming errors as a function of locus of contour removal (midsegment or vertex), percentage of removal, and exposure duration.
-
4000
-
-
f
950
z
900-
Y
F
Exposure Duration Contour Deletion At Vertex At Midsegment
---
.-cc0
B 0
850-
+ 0
?!
3
8oo-
s"
750
0 c
-
4
L
I
25
I
I
45
65
Percent Contour Deletion Fig. 19. Mean correct object-naming reaction time (milliseconds) as a function of locus of contour removal (midsegment or vertex), percentage of removal, and exposure duration.
Visual Pattern Recognition
41
Although accuracy was less affected by the locus of the contour deletion at the longer exposure durations and the lower deletion proportions, there was a consistent advantage on naming latencies of the midsegment removal, as shown in Fig. 19. (The lack of an effect at the 100-msec exposure duration with 65% deletion is likely a consequence of the high error rates for the vertex deletion stimuli.) This result shows that if contours are deleted at a vertex, they can be restored as long as there is no accidental filling in, but the restoration will require more time than when the deletion is at midsegment. Overall, both the error and RT data document a striking dependence of object identification on what RBC assumes to be a prior and necessary stage of componential determination.
2. Perceiving Degraded versus Partial Objects Consider Fig. 20 which shows for some sample objects one version in which whole components are deleted so that only three (of six or nine) of the components are present and another version in which the same amount of contour is removed, but in midsegment distributed over all of the object’s components. Component
Complete
Deletion
Midsegmeni Deletion
Fig. 20. Sample stimuli with equivalent proportion of contours removed either at midsegments or as whole components.
Irving Biederman
42
With objects with whole components deleted, it is unlikely that the missing components are added imaginally prior to recognition. Logically, one would have to know what object was being recognized to know what parts to add. Instead, indexing (addressing) a representation most likely proceeds in the absence of the parts. The two methods for removing contour may thus be affecting different stages. Deleting contour in midsegment affects processes prior to and including those involved in the determination of the components (Fig. 3). The removal of whole components (the partial object procedure) is assumed to affect the matching stage, reducing the number of common components between the image and the representation and increasing the number of distinctive components in the representation. Contour filling in is typically regarded as a fast, lowlevel process. We (Biederman, Beiring, Ju, & Blickle, 1985) studied the naming speed and accuracy of six- and nine-component objects undergoing these two types of contour deletion. At brief exposure durations (e.g., 65 msec), performance with partial objects was better than objects with the same amount of contour removed in midsegment (Figs. 21 and 22). At longer exposure durations (200 msec), the RTs reversed, with the midsegment deletion now faster than the partial objects. Our interpretation of this result is that although a diagnostic subset of a few components (a partial object) can provide a sufficient input for recognition, the activation of that representation (or its elicitation of a name) is not optimal
\
\
-A
Midsegment Deletion I
65
I
ioo
1
200
Exposure Duration (msec) Fig. 21. Mean percentage of errors of object naming as a function of the nature of contour removal (deletion of midsegments or components) and exposure duration.
Visual Pattern Recognition
iooo ’020L
-g
43
t\ \
980 -
E
d
Q)
.c
I-
960
-
c
0 .+
8
a
940-
c 0
8 920 e!
\
0
“t
5”
\ -‘4
Midsegment Deletion
900
4
400
65
200
Exposure Duration (msec) Fig. 22. Mean correct reaction time (milliseconds) in object naming as a function of the nature of contour removal (deletion at midsegments or components) and exposure duration.
compared to a complete object. Thus, in the partial object experiment described previously, recognition RTs were shortened with the addition of components to an already recognizable object. If all of an object’s components were degraded (but recoverable), recognition would be delayed until contour restoration was completed. Once the filling in was completed and the complete complement of an object’s components was available, a better match to the object’s representation would be possible (or the elicitation of its name) than with a partial object that had only a few of its components. We are currently attempting to formally model this result. More generally, the finding that partial complex objects-with only three of their six or nine components present-can be recognized more readily than objects whose contours can be restored through filling in documents the efficiency of a few components for accessing a representation. 3. Contour Deletion by Occlusion
The degraded recoverable objects in the right columns of Fig. 15 have the appearance of flat drawings of objects with interrupted contours. Biederman and
44
Irving Biederman
Blickle ( 1985)designed a demonstration of the dependence of object recognition on componential identification by aligning an occluding surface so that it appeared to produce the deletions. If the components were responsible for an identifiable volumetric representation of the object, we would expect that with the recoverable stimuli, the object would complete itself under the occluding surface and assume a three-dimensionalcharacter. This effect should not occur in the nonrecoverable condition. This expectation was met as shown in Figs. 23 and 24. These stimuli also provide a demonstration of the time (and effort?) requirements for contour restoration through collinearity or curvature. We have not yet obtained objective data on this effect, which may be complicated by masking effects from the presence of the occluding surface, but we invite the reader to share our subjective impressions. When looking at a nonrecoverable version of an object in Fig. 23, no object becomes apparent. In the recoverable version in Fig. 24, an object does pop into a three-dimensional appearance, but most observers report a delay (our own estimate is -500 msec) from the moment the stimulus is first fixated to when it appears as an identifiable three-dimensional entity. This demonstration of the effects of an occluding surface to produce contour interruption also provides a control for the possibility that the difficulty in the nonrecoverable condition was a consequence of inappropriate figure-ground groupings, as with the stool in Fig. 15. With the stool, the ground that was
Fig. 23. Nonrecoverable version of an object where the contour deletion is produced by an occluding surface.
Visual Pattern Recognition
45
Fig. 24. Recoverable version of an object where the contour deletion is produced by an occluding surface. The object is the same as that shown in Fig. 23. The reader may note that the threedimensional percept in this figure does not occur instantaneously.
apparent through the rungs of the stool became figure in the nonrecoverable condition. (In general, however, only a few of the objects had holes in them where this could have been a factor.) This would not necessarily invalidate the RBC hypothesis, but merely would complicate the interpretation of the effects of the nonrecoverable noise in that some of the effect would derive from inappropriate grouping of contours into components and some of the effect would derive from inappropriate figure-ground grouping. That the objects in the nonrecoverable condition remain unidentifiable when the contour interruption is attributable to an occluding surface suggests that figure-ground grouping cannot be the primary cause of the interference from the nonrecoverable deletions. D. SUMMARY AND IMPLICATIONS OF THE EXPERIMENTAL RESULTS
The sufficiency of a component representation for primal access to the mental representation of an object was supported by two results: (1) that partial objects with two or three components could be readily identified under brief exposures, and (2) comparable identification performance between the line drawings and color photography. The experiments with degraded stimuli established that the components are necessary for object perception. These results suggest an underlying principle by which objects are identified.
46
Irving Biederman
XI. Componential Recovery Principle The results and phenomena associated with the effects of degradation and partial objects can be understood as the workings of a single principle of componential recovery: If the components in their specified arrangement can be readily identified, object identification will be fast and accurate. In addition to those aspects of object perception for which experimental research was described previously, the principle of componential recovery might encompass at least four additional phenomena in object perception: (1) Objects can be more readily recognized from some orientations than others (orientation variability); (2) objects can be recognized from orientations not previously experienced (object transfer); (3) articulated (or deformable) objects, with variable componential arrangements, can be recognized even when the specific configuration might not have been experienced previously (deformable object invariance); and (4)novel, instances of a category can be rapidly classified (perceptual basis of basic level categories). A.
ORIENTATION VARIABILITY
Objects can be more readily identified from some orientations compared to other orientations (Palmer, Rosch, & Chase, 1981). According to the RBC hypothesis, difficult views will be those in which the components extracted from the image are not the components (and their relations) in the representation of the object. Often such mismatches will arise from an “accident” of viewpoint where an image property is not correlated with the property in the three-dimensional world. For example, when the viewpoint in the image is parallel to the major components of the object, the resultant foreshortening converts one or some of the components into surface components, such as disks and rectangles in Fig. 25, which are not included in the componential description of the object. In addition, as illustrated in Fig. 25, the surfaces may occlude otherwise diagnostic components. Consequently, the components extracted from the image will not readily match the mental representation of the object, and identification will be much more difficult compared to an orientation, such as that shown in Fig. 26, which does convey the components. A second condition under which viewpoint affects identifiability of a specific object arises when the orientation is simply unfamiliar, as when a sofa is viewed from below, or when the top-bottom relations among the components are perturbed, as when a normally upright object is inverted. Palmer et al. (1981) conducted an extensive study of the perceptibility of various objects when presented at a number of different orientations. Generally, a three-quarters front view was most effective for recognition. Their subjects showed a clear preference for such views. Palmer el al. termed this effective and
Visual Pattern Recognition
41
Fig. 25. A viewpoint parallel to the axes of the major components of a common object.
preferred orientation of the object its canonical orientation. The canonical orientation would be, from the perspective of RBC, a special case of the orientation that would maximize the match of the components in the image to the representation of the object. An apparent exception to the preference for three-quarters frontal view preference was the finding of Palmer et al. (1981) that frontal (facial) views enjoyed some favor in viewing animals. But there is evidence that routines for processing faces have evolved to differentially respond to cuteness (Hildebrandt, 1982; Hildebrandt & Fitzgerald, 1983), age (e.g., Mark & Todd, 1985), and emotion and threats (e.g., Coss, 1979; Trivers, 1985). Faces may thus constitute a special stimulus case in that specific mechanisms have evolved to respond to biolog-
Fig. 26. The same object as in Fig. 25, but with a viewpoint not parallel to the major components.
48
Irving Biederman
ically relevant quantitative variations, and caution may be in order before results with face stimuli are considered as characteristic of the perception of objects in general. B. TRANSFERBETWEEN DIFFERENT VIEWPOINTS When an object is seen at one viewpoint or orientation, it can often be recognized as the same object when subsequently seen at some other viewpoint, even though there can be extensive differences in the retinal projections of the two views. The componential recovery principle would hold that transfer between two viewpoints would be a function of the componential similarity between the views. This could be experimentally tested through priming studies, with the degree of priming predicted to be a function of the similarity (viz., common minus distinctive components) of the two views. If two different views of an object contained the same components, RBC would predict that aside from effects attributable to variations in aspect ratio, there should be as much priming as when the object was presented at an identical view. An alternative possibility to componential recovery is that a presented object would be mentally rotated (Shepard & Metzler, 1971) to correspond to the original representation. But mental rotation rates appear to be too slow and effortful to account for the ease and speed in which transfer occurs between different orientations. There may be a restriction on whether a similarity function for priming effects will be observed. Although unfamiliar objects (or nonsense objects) should reveal a componential similarity effect, the recognition of a familiar object, whatever its orientation, may be too rapid to allow an appreciable experimental priming effect. Such objects may have a representation for each orientation that provided a different componential description. Bartram’s (1974) results support this expectation that priming effects might not be found across different views of familiar objects. Bartram performed a series of studies in which subjects named 20 pictures of objects over eight blocks of trials. [In another experiment, Bartram (1 976) reported essentially the same results from a same-different name-matching task in which pairs of pictures were presented.] In the identical condition, the pictures were identical acorss the trial blocks. In the different view condition, the same objects were depicted from one block to the next, but in different orientations. In the different exemplar condition, different exemplars, for example, different instances of a chair, were presented, all of which required the same response. Bartram found that the naming RTs for the identical and different view conditions were equivalent, and both were shorter than control conditions, described below, for concept and response priming effects. Bartram theorized that observers automatically compute and access all possible three-dimensional viewpoints when viewing a given object. Alternatively, it is possible that there was high componential similarity across the different views, and the experiment was
Visual Pattern Recognition
49
insufficiently sensitive to detect slight differences from one viewpoint to another. However, in four experiments with colored slides, we (Biederman & Lloyd, 1985) failed to obtain any effect of variation in viewing angle and have thus replicated Bartram’s basic effect (or lack of an effect). At this point, our inclination is to agree with Bartram’s interpretation, with somewhat different language, but restrict its scope to familiar objects. It should be noted that both Bartram’s and our results are inconsistent with a model that assigned heavy weight to the aspect ratio of the image of the object or postulated an underlying mental rotation function. WITHIN C. DIFFERENT EXEMPLARS
AN
OBJECTCLASS
Just as we might be able to gauge the transfer between two different views of the same object based on a componentially based similarity metric, we might be able to predict transfer between different exemplars of a common object, such as two different instances of a lamp or chair. Bartram (1974) also included a different exemplar condition in which different objects with the same name (e.g., different cars) were depicted from block to block. Under the assumption that different exemplars would be less likely to have common components, RBC would predict that this condition would be slower than the identical and different view conditions, but faster than a different object control condition with a new set of objects that required different names for every trial block. This was confirmed by Bartram. For both different views of the same object as well as different exemplars (subordinates) within a basic level category, RBC predicts that transfer would be based on the overlap in the components between the two views. The strong prediction would be that the same similarity function that predicted transfer between different orientations of the same object would also predict the transfer between different exemplars with the same name. D. THEPERCEPTUAL BASIS OF BASICLEVELCATEGORIES Consideration of the similarity relations among different exemplars with the same name raises the issue as to whether objects are most readily identified at a basic as opposed to a subordinate or superordinate level of description. The componential representations described here are representations of specific subordinate objects, though their identification was always measured with a basic level name. Much of the research suggesting that objects are recognized at a basic level has used stimuli, often natural, in which the subordinate level had the same componential description as the basic level objects. Only small componential differences or color or texture distinguished the subordinate level objects.
50
Irving Biederman
Thus, distinguishing Asian elephants from African elephants or Buicks from Oldsmobiles requires fine discriminations for their verification. It is not at all surprising that with these cases basic level identification would be most rapid. On the other hand, many human-made categories, such as lamps, or some natural categories, such as dogs (which have been bred by humans), have members that have componential descriptions that differ considerably from one exemplar to another, as with a pole lamp versus a ginger jar table lamp, for example. The same is true of objects that are different from a prototype, as penguins or sports cars. With such instances, which unconfound the similarity between basic level and subordinate level objects, perceptual access should be at the subordinate (or instance) level, a result supported by a recent report by Jolicoeur, Cluck, and Kosslyn ( 1984). It takes but a modest extension of the componential recovery principle to problems of the similarity of objects. Simply put, similar objects will be those that have a high degree of overlap in their components and in the relations among these components. A similarity measure reflecting common and distinctive components (Tversky, 1977) may be adequate for describing the similarity among a pair of objects or between a given instance and its stored or expected representation, whatever their basic or subordinate level designation. E. THEPERCEPTION OF NONRIGID OBJECTS Many objects and creatures, such as people and telephones, have articulated joints that allow extension, rotation, and even separation of their components. There are two ways in which such objects can be accommodated by RBC. One possibility is that independent structural descriptions are necessary for each sizable alteration in the arrangement of an object’s components. For example, it may be necessary to establish a different structural description for Fig. 27a than for Fig. 27d. If this were the case, then a priming paradigm might not reveal any priming between the two stimuli. Another possibility is that the relations among the components can include a range of possible values (Marr & Nishihara, 1978). In the limit, with a relation that allowed complete freedom for movement, the relation might simply be joined. Even that might be relaxed in the case of objects with separable parts, as with the handset and base of a telephone. In that case, it might be either that the relation is nearby, or else different structural descriptions are necessary for attached and separable configurations. Empirical research needs to be done to determine if less restrictive relations, such as join or nearby, have measurable perceptual consequences. It may be the case that the less restrictive the relation, the more difficult the identifiability of the object. Just as there appear to be canonical views of rigid objects (Palmer et al., 1981), there may be a canonical “configuration” for a nonrigid object. Thus, Fig. 27d might be identified as a woman more slowly than Fig. 27a.
Visual Pattern Recognition
a
b
C
51
d
Fig. 27. Four configurations of a nonrigid object.
XII.
Conclusion
To return to the analogy with speech perception made in Section 11, the characterization of object perception that RBC provides bears close resemblance to many modem views of speech perception. In both cases, one has a modest set of primitives: in speech, the 55 or so phonemes that are sufficient to represent almost all words of all the languages on earth; in object perception, perhaps, a limited number of simple components. The ease by which we are able to code tens of thousands of words or objects may derive less from a capacity for making exceedingly fine physical discriminations than from allowing a free combination of a modest number of categorized primitives.
ACKNOWLEDGMENTS This research was supported by the Air Force Office of Scientific Research (Grant F4962083C0086). I would like to express my deep appreciation to Tom Blickle and Ginny Ju for their invaluable contributions to all phases of the empirical research described in this article. Thanks are also due to Mary Lloyd, John Clapper, Elizabeth Beiring, and Robert Bennett for their assistance in the conduct of the experimental research. Aspects of the manuscript profited through discussions with James R. Pomerantz, John Artim, and Brian Fisher.
REFERENCES Attneave, F. (1983). Prignanz and soap bubble systems: A theoretical exploration. In J. Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision. New York: Academic Press. Ballard, D., & Brown, C. M. (1982). Cornpurer vision. Englewood Cliffs, NJ: Prentice-Hall.
52
Irving Biederman
Barrow, H. G., & Tenenbaum, J. M. (1981). Interpreting line drawings as three-dimensional surfaces. Artificial Intelligence, 17, 75-1 16. Bartlett, F. C. (1932). Remembering. Cambridge: Cambridge University Press. Bartram. D. (1974). The role of visual and semantic codes in object naming. Cognitive Psychology, 6, 325-356. Bartram, D. (1976). Levels of coding in picture-picture comparison tasks. Memory and Cognition. 4, 593-602. Beck, J., Prazdny, K., & Rosenfeld, A. (1983). A theory of textural segmentation. In J. Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision. New York: Academic Press. Biederman, I. (1981). On the semantics of a glance at a scene. In M. Kubovy & J. R. Pomerantz (Eds.), Perceptual organization. Hillsdale, NJ: Erlbaum. Biederman, I., Beiring, E., Ju, G., & Blickle. T. (1985). A comparison ofthe perception of partial vs degraded objects. Unpublished manuscript. State University of New York at Buffalo. Biederman, I., & Blickle, T. (1985). The perception of objects with deleted contours. Unpublished manuscript. State University of New York at Buffalo. Biederman, I., & Ju, G. (1985). The perceptual recognition of objects depicted by line drawings and color photography. Unpublished manuscript. State University of New York at Buffalo. Biederman, I., Ju, G., & Clapper, J. (1985). The perception ofpartial objects. Unpublished manuscript. State University of New York at Buffalo. Biederman, I., & Lloyd, M. (1985). Experimental studies of transfer across different object views and exemplars. Unpublished manuscript. State University of New York at Buffalo. Binford, T. 0. (1971). Visual perception by computer. IEEE Systems Science and Cybernetics Conference, Miami, December. Binford. T. 0. (1981). Inferring surfaces from images. Artificial Intelligence. 17, 205-244. Brady, M. (1983). Criteria for the representations of shape. In 1. Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision. New York: Academic Press. Brady, M.,& Asada, H. (1984). Smoothed local symmetries and their implementation. International Journal of Robotics Research, 3, 3. Brooks, R. A. (1981). Symbolic reasoning among 3-D models and 2-D images. Artificial Intelligence, 17, 205-244. Carey, S. (1978). The child as word learner. In M. Halle, J. Bresnan, & G. A. Miller (Eds.), Linguistic theory and psychological reality. Cambridge, MA: MIT Press. Cezanne, P. (1904/1941).Letter to Emile Bernard. In J. Rewald (Ed.), Paul Cezanne’s letters (M. Kay, Trans.). London: B. Cassirrer. Chakravarty, 1. (1979). A generalized line and junction labeling scheme with applications to scene analysis. IEEE Transactions, PAMI, April, 202-205. Checkosky, S. D., & Whitlock, D. (1973). Effects of pattern goodness on recognition time in a memory search task. Journal of Experimental Psychology, 100, 341-348. Connell, J. H. (1985). Learning shape descriptions: Generating and generalizing models of visual objects. Unpublished master’s thesis, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA. Coss, R. G. (1979). Delayed plasticity of an instinct: Recognition and avoidance of 2 facing eyes by the jewel fish. Developmental Psychobiology, 12, 335-345. Egeth, H., & Pachella, R. (1969). Multidimensional stimulus identification. Perception and Psychophysics, 5 , 341-346. Fildes, B. N., & Triggs, T. J. (1985). The effect of changes in curve geometry on magnitude estimates of road-like perspective curvature. Perception and Psychophysics, 37, 218-224. Gamer, W. R. (1974). The processing of information and structure. New York: Wiley. Gamer, W. R. (1962). Uncertainty and structure as psychological concepts. New York: Wiley. Guzman, A. (1971). Analysis of curved line drawings using context and global information. Machine intelligence (Vol. 6). Edinburgh: Edinburgh Univ. Press.
Visual Pattern Recognition
53
Hildebrandt, K. A. (1982). The role of physical appearance in infant and child development. In H. E. Fitzgeral, E. Lester, & M. Youngman (Eds.), Theory and research in behavioral pediatrics (Vol. I ) . New York: Plenum. Hildebrandt, K. A., & Fitzgerald, H. E. (1983). The infant’s physical attractiveness: Its effect on bonding and attachment. Infant Mental Health Journal, 4, 3-12. Hochberg, J. E. (1978). Perception (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall. Hoffman, D. D. & Richards, W. (1985). Parts of recognition. Cognition, 18, 65-96. Humphreys, G. W. (1983). Reference frames and shape perception. CognitivePsychology, 15, 151196. Ittleson, W. H. (1952). The Ames demonstrations in perception. New York: Hafner. Jolicoeur, P., Gluck, M. A., & Kosslyn, S. M. (1984). Picture and names: Making the connection. Cognitive Psychology, 16, 243-275. Ju, G., Biederman, I . , & Clapper, J. (1985, April). Recognirion-by-components:A theory of image interpretation. Paper presented at the meetings of the Eastern Psychological Association, Boston, MA. Julesz, B. (1981). Textons, the elements of texture perception, and their interaction. Nature (London) 290, 91-97. Kanade, T. (1981). Recovery of the three-dimensional shape of an object from a single view. Artificial Intelligence, 17, 409-460. King, M., Meyer, G. E., Tangney, J., & Biederman, 1. (1976). Shape constancy and a perceptual bias towards symmetry. Perception and Psychophysics, 19, 129-136. Kroll, J. F., & Potter, M. C. (1984). Recognizing words, pichms, and concepts: A comparison of lexical, object, and reality decisions. Journal of Verbal Learning and Verbal Behavior, 23, 3966. Lowe, D. (1984). Perceptual organization and visual recognition. Unpublished doctoral dissertation, Department of Computer Science, Stanford University, Stanford, CA. Mark, L. S.,&Todd, J. T. (1985). Describing perception information about human growth in terms of geometric invariants. Perception and Psychophysics, 37, 249-256. Marr, D. (1977). Analysis of occluding contour. Proceedings of the Royal Sociery OfLondon B, 197, 441 -475. Marr, D. (1982). Vision. San Francisco: Freeman. Marr, D., & Nishihara, H. K. (1978). Representation and recognition of three-dimensional shapes. Proceedings of the Royal Society of London B , 200, 269-294. Marslen-Wilson, W. (1980). Optimal eficiency in human speech processing. Unpublished manuscript, Max Planck Institue fiir Psycholinguistik, Nijmegen, The Netherlands. McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception. Part I: An account of basic findings. Psychological Review, 42, 375-407. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81-97. Miller, G. A. (1977). Spontaneous apprentices: Children and language. New York: Seabury. Neisser, U. (1963). Decision time without reaction time: Experiments in visual scanning. American Journal of Psychology, 76, 376-385. Neisser, U. (1967). Cognitive psychology. New York: Appleton. Oldfield, R . C. (1966). Things, words, and the brain. Quarterly Journal of Experimental Psychology, 18, 340-353. Oldfield, R. C., & Wingfield, A. (1965). Response latencies in naming objects. Quarterly Journal of Experimental Psychology, 17, 273-28 1. Palmer, S. E. (1980). What makes triangles point: Local and global effects in configurations of ambiguous triangles. Cognitive Psychology, 12, 285-305. Palmer, S., Rosch, E., & Chase, P. (1981). Canonical perspective and the perception of objects. In J. Long & A. Baddeley (Eds.), Attention and performunce (Vol. 9). Hillsdale, NJ: Erlbaum.
54
Irving Biederman
Penrose, L. S., & Penrose, R. (1958). Impossible objects: A special type of illusion. British Journal Of PSychoIogy. 49, 31-33. Perkins, D. N. (1983). Why the human perceiver is a bad machine. In J. Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision. New York: Academic Press. Perkins, D. N., & Deregowski, J. (1982). A cross-cultural comparison of the use of a Gestalt perceptual strategy. Perception, 11, 279-286. Pornerantz, J. R. (1978). Pattern and speed of encoding. Memory and Cognition, 5 , 235-241. Pomerantz, J . R., Sager, L. C., & Stoever, R. J. (1977). Perception of wholes and their component parts: Some configural superiority effects. Journal of Experimental Psychology: Human Perception and Perjormance, 3,422-435. Rock, 1. (1984). Perception. New York: Freeman. Rosch, E., Mervis, C. B., Gray, W., Johnson, D., & Boyes-Braem, P. (1916). Basic objects in natural categories. Cognitive Psychology, 8, 382-439. Rosenthal, S. (1984). The PF474. Byte, 9, 247-256. Ryan, T., & Schwartz, C. (1956). Speed of perception as a function of mode of representation. American Journal of Psychology, 69, 60-69. Shepard, R . N., & Metzler, J . (1971). Mental rotation of three-dimensional objects. Science. 171, 701-703. Sugihara, K . (1984). An algebraic approach to shape-from-image problems. Artificial Intelligence. 23, 59-95. Treisman, A. (1982). Perceptual grouping and attention in the visual search for features and for objects. Journal of Experimental Psychology: Human Perception and Performance, 8, 194214. Treisman, A. M., & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107-141. Treisman, A., & Gelade, C. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97-136. Trivers, R. (1985). Social evolution. Menlo Park: Benjamin/Cummings. Tversky, A . (1977). Features of similarity. Psychological Review, 84, 327-352. Tversky, B., & Hemenway, K. (1984). Objects, parts, and categories. Journal of Experimental Psychology: General. 113, 169- 193. Ullrnan, S . (1983). Visual routines. Artificial Intelligence Laboratory. Memo No. 723, MIT, Cambridge, MA. Virsu, V. (1971a). Tendencies to eye movements and misperception of curvature, direction, and length. Perception and Psychophysics. 9, 65-72. Virsu, V. (1971b). Underestimation of curvature and task dependence in visual perception of form. Perception and Psychophysics. 9, 339-342. Waltz, D. (1975). Understanding line drawings of scenes with shadows. In P. Winston (ed.), The psychology of computer vision. New York: McGraw-Hill. Winston, P. A. (1975). Learning structural descriptions from examples. In P. H. Winston (Ed.), The psychology of computer vision. New York: McGraw-Hill. Witkin, A. P., & Tenenbaum, J . M. (1983). On the role of structure in vision. In J . Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision. New York: Academic Press. Woodworth, R. S. (1938). Experimental psychology. New York: Holt.
ASSOCIATIVE STRUCTURES IN INSTRUMENTAL LEARNING Ruth M . Colwill and Robert A. Rescorla DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF PENNSYLVANIA PHILADELPHIA, PENNSYLVANIA 19104
I.
Introduction
In instrumental learning, the likelihood of behavior changes as a result of its consequences. This learning process has been a major focus of experimental psychology. Many naturally occurring instances of learning seem to fit this paradigm, and substantial energy has gone into its analysis in the laboratory. The intention of this article is to consider the nature of the associative mechanisms involved in a particular sort of instrumental learning, that in which an animal’s action produces a positive outcome. It is common to acknowledge three major elements in any instrumental learning situation: a response that changes in probability, a reinforcer that is contingent upon that response, and a stimulus in the presence of which that contingency takes place. In the typical case, repeated exposure to the instrumental contingency results in an increased likelihood of the response occurring in the presence of the stimulus. For example, one commonly studied instance involves rat subjects in operant chambers. In that case, making food contingent upon lever pressing produces enhanced lever pressing in the chamber. Theories attempting to explain such changes in instrumental behavior have typically appealed to simple associative mechanisms, but they have differed in the selection of elements between which associations are assumed to form. Three different associative structures have dominated theoretical discussions. 1. The possibility that appealed to many early psychologists is that an association is formed between the response and the stimulus in the presence of which the response is reinforced (Guthrie, 1952; Hull, 1943). The assumed growth of an association between some antecedent stimulus ( S ) and the response (R) seemed to account most naturally for the observation that the response becomes more likely during the stimulus. In this S-R theory, the role of the contingent event is literally to reinforce this S-R association. The reinforcer does not itself become THE PSYCHOLOGY OF LEARNING AND MOTIVATION. VOL. 20
55
Copyright 0 19x6 by Academic Press. Inc. All rights of repmduclion in any form reserved.
56
Ruth M. ColwW and Robert A. Rescorla
part of the associative structure; it simply serves as a kind of catalyst facilitating the formation of an association between two other events, the response and the antecedent stimulus, For many early writers there was an obvious parallel to evolutionary theory: The reinforcer was seen as the analog to natural selection, sampling successful S-R contiguities from the array that occurred whenever the animal behaved during the stimulus. One particularly appealing feature of such a mechanism was its ability to generate behavior that appeared to be purposive or goal directed without actually involving any encoding of the goal itself. This view of instrumental learning so dominated thinking during the 1940s and 1950s that discussion turned from examination of the nature of the underlying associative structure to exploration of the properties that an event needed in order to be a reinforcer (e.g., Premack, 1965; Sheffield, 1966). 2. Many authors, however, have felt that this simple S-R alternative fails to capture the richness of an animal’s knowledge after instrumental training. Various kinds of evidence (some of which we review here) indicate that the animal has more knowledge of the reinforcer than is allowed by this S-R position. Several authors have suggested that this evidence could be accommodated by acknowledging a second association, that between the antecedent stimulus and the reinforcer. Many have argued that instrumental learning situations contain within them the conditions necessary for Pavlovian conditioning: When a response is reinforced during a stimulus, that stimulus is also explicitly paired with the reinforcer. According to two-process accounts, this Pavlovian S-reinforcer association occurs in parallel with the instrumental S-R association and provides the means for encoding information about the reinforcer. Some theorists (e.g., Rescorla & Solomon, 1967; Spence, 1956) give this Pavlovian association motivational properties, whereas others (e.g., Trapold & Overmier, 1972) see it primarily as playing a mediational role in which feedback from the Pavlovian response provides an additional source of stimulus support for the instrumental response. But an important consequence of both versions of two-process theories is that the instrumental reinforcer plays two roles: a catalyst for the S-R association and an associate for the S. Although the reinforcer is not represented as part of the fundamental instrumental association, it is encoded as part of a parallel Pavlovian association that forms in the course of instrumental learning. 3. Recently, it has become increasingly popular to view instrumental learning in a way that has somewhat more intuitive appeal: as an association between the response and the reinforcer (Bolles, 1972; Mackintosh & Dickinson, 1979; Tolman, 1933). According to this view, which represents a return to the earlier ideas of Konorski and Miller (1937), the organism learns the very relationship that the experimenter most carefully arranges, that the response produces the reinforcer. The animal directly encodes the goal as associated with the response. An especially attractive feature of this interpretation is that it may allow application of much of the theoretical power that has been developed for the explanation
Associative Structures in Instrumental Learning
51
of Pavlovian conditioning. A response-reinforcer view of instrumental learning parallels the widely held stimulus-reinforcer account of Pavlovian conditioning. It might then be possible to go some distance toward an understanding of the associations underlying instrumental learning by applying the rules uncovered for Pavlovian conditioning. In the discussion that follows, we present some recent evidence from our laboratory relevant to evaluating these possibilities. The structure of that discussion is as follows: First, we consider evidence suggesting that the organism forms response-reinforcer associations-that the reinforcer plays a role beyond that of a catalyst by entering into associations with antecedent responses. We describe in detail two sorts of data recently collected in our laboratory and briefly review several other types of historically important evidence. Second, we consider the problem of separating the response-reinforcer view from a two-process alternative. Many of the data described in Section I1 clearly demonstrate that the organism learns about the reinforcer; but they are less clear in deciding whether the reinforcer is encoded in terms of a response-reinforcer or stimulus-reinforcer association. Section 111 discusses these alternatives and describes some data favoring the response-reinforcer view. Third, we consider the role that the stimulus might play in an account of instrumental learning'that rests primarily on a response-reinforcer association. Throughout the discussion we emphasize the techniques and logic that allow analysis of the associative structure of instrumental learning as much as the answers that these techniques yield in the particular cases that we have studied. These techniques are primarily ones that were originally developed for the study of associative structures in Pavlovian conditioning but turn out to have considerable power and generality for studying various instances of associative learning. 11. Evidence for Response-Reinforcer Associations
In this section we describe in detail two procedures for identifying responsereinforcer associations. Both are derived from parallel procedures that have been quite successful in analyzing the structure of Pavlovian conditioning, and both yield clear evidence for response-reinforcer associations. We then briefly review the results of several other techniques that have been used to identify responsereinforcer associations. Finally, we discuss the generality of the finding that instrumental training results in response-reinforcer learning. A.
POSTCONDITIONING CHANGES OF THE REINFORCER
Perhaps the most straightforward way of detecting encoding of the reinforcer is to manipulate separately the value of the reinforcer after learning has taken place. We can then inspect the animal's likelihood of continuing the instrumental
58
Ruth M. Colwill and Robert A. Rescorla
performance in the absence of any further reinforcer deliveries. To the degree that changing the value of a particular reinforcer produces a specific change in the probability of responses that it has previously reinforced, we have evidence for a response-reinforcer association. This kind of logic has proved to be extremely successful in analyzing the associative structure of Pavlovian conditioning. Following the pairing of two stimuli, S2 and S 1, one can identify an S2-S 1 association by changing the value of S1 and inspecting the response to S2. Under many circumstances changes in the value of S 1 modify the response to S2, suggesting that S 1 was encoded as an associate of S2 (e.g., Rashotte. Griffin, & Sisk, 1977; Rescorla, 1979, 1980). Under other circumstances, the response to S2 is relatively impervious to changes in the value of S l , suggesting some other associative structure (e.g., Amiro & Bitterman, 1980; Cheatle & Rudy, 1978; Holland & Rescorla, 1975; Nairne & Rescorla, 1981; Rizley & Rescorla, 1972). Although we do not yet have an adequate characterization of the determinants of these different outcomes, it is clear that the postconditioning change technique can be a valuable analytic tool. Attempts to apply this tool to the case of instrumental learning have also led to a variety of results. Some authors have found evidence that a good deal of instrumental behavior persists after a change in the value of the reinforcer or reinforcer-correlated stimuli (e.g., Adams, 1980, 1982; Garcia, Kovner, & Green, 1970; Holman, 1975; Morgan, 1974; Morrison & Collyer, 1974; Tolman, 1933; Wilson, Sherman, & Holman, 1981). Others have found results that encourage the inference of a response-reinforcer association (e.g., Adams, 1982; Adams & Dickinson, 1981b; Chen & Amsel, 1980; Dickinson, Nicholas, & Adams, 1983; Khavari & Eisman, 1971; Krieckhaus & Wolf, 1968; Miller, 1935; St. Claire-Smith & MacLaren, 1983; Tolman & Gleitman, 1949). A particularly compelling example of sensitivity of the instrumental response to changes in the value of the reinforcer was recently reported by Colwill and Rescorla (1985a). They used a within-subjects design in which rats were trained on two different instrumental responses (lever pressing and chain pulling), each associated with a different reinforcer (sucrose liquid or Noyes pellets). Then each animal received pairings of one reinforcer with a lithium chloride (LiCl) toxin in an attempt to decrease its value artificially. The other reinforcer was presented but not poisoned. After this differential treatment of the reinforcers, the animals were once again given access to the instrumental response manipulanda and tested in the absence of the reinforcers. The question of interest was whether the animals would prefer to make the response whose reinforcer had not been devalued by pairing with toxin, thereby displaying knowledge of the specific response-reinforcer contingency. Because this experiment will serve as a prototype in subsequent analyses, we describe the procedure in somewhat more detail. After magazine training and one
Associative Structures in Instrumental Learning
59
session of continuous reinforcement on each response, animals received variable interval (VI) training on each manipulandum. They received, with each manipulandum, one 16-min session on a VI 30-sec schedule and then one 20-min session on a VI 60-sec schedule. Then the manipulanda were removed from the chambers and the animals were given five 2-day cycles of flavor-aversion training. On odd-numbered days the animals received 30 deliveries of one reinforcer, given at a rate of 1 /min. On each of these days, the session terminated in a 0.5% body weight intraperitoneal injection of 0.6 M LiCl. On even-numbered days the other reinforcer was delivered in the same manner, but no toxin was administered. Conditioning of this sort is extremely successful; on the last conditioning cycle, the animals consumed a mean of 0.1 and 30 of the poisoned and nonpoisoned reinforcers, respectively. Finally, each animal was given a 20-min extinction test during which it had simultaneous access to both instrumental response manipulanda. Figure 1 shows the results of that test, separated according to reinforcer identity and poisoning treatment. It is clear that for both reinforcers, instrumental responding was profoundly affected by poisoning of that reinforcer. Animals showed substantially lower response rates on the manipulandum whose reinforcer had been poisoned. The specificity of that depression implies that the Sucrose Reinforcer
Pellet Reinforcer
o Not poiaoned 0 Poiaoned
L-
OO
1
2
3
4 5 0 1 2 Blocks of 4 minutes
3
4
5
Fig. I . Sensitivity of the instrumental response to reinforcer devaluation. Mean responses per minute during the extinction test, shown separately for responses that had been reinforced by sucrose (left panel) or by Noyes pellets (right panel). An aversion had been conditioned to one reinforcer (solid symbols), but not to the other (open symbols). From Colwill and Rescorla (1985a). 0 1985 by the American Psychological Association.
60
Ruth M. Colwill and Robert A. Rescorla
animal encoded the reinforcer identity as part of its knowledge about the instrumental learning situation. A similar result can be obtained if the reinforcer is devalued by motivational means. In a companion experiment, Colwill and Rescorla (1985a) found that selectively satiating the animal on the reinforcer earned by one response led to a selective depression in the rate of making that response. These results thus suggest that instrumental performance is appropriate to the current value of the reinforcer when either motivational or associative procedures are used to devalue that reinforcer. It is worth noting two methodological features of these demonstrations of the impact of postconditioning changes in the value of the reinforcer. First, notice that all of the instrumental training and the reinforcer devaluation manipulations took place in the same chamber. This means that any general effects of the reinforcer devaluation manipulations on the chamber or on responding per se cannot account for the differential performance. Second, these experiments attempted to maximize the similarity between the conditions under which the reinforcer was earned and those under which it was devalued. The devaluation procedure involved delivery of the reinforcer at approximately the same rate and in the same chamber as it had been earned during instrumental training. This matching may be crucial for encouraging the animal to identify the reinforcer undergoing a change in value as being the same as the response-contingent reinforcer. Other results (e.g., Adams, 1982) suggest that the animal is extremely sensitive to relatively minor differences in the mode of delivery of the reinforcer. Failure to match the details of the manner in which the reinforcer is delivered may have contributed to earlier failures to find devaluation effects like those reported here (see Colwill & Rescorla, 1985a, for a fuller discussion). B. CONTINGENCY EFFECTS
A second line of evidence indicating the development of response-reinforcer associations comes from the study of reinforcer contingencies. In recent years it has become clear that Pavlovian conditioning can depend heavily on the contingency between the conditioned stimulus (CS) and the unconditioned stimulus (US), as distinguished from their simple contiguity. One result that has encouraged that view is the adverse effect of presenting USs in the interval between CSs. If one holds constant the number of USs that occur during a CS, varying the frequency of USs at other times can produce dramatic variations in conditioning of that CS (e.g., Durlach, 1983; Rescorla, 1968). That result suggests that the animal is sensitive not simply to the frequency of CS-US pairings, but rather to the contingency between the two events. Hammond (1980) and Dickinson and Chamock (1985) have recently demonstrated parallel results for instrumental training. A lever press that results in food
Associative Structures in Instrumental Learning
61
will be acquired less well if food otherwise occurs at a high rate in the absence of lever pressing. Some recent experiments in our laboratory have attempted to use that observation to analyze the nature of instrumental learning. The notion was that if instrumental responding depends on learning a response-reinforcer association, then presenting that same reinforcer in the absence of the response should have a more devastating effect than would presenting a different, but equivalently valued reinforcer. In order to provide a well-controlled and sensitive test of this proposition, we trained rats to make two different instrumental responses (lever press and chain pull), each leading to a particular reinforcer (liquid sucrose or Noyes pellet). Then we added response-independent presentations of one of the two reinforcers and inspected the consequences for each of the behaviors. If the animal encodes which reinforcer follows each response, then the adverse consequences of a free reinforcer should be more severe for the response that otherwise earned that particular reinforcer. The rats were trained on what Hammond (1980) has called a “constant probability” schedule. In this procedure, the session is divided into 1-sec intervals, and the reinforcer is delivered with some probability at the end of each interval. After some initial training, the animals were exposed to 14 sessions each with the lever and with the chain. The probability of a reinforceriwas set at the value p for each second that contained a response. The values of p were .5, .25, .lo, and .10 for the first four training days; p was set at .05 for the remaining 10 days. Then all animals received two sessions during which both the lever and the chain were available, and the probability of each reinforcer was set separately and independently at .05 for each second containing a response. Throughout this training, the probability of a reinforcer was set at zero for intervals in which no responding occurred. For the next 15 sessions both manipulanda remained present, but deliveries of one reinforcer were added in the absence of responding. The probability of that reinforcer was set at .05 both for intervals that contained the appropriate response and for intervals lacking that response. The other reinforcer continued to be delivered with a probability of .05 only in intervals containing the other response. Figure 2 displays the results of those manipulations. To the left are shown the relatively high rates of responding prior to the introduction of response-independent reinforcers. The middle portion of the figure shows the consequences of introducing free deliveries of one reinforcer. Unearned reinforcer deliveries produced an immediate drop in the rate of both responses. That loss may partly be due to the increased time spent consuming the unearned reinforcers. But more interestingly, free reinforcers produced a more substantial loss in the response which otherwise earned that particular reinforcer. The right-hand portion of Fig. 2 shows the results of a subsequent extinction test carried out in the absence of all reinforcers. During that test, the two
Ruth M. Colwill and Robert A. Rescorla
62
a
-
8-
Q)
3
C
5 & a
<
9
6-
\
\
\
v)
8
\
\
\
C
\
0
Ef Q)
4-
LI C
3
\
2-
Final Training
Free Reinforcers
Extinction
3 Day Blocks Fig. 2. Sensitivity of the instrumental response to noncontingent reinforcer deliveries. The center panel shows the mean rate of two responses when they produced a reinforcer either the same as or different from that delivered noncontingently. To the left are shown the rates of responding prior to the introduction of noncontingent reinforcers. To the right are shown the rates of responding in the absence of all earned and noncontingent reinforcers.
responses continued to occur at quite different rates. These results suggest that response-independent deliveries of a reinforcer have an enduring selective depressive effect on responses that earn that reinforcer. That observation is consistent with the notion that a response-reinforcer association underlies instrumental learning. Current theoretical treatments of such contingency effects in Pavlovian conditioning typically appeal to background conditioning (e.g., Rescorla & Wagner, 1972). According to such treatments, those USs occurring at times other than the CS result in conditioning of background stimuli. Conditioned background stimuli are then present during the times that the CS is paired with the US. It is well documented that CS-US pairings which take place in the presence of another already conditioned stimulus produce less learning (Kamin, 1968, 1969). Consequently, the background cues block conditioning of the original CS. One result that has encouraged this interpretation is that the adverse consequences of those reinforcers can be attenuated if they are preceded by another discrete signal (e.g.,
Associative Structures in Instrumental Learning
63
Durlach, 1983). Under those circumstances, one would expect the signal to reduce conditioning to the background and thus attenuate the ability of the background to interfere with conditioning of the original CS. One can give the effect of interresponse reinforcers on instrumental performance a similar interpretation and evaluate it by a similar manipulation; that is, one can ask whether signaling the free reinforcers delivered during instrumental training will attenuate their adverse effect. The results shown in Fig. 3 come from a procedure designed to answer that question. After the results shown in Fig. 2 were collected, the animals were returned to a simple training procedure with both manipulanda for 10 daily sessions. Then each animal was given Pavlovian conditioning with a 4-sec light-noise compound which terminated in sucrose. On each of 2 days, the animals received 40 light-noise presentations, delivered at a mean rate of 3/min. Then all animals were given the opportunity to earn reinforcers on both the lever and the chain, with free sucrose deliveries intermixed at the .05/sec rate. For half the animals those free sucrose deliveries were each preceded by the light-noise compound; for the other half, the sucrose deliveries were unsignaled. The results are shown in Fig. 3. When the free reinforcers were unsignaled, the results were as before: lower likelihood of the response for which the reinforcer was otherwise freely delivered. However, when the response-independent reinforcers were signaled, that effect was markedly attenuated. These results support the view that the effects of free reinforcers involve background conditioning. They suggest that instrumental and Pavlovian associations may be similarly affected by associations of the background with the reinforcer.
hsignaled
Signaled
Fig. 3. The effect of signaled and unsignaled noncontingent reinforcers upon rate of instrumental responding. The noncontingent reinforcers were either the same as or different from those earned by the response.
64
Ruth M. Colwill and Robert A. Rescorla
It may be noted in passing that the present results help to rule out several relatively uninteresting interpretations of the effects of interresponse reinforcers. One simple possibility is that freely delivering reinforcers results in the adventitious training of a competing behavior. Suppression of the original behavior would then be mediated by the training of some competing behavior that goes unobserved by the experimenter. Although such a competing response might contribute to the general depressive effect shown in Fig. 2, it is difficult to see how it could produce a selective effect on the response that earned a particular reinforcer. A second possibility is that freely delivered reinforcers attenuate a particular response because they produce selective satiation of the reinforcer earned by that response. As noted earlier, Colwill and Rescorla (1985a) found that selective satiation effects can govern differential instrumental performance. However, the sensitivity of the depressive effect to signaling of the extra reinforcers is difficult to account for in terms of reduced motivation. It seems unlikely that signaled reinforcers should differ so markedly from unsignaled reinforcers in producing satiation. Consequently, these results suggest that delivering reinforcers independently of an instrumental response has a selective depressive effect on the association between that response and the reinforcer. That implies a knowledge of the particular reinforcer. These two types of evidence strongly suggest that in instrumental learning the organism does encode the reinforcer. The reinforcer does not simply play the role of a catalyst producing learning about other events, but rather participates in the learning underlying instrumental performance. C. OTHER EVIDENCE ON REINFORCER ENCODING In addition to these two types of evidence, there are several other results that have been interpreted as support for the proposition that instrumental training involves response-reinforcer learning. Although they have been reviewed elsewhere, we think that it is worth noting these supporting results here. I.
Concurrent Measurement
A procedure of historical importance to the development of two-process theories was the concurrent measurement of Pavlovian responses during instrumental training (cf. Rescorla & Solomon, 1967). Although such measurements did not identify clear and consistent Pavlovian mediators for instrumental behavior, they did yield some results that support the existence of response-reinforcer associations. For instance, Williams (1965) measured the salivary response in dogs engaging in food-rewarded panel pressing. In some instances, he found clear evidence that instrumental training allowed the panel-press response to become
Associative Structures in Instrumental Learning
65
an elicitor of a conditioned salivary response. This suggests that the response became a signal for the reinforcer. Unfortunately, this sort of procedure can only provide evidence that the organism anticipates the reinforcer once it has made the instrumental response; it does not demonstrate that this response-reinforcer association is involved in the production of the instrumental response. Although the reinforcer may be expected once the response has occurred, that expectation may be an epiphenomenon, playing no role in the generation of the instrumental behavior.
2 . Response Form One kind of evidence that suggests detailed encoding of the reinforcer in Pavlovian conditioning is the dependence of the form of the conditioned response on the nature of the reinforcer. Typically, the Pavlovian conditioned response is different when different unconditioned stimuli are used, suggesting that the identity of the unconditioned stimulus is preserved in the associative structure underlying Pavlovian conditioning. Particularly systematic data on this point have been reported by Jenkins and Moore (1973) for the form of the autoshaped keypeck response in pigeons. In instrumental training the response form is much more heavily determined by the demands of the experimental contingencies themselves. But sometimes the details of the response form may reflect the identity of the reinforcer used. When the same nominal instrumental response is rewarded with different outcomes, the topography of the response can vary with the outcome. For instance, Spetch, Wilkie, and Skelton (1981) found differences in instrumental peck duration, force, and observer-judged form when birds pecked a key for food and water. Similarly, Hull and his colleagues (Cook & Hull, 1979; Hull, 1977; Hull, Bartlett, & Hill, 1981) identified qualitative and quantitative differences in lever pressing in rats reinforced with food and water. The implications of these results are somewhat clouded by alternative interpretations in terms of different motivational states, differential exposure to the reinforcers per se, or the availability of external stimuli that might signal the food; but these data are consistent with the view that the reinforcer becomes associated with the response so as to influence its precise form. 3. Mutual Interjierence with Pavlovian CS-US Associations
It is well demonstrated that when two Pavlovian CSs jointly signal the same US, they interfere with each other’s ability to develop an association with the US; that is, they overshadow each other. Several authors have argued that one may use this interference to evaluate the degree to which two stimuli share associations with the same US. This assessment procedure has in fact been used
66
Ruth M. Colwill and Robert A. Rescorla
to good effect in analyzing the structure of Pavlovian associations (e.g., Blanchard & Honig, 1976; Holland, 1977). Recently, there have been several attempts to apply this logic to demonstrate the shared encoding of the reinforcer in instrumental learning and Pavlovian conditioning. To the degree that the instrumental response becomes associated with the US in the same manner as does a Pavlovian C S , we should observe interference between the two associations. Several studies have provided evidence that a Pavlovian C S can overshadow the instrumental response (e.g., Dickinson, Peters, & Shechter, 1984; Mackintosh & Dickinson, 1979; Pearce & Hall, 1978; St. Claire-Smith, 1979a,b; Tarpy, Lea, & Midgley, 1983). Williams (1982) has recently reviewed this literature and concluded that although there are some interpretative difficulties, the data support the idea that Pavlovian conditioning and instrumental training share encoding of the reinforcer. Unfortunately, as Rescorla and Holland ( I 982) noted, these results are not uniquely anticipated by a response-reinforcer theory, but are consistent with any theory that says that predicted reinforcers are less effective. Even if an instrumenral reinforcer serves only as a catalyst promoting other associations with the response, its effectiveness in doing so may be attenuated if it is otherwise well signaled by a Pavlovian C S . Moreover, that observation may weaken the implications of the results shown in Fig. 2. As already noted, the effect of unearned reinforcers in reducing instrumental learning can be interpreted in terms of background conditioning interfering with the instrumental response learning. Within that view the results shown in Fig. 2 are a special case of a Pavlovian CS (the background) interfering with instrumental learning. Then they fall prey to a similar alternative interpretation. There is also evidence for interference in the other direction in which an instrumental response overshadows conditioning of a Pavlovian CS. For instance, Ganud, Goodall, and Mackintosh (1981) reported that a CS paired with food acquires less conditioning if it is accompanied by performance of an instrumental response. Moreover, Shettleworth ( I 98 I ) demonstrated that the amount of overshadowing of a Pavlovian C S by a response was directly related to the susceptibility of that response to the instrumental contingencies. Data such as these support the conclusion based on work with the concurrent measurement technique: After instrumental training the occurrence of the response results in the organism’s expecting the reinforcer. However, these data, too, provide no evidence that this expectation plays an important role in the generation of the instrumental response. Despite these alternative interpretations, the data on mutual interference between instrumental response learning and Pavlovian conditioning agree with the conclusions based on other procedures and so support the view that the reinforcer is encoded in an association with the response.
Associative Structures in Instrumental Learning
61
4. Reward Shifts
As Mackintosh (1974) points out, some of the earliest evidence for encoding of the reinforcer came from experiments that changed the reinforcer during the course of acquisition. The idea motivating these studies is that an animal should display a change in performance when the identity of the reinforcer is altered only if the nature of the reinforcer has been encoded. One very simple performance change that was documented in these early experiments was an orienting reaction on the first trial after a reward shift (e.g., Cowles & Nissen, 1937; Elliott, 1928; Lorge & Sells, 1936; Nissen & Elder, 1935; Tinklepaugh, 1928). But it has been more common to index sensitivity to change in reward in terms of the profound and rapid adjustment in instrumental performance when the reward is changed (e.g., Crespi, 1942; Elliott, 1928; Zeaman, 1949). The important observation is that performance after the shift in reward depends on the relation of the rewards used before and after the shift; the same postshift reinforcer produces quite different performance, depending on the reinforcer used prior to the shift. That dependence implies that an encoding of the original reinforcer is available for comparison with the new reinforcer. However, because the shifts in reward typically involve changes only in magnitude, they have only provided evidence for a crude encoding of the reinforcer. These experiments can be viewed as the precursors to experiments (like that reported in Fig. 1) that change the value of the reinforcer after learning is complete. One important difference is that they acquaint the animal with the new value of the reinforcer at the same time that the response itself is being measured for its sensitivity to that shift. For this reason, most of these reward-shift experiments could be interpreted in a manner like that described in the previous section: The effectiveness of the instrumental reinforcer in producing learning among other events may be modulated by the degree to which it is poorly signaled by external stimuli. Although many of these results have alternative interpretations, as a whole they are consistent with data reported in Sections II,A and II,B. In sum, such results provide convincing support for the view that instrumental learning involves some encoding of the reinforcer. D. GENERALITY OF REINFORCER ENCODING
Thus far, we have reviewed evidence from a variety of sources indicating that an important component of instrumental conditioning involves learning about the identity of the reinforcer. Especially compelling are those data showing that under well-controlled conditions the instrumental response can be highly sensitive to changes in the value of its reinforcer. However, there has been some uncertainty about the precise conditions under which that outcome can be ob-
68
Ruth M. Colwill and Robert A. Rescorla
tained. As noted previously, some experiments have found that postconditioning changes in the value of the reinforcer have no effect on instrumental performance. Those results suggest that under some circumstances instrumental learning may not involve encoding of the reinforcer. Consequently, we have repeated the procedures used for generating the results in Fig. 1 under a fairly broad range of parameter values. We report here variations using parameter values previously suggested to minimize sensitivity to changes in reinforcer value, and by implication the degree of response-reinforcer learning. In the experiments to be described we reduced the density of reinforcement, increased the delay of reinforcement, increased the amount of instrumental training, and brought the response under the control of an external stimulus. As we detail below, none of these manipulations eliminated encoding of the reinforcer. 1 . Extensiveness of Training The variable that has most frequently been suggested to change the nature of instrumental learning is the amount of training. Many different authors have argued that although instrumental behavior is initially goal directed, it eventually develops a kind of automatic quality that makes it relatively independent of the value of the goal. This notion of automaticity has deep historical antecedents (e.g., Allport, 1937; James, 1890; Murphy, 1947; Tolman, 1933, 1948) as well as widespread modem instantiations (e.g., Adams & Dickinson, 1981a; Hasher & Zacks, 1979; Irwin, 1971; Shiffrin & Schneider, 1977). In the language used here, these views suggest that although instrumental learning is initially response-reinforcer in character, with practice it becomes stimulus-response in nature. Empirical support for this proposition is remarkably scant. In one study, Adam (1982) found that poisoning of a reinforcer had less effect on performance of an instrumental response in animals previously given extensive response-reinforcer training than in animals previously given only moderate training. However, further analysis of this result revealed that the source of this difference was not the amount of instrumental training, but rather the degree of familiarity with the reinforcer itself. In keeping with this suggestion is the fact that in some studies Adam found it substantially more difficult to make animals reject the food after it had been extensively earned. This raises the possibility that extensive training does not affect the sensitivity of instrumental learning to changes in the value of the reinforcer, but rather affects the sensitivity of the reinforcer to the manipulation that changes its value. Extended training may not modify the character of instrumental learning; instead, it may modify the ability of our measurement techniques to assess that learning adequately. For instance, it is not implausible to think that extensive exposure to the food reinforcer would produce latent inhibition, thus making it
Associative Structures in Instrumental Learning
69
difficult to devalue that food by a conditioning operation. Nor would it be surprising if extensive experience earning food allowed the animal to discriminate more readily those food presentations that are poisoned from those that are earned by the instrumental behavior. Consequently, in a recent series of experiments, we attempted to examine extensiveness of training under conditions that equate exposure to the reinforcer per se (Colwill & Rescorla, 1985b). In these experiments, the animals were trained to make four different instrumental responses: lever press, chain pull, nose poke, and a handle pull. The chain and lever were, as in previous experiments, located on either side of the food magazine. The nose poke response consisted of the depression of a panel located behind a hole cut in the wall above the food magazine. The handle pull involved reaching between the grid bars and lifting a rod fitted with a circular handle located immediately to the right of the food magazine. Two of the responses (R1 and R2) earned one reinforcer (S 1) and the other two (R3 and R4) earned another reinforcer (S2). Response and reinforcer identifications were counterbalanced across animals. The design is schematized in Table I. Each response was reinforced for one 16-min session on a VI 30-sec schedule. Then all responding was reinforced on a VI 60-sec schedule. One member of each pair (Rl and R3) was trained moderately (one 20-min session) and the other member (R2 and R4) was trained extensively (13 20-min sessions). At the end of this training, one reinforcer was paired with LiCl and the other was presented, but never poisoned. That differential conditioning continued for six 2-day cycles. Finally, the animals were given two extinction tests. In one test, they chose between two extensively trained responses (R2 vs R4); in the other test, they chose between two moderately trained responses (Rl vs R3). Notice that one important feature of this design is that the very same reinforcer is TABLE I DESIGNOF EXTENSIVE TRAINING EXPERIMENT^
i Training
Devaluation
Test
R 1 + SI R2-SrSI R3 + S 2 R4 +S2
S1+, S2-
Moderate: RI vs R3
S2 + , S I -
Extensive: R2 vs R4
R1, R2, R3, and R4 are instrumental responses, lever pressing, chain pulling, nose poking, and handle puliing, counterbalanced across animals. S1 and S2 are sucrose and pellets. + and - represent the presentation or not of LiCI.
Ruth M. Colwill and Robert A. Rescorla
70
lo
3
Moderate
Extended
c
8 .
.-C
E 55 6 P
!I i 4
c
c
21
I O
poironed
I
0 not polronod
h l2-L-u1
2
3
4
5
1
2
3
4
5
Blocks of 2 minutes
Fig. 4. Sensitivity of extensively and moderately trained instrumental responding to reinforcer devaluation. Instrumental responding is shown from a choice extinction test administered between either two extensively (left panel) or two moderately (right panel) trained responses. An aversion had been conditioned to one reinforcer (solid symbols), but not to the other (open symbols). From Colwill and Rescorla (1985b). 0 1985 by the American Psychological Association.
used to train one response extensively and one moderately; as a result, we can be assured of equivalently devalued reinforcers for responses trained in varying amounts. Figure 4 shows the results of these choice tests for the responses trained moderately (left panel) and extensively (right panel). It is clear that for both amounts of training responding remained sensitive to the poisoning operation. There is little to suggest that under these conditions extensive training changed the character of the instrumental learning. Moreover, those results have been replicated in a recent study in which the amount of extensive training was increased to 60 sessions. Consequently, considerable information about the reinforcer identity seems to be preserved even after quite extensive amounts of training. 2 . Reinforcement Density
It has been suggested by a number of authors that performance is less dependent on the maintained integrity of the reinforcer when the relation between the response and reinforcer is degraded during instrumental training. One popular procedure for degrading that relation is to arrange for only some of the responses to be followed by reinforcers. There is a good deal of evidence that such partial
71
Associative Structures in Instrumental Learning
reinforcement procedures result in learning that is highly resistant to extinction (see reviews by Jenkins & Stanley, 1950; Lewis, 1960; Robbins, 1971). This suggests the possibility that less dense reinforcement schedules might remove the sensitivity of instrumental performance to changes in the reinforcer value. In most of the studies reported above, the reinforcement density was already relatively low, since the animals were trained on a VI 1-min schedule. However, in order to investigate this possibility further, we recently repeated the extended training study, but with the reinforcement density reduced fourfold. The rats were trained to make all four responses: lever press, chain pull, nose poke, and handle pull. As before, two were reinforced with sucrose pellets (one moderately and one extensively) and two with Noyes food pellets (again one moderately and one extensively), but now all responding was reinforced on a VI 4-min schedule. Then one reinforcer was paired with LiCl and the other was not. After six 2-day cycles of aversion training, all subjects were given two choice tests: one with the moderately trained responses and one with the extensively trained responses. The results, shown in Fig. 5, were very similar to those obtained with VI 1-min reinforcement schedules, shown in Fig. 4. Even under circumstances of quite low reinforcement density, responding remained sensitive to changes in the reinforcer value. Extended
Moderate
I
1
2
3
4 5 1 2 3 Blocks of two minutes
4
5
Fig. 5. Sensitivity of extensively (left) and moderately (right) trained instrumental responding to reinforcer devaluation. All training took place under a reduced reinforcer density. The treatment identifications are the same as in Fig. 4.
12
Ruth M. Colwill and Robert A. Rescorla
It is important to note, however, that other results suggest that the reinforcement schedule may be important. Chen and Amsel (1980) found partially reinforced runway behavior to be less sensitive to subsequent poisoning of the reinforcer than was continuously reinforced behavior. Capaldi and Myers ( 1978) found a similar difference in sensitivity to satiation. However, both studies also found differences in the success with which they could change the value of the reinforcer; partially reinforced animals continued to consume the reinforcer longer in the face of both poisoning and satiation. That raises the possibility that the reinforcer was not equally changed in the various groups. Potentially more interesting are the findings of Dickinson et al. (1983) that behaviors reinforced on a ratio schedule are more sensitive to changes in reinforcer value than are behaviors reinforced on an interval schedule. It seems quite possible that different ways of arranging the response-reinforcer relation will vary the degree to which the details of the reinforcer are encoded. However, the results of Fig. 5 clearly indicate that simply reducing the density of reinforcement does not eliminate the animal’s encoding of the reinforcer on a VI schedule.
3. Delay of the Reinforcer Another procedure that has been thought to make a response less sensitive to changes in the value of its reinforcer is to arrange for the response to be temporally or spatially distant from its reinforcer. Several authors (e.g., Morgan, 1974, 1979; Rescorla, 1977) have argued that with a temporal separation between the occurrence of a response and delivery of the reinforcer, the animal learns less about the particulars of that reinforcer. One reason commonly given for that view is that a long delay permits other responses and stimuli to intervene between the response and the reinforcer. Those other events may then become valuable by virtue of their own proximity to the reinforcer and may in turn serve as the functional reinforcers for the response being measured; that is, instrumental responses that are distant from the primary reinforcer may actually be maintained by conditioned reinforcers. Since there is some evidence from the Pavlovian literature to suggest that stimuli associated with conditioned reinforcers may fail to incorporate information about the primary reinforcers, it is plausible to think that instrumental responses trained by conditioned reinforcers may have a similar failing (e.g., Holland & Rescorla, 1975). One implication of such an analysis is that behaviors more distant from the goal should be less sensitive to changes in that goal. There is some evidence to support that implication when the goal is changed by motivational techniques (e.g., Morgan, 1979; Rescorla, 1977). On the other hand, there is other evidence suggesting just the opposite (e.g., Fantino, 1965; Nevin, Mandell, & Yarensky, 1981). To investigate this possibility further, we recently conducted an experiment devaluing the reinforcer after instrumental training carried out with various de-
Associative Structures in InsawaentslLearning
13
lays of reinforcement. This experiment was closely modeled on that studying the impact of extensive training, except that delay of reinforcement replaced amount of training in the design (see Table I). Rats were trained to engage in four instrumental responses (lever pressing, chain pulling, handle pulling, and nose poking), two leading to Noyes pellets and two to sucrose pellets. For each reinforcer, one response produced its immediate delivery and the other response produced its delivery after a 5-sec period. In the delayed reinforcement condition, a diffuse light filled the 5-sec interval for one response and a tone filled the interval for the other response. Then half the animals received the poisoning sequence with sucrose and half with Noyes pellets. Finally, all animals were tested with two choice procedures given in counterbalanced order. During one test, the animals were given a choice between two responses that had a history of immediate reinforcement, and during the other test, they were given a choice between two responses that had a history of delayed reinforcement. In each test, the reinforcer for one of the responses had been poisoned, but the reinforcer for the other response was still valuable. Figure 6 shows the results of those two choice tests, presented in the manner of earlier figures. Not surprisingly, responses that had been reinforced with a delay were in general made less frequently than those reinforced immediately. However, whatever the response-reinforcer interval, poisoning of a reinforcer led to a decrement in the likelihood of the response that had previously earned it. Those Immediate
*.E
Delayed
0 Poboned 0 Not pohonod
-
$6-
! 8 E 4-
0
0
1
2
3
4 5 0 1 2 Blocks of 2 minutes
3
4
5
Fig. 6. Sensitivity to reinforcer devaluation of instrumental responding trained either with an immediate (left panel) or delayed (right panel) reinforcer. An aversion had been conditioned to one reinforcer (solid symbols), but not to the other (open symbols).
74
Ruth M. Colwill and Robert A. Rescorla
results suggest that neither delaying the reinforcer nor interposing an explicit event between the response and the reinforcer results in performance that is impervious to changes in the value of the reinforcer. Even under these conditions, there is an encoding of the reinforcer in the learning underlying instrumental performance. 4. Stimulus Control
Another procedure that might instill an independence between a response and its outcome is the use of a stimulus to signal when the response is reinforced. Under those circumstances, of course, the animal comes to make the response primarily in the presence of a particular stimulus. It seems plausible that the arrangement of such a regular close relation between a response and its antecedent stimulus would especially encourage the formation of an association between the two. Consequently, training that establishes control over responding by a particular stimulus may increase the automaticity of the behavior and result in a loss of its sensitivity to changes in the value of its reinforcer. The possibility that establishing stimulus control over behavior might make that behavior resistant to a change in the reinforcer value has some empirical support. For example, Wilson et al. (1981) reinforced lever pressing in the presence but not in the absence of a clicker-light compound (SD).Then, in half the subjects, they paired the reinforcer with a toxin. In a subsequent test, they found that the level of responding in the presence of the SD was the same regardless of the current value of the reinforcer. Unfortunately, interpretation of that result is complicated because in that experiment instrumental responding established in the absence of an SD was also insensitive to reinforcer devaluation. However, a recent experiment in our laboratory suggests that such a loss of sensitivity need not occur. We conducted a variation on our basic devaluation experiment in which we established stimulus control over the instrumental responses. The animals were trained to chain pull and lever press, with each response followed by one particular reinforcer, either 8% liquid sucrose or Noyes pellets. However, only those responses that occurred in the presence of an SD (either a tone or a diffuse light) led to reinforcement. Each animal was trained with two unique combinations of stimulus, response, and reinforcer, but across animals all combinations were equally represented. During training, only one of the two designated combinations was presented on a particular day. Each 1-hr training session contained 30 trials per day for 16 days. Each trial had a maximum length of 10 sec; the first response during the trial terminated the stimulus and delivered the appropriate reinforcer. Responses in the intertrial interval were without consequence. Then each animal received five 2-day cycles of aversion training, with one reinforcer paired with LiCl and the other not. Finally, each animal was given the opportunity to make each response in the presence of each SD. On each of 4 days, it received two test sessions; one response was tested in
Associative Structures in Instrumental Learaing
15
the first session and the other response was tested in the second session. During a test session, each SD was presented five times for 10 sec, but no reinforcers were delivered. Figure 7 shows the outcome of those extinction test sessions. Plotted are the mean latencies to make the response upon onset of the stimulus with which it had been trained. The data are separated according to the poisoning treatment of the reinforcer that previously followed that stimulus-response pair. Animals almost never responded when the other stimulus was turned on during this test. The symbols at the left represent latencies at the end of training prior to poisoning of the reinforcer. The lines to the right show the latencies during the test days. It is clear that the latencies were sensitive to the treatment of the reinforcers. Latencies for the responses whose reinforcer had been poisoned were reliably longer than those for responses whose reinforcer had not been poisoned. Those data suggest that simply bringing a response under stimulus control does not make it impervious to changes in the value of its reinforcer. There are two other features of this experiment that are worth noting. First, the responses were differentially affected not only in the presence of the SD,but also
0 Poisoned 0 Not poisoned
1
2 3 Sessions
4
Fig. 7. Sensitivity to reinforcer devaluation of an instrumental response brought under the control of a discriminative stimulus. Shown are the mean latencies of responding during extinction testing. An aversion had been conditioned to one reinforcer (solid symbols), but not to the other (open symbols).
16
Ruth M. Colwill and Robert A. Rescorla
in their absence. The likelihood of an intertrial response was substantially lower if the reinforcer for that response had been paired with poison. During the first test day, the mean number of intertrial responses was 19 and 42 for the poisoned and unpoisoned responses, respectively. Second, the use of an SD provides additional information on the immediacy with which the organism shows these effects in testing. Earlier results from free-operant paradigms have made it clear that response rates differ very early in testing, within the first 4 min. But in the present discrete-trial experiment, there was a difference in response latency on the very first test trial. That suggests that the animal anticipates the particular reinforcer it will receive from the very outset of testing. 5 . Conclusion
The results of all of these manipulations suggest that encoding of the reinforcer is a widespread and robust feature of instrumental learning. Despite variations in amount of training, density of reinforcement, delay of reinforcement, and degree of stimulus control, the animals continued to show changes in performance of the instrumental response whose reinforcer was changed in value. But it is very difficult to decide the more subtle issue of whether these manipulations changed the degree to which the reinforcer is encoded. The first three manipulations changed the general level of responding after instrumental learning. The resultant repositioning of the control responses on performance scales makes comparisons of effect magnitudes extremely hazardous. However, it is relevant to note that whether those manipulations generally enhanced responding (as with extended training) or generally depressed it (as with delayed reinforcement), encoding of the reinforcer persisted. It is even more difficult to compare the latency measure of behavior under stimulus control with the rate measure obtained in the absence of that control. But again, there is no evidence to suggest that either situation destroys reinforcer encoding. Finally, we should comment on why our results consistently show such strong evidence of the effects of devaluing the reinforcer when earlier experiments produced a range of outcomes. A unique feature of our experiments is the use of multiple reinforcers to train multiple responses. Most other experiments have trained only a single response with a single reinforcer. The use of a multiple response/reinforcer procedure has a number of consequences that might be expected to enhance our ability to detect the effects of devaluing the reinforcer. First, it allows one to use a test procedure in which the same animal is given a choice between two responses whose reinforcers have received different devaluation treatments. Although the results reported by Colwill and Rescorla (1985b) and in Section 111 suggest that such a testing procedure is not essential, it almost surely increases the chances of detecting differences in response strength. Second, a multiple response/reinforcer procedure permits one to conduct the
Associative Structures in Instrumental Leprning
I1
devaluation treatment in the same environment used for instrumental training. Maximizing the similarity between the circumstances of instrumental learning and those of change in the value of the reinforcer may be important in inducing the organism to identify the reinforcer changed as the same as that it earlier earned. But if poisoning of a reinforcer is conducted in the original training environment, the value of the environment may be adversely affected. With a multiple response/reinforcer procedure, one can hope to equate any general impact of poisoning on the environment by arranging for all animals to have the poisoning preceded by some reinforcer and then looking for a specific impact on the response trained by that reinforcer. However, when only one responsereinforcer relation has been trained, it is necessary to compare animals that receive poisonings preceded by the reinforcer with other animals that receive poisonings in an unpaired relation to the reinforcer. As a result, the circumstances under which the poison is presented in the environment differ across groups. It would not be surprising if the animals receiving unpaired presentations of poison showed not only less conditioning of the reinforcer, but also greater conditioning of the environmental stimuli. If conditioning of an environment depresses instrumental performance in its presence, this would make detection of an effect of reinforcer devaluation in the paired group more difficult. Indeed, in some reported failures to detect instrumental response effects of devaluing a reinforcer (e.g., Adams, 1982), the difficulty appears attributable to the substantial depression of instrumental performance observed in the unpaired group. Third, exposing the animal to multiple reinforcers following multiple responses may induce a devaluation effect where it otherwise might not have existed. This might happen in at least two ways. First, it is possible that when the animal experiences only one reinforcer, it fails altogether to encode that reinforcer and uses it only as a catalyst. This seems unlikely since we have been able to detect devaluation effects in animals that have only experienced one response/reinforcer relation. Second, it is more likely that exposure to multiple reinforcers encourages the animal to focus its encoding on the features that differentiate them. The devaluation procedures that are commonly employed change the value of the reinforcer by conditioning only some of its features, typically those involving flavor; they leave relatively untouched various other features. Consequently, any procedure that encourages the animal to encode the reinforcer used for instrumental training in terms of its flavor would be expected to allow one to detect the consequences of that devaluation procedure. On the other hand, to the degree that the animal learns an association between the response and other features of the reinforcer, such as its general drive-reducing properties, a devaluation treatment that affects the flavor might be relatively less effective. Using multiple reinforcers that share some common properties but that differ in flavor may encourage flavor-based encoding. Consequently, although instrumental training may intimately involve response-reinforcer associations in
78
Ruth M. Colwill and Robert A. Reseorla
either the single or multiple-reinforcer procedure, the use of multiple reinforcers may influence the exact encoding of the reinforcer and hence our ability to detect those associations with any given devaluation treatment. Whatever the merit of these observations, it is clear that with the procedures used here, response-reinforcer associations are a routine outcome of instrumental training.
111. Separation of R-Reinforcer from S-Reinforcer Learning The results of the previous section indicate quite clearly that the reinforcer does not simply serve as a catalyst producing instrumental learning among other elements. Rather, that reinforcer is itself encoded. It is natural to assume that this encoding is done by means of an association with the response that produces the reinforcer. Indeed, the previous discussion has interpreted evidence for encoding of the reinforcer in just those terms. However, the studies previously cited are in fact equally compatible with a two-process notion in which the reinforcer becomes associated not with the response, but with those stimuli that are antecedent to the response. Most versions of a two-process account assume that the response develops an association with some prior stimulus, an association promoted by the reinforcer. They further assume that the reinforcer also becomes associated with that prior stimulus through a Pavlovian process. By virtue of this encoding, it is thought that the reinforcer may then act either as a motivator or as a mediator of the instrumental association. Whatever their detailed function, the important point is that the format of this encoding is in terms of an association of the reinforcer with an antecedent stimulus, not with the instrumental response itself. This interpretation is most obviously applicable to the case of discrete-trial instrumental learning (an instance of which was used to produce the results shown in Fig. 7). When a response is reinforced only in the presence of some SD, it has seemed only natural to acknowledge that there is an embedded Pavlovian relation between the SD and the reinforcer (e.g., Rescorla & Solomon, 1967). Subsequent changes in the value of the reinforcer would then be expected to produce changes in the animal’s reaction to that SD. Responding might be depressed in the presence of that SD not because the animal knows what reinforcer will follow the response, but because it knows what reinforcer will follow that stimulus. The anticipatory aversive reaction could well be expected to reduce specifically responding in the presence of the SD by removing either some of the original motivation (Rescorla & Solomon, 1967) or some of the original stimulus support (Trapold & Overmier, 1972) for the response. Two-process theory can readily be expanded to explain many instances in which nondiscriminated instrumental responding is adversely affected by changes in the reinforcer value. Even in the absence of an explicit stimulus,
Associative Structures in Instrumental Learning
19
reinforcement of a response must always occur in the presence of some contextual stimuli. In the most elementary experiments in which a response is followed by a reinforcer in some apparatus, one would expect the apparatus cues to become associated with the reinforcer. If that reinforcer should become aversive, responding in that apparatus would be depressed even in the absence of a response-reinforcer association. Two-process accounts may even be extended to certain cases in which multiple responses and reinforcers all occur in the same physical context. For instance, in many of the experiments previously described, the animals were trained on two response-reinforcer relations, both in the same chamber. However, only one response-reinforcer combination was available during any one training session. As a result, the anticipation of a particular reinforcer may become a component of the stimulus complex in which each response has been reinforced. When that reinforcer is subsequently poisoned, the context may then fail to provide either the necessary motivation or stimulus support for that response. The stimulus conditions appropriate to performance of the other response whose reinforcer has not been devalued will still be intact, leading to differential performance of the two responses. However, this account is less successful in dealing with experiments in which multiple responses and reinforcers are mixed within the same training experience. For instance, Adams and Dickinson (1981b) trained rats to lever press for one reinforcer, but administered another reinforcer in a noncontingent fashion during the same session. As a result, there was no opportunity for the responsecontingent reinforcer or the anticipation of its occurrence to gain a special advantage as a stimulus in the presence of which the response was reinforced. Nevertheless, Adams and Dickinson (1981b) found that responding was more profoundly affected if the reinforcer for that response had been poisoned than if the noncontingent reinforcer had been devalued. A recent experiment in our laboratory (Colwill & Rescorla, 1985a, Experiment 2) provides supporting evidence under conditions where the two reinforcers were more equivalently treated. This study was a simple variation on the experiment for which the results are described in Fig. 1. In this particular variation, both responses (lever pressing and chain pulling) were trained in the same session. In each session, the animals had both manipulanda continuously available; responding on either produced reinforcers on equivalent VI schedules, but each response earned only one kind of reinforcer (either sucrose or pellet). Like the Adams and Dickinson study, this design provides no opportunity for the reinforcer earned by the response to bear a special antecedent stimulus relation to that response. Yet when our standard poisoning operation was conducted, it differentially affected responding. As shown in Fig. 8, even after this intermixed training, the response whose reinforcer was poisoned was selectively depressed. The results of such experiments suggest that the encoding of the reinforcer
80
Ruth M. ColwiU and Robert A. Rescorla Sucrose Reinforcer 0
1
2
Not pol8onad
3
Pellet Reinforcer
I-
0 4 5 1 2 Blocks of 4 minutes
3
4
5
Fig. 8. Sensitivity to reinforcer devaluation of instrumental responses trained concurrently in the same session. Treatment designations are as in Fig. 1. From Colwill and Rescorla (1985a). 0 1985 by the American Psychological Association.
does not take place in terms of an association between the reinforcer and the global properties of the context. However, it remains possible to defend a twoprocess theory that sees more local stimulus events as responsible for encoding of the reinforcer. Notice, for instance, that the two responses we measure occur on manipulanda that are located in different parts of the chamber. This means that there are separate local environmental stimuli that are correlated with the occurrence of the two reinforcers. As a result, it is quite possible that each reinforcer is uniquely associated (by Pavlovian processes) with these different locations. When the value of one of the reinforcers is reduced, the location correlated with delivery of that reinforcer would become less attractive and thereby differentially interfere with performance of the instrumental responses. The tactile, visual, and olfactory properties of the manipulandum, rather than the execution of the instrumental response, may be the associates of the reinforcer. It seems unlikely that purely empirical assessments can decisively choose among these alternatives. As the stimuli that are envisioned to control anticipation of the reinforcer become more and more localized in the manipulandum itself, the distinction between the stimulus and response aspects of behavior becomes less and less tenable. It is only the adherence to a broader theoretical framework that would lead one to advocate one description over another. Nevertheless, experiments can contribute to the plausibility of one alternative or the other. For instance, in one recent experiment we attempted to make the response-reinforcer alternative more attractive by intentionally minimizing the differences in correlated stimulus features of responding. Our strategy, following an earlier suggestion by Bolles, Holtz, Dunn, and Hill (1980), was to reinforce two different behaviors, both addressed to the same manipulandum. For this purpose, we inserted into each chamber a 10-cm vertical pole that could be displaced either to the left or to the right. We otherwise conducted an experiment
Associative Structures in Instrumental Leaning
81
like that just described. In each of six 20-min sessions displacement in one direction produced a Noyes pellet on a VI I-min schedule, whereas displacement in the other direction produced a liquid sucrose reinforcer on a similar schedule. Observation of the animals indicated similar patterns of orientation toward, approach to, and grasping of the manipulandum regardless of which direction the pole was displaced. During training, the animals freely intermixed the responses. After training, the animals received the standard poisoning sequence with one reinforcer, but not with the other, and were then tested with the pole. As before, the question of interest is whether there would be a selective depression of the response that was trained with the now devalued reinforcer. Figure 9 shows the results, separated according to the subsequent treatment of the reinforcers. Those results look remarkably like those of previous figures: There is a selective depression of the response whose reinforcer had been poisoned. Overall, performance levels are somewhat lower than in previous figures, but this partly reflects the lower level of responding (6/min) prior to poisoning. However, it may also be partly attributable to the fact that the two instrumental behaviors share many more components, such as the approach to and grasping of the manipulandum. Nevertheless, the fact of principal interest is the selective impact of reinforcer devaluation on subsequent instrumental responding. Although one could construct stimulus-based two-process accounts for this finding, it is more naturally accommodated by the notion that there is a response-reinforcer association. As noted, an empirically based choice between encoding of the reinforcer in terms of an association with a response and in terms of an association with a
0 Poimonod 0 Not poimoned
Blocks of 4 minutes
Fig. 9. Sensitivity to reinforcer devaluation of two responses directed toward the same manipulandum. An aversion had been conditioned to one reinforcer (solid symbols),but not to the other (open symbols).
82
Ruth M. Colwill and Robert A. Rescorla
response-related stimulus can probably never be made convincingly. Nevertheless, the present evidence certainly encourages the currently popular response-reinforcer view.
IV. The Role of the Stimulus in Instrumental Behavior If one concludes that instrumental learning fundamentally involves an association between the response and the reinforcer, then it is natural to ask about the role that leaves for the stimulus in controlling behavior. We will consider here three possibilities, two of which have already been mentioned.
A. A N ASSOCIATIONWITH THE RESPONSE The preceding discussion demonstrates that the response is associated with its consequent reinforcer; but that does not rule out the possibility that it is also associated with its antecedent stimulus. The organism might well learn multiple associations of the response-one with a reinforcer and one with a stimulus. Moreover, a stimulus-response association might function in various ways to promote responding. One possibility is evocative. As suggested by classical S-R theories, the presentation of the stimulus might simply elicit the response, without regard to the anticipated consequences of that response. The previously reviewed evidence suggests that an evocative function of the stimulus does not provide a complete account of performance in instrumental learning, but it might be an alternative source for some instances of instrumental behavior. We describe one piece of evidence for this later. Another possible function is selective. The stimulus might activate a representation of the response that in turn is evaluated in terms of its association with the reinforcer. In this role, an S-R association would work in conjunction with a response-reinforcer association, selecting the subset of alternative responses that the organism would inspect for associations with the response. We discuss that possibility in the second section below.
I . Residual Responding In discussing the preceding experiments, we have emphasized the selective depression that reinforcer devaluation produces. But we have made less of an equally important aspect of those results: The incompleteness of the effect of reinforcer devaluation on performance of the instrumental response. Under a wide variety of parameter values, poisoning of a reinforcer left a considerable amount of residual performance of the response that had produced that reinforcer. That residual occurs in the choice tests that we have previously reported,
83
Associative Structures in Instrumental Learning
but it is even more obvious if one conducts a single response test after the same treatments. For instance, in a recent experiment, Colwill and Rescorla (1985b) compared extensive and moderate training in a procedure similar to that used to produce the data shown in Fig. 4.However, we tested the animals with only one manipulandum available at a time. The results of that single-response test are shown in Fig. 10. That figure agrees with Fig. 4 in showing that both extensively and moderately trained responses were sensitive to poisoning of the reinforcer. But it also reveals a substantial level of performance of the poisoned response. When no preferred alternative behavior is present to provide competition, the response whose reinforcer has been poisoned continues to occur with surprising frequency. An obvious interpretation of that outcome is that some portion of instrumental behavior is independent of the current value of the reinforcer, in the manner that a stimulus-response theory would anticipate. Even when the reinforcer has been devalued, the response may have an association with an antecedent stimulus that has evocative power. However, if one were to interpret this residual responding in terms of an S-R association. there are a number of alternatives that need to be considered and Extended
Moderate
t
10 -
$
'g
8-
pohonad
o not poisoned
it
!
g-
c
4-
H?2
H
*t
- 0
1
2
3
4
5
1
2
3
4
5
Blocks of 2 minutes
Fig. 10. Sensitivity to reinforcer devaluation of extensively (left) and moderately (right) trained instrumental responses as assessed during a single response extinction test. An aversion had been conditioned to one reinforcer (solid symbols), but not to the other (open symbols). From Colwill and Rescorla (1985b). 0 1985 by the American Psychological Association.
84
Ruth M. Colwill and Robert A. Reseorla
rejected. First, it is possible that this residual behavior simply represents the level of responding that one would expect of an animal confined to the chamber without any other interesting ways to pass the time; that is, the residual may represent not the continued execution of behavior that was once reinforced, but an inherent untrained level of performance that has nothing to do with the instrumental training contingencies. The performance levels seem somewhat higher than would be anticipated by this interpretation, but it is important to provide more formal evidence on this point. For that reason, Colwill and Rescorla (1985b) recently compared responding following reinforcer devaluation in animals that either had been reinforced for making the instrumental response (contingent group) or had been exposed to the reinforcer independently of their making the response (noncontingent group). In order to extend the generality of the assessment, each animal was given both extended and moderate exposure to its training conditions. Thus, the animals in the contingent group received extensive response-reinforcer training with one manipulandum (either a chain or lever) and only moderate response-reinforcer training with the other. All responding was reinforced on the same VI 1-min schedule with the same Noyes pellet reinforcer. Animals in the noncontingent group received daily presentations of the Noyes pellet reinforcer delivered on a variable time (VT) 1-min schedule. One manipulandum was available during many of these sessions and the other manipulandum was available during only a few of these sessions. In neither case was responding ever reinforced in the noncontingent group. Then all animals received poisoning of the Noyes pellets before being given separate extinction tests with each of the responses. Figure 11 shows the results of those tests, separated according to the extent of training. Two things are clear from that figure. First, animals in the contingent group continued to respond at a substantially higher rate than did animals that had never received response-contingent presentations of the reinforcer. That outcome implies that the residual responding represents more than simply the untrained response tendencies of the animal. Second, this data pattern emerged regardless of whether training had been extensive or moderate. Indeed, amount of training had no obvious effect, supporting the conclusions of the discussion in Section 11. These results clearly indicate that the residual behavior observed after reinforcer devaluation is not attributable to unconditioned levels of responding, but is rather a product of the instrumental training procedure. This study also eliminates another potential source of the residual responding observed in our previous experiments. Those experiments have typically employed multiple reinforcers with multiple responses and then reduced the value of one reinforcer while leaving intact the value of another. We have noted previously the methodological value of such a procedure. But it does allow the possibility that some of the residual responding may be the result of generalization from the response whose
Associative Structures in Instrumental Learning 10 -
Lever (Toot 1I
85
Chain (Test 2)
-
contingent t . extended Mmoderate noncontlngent
5 0 - , 1
,
,
2
3
,
,
1
4 5 1 2 Blocks of 2 minutes
1
1
3
4
,
5
Fig. 1 I . Mean responses per minute during the extinction test for subjects trained extensively (solid symbols) or moderately (open symbols) with either responsecontingent (solid lines) or noncontingent (dashed lines) reinforcers. From Colwill and Rescorla (1985b). 0 1985 by the American Psychological Association.
reinforcer is still valuable. In the present experiment, however, there was no opportunity for such generalization to occur; a single reinforcer was used for both responses, and its value was successfully removed. A second possible interpretation for residual responding is that we have incompletely changed the value of the reinforcer by our poisoning operation. Residual responding may represent behavior directed toward a reinforcer that has retained some positive value. This possibility must be given serious attention because once responding has been trained, a reinforcer that is only mildly attractive might be sufficient to support performance. Certainly, there are many instances in which reward schedules inadequate to train performance can nevertheless maintain behavior once established (Ferster & Skinner, 1957). Our studies have typically contained two procedures for assessing the completeness of the change in the value of the reinforcer. One procedure tests for consumption of the reinforcer after the extinction test for instrumental responding has been administered. In almost all cases, we have found total rejection of the poisoned reinforcer. For example, the animals whose instrumental responding is shown in Fig. 10 were subsequently given five response-independent deliveries of the poisoned reinforcer, presented at a rate similar to the one at which they had been earned. In that test, no animal consumed any of the poisoned pellets. Such complete rejection is, of course, quite important. An animal
Ruth M. Colwill and Robert A. Rescorla
86
willing to consume even a single pellet might also be willing to work for that pellet. The second procedure assesses the reinforcing power of the poisoned reinforcer. In that assessment, we examine the ability of the reinforcer to maintain previously established performance. Specifically, we compare responding that produces the poisoned reinforcer with responding that produces no outcome. That gives one an extremely sensitive estimate of any reinforcing properties that the devalued reinforcer has beyond those of the null event. A typical result is shown in Fig. 12 for the animals whose instrumental data are shown in Fig. 11. Animals in the contingent condition were given two 10-min tests, first with the lever and then with the chain. For half the animals, responding on the lever produced the poisoned reinforcer on a VI 30-sec schedule; the other half of the animals received no response-contingent events. These contingencies were reversed in the second test with the chain present. It is clear that the poisoned reinforcer did not differ from no consequence in maintaining behavior. These results provide no support whatsoever for the view that residual responding is attributable to the incompleteness of our poisoning operation. This is not to say that all findings of residual responding that have been reported in the literature are free of this alternative interpretation. Indeed, the fact that some experiments have found reduced, but nonzero consumption of the reinforcer after poisoning makes this a viable alternative for several studies. However, it does
al
c,
x2
3
&i
n v)
$
C
2
0
2
al
a
I
I
I
I
I
I
I
1
Blocks of 2 Min Fig. 12. Assessment of the reinforcing value of a substance paired with poison. Shown are the mean rates for responses followed either by no outcome or by the poisoned reinforcer, collapsed across the two test sessions.
Associative Structures in Instrumental Learning
87
suggest that incompleteness of poisoning is not the basis for the present observations of residual responding. There remains, however, a third possibility that is closely related to the second: that we have completely changed the value of the reinforcer delivered during the poisoning manipulation, but not that earned during instrumental training. As we noted earlier, it would not be impossible for the animal to differentiate the two occasions of reinforcement delivery, effectively treating them as involving two somewhat different reinforcers having different values. Concerns of just this kind prompted our adopting several of the procedures used in earlier experiments. Thus, our poisoning procedure involved delivering the reinforcer in the same chamber with the same average interreinforcer interval as that employed in the instrumental training phase. That procedure represents a departure from many earlier studies that involved gross differences in the manner of delivering the reinforcer. However, the present studies have left unmatched one very important aspect of reinforcer delivery: whether or not it was response contingent. It would not be surprising if the animal treated rewards that it earned as different from those delivered independently of its behavior. For instance, the occurrence of earned pellets can be anticipated with accuracy, whereas unearned pellets are relatively unannounced. Consequently, during a typical poisoning procedure certain anticipatory features of responding to the reinforcer may be absent. For this reason, we recently conducted a study in which we attempted to incorporate this aspect of reinforcer delivery into the poisoning phase of the experiment (Colwill & Rescorla, 1985b, Experiment 4). In this experiment we trained rats to make four responses: lever pressing, chain pulling, nose poking, and handle pulling. Two of the responses were reinforced with sucrose pellets and the other two with Noyes pellets. After each response had been reinforced on a VI 60-sec schedule for 13 sessions, we established an aversion to one of the reinforcers; for some animals (unearned group), that reinforcer was freely delivered on a VT schedule as in our previous experiments; for other animals (earned group), that reinforcer was earned on a VI schedule by one of the responses that it had previously reinforced. The other reinforcer was presented following one of the responses that it had previously reinforced for the earned group and freely for the unearned group, but in neither case was it poisoned. Then both groups of animals were given the choice between two responses (those not available during the aversion phase for the earned group), one whose reinforcer had been poisoned and one whose reinforcer had not. Figure 13 shows the results of the aversion phase of the experiment for the animals that earned reinforcers during poisoning. It is clear that instrumental responding rapidly reflected the treatment of the reinforcer. Responding rapidly declined during those sessions that would end in poisoning; responses during sessions with the other manipulandum continued to occur frequently. It is of
88
Ruth M. Colwill and Robert A. Rescorla
P k
i
K
12-
t
c *-
I
Sessions
Fig. 13. Effect on instrumental responding of pairing a food with poison at the end of an instrumental training session. An aversion was conditioned to one reinforcer (solid symbol), but not to the other (open symbol). From Colwill and Rescorla (1985b). Q 1985 by the American Psychological Association.
interest to note that this decline in the poisoned response occurred before there was any detectable reduction in consumption of the reinforcer. For instance, on the fourth cycle of poisoning, the response that earned the poisoned reinforcer was less frequent than that which earned the nonpoisoned reinforcer. Yet, in that session, the rats continued to consume all deliveries of both reinforcers. This suggests that the instrumental behavior may be the more sensitive index of the effect of the toxin. Figure 14 shows the results of the final choice test, separated according to whether the animal had received poisoning of an earned or unearned reinforcer. Those data show the same pattern as previous results, regardless of the mode of poisoning: Responses whose reinforcer had been poisoned occurred with a lower frequency. There is no indication at all that allowing the animal to earn the reinforcer at the time of poisoning either increased or decreased the impact of that operation on the likelihood of other responses trained with that reinforcer. These results indicate that even when one goes to some lengths to ensure comparabilitybetween the reinforcer when delivered contingent upon a response and when poisoned, residual responding nevertheless remains. That makes the
Associative Structures in Instrumental Learning
89
position that the animal differentiates between the reinforcer it earns and that which is devalued seem unattractive as an account for residual responding. However, it is probably never possible to rule out the possibility that continued responding is attributable to some residual value in the reinforcer as coded by the animal. It is instructive in this regard to consider a comparable result that has regularly been observed in purely Pavlovian paradigms. As already noted, it is common to use procedures that devalue S 1 after Pavlovian pairings of S2 and S 1 as a technique to assess the amount of S2-Sl learning. In many experiments, such a devaluation of S1 produces a marked reduction in the response to S2. However, that reduction is rarely complete, and under some circumstances it is negligible. In the Pavlovian context, that result is not typically taken to mean that there is some source of responding other than the S2-S 1 association; rather, it is interpreted in terms of S2 being associated with some feature of S 1 that survives the devaluation procedure. A common alternative is that S2 becomes associated with the response properties of S1 during the S2-Sl pairings, whereas it is the stimulus properties of S1 that undergo change during its devaluation; that is, one
Unearned
Earned
poisoned
6- '
1
2
3
4
6
Blocks of 2 minutes
Fig. 14. Sensitivity of instrumental responding to reinforcer devaluation conducted either on earned (left panel) or unearned (right panel) reinforcers. An aversion had been conditioned to one reinforcer (solid symbols), but not to the other (open symbols). From Colwill and Rescorla (1985b). 0 1985 by the American Psychological Association.
90
Ruth M. Colwill and Robert A. Rescorla
set of S1 properties becomes associated with S2 and another set becomes associated with the devaluing event. Under those conditions the S 1 actually presented could be completely devalued on the basis of one set of properties, while the aspects of S1 associated with S2 retain some value. Rescorla (1982) has argued that a similar interpretation could be made of residual instrumental responding. For instance, as noted previously, the response might become associated in part with features of the reinforcer other than its flavor. Since the poisoning procedure is known to change principally the value of those flavor features, a reinforcer paired with a toxin might be rejected and might fail to reinforce behavior upon which it is contingent because its flavor is so aversive. Yet, its previous use as a reinforcer might have resulted in an association between the response and some nonflavor feature of the reinforcer that retains its value. It is not clear whether such an alternative can ever be ruled out. But until it is we must be cautious in interpreting residual responding in terms of an evocative function of an S-R association.
2 . Discriminative Performance A second kind of support for the view that the stimulus becomes associated with the response comes from the simple fact that instrumental responding can be brought under the control of an SD. The ability of organisms to learn to make one instrumental response in the presence of one stimulus and another response in the presence of another stimulus seems to imply other than response-reinforcer associations. One obvious possibility is that the animal additionally forms a stimulus-response association. Such an association need not function to evoke the response, but might instead activate some representation of the response. An activated representation that in turn has an association with a currently valued reinforcer might then be executed. Indeed, in the absence of such an S-R assumption, a response-reinforcer interpretation of instrumental learning has some difficulty with performance generally. As Mackintosh and Dickinson (1979) note, it is not clear how such a theory makes the step from the animal having a response-reinforcer association to its producing the response. Mowrer (1960) attempted to solve a related problem by imagining that the animal was continually scanning its response alternatives, evaluating the outcome of every possible response. But it seems more plausible to assume that the animal uses the simple expedient of an S-R association to reduce the set of alternative behaviors evaluated. This is by no means the only way a response-reinforcer view can generate discriminative performance (see below), but it is a plausible one. 3 . Conclusion
These arguments suggest that the organism may form an S-R association in addition to its response-reinforcer association. However, the arguments from
Associative Structures in Instrumental Learning
91
residual responding and discriminative control are relatively indirect, based on a lack of viable alternatives rather than on direct support for the presence of this association. Stimulus-response associations have yet to be demonstrated in the simple direct way that stimulus-stimulus or response-reinforcer associations have been revealed.
B. ASSOCIATIONWITH THE REINFORCER A second potential role for the stimulus in instrumental behavior is that suggested by classical two-process theory: It might develop a direct Pavlovian association with the reinforcer. We have already remarked on the fact that arranging an instrumental response-reinforcer contingency in the presence of some stimulus provides the occasion for a Pavlovian stimulus-reinforcer contingency. This observation has greatly encouraged a variety of two-process theories and generated several types of experiments intended to assess the importance of that putative Pavlovian learning for instrumental learning and performance (see Rescorla & Solomon, 1967; Trapold & Overmier, 1972). Some of the earliest and most direct attempts to provide evidence for the formation of a stimulus-reinforcer association in instrumental learning were carried out by Konorski and his colleagues (Ellison & Konorski, 1964; Konorski & Miller, 1930). They simply inspected the ability of an instrumental SDto elicit Pavlovian CRs during training. In one study, the SD signaled that leg flexion would be reinforced with food (Konorski & Miller, 1930). In another study, Pavlovian and instrumental contingencies were explicitly separated by arranging that panel pressing in the presence of the SD produced a Pavlovian CS that terminated with food (Ellison & Konorski, 1964). In neither case was the presentation of the SD sufficient to elicit a Pavlovian CR (salivation). Conditioned salivation was observed only after performance of the instrumental response or in the presence of the Pavlovian CS. These findings have been confirmed by several other investigators (e.g., Deaux & Patten, 1964; Williams, 1965). Attempts to transfer explicitly established Pavlovian CS-US associations into the instrumental situation have produced somewhat more encouraging evidence for the stimulus-reinforcer possibility. For instance, Bower and Grusec (1964) initially trained a Pavlovian discrimination in which one CS was followed by water and another was nonreinforced. They then used those CSs as SDs which signaled whether or not a lever press would produce water. The discrimination was mastered more rapidly when the previous CS+ was used as a positive SD and the previous CS- was used as a signal that the response would not be reinforced, compared with the converse arrangement. One interpretation is that the Pavlovian conditioning had given the animal a head start on the learning that would have occurred in the normal course of instrumental discrimination training. Other studies have found congruent results (e.g., Mellgren & Ost, 1969; Trapold, Lawton, Dick, & GOSS,1968).
92
Ruth M. Colwill and Robert A. Rescorla
However, a historically related inference from the two-process account has received more mixed support. If the instrumental SD has an important Pavlovian component, then one might be able to use a simple Pavlovian CS in its place and still evoke the instrumental behavior. Although some early (e.g., Estes, 1948) and a few modem (e.g., Edgar, Hall, & Pearce, 1981; Lovibond, 1983) experiments found positive results, other modem evidence has not supported this implication (e.g., Karpicke, Christoph, Peterson, & Hearst, 1977; LoLordo, McMillan, & Riley, 1974; Schwartz, 1976). Perhaps the strongest support for the formation of a stimulus-reinforcer association in instrumental training has emerged from a series of studies by Trapold, Peterson, and their colleagues. Those studies compare various sorts of instrumental discrimination training under circumstances that arrange either consistent or inconsistent stimulus-reinforcer relations. For instance, Trapold ( 1970) reinforced rats for pressing one lever in the presence of a tone stimulus and a different lever in the presence of a clicker stimulus. For some animals, different reinforcers (food and sucrose) were used, depending on which SD was present. For other animals, the same reinforcer (food or sucrose) occurred regardless of the identity of the SD. The former animals learned the instrumental discrimination more rapidly, suggesting that consistency of the SD-reinforcer relation promotes instrumental performance. One popular explanation for this outcome is that a stimulus-reinforcer association develops, endowing the stimulus with additional features that facilitate discrimination learning. More recent studies have confirmed this sort of finding in a broad range of more complex discrimination learning situations (e.g., Brodigan & Peterson, 1976; Carlson & Wielkiewicz, 1972, 1976; DeLong & Wasserman, 1981; Edwards, Jagielo, Zentall, & Hogan, 1982; Flaherty & Davenport, 1968; Overmier, Bull, & Trapold, 1971; Peterson & Trapold, 1980; Peterson, Wheeler, & Trapold, 1980). The experimentsof Peterson et al. (1980) are especially compelling evidence for this view. Despite this evidence, and despite the arguments that are commonly made that the Pavlovian stimulus-reinforcer relation is embedded in the instrumental contingency, there is some reason to be skeptical about the development of a stimulus-reinforcer association. That reason derives from modem findings in Pavlovian conditioning. A major conclusion of modem studies of Pavlovian conditioning is that an event becomes associated with the reinforcer only when it provides information about the reinforcer; simply arranging for the event to occur contiguously with that reinforcer is not sufficient. For instance, if an AX compound is followed by a reinforcer but A is nonreinforced when presented separately, animals typically show little association between A and the reinforcer. Yet, the inference that a standard instrumental learning situation has an embedded Pavlovian relation can be thought of as violating just that conclusion. Consider that in instrumental learning the stimulus (A) is not reinforced when it occurs alone, but
Associative Structures in Instrumental Learning
93
only when the response (X) also occurs. Under those circumstances one might anticipate that the animal would fail to form a simple association between the stimulus and the reinforcer; the stimulus per se is not informative about the occurrence of the reinforcer. Indeed, some Pavlovian theorists have gone so far as to suggest that with this paradigm the stimulus should develop an inhibitory association with the reinforcer (e.g., Konorski, 1967). Thus, on the basis of modem Pavlovian findings, one must be puzzled by the assertion that the SD becomes associated with the reinforcer. Just this point has prompted several recent empirical investigations by Mackintosh and his collaborators. As previously noted, several authors have advocated the use of the blocking and overshadowing paradigms to determine the degree to which two stimuli share the same associations. Holman and Mackintosh (1981) applied this logic to determine the degree to which a Pavlovian CS and an instrumental SD share associations with the reinforcer. Of most interest in the present context, they found little evidence that an SD previously used to signal when a response would produce a reinforcer could block Pavlovian conditioning of another stimulus by that same reinforcer. At the same time that the SD demonstrably controlled instrumental performance, it showed no evidence of having a Pavlovian association with the reinforcer. Thus, although there is some evidence that an SD can become associated in a Pavlovian manner with the reinforcer, there is also good reason to believe that the SD is not simply a Pavlovian CS.
C. THESTIMULUS AS
AN
OCCASIONSETTER
The previous discussion suggests that a simple associative analysis fails to capture the relationship that the SD bears to the reinforcer. That point was also made many years ago by Skinner (1938) who argued that the SDparticipated in a more complex three-term relation involving both the response and the reinforcer. He argued that the SD “set the occasion” upon which the response would be reinforced (see also Catania, 1971; Jenkins, 1977). This notion of occasion setting has never received much theoretical elaboration. But a number of laboratories have recently attempted to capture it within a purely Pavlovian procedure. Holland (1983, 1985), Jenkins (1983, and Rescorla (1985) have all studied Pavlovian paradigms in which one stimulus can be described as setting the occasion on which another will be followed by the reinforcer. All three laboratories have found evidence that such a relation can be modeled and analyzed in a Pavlovian procedure. Those results give a broader base to the notion of occasion setting and encourage the view that it may be a fundamental process involved in various learning situations, including instrumental stimulus control. The key to the detection of Pavlovian occasion setting and its separation from
94
Ruth M. Colwill and Robert A. Rescorla
simple associations has been a relatively underappreciated feature of conditioning: In a simple Pavlovian paradigm the nature of the learned response depends on the nature of the CS. There is now a reasonable number of studies indicating that when the same US is paired with an array of CSs, those CSs can come to produce quite different responses, all indicative of otherwise comparable associations (see Holland, 1983). This often makes it possible to attribute behavior to an association between the reinforcer and one of a constellation of stimuli that are concurrently present. That in turn permits one to identify one stimulus as producing the behavior and another stimulus as allowing the first to do so. These points can be illustrated more concretely by the “facilitation” procedure that Rescorla (1985) has recently reported for pigeon subjects. The CS dependence of the CR is readily displayed when pigeons have food paired with either a localized visual or a diffuse auditory CS. When the illumination of a localized response key by a particular light signals the availability of grain, the resulting keylight-food association is exhibited in the form of directed keypecking (so-called autoshaping). However, a diffuse auditory or visual signal bearing the same Pavlovian relation to grain instead produces increased general activity without apparent direction. That difference in response form is exploited in the facilitation procedure in order to determine whether a diffuse stimulus can set the occasion on which a localized stimulus will be reinforced. In a typical procedure, a 5-sec keylight is nonreinforced except on those occasions when it comes at the end of a 15-sec white noise (Rescorla, 1985). One can then assess the degree to which the white noise facilitates or sets the occasion for the keylight-food association by observing the nature of the response during compound trials. To the degree that the noise simply becomes a signal for food itself, there will be an increase in general activity, the response that noise-food associations produce. However, to the degree that the noise sets the occasion for the keylight-food association, there will be an increase in directed keypecking, the response that keylight-food associations produce. Figure 15 shows a typical result of applying such procedures. That figure comes from an experiment (Rescorla, 1985) in which pigeons received two concurrent facilitation paradigms. Two different keylights were each reinforced only when presented during a diffuse stimulus; for one keylight that stimulus was a white noise, and for the other keylight it was a flashing houselight. Neither keylight was reinforced when presented separately. The figure plots the amount of keypecking over the course of extended exposure to this discrimination regime. It is clear from the left-hand side of this figure that the discrimination was readily learned and took the form of differential keypecking in the presence of the facilitators. That suggests that the discrimination was based not simply on differential associations of the diffuse stimuli with the food, but on their having the ability to modulate responding to the keylights. A fair amount is known about the properties of such facilitation (Rescorla,
Associative Structures in Instrumental Learning
2
4
6
8
Sessions
1012
O T E Light
95
O T E Nohe
Test
Fig. 15. Facilitation of Pavlovian autoshaped responding to a keylight. The left-hand panel shows responding to two keylights when presented in compound with a light (L) or noise (N),or as separate elements. Only the compounds were followed by reinforcement. The right-hand panel shows responding to the keylights when presented alone as elements (E)and in compound either with their original (0)or transfer (T)facilitator. From Rescorla (1985).
1985). For instance, a diffuse stimulus treated in this way differs substantially from a diffuse stimulus that is simply paired with food in a Pavlovian fashion. A purely Pavlovian CS will not act to promote keypecking in the manner shown in Fig. 15. Nor does conducting Pavlovian acquisition or extinction experiments with a previously trained facilitator change its ability to modulate responding to the keylight. Moreover, a facilitator is substantially less capable than is a comparable Pavlovian CS in establishing second-order conditioning to a keylight that signals its occurrence. Those data support the conclusion that the facilitator is functionally different from a simple Pavlovian excitor. It is also of interest that under some circumstances a facilitator will transfer from one target to another. The right-hand side of Fig. 15 shows one example of such transfer. After discrimination training, the two diffuse stimuli were each tested both for their effect on the keylight with which they had received original (0)training and for their transfer (T) to the keylight trained with the other facilitator. Figure 15 shows that this transfer was substantial, although short of complete. Results reported by Rescorla (1985) suggest that this transfer can occur even when the response form elicited by the target stimuli is different.
96
Ruth M. Colwill and Robert A. Rescorla
However, it appears to extend only to target stimuli with a history of both reinforcement and nonreinforcement. Rescorla ( 1985) has suggested that facilitation may be thought of as the removal of an inhibitory process that was suppressing the action of the original target-reinforcer association. The reality of this sort of occasion setting function in a purely Pavlovian setting encourages the view that it characterizes the role of a stimulus in instrumental training as well. Moreover, the parallel between the two cases is strengthened by one recent study conducted with pigeons in our laboratory. The question of that study was whether there is sufficient commonality in the learning of facilitation and SD control that a facilitator could transfer to an instrumental response that had been brought under the control of another stimulus. Can we think of an instrumental response as just another target whose association with the reinforcer is made operative by the presence of a facilitator? To test this idea, 16 pigeons were trained to step on a treadle for food reward. They were reinforced on a random ratio schedule such that the probability of a treadle press leading to food was .2 in the presence of a 15-sec houselight, but 0 in its absence. This training proved more arduous than training discriminative keypecking, but by the end of 15 daily 15-min sessions, each containing 12 trials, the response rate during the light was 14.3/min, whereas that in its absence was 3.2/min. In addition to this instrumental training, the birds were trained in a facilitation procedure using auditory stimuli (a white noise and an 1800 Hz tone) as the facilitators and localized keylights (red and green) as the targets. All subjects received a simple facilitation treatment with one pair of stimuli: Reinforcement of the 5-sec keylight only in the presence of one 15-sec auditory stimulus. In addition, all subjects received a control treatment with the other auditory stimulus. For half the subjects that control consisted of simply pairing the auditory stimulus with food in a Pavlovian fashion. For the other half, a “pseudo-facilitator” treatment was used in which a keylight was reinforced on all its presentations in both the presence and absence of the diffuse stimulus. This control equates the number of times the facilitator and the diffuse stimulus were paired with food in the presence of a visual target; however, since the keylight was also reinforced in the absence of the diffuse stimulus, it provides no information about the reinforcement contingencies. Rescorla (1985) has shown that this kind of pseudo-facilitator does not promote responding to target keylights. All animals received 10 such Pavlovian conditioning sessions, 7 prior to instrumental training and 3 following that training. During each Pavlovian session, each of the appropriate four trial types was presented 12 times, with a mean intertrial interval of 1 min. Figure 16 shows the results of presenting these various stimuli to the birds when they had the opportunity to treadle press. During that exposure, all stimuli were treated as instrumental SDs; that is, treadle pressing was reinforced on a
Associative Structures in Instrumental Learning
P
91
FAC
.i” cs+
Pseudo-FAC
Background
4.
1
I
I
1
Blocks of 4 trials
Fig. 16. Transfer of a Pavlovian facilitator to an instrumental response. Instrumental treadlepress responding is shown in the presence of only background stimuli. a facilitator, a Pavlovian CS+ , and stimulus given pseudo-facilitation training. All stimuli were trained as instrumental discriminative stimuli during this test.
ratio schedule in the presence of each, but not in their absence. Since the two subgroups differed only in the treatment of their control stimulus, the results of responding during the facilitator (FAC) and in the absence of any stimulus (background) have been combined. However, the results for the two control stimuli (the Pavlovian CS+ and the pseudo-facilitator) are shown separately. Relative to responding during the background alone, all three stimuli began to develop discriminative control in the course of testing. But the most interesting result is that the stimulus trained as a facilitator promoted performance from the outset and continued to do so throughout the test. The facilitator generated reliably greater responding than either the excitatory Pavlovian CS+ or the relatively neutral pseudo-facilitator. That suggests that a facilitator and an SD share enough features that the former can replace the latter in promoting instrumental performance. These results encourage the view that it is profitable to think of the stimulus in instrumental learning in part as having a kind of higher-order modulatory role. The fact that similar processes can be observed in Pavlovian conditioning where they are beginning to yield to analysis helps to take some of the mystery out of this notion of occasion setting and may lead to a deeper understanding of stimulus control in instrumental learning. However, the analysis of occasion setting
98
Ruth M. Colwill and Robert A. Rescorla
in Pavlovian situations is itself still in a preliminary stage, with considerable uncertainty remaining at both the empirical and theoretical levels (cf. Holland, 1985; Jenkins, 1985).
V. Conclusion The data reviewed in this article provide various sorts of information on the associative structure underlying instrumental performance. First, they are informative about the role of the reinforcer. The results on changing the value of the reinforcer suggest that it is not simply a catalyst promoting associations among other events. Instead, the reinforcer is itself encoded. Moreover, over a wide variety of parameter values that encoding is at least partly in terms of an association with the response. Second, the results presented here bear on the role of antecedent stimuli in instrumental learning. There is some evidence that such stimuli become associated with both the response and the reinforcer. But other results suggest that the stimulus plays an occasion setting role in which its presence identifies the response-reinforcer relation. Some analysis of that modulatory role has been completed within purely Pavlovian paradigms, and there is evidence for its shared function with an instrumental SD. Finally, it is worth noting that the analysis carried out here has leaned heavily on the tools developed in over 20 years of experimentation on Pavlovian associations. Most of the experimental procedures and many of the theoretical notions have direct analogs in previous treatments of Pavlovian conditioning. This supports the view that our understanding of Pavlovian conditioning is now sufficiently advanced that it can be an important aid in the analysis of other forms of learning. ACKNOWLEDGMENTS The research reported here was generously funded by several grants from the National Science Foundation. The article was written while Ruth Colwill was at the Howard Hughes Medical Institute, New York, and while Robert Rescorla was a J. S. Guggenheim fellow at Cambridge University. We thank Anthony Dickinson for many helpful comments.
REFERENCES Adams, C. D. (1980). Postconditioning devaluation of an instrumental reinforcer has no effect on extinction performance. Quarterly Journal of Experimental Psychology. 32, 447-458. Adams, C. D. (1982). Variations in the sensitivity of instrumental responding to reinforcer devaluation. Quarterly Journal of Experimental Psychology, MB,77-98. Adams, C. D., & Dickinson, A. (1981a). Actions and habits: Variations in associative representa-
Associative Structures in Instrumental Learning
99
tions during instrumental learning. In N. E. Spear & R. R. Miller (Eds.),Informarion processing in animals: Memory mechanisms (pp. 143-165). Hillsdale, NJ: Erlbaum. Adams, C. D., & Dickinson, A. (198lb). Instrumental responding following reinforcer devaluation. Quarterly Journal of Experimenral Psychology, 33B, 109- 121. Allport, G . W. (1937). Personality: A psychological interpreration. New York: Holt. Amiro, T. W., & Bitterman, M. E. (1980). Second-order appetitive conditioning in goldfish. Journal of Experimental Psychology: Animal Behavior Processes, 6 , 41-48. Blanchard, R., & Honig, W. K. (1976). Surprise value of food determines its effectiveness as a reinforcer. Journal of Experimental Psychology: Animal Behavior Processes, 2, 67-74. Bolles, R. C. (1972). Reinforcement, expectancy, and learning. Psychological Review, 79, 394409. Bolles, R. C., Holtz, R.,Dunn, T., & Hill, W. (1980). Comparisons of stimulus learning and response learning in a punishment situation. Learning and Motivation. 11, 78-96. Bower, G., & Grusec, T. (1964). Effect of prior Pavlovian discrimination training upon learning an operant discrimination. Journal of Experimental Analysis of Behavior, 7, 401-404. Brodigan, D. L., & Peterson, G. B. (1976). Two-choice conditional discrimination performance in pigeons as a function of reward expectancy, prechoice delay and domesticity. Animal Learning and Behavior, 4, 121-124. Capaldi, E. D., & Myers, D. E. (1978). Resistance to satiation of consummatory and instrumental performance. Learning and Motivation, 9, 179-201. Carlson, J. G., & Wielkiewicz, R. M. (1972). Delay of reinforcement in instrumental discrimination learning of rats. Journal of Cornpararive and Physiological Psychology, 81, 365-370. Carlson, J. G.. & Wielkiewicz, R. M. (1976). Mediators of the effects of magnitude of reinforcement. Learning and Motivation, 7 , 184-196. Catania, A. C. (1971). Elicitation, reinforcement and stimulus control. In R. Glaser (Ed.), The nature of reinforcement (pp. 196-220). New York: Academic Press. Cheatle, M. D., & Rudy, J. W. (1978). Analysis of second-order odor-aversion conditioning in neonatal rats: Implications for Kamin’s blocking effect. Journal of Experimental Psychology: Animal Behavior Processes, 4, 237-249. Chen, J . S., & Amsel, A. (1980). Recall (versus recognition) of taste and immunization against aversive taste anticipations based on illness. Science, 209, 85 1-853. Colwill, R. M., & Rescorla, R. A. (1985a). Post-conditioning devaluation of a reinforcer affects instrumental responding. Journal of Experimenral Psychology: Animal Behavior Processes, 11, 120- 132. Colwill, R. M., & Rescorla, R. A. (1985b). Instrumental responding remains sensitive to reinforcer devaluation after extensive training. Journal of Experimental P S Y C ~ O ~ Animal O ~ Y : Behavior Processes. 11, 520-536. Cook, C. R., & Hull, J. H. (1979). Instrumental response topographies of rats on partial reinforcement or reacquisition schedules. Journal of General Psychology, 101, 151-152. Cowles, J . T., & Nissen, H. W. (1937). Reward-expectancy in delayed-responses of chimpanzees. Journal of Comparative Psychology, 24,345-358. Crespi, L. P. (1942). Quantitative variation of incentive and performance in the white rat. America t Journal of Psychology, 55, 467-5 17. Deaux, E. B., & Patten, R. L. (1964). Measurement of the anticipatory goal response in instrumental runway conditioning. Psychonomic Science, 1, 357-358. DeLong, R. E., & Wasserman, E. A. (1981). Effects of differential reinforcement expectancies on successive matching-to-sample performance in pigeons. J urnal of Experimental P S Y C ~ O ~ O ~ Y : Animal Behavior Processes, 7, 394-412. Dickinson, A,, & Charnock, D. J. (1985). Contingency effects with a constant probability of instrumental reinforcement. Quarrerly Journal of Experimental Psychology 37B, 397-416.
9
100
Ruth M. Colwill and Robert A. Rescorla
Dickinson, A,, Nicholas, D. J., & Adams. C. D. (1983). The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. Quarterly Journal of Experimental Psychology, 35B, 35-5 I . Dickinson, A,, Peters, R. C., & Shechter, S. (1984). Overshadowing of responding on ratio and interval schedules by an independent predictor of reinforcement. Behavioral Processes, 9,42 1429. Durlach, P. J. (1983). Effect of signaling intertrial unconditioned stimuli in autoshaping. Journal of Experimental Psychology: Animal Behavior Processes. 9, 374-389. Edgar, D., Hall, G., & Pearce, J. M. (1981). Enhancement of food-rewarded instrumental responding by an appetitive conditioned stimulus. Quarterly Journal of Experimental Psychology, 33B, 3-19. Edwards, C. A , , Jagielo, J. A,, antall, T. R., & Hogan, D. E. (1982). Acquired equivalence and distinctiveness in matching to sample by pigeons: Mediation by reinforcer-specific expectancies. Journal of Experimental Psychology: Animal Behavior Processes, 8, 244-259. Elliott, M. H. (1928). The effect of change of reward on the maze performance of rats. Universiry of California Publications in Psychology. 4, 19-30. Ellison, G. D., & Konorski, J . (1964). Separation of the salivary and motor responses in instrumental conditioning. Science, 146, 1071-1072. Estes, W. K. (1948). Discriminative conditioning. 11. Effects of a Pavlovian conditioned stimulus upon a subsequently established operant response. Journal of Experimental Psychology, 38, 173-177. Fantino, E. (1965). Some data on the discriminative stimulus hypothesis of secondary reinforcement. Psychological Record, 15, 409-415. Ferster, C. B., & Skinner, B. F. (1957). Schedules ofreinforcement. New York: Appleton. Flaherty, C. F., & Davenport, J. W. (1968). Noncontingent pretraining in instrumental discrimination between amounts of reinforcement. Journal of Comparative and Physiological Psychology, 66,707-71 I . Garcia, J., Kovner, R., & Green, K.F. (1970). Cue properties vs palatability of flavors in avoidance learning. Psychonomic Science, 20, 313-314. Garrud, P., Goodall, G., & Mackintosh, N. J. (1981). Overshadowing of a stimulus-reinforcer association by an instrumental response. Quarter1.v Journal of Experimental Psychology, 33B, 123- 135. Guthrie, E. R. (1952). The psychology of learning. New York: Harper. Hammond. L. J. (1980). The effect of contingency upon the appetitive conditioning of free-operant behavior. Journal of Experimental Analysis of Behavior. 34, 297-304. Hasher, L., & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108, 356-388. Holland, P. C. (1977). Conditioned stimulus as a determinant of the form of the Pavlovian conditioned response. Journal of Experimental Psychology: Animal Behavior Processes, 3, 77- 104. Holland, P. C. (1983). “Occasion-setting” in Pavlovian feature positive discriminations. In M. L. Commons, R. J. Hemstein, & A. R . Wagner (Eds.), Quantitative analyses of behavior: Discrimination processes (Vol. 4, pp. 183-206). Cambridge, MA: Ballinger. Holland, P. C. (1985). The nature of conditioned inhibition in serial and simultaneous feature negative discriminations. In R. R. Miller & N. E. Spear (Eds.), Information processing in animals: Conditioned inhibition (pp. 267-297). Hillsdale, NJ: Erlbaum. Holland, P. C., & Rescorla, R. A. (1975). The effect of two ways of devaluing the unconditioned stimulus after first- and second-orderappetitive conditioning. Journal ofExperimenta1 Psychology: Animal Behavior Processes, 1, 355-363. Holman, E. W. (1975). Some conditions for dissociation of consummatory and instrumental behavior in rats. Learning and Motivation. 6 , 358-366.
Associative Structures in Instrumental Learning
101
Holman, J. G., & Mackintosh, N. J. (1981). The control of appetitive instrumental responding does not depend on classical conditioning to the discriminative stimulus. Quarterly Journal of Experimental Psychology, 33B, 2 1-3 1. Hull, C. L. (1943). Principles of behavior. New York: Appleton. Hull, J. H. (1977). Instrumental response topographies of rats. Animal Learning and Behavior. 5 , 207-212.
Hull, J. H., Bartlett, T. J., & Hill, R. C. (1981). Operant response topographies of rats receiving food or water reinforcers on FR or FI reinforcement schedules. Animal Learning and Behavior, 9, 406-410. Irwin, F. W. (1971). Intentional behavior and motivation: A cognitive theory. Philadelphia: Lippincott . James, W. (1890). The principles of psychology. Holt, New York. Jenkins, H. M. (1977). Sensitivity of different response systems to stimulus-reinforcer and response-reinforcer relations. In H. Davis & H. M. B. Hurwitz (Eds.), Operant-Pavlovian interactions (pp. 47-62). Hillsdale, NJ: Erlbaum. Jenkins, H.M. (1985). Conditioned inhibition of key pecking in the pigeon. In R. R. Miller & N. E. Spear (Eds.), Information processing in animals: Conditioned inhibition (pp. 327-353). Hillsdale, NJ: Erlbaum. Jenkins, H. M., & Moore, B. R. (1973). The form of the autoshaped response with food or water reinforcers. Journal of Experimental Analysis of Behavior, 20, 163- 181. Jenkins, W. O., & Stanley, J. C., Jr. (1950). Partial reinforcement: A review and critique. Psychological Bulletin. 47, 193-204. Kamin, L. J. (1968). Attention-like processes in classical conditioning. In M. R. Jones (Ed.), Miami symposium on predictability, behavior and aversive stimulation (pp. 9-33). Coral Gables, FL: Univ. Miami Press. Kamin, L. J. (1969). Predictability, surprise, attention and conditioning. In B. Campbell & R. Church (Eds.), Punishment and aversive behavior (pp. 279-296). New York: Appleton. Karpicke, J., Christoph, G., Peterson, G., & Hearst, E. (1977). Signal location and positive versus negative conditioned suppression in the rat. Journal of Experimental Psychology: Animal Behavior Processes, 3, 105-1 18. Khavari, K. A., & Eisman, E. H. (1971). Some parameters of latent learning and generalized drives. Journal of Comparative and Physiologica/ Psychology:y.77, 463-469. Konorski, I. (1967). Integrative activiry of the brain. Chicago: Univ. of Chicago Press. Konorski, J., & Miller, S. (1930). Methode d’exarnen de I’analysateur moteur par les rt5actions salivomotrices. Compres rendus des Seances de la Socidtb Biologique, 104, 907-910. Konorski, J., & Miller, S. (1937). On two types of conditioned reflex. Journal of General Psychology. 16, 264-272. Krieckhaus, E. E. & Wolf, G. (1968). Acquisition of sodium by rats: Interaction of innate mechanisms and latent learning. Journal of Comparative and Physiological Psychology, 65, 197-201. Lewis, D. J. (1960). Partial reinforcement: A selective review of the literature since 1950. Psychological Bulletin. 57, 1-28. LoLordo, V. M., McMillan, J. C., & Riley, A. L. (1974). The effects upon food-reinforced pecking and treadle-pressing of auditory and visual signals for response-independent food. Learning and Motivation, 5 , 24-41. Lorge, I., & Sells, S. B. (1936). Representative factors in the rat under “changed-incentive technique.” Journal of Genetic Psychology, 49, 479-480. Lovibond, P. F. (1983). Facilitation of instrumental behavior by a Pavlovian appetitive conditioned stimulus. Journal of Experimental Psychology: Animal Behavior Processes, 9, 225-247. Mackintosh, N. J. (1974). The psychology of animal learning. London: Academic Press. Mackintosh, N. J., & Dickinson, A. (1979). Instrumental (type 11) conditioning. In A. Dickinson &
I02
Ruth M. Colwill and Robert A. Rescorla
R. A. Boakes (Eds.), Mechanisms of learning and motivation (pp. 143-167). Hillsdale, NJ: Erlbaum. Mellgren, R. L., & Ost, J. W. P. (1969). Transfer of Pavlovian differential conditioning to an operant discrimination. Journal of Comparative and Physiological Psychology. 67, 390-394. Miller, N. E. (1935). A reply to “Sign-gestalt or conditioned reflex?” Psychological Review, 42, 280-292. Morgan, M. J. (1974). Resistance to satiation. Animal Behavior. 22, 449-466. Morgan, M. J. (1979). Motivational processes. In A. Dickinson & R. A. Boakes (Eds.),Mechanisms of learning and motivation (pp. 171-201). Hillsdale, NJ: Erlbaum. Morrison, G. R., & Collyer, R. (1974). Taste-mediated aversion to an exteroceptive stimulus following LiCl poisoning. Journal of Comparative and Physiological Psychology, 86, 5 1-55. Mowrer, 0. H. (1960). Learning theory and behavior. New York: Wiley. Murphy, G. (1947). Personality: A biosocial approach to origins and structure. New York: Harper. Nairne, J. S . , & Rescorla, R. A. (1981). Second-order conditioning with diffuse auditory reinforcers in the pigeon. Learning and Motivation. 12, 65-91. Nevin, J. A., Mandell, C., & Yarensky, P. (1981). Response rate and resistance to change in chained schedules. Journal of Experimental Psychology: Animal Behavior Processes, 7, 278-294. Nissen, H. W., & Elder, J. H. (1935). The influence of amount of incentive on delayed response performances of chimpanzees. Journal of Genetic Psychology, 47, 49-72. Overmier, J. B., Bull, J. A,, 111, & Trapold, M.A. (1971). Discriminativecue properties ofdifferent fears and their role in response selection in dogs. Journal of Comparative and Physiological Psychology, 76, 478-482. Pearce, J. M., & Hall, G. (1978). Overshadowing the instrumental conditioning of a lever press response by a more valid predictor of reinforcement. Journal of Experimental Psychology; Animal Behavior Processes, 4, 356-367. Peterson, G. B., & Trapold, M. A. (1980). Effects of altering outcome expectancies on pigeons’ delayed conditional discrimination performance. Learning and Motivation. 11, 267-288. Peterson, G. B., Wheeler, R. L., & Trapold, M. A. (1980). Enhancement of pigeons’ conditional discrimination performance by expectancies of reinforcement and nonreinforcement. Animal Learning and Behavior. 8, 22-30. F’remack, D. (1965). Reinforcement theory. In D. Levine (Ed.), Nebraska symposium on motivation (pp. 123-180). Lincoln: Univ. of Nebraska Press. Rashotte, M. E., Griffin, R. W., & Sisk, C. L. (1977). Second-order conditioning of the pigeon’s keypeck. Animal Learning and Behavior, 5 , 25-38. Rescorla, R. A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66, 1-5. Rescorla, R. A. ( 1977). Pavlovian second-order conditioning: Some implications for instrumental behavior. In H. Davis & H. M. B. Hunvitz (Eds.), Operant-Pavlovian interactions (pp. 133164). Hillsdale, NJ: Erlbaum. Rescorla, R. A. (1979). Aspects of the reinforcer learned in second-order Pavlovian conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 5 , 79-95. Rescorla, R. A. ( 1980). Pavlovian second-order conditioning: Studies in associative learning. Hillsdale, NJ: Erlbaum. Rescorla, R. A. (1982). Comments on a technique for assessing associative learning. In M. L. Commons, R. J. Hemstein, & A. R. Wagner (Eds.), Quantirarive analysis of behavior: Acquisition (Vol. 3). Cambridge, MA: Ballinger. Rescorla, R. A. (1985). Inhibition and facilitation. In R. R. Miller & N. E. Spear (Eds.), Information processing in animals: Conditioned inhibition (pp. 299-326). Hillsdale, NJ: Erlbaum. Rescorla, R. A., & Holland, P. C. (1982). Behavioral studies of associative learning in animals. Annual Review of Psychology, 33, 265-308.
Associative Structures in Instrumental Learning
103
Rescorla, R. A., & Solomon, R. L. (1967). Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning. Psychological Review, 74, 15 1- 182. Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Us.), Classical conditioning 11: Current research and theory (pp. 64-99). New York: Appleton. Rizley. R. C., & Rescorla, R. A. (1972). Associations in second-order conditioning and sensory preconditioning. Journal of Comparative and Physiological Psychology, 81, 1- 11. Robbins, D. (1971). Partial reinforcement: A selective review of the alleyway literature since 1960. Psychological Bulletin, 76, 4 15-43 1. Schwartz, B. (1976). Positive and negative conditioned suppression in the pigeon: Effects of the locus and the modality of the CS. Learning and Motivation, 7 , 86-100. Sheffield, F. D. (1966). A drive-induction theory of reinforcement. In R. N. Haber (Ed.), Current research and theory in morivarion (pp. 98-1 I I ) . New York: Holt. Shettleworth, S. J. (1981). Reinforcement and the organization of behavior in golden hamsters: Differential overshadowing of a CS by different responses. Quarterly Journal of Experimental Psychology, 33B, 241-255. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: 11. Perceptual learning, automatic attending and a general theory. Psychological Review, 84, 129- 190. Skinner, B. F. (1938). The behavior of organisms. New York: Appleton. Spence, K. W. (1956). Behavior theory and conditioning. New Haven: Yale Univ. Press. Spetch, M. L., Wilkie, D. M., & Skelton, R. W. (1981). Control of pigeons' keypecking topography by a schedule of alternating food and water reward. Animal Learning and Behavior, 9, 223229. St. Claire-Smith, R. (l979a). The overshadowing of instrumental conditioning by a stimulus that predicts reinforcement better than the response. Animal Learning and Behavior, 7 , 224-228. St. Claire-Smith, R. (1979b). The overshadowing and blocking of punishment. Quarterly Journal of Experimental Psychology. 31, 5 1-6 I . St. Claire-Smith, R., & MacLaren, D. (1983). Response preconditioning effects. Journal ofExperimental Psychology: Animal Behavior Processes, 9, 41 -48. Tarpy, R. M., Lea, S . E. G., & Midgley, M. (1983). The role of the response-reward correlation in stimulus-response overshadowing. Quarterly Journal of Experimental Psychology, 35B, 5365. Tinklepaugh, 0. L. (1928). An experimental study of representative factors in monkeys. Journal of Comparative Psychology, 8, 197-236. Tolman, E. C. (1933). Sign-Gestalt or conditioned reflex? Psychological Review, 40, 246-255. Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological Review, 55, 189-208. Tolman, E. C., & Gleitman, H. (1949). Studies in learning and motivation: I. Equal reinforcements in both end-boxes, followed by shock in one end-box. Journal of Experimental Psychology, 39, 8 10-8 19. Trapold, M. A. (1970). Are expectancies based upon different positive reinforcing events discriminably different? Learning and Motivation, 1, 129-140. Trapold, M. A,, Lawton, G. W., Dick, R. A., Goss, D. M. (1968). Transfer of training from differential classical to differential instrumental conditioning. Journal of Experimental Psychology, 76, 568-573. Trapold, M. A,, & Overmier, J. B. (1972). The second learning p m e s s in instrumental learning. In A. A. Black & W. F. Prokasy (Us.), Classical conditioning. 11. Current research and theory (pp. 427-452). New York: Appleton. Williams, B. A. (1982). Blocking the response-reinforcer association. In M. L. Commons, R. J.
104
Ruth M. Colwill and Robert A. Rescorla
Hemstein, & A. R. Wagner ( a s . ) , Quantitative analyses of behavior: Acquisition (Vol. 3, pp. 427-445). Cambridge, MA: Ballinger. Williams, D. R. (1965). Classical conditioning and incentive motivation. In W. F. Prokasy (Eds.), Classical conditioning: A symposium (pp. 340-357). New York: Appleton. Wilson, C. L., Sherman, J. E., & Holman, E. W. (1981). An aversion to the reinforcer differentially affects conditioned reinforcement and instrumental responding. Journal of Experimental Psychology: Animal Behavior Processes, 7 , 165- 114. Zeaman, D. (1949). Response latency as a function of the amount of reinforcement. Journal of Experimental Psychology, 39, 446-483.
THE STRUCTURE OF SUBJECTIVE TIME: HOW TIME FLIES John Gibbon NEW YORK STATE PSYCHIATRIC INSTITUTE AND COLUMBIA UNIVERSITY NEW YORK, NEW YORK 10032
I. Introduction How does time fly? When my grandmother said this upon looking up from reading aloud, did she mean that real time passed more rapidly than her perception of real time? Or perhaps she simply meant that she was not being attentive to real time. We have all had similar experiences in which our internal impression of time has surprised us by a misrepresentation that is too low. Tempus fugir particularly fast, apparently, when we are having fun. Our folklore contains a few expressions of this sort, for example, “Time waits for no man.” However, the more common anecdote has the subjective perception of time accumulating faster than real time-most especially when we are attending to it. Our folklore is rich in this misperception, for example, in the failure of the watched pot to boil, or the heaviness with which time hangs on one’s hands, or the petty pace at which it creeps. These modes of internal representation of time might be schematized as in Fig. 1, which plots subjective time as a function of real time. A veridical time sense is represented by the positive diagonal, and the concave-down function represents the case in which subjective time accumulates more rapidly than real time early within an interval. A moment’s thought reveals that some curvature is required to give substance to this view, since a proportional, but faster than real time subjective time sense would not be discriminable from the veridical case. Whatever the internal clock is that allows us to appreciate one moment as following another, it is surely not read in seconds, and hence the unit of subjective time is arbitrary. If we experience time within an interval as moving faster than real time, what we mean is that subjective time accumulates rapidly now, but not always. A concave-down representation is appealing intuitively. Even when the end of an interval is not marked by a desired event (e.g., the pot boiling), still, late portions of the interval seem to elapse more slowly. I would even argue that this impression remains when the anticipated event is aversive (e.g., an expected THE PSYCHOLOGY OF LEARNING AND MOTIVATION. VOL. 20
105
Copyrighl 0 1986 by Academic Press. Inc. All rights of reproduction in any form reserved.
John Gibbon
I06
A Wofched Pof
Never Boils
I / Real Time Fig. 1. Schematic diagram for two subjective misrepresentations of real time. In the lower curve, subjective time accumulates slower than real time early in an interval. In the upper curve, subjective time accumulates faster than real time early in an interval.
punishment). This intuitive impression is not restricted to intervals in the secondsto-minutes range either. At one extreme, if we regard units along the abscissa in Fig. 1 as decades, there is a sense in which the years between the fourth and fifth seem more compressed than those between the first and second. The other extreme, time in the milliseconds-to-seconds range, has been a focus of psychological experimentation from the inception of the first psychological laboratory. In the late 189Os, Wunt and his students studied what they believed to be curvature in the subjective time sense by asking subjects to estimate short time intervals, either by reproduction or discrimination. The estimates usually passed through a point of equality with real time, the “indifference interval, such that time was overestimated below and underestimated above this value. Figure 2 displays some average time estimates and a redrawn theoretical curve “derived from the data” from Meumann (1 893). The systematic errors of the sort indicated in Fig. 2 were interpreted as direct reflections of the subjective temporal representation of duration. There is, however, a logical problem in this kind of interpretation. Curvature in the subjective time sense cannot actually be revealed by this method. Our mnemonic encode and decode systems are designed for accuracy. Whatever the frequency of errors in our reports, systematic deviation from real time in the location of such errors must be the result of some alternative process than curvature in our time scale (cf. Gibbon, 1981a).’ The problem is illustrated in ”
‘The method of obtaining such data is also flawed by modem standards in a number of respects. It seems likely on reanalysis that these techniques, which involve listening to one duration and then reproducing or recognizing a second, owe their curvature to the time-order error (Gibbon and Allan, in preparation).
The Structure of Subjective Time
1 1
.5
I
.7
I
I
.9
1.1
I
107
I
1.3 1.5
Real Time Fig. 2. Estimated time as a function of real time (seconds), adapted from Meumann (1893). The author’s theoretical representation is an exponential growth function T = T + ,1021 - .048eT. Note the PSE or indifference interval, at 0.75 sec.
-
Fig. 3. The encoding and decoding systems are shown as fundamentally different for short and long times. To accommodate a function like Meumann’s, the time estimate, for a short time must be somewhat longer than S, while that for a long time, i,must be somewhat shorter than L. One would have to imagine that subjective representations of short times moved up as they were encoded (solid arrows) or decoded (dashed arrows), but that representations of long values moved down. This encode/decode problem remains no matter how extreme the curvature of the subjective time scale (Gibbon, 1981a). In fact, later work showed that continued training with feedback on accuracy significantly reduced errors. The subjective time sense may be as curved as one might like to imagine in the mean and still result in accurate reproduction of a given learned duration with enough practice. Indeed, it would be hard to imagine
s,
0,
E
1
R e d Time Fig. 3. Subjective time representation required to produce overestimates at short and underestimates at long times. The arrows represent encoding and decoding processes to and from memory.
108
John Gibbon
evolution in her wisdom designing us in any other way, else the lioness would never learn to time her leap nor would her prey ever learn to escape. What, then, accounts for the intuitive appeal and anecdotal documentation in our folklore of different speeds for subjective time early and late in an interval? I will argue in this article that it is the change in discriminability of adjacent time values as time increases. It is well known that temporal discrimination over a fairly broad range in both animals and humans roughly follows Weber’s law (Allan, 1979; Getty, 1975; Gibbon, 1977; Stubbs, 1968; Treisman, 1963). Long time intervals are harder to tell apart than short ones, and hence they “feel” closer together in subjective appreciation. The classical historical approach to this problem was, of course, the Fechnerian scale constructed on units of discriminability. A modem treatment in this tradition is contained in a recent paper by Heinemann (1984) in which he shows that many of the standard temporal discrimination findings with animals in psychophysical settings are accommodated by assuming a time sense which is logarithmic in the mean, with constant variance on the psychological scale. 1 will briefly review one of the discrimination findings from the animal literature which has been widely cited in support of a logarithmic time sense. The analysis provides a solution to the encodeldecode problem for inferring properties of the subjective time scale, but at the cost of introducing some ambiguities of its own.
11. The Temporal Middle
The encode/decode problem requires us to avoid asking a subject to reproduce or recognize a time value that we have just marked, so to speak, on his subjective scale. An important advance in our study of this problem was made by Church and his colleagues (Church, 1978; Church & Deluty, 1977; Roberts & Church, 1978; Meck & Church, 1983) . Their solution was simply to ask subjects to report on time values on which there was no prior informative training. Performance to temporal stimuli intermediate between those which had been reinforced might be diagnostic of the form of the time sense, independently of motivational factors. In particular, interest centered on the point of subjective equality (PSE) at which subjects were indifferent or maximally “confused” in discriminating between two differentially reinforced time intervals. A.
BISECTION: THEGEOMETRIC MEAN
Their technique, the bisection procedure, has not changed much since its development nearly 10 years ago (Church & Deluty, 1977). Rats are trained to discriminate between a short stimulus duration, S, and a long comparison duration, L , by reporting on one of two levers after presentation of a sample duration.
I::r 5.v y1
The Structure of Subjective Time
109
P5322
0
0
0 0
1.0 2.0 3.0
!
1.0
Normalized Signal Duration,
2.0
3.0
T/m
Fig. 4. Long report probability, P [ z ” ] , as a function of normalized signal duration, T / a , for 4 subjects studied at two ranges of signal durations. The 2-8 sec range (open circles) and the 2-16
sec range (filled circles) approximately superpose. Indifference, the point at which subjects are equally likely to report short or long, lies close to the geometric mean (1 .O on the normalized scale).
They are then queried about intermediate values for which they are never reinforced. We have adapted this procedure for pigeons. A trial begins with the illumination of the center key of a three-key array with a “trial available color,” say, blue. The first peck turns the center key white for a duration of T sec, after which it is extinguished, and two side keys are illuminated, say, red and green on the left and right, respectively. If T is equal to S,a peck on the left, red key is followed by food. If T is equal to L, a peck on the right, green key is followed by food. All other values of T (N = 5 ) go unrewarded, but a report response on one of the side keys is required to terminate the trial and initiate the intertrial interval (generally set equal to 5L). Data from four subjects are shown in Fig. 4. The graph plots report probability on the “long” key, P[’z”], as a function of the duration of the sample, T, normalized by the geometric mean, Two different pairs of short and long values were studied, 2 versus 8 sec, and 2 versus 16 sec, for a minimum of 3 weeks each, and the resulting functions are indicated by open and filled circles in the figure. Normalization by the geometric mean is seen to produce rough superposition of the two functions, especially in the mean. This finding is a form of Weber’s law, as discriminability is constant at constant proportions of the ratio of L to s . 2
a.
*That is, Weber’s law requires that report probability remain constant at constant ratios of TIL, given a constant L / S ratio. Superposition in our metric is stronger than that requirement as it allows these two ratios to trade, since T / a = [ T / L ] [ m ] .
110
John Gibbon
The point at which subjects are equally likely to report short or long, the PSE or indifference point, is indicated in the mean function by the dashed line. This value, which may be thought of as one kind of “subjective middle,” lies close to the geometric mean, T / a -- 1.0. These findings replicate and extend to pigeons, and to larger and unequal L / S ratios, these same effects were first described by Church and Deluty (1977) with rats. The geometric mean finding originally prompted Church and his colleagues to speculate that the subjective time scale might be logarithmic, so that the arithmetic average of a pair of values would be represented by the geometric mean in real time. Later analysis revealed that averaging logarithms is not the only version of a theoretical account of the temporal middle which would locate the indifference point at the geometric mean. It also results from a signal detection account using likelihood ratios, as well as from a similarity metric using ratios of linear functions of real time. The reader interested in the quantitative details of this analysis should consult Gibbon (1981b). For our present purposes we wish simply to contrast this view of a midpoint between two subjective values, with alternative conceptualizations. The midpoint from the point of view of discrimination models may be thought of as the point at which errors in appreciating either the short or the long value are equal. Two candidates for the structure of such errors and their relation to either a nearly linear or a logarithmic time scale are shown in Fig. 5 . The top panel shows a scalar timing system in which subjective time increases as a power function of real time, with an exponent near 1.O. In this system variance in the
S
GM ’ R e d Time
i
Fig. 5. Schematic diagram of variable memory representations for two timing processes. The scalar timing process in the top panel has mean time increase nearly linearly with real time and variance increase with the square of the mean. The log timing process in the lower panel has mean time increase as the logarithm of real time, with variance on the subjective scale constant. Arrows indicate the points of equal likelihood.
The Structure of Subjective Time
111
memory for a given time value increases as the square of the mean, the scalar property (Gibbon, 1977). Two distributions are shown on the ordinate corresponding to memory for a short and a long time value. In the lower panel, subjective time is shown growing as the logarithm of real time, with constant variability on the subjective time scale. Both of these processes conform, under certain response rule constructions, to Weber’s law. They thus are consonant with this general finding in the literature, in contrast to a Poisson timing system in which variance increases directly with the mean (cf. Gibbon, 1981b). Clearly, an averaging version of the subjective temporal middle places the middle close to the arithmetic mean for scalar timing, but at the geometric mean for log timing, as noted above. Alternative discrimination processes, however, produce different PSEs, or temporal middles, for these two variance structures. A classical analysis using signal detection ideas might ask where maximum confusions would lie if subjects were making a posteriori calculations of the likelihood that a given sample was drawn from one or the other memory distribution. Intuitively, confusion should be maximal when likelihoods are equal. This point is shown for these two systems by the arrows and solid lines where the density functions intersect. The real time values associated with this version of the middle are the geometric mean (still) for the log timing system, but the harmonic mean for the scalar timing system (Gibbon, 1981b). The harmonic mean lies to the left of the geometric mean, as is evident in the figure, and in principle should be readily distinguishablefrom the geometric mean. In practice, however, variability around the midpoints leaves the issue still in doubt, and in fact, differing versions of the bisection task (Siege], in press) suggest that under some circumstances bisection tends to occur closer to the harmonic mean. However, the preponderance of the data favor the geometric mean, and hence were one to adopt a likelihood ratio discrimination rule, a log timing account would seem preferable. Perhaps unfortunately, mathematical models (or those who construct them) are rather too flexible when faced with a single potential discrepancy. Special cases are somewhat too readily accommodated. I have argued elsewhere (Gibbon, 1981b) that an alternative discrimination rule is equally feasible for this task. This is a similarity rule in which subjects compare intermediate sample values with their memory for either endpoint and report whichever is most “similar.” Similarity is based on a ratio of the percept to the remembered values. Such a ratio results in indifference at the geometric mean for the scalar timing system as well as for the log timing system. Thus, we are left from these data alone without a clear discrimination between these two candidates for the subjective time sense. B. TIMELEFT: THEARITHMETIC MEAN Several alternative approaches to curvature in the time sense have been studied by Church and Gibbon. One approach that we have examined did not depend, as
John Gibbon
I12
do the analyses above, on the form of the distribution of errors around a remembered time. We reasoned that the difference between curvature and linearity in the subjective scale would be revealed if we could induce subjects to compare the beginning of one interval with the end of another. For example, if subjective time were strictly proportional to real time, then an interval of, say, 30 sec should be perceived as equal to the second half of an interval of 60 sec. But if time is curved as in the bottom panel of Fig. 5 , this comparison should reveal the last half of the 60-sec interval to be subjectively much shorter the 30-sec interval. The procedure we devised to effect this comparison is called the time-left procedure. Subjects were asked to choose between a standard delay which stays fixed and the remaining time in an elapsing comparison delay which starts at twice the standard. The rationale behind this choice procedure will be briefly recapitulated here, and some new data presented. The work serves to introduce adaptations of this procedure studied later, appropriate to other ends.
111. Experiment 1: Baseline Time Left Trials begin with an initial choice period of variable duration, T , as illustrated in Fig. 6. During the choice period, two keys, colored, say, white and red on the
6
0
1
T
T+S
C
Time Since Trial Began Fig. 6. Time-left procedure. During an initial-choice period ( T ) , two response keys are available. Pecks to these keys occasionally lead to mutually exclusive terminal link delays to reinforcement (bull’s-eyes). On the standard side of the choice, the delay is always S sec. On the comparison side, the delay is L=C-T, the remaining time in a total alternative delay interval of C sec.
The Structure of Subjective Time
113
left and right, are available, and birds may distribute pecks across them in accordance with their preference for one or the other of two mutually exclusive consequences. If after time T the first peck is to the red key, the white key is extinguished, the red key changes color to green, and food is primed for responding after a fixed, standard interval, S. Conversely, if at the choice point subjects respond on the comparison white key, the red key is extinguished, the white key remains illuminated, and food is primed for responding after a total fixed comparison duration, C , timed from the start of the trial. In the work described here, generally C = 2s. The point at which a choice response will be effective in obtaining one of these two consequences is varied randomly from trial to trial and covers the range from shortly after trial onset to just before C. At any arbitrary moment, while the choice keys are still available, subjects are faced with a standard delay which might occur “right now” for pecking the red key, or a delay consisting of the remaining time, L = C - T, on the time-left side of the choice. If subjects are to choose the shortest delay to food, they should respond on the red key in favor of the standard early in the trial, but switch over to the time-left alternative later in the trial, when L < S. A typical psychometric choice function from a pigeon studied with S = C/2 = 30 sec is shown in Fig. 7. Preference increases smoothly from near 0 to near 1 as time elapses during the trial. The function shows an indifference point, TI,*, which may be regarded as the time at which the subjective distance to food on both alternatives is equal. In this example, subjective equality occurs when the actual remaining time is longer than 30 sec. This might be the result of a bias in favor of the elapsing interval, or it might indicate some curvature in the subjective time sense, since then the subjective distance from 0 to 30 sec (the standard
Time Since Trio1 Begon Fig. 7. Psychometric preference function from the time-left procedure. Data points represent relative responding on the time-left key during the choice period, ”L”/[”L”+ ”S”].
114
John Gibbon
delay) would be greater than the subjective distance from 30 to 60 sec (the timeleft delay). These ideas are shown schematically in Fig. 8 (adapted from Gibbon & Church, 1981). In the top panel a linear time sense is shown, with the subjective distances to 30 and 60 sec indicated by the vertical arrows near the ordinate. A negative intercept is shown, corresponding to a small latent period, To, before subjective time begins to grow with real time. Subjects are assumed to perform the equivalent of allowing the 30 sec distance (arrow) to ride up the ramp as time elapses until it just spans the distance between the current ramp value and subjective 60 sec. This is indicated as the first'switchover point, T,,2, on the abscissa. Now consider what is expected when both S and C are doubled in absolute value. The standard is then 60 sec long and subjects would be expected to switch at the second T I l 2point in the figure when the distance between 0 and 60 sec just matches the distance from the ramp to the subjective representation of 120 sec. Thus, for a linear scale, T I l 2increases with increasing S and C values. Next, consider what is expected if the subjective time scale is logarithmic. This is depicted in the bottom panel on an inverse semilog plot, so that real time is on a log scale and the subjective representation grows linearly on this scale. Again, arrows on the ordinate reflect the subjective distance to 30 and 60 sec. When S = 30 sec, subjects should prefer the standard until the arrow appropriate to subjective 30 sec just matches the distance between the ramp function and subjective 60 sec. This is indicated at T,12on the abscissa. It is clearly a much
Time Since Trial Began Fig. 8. Schematic diagram of a linear time sense (upper panel) and logarithmic time sense (lower panel), both with an arbitrary temporal intercept (To).The length of the awows on the left represent the subjective time to reinforcement for two values of the standard. (After Gibbon & Church, 1981 .)
The Structure of Subjective Time
115
earlier value than 30 sec into the trial, even with a rather large latent period (To = 5 sec). Now, however, if both S and C are doubled in real time on the logarithmic scale, this amounts to adding the same increment to both delays. Hence, indifference, the point at which the delays to food are subjectively equal, is not altered at all! This is shown by the 60-sec arrow on the ramp meeting subjective 120 sec at the same T , / 2value. The key observation, then, is not whether T I l 2occurs somewhat earlier than the midpoint of the elapsing interval, but whether TI,, increases when the standard and comparison are increased, but maintained in the same ratio. It is readily demonstrated that for the linear timing system the remaining delay to food at the should be linearly related to the midpoint of the point of indifference, C elapsing interval, C/2= S (Gibbon & Church, 1981; Gibbon, Church, & Meck, 1984). A bias in favor of the elapsing interval is indicated by a slope different from 1.0 in this relationship. Seven birds were studied at three or four different S, C pairs for eight sessions each, with each condition generally determined twice. Table I shows conditions and color assignments (partially counterbalanced) for all subjects in Experiment 1 and for subsets of these subjects studied in two later experiments, reported below. Psychometric functions like those in Fig. 7 were obtained from the last 4 days at each determination. C - T I , , values taken by interpolation from the preference functions are shown in Fig. 9. Each function is fit with a least-squares line and r2 values indicated. The time remaining at indifference is linearly related to the size of the midpoint of the interval, confirming and extending our earlier reports (Gibbon & Church, 1981; Gibbon et al., 1984). The slopes of these functions are reliably different from 2.0 (TI12vs S slope different from 0.0) and hence not consonant with a logarithmic subjective time scale. They are also generally greater than 1.O, consonant with a bias in favor of the elapsing interval.
IV. Time-Left Mixture: The Harmonic Mean I have argued that the procedure for finding a midpoint between two remembered time values heavily constrains the obtained results. When the midpoint is assessed by its similarity to the two ends, the geometric mean results. When the midpoint is assessed by contrast with an interval half as long, it is linearly related to the arithmetic mean. Both results are compatible with at least one version of a scalar timing process, but a log timing process is compatible only with the middle at the geometric mean. Alternative constructions of a scalar timing process, however, predict the harmonic mean as the temporal middle. My third approach to the temporal middle problem develops theory and experiment on task demands for which the
TABLE 1 KEY COLORAND TIMEVALUEASSIGNMENTS FOR EACHSUBJECTI N EACHEXPERIMENT
Experiment Experiment I Color assignments: Choice link ( ‘ * C ” l “ S ’ ) Terminal link (CIS) Time values (CIS): Condition: 1 L
3 4
5 6 7 8 9 10
I1
12 13
370
372
373
685
1380
2549
3106
RIG RILW
WIR WIG
WIG WIR
RIG RILW
RIG RILW
WIR WIG
WIG WIR
60/3OU 30115 120160u 120160a 120160“ 120160 60130 30115 120160
60130 301 15 120160 120160 120160 120160 60130 301 15 120160
60130 301 I5 120160 120160
1517.5 301 15
1517.5 301 I5 301 15 1517.5 60130 301 I5 1517.5 60130 60130
120160 120160 60130 30115 120160
301 15
1517.5 60130 30115 1517.5 60130 60130
1517.5O 30115 30115 1517.5 60130 301 15’ 120160 60130 30115 120160 120160 60130 I20160
15I7Sa 30115 30115 1517.5 60130 301 I5 I20160 60130“ 30115 120160 60130 120160 60130
Experiment 2 Color assignments: Choice link ( “ C ” / “ S ” ) Terminal link (CIS) Time values (CIS,, S2): Condition: 1 L
3 4
RIG R/LW
WIR WIG
WIG W/R
60/10, 50 120120, 100 60/10, 50
60/10, 50
AM 60/10, 50
120/20, 100 60/10, 50 3015, 25
30/5, 2 F
1 1 1
120/20, loo“ 60/10, 50“ 3015, 25“
HM 5 6 7 8
c
4
9 10
I1 Recovery (CIS): 12 13 14 15
16
30/10.71, 25 30/10.71, 25 30/10.71, 25 120/42.86, 100 l20/42.86, 100 120/42.86, 100 60121.43, 50 60121.43, 50 60121.43, 50 30/10.71, 25 30110.71, 25 30110.71, 25 120142.86, 100 120142.86, 100 120142.86. 100 60121.43, 50 60121.43, 50 60/21.43, 50 30/10.71, 25 30/10.71, 25 120/60 60130 30/15 120/60
120/60 60130 30115 120/60
.1
=
C/2
1 1 1 1 1 1 J.
s = c/2
120160
60/30 30/15 120/60
HM l20/42.86, 100 120142.86, 100 120/42.86, 100
= C/2
1 1 J. 1 = C12
1 (conrinued)
TABLE I
(Continued) Bird Experiment
370
372
373
685
1380
2549
3106
RIG RIW
RIG RIW
WIR WIG
WIG WIR
60130 60115, 45 60115, 60 601 15, 120 601 15, 240“ 6 0 1 1 5 , 240“
60130 60115. 45 60115, 60 601 15, 120 601 15, 240 601 15, 240
Experiment 3 Phase I Color assignments: Choice link (“C”1”S”) Terminal link (CIS) Time values (CIS): (CISI, Sz):
S = c12 AM = C12 GM = C12 HM + C12 HM +-+ c12 HM ++ c12
60130 60130 60115, 45 60115, 45 60115, 60 60115, 6 0 O 601 15, 120 601 15, 120 601 15, 240 601 15, 240 601 15, 240 601 15, 240
Phase II Informative condition Color assignments: Choice link ( “C”I“SI,” “S2” ) Terminal link (CIS) Time values (CISI, Sz): HM
-
Uninformative condition Color assignments: Choice link (“C”1“S”) Terminal link (CISI, S 2 ) Time values ( C I S I ,S 2 ) : HM
C12
-
C12
” TI,* value not obtained in 8-day determination. c
T I l 2not obtained because of near exclusive preference for S , . not obtained because of near exclusive preference for L = C - T .
RIG, Y WIY, G WIG, Y WIR RIW WIR 60115b, 4 8 0 ~ 60115b. 4 8 0 ~ 60/15b,480 60115b, 480 RIG, Y RIW
RIW RIG, Y
RIW RIG, Y
WIR WIY, G
WIR WIG, Y
60115, 480
60115, 480
60115, 480
60115, 480
The Structure of Subjective Time
lo51 75 I
/
I19
1
1051 74
15 75 45
P372 r2- 999
P2549
r2:
999
rz.- 995
15
45
75
105
Standard Duration, S Fig. 9. Time left at indifference, C-T1,2, as a function of the size of the standard, for 7 birds. The linear functions are least-squares regressions on the mean values (filled circles). Variance accounted for (r2)is indicated in each panel.
harmonic mean is the appropriate index of the middle. The key feature of this analysis is the construction of an aggregate or mixture of intervals on one side of a choice, contrasted with a single alternative interval on the other. The question is a classic one in the operant-choice literature, where it takes the form of asking what value of a fixed interval schedule matches or is equivalent to a variable interval (or sometimes a mixed interval) schedule. Historically the answers have been consistent in one qualitative respect, namely, in rejecting the arithmetic mean as too large; that is, in a variety of choice settings, both rats and pigeons show a pronounced preference for a variable delay to reward, as against a fixed delay equal to the arithmetic mean of the variable set. The phenomenon has been reported mainly in the concurrent chains paradigm with pigeons (e.g., Autor, 1969; Hermstein, 1964; Killeen, 1968, 1970), but its predecessors go back to the earlier work of Pubols (1958, 1962) and Logan (1960, 1965) with rats. Pubols (1962) suggested that a steep discount in the value of a delayed reward might make an ensemble of such delays disproportionately weighted by short intervals. The work was couched in a delay-of-reward gradient framework, with rats running faster for variably delayed reward in the goal box
120
John Gibbon
of a straight alley. It also occurred, however, in choice behavior of rats in T mazes (Pubols, 1962), where the phenomenon appears to be essentially the same as the more recent findings with pigeons in concurrent chain schedules. Attempts to explain the common variable preference have remained largely descriptive. An early report by Killeen (1968) suggested that the harmonic mean of a variable interval schedule may be the best predictor of its fixed interval equivalent (see also Shimp, 1969). Similarly, a recent report by Mazur (1984) suggests that averaging hyperbolic functions associated with two delays might be the rule for obtaining their fixed equivalent. This amounts to adding a (small) constant to each delay and then taking their harmonic mean.3 A variety of alternative suggestions also have been advanced, including the geometric mean (Fantino, 1969) and the harmonic mean of the cubed delays (Davison, 1969). Given this rich history in the psychological literature, it is striking that there have been virtually no reports of the alternative preference-for no variability. An important exception may be some recent findings by Caraco and his associates in an experimental analog of a foraging situation (Caraco, 1982; Caraco, Martindale, & Whittam, 1980).4 There follows a theoretical and experimental analysis of the preference for a mixture of two delays to food. The results strongly favor the harmonic mean as the temporal middle. The two-point mixture is a case of special interest because it admits of a simple theoretical solution showing that averaging of expectancies, or inverse delays, is not the only mechanism resulting in the harmonic mean. In the time-left setting, one can study this question by delivering one of two standards on the standard side of the choice, each on a random half of the entries into that terminal link. The procedure is shown in Fig. 10. It is a time-left procedure just as in Fig. 6, but now there are two standards, S , and S,. Memory for the mix of standard delays is contrasted at any moment, T, with memory for the elapsing delay, L = C - T. Imagine that both standards are remembered with some variance and mixed in equal proportion in memory, just as they are delivered to the subject. Consider a decision mechanism which takes a sample from this mixture in memory and compares it to a sample from the memory of the fixed, time-left alternative. Choice favors whichever is the shorter delay. The general case for Gaussian memory distributions of the individual delays is analyzed in the Appendix. It is shown there that if appreciation of the current time 'In scalar expectancy theory (Gibbon, 1977). this is equivalent to averaging mean expectancies, where the small added constant is To. Taraco and his associates found that some passerine species show risk aversion, which may be interpreted as preference for no variability under some motivational conditions. Their paradigm attempts to model foraging conditions in the field and does not involve temporal variability directly. However, a temporal analog of their risky choice situation certainly deserves study.
The Structure of Subjective Time
121
t
0
T Tt15
Tt45
60
Fig. 10. Time-left procedure with two standards. The procedure is identical to that for one standard (Fig. 6), except that now either a short, S,, or a long, S,, delay is programmed for half of the entries into the standard side terminal link. In the example shown, a Msec comparison interval is contrasted with two standards which average 30 sec.
has small variance compared to the memory for time, the memory representation is given by of the time left at indifference, L I l 2= C -
where the p(T) are the mean memory representations of real times, T. The weights, p, q = 1 - p , are given by
and the uJ2,J = 1,2, C are the variances associated with the two standards and the time-left interval, respectively. Consider first the hypothetical case in which the fixed, time-left alternative is known exactly so that the only variance in the decision mechanism comes from the two-point mixture. The indifference point, p(LIl2),in this simple scheme is then just that value of the fixed alternative for which 50% of the samples from the mixture are shorter (or longer)-the median. The median is a limiting case as variance in the mixture becomes large relative to variance in the memory for the fixed delay. This case might be approximated, for example, when the two standards were widely separated (Experiment 3). If the variance of the time-left alternative is set to zero, the weights become
John Gibbon
I22
+
p = u2/(uI u,). It is then readily shown (cf. Appendix) that the real-time
counterpart of the median, L I l 2 ,becomes
f(S,
L,/* =
[
+ SZ),
uJ = u, uJ = u,
l/[f(l/S, + 1/S2)1? uJ
p(SJ) 0: S J , J = 1, 2 p(SJ) 0: In(SJ), J = 1 , 2 = y/dSJ)9 p ( S J ) 0: sJ,
(34 (3b) (3c)
The median of the mixture takes the value of the arithmetic mean if variance on the subjective scale is constant and the mean memory representation is proportional to its real-time counterpart [absolute timing, Eq. (3a)], the geometric mean if variance is constant but the mean memory representation is proportional to the logarithm of its real-time counterpart [log timing, Eq. (3b)], and the harmonic mean if variance is proportional to the square of the mean and the mean is proportional to real time [scalar timing, Eq. ( 3 ~ ) ] . ~ It is important to recognize that these implications involve no free theoretical parameters. In particular, the harmonic mean remains the median of the twopoint scalar mixture independent of the level of sensitivity to time. This is shown graphically in Fig. 1 1. Two mixtures of two-point Gaussian distributions are displayed with different sensitivity values [coefficient of variation, y in Eq. (3c)l. The median is indicated by the dashed vertical line (HM). Two experiments are reported below which analyze this question from somewhat different perspectives. Both utilize the “double standard” time-left procedure.
V. Experiment 2: Arithmetic and Harmonic Mean Standards Over a series of conditions lasting a minimum of 8 days, three subjects from Experiment 1 were studied at sets of S,and S, values chosen so that either their arithmetic or harmonic mean equalled C / 2 , where C/2 = 15, 30, and 60 sec, comparable to the baseline condition (see Table 1). Performance under variable standards was contrasted with performance under single-standard baseline redeterminations conducted after the double standard conditions. The logic was that the pair resulting in performance most like that with the single standard represented the appropriate temporal middle for this situation. A representative set of preference functions for one bird with the 30-sec standard and an arithmetic and harmonic mean pair is shown in Fig. 12. Data were taken from the last 4 days under each condition. When the pair of standards was chosen such that their arithmetic mean equalled 30 sec, the bird stayed with 5These conclusions hold quite well if proportionality in the mean is relaxed to linearity with a constant real-time intercept, To (cf. Gibbon & Church, 1984).
The Structure of Subjective Time I
123
I
Remembered Time Fig. I I . Hypothetical memory distributions corresponding to a mixture of two remembered realtime intervals. The mixture with two modes is for a relatively sensitive subject (y = .15). The broader mixture with a long tail is for a relatively insensitive subject (y = .4). The median is indicated by the dashed vertical line over the harmonic mean (HM) of the intervals.
this side of the choice considerably longer into the elapsing interval than when the standard was the single value set at 30 sec. When the pair was chosen such that their harmonic mean equalled 30 sec, performance was virtually identical to that obtained with the single fixed standard. The data of Fig. 12 are particularly good exemplars of the rule that performance looks closest to the single standard with the harmonic mean pair. However, data from other pairs also conform to this rule, at least with respect to central tendency.
S.30 S=10,50 A S'21.43. 50 I
15 30 45 Time Since Triol Begon
60
Fig. 12. Psychometric preference functions for one subject under the baseline time-left condition (S = 30 sec) and an arithmetic (S,= 10 sec, Sz = 50 sec) and harmonic (SI = 21.43 sec. S2 = 50 sec) mean pair condition.
John Gibbon
124
The last 4 days under each condition were analyzed by extracting indifference point values from the preference functions. The time left at indifference, C - T,,,,for each pair is plotted over the arithmetic, geometric, or harmonic mean abscissa value in the three rows of Fig. 13. Each subject is represented by a column. The linear functions in each panel are best-fit regression lines, and the regression function for the single standard from the recovery of baseline is repeated in each panel. The baseline regression from Experiment 1 is shown in the panels in the top row for reference (dashed line). Note first that the two functions for the single standard (top row) are quite close and that they lie above the function for the mixture in every case; that is, subjects prefer the mixture more than the single standard, no matter how the mean of the mixture is calculated. It is also clear, however, that performance is closest to that for the single standard when the double standard data are plotted against the harmonic mean of the mixture. This is reflected also in the variance accounted for by the regressions. On the other hand, while the harmonic mean does better than the other two,
P370
P372
P373
p L?c 0
40
80
120
0
40
80
120
Mean Standard Duration
Fig. 13. Time left at indifference for each subject under each condition as a function of mean of the standard side delays. The mean is calculated either as the arithmetic (top row), geometric (middle row), or harmonic mean (bottom row). The regression function from the baseline recovery with a single standard (filled circles) is repeated in each panel. It is quite comparable to the function from Experiment 1 (dashed line, open triangles, top panels only). The lower function in each panel is the regression of the double standard data against the mean, calculated in each of the three manners.
The Structure of Subjective Time
I25
the slope differences, particularly between the geometric and the harmonic mean regressions, are often not large. The next experiment widens this difference, capitalizing on different properties of the two means.
VI. Experiment 3: Harmonic Mean Asymptote Our second approach to this problem included very large separations between the two standards in the mixture. Note that the arithmetic mean and the geometric mean are unbounded as the larger of the two values is increased. The harmonic mean, however, asymptotes to twice the smaller value as the larger increases without bound. We capitalized on this difference in Experiment 3 by studying a set of pairs in which we maintained S, = 15 sec and increased S2 to large values. The comparison interval was kept at C = 60 sec. The remaining four subjects from Experiment 1 served.
A. PHASE I: UNCUEDDOUBLESTANDARDS Subjects were studied as in Experiment 2 with double standards, with S, = 15 sec and S, = 45, 60, 120, and 240 sec in successive conditions (Table I). Indifference points were taken from preference functions pooled over the last 4 days in each condition. The time left at indifference, C-T1,*, is plotted in Fig. 14 against increasing S,. The mean data from each subject under each condition are represented by open points. The squares on the right are from Phase II of the experiment. The functions labeled AM, GM, and HM represent the arithmetic, geometric, and harmonic mean of the two standards. The data lie between the function for the geometric and that for the harmonic mean. At S2 = 240 sec, the geometric mean function meets C = 60 sec. A log timing account predicts indifference at the beginning of the trial and preference for time left thereafter. One subject did indeed absorb at this value, and no data point is presented for it here.6 An indication of the temporal control exerted by memory for the two mixed standards may be seen in Fig. 15. The data are group mean response rate in the standard terminal link, both when it was short (S, = 15 sec) and long (S, = 240 sec). There is a peak in response rate at the 15-sec time value (both functions) followed by a decline in responding to low values and then a gradual rise as the S, interval elapses. Subjects thus bear in mind, so to speak, both the short and the long interval, and do so with much sharper resolution close to the short target time. This, of course, is expected from a scalar timing account, and the sharply 6This subject, P685, absorbed on the time-left side of the choice from the beginning of the trial at S2=240 sec, and thus did not have a T l l zvalue here. It did, however, show an indifference point for a larger S2 in Phase I1 of the experiment.
John Gibbon
I26
IOOkon
am P685
ooP1380 ooP2549 eaP306
1
z 11’
i=
SI 60
120
240
480
Long Standard, S2 Fig. 14. Time left at indifference, C - T I , ~as , a function of the size of the larger of the two standard delays, S2. The solid line functions represent the arithmetic mean (AM), geometric mean (GM),and harmonic mean (HM). The latter asymptotes at 2SI = 30 sec. Open circles represent the mean across subjects. The rightmost points at 480 sec (squares) are from the uninfurmutive condition in Phase 11.
peaked gradient around S , = 15 sec is quite comparable to functions obtained with the “peak procedure” (Roberts, 1981) used in our laboratory in other contexts (Gibbon et al., 1984).
B. PHASE11: CUEDDOUBLESTANDARDS In Phase I subjects were studied at successively larger values of S,. This may have introduced some inertia in shifting preference toward time left. To control for hysteresis in Phase 11, S, was increased to 480 sec, but in two different conditions subjects were either informed or uninformed about which of the two standards was programmed to occur for choices of the standard side on that trial.
3
=
0
o
180 Time Since Standard Began 60
120
1
240
Fig. 15. Group mean response rate in the short (SI= 15 sec, open circles) and long (Sz sec, closed circles) standards.
=
240
The Structure of Subjective Time
I27
The procedure for the infonhative condition is diagramed in Fig. 16. The trial begins with one of two colors on the standard initial link choice key. In the presence of one color, “the short color,” here yellow, pecking will produce the short standard (Y + S , ) . On other trials the standard choice key is colored “S, color,” here green, and pecking now will produce the long standard (G + SJ. The procedure is thus a mixture of two single-stimulus time-left procedures in which the standard is either short or long. As subjects learned the predictive nature of the conditional choice key colors, they came to choose the standard on trials when it would be short, but absorbed rapidly on the time-left key when the standard would be long. After training with the predictive colors on the choice key, subjects were returned to an uninformative condition similar to Phase I, with a single color on the standard choice key. Now, however, entry into each standard delay was accompanied by a cue. The two colors previously used to predict S , and S , were now present during S, and S,. The short or the long delay, as before, occurred on half of the entries into this terminal link. Color assignments were partially counterbalanced across subjects (see Table I). Preference functions over the last 4 days for each condition of Phase I1 are shown with the mean in Fig. 17. Two functions are shown for the informative condition, one for choice in the presence of the color predicting S , (upwardpointing triangles labeled Y S = 15), and another for choice in the presence of the S , predictor (downward-pointing triangles labeled G S = 480). One
+
+
1
&
0
T T+Sl
C
T+S2
Fig. 16. Time-left procedure for the informafive condition of Phase 11. Trials begin with the On trials in standard side choice key lit with one of two colors, say either yellow (Y)or green (G). which it is yellow, entry into the terminal link will produce the short (S = 15 sec) standard delay, while on trials in which it is green, entry into the terminal link will produce the long (S = 480 sec) delay.
John Gibbon
128
-0
15
Time Since Trial Begun Fig. 17. Psychometric preference functions from 4 subjects under the informative and uninformative conditions. The two functions for the informative condition represent preference in the presence of the choice color predicting the short (S = 15 sec) or the long (S = 480 sec) standard delay. Under the uninformative condition, subjects cannot predict which of these two delays will occur upon entry into the standard terminal link.
subject (P685)discriminated the two consequences perfectly. It absorbed on time left when the long standard was programmed and on the standard when the short standard was programmed. The other subjects showed varying degrees of discrimination performance, but all of them preferred the short standard much more than the long standard in the informative condition. The intermediate functions are those generated by the uninformative condition in which subjects cannot predict which of the two standards is programmed until they obtain it. The preference functions all rise smoothly to indifference values in the middle of the range. This is true for the subject that showed near perfect discrimination of the two when they were predictable, as well as for the other subjects. It should be noted that these intermediate functions cannot be conceived as averages of the two functions for the informative condition. In the extreme case of perfect discimination (e.g., P685), averaging would produce a flat function at indifference. Another index of the discriminative control exerted by the informative stimuli
The Structure of Subjective Time
I29
in the choice period is the performance in the terminal link standards following one or the other signal. Note that once the standard terminal link has begun, the key color is uninformative with respect to which interval is programmed. Thus, discriminative performance here reflects the conditional control of the prior choice stimuli. In Fig. 18 response rate in the two standards is shown as a function of time since the standard began for the informative condition. The functions obtained following the predictor of the short standard are all higher and sharply accelerated toward the S, = 15 sec termination with food. Some of the rate functions for the S, interval show a mode at the S, = 15 sec value, followed by a decline and then a gradual rise as the long interval nears completion. The short mode at S, presumably reflects some residual excitatory strength here, but the discrimination between the long predictor and the short predictor is quite clear in the absolute rates near 15 sec. The small mode here is to be contrasted with the larger peak at S, = 15 sec seen when subjects were uninformed as to which standard was in effect (Fig. 15). The C-TI,, values from the uninformative condition are plotted in Fig. 14 (squares). The mean preference function here has an indifference value such that the time remaining happens to lie precisely at the harmonic mean. This means that some subjects preferred the mixture with S, = 480 sec more than they did that with S, = 240 sec. The differential cues during the terminal link in the uninformative condition at 480 sec may be implicated in this difference.
P685
1 ~ 1 '
P3106
Fig. 18. Response rate in the short and long standards under the informative condition. After responding on the "SI" (yellow) choice key, subjects receive the short standard (function labeled Y 3 SI= 15 in the panel for the mean). After responding on the "S," (green) choice key, subjects receive the long standard (labeled G 3 Sz = 480).
130
John Gibbon
The conclusion seems clear that the psychologically typical, central, or middle value of a mixed pair of delays lies close to the harmonic mean, even when that value represents a pair with one delay 16 times as long as the other. Somehow subjects mix the two values, either as suggested in Fig. 11 and switch over at the median, or they mix these values in some fashion which reflects the average of their inverses, as suggested originally by Killeen (1968) and recently modified by Mazur (1 984). The mechanism whereby the switchover point is determined is not forced by these data beyond the constraint imposed by the harmonic mean. However, continued sampling with multiple decisions as the trial elapses is unlikely on the basis of trial-by-trial analyses. If fresh decisions were made several times during the choice period as it elapses, then several switchover points might result within a trial with a long choice period. Particularly when the second standard is long, multiple comparisons within a trial should result in the preference function rising rapidly to near 0.5 and staying there until about 45 sec into the C = 60 sec interval. The slopes of these preference functions are too steep to be a direct reflection of the variance in the double standard mixture (cf. Appendix). Moreover, reversals within a trial were rare. Subjects generally showed one T , / 2value per trial, with virtually no responding on the time-left side prior to the switchover point. The smooth ogival preference functions arise largely from between-trial variation in the location of T,,,. Thus, it seems that if something like the complete mixture is retained in memory, sampling from it must be used to adjust the current switchover point. For example, if a sample from memory for the mixture were shorter than the time left from T,,2 to food on the last trial, subjects might adjust the switchover point somewhat later into the current trial. Such a mechanism would titrate T,,2toward the median of the m i x t ~ r e . ~
VII. Concluding Remarks What I have attempted to develop in this article has two main conclusions. The first is that the central, typical, average, or representative value of an aggregate of two or more time intervals in memory is no single, simple function of the values in the collection. The temporal middle, which seems on its face an important, conceptually diagnostic value for understanding the processing of 'This mechanism should also generate alternation in the direction of adjustment near the center of the TI,*distribution. A relatively complete sequential analysis is beyond the scope of this article; however, some preliminary analysis of one subject (P2549)on the baseline conditions has been done. Differences between TI,* values on successive trials were coded positive or negative. A runs test on these data revealed reliably more runs, that is, more alternations, than expected by chance. Sequential data from these and other tasks are currently being studied with R . M. Church.
The Structure of Subjective Time
131
time, may take on different values. Depending on the task, the temporal middle may suggest an underlying logarithmic process, as in the bisection procedure, an underlying linear process, as in the time-left procedure with a single standard, or an underlying hyperbolic process, as in the time-left double standard procedure. The data we have adduced are consonant with scalar timing as the primary mechanism modulating the discriminability of time intervals and are not consonant with at least some versions of alternative variance structures in the timing mechanism. The second main conclusion is that our original question on the curvature of subjective time is moot. If one regards the subjective time scale as based on discriminability, then curvature is an appropriate discription. However, if one regards subjective time as a direct reflection of the internal measurement device that appreciates the passage of successive moments (and this is my preference), then the data and theory put forth here argue against serious curvature, at least not of the order of logarithms. On this view, the answer to the question, Is subjective time curved?, is, Not very, but the temporal middle is where you find it.
Appendix: Double Standard Mixture At any arbitrary time, Tin the trial, subjects are assumed to sample from their memory, established over long training, for the time remaining to food on both sides of the choice. On the standard side this is a sample (xs) from a mixture of the memories for S, and S2, which are assumed normal with means and variances p(Si) = pi,
= ui,
a(&)
i= 1, 2
(Al)
This mixture has distribution function
where Q, is the unit normal distribution function with variate zi(x) = ( x - p i ) / a i ,
i = 1,2
An estimate of the time left to food on the comparison side, L = C - T, is obtained as the difference between two independent samples from the memory for the total comparison delay, C , and the current time, T. The time left, xL =xc-xT, is then normally distributed also with the mean and variance
A ratio comparison is made between the two delays, and a choice in favor of L occurs whenever x,lx,
John Gibbon
132
where B is a bias parameter reflecting preference for an elapsing or a fixed delay.* Bias acts as a scale parameter here (i.e., the mean and standard deviation of xL/Bare just these parameters rescaled by l/B),so we may allow B = 1 without loss of generality. The probability of choosing the time-left alternative, L, at time T,is then
I
+m
P("L"1T) = P(XL < xs) =
+[ZL(X)]
-m
dF,(x)
(A61
where zL(x) is defined analogously to Eq. (A3), with i = L. The form of the mixture, F,(x), given by EQ.(A2), allows the right-hand side of EQ.(A6) to be expressed as the sum of two probabilities, one associated with each of the two standards,
P("L"p)= l[P(XL c x,)
+ P(XL c x,)]
(A4)
where the xi are normal with means and variances, P;, ui [Q. (Al)].The differences, x, - xi, are then also normal, so that Eq. (A4) has solution
P("L"lr) = K@[51(r)l + @[52(1311
(A51
where
ti(T)= ( k i- kL.L)/(uf+ ut>"'
i = 1,2
(A6)
with pi, u, defined as in Eqs. (A3) and (A4). A.
TIMELEFTAT INDIFFERENCE:L,,,
=
C - TI,,
At indifference, P("L"IT,,,)= 1. Hence
@[51(T112)1= 1 - @[52(T112)1 But by symmetry, 1 - @(z) = @(-z), so that (11.1
-
PL,/*)/(UI2
+
1
UL,,2 2 112
=
(PL1,,-
=
047)
-t2, or + u~1,22)L12 (AS)
~2)/(~2'
The results, Eqs. (A5) and (A8), are quite general and hold for any mean and variance assignments provided the memory random variables are essentially positive normal. Several lines of evidence, however, suggest that appreciation of the current time, T, has negligible variance relative to variance in the memory for the standard and comparison intervals (cf. Gibbon & Church, 1984; Gibbon eral., 1984;Roberts, 1981). For this case, q,,* = uc, and Q. (AS) yields a solution for the mean memory representation at indifference PL1/2 =
where q = 1
-p
PPI + 4P2
(A9)
and
8B may be expected to vary also (e.g., with E(b) = B ) . Introduction of variance here induces some skew in the decision variates. This case is treated in some detail in Gibbon er al. (1984).
The Structure of Subjective Time
133
1
P'
1+
[(a12
+ ac2)/(a22 + ac2)]1/2
which are text Eqs. (1) and (2). If variance is constant, (a,= a), then p = q = 1, and the time left at indifference is the arithmetic mean of the standards on the subjective scale. This translates to the arithmetic or geometric mean of the real-time standards as the mean memory representation is proportional to real time or the logarithm of real time, respectively [text Eqs. (3a) and (3b)l. If variance on the subjective scale is proportional to the square of the mean so that the coefficient of variation, y = u/p, is constant (the scalar property), then 1
P=
1
+ [(P*2+
Pc2)/(P22+ Pc211112
(A1 1)
In this form, it is clear that if the two standards are small relative to the comparison, p + 4, so that the time left at indifference approaches the arithmetic mean of the standards. This relation is preserved if remembered time is proportional to real time. Unfortunately, in practice choice becomes unbalanced when one alternative is much more favorable thatn the other. Subjects tend to absorb on the more favorable side. Hence this prediction of scalar timing is not readily evaluated. B.
MEDIANOF
THE
DOUBLESTANDARD MIXTURE
A case of special interest arises when variance in the appreciation of the time left in the elapsing interval is negligible relative to variance in the double standard mixture. In this case, indifference is expected when the time left in the elapsing interval equals the median of the mixture. Setting a& =0 in Eq. (A8) yields Eq. (A9) withp = a2/(a, + a2). Again, p = 4 for constant variance, with the arithmetic and geometric mean implied for absolute and log timing, respectively [text Eqs. (3a) and (3b)l. Now, however, a straightforward algebraic calculation shows that for a constant coefficient of variation y = a/p (scalar timing), 1
the harmonic mean [text Eq. (3c)l. Proportionalitypreserves this relation exactly (and linearity preserves it approximately) for the real-time counterparts of the mean memory representations. ACKNOWLEDGMENTS This research was supported by NSF Grant BNS 81 1-9748. I am indebted to Stephen Fairhurst for data collection and analysis and for execution of the experiments reported here. I am also indebted to
I34
John Gibbon
him and to my colleagues Russell Church and Lorraine Allan for formative discussions of many of the issues studied here.
REFEREN cEs Allan, L. G. (1979). The perception of time. Perception and Psychophysics, 26, 340-354. Autor, S. M. (1969). The strength of conditioned reinforcers as a function of frequency and probability of reinforcement. In D. P. Hendry (Ed.), Condirioned reinforcement. (pp. 127-162). Homewood, IL Dorsey Press. Caraco, T. (1982). Aspects of risk aversion in foraging white-crowned sparrows. Animal Behavior, 30,719-727. Caraco, T., Martindale, S., & Whittam, T. S. (1980). An empirical demonstration of risk-sensitive foraging preferences. Animal Behavior, 28, 820-830. Church, R. M. (1978). The internal clock. In S. H. Hulse, H. Fowler, & W. K. Honing (Eds.), Cognitive processes in animal behavior. Hillsdale, NJ: Erlbaum. Church, R. M., & Deluty, M. 2. (1977). Bisection of temporal intervals. Journal of Experimental Psychology: Animal Behavior Processes, 3, 216-228. Davison, M. C. (1969). Preference for mixed-interval versus fixed-interval schedules. Journal of rhe Experimental Analysis of Behavior, 12, 247-252. Fantino, E. (1969). Conditioned reinforcement, choice, and the psychological distance to reward. In D. P. Hendry (Ed.), Conditioned reinforcement. Homewood, IL: Dorsey Press. Getty, D. J. (1975). Discrimination of short temporal intervals: A comparison of two models. Perception and Psychophysics, 18, 1-8. Gibbon, J. (1977). Scalar expectancy theory and Weber’s law in animal timing. Psychological Review. 84, 279-325. Gibbon, J. (1981a). Two kinds of ambiguity in the study of psychological time. In M. Commons & J. A. Nevin (Eds.), Quanfifarive unalysis of behavior, (Vol. 1). Cambridge, MA: Ballinger. Gibbon, I. (1981b). On the form and location of the psychometric bisection function for time. Journal of Mathematical Psychology, 24, 58-87. Gibbon, J., & Church, R. M. (1981). Time left: Linear vs. logarithmic subjective time. Journal of Experimental Psychology: Animal Behavior Processes, 7 , 87- 108. Gibbon, J., & Church, R. M. (1984). Sources of variance in an information processing theory of timing. In H. L. Roitblat, T. G. Bever, & H. S. Terrace (Eds.), Animal cognition. Hillsdale, NJ: Erlbaum. Gibbon, J., Church, R., & Meck, W. (1984). Scalar timing in memory. In J. Gibbon & L. G. Allan (Eds.), Timing and time perception. New York: New York Academy of Sciences. Heinemann, E. G. (1984). A model for temporal generalization and discrimination. In J. Gibbon & L. G. Allan (Eds.), Timing and time perception. New York: New York Academy of Sciences. Hermstein, R. J. (1964). Aperiodicity as a factor in choice. Journal of the Experimental Anal.ysis of Behavior. 7 , 179-182. Killeen, P. (1968). On the measurement of reinforcement frequency in the study of preference. Journal of the Experimental Analysis of Behavior. 11, 263-269. Killeen, P. (1970). Preference for fixed-interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 14, 117-124. Logan, F. A. (1960). Incentive: How the conditions of reinforcement affect theperjormance of rats. New Haven: Yale Univ. Press. Logan, F. A. (1965). Decision making by rats: Delay versus amount of reward. Journal of Comparative and Physiological Psychology, 59, I - 12.
The Structure of Subjective Time
135
Mazur, J. E. (1984). Tests of an equivalence rule for fixed and variable reinforcer delays. Journal of Experimental Psychology: Animal Behavior Processes, 10, 426-236. Meck, W. H., & Church, R. M. (1983). A mode control model of counting and timing processes. Journal of Experimental Psychology: Animal Behavior Processes, 9,320-334. Meumann, E. (1893). Beitrage zur psychologie des Zeitsinns. Philosophische Srudien, 8,431-509. hbols, B. H.(1958). Delay of reinforcement, response perseveration, and discrimination reversal. Journal of Experimental Psychology, 56, 32-40. hbols, B. H.(1962). Constant versus variable delay of reinforcement. Journal of Comparative and Physiological Psychology, 55, 52-56. Roberts, S. (1981). Isolation of an internal clock. Journal of Experimental Psychology: Animal Behavior Processes, 7, 242-268. Roberts, S., & Church, R. M. (1978). Control of an internal clock. Journal of Experimental Psychology: Animal Behavior Processes, 4, 318-337. Shimp, C. P. (1969). Concurrenl reinforcementof two interresponse times: The relative frequency of an interresponse time equals its relative harmonic length. Journal of the Experimental Analysis of Behavior, 21, 109-1 15. Siegel, S. F. (1986). A test of the similarity rule model of temporal bisection. Learning and Motivation. 17, 59-75. Stubbs, A. (1968). The discrimination of stimulus duration by pigeons. Journal of the Experimental Analysis of Behavior, 11, 223-256. Treisman, M. ( I 963). Temporal discrimination and the indifference interval: Implications for a model of the “internal clock.” Psychological Monographs, 77 (13, whole no. 576).
This Page Intentionally Left Blank
THE COMPUTATION OF CONTINGENCY IN CLASSICAL CONDITIONING Richard H . Granger, Jr. and Jeflrey C . Schlimmer COMPUTER SCIENCE DEPARTMENT UNIVERSITY OF CALIFORNIA IRVINE, CALIFORNIA 92717
I. Introduction: Theory and Experiment in Classical Conditioning Experimental and theoretical work on classical conditioning over the past 20 years includes mathematical formulations of the conditions under which conditioning will and will not occur in animals (Rescorla, 1967, 1968; Gibbon, Berryman, & Thompson, 1974); algorithms that give rise to this behavior (e.g., Rescorla & Wagner, 1972; Mackintosh, 1975; Pearce & Hall, 1980; Wagner, 1981); computer simulations of the behavior (e.g., Rescorla & Wagner, 1972; Sutton & Barto, 1981; Hampson & Kibler 1983); and substrate-level implementations of the neural circuits that may underlie conditioning (Hawkins & Kandel, 1984; Chang & Gelperin, 1980; Alkon, 1980; Thompson et al., 1984; Gluck & Thompson, 1985). It is quite difficult, however, to evaluate in a principled way how all of these experimental results, algorithms, computer models, and proposed circuits are related to each other. For instance, how could we go about deciding whether the Rescorla-Wagner (1972) or Mackintosh (1975) algorithms do what the Rescorla (1968) constraint specifies that such algorithms are supposed to do? How might we decide whether a particular experimental result should imply a revision to that constraint? This article presents a unified framework within which to view the computations, algorithms, and neurobiological implementations underlying classical conditioning. In particular, we present an extensive mathematical analysis of the constraints on classical conditioning, as originally identified by Rescorla (1968); that is, the precise contingency conditions under which mammals will and will not learn a particular association between two events in a classical conditioning situation. In classical conditioning, an unconditional stimulus (US),that is, a cue that is inherently biologically salient to an animal (such as an electric shock), is repeatedly paired with a conditional stimulus (CS),a cue that initially has no THE PSYCHOLOGY OF LEARNING AND MOTIVATION. VOL. 20
137
Copyright 0 1986 by Academic Press. Inc. All rights of rrproduction in any form I X S C N ~ ~ .
138
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
special significance to the animal (e.g., a tone or a light); over repeated trials, the animal can learn that the CS is predictive of or associated with the US. This phenomenon of associative learning is subject to laws and constraints: An association will be learned to some extent in some conditions and to a lesser extent (or not at all) in others. Using M a d s ( I 982) distinction among the computational (roughly, behavioral), algorithmic (abstract mechanism), and implementation (neurobiological) levels of analysis of psychobiological mechanisms, our computational analysis may be used to test the adequacy of a number of proposed algorithm-level and circuit-level mechanisms for classical conditioning. Our computational analysis is applied to a broad range of issues relating to contingency in classical conditioning, and a number of results are derived: 1. A new class of trial presentation conditions for classical conditioning is identified and distinguished from other presentation conditions. This new class of conditions, which we term “partial warning,” is simply the reciprocal of the well-known “partial reinforcement” condition: Where partial reinforcement intersperses spurious (unpaired) CS trials with CS-US pairings (with no spurious USs), the partial warning condition intersperses spurious USs with no spurious CSs; both of these partial conditions are differentiated from the “composite” class of presentation conditions in which combinations of both spurious CS and spurious US trials are added to CS-US pairings. The new condition has been mentioned only rarely in literature, and we show how comparative analysis of these conditions may prove fruitful in evaluating proposed algorithms and circuits for contingency. 2. A number of new predictions are generated which may be tested experimentally; in particular, the computational analysis of contingency predicts that learning of a positive CS-US association should occur in even the extreme cases of the partial warning condition, as it does in extreme partial reinforcement conditions, but not in extreme composite conditions. 3. It is shown that the standard predictions of contingency-based associative learning in classical conditioning (from Rescorla, 1968) depend critically on strong assumptions about timing. In particular, under different assumptions about the duration of a trial (2 min vs 3 min, etc.), the contingency prediction of whether or not a particular CS-US association will be learned or the extent to which it will be learned is greatly altered. 4. Algorithms presently in the literature are analyzed for their adequacy to account for the range of effects predicted by the computational contingency constraint. A new algorithm is proposed that accounts for the appropriate computational constraints (including the new partial warning prediction) as well as accounting for blocking and providing a coherent account of some learned irrelevance and latency effects in conditioning.
Contingency in Classical Conditioning
I39
5. Proposed neurobiological circuits for classical conditioning are similarly analyzed for their adequacy to account for these predictions. In particular, Hawkins and Kandel (1984) have offered an analysis of a neurobiological circuit in Aplysia as evidence that the operation of this circuit gives rise to associative learning; we address the question of whether the circuit’s operation simulates the same specific laws as do mammals in classical conditioning situations. If so, then a strong connection between molluscan and mammalian conditioning will have been shown; if not, then it will be possible to rigorously distinguish molluscan and mammalian classical conditioning. This article raises a number of theoretical and experimental questions in light of our framework for couching the mechanisms of associative learning. The rest of the article is divided roughly into two parts: Sections I1 and 111 provide overview, introduction, and background to our approach and our results; Sections IV and V then give detailed and in-depth analyses of the questions we have raised. For many of these theoretical questions, no answers are provided per se, but, wherever possible, we have attempted to develop explicit experimental predictions from our theoretical work to ensure that our results are testable and falsifiable.
11. A Three-Level Analysis of Classical Conditioning
A. CHARACTERIZATION OF PARTIAL vs COMFQSITE PRESENTATION CONDITIONS Mammals have been tested extensively for their sensitivity to various presentation conditions in classical conditioning (e.g., Rescorla, 1968, 1972; Mackintosh, 1975; Dickinson, 1980; Rescorla & Wagner, 1972; Gibbon et al., 1974). Rescorla (1968) identified the conditions that enable versus those that prevent learning of a particular association over trials: A positive CS-US association will be learned only if the probability of the US occurring, given that the CS has occurred, is greater (over trials) than the probability of the US occurring given that the CS has not occurred, or, formally, p(USlCS) > p(USlcs). This new constraint condition on associative learning in classical conditioning, termed contingency by Rescorla (1968), displaced the then prevalent notion that simple contiguity (i.e., the number of paired presentations of CS and US) was the key factor that determined the level of learning of a CS-US association (Spence, 1936). Rescorla demonstrated that it was this measure of relative conditional probabilities, not number of pairings, that determined whether a particular association would be learned and the extent of the associative strength that would be perceived between the CS and US. Analysis of this constraint of relative conditional probabilities shows that
140
Richard H. Granger, Jr. and Je!’€rey C. Schllmmer
learning of positive CS-US associations is enabled in certain categories of presentation conditions and is prevented in other conditions. For instance, animals will readily learn a positive CS-US association in a “perfect pairings” condition (i.e., repeated CS-US pairing trials, with no misinformation presented). From the statement of relative conditional probabilities, it can be readily predicted that animals will also learn the association to some extent even in extreme partial reinforcement conditions’ (perfect pairings with many spurious CSs mixed in), but that learning of the positive association will be severely degraded in composite misinformation conditions where both spurious CSs and spurious USs are mixed in with presentations of pairings.2 This is because the above conditional probability inequality holds throughout the perfect pairings and partial reinforcement conditions, but does not necessarily hold in composite conditions. Hence, based on the contingency constraint of relative conditional probabilities, we can rigorously distinguish between characteristics of learning in partial reinforcement conditions versus in composite misinformation conditions. In the partial condition, as more and more spurious (unpaired) CS trials are mixed in with paired CS-US trials, learning will degrade only very mildly. If the level of associative CS-US correlation is plotted against the percentage of presented spurious CS trials in partial reinforcement (top curve in Fig. I ) , learning of the association degrades very gently until the percentage of spurious CSs is up around 90%, and only goes to zero when there are 100%spurious CS trials. This means that there will be some learning of the positive CS-US association no matter how many spurious CSs are added in a partial reinforcement condition, up to but not including 100% spurious CSs, and furthermore, that the level of learning of the CS-US association will barely be degraded at all unless trials consist of more than 90% spurious CSs overall. In contrast to the almost imperceptible, gentle degradation in the partial reinforcement condition, in the composite misinformation condition learning will be severely degraded with the addition of more and more spurious CS, and spurious US trials are mixed in with paired trials. The lower line in Fig. 1 plots the strength of CS-US learning against the percentage of spurious CS and spurious US trials in the composite condition; in this case, learning of the association severely degrades down to zero association with 50% spurious trials, and as the percentage of spurious trials increases over 50%, the inverse of the association is ‘This condition has, of course, been extensively tested and confirmed in the literature (e.g., Fitzgerald, 1963; Rescorla, 1968). ZThere are a number of subcategories of the composite misinformation condition: For very few spurious CSs and USs, the animal will still learn the positive association; as these are increased, the animal will increasingly fail to’learn the positive association and will increasingly tend to learn the inverse of the association (i.e., that the CS is a “safety signal” indicating that the US is nor about to occur). Section IV presents a precise specificationof learning in these conditions and the implications thereof.
Contingency in Classical Conditioning
141
Cue strength I
Putid reinforcement (apnriolu c
O
r
y
-
__ Fig. 1. Degradation of learned predictiveness in partial versus composite conditions.
increasingly learned (i.e., the CS is learned to be a safety signal indicating that the US will not occur). The difference between the partial and composite cases can be clearly seen: As the percentage of spurious trials increases past about 30% or 4096, learning will be almost unimpaired in the partial condition, but will be severely degraded in the composite condition.
B . THEFOURTRIAL-PRESENTATION CONDITIONS OF CLASSICAL CONDITIONING We have distinguished the characteristics of learning in the perfect pairings (no spurious trials) condition, the partial reinforcement class of conditions (spurious CS trials, but no spurious USs), and the composite misinformation condition class (both spurious CS and spurious US trials): The composite condition exhibits very severe degradation of the learning of a CS-US association, while the partial condition yields only very gentle degradation of learning. When presented in this way, a fourth logical class of testing categories comes to light: What will happen in a situation in which spurious USs, but no spurious CSs, are mixed in with CS-US pairings? We term this fourth category the purziul warning condition: In this condition, not all of the USs (e.g., shocks) are preceded by a CS (tone) warning. In this light, perfect pairings represent both
142
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
perfect reinforcement (of the tone CS) and perfect warning (of the shock US); the two partial conditions correspond to misinformation along one and only one of these two dimensions, while the composite condition presents misinformation in both ways (reinforcement and warning). The partial warning condition seems not to have been extensively tested in the 1iteratu1-e;~in particular, it is unclear from the literature whether gentle or severe degradation of associative learning occurs in this class of testing conditions. It is important to note that these four classes of trial-presentation conditions simply represent subdivisions of the continuum of all possible such conditionsthey are not discrete, discontinuous categories, but rather are particular subareas of the overall “space” of possibilities. This contingency space will be introduced in Section 11,C and then will be explored in some depth in Section IV. However, continued reference to these four categories of trial-presentation conditions will enable us to afford a clear discussion of the particular characteristics of learning in each condition. A number of researchers have experimented with spurious USs, but apparently only in combination with spurious CSs (thereby forming a composite misinformation condition). For example, Rescorla (1968, 1972), who was the first to perform a systematic exploration of contingency effects in conditioning, proceeds by first testing partial reinforcement conditions, that is, CS-US pairs and spurious CSs, showing that gentle degradation of learning occurs with the addition of spurious CSs. He then adds spurious US trials to the spurious CSs, generating composite misinformation cases, and demonstrates that learning becomes severely decremented as the percentage of spurious (CS and US) trials is increased. Careful reading of Rescorla (1966, 1968, 1972) shows clearly that he does not report testing the effects of spurious USs without spurious CSs (i.e., the partial warning condition). In the same vein, work on contingency following Rescorla’s (e.g., Gamzu & Williams, 1971; Hearst & Franklin, 1977; Mackintosh, 1983; Dickinson, 1980) has concentrated on the partial reinforcement and composite misinformation conditions-we have been unable to find any report of systematic testing of CS-US pairs plus spurious USs, without spurious CSs, in the animal or human learning literature. The mathematical analysis of contingency presented here predicts that only gentle degradation of learning should occur throughout the partial warning condition, just as it does in the partial reinforcement condition. To test this prediction, we are conducting an experiment replicating Rescorla (1968, Experiment 2) on 3Gibbon er a/. (1974) identified this category of presentations, which they termed the ’‘ ‘CSimplies-US implication.’ . . . CS implies US but USs occur with some probability in -CS [the absence of a CS] also.” They reported then that “This . . . implication represents another case of partial schedules that has not been investigated.” This seems still to be true more than a decade later; we are in the process of testing this condition in our laboratory.
Contingency in Classical Conditioning
143
the partial reinforcement (PR) condition (0.4-0), composite (C) condition (0.40.4), the null (N) condition (0-0) (a control in which no CSs or USs are presented to the animal), and adding a partial warning (PW) condition ( I .O-0.4). C.
SIGNIFICANCE OF THE NEW FINDINGS: A THREE-LEVEL ANALYSIS
1.
The Three Levels: Computation, Algorithm, Implementation
This new partial warning condition is potentially just as integral a part of classical conditioning as is the well-known partial reinforcement condition; the four conditions in Table I (PP, PW, PR, and C) taken together constitute complete coverage of the possible testing conditions for classical conditioning. Using the new analysis presented here of the contingency constraint (i.e., the computational specificiation of the conditions under which classical conditioning will and will not occur), we have been able to define this presentation condition formally and clearly and to generate the prediction that learning should occur throughout this condition, just as it does in partial reinforcement. Using Marr’s (1982) distinction among the computational, algorithmic, and implementation levels of analysis of psychobiological mechanisms, we propose a set of related analyses of classical conditioning at all three levels. Any complete theory of any complex phenomenon can usefully be divided into these three separate levels. Animals can be said to perform identifiable computations, that is, to transform inputs to outputs in a principled way. For instance, in order to learn which of many possible cues (tone, light, air puff) reliably predict the occurrence of some other salient stimulus (e.g., shock, food), a rat in a classical conditioning situation must “compute” the relative predictiveness of each of the possible cues with respect to the occurrence of the salient stimulus. At the computational level, we may speak simply of these computations that must somehow be performed, without reference to how those computations may be carried out. The algorithm level constitutes the level of mathematical function that performs the necessary computations. Finally, these algorithms may be instantiated in a substrate (e.g., neurons, wires, computer bits) at the implemenTABLE I CATEGORIES OF TRIAL PRESENTATIONS IN CLASSICAL CONDITIONING
No spurious (unpaired) CSs Spurious CSs
No spurious (unpaired) USs
Spurious USs
Perfect pairings (PP) Partial reinforcement (PR)
Partial warning (PW) Composite condition (C)
144
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
ration level. These three levels are not wholly independent. In particular, the algorithm level must conform to the constraints provided by each of the other two levels: It must compute all and only those things that have been identified as actually occurring in animal learning (at the computational level) by making use of only those tools provided in the substrate (at the implementation level). It is at the computational level that the target behavior is formally and precisely stated, so this level prescribes the characteristics of the object of study. This allows us to speak of the degree of “correctness” of algorithms that are proposed to calculate the behavior. Any algorithm, no matter how fast, elegant, or efficient, is a correct algorithm for, say, contingency in classical conditioning only to the extent that it gives rise to the precise target behavior; that is, it learns or fails to learn a specific CS-US association in precisely those conditions that the computational level specifies. Similarly, any neurobiological circuit actually instantiates classical conditioning only to the extent that its operation gives rise to those correct target behaviors specified in the computational-level analysis. Once such a computational analysis has been performed (in this case, by Rescorla, 1968), then using the computational level as “arbiter” of the adequacy of proposed algorithms and circuits enables us to narrow the search for valid mechanisms of associative learning. By the same token, however, a given computation is correct only to the extent that it is actually computable with the mechanisms provided by the substrate. The relevant neurobiology therefore establishes an equally crucial constraint in the sense that it is the substrate that (somehow) gives rise to the target behavior. Just as algorithms for learning must conform to the computational constraints of the target behaviors to be explained, so must algorithms conform to the implementational constraints of the substrate. For instance, any proposed algorithm must be able to be run in a parallel, associative network of neurons, since that is the nature of the substrate. The problem is that it is often possible to experimentally identify a precise characterization of a target behavior long before the relevant neurobiology is identified; this is clearly the case, for instance, with classical conditioning. These three are distinct and in most respects independent (although they interact); it is often quite unclear just what a particular algorithm or implementation computes. For instance, “connectionist” models of learning (e.g., Anderson et al., 1977; Hampson & Kibler, 1983) consist of large numbers of distributed, parallel nodes and links that cooperatively and competitively perform individual calculations; analyzing what the overall system computes quite often turns out to be a mathematically difficult or intractable task. The computational constraint of contingency in classical conditioning can be stated loosely as the fact that the positive CS-US association will be learned in the PP, PR, and PW conditions described above and will not be learned in particular C conditions. The mathematical formulation of this set of results
Contingency in Classical Conditionhg
145
enables us to recast the existing analyses of contingency into a larger framework. This analysis may then be used to determine which of many proposed mathematical algorithms and neurobiological circuits conform to the appropriate constraint. Section IV of this article describes all this in detail. The following is a brief introductory presentation of our computational, algorithmic, and implementation-level analyses. 2. The Computational Analysis of Contingency Rescorla’s (1967, 1968)original characterizationof the contingency computation was that rats are able to learn a positive CS-US association only if the probability of the shock outcome (the US) given the occurrence of the conditional stimulus feature (the CS;e.g., the tone) is greater than the probability of the outcome occurring without that feature having occurred, or, stated in terms of conditional probability, p(US1CS) > ~ ( U s I c s ) . ~ This constraint can be translated into a three-dimensional graph in which the three axes correspond to the joint probabilities of the CS and US occurring, the CS but not US occurring, and the US but not the CS oc~urring.~ In other words, the three axes correspond to the probability of CS-US pairs, the probability of spurious (unpaired) CSs, and the probability of spurious (unpaired) USs. Figure 2 shows two rotated views of these axes and plots the above Rescorla conditional probability boundary surface, which translates into a saddle-shaped surface (hyperbolic paraboloid) in this space. (The mathematical derivation of the equation plotted here is given in Appendix A.) Each point in this space corresponds to a specific set of classical conditioning trials, with the probabilities of the CS and US occurring together determined by the point’s location in the space. Points on (or in the immediate vicinity of) the saddle surface correspond to those presentation conditions in which the presented CS-US association will not be learned; “inside” (on the Z-axis side of) the surface, the positive association will be learned to some extent (i.e., the CS signals the US), while “outside” the surface (the side away from the 2 axis), the negative association will be learned (i.e., that the CS is a safety signal indicating that the US will not occur). All proposed mechanisms (mathematical or biological) purported to perform classical conditioning can be tested for their adequacy 41t is crucial to note that conditional probabilities [e.g., p(US1CS)-the probability of the US given Occurrence of the CS] are distinct from joinr probabilities [e.g., p(US,CS) -the probability of the US and CS together; these are related:p(USICS) = p(US,CS)/p(CS)], and both are in turn distinct from simple marginal probabilities (e.g., the percentage of USs or CSs over trials). These differences are gone into in detail in Section IV. SThe fourth logical possibility, the probability that neither CS nor US occurs, is uniquely determined at each point in this three-dimensional space and hence is not a separate independent axis.
146
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
z = P(CS,US) (probability of CS-US pair)
Y = P(rn,US) (probability of spurious US)
Fig. 2. The computational constraint of contingency (330" and 240" rotations).
to account for contingency by measuring their (ideal) performance against the criteria represented by this curve. (Section IV goes more deeply into this computational analysis and its implications.) The computational contingency constraint itself is not inviolate; though it arose from systematic behavioral testing (Rescorla, 1968, 1969, 1972) and has been replicated and extended (Rescorla & Wagner, 1972; Mackintosh, 1975; Pearce & Hall, 1980), there are testable predictions from the formula that have not yet been tested, which, if in conflict with (future) experimental results, would require modification of the constraint. In other words, the computational level constraint is experimentally testable and falsifiable; for instance, the partial
Contingency in Classical Conditioning
147
z = P(CS,US) (probability of csus pair)
x = P(CS,ss) (probability of spurious CS)
Fig. 2. (continued)
warning condition will provide a test of a specific class of predictions of the theory (see Section IV) which have not yet been subjected to systematic testing. A particular question that arises about the contingency formula is that the calculation of conditional probabilities depends on an explicit assumption about the duration of time that is deemed to constitute a trial. Different choices of trial duration can change the values of the conditional probabilities for any single set of trials. This means that perceived trial duration will alter perceived conditional probabilities and so will determine in part which of several potential CS cues the animal will associate with the US and how strongly that association will be learned. This leads inexorably to the assumption either that (1) particular animals
148
Richard € Granger, I. Jr. and Jeflrey C. Schliimer
have fixed “trial window” durations (possibly different fixed durations for different classes of CSs) or (2) that animals have a way of choosing a trial window duration based on some characteristic of the trials, such as the duration of the CS. It is interesting to note that in models and simulations of classical conditioning (e.g., Rescorla & Wagner, 1972) as well as in animal experiments (e.g., Rescorla, 1968), the trial window duration is assumed to be set equal to the CS duration. This is by no means the only plausible assumption and, in fact, other assumptions can drastically change the predicted behavior of subjects and the performance of simulations. In sum, the trial window duration must be added as an explicit assumption applied to the interpretation of these experiments and simulations. These issues will be discussed in more detail in Section V,E,2.
3. The Contingency Algorithm This computational constraint of when animals will and will not learn an association can be translated into an algorithm or abstract mechanism that gives rise to that computation. A number of researchers [including Rescorla & Wagner (1972), Dickinson (1980), Pearce & Hall (1980), Wagner (1981), and Mackintosh (1983)] have developed algorithm-level theories of learning to capture the major effects of contingency in classical conditioning; we briefly review aspects of these theories in Sections tV,B and V. We propose an algorithm based on Bayes’s rule of induction (Bayes, 1763; Pearl, 1982): The algorithm makes use of precisely the inputs that the animal receives in a classical conditioning situation, together with the animal’s expectations of what will occur and, in a natural trial-by-trial fashion, assigns incrementally changing associative strengths to various candidate CS-US pairings. [Indeed, Bayes’s centuries-old rule corresponds closely to Rescorla’s original (1968) characterization of the computational constraint for contingency in rats, that is, that the positive association should be learned only if the probability of the US given the occurrence of the CS is greater than the probability of the US occurring without the CS. It is compelling to note that the two were arrived at entirely independently, yet both were designed to account for inductive learning-one in mathematical philosophy and one in animal learning.] Section IV,B shows that this algorithm performs as it should; that is, it learns in the appropriate presentation conditions. Furthermore, the algorithm yields the Kamin (1968) blocking phenomenon in a very natural way as a side effect of its operation (see Section V,A). Finally, the algorithm requires no counterintuitive calculations on the part of the animal; rather, it is a very plausible and simple calculation to imagine neural circuits to be performing. 4.
The Neurobiological Implementation of Contingency
These new computational and algorithm-level characteristics in turn pose a set of necessary constraints that must be satisfied by candidate biological mecha-
Contingency in Classid Conditioning
149
nisms that are proposed to underlie classical conditioning; hence, the characterization may aid in narrowing the neurobiological search for such candidate mechanisms (at the implemenfufionlevel). The characterization could in principle have been derived (in a bottom-up fashion) from the known circuitry, but so far analysis of the operation of neurobiological circuits has not given rise to computational constraints for contingency; the laws of animal contingency learning nonetheless constitute a necessary condition for a complete test of the validity of any proposed circuit for classical conditioning. Using this three-level analysis thus gives us a tool for distinguishing among viable and nonviable candidates for biological mechanisms underlying classical conditioning and, furthermore, for potentially distinguishing among possible different variations of classical conditioning that may occur in different taxonomic categories (taxa) of animals (e.g., different orders, classes, phyla, species). For example, Hawkins and Kandel (1984) present evidence that invertebrate Aplysiu perform associative learning (i.e., its response to the CS is altered by the CS being paired with a US), which raises the tantalizing possibility that this molluscan associative learning may be equivalent to mammalian classical conditioning. If so, then the Aplysiu circuit (and the intact preparation) should exhibit only gently degraded learning in the partial (PW and PR) conditions specified above, but severely degraded learning in composite misinformation (C) conditions. If, however, this set of constraints does not hold in Aplysiu, then this would indicate that there exist important differences between molluscan and mammalian conditioning. This in turn would suggest that these invertebrates may be performing some related, but distinct algorithm for associative learning.6 The results of this analysis could potentially have strong implications that might limit the usefulness of certain taxa (e.g., phyla, classes, orders) of animals as valid models of higher mammalian learning phenomena by rigorously distinguishing between characteristics of associative learning in different taxa of animals.
D. PARTIAL SUMMARY The computational analysis of contingency gives rise to a mathematical distinction between true contingency and partial approximationsof it, identification of a new experimental condition (partial warning) in classical conditioning and an experimentally testable prediction about its characteristics (i.e., that learning should occur throughout variations in probability of spurious USs), as well as a bThe other alternative is, of course, that the Rescorla (I%@ computational constraint is in error; since this constraint has been extensively tested (e.g., Rescorla, 1967, 1968, 1972; Rescorla & Wagner, 1972; Gibbon er al., 1974; Mackintosh, 1974), we assume for now that the conhtraint is correct. Complete experimental validation of the constraint will depend on the testing of the partial warning case.
I50
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
new method of analysis of neurobiological circuits proposed to underlie classical conditioning. The key point here is that in the absence of this kind of computational analysis there would be no principled way to tell whether any particular proposed algorithm, theory, or circuit for classical conditioning is correct. Using the analysis, we can now specify what constraints any such proposal must satisfy in order to be an adequate candidate mechanism for contingency in classical Conditioning.
111. A.
Background: Historical Perspective on Contingency
EXPERIMENTAL RESULTSO N CONTINGENCY
An examination of the development of the contingency constraint indicates that its roots lie in the notion of contiguity: When two events follow each other closely, animals tend to form excitatory associations. Spence (1936) details a somewhat more fleshed-out approach, explaining that an association between two events (e.g., a CS and a US) is a function of the number of times they occur together versus the number of times they do not; the former strengthens while the latter weakens an excitatory association between the two events. In terms of joint (not conditional) probabilities, this constraint on learning is p(CS, US) > p(CS, US) p ( c s , US). The next development in the contingency constraint was a sequence of experiments revealing that under partial reinforcement conditions animals form excitatory associations similar to those elicited by a contingency based on perfect pairings (Fitzgerald, 1963; Wagner ec a l . , 1964; Thomas & Wagner, 1964; Brimer & Dockrill, 1966). A few years later, experiments explicitly aimed at exploring the space of possible contingencies led Rescorla to form the characterization that if p( USlCS) > p(USlcs), then excitatory conditioning occurs, if p(US1CS) < p(USlcs), then inhibitory conditioning occurs, and ifp(US(CS)= p(USlm), then neither type of conditioning occurs (Rescorla, 1966, 1967, 1968, 1969). This newly formulated constraint of contingency supplanted the existing notion that simply contiguity (i.e., the number of CS-US pairings) was the measure of associative learning in classical conditioning. This had far-reaching implications for the proper control procedures in classical conditioning (Rescorla, 1967) and for the possible mechanisms that animals could be using to calculate the associative predictiveness of various cues in classical conditioning situations. Rescorla’s seminal experiments studied a wide range of contingency conditions, denoted by a pair of numbers ( N - M ) corresponding to the values of p(US1CS) and p ( U S ) a ) . Presentation conditions tested by Rescorla (1966, 1967, 1968) were0.0-0.0,0.0-0.2,0.0-0.4,0.0-0.8,0.1-0.0,0.1-0.1,0.2-
+
Contingency in Classical Conditioning
151
0.0, 0.2-0. I , 0.2-0.2, 0.4-0.0, 0.4-0. I , 0.4-0.2, and 0.4-0.4. The partial reinforcement, composite, and inhibitory contingencies were also well explored by Rescorla and others (Hammond, 1967; Gamzu & Williams, 1971; Hearst & Franklin, 1977), confirming the contingency characterization. However, none of these experiments systematically tested partial warning contingencies, that is, those in which there are no spurious CSs, but there are spurious USs mixed with CS-US pairs. It is useful to observe some attributes of conditional probabilities in the presentation conditions of classical conditioning. For example, in all partial reinforcement conditions, p(USlcs) = 0, since USs never occur without CSs in this condition, or, in other words, there are no spurious US presentations. It is the value of p(US1CS) that may be varied. Hence, all partial reinforcement conditions are of the form N-0 (e.g., 0.8-0.0, 0.4-0.0, 0.2-0.0). All such points lie on the X-Z plane of the contingency space. Reciprocally, in all partial warning conditions, p(US1CS) = 1, since the US always occurs if the CS has; that is, there are no unpaired (spurious) CSs in this condition. This means that all partial warning values are of the form 1-N (e.g., 1.0-0.8, 1.0-0.4, 1.0-0.2). These points all lie on the Y-Z plane. Composite values (which lie in the space between the three planes) may be of the form N-M for any N and M (0 < N < 1,O < M < 1); those values for which associations will not be learned are those for which N = M (these lie on the saddle surface itself). Finally, the perfect pairings case is 1.O-0.0 b(US1CS) = 1 and p(USlcs) = 03; these points lie on the Z axis. We list explicit p(US1CS) to p(USla) values in this section in order to illustrate clearly which categories of conditions have been tested and which have not. In human experimental settings, contingency has been studied mostly in response outcome situations where p(0lR) (the probability of the outcome given the response) and p ( 0 l R ) lie within the partial reinforcement and composite contingency conditions. Allan and Jenkins (1980) found that when subjects were presented with response 1, response 2, and no-response alternatives, the subjects estimated the actual contingency accurately, provided there was a contingency [p(OlR,) # p(OlR,)]. In the absence of a contingency between response and outcome, subjects’s estimations were found to be related to the overall probability of the outcome. The contingencies they investigated included [p(OlR,) to p(OlR,)] values equaling 0.1-0.3, 0.1-0.5, 0.1-0.9, 0.2-0.8, 0.5-0.8, 0.50.9, and 0.7-0.9. Wasserman, Chatlosh, and Neunaber (1983) studied the effects of discrete versus continuous responses and temporal regularity in contingency perception during free operant procedures. They investigated the nine combinations of p(OlR) = 0.125,0.500,0.875 crossed withp(01R) = 0.125,0.500,0.875. They found that subjects’s ratings of the contingency were strongly correlated to the actual contingency presented. Shanks (1985) found that contingency judgments increased toward a positive
I52
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
asymptote when the actual contingency was positive and toward a negative asymptote when the contingency was negative. He investigatedp(0lR) to p ( 0 l R ) values of 0.25-0.25, 0.25-0.75, 0.75-0.25, and 0.75-0.75. These contingencies lie within the composite condition, asserting the validity of the contingency characterization within that condition, though they add no new data to the partial reinforcement or partial warning cases. B. THEORETICAL RESULTS ON CONTINGENCY Rescorla and Wagner (1972), Wagner and Rescorla (1972), Mackintosh (1975), and Pearce and Hall (1980) have described algorithms for the computation of contingency effects. Each of these models is based on parameters (such as the innate salience of each cue in the environment and the innate salience of the US) which are used to describe the change in associative strength between a CS and US as a result of repeated pairings. The Rescorla-Wagner (1972) model assumes that each US can support only a limited association strength for which co-occurring CSs compete, and the effectiveness of the US in advancing conditioning is inversely proportional to the degree to which it is predicted by the stimuli occurring on a given trial. An effective signal can greatly reduce the effectiveness of the US for other stimuli and thereby result in blocking their learning. In contrast, attentional models such as those of Mackintosh (1975) and Pearce and Hall (1980) assume that conditionability or salience of the CS varies proportionately to the degree to which the US is predicted. Blocking results from a variation in CS processing rather than a reduction in US processing. These features allow attentional models to account for more of the data on blocking and latent inhibition. Several researchers have designed mathematical models at the implementational level which address the contingency constraint. Sutton and Barto (1981) utilize a neuron-like element which computes an output based on a function of its weighted inputs. The process of adjusting the weights is designed to allow the model to replicate the general characteristics of the reported data on contingency, the effect of the interstimulus interval on conditioning, blocking, and higherorder conditioning. Their work includes a sizable discussion of the inherent mathematical and implementationalconstraints on the design of any model at this level. Other representative mathematical, implementational models are based on the work of perceptrons (Rosenblatt, 1962), which were simple neuron-like elements. For example, Hampson and Kibler (1983) demonstrate how a small, layered network of these elements may compute any arbitrary Boolean function of its inputs. They present completeness and correctness results and explain how such a model may account for the main effects of contingency learning and blocking.
Contingency in Classical Conditioning
153
Alkon /1980), Hawkins and Kandel (1984), and Chang and Gelperin (1980) have all investigated the neural substrates of associative learning in invertebrate preparations and made claims about the extent to which these circuits and preparations actually perform classical conditioning like mammals. In particular, Hawkins and Kandel(l984) speculate about the ways in which aspects of conditioning might emerge from lower-level processes. They do not distinguish between ( 1) explicit constraints of contingency-based classical conditioning versus (2) simple associative learning in which a response to a CS is altered by its pairings with a US. For example, they claim (p. 387) that “if unannounced [i.e., spurious] USs occur between pairing trials, the ability of the CS to predict the US is reduced and learning degenerates. . .” (Rescorla, 1968). But this does not distinguish whether the reported degradation of learning corresponds to the gentle degradation of partial conditions or the severe degradation of composite conditions. We will discuss this and some related problems in more detail in Section IV,C,3. With these results in mind we have attempted to seek a uniform way to evaluate how these human and animal behaviors, mathematical algorithms, neurobiological circuits, and computer models are related to each other. Our intent is to provide a both rigorous and understandable account of some major aspects of the computational, algorithmic, and implementation attributes of contingency in classical conditioning. The following sections provide a more detailed view of our progress so far.
IV. Detail: The Contingency Computation, Algorithm, and Implementation A. THECONTINGENCY COMPUTATION 1 . The Theoretical Formulation
As already described, Rescorla’s (1968) computational constraint of contingency is that a specific presented positive CS-US association will be learned only if the probability of that US given that CS is greater than the probability of the US without the CS or, formally, p(US1CS) > p(USlcs). Reciprocally, safety signals, that is, CSs denoting the absence of a US, are learned only if p(US)CS) < p(USlCs). Church (1969) and Gibbon et al. (1974) diagram the “space” of contingencybased learning by first plotting those areas in a plane corresponding to an association being learned and the association not being learned, according to this formula. The two axes in Fig. 3 denote the likelihood that the US will occur given the CS (Y axis) versus the likelihood that the US will occur given no CS (X axis).
154
Richard H. Granger, Jr. and Jeffrey C. Schlimrner
1
P(US I CS)
0
Fig. 3. Church-Gibbon contingency plane.
Above the diagonal line through the plane, the association will be learned (e.g., a tone CS signals a shock US); below the line, the opposite of the association will be learned (e.g., the tone is a safety signal that the shock will not occur). In both cases, the relative conditional probability constraint holds. On the diagonal itself, the probability of the US given the CS is equal to the probability of the US given no CS, so presentation conditions along that line will prevent the animal from learning any positive or negative CS-US association. In this plane, we may also represent points corresponding to particular trial presentation condition^.^ For example, in Fig. 3, four points are presented corresponding to an example of a partial reinforcement condition (point I ), composite misinformation (point 2) and partial warning (point 3) conditions, and the null condition (no CSs or USs presented; point 4). (No perfect pairings condition is labeled.) Points 1, 2, and 4 correspond to those trial conditions used by Rescorla (1968); our experiment in progress includes a replication of those three points and the addition of point 3, which represents a partial warning condition. In this Church-Gibbon plane, the perfect pairings condition is the point at (1,O) (upper left corner), the partial reinforcement condition corresponds to the left vertical axis, partial warning corresponds to the top (horizontal) axis, and the rest of the square corresponds to the class of composite conditions. In the same paper described above, Gibbon et af. (1974) expand their analysis to a three dimensional space. Building on this work, we re-present and extend this analysis8 by mapping the contingency results into a Cartesian three dimen'Note that all the points represented in the figure are above the noncontingent line. simply denoting that the four conditions illustrated here were positive-association conditions, that is, conditions in which the CS indicates that the US is coming, as opposed to safety signal conditions, which would appear below the line-such conditions have been tested, but they are not illustrated here. sour saddle graph of contingency was developed before we had seen the derivation by Gibbon el a / . (1974); we are gratified that we have independently arrived at compatible sets of results.
Contingency in Classical Conditioning
I55
z = P(CS,LrS) (probability of CS-US pair)
\ Y = P(m,US) (probability of spurious US) Fig. 4. The contingency constraint.
sional space (Fig. 4) in which the three axes correspond to joint (not conditional) probabilities: The Z axis is the probability of the CS and US both occurring [i.e., the probability of CS-US pairs: p(CS, US)], X is the probability of the CS and not US occurring [the probability of spurious (unpaired) CS trials mixed in: p(CS, and Y is the probability of the US and not CS occurring [the probability of spurious (unpaired) USs: p(=, US)].9 These three joint proba-
us)],
'The three axes of this space represent joint probabilities; the Rescorla constraint plotted in the space represents a comparison between two conditional probabilities [probability of US given the CS greater than the probability of US given the absence of the CS, orp(US1CS) > p(USlm)]. These two
I56
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
bilities must add to a total probability 5 1, so the overall space used to represent all possible sets of trial presentation conditions corresponds to the truncated cube bounded by the Z, X, and Y axes and the plane X + Y + 2 = 1. Using this three dimensional space, we can diagram the true contingency constraint, which appears as a “saddle” shape in the space (Fig. 4). 2. Interpretation of the Contingency Space This contingency space can be broken down into regions that correspond to the four presentation conditions identified earlier (Section 11,B). The perfect pairings condition is that in which no spurious (unpaired) CSs or USs occur: This corresponds to the 2 axis itself (Fig. 5a). The partial warning condition is the plane defined by the 2 and Y axes (along the left side of the space), since this is the set of cases in which both perfect pairings (Z)and spurious USs (Y) are included, but no spurious CSs are included, so X must have a value of 0 (Fig. 5b). The partial reinforcement condition is the plane defined by the 2 and X axes (the right side), since this condition includes pairs and spurious CSs, but no spurious USs (Fig. 5c). Finally, the composite misinformation condition is all of the space between these planes. [The bottom plane, defined by the X and Y axes with a 2 value of 0 (Fig. 5d), would correspond to a completely unpaired condition, i.e., no pairings, only presentations of spurious USs and spurious CSs. This special case of the larger composite misinformation category is one in which the negative safety signal interpretation of the CS will be readily learned, though the positive CS-US association will not.] The actual area in which learning of a positive CS-US association is predicted to occur [by the Rescorla ( 1968) computational constraint of relative conditional probabilities] is behind the saddle surface, that is, within the area bounded by the surface and the 2 axis. Within this area, the probability of the US given the CS is greater than the probability of the US without the CS [p(US(CS) > p(USlcs)]. In front of the saddle. the opposite of the CS-US association will be learned; sincep(US1CS) < p(USlE), the CS is learned to be a safety signal, indicating that the US will not occur. The saddle surface itself corresponds to the points at which p(US1CS) = p(USIcs): Directly on (and in the immediate vicinity of) the surface, the CS will be learned to be unassociated with the US [this corresponds to a truly random control procedure, as discussed by Rescorla (1967, 1972)l. types of probabilities are distinct from each other and are related as follows: p(BIA) = p(A,B)/p(A). Furthermore, both of these types of probabilities are distinct from marginal probablities (e.g., the percentages of CSs and USs over trials). It is quite possible, for example, to change the percentage of CSs and USs in a set of trials without changing either the joint probability of CSs and USs (p(CS,US)] or the conditional probability of a US given a CS (p(USlCS)]. Similarly, two different values of the conditional probability of the US given the CS (p(LJSlCS)] could correspond to a single value of their joint probability [p(US.CS)]. In general, varying the number of CSs or USs will not necessarily change the conditional or joint probabilities.
Fig. 5. Regions of the contingency space: (a) perfect pairings contingency; (b) partial warning contingency; (c) partial reinforcement contingency; (d) completely unpaired contingency.
Figs. 5 c and d
Contingency in Classical Conditioning
I59
Recall that each point on the Church-Gibbon plane corresponds to a different potential testing condition: The four points on Fig. 3 correspond to a partial reinforcement condition (point l), composite misinformation (point 2), partial warning (point 3), and the null condition (no CSs or USs presented; point 4). These points are typically denoted by the values of the two conditional probabilities to be compared: the probability of the US given the CS, and the probability of the US in the absence of the CS. For instance, point 1 corresponds to 0.4-0, that is, p(USlCS) = 0.4 and p(USlcs) = 0. Similarly, point 2 corresponds to 0.4-0.4, point 3 corresponds to 1.0-0.4,and point 4 is 0-0. A notable aspect of the three-dimensional saddle graph is the way in which it corresponds to the Church-Gibbon contingency plane: All the above presentation conditions, which are points in the plane, correspond to line segments in the contingency three-dimensional space. This is because, by the laws of conditional probability,
[since the marginal probability p(CS) in the denominator is simply equal to the sum of its joint probabilities with or without the US]. Now each of the three joint probabilities in the resulting equation corresponds directly to a value in the threedimensional space, so we have p(US1CS) = Z / ( X
+ z)
Similarly,
-
Y
(1 - x - Y - Z )
-
Y
+ Y - 1 -x -z Setting either of these two values, say, Z l ( X + Z), to a particular constant value such as 0.4 defines a plane segment in the contingency space. Similarly, setting Yl(1 - X - Z) = 0 defines another plane segment; the intersection of these two planes is a line segment. Individual Church-Gibbon squares also correspond to plane segments in this space, and the intersection of a particular Church-Gibbon square with the 0.4-0 line segment corresponds to a point. The set of all such points in the square makes up the 0.4-0 line segment in the space. Different Church-Gibbon squares in the space correspond to different settings of p(CS, US)-the probability of nonpresentations of either the CS or US (see Sections IV,B and V,E). (Hence, this standard method of specifying trial conditions is underspecijjed; a single specification such as 0.4-0 refers to a large number of different trial conditions. This leads to some counterintuitive predictions that are discussed further in Section V,E.)
--
160
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
Figure 6 illustrates the line segments in the contingency space comprising the saddle surface that corresponds to the diagonal in the Church-Gibbon square. Each individual line segment represents a point on the Church-Gibbon diagonal. Figure 7 illustrates some of the presentation conditions tested by Rescorla (1968), with one partial warning condition (1-0.4) added. The conditions in which p(US(CS) = p(US(=) (0.4-0.4, 0.2-0.2, 0.1-0.1) are those that lie directly on the saddle surface. The 1.0-0.4 condition lies entirely within the partial warning plane. The 0.4-0,0.2-0, and 0.1-0 conditions all lie entirely on the partial reinforcement surface. A 0.4-0.1 case would be in the space, on the inside of the saddle surface (since this condition enables learning of the positive 2 = P(CS,US) (probability of CS-US pair)
x = P(CS,tlS)
r
(probability of llpuriour CS)
Y = P(m,US) (probability of EPUriOUS us)
Fig. 6. Presentation lines comprising the contingency saddle surface.
Contingency in Classid Conditioning
161
z = P(CS, US) (probability of cs-us pair)
1.0-0.4 \ \
\ \ \
Y = P(m,US) (probability of 8PUriOlM US)
Fig. 7. Specific presentation condition lines in contingency space.
association); a 0.1-0.4 case would lie outside the surface (in the negativeassociation area of the composite space). There are interesting consequences of these lines; Section V,E explores these issues. B. THECONTINGENCY ALGORITHM
1. Inputs and Outputs of Contingency Algorithms An algorithm for contingency must account for an animal’s transformation of the inputs that are presented (e.g., trial sequences) into output categorizations of stimuli. The output categorizations can be thought of as differential assignments
I62
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
of associative strengths to different candidate CS-US pairs, where higher strengths would indicate that the animal “believes” that the CS leads to the US (i.e., has acquired the association), lower strengths that the association has not been learned (or, perhaps more accurately, that the CS has been learned to be uncorrelated with the US), and negative strengths that a negative association has been learned (the CS reliably signals the absence of the US). [Furthermore, we will show in Section IV,B,5 that “context” cues can be formally distinguished from other learned correlated and uncorrelated stimuli. Essentially, the result presented there shows that an animal should be able to tell whether a particular stimulus behaves like a context cue with respect to some particular US by determining not only the level of correlation of the cues, but also noting that the context cue occurs extremely often, i.e., that p(CS) = 1 for context cues.] In summary, the logical categories of output relationships that the animal can learn to discern are positive predictions, negative predictions, and uncorrelated cues; and context cues can be distinguished from other cues (see Section IV,B,5). The inputs available to the animal are occurrences of features in the environment. l o For simplicity (see Table II), we can categorize the logically possible pairwise combinations of two arbitrary feature events F1 and F2 (which, for classical conditioning, correspond to the CS and US, respectively). Either ( I ) FI occurs and then F2 occurs (which we will term a successfulprediction), (2) F1 occurs and then F2 does not occur (error of commission; i.e., the environment has committed an erroneous prediction of the FI -F2 sequence by the occurrence of an FI event without F2 following it), (3) FI does not occur and then F2 does occur (error of omission; i.e., FI is omitted from the FI-F2 sequence), or (4) neither FI nor F2 occurs (which we refer to as a nonprediction, or nonpresentation).“ Predictions and nonpresentations (nonpredictions) both have the effect of strengthening the predictive value, or association, between F1 and F2 (since they either appeared together or failed to appear together), while errors of commission and omission weaken the association. This implies that learning occurs in part in the absence of stimuli, since a nonpresentation is the absence of either CS or US. Some implications of this are discussed further in Section V,E.
IONote that we make the simplifying assumption that event occurrences may be described in terms of discrete time and trials. This is a common assumption in the learning literature (Rescorla & Wagner, 1972; Mackintosh, 1975; Pearce & Hall, 1980); see Section V.E.2 for a discussion of some of the implications of this assumption. “Nonpredictions (nonprcsentations) are simply the absence of the two features; if all such nonpredictions were counted, there would be a huge, ongoing number. All algorithms must systematically undercount the “true” number of nonpredictions. The method proposed as part of our algorithm (Section IV.B.5) is to only consider a nonprediction to have happened when F2 has been predicted but did not occur. The issue of the role of nonpredictions in contingency algorithms remains a crucial one, since the conditional probabilities at the heart of the computational constraint cannot be said to have been calculated without nonpredictions being taken into account (see Section V.E).
Contingency in Classical Conditioning
I63
TABLE I1
POSSIBLECOMBINATIONS OF F1 F2 present FI present FI absent
(++) (-+)
Successful prediction (s) Error of omission (0)
AND
F2
F2 absent (+-)
Error of commission ( c )
( - - ) Nonprediction ( P I )
2. Rescorla’s Interpretation of Contingency Rescorla (1968) offers an algorithmic interpretation of the contingency data by suggesting that two separate, opposing processes are at work: An excitatory association develops as a result of CS-US pairings, and an inhibitory association grows with each spurious US. In a partial reinforcement situation (CS-US pairs with spurious CSs), the excitatory association is formed due to the presence of CS-US pairs, but no inhibitory association is formed due to the lack of any spurious USs. In the composite condition (pairs, spurious CSs and spurious USs), the occurrence of spurious USs results in an inhibitory association which can cancel the excitatory association. This account fails to attribute any effect to spurious CSs. The predicted net association resulting from a partial warning contingency (pairs plus spurious USs, with no spurious CSs) would therefore be a strongly inhibitory one: some excitatory association from CS-US pairs, but a potentially large inhibitory association arising from spurious USs. Simply, since spurious CSs are hypothesized by this account to have little or no effect on the outcome of conditioning, no distinction is made between the composite and partial warning conditions. Learning is predicted to be severely degraded in both cases as spurious US trials are introduced. This contradicts the constraint that p(US1CS) > p ( U S ) c s ) ,which predicts severe degradation in the composite condition, but only gentle degradation in partial warning (as in partial reinforcement).
3. The Rescorla- Wagner Algorithm and Partial Warning Rescorla and Wagner (1972) propose an algorithm based on the idea that a single associative strength changes incrementally over trials. As the result of a particular trial, the total associative strength of each of the components (A and X) of a stimulus compound (AX) is increased or decreased by an amount proportional to the size of the combined associative strength of A and X:
and
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
164
where a and p correspond to salience measures of the CS (A or X) and US, respectively, and A is the highest (asymptotic) level of associative strength that the particular US is assumed to be capable of supporting (it is assumed that different USs will yield different A levels). Associative strengths of cues A, X, and of the A X compound are indicated by V,, V,, and V,,. A crucial assumpton underlying Rescorla and Wagner’s algorithm is that all potential CSs which could be conditioned to a single US are competing against each other for their share of the total available associative strength (A). Rescorla and Wagner (1972) argue that this,competition.effect gives rise to a number of desirable features of the model, suCn as blocking, as found by Kamin (1968). The general line of reasoning in the analysis is that as one stimulus increases in predictive power over competing ones, the associative strength of the competitors is stolen by the associative qtrength of the predictive stimulus. The context (e.g., the conditioning chambdr) is thought of here as yet another competing cue, and hence spurious USs could be thought of as strengthening the associative strength of the context, since treating the context as a candidate CS allows the view that spurious USs are occurring in the presence of the context
+
cs.
Rescorla and Wagner offer the argument that in a partial reinforcement condition, the US never occurs in the presence of the context without the CS also being present, so the context has no chance to “steal” associative strength from the CS. In contrast, in the composite misinformation condition, the US sometimes occurs in the presence of the context and sometimes in the presence of the CS plus context, and hence the context has opportunities to decrement the strength of the CS-US association. As before, problems arise when we try to apply this account to the partial warning condition. The US occurs often in the presence of the context with no CS, just as in the composite case, which should lead to the same strengthening of the context as in the composite case; at the same time, there are no more CS-US pairs in this condition than in either the partial reinforcement or composite conditions. This implies that the algorithm will not learn the CS-US association in the partial warning case; yet, in this case, conditioning is predicted by the contingency constraint, since p(US1CS) > p(USlcs) in all partial warning conditions (1 .O-0.6, 1 .O-0.4, etc). In fact, the only difference between this partial warning case and the composite case is the lack of spurious CSs in the former. Hence, an explanation of why learning occurs in one condition and not in the other can only rest on an account of how the existence of extra unpaired CSs can either strengthen the association between the context and the US or weaken the association between the CS and the US-and yet these unpaired CSs must not have this effect in the partial reinforcement condition! In other words, the authors’ account of the operation of this algorithm offers no way to provide a consistent explanation of why conditioning to the context should prevent learning
Contingency in Classid Conditioning
165
in the composite condition, but not in the partial reinforcement or partial warning conditions. It is still possible that this algorithm will predict learning correctly in the partial warning condition; this question may be quantitatively tested regardless of problems of intetpretation of the qualitative account. We have performed simulations of the Rescorla-Wagner algorithm with a range of parameter settings (see Appendix B) which show that in the partial warning condition learning of the CS-US association is severely degraded with the addition of spurious USs. This indicates that, under the conditions we have tested (and have reported in Appendix B), the algorithm is predicting that the partial warning condition behaves like the composite condition rather than like the partial reinforcement condition. This is in contradiction to the Rescorla contingency constraint, which predicts the same gentle degradation in partial warning as in partial reinforcement. Since Rescorla and Wagner also offer a derivation showing that the algorithm should compute the precise Rescorla constraint, there appears to be an important discrepancy; further investiation of the relationship between this algorithm (representing the mechanism) and computation (which is its intended output) is called for. 4. Contingency vs Strengthening and Weakening
As in the case of possible discrepancies between Rescorla-Wagner (1972) and Rescorla (1968), it is often not obvious just what a particular algorithm will compute, so that it is often difficult to tell whether a particular algorithm conforms to the computational constraint of contingency. A number of algorithms proposed to simulate aspects of learning in general (though not classical conditioning in particular) do so by variants of a basic mechanism that strengthens an association upon successful pairings of the CS and US, and weakens the association on unsuccessful pairings, that is, when either the CS or US occurs unpaired (e.g., Spence, 1936; Anderson, 1983; Langley et af.,1983). We will show that this intuitively natural mechanism cannot be made to conform to the computational constraint of classical conditioning and cannot be an algorithm for this particular form of learning. Any account that depends on a linear strengthening/weakening algorithm (henceforth S/W algorithm) will correspond to an equation in which the incremental change in associative strength of a stimulus A (AVA) changes as an additive function of the three axes of the contingency space; that is, all S/W algorithms yield an equation of the form
Any such additive equation in this space will always and only give rise to a planar surface denoting the noncontingent boundary, that is, the boundary be-
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
I66
tween learning positive and negative associations. Since this boundary will be a plane in the space for S/W algorithms, it can never be more than a planar approximation of the (nonplanar) saddle surface. For much of the space, the plane can be placed in such a way that is a reasonable approximation of part of the saddle. This is only true, however, as long as either the partial reinforcement condition or the partial warning condition is ignored. These two conditions correspond to the areas of the saddle surface which curve away from the S/W plane. As long as these algorithms do not try to account for both partial reinforcement and partial warning, it is possible to present models of conditioning that 2 = qcs, US) (probability of cs-us pair)
Y = P ( c s ,US) (probability of spurious US)
\
x = P(CS,us) (probability of spurious CS)
Fig. 8. SIW algorithms approximating (a) partial warning and (b) partial reinforcement.
Contingency in Classical Conditioning
I67
generate planar approximations of either the left or the right portion of the true contingency saddle surface and correspondingly will approximate the predictions of either partial reinforcement or partial warning learning, but not both. Figure 8 shows two different placements of a strengthening/weakening plane that approximate the partial warning and partial reinforcement portions of the contingency surface, respectively. Once both the partial reinforcement and partial warning conditions together are taken into account, it will be seen that there can be no placement of the S/W plane that will serve as even the roughest approximation of the contingency saddle surface. The reason is simply that S/W
z = P(CS,US) (probability of CS-US pair)
P
Y = P(m,US) (probability of spurious US)
Fig. 8b.
168
Richard H. Crnnger, Jr. and Jeffrey C. Schlimmer
algorithms do not differentiate between spurious CSs and spurious USs; all additions of misinformation to these algorithms are viewed as composite misinformation. It is by distinguishing among the types of misinformation (unpaired CSs, unpaired USs, composites) that the correct contingency computation can be achieved. 5 . A New Algorithm for Contingency
Bayesian statistics (Bayes, 1763; Pearl, 1982; Skyrms, 1966) provide formulae for the calculation of two values in inductive logic: logical suflciency (LS),which indicates the extent to which the presence of one event predicts or increases the expectation of another particular event; and, reciprocally, logical necessity (LA'), which represents the extent to which the absence of an event decreases expectation or prediction of the second event. LS and LN are defined to be
If we consider F2 to be a US and F1 a CS,then note that when LS > 1, it is also true that p(USlCS) > p(USlcs), and vice versa. Additionally, LS = 1 if and only if p(USlCS) = p(USlcs), and LS < 1 iff p(USlCS) < p(USICs).'* The values of Ls and LN may be calculated by a pair of simple formulae composed of precisely the four possible input categories of pairwise feature Occurrences given in Section IV,B,l:
( n + c) L s = sc(s + 0)
JJv= o(n + c) n(s
+ 0)
where s is the count of successful predictions, c is errors of commission, o is errors of omission, and n denotes nonpredictions (nonpresentations). For each biologically salient cue (i.e., US) that the animal has learned about, the animal is assumed to be maintaining simple memories of these counts of successes, omissions, commissions, and nonpre~entations'~ from which LS and LN are derived as shown above. At any given time (e.g., on a particular trial), the animal calculates its level of expectation that the US might Occur, based on these stored values. The actual algorithm is as follows: Assume a number of candidate CSs (e.g., CS, , CS,, CS,) that have been experienced in conjunction --
'*Sincep(CSIUS) = I - p(CSIUS) andp(CS1US) = 1 - p(CSlm). Furthermore, it can be. shown - I I or that that LS > 1 if and only if LN < I . However, it is not true in general that ILN - 11 =
Ls
= LN.
13With the proviso given in Section IV.B.1 that nonpresentations will be. systematically undercounted.
Contingency in Classical Conditioning
169
with a particular US; then there are existing counts in memory for the associations of each of these CS,s with this particular US. At a given trial, assume some number of cues actually occur (e.g., CS, and a new, as yet unobserved cue, CS,). Then the level of expectation of the US is calculated by multiplying the LS values of those cues that occurred (in this case, CS,and CS,) with the LN values of those cues that did not occur (but had been seen before: CS,, CS,). This has the effect of combining the extent to which the cues that are present increase expectation of the US (LS)with the extent to which the cues that are absent decrease expectation of the US (wv). This illustrates the reason that we make use of separate values for LS and LN rather than only maintaining a single associative strength for each cue, as, for example, Rescorla and Wagner do: There is somewhat different information being learned about the effect of the absence of a cue than the information learned about the effect of its presence. It can also now be seen that this algorithm is not a competitive one in the sense that the Rescorla-Wagner algorithm is: As an individual cue gains associative strength with respect to the US in the Rescorla-Wagner algorithm, that cue is stealing associative strengths of other competing cues. In our algorithm, the LS and LN values of each cue progress independently of each other with respect to a US, and then all such values are used cooperatively to compute the level of expectation of the US at any given time.’, This use of LS and LN to compute levels of expectation of the US can be viewed in terms of the extent to which individual cues are being categorized by the animal as positive or negative predictive cues, as context cues, or as uncorrelated cues. LS values range from 0 to m, with high LSs corresponding to a particular feature (CS,) strongly predicting a second feature (the US), since high LS implies a high ratio of successes to errors of commission, and very low LSs corresponding to the case where the CS implies that the US will not occur (low ratio of successes to commissions). Hence, for a high LS value, CS, is a positive predictor of the US; for low LS, CS, is a negatively predictive cue, that is, the presence of this CS predicts that the US will not occur. An LN value near 1 indicates that the absence of a cue may be ignored, while a low LN value (near zero) indicates that presence of the cue is necessary for prediction. When the value of LS is approximately 1, that is, neither very high nor very low, then the CS cue is uncorrelated. A context cue, that is, one that occurs with an extremely high frequency, may be identified by simply computing p(CS): When this is approximately equal to 1, the cue is appearing almost all the time (in I40ur “cooperative” algorithm also proposes new Boolean combinations (conjunctions and disjunctions)of features as independent cues; these composite cues then build up their own LS and W values independently of their constituents. This is described in some detail in Granger, Schlimmer, and Young (1986) and Granger and Schlimmer (1985).
I70
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
every trial) and is a candidate context cue. Calculation of this probability is straightforward: p(CS) = (s c)/(s c + o + n). Again, what is being described is a way in which the use of LS and LN for calculating levels of expectation of a US can be viewed as an approximate categorization of cues by their predictiveness. This view can be summarized as follows: F1 will be classified as a positive cue if LS B I , LN < 1 ; negative cue if LS 4 1, LN + 1; uncorrelated if LS = LN = 1; and context if p(CS) = 1 . These categories roughly capture how the animal’s behavior will reflect its internal LS and LN values (and hence its level of expectation of the US). It is not the case that any given cue is necessarily categorized “all or none” as either, say, a positive cue versus a context cue. Any given cue is more usefully viewed as having attributes of a number of these categories, so that a particular cue may be viewed as, for instance, a weak positive predictor (say .4) and a somewhat stronger context cue (.6). The actual levels of expectation calculated from LS and LN values by the algorithm are the true internal measurements of what has been learned.
+
+
6. Gathering Evidence: Incremental Operation of the Algorithm We have constructed a computer simulation of the algorithm to illustrate its operation; this section describes that simulation [Granger et al. (1986) and Granger & Schlimmer (1985) contain further discussion of the algorithm and the computer simulation]. This section provides a brief overview of the operation of the program. All counts in memory are initially set to 1. l5 These counts are updated only when a memory trace (corresponding to a feature complex) is triggered by matching cues in the environment, at which point the matched trace becomes the source of predictions of what will happen and what behaviors are associated with these predictions. This trace is matched against new events. When a prediction succeeds, the success scores of matched features in the environment are incremented. Cues failing to match receive incremented omission scores. If a prediction fails, each cue feature that matched the environment scores a commission; each cue feature that was absent from the environment, a nonprediction. Novel features present in the environment are added with an initial score of 1 commission, 1 prediction, 1 omission, and 1 nonprediction. Assume a situation where tones, lights, noises, and shocks are occurring. The
151n fact, any Bayesian algorithm must start with some arbitrarily chosen initial probability values; the choice of values will not change the overall operation of the algorithm, though it may affect the initial learning of a novel stimulus.
171
Contingency in Classical Conditioning
program’s task is to construct a memory record which will allow it to predict the occurrence of the shock accurately (presumably in order to avoid it). Specifically, given a positive contingency situation, that is, one in which the shock is reliably preceded by a conjunction of features (e.g., tone and light), a table representing a portion of memory about the shock will look similar to Table 111. [Note that successes are indicated by (+ +), commissions by (+ -), omissions by (-+), and nonpredictions by (--). The figures in Table I11 are taken from runs of an early version of our computer model.] To reiterate, the LS (logical sufficiency) value indicates the degree to which a cue is sufficient to cause expectation of a result feature, with values greater than 1 indicating a positive contribution to expectation. The LN (logical necessity) value indicates the degree to which absence of a cue precludes expectation of a result feature. An LN value near 1 indicates that absence of a cue may be ignored, while an LN value near zero indicates that a cue is necessary for expectation. [An interesting sidelight is that the conjunction of light and tone has been proposed by the program itself: See discussion in Granger ef al. (1986) and Granger & Schlimrner (1985).] This chart illustrates important differences between contingency learning, on the one hand, and strengthening/weakening algorithms (based on number of pairings), on the other. Cage and tone receive the same number of pairings with shock, but tone is a much better predictor of shock. Moreover, tone was involved in a greater number of mistaken predictions (errors of commission) than was buzz, but tone is still recognized as the better predictor. 7. Summary: Per$ormance of the Algorithms
We have discussed Rescorla and Wagner’s (1972) algorithm, the class of strengthening/weakening (S/W) algorithms, and our new proposed contingency algorithm based on the calculation and use of sufficiency and necessity (LS and
TABLE 111 POSITIVE CONTINGENCY
Cage Tone Light Buzz Whir And (tone, light)
52 52 52 19 43 48
II 7 8 4 10 3
I I I 34 10 1
1 5
4 8 2 5
1.07 1.68 1.47 1.08 0.97 2.61
0.23 0.05 0.06 0.96 1.13 0.03
172
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
w)values. We have performed simulations of all three categories of algorithms and summarize here our findings on their performance. Appendix B contains a set of results of simulations using all three algorithms. S/W algorithms will learn a positive CS-US association appropriately in the perfect pairings (PP) presentation condition, and performance will fall off severely (again appropriately) in the composite misinformation (C) condition. However, degradation of learning in the partial warning (PW) and partial reinforcement (PR) cases is indistinguishable from the composite case for S/W algorithms; this of course contradicts the contingency constraint, which predicts severe degradation in the composite condition, but very gentle degradation in both partial conditions (see Section 11,A). The Rescorla-Wagner algorithm learns appropriately in the PP and PR conditions and is severely degraded (appropriately) in the C condition. However, in our simulations of the Rescorla-Wagner algorithm on the PW case, learning is just as severely degraded as in the C condition, not gently as in the partial reinforcement condition. Further investigation and interpretation of these results are required. Our algorithm is based directly on the contingency constraint, and so it will learn appropriately in all four presentation categories: It shows severe composite degradation and only gentle partial degradation. Like the other two algorithms, it requires no complex or counterintuitive calculations on the part of the animal; the correct constraint arises naturally from a set of simple operations. The algorithm also accounts naturally for blocking and provides an account of aspects of learned irrelevance, latency, and tracking of changes in the environment (see Section V). We are continuing to apply the algorithm to a range of conditioning phenomena to test its breadth and range of usefulness. C. CIRCUITS FOR CONTINGENCY 1. The Evaluation of the Adequacy of Proposed Circuits
The neurobiology of learning and memory involves the search for biological mechanisms that underlie and, by their operation, give rise to overt learning behavior. Associative learning is an area in which a great deal of recent progress has uncovered a number of competing candidates for the biological mechanism underlying the class of phenomena comprising classical conditioning (e.g., Hawkins and Kandel, 1984; Alkon, 1980; Chang & Gelperin, 1980; Thompson et al., 1984) in addition to a number of mathematical and computational models of these proposed mechanisms (e.g., Sutton & Barto, 1981; Gluck & Thompson, 1985; Hampson & Kibler, 1983). Models of this kind have focused primarily on the temporal constraints on classical conditioning, for example, the interstimulus and intertrial intervals (IS1and ITI); and most have also attended to the constraint of conditional probability in contingency.
Contingency in Classical Conditioning
173
The problem to be addressed here is as follows: How can we determine which (if any) of these proposed mechanisms may be correct ones for classical conditioning? In other words, how can competing mechanisms be evaluated against each other and against the (behavioral level) classical conditioning data? In order to determine which might be valid candidate classical conditioning mechanisms, each proposed mechanism must be tested to see that its performance conforms (at least) to the known attributes of classical conditioning, such as range of interstimulus and intertrial intervals, blocking effects, and conditional probability (contingency) effects. 2 . Categories of Mechanisms Without conforming precisely to known computational constraints, any given candidate mechanism may turn out to be a mechanism for some form of associative conditioning, but not the particular set of algorithms that mammals use to perform associative learning in classical conditioning situations. What might it mean for a mechanism to conform to many, but not all, of the constraints of contingency in mammalian classical conditioning? imagine a proposed biological mechanism that exhibits behaviors resembling mammalian classical conditioning (MCC),but is not identical to them, and so cannot be the complete mechanism that underlies such learning. We may distinguish among three categories of proposed biological mechanism for mammalian classical conditioning (see Table IV): 1. Insufficient (or incomplete) mechanisms are those that do not successfully give rise to the phenomena of mammalian classical conditioning, either because the mechanism is incorrect or, possibly, it is only one component of some larger, as yet undiscovered mechanism. 2. Taxon-specific mechanisms are those that accurately reflect the associative learning abilities of some particular taxonomic category (e.g., class, order, or phylum) of animal, but in which that animal’s classical conditioning behavior can be shown to be distinct from mammalian learning in some specific identifia-
TABLE IV EVALUATION OF PROFWEDBIOLOGICAL MECHANISMS
Mechanism is insufficient for MCC in animal x Mechanism is sufficient for MCC in animal x
Animal x does not do MCC
Animal x does MCC
Incorrect or incomplete mechanism
Incorrect or incomplete mechanism
Taxon-specific mechanism
MCC mechanism
I74
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
ble fashion. Such a mechanism is a correct classical conditioning mechanism, but is not a correct mammalian classical conditioning mechanism. For example, if it turns out that although mammals learn in the partial warning presentation condition, Aplysia does not do so, then it still may be the case that the proposed Hawkins and Kandel ( 1984) mechanism for Aplysia classical conditioning might indeed be the circuit that performs classical conditioning in Aplysia, but it would not then be the case that that same circuit mechanism is the one that underlies conditioning in mammals. 3 . Mammalian clussical conditioning (MCC) mechanisms are those biological mechanisms that underlie the performance of actual mammalian conditioning. Of these three, the first (incorrect or incomplete) simply represents the class of mechanisms that cannot be shown to perform the right behaviors and calls for further exploration, the second (taxon-specific) represents the possibility that different groups of animals perform associative learning differently-this is a sensible possibility in that the point of associative learning is to note and learn about regularities in the environment, and there may be many differing mechanisms that have evolved to instantiate different versions of this regularity-detecting ability. From a computational point of view, certain taxon-specific (e.g., phylum-specific or class-specific) mechanisms may be useful approximations of a true mammalian classical conditioning mechanism, but from a biological point of view, such a phylum-specific mechanism cannot indiscriminately be considered to be the same as mammalian classical conditioning: Differences that appear almost insignificant may likely point to biological differences that are crucially important. It is more useful to identify both the similarities and differences among distinct animal phyla rather than simply using one as a convenient approximation of another as though the differences were not important. Finally, the third (mammalian classical conditioning) represents those mechanisms that may actually underlie classical conditioning in mammals-there may still be differences among mechanisms across species or even within a single individual.
3. Computational Analysis of Aplysia Hawkins and Kandel (1984, p. 387) briefly discuss a trial-presentation condition that directly corresponds to the partial warning condition and suggest how learning may proceed in this condition. They begin by stating that “In classical conditioning, animals do not simply learn that the CS precedes the US (contiguity), but they also learn the contingency or correlation between the CS and US”; they go on to say that . . . if unannounced [i.e., spurious] USs occur between pairing trials. the ability of the CS to predict the US is reduced and learning degenerates. In the limit, if the probability of unannounced USs is the sainc as the probability of announced (paired) USs so that there is zero
Contingency in Classical Conditioning
175
contingency, animals do not learn to associate the CS and US despite the fact that they are paired together many times (Rescorla, 1968). Rescorla and Wagner (1972) proposed that this effect could be explained by an extension of the argument they advanced for blocking. . . . In [a] hypothetical example the addition of unpredicted USs would not only cause a decrease in the difference between the strengths of the CS and CS - , but would also cause a decrease in the absolute strength of the CS+ . Results similar to those shown . . . have recently been obtained in Aplysia in an experiment. . . (Hawkins, Carew, & Kandel, 1983). pp. 387-388 +
These statements deserve careful examination in light of our computational analysis of contingency effects in classical conditioning. First, Hawkins and Kandel state that as spurious US trials are added (presumably to CS-US pairs), learning degenerates. As we have seen, learning is predicted to degenerate to some extent in all conditions that have spurious trials, but the key difference between true contingency and other possible classical conditioning mechanisms (such as strengthening/weakening algorithms, Section IV,B,4) is that in contingency-based conditioning, the composite condition severely degrades learning of the CS-US association, while the two partial conditions (PW and PR) only gently degrade this learning. It is therefore this distinction between partial and composite conditions that must be tested experimentally in order to determine what this circuit (and animal) is actually computing. Hawkins and Kandel go on to state that “in the limit,” learning should be degraded to zero with the addition of enough spurious US trials. Since they seem clearly to be describing a partial warning condition with no spurious CS trials, this limit will only be reached when p(US1CS) = p ( U S ( S ) = 1, which can only happen when there is a US presented in every trial. Assuming that this is not what Hawkins and Kandel meant, this again calls for the crucial distinction to be made between the partial warning versus composite cases: In the latter, 50% spurious US trials will degrade the learning to zero, since this is the severe degradation case, but in partial warning, it takes 100% USs to degrade learning to zero. Hawkins and Kandel cite Rescorla (1968) and Rescorla and Wagner (1972) for explanations of the degradation of learning, but again it is the case that these cited papers explain the dzfference between partial reinforcement and composite conditions; furthermore, neither paper mentions the partial warning condition, that is any condition in which spurious USs but no spurious CSs are added to pairing trials. Finally, the initial experiments referred to by Hawkins and Kandel demonstrate degradation of learning in Aplysia, but it is the distinction between the gentle degradation of the partial conditions versus the severe degradation of the composite condition that should be experimentally tested. l 6 In the absence of ‘%luck and Thompson (1985, in press) have constructed a computer simulation of Hawkins and Kandel’s (1984) Aplvsiu circuit mechanism. While it is often quite difficult to test a wide range of
176
Richard € Granger, I. Jr. and Jeffrey C. Schlimmer
testing for this distinction, it cannot be determined whether the Apfysia circuit is performing contingency-based classical conditioning (as mammals do) or some form of learning that is distinct from this contingency-based conditioning.
V. Breadth of the Theory: Blocking, Latency, Tracking, Learned Irrelevance A. BLOCKING
The failure of an animal subject to form an association with the novel component of a compound stimulus following successful classical conditioning to the familiar component is called blocking. Kamin ( 1968) originally demonstrated this effect by first training animals to associate a noise with a shock. Then animals were repeatedly presented a compound of light and noise followed by a shock. Upon testing, the animals demonstrated little or no conditioning between the light and the shock; the previous effective conditioning of the noise to the shock “blocked” subsequent conditioning to the light. All accounts of this effect concur that expectation on the part of the animal is crucial, for the light offers no new information about the onset of the shock. Rescorla and Wagner (1972) offer an account in which stimuli compete for a limited amount of associative strength. A single stimulus may acquire the complete amount; subsequent stimuli compounded with this previously conditioned cue must compete for associative strength with the completely effective cue and thus acquire no association. Mackintosh (1975) explains that the animal may instead be learning not to pay attention to the redundant stimulus. The animal then simply does not modify associative strengths for the new stimuli, since no unexpected US occurred. Our account is similar to Mackintosh’s in that there is no competition for a limited resource. Like each of the two other accounts, learning only occurs when expectation fails: Either the shock is not expected and it is received (an error of expectation omission) or the shock is expected and is not received (error of expectation commission). When one stimulus comes to predict the US completely, no additional associational modifications are made until that stimulus is no longer accurate. A rough differential prediction may be made between our account and Mackintosh’s: In Mackintosh’s account the lack of attention to the redundant cue is a residue of the blocking experiment; in our algorithm, when the contingencies of the experimental setup change, we would predict that animals
behaviors in a circuit preparation, Gluck and Thompson’s simulation of the circuit may be analyzed to see how it actually behaves under various circumstances. We are currently collaborating with Gluck and Thompson to test whether the model satisfies the behavioral constraints identified above.
Contingency in Classical Conditioning
177
would demonstrate little hesitancy to form an association with the previously redundant cue.
B. LATENCY Another characteristic of the classically conditioned animal is the delay between the onset of the CS and the animal’s response. A salient feature of this latency is that it tends to be proportional to the delay from the onset of the CS to the onset of the US in classical conditioning, but for the same animals in an instrumental conditioning task, the response latency tinds to be quite short. A representative experiment performed by Wahlsten and Cole (1972) demonstrates just this difference. Subjects were divided into classically and instrumentally conditioned groups. For both groups a CS signaled an aversive US: In the classical group, the US was unavoidable; in the instrumental group, the US was terminated by the CR of the animal. Subjects in the classical group waited until just before the onset of the US before responding, whereas subjects in the instrumental group originally waited as long as the classical animals did, but then began to make the response immediately following the onset of the CS; the animals are making a response as early as is effective. This could be accounted for by assuming that the animal “experimented” with smaller response latencies. For the classical subjects, this would prove useless because the US is unavoidable. The instrumental subjects, however, would initially just lessen the impact of the US, but through continued shortening of their response latency would come to avoid it altogether. Further details of this theoretical viewpoint and a simulation may be found in Granger et al. (1986).
c.
TRACKING CHANGES
IN THE
ENVIRONMENT
Subjects adapt to changes in their environment over time. For instance, the fox adapts to the seasonal coat of his prey, and a one-legged bird will learn to change its landing behavior. This ability to track changes over time is another computational constraint which may be used to test proposed learning algorithms. Rescorla and Wagner (1972) and Mackintosh (1975) utilize a formula which allows a reversal of the sign corresponding to the increment of an association’s strength (AV). This enables the algorithm to switch from strengthening a previously successful association to weakening it when it is no longer effective. Our algorithm is not based on a formula describing a change in associative strength, but on the calculation of associativity based on a history of a cue’s effectiveness. As that history reflects changes in the environment, the associative strength assigned to a concept changes as well. For instance, as the environment changes over time, some previously predictive cue might become nonpredictive, in which case predictions would start failing, and the ongoing count of successful predictions
178
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
would slowly be overtaken by the growing counts of commissions and omissions. Reciprocally, if a previously unpredictive cue becomes predictive, it will get reintroduced as a potential cue, and its successful predictions will allow its LS value to grow. Similarly, tracking changes in Boolean feature combinations follows naturally from a thresholding effect associated with the formation of those combinations (Granger & Schlimmer, 1985).
D.
LEARNEDIRRELEVANCE
The reluctance of animals to form different associations between a previously associated CS and US includes results from learned irrelevance. A set of experiments by Siege1 and Domjan (1971) tested five conditions where the subjects were preexposed to the CS, to the US, to an uncorrelated presentation of the CS and the US, to a backward pairing of the US and CS, and given no preexposure. These animals were then placed in a s!andard excitatory contingency situation. They found that the rate at which subjects acquired the new association was ordered from greatest to smallest as follows: Animals with no preexposure learned most quickly, followed by preexposure to the CS or to the US, uncorrelated preexposure to the CS and US, and finally the backward pairing group, which was the slowest to form an association. Learned irrelevance refers to the difference between (1) the effect of preexposure to the CS or to the US, and (2) the effects of receiving preexposure to an uncorrelated presentation of the CS and US. In the latter condition, the CS is initially learned to be irrelevant to the US, while in the former condition no such relationship is present. Mackintosh’s (1975) model of selective attention would account for this in terms of a gradual reduction of the stimulus-specific learning parameters which represent attention. After an uncorrelated presentation of the CS and US, little attention would be paid to the CS and subsequent excitatory conditioning would be inhibited. The Rescorla and Wagner (1972) model might account for learned irrelevance if an association were formed between the context and the US in the uncorrelated condition. This context association might then block the further acquisition of association on the part of the CS during the excitatory conditioning. While conditioning to the context certainly does occur, this model would predict that no subsequent learning to the CS would be demonstrated. Our model explains the difference between the preexposure to the CS or US group and the preexposure to the uncorrelated presentation group by specifying that the associative calculations on the part of the animal are based on the history of association between the CS and US. By retaining the counts of event types, the computation is not based solely on the present association as it is in the delta models of Rescorla-Wagner and Mackintosh, but rather on the resultant of the previous values of these measures. In other words, all three models (ours, Mackintosh’s, and Rescorla and Wagner’s) provide accounts of blocking and tracking changes
Contingency in Classical Conditioning
I79
over time in the environment. Our algorithm, however, sometims resists tracking a change in accordance with learned irrelevance data, while the RescorlaWagner algorithm will sometimes tend to track changes in the environment “too well”; that is, their algorithm will change more readily than animals will (according to learned irrelevance data) in response to environmental changes.
E. TIME,BACKGROUND, AND PROBABILITY I.
Underspecification of Trial Conditions
In Section IV,A,2, it was shown that trial presentation conditions (e.g., 0.40, 1.0-0.4) correspond not to points, but to line segments in the contingency space (Figs. 6 and 7). This fact implies that this standard method” of specifying a testing condition is underspecified: There are multiple different testing conditions that would all be describable as, say, 0.4-0.2. The contingency constraint means that excitatory conditioning should hold in all 0.4-0.2 conditions, but any attempt at replication of an experimental condition that is only described as 0.40.2 may be confounded by lack of information about which 0.4-0.2 condition is meant. Imagine two different testing conditions, A and B, that both lie along the 0.40.2 line segment in contingency space (Fig. 9); just what are the differences between these two points? What is it that is changing as we travel along the line from point A to point B? Point A contains fewer CS-US pairs (since its Z value is lower), fewer spurious CSs (since its X value is lower), and slightly more spurious USs (since its Y value is a bit higher) than point B. There are not enough extra spurious USs to make up for the smaller number of pairs and spurious CSs; what is substituted are more nonpresentations (see Section IV,B,I) at point A than at point B; that is, the set of trial conditions described by point A contains more events in which neither the CS nor the US occurs than that described by point B. Since a certain amount of time is allocated to the overall set of trials, these events are translated into “empty” time durations. For purposes of replication, then, a complete specification of a trial-presentation condition would require more information than just the two conditional probabilities of contingency. An alternative formulation would offer these two conditional probabilities as well as the number of CS-US pairs and the total number of trials or total amount of time allocated for presentations to the animal. The trial-presentation condition corresponding to point A might be specified as [0.4-0.2; 25/100 (8 hr)], denoting that p(USlCS) = 0.4, p(UScS) = 0.2, with 25 pairs presented over 100 total trials (for a total ”Used extensively by Rescorla (1967, 1968, 1972), Rescorla and Wagner (1972), Mackintosh (1983), etc.
Richard € Granger, I. Jr. and Jeffrey C. Schlimmer
180
z = P(CS,US) (probability of CSUS pair)
I
/
x = P(CS,BS) (probability of spurious CS)
Y = P(m,US) (probability of spurious US)
Fig. 9. The 0.4-0.2 presentation condition.
duration of 8 hr),'* thereby specifying the joint probability of pairings being presented p(CS, US) = 25/100 = 0.25. Similarly, then, the condition corresponding to point B might be specified as [0.4-0.2; 35/100 (8 hr)], denoting that in this condition there were 35 pairs over 100 trials: p ( C S , US) = 0.35. These additional numbers are required because for complete (replicable) specification of a trial-presentationcondition we need to know each of the marginal or joint probabilities corresponding to the X. Y, and Z axis values [p(CS,
us),
'*This total time value is redundant with that of the total number of trials if the time allocated per trial is specified.
Contingency in Classical Conditioning
181
p ( a , US), and p(CS, US)]. By the laws of conditional probability, we know that p(US)CS) = p(CS, US)/p(CS) and p(USlcs) = p(=, US)lp(m). In the
new proposed specification, we have p(CS, US) (the Z axis value) directly derivable as the ratio of the number of pairings and the total number of trials (or the total amount of time for trials times the amount of time per trial). For point A, the Z value is 0.25; for point B, it is 0.35. Then we can compute p(CS) = p(CS, US)/p(USlCS), and p(=) = 1 p(CS), and thereby compute the Y axis valuep(m, US) = p(USIcs)p(CS). The Y value for point A is 0.075, and for point B it is 0.025. Finally, all that is left to compute is the value of the X axis by the equation p(CS, = p(CS)-p(CS, US) to completely constrain the point in the space (for point A, X = 0.375; for point B, X = 0.525). In summary, reporting the two conditional probabilities, the number of pairings, and the total number of trials (or total trial time) is sufficient to completely specify the training conditions. The theoretical formulation of contingency, in fact, requires that these nonpresentations or empty trials be taken into account; different theories have handled this in different ways. Rescorla and Wagner (1972) presume that all such empty trials are, in fact, exposures to the context; this is another way of viewing what it means for the context to compete with other CSs for associative strength in their theory. Most recent theories of conditioning (e.g., Mackintosh, 1975; Dickinson, 1980; Pearce & Hall, 1980) adopt variations of this idea. In contrast, our theory represents the context as an independent candidate CS like all the others; the difference is that we explicitly identify context cues, since, as we showed in Sections IV,B, 1 and IV,B,5, we can mathematically distinguish between context cues, and other types of predictive and uncorrelated cues. The implication is that the animal is capable of learning the extent to which particular cues are predictive cues (normal CS+ , either positive or negative safety signals), are uncorrelated (CS-), or are context cues. Rescorla and Wagner, therefore, deal with time implicitly by interpreting nonpresentations as exposures to the context or background cues. We attempt to deal with time explicitly by counting nonpresentations; we deal with context cues as initially being candidate CSs competing as possible predictive cues and over trials becoming learned to be a separate category of cues that are neither predictive nor uncorrelated. Gibbon (1977, 1984) presents a theory based in part on timing that also attempts to treat time as an independent entity.
us)
2. The Trial Window Duration Assumption Even given the above complete, replicable specification of a particular set of trials, there is a problem that confounds both the theoretical formulation and experimental testing of contingency: the assumption of the duration of a particular trial. The calculation of conditional probabilities (and therefore the prediction
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
182
of when a particular CS-US association should or should not be learned, and the predicted strength of its learning) is dependent on the assumption that the experimenter (or theorizer) makes about the duration of a trial. This is not an idle issue: Different assumptions can lead to drastically different conditonal probability calculations. Figure 10 illustrates this: Given a particular layout of cue presentations (in the figure, T indicates tone, L indicates light, and S indicates shock), then the values of p(US1CS) and p(USlcs) are given under three different assumptons about the trial window duration: 2, 3, and 4 min. First, ignoring the tone CS and simply looking at the predicted associativity of the light and the shock, these three different assumptions render this set of trials as 0.5-0.2 when the trial size is assumed to be 2 min, 0.25-0.6 when it is assumed to be 3 min, and 0.75-0.33 when it is assumed to be 4 min. Under the first and last assumptions, the light CS is predicted to be strongly learned [since p(US1CS) > p(USlcs)], while under the 3-min assumption, the opposite prediction is made: The light CS should be strongly learned to be a safety signal, indicating that the shock will not occur. This is strongly counterintuitive and indeed rests on an example that was crafted explicitly to give rise to such a result, but nonetheless, by the strict rule of contingency, these are the correct predictions under these three different trial window assumptions. Furthermore, the predicted ordering of the two cues (tone and light) will be reversed in this example: The tone and light CSs in Fig. 10 will be about equally L S
I
T
I
I
L
I
I S
S
I
I
I
I
L
1
I
T S
I
I
I
I
I
For Tone: 1 Faking, 1 Spurious CS, 3 Spurious USs, and 9 Empty trials P(US I T) 1/( I* 1)- 0.5; P(US I 'i) 3/(3+9) 0.25 For Light: 2 PaIflngS, 2 Spurious CSs, 2 Sp~@ousUSs, and 2 mpty trials P(US I L) 2/(2*2) 0.5;P(US I L) 2/(2+8) 0.2 I
I
-
I
I
- -
-
I
I
I
I
-
I
I
For Tone: 1 PaWng. 1 Spurious CS, 3 Spurious USs, and 4 Empty trials P(US I T) I/( I* I ) 0.5;P(US I 3/(3+4) 0.43 For Ught: 1 pairlile. 3 Spurious CSs, 3 Spurious U L ,and 2 Empty trials P(US I L) = i/( i+3) 0.25;P(US I i) 3/(3+z) 0.6 I
I
I
-
I
-
-
-
I
-
I
-
I
I
-
For Tone: 1 pairlna. 1 Spurious CS, 3 Spurious USs. and 2 Empty W s P(US IT) I / ( I* 1) 0.5;P(US 15) 3/(3+2) 0.6 or ught 3 ~ a i r r n s ,I spur~ousCS, 1 spurious US,and 2 m p t y mis P(US I L) 3/(3+ I ) = 0.75;P U S I L) I/( 1+2)= 0.33
-
Fig. 10. Three different trial-window duration assumptions yield different contingencies for the same set of trials.
Contingency in Classical Conditioning
183
predictive of the shock under the 2-min assumption (0.5-0.25 for tone, 0.5-0.2 for light); but the tone will be more predictive of the shock than the light is under the 3-min assumption (0.5-0.43 for tone, 0.25-0.6 for light); and, finally, the tone will be much less predictive of the shock under the 4-min assumption (0.50.6 for tone, 0.75-0.33 for light). Were we to run an animal experiment using these trial data, our prediction of whether the tone or light, or both, would be learned to be associated with shock would depend directly on our assumption about the trial window duration. It seems intuitively clear that animals do not make judgments about pairings on the basis of something so artificial and arbitrary as a “time window”; rather, if a tone CS is followed closely by a shock US (within, say, 5 sec), then a pairing is perceived by the animal, independent of whether a time-window boundary should ideally have fallen between the CS and US. This logically implies that the ideal contingency constraint, as it currently stands, is in need of revision or extension. Subjects must be choosing trial windows at least in part on the basis of the cues and events that are perceived; yet the very perception of the nature of those events seems to be dependent in part on the choice of trial windows. One possible extension to the theory can be based on this apparent paradox: The animal may first determine the salient cues in the environment and may acquire information about their durations, and then that information may be used in part to incrementally calculate the associativity or predictiveness of various cues (via some algorithm). Indeed, Rescorla (1968) and Rescorla and Wagner (1972) have made the assumption that trial window duration was equal to CS duration and have shown that this assumption leads to consistently successful experimental testing and successful predictive simulations of contingency. However, it has not been made clear in this literature how the animal may come to choose the CS duration as the perceived trial window duration. Assuming that CS duration is somehow used as approximate trial window duration by animals, then it is possible that it is the rapid conditioning of nonspecific response systems (e.g., heart rate, galvanic skin response, respiration) that is used to select candidate cues and to identify their durations, and then that these cue durations are used as candidate trial window durations as part of the process of determining associativity of cues. Experimentation with this theoretical line of thinking may clarify the relationship between rapid acquisition of nonspecific responses and slower learning of complex skeletal responses in associative learning.
VI. Summary: Limitations and Contributions of the Theory A.
STATUSOF OUR PROGRESS
We have attempted to provide in this article an analysis of the effects of contingency in classical conditioning and the implications of that analysis to
184
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
predicted experimental outcomes, proposed algorithms, and the evaluation of neurobiological circuits underlying conditioning. We are in the process of testing some of our theoretical predictions in our experimental laboratory (including the partial warning prediction and aspects of different trial window duration assumptions). We intend the empirical results of these experimental studies to provide support or falsification for specific aspects of the theory. We are continuing a research program of extending our results to a broader range of phenomena of learning and memory, though we feel that classical conditioning clearly represents a reasonable paradigm for testing the limits of the'way in which animals learn observed associations in their natural environments. It is probably the case that, in this regard, instrumental conditioning represents a still more natural set of experimental procedures; our investigation has led us toward an integrative view of classical and instrumental conditioning (Granger et al., 1986) which we intend to pursue. Similarly, there are a number of well-known associative and nonassociative effects, especially extinction phenomena, sensitization, habituation, and their relation to conditioning.
B. INTERDEPENDENCE OF THE THREELEVELS The key question we have addressed here is as follows: How can we evaluate proposed theories, algorithms, circuits, or models of learning and memory in a principled way? The answer offered is that constraints on learning arise from both the computational level (where the precise defining features of the behavior are established) and the implementation level (where the biophysical mechanisms that underlie the behavior are identified). Since these two levels rarely meet each other, most theories are mediated through the algorithm level. Mechanistic descriptions of circuit operations are the bottom-up contribution, and derivations of behavioral-level constraints are the top-down contributions to a theory. A computational-level analysis of the target behavior establishes the range of conditions that define (and thereby constrain) the behavior under study (such as classical conditioning). A complete theory must also be constrained by the physical attributes of the substrate system in which is is embedded; the neurobiological basis of classical conditioning is crucial. In principle, if we had a perfect implementation level characterization of classical conditioning in, say, a circuit, then we would be able to determine the computational constraint (bottom-up) from the operation of that circuit. In the absence of such information (at least in the case of classical conditioning), the computational constraint was derived instead from animal experiments (Rescorla 1968); this, of course, still constitutes a bottom-up derivation, as all such derivations must initially be. Once the computational constraint is in place, however, then the target behavior is
Contingency in Classical Conditioning
185
defined, and all proposed theories, algorithms, or circuits must conform to the constraint. We cannot be sure that any given computational constraint is perfect or finished; for instance, if it turned out that a positive CS-US association was not learned in a partial warning presentation condition, then that would imply that the Rescorla (1 968) constraint would require refinement. More complex counterintuitive predictions of the computation (such as the dependence on assumptions about trial window duration; Section V,E,2) also may give rise to experimentally testable questions about the validity, extent, and accuracy of the theory. Furthermore, the constraint only refers to effects of contingency in classical conditioning, yet the overall learning and memory capabilities of mammals certainly have more complex and far-reaching computational characterizations than just this constraint; the contingency constraints can be viewed, then, as one element of a large class of constraints. Our aim has been to attempt to analyze and clarify the contingency constraint, to apply it to generate useful predictions (such as learning in the partial warning condition), and to provide a uniform way of evaluating proposed algorithms, behavioral predictions, circuits, and models. We hope that theoretical and experimental investigators will continue to work together toward testing and refinement of the contingency constraint. We further hope that the analysis of contingency presented here will be used as a tool for researchers to test their own theories and experiments, and even as a measuring stick to keep us on track in our evaluation of what is and is not contingency in associative learning.
Appendix A: Derivation of Contingency Surface If it is assumed that trials are discrete, independent, and randomized, then we may consider each of the four possible stimulus combinations:
x = ~(cs,US) cs alone
Y =p(CS,uS) US alone = p(CS,US) CS followed by US _-
2
1 - X - Y - Z = p(CS,US) Empty trial
Consider a Cartesian coordinate system in a Euclidian three-dimensional space. All possible stimulus combination points (defined above) lie within a right triangular prism within the unit cube bounded by the X-Y,X-Z, and Y - 2 planes and by a truncating slanted plane passing through the points (X,Y , Z ) = (1 ,O,O), (0,l ,O), and (O,O,l), since X Y + Z 5 1. The contingency characterization states that conditioning does not occur when p(US)CS) = p(USlcs), and that this equality therefore defines the boundary
+
I86
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
between learning of positive and negative associations. By the definition of conditional probabilities we have
The marginal probabilities are directly derived as
p(CS) = x + 2 p(cs)= 1 - x - 2 p(US) = Y z p(m)= 1 - Y - 2
+
Substituting, we have p(USlCS)
=
Z/(X
+ 2)
p(US1Cs)
=
Y/(l - x - Z )
Substituting these expressions in the contingency boundary equation, we have
Y=
Z(1 - x - 2 ) ( X + 2)
which describes a hyperbolic paraboloid. It is illustrated within the truncated unit cube in Fig. 2.
Appendix B: Comparative Analysis of Performance of Contingency Algorithms RESCORLA AND WAGNER The Rescorla and Wagner (1972; Wagner & Rescorla, 1972) model was simulated under a pair of conditions. In the first set of simulations, p(CS, US) = 0.10; that is, 1 of every 10 trials was a reinforced presentation of the CS. The parameters were chosen following those in the original presentation (Rescorla & Wagner, 1972, p. 82). Specifically, acs = acontext = 1.0, Areinforced = I , Anonreinforced = O, Preinforced = Pnonreinforced = O. 5 . The last parameter is larger than one originally used and was chosen to allow asymptotic learning in 25 trials (=O. 10X 250). The asymptotic associative strengths for the CS and context in the presence of various amounts of spurious cues are presented below.
Contingency in Classical Conditioning
I87
p(CS,US) = 0.10, 250 Trials Total (figures are single samples from an arbitrary, uniform ordering)
vcs 0
-
1 .OO 0.25 I .OO
25 CS 25 US 25 CS,US 50 CS 50 US 50 CS,US 75 cs 75 us 75 cs,us
0.44
0.17 1 .oo 0.29 0.12 1 .OO 0.21
0.00 0.00 0.28 0. I6 0.00 0.56 0.38 0.00 0.83 0.71
0.98 0.29 0.71 0.34 0.16 0.43 -0.10 0.14 0.18 -0.53
VCOIltCXt
0.00 0.01 0.26 0.20 0.01 0.61 0.46
0.02 0.74 0.67
A second set of simulations were performed, this time with the p(CS, US) = 0.20 and the exact set of parameters used in Rescorla and Wagner (1972, p. 88), %S = 0.5*%ontext = O. 'reinforced = 'nonreinforced Preinforced = O. Pnonreinforced= 0.05. The asymptotic associative strengths for the CS and context are presented below. '9
9
'9
9
p(CS,US) = 0.20, 250 Trials Total (figures are a single sample from an arbitrary, uniform ordering) %
Type
0 20 20 20 40 40 40 60 60 60 80 80 80
CS
P(USICS) P(USlCs)
us CS,US CS US CS,US CS US CS,US CS US CS,US
I .OO 0.50 I .00 0.67 0.33 1 .OO 0.50 0.25 I .OO 0.40 0.20 I .oo 0.20
0.00 0.00 0.25 0.14 0.00 0.50 0.33 0.00 0.75 0.60 a
1.00 0.80
vcs 0.84 0.66 0.66 0.59 0.44 0.50 0.37 0.34 0.37 0.14 0.27 0.25 -0.05
Vcontert 0.09 0.34 0.34 0.21 0.06 0.54 0.35 0.05 0.72 0.47 0.05 0.86 0.61
Undefined.
GRANGER AND SCHLIMMER
The Granger and Schlimmer model has been similarly tested for the cases where p(CS, US) = 0.10 and p(CS, US) = 0.20. The LS and LN measures
188
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
computed for each potential cue stimulus are interpreted first as odds, then are converted to a probability [(p = odds/( 1 + odds)], and then are mapped onto the range [- 1 ,I]]V = (p-0.5)/0.5] for the purposes of straightforward comparison with the other models presented. The results for varying degrees of spurious CSs, spurious USs, and spurious CSs and USs are presented below. p(CS,US) = 0.10, 250 Trials Total (each data point represents an average over 10 orderings)
I .oo
0 25 CS 25 US 25 CS,US 50 CS 50 US 50 CS,US 75 cs 75 us 75 c s , u s
0.25 I .oo 0.44 0.17 I .oo 0.29 0.12 I .OO 0.21
0.00 0.00 0.28 0.16 0.00 0.56 0.38 0.00 0.83 0.71
0.99 0.96 0.90 0.47 0.89 0.85 -0.08 0.65 0.64 -0.47
-0.28 -0.28 -0.13 -0.22 -0.28 0.11
-0.13 -0.28 0.54 -0.02
p(CS,US) = 0.20, 250 Trials Total (each data point represents an average over 5 orderings)
I .oo 0.50 I .oo 0.67 0.33
0 20 cs 20 us 20 cs,us 40 CS 40 US 40 CS,US 60 cs 60us 60 cs,us 80 cs 80 us 80 CS,US a
1
.oo
0.50 0.25 I .oo 0.40 0.20 I .oo 0.20
0.00 0.00 0.25 0.14 0.00 0.50 0.33 0.00 0.75 0.60 U
I .OO 0.80
0.99 0.97 0.94 0.64 0.94 0.92 0.19 0.86 0.85 -0.20 -0.23 -0.01 -0.49
-0.23 -0.23 -0.11 -0.17 -0.23 0.11 -0.09 -0.23 0.42 0.00
-0.23 0.98 0.11
Undefined.
STRENGTHENING AND WEAKENING In contrast to those algorithms which compute conditional probability correctly, we simulated an algorithm from a class which computes
AV
= a[yZ
+ 6X + UY + p(1 - X
- Y - Z)]
Contingency in Classical Conditioning
I89
Specifically, we chose OL = 0.15; y = 0.90,6= 0.10, u = 0.40, and p The results for p(CS, US) = 0.10 are presented below.
= 0.00.
p(CS,US) = 0.10, 250 Trials Total (figures are a single sample from an arbitrary, uniform ordering)
0
-
25 CS 25 US 25 CS,US 50
CS
50 US 50 CS,US
75 cs 75 us 75 c s , u s
1 .OO 0.25 1 .OO
0.44 0.17 1.OO 0.29 0.12 I .OO 0.21
0.00 0.00 0.28 0. I6 0.00 0.56
0.38 0.00 0.83 0.71
vcs
VConteXl
1.00 0.99 -0.35 0.99 0.97 -0.99 -0.94 0.63 -0.99 -0.94
0.00 0.00 0.97 0.96 0.00 1.00 1.00 0.00 1.00 1.00
ACKNOWLEDGMENTS This research was supported in part by the Office of Naval Research under Grants N00014-84-K0391 and N00014-85-K-0854,by the Army Research Institute under Contract MDA903-85-C-0324, and by the National Science Foundation under Grants IST-81-20685 and IST-85-12419. Our thanks to Donald H. Perkel for his help with our analysis of contingency and development of the saddle graph; to Mark A. Gluck and Nelson Donegan for their extremely helpful comments on earlier drafts of this article; to Michal T. Young for his extensive collaboration with us, especially in the development of the LS-LN contingency algorithm; to Lynn Nadel, Jeff Willner, add Lisa Kun for their helpful discussions about the Rescorla-Wagner and competing algorithms, and to them and Frank Schottler for help with our experimental setup; to David Benjamin for his help in designing and implementing the computer software for our animal experiments; to Norman W. Weinberger, Gary S. Lynch, and James L. McGaugh for many helpful discussions; to Stacey Murren Granger, Donna Stephens, and Charles L. Post, who are running our experiment-in-progress testing the partial warning condition; and last, but far from least, thanks to Stacey and Joyce, for their tolerance and support.
REFERENCES Alkon, D. L. (1980). Membrane depolarization accumulates during acquisition of an associative behavioral change. Science. 210, 1375-1376. Allan, L. G., & Jenkins, H. M. (1980). The judgment of contingency and the nature of the response alternatives. Canadian Journal of PsychologylReview of Canadian Psychology, 34, 1-1 1. Anderson, J. A., Silverstein, J. W., Ritz, S. A., & Jones, R. S. (1977). Distinctive Features, categorical perception and probability learning: Some applications of a neural model. Psychological Review, 84, 41 3-45 I . Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard Univ. Press.
190
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
Bayes (1763). An essay towards solving a problem in the doctrine of chances by the late Rev. Mr. Bayes. Philosophy Transactions of The Royal Society. Brimer, C. J., & Dockrill, F. J. (1966). Partial reinforcement and the CER. PsychonometricScience, 5, 185-186. Chang, J. J., & Gelperin, A. (1980). Rapid taste aversion learning by an isolated molluscan central nervous system. Proceedings of the Narional Academy of Sciences, 77, 6204. Church, R. M. (1969) Response suppression. In B. A. Campbell & R. M. Church (Eds.), Punishment and aversive behavior. Conference on Punishment, Princeton, NJ, 1967. New York: Appleton. Dickinson, A. ( 1980). Conremporary animal learning theory. Cambridge, London: Cambridge Univ. Press. Fitzgerald, R. D. (1963). Effects of partial reinforcement with acid on the classically conditioned salivary response in dogs. Journal of Comparative and Physiological Psychology, 56, 10561060. Gamzu, E., & Williams, D. R. (1971). Classical conditioning of a complex skeletal response. Science, 171, 923-925. Gibbon, J. (1977). Scalar expectancy theory and Weber’s law in animal timing. Psychological Review, 84, 279-325. Gibbon, J., Church, R. M.,& Meck, W. H. (1984). Scalar timing in memory. In J. Gibbon & L. Allan (Eds.), Timing and rime perception. New York: The New York Academy of Sciences. Gibbon, J., Berryman, R., & Thompson, R. L. (1974). Contingency spaces and measures in classical and instrumental conditioning. Journal of rhe Experimental Analysis of Behavior. 21 585-605. Gluck, M. A., & Thompson, R. F. (1985). A computer model of the neural substrates of classical conditioning in the Aplysia. In Proceedings of the Seventh Annual Conference of the Cognitive Science Society (pp. 36-42). Gluck, M. A,, & Thompson, R. F. (in press). Modeling the neural substrates of associative learning and memory: A computational approach. Psychological Review. Granger, R. H., & Schlimmer, J. C. (1985). Learning salience among features through contingency in the CEL framework. In Proceedings of the Sevenrh Annual Conference of rhe Cognitive Science Society (pp. 65-79). Granger, R. H., Schlimmer, J. C., & Young, M. T. (1986). Contingency and latency in associative learning: Computational, algorithmic and implementation analyses. Department of Computer Science Technical Report 85-10, University of California, Irvine; in J. Davis, E. Wegman, & R. Newburg (Eds.), Brain structures, learning and memory (in press). Grossberg, S. (1982). Processing of expected and unexpected events during conditioning and attention: A psychophysiological theory. Psychological Review, 89, 529-572. Hammond, L. J. (1967). A traditional demonstration of the active properties of Pavlovian inhibition using differential CER. Pscyhonomerric Science, 9, 65-66. Hampson, S., & Kibler, D. (1983). A Boolean complete neural model of adaptive behavior. Biological Cybernerics. 49, 9- 19. Hawkins, R. D., Carew, T. J.. & Kandel E. R. (1983). Effects of interstimulus interval and contingency on classical conditioning in Aplysia. Society for Neuroscience Abstracts, 9, 168. Hawkins. R. D., & Kandel, E. R. (1984). Is there a cell-biological alphabet for simple forms of learning? Pscyhological Review, 91, 376-391. Hearst, E., & Franklin, S. R. (1977). Positive and negative relations between a signal and food: Approach-withdrawal behavior to the signal. Journal of Experimental Psychology: Animal Behavioral Processes, 3, 37-52. Kamin, L. J. (1968). Predictability, surprise, attention, and conditioning. In M. R. Jones (Ed.),
Contingency in Classical Conditioning
191
Miami Symposium on the Prediction of Behavior, Aversive Stimulation. Coral Gables, FL: Univ. of Miami Press. Langley, P. W., Zytkow, J. M., Simon, H. A,, & Bradshaw, G. L. (1983). Mechanisms for qualitative and quantitative discovery. Proceedings of the International Machine Learning Workshop (pp. 121- 132). Urbana-Champaign: University of Illinois. Mackintosh, N. J. (1974). The psychology of animal learning. New York Academic Press. Mackintosh, N. J. (1975). A theory of attention: Variations in the associability of stimulus with reinforcement. Psychological Review, 82, 276-298. Mackintosh, N. J. (1983). Conditioning and associative learning. New York Oxford Univ. Press. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: Freeman. Pearce, J. M., & Hall, G. (1980). A model for Pavlovian learning: Variations in the effectivenessof conditioned but not of unconditioned stimuli. Psychological Review, 87, 532-52. Pearl, J . (1982). Reverend Bayes on inference engines: A distributed hierarchical approach. Proceedings of the National Conference on Artificial Intelligence (pp. 133-136). Rescorla, R. (1966). Predictability and number of pairings in Pavlovian fear conditioning. Psychonomic Science, 4, 383-384. Rescorla, R. (1967). Pavlovian conditioning and its proper control procedures. Psychological Review. 74, 71-80. Rescorla, R. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66, 1-5. Rescorla, R. (1972). Informational variables in Pavlovian conditioning. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 6, pp. 1-46). New York: Academic Press. Rescorla, R., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy ( a s . ) , Classical conditioning 11: Current research and theory. New York Appleton. Rosenblatt, F. (1962). Principles of neurodynamics: Perceptions and the theory of brain mechanisms. Washington, D.C: Spartan Books. Shanks, D. R. (1985). Continuous monitoring of human contingencyjudgment across trials. Memory and Cognition, 13, 158-167. Siegel, S., & Domjan, M. (1971). Backward conditioning as an inhibitory procedure. Learning and Motivation, 2, 1- 1 I . Skyrms, B. (1966). Choice and chance: An introduction to inductive logic. Belmont, CA: Dickenson. Spence, K. W. (1936). The nature of discrimination learning in animals. Psychological Review. 43, 427-449. Sutton, R. S., & Barto, A. G. (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, 88, 135- 170. Thomas, E., & Wagner, A. R. (1964). Partial reinforcement of the classically conditioned eyelid response in the rabbit. Journal of Comparative and Physiological Psychology. 58, 157-158. Thompson, R. F., Clark, G.A,, Donegan, N. H., Lavond, D. G., Madden, J., Mamounas, L. A., Mauk, M. D., & McCormick, D. A. (1984). Neuronal substrates of basic associative learning. In L. Squire & N. Butters (Eds.), Neuropsychology of memory. New York:Guilford. Wagner, A. R. (1981). SOP: A model of automatic memory processing in animal behavior. In N. E. Spear & R. R. Miller (Eds.), Information processing in animals: Memory mechanisms (pp. 547). Hillsdale, NJ: Erlbaum. Wagner, A. R., & Rescorla, R. A. (1972). Inhibition in Pavlovian conditioning: Application of a theory. In R. A. Boakes & M. S. Halliday (Eds.), Inhibition and learning. London: Academic Press.
I92
Richard H. Granger, Jr. and Jeffrey C. Schlimmer
Wagner, A. R., Siegel, S . , Thomas, E., & Ellison, G. D. (1964). Reinforcement history and the extinction of a conditioned salivary response. Journal of Comparative and Physiological Psychology, 58, 354-358. Wahlsten, D. L., & Cole, M. (1972). Classical avoidance training of leg flexion in the dog. In A. H. Black & W. F. Prokasy (Eds.). Classical conditioning 11: Current research and theory. New York: Appleton. Wasserman, E. A., Chatlosh, D. L., & Neunaber, D. J. (1983). Perception of causal relations in humans: Factors affecting judgments of response-outcome contingencies under free-operant procedures. Learning and Motivation, 14, 406-432.
BASEBALL: AN EXAMPLE OF KNOWLEDGE-DIRECTED MACHINE LEARNING Elliot Soloway DEPARTMENT OF COMPUTER SCIENCE YALE UNIVERSITY NEW HAVEN, CONNECTICUT 06520
I. Introduction: Motivation and Goals The problem of how one comes to know something-how one learns something-has been debated for centuries. Plato’s formulation provides a concise statement of the problems involved in learning. The following is a synopsis of a dialogue between Meno and Socrates where the problem of corning to know is discussed (see Plato, 1949): “How could the slave boy learn the proof of the Pythagorean theorem? If the boy did not know it already then how could he recognize it as it is being taught to him; on the other hand, if he knew it already then certainly the boy was not learning. ” Plato’s way out of the paradox was to put forth the Doctrine of Recollection: We are born knowing everything and learning is simply recollecting what is already there. In this article, we will examine the problem of learning from an artificial intelligence (AI)/cognitive perspective and suggest that one need not necessarily be forced into the nativist position espoused by Plato in order to account for learning; that is, we will explore how some general processes can interact with “old knowledge” in order to generate “new knowledge.” In particular, we describe a system called BASEBALL that uses the three processes of interpretation, generalization, and evaluation, plus general knowledge about actionoriented, competitive games in order to develop an understanding of the specific game of baseball. For example, input to BASEBALL will be the actions observable in a game of baseball; for example, player B1 throws a ball, player A1 hits the ball, player A1 runs. BASEBALL will draw on its knowledge base and the three aforementioned processes and output rules of baseball such as player A1 wanted to hit the ball and he succeeded with his goal, while player B1 did not want A1 to hit the ball, and he failed with his goal. In other words, BASEBALL will not attempt to learn the rules of baseball as they are stated in an official rule book. Rather, BASEBALL will attempt to learn about the intentions of the THE PSYCHOLOGY OF LEARNING AND MOTIVATION. VOL. 20
193
Copyright Q 1986 by Academic Re=. Inc. All rights of repoduction in any form reserved.
194
Elliot Soloway
players and about the competitive and cooperative relationships that exist between the players. While Plato’s position is an extreme one, we will nonetheless argue here that at least some domain knowledge-old knowledge-must be brought to bear in order for a system to learn. Thus, we will argue that a domain independent system would not be able to carry out the sort of learning that BASEBALL does. Clearly, we must be careful lest we provide BASEBALL with too much knowledge and thus reduce BASEBALL’Slearning to mere recollection. While we will not be able to quantify how much old knowledge is too much, we will attempt to assess the contribution of the domain knowledge initially provided to BASEBALL. The organization of this article is as follows: First, in Section I1 we depict the description of baseball that BASEBALL sees as input. In the next three sections, we describe each of the main processing levels in BASEBALL: Section 111 deals with the interpretation process where BASEBALL attempts to hypothesize intentions and relationships for the observed actors; Section IV deals with the generalization process where BASEBALL attempts to identify which features of its hypotheses are relevant to its goal and which features can be allowed to vary; Section V deals with the evaluation process where BASEBALL attempts to assess the truth of its hypotheses. In Section VI we describe the results of running different versions of BASEBALL. Section VII presents some concluding remarks.
11. Representing the Game of Baseball Input to BASEBALL is supplied by a program that simulates the continuous game of baseball by breaking it up into discrete time intervals, called snapshots. Each snapshot consists of a set of pafrern descriptions depicting the state of the world at an instant in time. A pattern description is a 5-tuple which captures four essential perceptual dimensions (features) of such a miniworld-action, actor, location, time of occurrence, plus any modifiers to those dimensions. For example, Fig. 1 illustrates three sample snapshots. In snapshot 102 we see player A1 THROWing a BALL. That same BALL is MOVING through the AIR in snapshot 103, and an opposing player, B 1, is seen SWINGHITting the BALL in snapshot 104. Table I lists the set of actions observable by our system; they are the natural ones for describing an action-oriented game (e.g., RUN, THROW, CATCH). The system represents locations such as pitcher’s mound and first base only as X-Y coordinates; they have no a priori significance. The value of the feature time of occurrence is a number which encodes the time the event occurred in the game (i.e., the first event has time 1, the second event time 2, and so on). The game of baseball that BASEBALL observes is a simplified version of the
Knowledge-Directed Machine Learning 102
(THROW A1 PM BALL) (AT A2 HP) (AT A 3 FB) (AT A8 RF) (HOLDOBJ Bl HP BAT) (AT B2 DUGOUTB) (AT B 3 DUGOUTB)
(AT
69
103
I95 104
(AIRMOVING BALL PM (FAST)) (AT A 1 PM)
(AT A 1 PM) (AT Aa HP) (AT A 3 FB) (AT AQ RF) (SWINGHIT B1 HP BALL) (AT B2 DUGOUTB) (AT B3 DUGOUTB)
(HOLDOBJ Bl HP BAT)
DUGOUTB)
(AT BQ DUGOUTB)
Fig. I . Example: unfiltered snapshots.
real game. Table I1 lists the events the system actually observes, while Table I11 lists some of the events the system does not observe. The major reason some events were not included was due to the lack of knowledge needed to interpret them as events in a competitive action game. The deficiency in the knowledge base stems in part from our desire to simplify the problem and in part from our ignorance of how to specify in a general way the knowledge necessary to understand some events. As we mentioned, choices we made in the design of BASEBALL need to be carefully scrutinized in order to better understand their contribution to the learning process. This need for analysis arises immediately: In representing the game of baseball, have we already given BASEBALL too much of a head start? For example, why didn't we represent the beerman hawking brew or the clouds moving? Why didn't we represent the actions of the players at the level of microactions (arm moving, leg moving, etc.)? Clearly, when one goes to a TABLE I LISTOF OBSERVABLE ACTIONS (time HOLDOBJECT player location object) I " ( " THROW " 1 " ( " SWINGHIT " " 1 " " ( " CATCH " 1 Y ( " SWINGMISS " " 1 " modif iers) WALK Y ,I ( " RUN " 1 II I ( " ON. " 1 It U ( " AT " 1 ( AIRMOVING object 'I modifiers) ( " GROUND,, ,I MOVING " 1 ( ' I
'I
It
I96
Elliot Soloway
TABLE I1 EPISODESOBSERVED BY BASEBALL Infield Single Infield Groundout Outfield Single Infield Flyout Outfield Flyout Outfield Double Out at Second Base Infield Single plus One Baserunner Double-play Fielder’s Choice - Safe at Firstbase Out at Secondbase Fielder’s Choice - Out at Firstbase Safe at Secondbase Pitcher Throws - Batter Swings and Misses Pitcher Throws - Batter Does Not Swing
baseball game, one “sees” all these other features. In coming to understand the game of baseball as a game, one needs to realize that the former features are not relevant to the game; one needs to also integrate microactions into more macro ones. While we cannot put a hard number on it, we feel that we have included enough nonrelevant features in the initial snapshot descriptions: BASEBALL still has a significant amount of work left in order to sort out the relevant from the nonrelevant features in the pattern descriptions. Thus, it does not appear to us that we have unfairly biased the system by some prefiltering of the input data.
111. Interpretation Process BASEBALL attempts to see physical actions and come to an understanding of the nonpurposive and purposive relationships between the observed actions. For TABLE 111 EPISODESNOT OBSERVED BY BASEBALL Homerun Triple Foul Ball A Hit with More Than One Baserunner Sacrifice Flyout Infield Ground Rule
Knowledge-Directed Machine Learning
197
example, BASEBALL will see one player HITting a BALL with a bat, the BALL MOVing in the AIR, and then someone CATCHing it; BASEBALL must then understand (1) that the CATCH of the latter player was, in some sense, physically enabled by the HIT of the former player, and it must understand that (2) the former player intended to HIT the ball and, moreover, he did not want the latter player to CATCH it. In other words, BASEBALL must interpret what it sees in terms of some given model. In this section, we will describe the three levels of interpretation that BASEBALL carries out on the raw input data. We will also highlight the key role that domain knowledge must play in this interpretation process. A. AITENTIONFOCUSING The objective of this first level of processing is to (1) reduce the amount of information that needs to be analyzed by “higher-level processes,” and (2) provide a crude structuring on the otherwise continuous stream of data. In particular, attention focusing (AF) attempts to reduce the input data by filtering out actions on the basis of the following heuristic: Change is interesting. Animals too seem to employ this heuristic, and thus they tend to habituate to those features of the environment which are unchanging. Similarly, BASEBALL filters out the actions that do not change from snapshot to snapshot, except those requiring a significant amount of skill and/or energy to perform (e.g., RUNning FAST). The results of this filtering algorithm are quite dramatic. The number of pattern descriptions per snapshot is decreased from 18 to an average of 2 or 3. Certainly, low-energy action sequences that do not change can be important. On a first pass, however, BASEBALL will miss such subtleties. The hope is that later processing will redirect the system’s attention to take note of nonchanging activity when necessary (see Section III,C,2). Next, AF attempts to segment the continuous sequence of snapshots into units which are meaningful in the task domain. Such units are called episodes and are carved from the snapshot sequence on the basis of the following heuristic: A competitive episode is often indicated by a period of high-energy activity surrounded by periods of low-energy activity. In other words, the cycles of competitive activity in a game can often be distinguished from the ritualistic or preparatory activity by the difference in the degree of energy expended. An “infield single” or a “flyout” would be typical episodes in baseball; a “down” would be an episode in football. Attention focusing is also used in the evaluation of hypotheses generated by subsequent processing; that is, predictions generated from those hypotheses are fed back to AF where they wait to be triggered by incoming data. If any of the predictions are matched, then AF sends a message to hypothesis evaluation with that information (see Section V).
Elliot Soloway
I98
B.
COMMON-SENSE PHYSICS
The output of AF is fed to a set of routines that attempts to provide a nonpurposive interpretation of the events; that is, this level of processing takes as input a stream of disconnected actions and produces as output actions that are linked together in terms of causal enablement chains. For example, the system must realize that the BALL THROWn by one player provides the enablement condition for another player to CATCH it. Notice that no statement about why the player might have thrown the ball is made at this level of processing. In effect, this level of processing simply uses a common-sense model of physical actions, much as a young child might, to tie together actions at the level of basic physical causality. BASEBALL uses act schemas to represent physical actions. In particular, each act schema specifies four types of information: The primitive action class to which the action belongs; for example, the primitive action underlying THROW is PROPEL-INANIMATE-OBJECT. In using primitive actions in the act schemata, as opposed to specific actions, we allow for the establishment of a wider range of causal enablements. For example, we did not think it fair to simply say that a SWINGHIT could occur after a THROW. Rather, we specify for THROW that any action that could count as doing something to a ball at some location could follow a THROW (Fig. 2). The primary enabling conditions (preconditions) for execution of that action The primary consequences of executing that action Additional descriptive information about the action, for example, the skill and energy required to perform the action, the range of alternative consequences (see Section III,C, 1,b) For example, Fig. 2 depicts portions of the act schema for THROW and SWINGHIT. Under the indicator PRIMITIVE-ACTION-TYPE, THROW is listed as an ISA-INSTANCE-OF the primitive action PROPEL-INANIMATE-OBJECT. The PRIMARY-CONSEQUENCE for THROW states that the object THROWn first must be MOVING and, sometime later, that the same object must LOCATE itself at some location. This specification does not state how the ball should be moving-flying in the air, rolling on the ground, etc.-nor does this specification state how the object will come to be at a location-by being caught, simply by rolling to a stop, etc. We define a causal enablement chain to be a sequence of actions in which the consequences of one action (or possibly several actions) satisfy the preconditions of a subsequent action. For example, at the bottom of Fig. 2 we see the snapshot sequence of (pitcher THROWS BALL-batter HITS BALL). The objective is to
Knowledge-Directed Machine Learning
199
(SWINGHIT Bl HP BALL
(THROW A1 PM BALL
(ENABLED-BY
(ENABLED (104 SWINGHIT))
(102 mow))
, 9, (ISA-INSTANCE-OF PROPEL-INANIMATE-OBJECT)
PRIMARY-CO"CES
(AND (IXA-INSTANCE-OF MQVING-INANIMATE-OB JECT) (ISA- ST CE OF LOsTE%I TE-09JECT) )
(OR (ISA-INSTANCE-OF
PROPEL-INANIMATE-OBJECT) (COULD-COUNT-~-A-IN~AN~-OF LOCATE-ANIMATE-OBJECT))
Y-ENABLING-CONDITIONS
PROPEL-INANIMATE-OBJECT) -OF
(
A-INSTANCE-OF MOVING-INANIMA~-OBJECT))
102 (THROW A 1 PM BALL)
(AIRMOVING BALL PM (FAST))
(SWINGHIT Bl HP BALL)
Fig. 2. Using act schemas to infer causal enablement relationships.
discover that the pitcher's THROW (snapshot 102) satisfied the physical enabling conditions for the SWINGHIT (snapshot 104). As indicated there, the PRIMARY-ENABLING-CONDITIONS for SWINGHIT are accessed in the act schema and matched against the observed actions. In particular, we see that the primitive action MOVING-INANIMATE-OBJECTin the precondition clause of SWINGHIT is successfully matched against the action AIR-MOVING BALL in the observations (snapshot 103), since AIR-MOVING BALL ISA-INSTANCEOF MOVING-INANIMATE-OBJECT. Next, the PROPEL-INANIMATEOBJECT is successfully matched against the THROW by player A1 (snapshot 102), since THROW ISA-INSTANCE-OF PROPEL-INANIMATE-OBJECT. Similarly, the PRIMARY-CONSEQUENCES from the THROW act schema are successfully matched against the observations in snapshots 103 and 104. The results of this process establish a causal enablement chain between the action THROW in snapshot 102 and the action SWINGHIT in snapshot 104. Finally, to record the establishment of a causal enablement chain, the pattern descriptions of the respective actions are augmented by the addition of new feature descriptors, for example, ENABLED and ENABLED-BY. * 'Act schemas also contain secondary enabling conditions and secondary consequences (Soloway, 1978). These are used to help cope with the "frame problem."
200
Elliot Soloway
C. DOMAINHYPOTHESES BASEBALL now needs to bring into play knowledge about the domainknowledge about action-oriented competitive games. It will use that knowledge to provide an interpretation for the observed events and ascribe intentions to the actors in those events. In effect, BASEBALL needs to carry out plan recognition (Schank & Abelson, 1977; Schmidt, Sridharan, & Goodson, 1978). For example, after observing the actions of A1 THROWing the BALL, and B l HITting the ball, it will need to see that the pitcher, A 1, did not want the opposing player, B1, to hit the ball. Note that without employing domain knowledge, BASEBALL could not hypothesize the goal of Al: Goals are not “seen,” but rather they are, as it were, laid on the observations. To illustrate the importance of this key point, consider the following Gedanken experiment: Assume a person watched a game of baseball, believing all the while that what was being acted out was a religious ceremony. Given that a priori model, one could readily develop cogent interpretations for the observed events. For example, the ball must be a holy object: It gets passed from individual to individual, after the individual holding a stick walks back to a bench. Or, after hitting the holy object with a stick, an individual visits various stations on the field, typically stopping at one, etc. Initially this example may seem ludicrous. However, it is fair to say that we all have had a similar experience, i.e., using an inappropriate model to interpret some event. Finally, with no overall model of what the actors are doing, one cannot provide any interpretation. Thus, domain knowledge must-and willplay a major role in BASEBALL’S learning activity. 1 . A First-Order Characterization of Competitive Action Games
There is a diverse literature on games [see Avedon & Sutton-Smith (197 1) for an extensive bibliography]. Anthropologists, sociologists, psychologists, and mathematicians have studied games from a wide range of perspectives and purposes, for example, to identify the invariant structures in all games, to understand the implications for human development of this “universal grammar” of games, to understand the impact of game playing on personality, to develop good techniques for teaching games, and to develop a mathematical characterization of games. Our goal, however, is somewhat different, namely, to describe in general terms the common-sense knowledge about games which would be sufficient to enable a system to learn about a particular game. Since, by and large, current work in the study of games does not address this issue, we have had to develop
Knowledge-Directed Machine Learning
20 1
our own characterization of action-oriented games. In particular, that characterization is based on two key concepts: Competition: The interactions between players on two teams, where players on team A try to prevent players on team B from achieving their goals, and vice versa Cooperation: The interactions between players on the same team, where those players try to help each other achieve some common goal (or goals)
“Winning a game” occurs when team A achieves some distinguished goal(s), while preventing team B from also achieving the distinguished goal(s), or vice versa. a. Local Competitive Interactions. Drawing on the above characterization of competition and cooperation, we can further characterize the local interaction of two opponents in an action-oriented game as follows:
LOCAL-COMPETITIVE-INTERACTION(ACT-OF(PLAYERl), (ACT-OF(PLAYER 2)-t (FAIL(G0AL-OF(PLAYER1)) and or
SUCCEED(G0AL-OF(PLAYER2)))
(SUCCEED(G0AL-OF(PLAYER1))and FAIL(G0AL-OF(PLAYER2))) where PLAYER1 and PLAYER2 are on OPPOSING TEAMS For example, in the pitcher-batter interaction where the batter hits the ball and safely makes it to first base, we can hypothesize that the pitcher and batter were in the competitive relation of PHYSICAL-COMPETITION, with the pitcher FAILing and the batter SUCCEEDing with their respective goals.2 Note, it must be the case that for any competitive interaction both clauses of the above disjunction must be possible; that is, either player in the competitive interaction must in principle be able to win (succeed). Indeed, it would be a strange “game” if only one team could win! In the pitcher-batter example, we saw the batter SUCCEED while the pitcher FAILS. However, if the interaction were really a competitive ZSituations that appear at first blush to be anomalous can typically be explained. In football, for example, it would seem that both teams FAILed in the situation where a “pass is not completed” (a teammate of the quarterback fails to catch the football and the opposing team did not intercept the pass). However, a more careful analysis of this situation requires that several levels of goals be distinguished. The passing team FAILed by not completing the pass, but SUCCEEDed by not losing possession of the ball. The opposing team FAILed by not intercepting the pass, but SUCCEEDed when the pass was not completed. In the case where an opposing player actually knocked the ball away from the would-be receiver, a hypothesis could be made of a local competitive interaction in which the would-be receiver FAILed because he did not catch the ball and the opposing player SUCCEEDed because he prevented the former player from achieving his goal.
202
Elliot Soloway
one, then the pitcher must be able to SUCCEED and the batter FAIL. From actions in baseball, we in fact know that it is often the case that the pitcher THROWS a BALL and the batter does NOT HIT the BALL. Thus, for each type of competitive interaction, there are two possible outcomes: FAIL(PLAYER l)/SUCCEED(PLAYER2): The GOAL-OF(PLAYER 1) must have been to PREVENT PLAYER2’s action, and since PLAYER2 did execute that action in the particular interaction under investigation, PLAYER 1 FAILed with his goal. Similarly, PLAYER2 must have wanted to execute his action, and thus he SUCCEEDed in the particular interaction. Note that the ACT-OF(PLAYER1) can be said to have “enabled” the ACT-OF(PLAYER2), where the sense of enablement here will be explicated in the ensuing paragraphs. SUCCEED(PLAYERl)/FAIL(PLAYER2):PLAYER2 must not have wanted to execute the action he did in fact execute, thus he FAILed in the interaction under investigation. On the other hand, PLAYER1 must have wanted PLAYER2 to execute that action, and since PLAYERl’s action enabled PLAYER2’s undesired action, PLAYER1 SUCCEEDed in the interaction. In fact, as we shall see in Section V, BASEBALL makes explicit use of the symmetry of the SUCCEED/FAIL relationship in evaluating its hypotheses. BASEBALL is given knowledge about four specific types of competitive interactions: PHYSICAL-COMPETITION: This type of competitive interaction is at the heart of action-oriented, competitive games; one player pits his physical actions against an opponent. For example, the archetype interaction of this sort is the pitcher-batter relationship. (This type of interaction will be discussed in Section III,C, I ,b.) ORDER-OF-OCCURRENCE: In this type of competitive interaction, the key element of time is brought to bear. For example, in baseball, a player who hits a ball must reach first base before the ball is caught by an opponent, who is also standing at first base. STATE-OF-DISTINGUISHED-OBJECT: In this type of competitive interaction, one player’s change in actions is related to the conjunction of an opponent’s action and to the state of the distinguished object. In baseball, for example, if a player catches a ball one of the distinguished objects in baseball-which was hit by an opponent, before the ball touches the ground, then the batter is not permitted to get on base. In other words, the “state” of the situation can play an important role in understanding the goals and actions of the players in a game. LOGICAL-COMPETITION: This type of competitive interaction is the weakest sort; it just says that two players appear to be in competition, but
Knowledge-Directed Machine Learning
203
BASEBALL does not know why. (This type of interaction will be discussed in Section III,C,2.) The knowledge that allows BASEBALL to recognize each of the four sorts of competitive interactions is encoded as production rules. Moreover, for each type of situation there are two production rules: one that can recognize the FAIL/ SUCCEED situation, and one that can recognize the SUCCEED/FAIL situation. We call these rules causal link schemas, since they attempt to explain the change in a player’s actions by positing some enablement link to an opponent (or confederate’s)actions. A detailed example of one type of causal link schema (abbreviated CLS) is given in Section III,C,l,b. b. PHYSICAL-COMPETITION: An Example of a Causal Link Schema. Assume that BASEBALLjust observed player B 1’s SWINGHIT in snapshot 103 of Fig. 3. Since this action is a change from what B1 had done previously, BASEBALL seeks to provide some sort of explanation for this change. We shall focus our attention here on how the CLS that can recognize the FAIL/SUCCEED situation of PHYSICAL-COMPETITION can provide a possible explanation for this ~ h a n g eIn . ~particular, since Al’s THROW (snapshot 101) occurred shortly before Bl’s SWINGHIT (snapshot 103), the former is taken as a candidate action to be examined further. Thus, the variables in the predicates of the CLS in Fig. 3 can be bound to the following values: The pattern description for Al’s THROW can be bound to the variable ACTl , and player A1 can be bound to PLAYER1; similarly, the pattern description for Bl’s SWINGHIT can be bound to ACT2 and player Bl bound to PLAYER2. Finally, assume that the act schemas for THROW and SWINGHIT have made an inference about the physical relationship between these two actions: The THROW by A1 ENABLED the SWINGHIT by B 1. Now BASEBALL evaluates the predicates in the CLS for PHYSICALCOMPETITION: The first predicate on the left-hand side of the IF statement is (CHANGEACT ACT2). In goal-directed behavior, a change in a participant’s actions often indicates a change in goals. Moreover, such changes might be due to some relationship with an opponent or with a teammate. At a minimum then, explanations for changes in a player’s actions must be sought. (Currently, the system does not attempt to explain why a player continued to perform some action.) Since B 1’s previous action was simply to stand and HOLD a BAT at HOMEPLATE (snapshot 102), this predicate returns TRUE. The second test is (OPPOSING-TEAMS ACTl ACT2). If Bl’s change in actions is to be explained in terms of a competitive relationship, the candidate 3All the other CLSs are also called on to see if they can provide a possible explanation. They turn out to be not relevant in this situation.
CAUSAL-LINK SCHEMA:
IF (MID (101 THROW A 1 PM BALL (ENABLED (103 SWINGHIT))) (103 SWINMIT 81 HP BALL ~ENABLED-By (101 THROW))) THEN (AUCMk3T-PATTERN DESCRIPTION
(DIFFICULT-ACTS ACT1 ACT21
(101 THROW A1 PM BALL (ENABLED (103 SWINGHIT)) (WANT PFEVENT (103 SWINGHIT) FAIL) (PHYSICAL-COUPETITION WITH (103 SWINGHIT)) DIFFICULT-ACT (CAN-AFFECT-PERFORMANCE (103 SWINGHIT))) (103 SWINMIT El HP BALL (ENABLED-BY (101 "HFIOW)) (WANT EXECUTE (103 SWINMIT) SUCCEED)) (PHYSICAL-COMPETITION WITH (101 THROW)) DIFFICULT-ACT CHANCED-ACT) 1
(HOLDOBJ B1 HP BAT) (AIRMOVING BALL HP (FAST))
Fig. 3. Application of a CLS to observed actions.
Knowledge-Directed Machine Learning
205
player must be on the opposing team: A 1 is on the A team while B 1 is on the B team; thus, this predicate is also satisfied. The third predicate, (PHYSICAL-ENABLE ACT1 ACT2), seeks to establish the direct physical connection between the two actions in question. By accessing information in the pattern descriptions for these actions, this predicate discovers that the act schemata have inferred that a physical enabling relationship does exist between the THROW and the SWINGHIT, and thus can also return true. Just because B 1’s action was enabled by Al’s action does not mean that BI intended or wanted to execute that action. Similarly, we still do not know if A1 did not want to enable B 1’s action. Judgement about these issues requires that a natural assumption about games be made: A player does not unintentionally perform an action which requires a high degree of skill and energy. HITting a BALL, which is moving FAST, with a stick-like object is an action requiring a significant degree of skill. It is unlikely that a player would execute such an action if he had not intended to do so. This observation stems from the very nature of competitive action games; highly skilled players perform physical actions which test the limits of their physical abilities. Moreover, actions that require a significant degree of skill and/or energy are often the important ones in a game. We assume that a player would not usually want to enable the execution of such an action by an opponent. For lack of a better term, we define “difficult act” to be an act that requires a high degree of skill and/or energy for performance. By accessing declarative information in the appropriate act schemas as to the amount of skill and energy required to perform a THROW and a SWINGHIT, the fourth predicate (DIFFICULT-ACTS ACTl ACT2) determines that both of the actions can be considered to be difficult acts. The fifth predicate (CAN-AFFECT-PERFORMANCE ACTl ACT2) attempts to establish that each player might have succeeded in the candidate competitive interaction. In particular, it attempts to determine if the complementary alternative to the observed outcome of the interaction could have taken place, that is, the above predicate reasons hypothetically to determine if the pitcher could have THROWn the BALL toward the batter in such a way as to decrease the likelihood of player BI being able to execute his SWINGHIT. In our example, this amounts to examining the relationship of THROWing the BALL and subsequently HITting the BALL. In particular, BASEBALL asks itself the question: What effects could occur if the pitcher Al had applied an increased amount of skill and energy to the performance of his THROW? Drawing on information in the act schema for THROW, BASEBALL can reason that the BALL could travel FARTHER and/or the BALL could travel FASTER. Now, it must ask itself: What effects could the above conditions have on the performance of SWINGHIT? Drawing now on information in the act schemas for SWINGHIT, BASEBALL can reason that it would have been more difficult,
206
Elliot Soloway
hence less likely to HIT a BALL which was moving FASTER. Thus, it is possible that the pitcher could have THROWn the BALL, and the batter might not have HIT it. If there were nothing the pitcher could have done-short of not THROWing the BALL at all or THROWing it completely out of Bl’s rangewhich would have had some negative effect on Bl’s SWINGHIT, one would probably not call such an interaction “competitive.” Given the above line of reasoning, this fifth predicate can now return true. Since all the predicates in the left-hand side of the CLS of Fig. 3 are true, the hypothesis on the right-hand side of the CLS can be triggered. The inference made in this case is that a competitive relationship, PHYSICAL-COMPETITION, seems to exist between A1 and B l . Moreover, based on the preceding argument that players in a competitive game do not usually intend to enable a difficult action of an opponent, Al’s goal was hypothesized to be that of preventing his opponent’s action of SWINGHIT, and thus he FAILed with his goal in this instance. Similarly, based on the assumption that players usually intend to execute difficult acts, Bl’s goal was hypothesized to be that of wanting to execute the action SWINGHIT, and thus he SUCCEEDed with his goal in the observed instance. Thus, the action of the CLS is to output a production rule that has embedded in it an augmented pattern description: The above values for the features goal and competitive casual relationship have been added to the pattern descriptions of Al’s and B 1’s actions; values for the features change act, difficult act, and can-affect-performanceare also added. (Recall that the physical-enable relationship has already been added by the appropriate act schemas.) c. Local Cooperative Interactions. Another way to explain why a player executed a change in his action sequence is to appeal to a COOPERATIVEINTERACTION which enabled the change. In order to hypothesize a COOPERATIVE-INTERACTION, the system tries to establish a link to a previous action by the same player or by a teammate of the player. Moreover, just as in the case of competitive interactions, we can define a goal structure for cooperative interactions independent of the particular type of cooperative relationship.
COOPERATIVE-INTERACTION(ACT-OF( PLAYER 1) , (ACT-OF(PLAYER2[GOAL-OF(PLAYER1) = WANT ENABLE ACT-OF(PLAYER1)l and [GOAL-OF(PLAYER2)= WANT EXECUTE ACT-OF(PLAYER l)] where PLAYER1 and PLAYER2 are either the same player of players on the same team, and PLAYERl’s action “enabled” PLAYER2’s action. For example, in Fig. 4 the GOAL-OF A5’s THROW was to enable A3’s CATCH, and A3’s goal was to CATCH the BALL. At the level of local cooper-
CONCURRENT ACTIVITY1 HOIDOBJ A 1 PN BALL) (PHYSICAL-COOPERATION WITH #a) 112 SUCCEED) (GOAL. (WANT FMBLE =CUTE
--(X1
--W2
b THROW A 1
1
PN BALL)
(PHYSICAL-COOPERATION w I m xi) (GOAL (WANT EXECUTE X 2 ) SUCCEm) (PHYSICAL-COWETITION WITH dd (GOAL: (WANT PREVENI EXECLd 118) FAIL) --(X3
3 AT A1 PM)
1I
COUPRE
PHYSICAL-ENABLING A=
--(X6
SICAL-COMPRITON WITH (ORDER-OF-OCcuRReyCE
I I
i
PHYSICAL-UUBLIXG A C I
#la)
3 AIRMOVING PU 0N.L (FAST) 1
gpothesss
FAIL)
COOPERATION WITH U7)
--(#7 6 RUR 83 W (FAST)) --(*a (ORDER4F-OCCURRENCE COOPERATION WITH XB) (GML: (NANT EXECUTE X7) SUCCEED)
6 BOUNCE HP 0N.L (CRlm))
--(Xll 6 GROUWNOVIWG W BALL (FAST))
I
(PHYSICAL-COOPERATION WITH N14)
--(#lo 10 AT A3 FBI
(ORDER-OF-OCCURWICE QUIPRITION W I T H XZ
Fig. 4. An infield single episode after interpretation.
208
Elliot Soloway
ative interaction, the outcome (SUCCEED-FAIL) structure is trivial; we simply label both parties to the cooperative interaction as SUCCEEDing. Since there is no alternation of SUCCEED-FAIL outcomes, the system does not need two versions (two CLSs) of each cooperative relation~hip.~ 2.
Using Top-Down Knowledge to Redirect Attention
In the snapshots in Fig. 5a, we see the pitcher, A I , engaged in the apparently high-energy action of THROWing the BALL FAST. Notice, however, that no opposing player appears in the snapshots; the pitcher’s teammate A2 simply CATCHes the BALL. Now consider the unfiltered version of snapshots 203 and 204 in Fig. 5b. In snapshots 203 and 204 we see the batter B 1 simply standing at the HOMEPLATE HOLDing a BAT. Since his actions did not change, they were eliminated by the initial crude filtering during AF. No ordinary competitive CLS was able to operate since there was no change in one team’s action correlated with a change in an opposing team’s action. Thus, BASEBALL will find itself at the end of an episode in which no competitive interaction had been re~ognized.~ If BASEBALL wants to believe that this sequence of snapshots is a competitive episode, then it must somehow discover a competitive interaction. To do this we equip the system with special CLSs which force the system to go back to the original data in the hope of finding an action that could be considered to be in competition with some action of the opposing team. These CLSs are only invoked when an episode has been processed in which no possible competitive interaction was found. In effect, BASEBALL wants to believe that competition is taking place, and thus it goes back to the original data for a “second look.” In particular, the special CLSs look in the original data for an action performed by an opposing player which is close in time and/or location to a difficult action performed by the other team. For example, in Fig. 5b we see that player B1 is standing in a position immediately adjacent to where the BALL THROWn by A1 is caught by A2 and, in addition, B 1’s action occurs concurrently with team A’s actions. In this situation, the special CLSs hypothesize LOGICALCOMPETITIVE-INTERACTIONS between the pitcher A I and the batter B I , and the catcher A2 and the batter B 1. This type of competitive relationship is a catchall one; the system does not know exactly what type of relationship exists, but it wants to believe that some type of relationship does exist. Moreover, BASEBALL cannot identify which player SUCCEEDed and which player FAILed. Thus, both outcomes are hypothesized. While this particular situation is 4We label the cooperative relationship between A5 and A3 as one of PHYSICAL-COOPERATION. In contrast, we label as ORDER-OF-OCCURRENCE-COOPERATION the rather trivial relationship of B3’s RUN after his SWINGHIT (Fig. 4). 5BASEBALL uses a grammar for Competitive episodes, represented as an augmented transition network, in order to actually “parse” episodes (Soloway, 1978).
Knowledge-Directed Machine Learning
209 (AT Al PW) (AT A2 HP) (AT M FE)
-r
-Arluuct-
(HOLDOEJ 8 1 HP BAT) (AT 82 DUOOUTB) (AT B3 DUGOVFB)
201 HOLDOBJ A l PY BALL) ( 2 0 2 THROW A 1 PY)
(AIRNOVINO BALL. PY (FAST)) UNFILTERED SNAPSHOT
(208 HOWOBJ A2 H P BALL)
(207 THROW A2 H P BALL)
zoa
(208 AT A2 HP) (209 AT A 1 PI0 (210 CATCH A 1 PY BALL) (211 HOWOBJ A 1 PY BALL)
(208 AIRMOVING BALL (200 AIRMOVING BAW
HP (SLOW)) HP (SLOW)) (AT A 1 PW) (AT A2 HP) (AT A3 FBI
FILTERED SNAPSHOTS OF A (PITTHROWS-BATTER DOES NOT SWING) EPISODE: vxLLm~SIRIKE. OR 'BALL' (HOLDOBJ 8 1 HP BAT) (AT B1 DUCOVIB) (AT B3 DUGOVlg)
(AIRMOVING BALL PY (FAW)) 1
I
WILTWED SNAPSHOT
Fig. 5. Using knowledge to redirect attention.
an extreme case, BASEBALL not infrequently hypothesizes more than one interpretation for observed events. During hypothesis evaluation, BASEBALL tries to bring evidence to bear in determining which, if any, of the hypotheses might be true (see Section V). 3. Using Acquired Knowledge
An important test of a learning system is its ability to use the knowledge it has acquired. Two questions arise in this regard: How to use the acquired knowledge: A system must be able to decode the representation of the new knowledge; for example, if the new information is represented as a LISP procedure, then the LISP interpreter would provide the appropriate interpretation. When to use the acquired knowledge: A system needs to have some heuristics that suggest contexts in which the new information can be appropriately applied. The problem of how to use the acquired knowledge is resolved in BASEBALL by encoding both the new information and the old information in the same
210
Elliot Soloway
representation-as production rules. Thus, from the standpoint of execution, old knowledge is indistinguishable from new knowledge; the same interpreter that applies the initially supplied general knowledge can also apply the acquired specific knowledge. Because the acquired knowledge in BASEBALL is only once removed from the general knowledge, there is little problem in knowing when to use the acquired knowledge: Since the hypothesized specific CLSs serve to suggest goals and relationships for players’ actions-just like the parents which produced them-this acquired knowledge can be placed at the same level as the parent knowledge; that is, whenever the initial general knowledge is applicable, the acquired specific knowledge is also applicable.6 4 . Extending BASEBALL’S Understanding of Games
BASEBALL has only a fraction of the knowledge actually needed to understand what is going on. Below we identify two types of knowledge that would need to be incorporated into BASEBALL in order to raise its understanding capacity. Knowledge of markers: In action-oriented games, there are always distinguished events that must be kept track of. For example, in baseball, there are balls, strikes, hits, and runs. BASEBALL would need rules to help it correlate the achivement/failure of goals with the various markers. Composing local goals into higher-level goals: In Fig. 4 we depict BASEBALL’Sanalysis of an infield single episode. We can view the local competitive interactions as subgoals of the episode. For example, if A1 had SUCCEEDed with his goal of PREVENTing B1 from HITting the BALL, then the episode would have ended there. Since A1 FAILed at this juncture, the rules of baseball give team A another chance at PREVENTing B3 from SUCCEEDing with his episode goal, namely, have A3 CATCH the BALL before B3 arrives at FIRSTBASE. In other words, the success (failure) of a sequence of subgoals does not necessarily imply success (failure) of the final goal. It would be nice to have a more complete theory of Competitive interactions in order to better integrate the local goals into some sort of hierarchical goal structure. [See Bruce & Newan (1978) and Lehnert (1981) for analogous work in describing the interactions of individuals in conversations and stories.] 6The integration of new knowledge into the level of the parent knowledge can be viewed as a process of passing the capabilities from the parent to the spawned knowledge. Lenat (1976), who also adopts this technique for his system AM, reports that problems can arise if the spawned knowledge simply inherits the parent’s capabilities; as the new knowledge becomes more and more specialized-more and more removed from the general knowledge-the heuristics associated with general knowledge are not sufficiently constraining, so they provide little guidance in carefully choosing contexts in which the specialized knowledge would be appropriate.
Knowledge-DirectedMachine Learning
21 I
IV. Generalization Process The output of the interpretation process are hypotheses that deal with specific situations; for example, A 1 WANTS to HIT the BALL THROWn by B 1, but B 1 does NOT WANT A1 to do so. The generalization process must transform the specific hypotheses into more general ones; for example, a PLAYER ON ONE TEAM WANTS to HIT the BALL THROWN by a PLAYER ON THE OPPOSING TEAM. As evidenced by even this simple example, knowledge about games must be employed in order to carry out the generalization; for example, BASEBALL must know that there are two teams competing with each other in order to carry out the above generalization. In fact, the key point of this entire section is that domain knowledge is required by the generalization process in order to develop effective generalizations. We recognize that this claim is a strong one and contrasts with the stated goal of some work done on generalization (e.g., Vere, 1975). Thus, before launching into a description of how BASEBALL actually performs hypothesis generalization, we will first present a critical review of a “generic” generalization technique; we will argue that, contrary to the intended goal of such a technique, domain knowledge is imported into the generalization process, albeit only implicitly. A. DATA-DIRECTED GENERALIZATION: A CRITICAL ASSESSMENT The history of A1 is filled with attempts at developing domain-independent problem-solving processes. For example, it was initially thought that a system such as the general problem solver (GPS) (Newell, Shaw, & Simon, 1959) would be appropriate for large classes of problems. However, the lesson learned from experimenting with that type of system is that weak general methodsdomain-independent heuristics, such as means-ends analysis-need to be augmented by strong specific methods (i.e., domain knowledge). Theorem proving and machine translation have a similar history. For example, more than the dictionaries and syntactic description of languages are needed in order to effectively translate one language into another: A system needs to understand the source language statement. Of course, it would be nice if the computer system did not need to know about a subject domain: A1 would then not need to be involved in the difficult business of knowledge codification and representation. However, the experience of researchers is that domain knowledge must be employed in order to create a truly effective problem solver. Given this history, it thus should come as no surprise that the development of domain-independent generalization strategies is an unattainable goal. In this section, then, we will first describe an ostensibly domain-independent generalization strategy that is representative of ones typically in the literature, and then show how domain knowledge nonetheless does at least tacitly creep into its processing.
212
Elliot Soloway
The following rule underlies most of the ostensibly domain-independent generalization schemes: R,: If a subset of features have been common to a number of instances in the past, then that subset will probably be common to instances of that class in the future and hence will aid in the recognition/discriminationof that class in the future. Figure 6 illustrates how R, is used to actually construct generalizations. In that figure two symbolic descriptions of objects are depicted; in Scene 1, there is a small, black circle and a square, while in Scene 2, there is a black square and a circle. Generalization A, produced using R , , reflects the commonalities exhibited in both scenes; the details that differed in each scene were eliminated. Thus, a variable, X, was substituted in the feature BLACK, since the BLACK object was different in each scene. Also, the feature SMALL (Pl) from Scene 1 was not included in generalization A, since it had no counterpart in Scene 2. Note that these changes were required by the data. By “variabilizing” more constants and/or “dropping” more features, other generalizations are possible and are consistent with R,. Given the distinctly bottom-up flavor of this type of scheme, we term this approach data-directed generulizution. In what follows, we identify three key problems that immediately arise with this sort of generalization technique and the means by which those problems are handled-typically using domain knowledge. 1. Coping with the combinatorial explosion of potential generalizations: As can be seen in Fig. 6, a system must have heuristics to control the generation of possible generalizations. Typically, a partial ordering is defined on the set of generalizations, using the instance of relation (Hayes-Roth & McDermott, 1976; Plotkin, 1970; Reynolds, 1970; Vere, 1975). Then the least common generalization (LCG) is usually the only generalization that is generated. Thus, for example, the LCG in Fig. 6 would be generalization A. The LCG will be matched against additional exemplars and further generalized. Note that the instance-of relation is domain independent. However, often one wants to put some other domain-specific ordering on the set of generalizations. Larson and Michalski’s system (1977) allows the user to specify domain-specific ordering relationships. For example, if the generalization system is dealing with descriptions of laboratory tests and surgical procedures for diseases, one might want to order the generalizations in terms of cost of execution of the analysis rather than simply instance-of. 2. Coping with multiple least common generalizations: Often there is no unique LCG. The system must then be given some heuristics for how to deal with this situation. If one (or a small number) generalization is not chosen, then again there is the potential for a combinatorial explosion: When additional exemplars are observed, the LCG(s) needs to be refined in order to account for the new exemplars. The generalization systems developed by Hayes-Roth and McDer-
213
Knowledge-Directed Machine Learning
I
Pl,P2/x,Pl/y,P2/z
Pl/Y.Pl ,P2/z
p2/y,P1,P2/2
BLACK(2) CIRCLE(y) SQUARE(P2)
A
P1,P2/x BLACK(x) CIRCLE (Pi) SQUARE(P2)
BLACK (P1 CIRCLE(pi 1 SQUARE (P2)
Fig. 6. Possible generalizations of two geometric scenes.
BLACK(P2) CIRCLE (P1) SQUARE (Pa)
214
Elliot Soloway
mott (1975), Larson and Michalski (1977), and Michalski (1977) allow the user to again specify domain-specific criteria to be used in selecting the best generalizations. 3. Requiring an adequate training set: A system can learn with or without a teacher. In the former scheme, a teacher presents exemplars to the system, where the exemplars are chosen by the teacher from a given class; the system works on developing a generalization for one class at a time. Note the teacher must ensure that exemplars always exhibit the telltale features of the class; noisy data are typically not a l l ~ w e dIn . ~the latter scheme, the system is presented with exemplars from all classes without the system being informed as to which class an exemplar belongs; the system itself must organize the exemplars. By and large, all data-directed generalization schemes require a teacher to be present. In effect, then, the teacher provides the domain knowledge: He serves to guarantee that the exemplars will exhibit the key features and that exemplars will belong to the same class. Moreover, some data-directed generalization systems are sensitive to the order in which the exemplars are presented. While the above problems with data-directed generalization might be viewed as “technical”, there is a deeper, more fundamental question regarding this scheme; namely, can such an approach discover rules composed of the relevant features-as opposed to the merely correlated ones? While blue eyes may correlate with pitchers, this feature isn’t relevant to BASEBALL’Sobjective of understanding the goals of the pitchers. Thus, relevance is not simply correlation, but rather something determined with respect to a goal. Since data-directed generalization techniques do not take goals into account, they cannot in principle identify the relevant features amid the set of correlated features. In sum, then, data-directed generalization schemes need to import domain knowledge in order to cope with technical problems that arise, and they need to import domain knowledge in order to identify the relevant features of the domain and not merely the correlated ones. In the next section, we will describe how BASEBALL carries out generalization, and we will point out how domain knowledge is used in order to identify the relevant features. B . KNOWLEDGE-DIRECTED GENERALIZATION Since the goal of BASEBALL is to discover general classes composed of relevant features, domain knowledge is used to direct the discovery process; data-directed generalization plays a secondary role. Again, we have tried to take ’If an exemplar of a class is missing a key feature, then the generalization technique will “drop” that feature from the resultant generalization.
Knowledge-Directed Machine Learning
215
care not to include so much knowledge in this process as to trivialize the whole enterprise; we again invite the reader’s inspection. Before describing how generalization is carried out in BASEBALL, we need to point out the different levels of classes that BASEBALL attempts to produce. From our analysis of action-oriented, competitive games, we see three levels of abstraction as being natural and meaningful in this type of situation: (1) the competitive interaction, (2) the episode, and (3) the episode’s Anal competitive goals.8 In what follows, then, we describe how, without an explicit teacher, BASEBALL partitions the data at these three levels of abstraction into meaningful classes and discovers the allowable variation within a class. 1. Forming Classes of Competitive Interactions
In order to establish a general class at the level of local competitive interaction, a subset of features of a pattern description9 must be distinguished. As we said earlier, a rule such as R, serves to pick out such a subset on the basis of commonalities evident in two or more instances. The role that this rule serves in data-directed generalization schemes is analogous to the following rule employed by BASEBALL to highlight relevant features: R,: Hypothesize as relevant that subset of features added to the pattern descriptions during the interpretation phase (e.g., goal, competitive relation, difficult act, physical enablement). For example, in Fig. 7 we depict pattern descriptions from two competitive interactions. Those features that are asterisked in the pattern descriptions are those that satisfy R,, namely, the features added by the appropriate act schemas and by the CLSs which have hypothesized competitive relationships between the actions of opposing players or cooperative relationships between members of the same team. As described below, eventually BASEBALL will form a class for each competitive interaction depicted in Fig. 7 on the basis of features selected by R,. R, directs BASEBALL to employ knowledge in a top-down fashion in order to hypothesize feature relevancy. In contrast to the data-directed generalization schemes which require that a number of examples be observed before feature correlation can be established, R, needs to see only one example of a competitive (or cooperative) interaction in order to hypothesize feature relevance. In effect, the crucial aspects of a class description are determined during the interpretation phase. 8Clearly, some composition of episodes or some other portions of an episode (e.g., the beginning) may be meaningful in a game. However, as we indicated earlier, we do not understand games well enough in general to specify such additional units. 9Recall that the output of the interpretation process is production rules. However, for simplicity’s sake, hereafter we shall ignore the production rule representation and speak in terms of generalizing the pattern description representation of the competitive and cooperative interactions into classes. However, we simply remind the reader that a production rule is associated with each class of interactions.
Elliot Soloway
216
CDYPFTITIVE 1-CTION INFIELD SINGLE EPISODE
110
110
*CATCH
216
.ON
*CATCH
06
A3
FB *BALL
83
FB
*BALL
b(OcCURS-Mnm
ON))
*(ORDER-OFOCCURWCECOYPETITION WITH (214 ON)) IOIFFICULT-ACT
.(OCCURS-rn (224 OW))
(108 RUII)
r~OCNRS-BhFoR6 (110 CATCH))
/
COYPETITIVE INTEUCTIOY 310
461 .~~
450
.WALK
Bl
A3
A4
HP
FB *BALL * ( W A N T ENABLE (311 W N K ) SUCCEED)
*(DRDER-OF
OCcuRRwCE-
CWPCTITION WIM (311 WALK)) .DIFFICULT-ACT .(OCCURS-EiRORE (311 Y A W ) )
mm
&!zWuY&IWTFIUCTIOn
I
311 *W"
CATCH
.(WANT
(224 ON) SUCCEED) *(ORDER-OFOCCURRENCECOYPETITIOY WITH (215 CATCH)) *NOT-DIFFICULT-ACT .(PYABLED-BI DIFFICULT-ACT (223 RUII)) .(OCrn-BEFORE (221 CATCH))
* ( U r n PRweM (224 ON) FAIL)
*(ORDW-OFOccuRRoIcECOYPETITIOY WITH (110 CATCH)) *NOT-DIFFICULT-ACT *ENABLED-BY DIFFICULT-ACT
*(ORDW-OFnccuRRwcECOMPETITION WITH (109 ON)) *DIFFICULT-ACT (109
FB
*(WANT e x E m (100 ON) SWCCEPD)
*(WANT PnewhT (100 ON) FAIL)
OF AN
.(WANT NOT UENR (311 W A L K ) FAIL)
*(ORDBI-OF OEcuRREncECOYPETITION VIM (310 CATCH)) .NOT-DIFFICULT-ACT .(UIABLED-BI DIFFICULT-ACT (S10 Run)) t(0cNRS-m (310 CATCH))
I
I
HP .Mu . ( U r n ENABLE (451 INK) SUCCEED)
(omw-m-
I
OCNRRENCECOYPETITION WIM-(461 WALK)) *DIFFICULT-ACT .(OCCURS-WUIIE (461 W A L K ) )
.(WANT
NOT
mm
(4Sl U r n ) FAIL) .(ORDW-OFO C ~ C E COYPETITIOY W I M (460 CATCH): .NOT-DIFFICULT-ACT .~ENABLED-BY
DIFFICULT-ACT
(460 Run)) .~OCNRS-rn
(6)
(C)
Fig. 7. Classes of competitive interactions.
While features are hypothesized to be relevant after observation of only one example, a class is not formed until BASEBALL sees another similar interaction. Two interactions are similar if they agree (match) on all the features hypothesized as relevant by R,. Consider the competitive interactions taken from two infield singles (Fig. 7a and b) and two infield groundouts (Fig. 7c and d). Two classes of competitive interactions can be generated, since the competitive interaction from the infield groundout matches a competitive interaction from the other infield groundout, but does not match either of the competitive interactions from the infield singles, and vice versa. In other words, class formation in BASEBALL is a two-step process: First, a subset of features is hypothesized as relevant using R,, and then classes are formed using a rule akin to R, , namely, R, *: Merge together two pattern descriptions which have a distinguished subset
Knowledge-DirectedMachine Learning
217
of features in common. In the previous section, we saw that R, was used for two purposes: hypothesis of relevant features and specification of criteria for matching two events and merging them into a class. The variability within a class of interactions reflects the variability of the values for those features which were not initially hypothesized as relevant, for example, the “location” feature and the “player” feature. The data-directed generalization technique of variabilization is used to replace constants by variables in the generalized pattern descriptions. For example, in Fig. 8 we see two competitive interactions where the action takes place at FIRSTBASE in Fig. 8a and b, while the action takes place at SECONDBASE in Fig. 8c and d. In order to accommodate the differences in the location at which the action takes place, the location feature is variabilized; the data force the constants FIRSTBASE and CLASS OF COMPETITIVE INTXWCTIONS: CEWERALIZFD P A T I ” DESCRIPTION ?(TIME1 (OCCURS-(TIME1 (TIUEZ)) CATCH ?PLAYER1 ?LOCATION OIEMBER $LOCATION (FIRSTBASE SECONDBASE))) BALL (WANT PREVENT ($TIME2 ON) FAIL) (ORDER-OF-OCCURRENCE-COUPETITION WITH ($TIHE2 ON)) DIFFICULT-ACT I
FIRSTBAS& BALL (WANT PREVEKT (100 ON) FAIL)
FIRSTBASE (WANT EXECUTE (100 ON) SUCCEED)
(ORDER-OF-
(ORDER-OF0cCVRR ENcI-
(8)
I
t
(e)
OCCVRRENCECOMPETITION WITH (100 ON)) DIFFICULT-ACT (OCCuRS-rn (109 ON))
?(TIME2 (OCCURS-BepoRe $TIME2 (TTIMEZ)) ON ? (PLAYER2 (OPPOSINo-Tenws $PLAYER1 *PLArnZ) (LOCATION (WANT EXECUTE ($TIME2 ON) SUCCEPD) (ORDW-OF-OCCURRREYCE
COMPETITION WITH (110 CATCH)) NOT-DIFFICULT-ACT (ENABLED-BY DIFFICULT-Am (108 RUN) (OCCURS-BEFORE (110 CATCH)) (b)
226
224 ON
83
A1
SECONDBASE
SECONDBASE (WANT EXECUTE (224 ON) SUCCEED) (ORDER-OFOCCURRENCECOMPETITION WITH (226 CATCH)) NOT-DIFFICULT-ACT (ENABLED-BY DIFFICULT-ACT (22s RUN))
BALL
(WANT PREVENT (224 ON) FAIL) (ORDER-OFOCCURRENCECOMPETITION WITH (214 ON)) DIFFICULT-ACT (OCCURS-AFTER (224 ON))
~occuRs-BEFoRE (216 CATCH))
(C)
Fig. 8. Example: variable substitution.
(d)
218
Elliot Soloway
SECONDBASE to be replaced by a variable. However, we do not allow that variable to match any location. Rather, BASEBALL builds a set which contains the observed constants and constrains the variable to match only elements of that set (Fig. 8e). As more examples of similar interactions are observed, the set can be extended. We call the above generalization strategy constrained data-directed generalization (CDDG) and contrast it with unconstrained data-directed generalization (UDDG). The latter strategy also replaces a constant by a variable, but does not limit that variable to match a specific set of values. The data-directed generalization schemes reported on earlier tend to employ UDDG. In substituting a variable for differing constants, CDDG constrains that variable to match one of the values already observed, while UDDG places no restrictions on the variable. However, UDDG's more aggressive strategy has a tendency to produce overgeneralizations.lo Since the level of generalization of a hypothesis can play a role in the verification (or elimination) of that hypothesis, overgeneralization can have troublesome consequences. In Section VI we devote an entire experiment to this issue; we report on the performance of both techniques with regard to their tendency to mistakenly accept incorrect hypotheses as truths and their tendency to form classes at the correct level of generalization. BASEBALL is supplied with knowledge about the player feature. In Fig. 8, we see that four different players, (A3 B6) and (B3 Al), were involved in competitive interactions: Since the player feature was not hypothesized as relevant during the interpretation phase, the value of this feature is allowed to vary in response to the data. Thus, in the general class formed for this interaction (Fig. 8c), the constants (A3 B6) and (B3 Al) are replaced by two variables. However, BASEBALL knows that in a competitive interaction player variables must each refer to the opposing team. Thus, the generalization in this example required that the variables substituted for the constraints in the player features be constrained to match only those members on opposing teams. In BASEBALL, we have adopted a constraint similar to that of Sussman (1973) and Vere (1978) which serves to prevent the generation of multiple least generalizations. All features in pattern descriptions are required to participate in the generalization; no feature can be eliminated (dropped). In addition, once a generalized hypothesis is verified as being correct (see Section V), then the features in the pattern description become frozen; that is, they are no longer subject to generalization. For example, if the location feature has remained constant, as it does in the (THROW A1 PM BALL) (SWINGHIT B1 HP BALL) interaction, then it in effect has become one of the discriminating features for interactions in that class. This is also true for features that have become variabilized (e.g., the location feature in Fig. 7). IOHowever, as the example in Fig. 8 illustrates, even CDDG can lead to an overgeneralization.
Knowledge-DirectedMachine Learning
219
2. Formation of Classes of Episodes A class of episodes is composed of a set of similar episodes where we say that episode (i) is similar to episode (i)if and only if for all competitive and cooperative interactions there must be similar corresponding competitive and cooperative interactions in episodes (i) and (i). Since the interactions comprising an infield single episode and an infield groundout episode are all similar except for the final competitive interaction, these two episodes cannot be merged into the same class. The similarity requirement between episodes in a class also enables BASEBALL to bring context to bear on the variabilization process in order to prevent overgeneralization; that is, variabilization of constants in the hypothesized local interactions is only to take place at the direction of other interactions which are contained in episodes of the same class. For example, the infield single and outfield single episodes are not similar, since the outfield single does not contain a cooperative interaction between a fielder and the first baseman which is contained in the infield single. In this case variabilization of constants is not permitted. If we were to allow generalization to take place across the interactions in these dissimilar episodes, the location feature in the final competitive interactions would need to be variabilized to accommodate the differences in the two episodes. This would be most unfortunate, since the fact would be lost that the fielder must be at the location to which the batter is running in order for there to be an ORDER-OF-OCCURRENCE timing relationship. While this strategy may seem ad hoc, it actually reflects a deeper understanding of the problems with generalization. Whenever global contextual constraints are available, they should be used to control generalization. It is a strong requirement that all episodes in a class match on all the competitive and cooperative interactions. Additional knowledge about flexibility within a class might permit this constraint to be weakened, and thus a partial match between episodes might suffice. For example, BASEBALL currently constructs separate classes for simple infield singles and for infield singles in which a player who is a teammate of the batter is already at FIRSTBASE. In the latter episode, at least an additional cooperative interaction is hypothesized; when the batter HITS the BALL, an ORDER-OF-OCCURRENCE-COOPERATION is hypothesized to exist between the batter at HOMEPLATE and the runner at FIRSTBASE. The requirement that all cooperative and competitive interactions participate in the match means that infield singles without this additional cooperative interaction will be considered to be different. While this interpretation is reasonable, one might nonetheless want to lump both types together. In any case, since BASEBALL is not provided with knowledge which tells it what can be ignored in a match, the strong requirement of complete matching is currently required.
Elliot Soloway
220
3. Formation of Classes of Final Competitive Goals of Episodes
The highest level of classification currently in BASEBALL is based on only one aspect of the episode, the goals of players in the final competitive interaction of an episode. This classification scheme allows us to merge together episode classes such as outfield flyout and infield groundout into one class, namely, outs. The motivation for such a level of generalization stems from our original assumption about the hierarchical structure of plans and the special status of the final goal in that hierarchy. In particular, the sequence of actions and subgoals leading up to the final goal can be considered as one method for achieving the final goal. However, other action subgoal sequences may also serve to achieve the same goal. Though the episodes outfield flyout and infield groundout have somewhat different action subgoal sequences, they can nonetheless be merged together, since they both achieve the same final goal, namely, preventing the batter from reaching FIRSTBASE. The technique for discovering classes at this level is the same as that employed at the lower level of episode classes. The “drop rule” is used to eliminate all features except the goals hypothesized for the players engaged in the final competitive interaction. R,* is then used to collapse together similar episodes; all episodes which match with respect to final competitive goals are merged together. Since the goal [?PLAYER1 WANTS PREVENTS $PLAYER2 ON FIRSTBASE] [?PLAYER2 (OPPOSING-TEAMS $PLAYER1 $PLAYER2)) WANTS EXECUTE ON FIRSTBASE] is common to both the infield groundout and the outfield flyout, these two episodes can be merged together. A discussion of the classes formed in this manner is presented in Section VI.
V.
Evaluation Process
As we clearly stated earlier, the interpretations generated by BASEBALL can be incorrect. Thus, BASEBALL attempts to evaluate its hypotheses in search of the correct interpretation. A strong indicator of the correctness of one’s understanding is the use of that understanding to correctly predict as yet unseen events and their corresponding interpretations. In contrast, sole reliance on the recurrence of an event and its interpretation provides little evaluative information (Minsky, 1963): If the first interpretation was wrong, the succeeding incorrect interpretation would only reinforce the mistake. In this section, we outline two
Knowledge-Directed Machine Learning
22 I
types of predictions that BASEBALL makes in order to evaluate its interpretations.
A. TYPEI PREDICTIONS: COMPLEMENTARY OUTCOME OF AN OBSERVED COMPETITIVE INTERACTION Recall that in the application of the competitive causal-link schemata a test was made for possible variability in the goal outcomes of a competitive interaction. For example, while we hypothesized that the pitcher (Al) failed and the batter (B 1) succeeded with their respective goals in the pitcher-batter PHYSICAL-COMPETITION competitive interaction, BASEBALL reasoned that it was hypothetically possible for the pitcher to have succeeded and the batter to have failed with their respective goals. We argued that this possible variability of outcomes is a necessary ingredient of a competitive interaction. This observation on the nature of competition is the basis for the first type of prediction: From an observed local competitive interaction where one player succeeds and the other player fails with their respective goals, predict that the same competitive interaction can take place, but the outcomes will be reversed-the one who succeeded will fail, while the one who failed will succeed. For example, in Fig. 9 two SWING and HIT interactions have given rise to the generalized Type I prediction shown there. This prediction is, as are all predictions, passed to AF. A SWINGMISS interaction (Fig. 9) that AF subsequently observes triggers the prediction: The SWINGMISS is recognized to be an instance of (NOT SWINGHIT), the players involved are on opposing teams, and the players’ goals are complementary to their goals in the SWING and HIT interaction. Once a prediction is matched, a message to that effect is passed back to the hypothesis evaluation process where the confidence value for that hypothesis is increased by some small value. B. TYPEI1 PREDICTIONS: WHATSHOULDNOT BE OBSERVED The structure “if conditions X hold then Y is enabled” underlies all the competitive relationships hypothesized by the CLSs. For example, assume that BASEBALL observes the following actions: [lo1 ON B1 FIRSTBASE] [lo3 CATCH A3 FIRSTBASE BALL]. The CLS for the competitive interaction called ORDER-OF-OCCURRENCEmakes the following hypothesis: If [ON B 1 FIRSTBASE] OCCURS-BEFORE [CATCH A3 FIRSTBASE BALL], then [ON B 1 FIRSTBASE] is allowed to continue. The crucial feature is the introduction of the timing relation OCCURS-BEFORE. Since we take such hypotheses as the above as rules, we require that the conditions X be both necessary and sufficient for Y. Thus, if X is a necessary condition for Y,then whenever -X is observed,
Instances Type-I PredlCtlOn
(237 THROW))
(2.39 SWINGHIT)>
SWINGHIT) FAIL) (PHISICN-COYPETITIONWITH (230 SWINGHIT)) DIFFICULT-ACT (230
(WANT m m (230 SWINGHIT) SUCCEED) (PHYSICN-CoMPEIITIONWIM (237 THRGW))
4
(7~11~1
(SIyE1
V(PUILR2
?PLAYER1
PM
(PHYSICAL-COMPETITION ($TIME2 SACTZ))) (103 SWINGHIT)) (WANT PREVENT (103 SWINGHIT) FAIL) (PHISICALXOYPETITION wIm (103 SWINGHIT)) DIFFICULT-ACT
(101 ( W M
'SWINGHIT))) (OPPOSINGTEAMS $PLAYER1 SPUIERa))
9 ( A C I ("I CEO SAC12
THROW
DIFFICULT-ACT
WIM
(PHYSICAL-COMPETITION ($TIME1 THROW)))
wIm
mow))
aam
(103 SWINGHIT) SOCCEXD)
(PHYSICN-COYPETITION WITH (101 THROW))
DIFFICULT-ACT MATCH
(200 THROW))
(202 SWINGMISS))
( W A N T ENABLE
SWINGMISS) SUCCEED) (PHYSICAL-COMPETITION WITH (202 SWINGMISSQ) (202
( W A(202 N T (NOT SWINGMISS)) EXECUIE
FAIL) (PHYSICAL-COMPETITIONWITH (ZOO T H R O W ) ) )
SUBSEPUWnY OBSERVED EPISODES
SWING-AND-MISS-EPISODE (Co.pstltlvs Hypotheses)
Fig. 9. Type I prediction.
Knowledge-Directed Machine Learning
223
-Y should be observed; if X is a sufJicient condition for Y, then whenever X is observed, Y should be observed. Based on the need for a rule to be both sufficient and necessary, BASEBALL makes two predictions of events that should NOT be observed: (1) It should not be the case that if -X is observed, then Y is also observed, and (2) it should not bet the case that if X is observed, -Y is observed. The former case disconfirms that X is a necessary condition for Y,while the latter case disconfirms that X is a sufficient condition for Y. As an example of the former prediction for necessity, BASEBALL predicts that if it first observes -([ON B 1 FIRSTBASE] OCCURS-BEFORE [CATCH A3 FIRSTBASE BALL]), that is, A3 CATCHES the BALL before B 1 executes ON FIRSTBASE, and then it observes [ON B1 FIRSTBASE], then OCCURS-BEFORE is not a necessary condition for the B 1 to remain ON FIRSTBASE. Thus, an observation of such a prediction would then have the effect of invalidating the hypothesis upon which the prediction was based. Predictions of this type are also passed to AF where they are matched against the subsequent input. If either a necessity or a sufficiency prediction (a Type I1 prediction) is found to occur, the confidence value for the hypothesis on which the prediction is based is made to go negative. The motivation here is that one negative piece of evidence for a hypothesis serves to effectively eliminate it; however, positive evidence (Type I predictions) does not necessarily confirm a hypothesis, but only serves to increase the confidence in that hypothesis. C. FROMHYFQTHESISTO TRUTH BASEBALL is given a threshold for accepting hypotheses as truth. The dangers involved with the choice of a threshold are well known: If the threshold is set too low, then false hypotheses may be turned into truths; if the threshold is set too high, not all the truths will be discovered. Once a hypothesis has been accepted as a truth, it is used to modify the confidence values of other hypotheses: Those whose goals are consistent with the new truth have their confidence value increased by a constant, while those that are inconsistent have their confidence value decreased by a constant. For example, when BASEBALL accepts the hypothesis that getting ON FIRSTBASE is the intended goal of the batter in an infield single episode, then it can decrease the confidence value of an inconsistent hypothesis from an outfield single episode that says getting ON FIRSTBASE is not what was intended by the batter. The use of acquired knowledge to aid in the evaluation of other hypotheses is risky business: It assumes that a hypothesis verified in one context is relevant to the verification (or elimination) of hypotheses put forward in other contexts. In order to soften the effect of this procedure, the constant by which the confidence values on the hypotheses are modified has been kept low. Finally, in a complete learning system, the ability to recover from an incorrect
Elliot Soloway
224
hypothesis accepted as true would be provided. However, detecting when one’s view of a situation is out of whack and then deciding what to do about it is no mean feat! Currently, BASEBALL is not equipped with such backtracking capability. If it accepts a falsehood as truth and proceeds to eliminate other actual truths as being falsehoods, it cannot recover.
D. THEINTERACTION BETWEEN GENERALIZATION AND
EVALUATION
The interaction between generalization and evaluation is a subtle and troublesome one. Generalization tends to relax constraints on hypotheses, and thus a generalized hypothesis has more contexts in which its predictions can potentially be matched. One implication of this observation is that incorrect hypotheses which are overgeneralized have a tendency to become verified. Since BASEBALL currently has no error recovery procedures, our strategy has been to decrease the likelihood of an overgeneralization and thus decrease the chance of verifying an incorrect hypothesis. The conservative generalization technique (CDDG) described in Section IV,B is the key to this strategy. In Section VI we compare the performance of this technique with another more ambitious generalization technique with regard to this issue.
VI. Experiments Since practical considerations prevent us from running BASEBALL on enough games of baseball to be statistically meaningful, we must be satisfied with a more qualitative method of evaluation. In particular, we will present several runs of BASEBALL in which we vary different parameters of the system in order to illustrate some of the points made earlier (e.g., the interplay between generalization and evaluation). In the runs described below, BASEBALL was fed a continuous string of 1680 snapshots containing 105 episodes; the distribution of those 105 episodes is given in Table IV. The ordering of those episodes was random and thus does not conform to the rules of baseball. A baseball fan would no doubt be upset by the sequence of events generated in this manner; for example, in one episode string, a fielder’s choice follows a double play. However, since BASEBALL does not possess knowledge which could link episodes together, this simplification of the game does not affect BASEBALL’Ssense of propriety. The episodes do contain the variability necessary to test the generalization capabilities of the system; various values for the location, player, and timing features are present in the data. Finally, while the details of AF’s processing can be found in Soloway (1978), it is sufficient here to say that AF reduced the number of snapshots per
Knowledge-Directed Machine Learning
225
TABLE IV
PRESENTED TO BASEBALL EPISODES NAME OF EPISODE i 2 3
4 5 8
7 8
9
10 11 12 13 14
15 18 17 18 19
Infield single Infield Croundout Outfield Single-I Outfield Single-I1 Outfield Single-I11 Infield Flyout Outfield Flyout Outfield Double Out at Secondbase Infield Single plus Baserunner-I Infield Single plus Baserunner-I1 Infield Single plus Baserunnner-I11 Double-play Fielder's Choice-I Fielder's Choice-I1 Fielder's Choice-I11 Fielder's Choice-IV Swing-and-miss Throw-and-noswing
NUMBER OF TIMES IT APPEARS 3 8
3 3 4 8 9
3 3 2
2 2 3 3 2 2
4
18 27 105
episode and the number of pattern descriptions per snapshot significantly. In what follows, then, we report only on the processing carried out by levels beyond AF. A.
EXPERIMENT 1 : A REPRESENTATIVE RUN
In this first run, we will try to illustrate some of the key strengths and weaknesses of BASEBALL. We will focus on the levels of processing starting with the applications of the causal link schemata. In particular, Table Va depicts the total number of CLS activations;the numbers were computed on the basis of one example from each of the 19 episode types (Table IV). In accordance with the rules of baseball, we evaluated whether a competitive hypothesis was correct and found that 75% of them were correct. On the average, 5.7 hypotheses of competitive and cooperative interactions were put forward in an episode (Table Vc). The numbers in Table V provide some evaluation of the following question: Did the domain knowledge given to BASEBALL guide it directly to an understanding of the specific events in baseball? In that table, a competitive interaction simply indicates that some competitive relationship was hypothesized to exist between two opposing players; based on one example from each of the 19 episode types, there were 59 such interactions hypothesized. However, some-
Elliot Soloway
226
TABLE V PERFORMANCE STATISTICS OVERALL STATISTICS FOR THE CAUSAL-LINK SCHEMAS
(a)
TOTAL # OF TOTAL # OF TOTAL # OF COMPETITIVE COMPETITIVE COMPETITIVE HYPOTHESES* HYPOTHESES AND COOPERATIVE HYPOTHESES
I
109
(b)
40
TOTAL # OF INCORRECT COMPETITIVE HYPOTHESES+
sa
68
17
PERCENTAGE OF CORRECT COMPETITIVE HYPOTHESES
75%
I
DISTRIBUTION OF COMPETITIVE HYPOTHESES TOTAL # OF ORDER-OFOCCURRENCE COMPETITION HYPOTHESES
TOTAL # OF PHYSICALCOMPETITION
(C)
TOTAL. # OF CORRECT COMPETITIVE HYPOTHESES+
F/S
S/F
17
18
TOTAL # OF STATEOFDISTINCUISHEDOBJECTCOMPETITION HYPOTHESES
TOTAL # OF LOGICAL COMPETITION HYPOTHESES
F/S
S/F
F/S
S/F
F/S
S/F
9
13
5
3
a
2
AVERAGE NUMBER OF CLS ACTIVATIONS PER EPISODE; AVC. # OF COOPERATIVE HYPOTHESES
AVE. # OF COMPETITIVE HYPOTHESES
2.1
(d)
TOTAL # OF DISTINCT COMPETITIVE INTERACTIONS
69
3.6
AVC. # OF COMPETITIVE/COOPERATIVE HYPOTHESES 5.7
ALTERNATIVE INTERPRETATIONS TOTAL # OF COMPETITIVE HYPOTHESES
69
AVG. # OF COMPETITIVE HYPOTHESES PER COMPETITIVE INTERACTION
1.2
MAXIMUM 17 OF ALTERNATIVE COMPETITIVE HYPOTHESES PER COMPETITIVE INTERACTION 3
Ignores trivial physical cooperation hypothese. Correctness and incorrectness was determined by human evaluation in accordance with the rules of baseball. 0
b
Knowledge-Directed Machine Learning
227
times more than one CLS was put forth as an explanation for a competitive interaction; that is, on the average, 1.2 competitive hypotheses were suggested for each competitive interaction. The maximum number of alternative hypotheses put forward for a competitive interaction was 3. Quite frankly, not all that many alternatives were generated; we would have felt better if the average number of alternative competitive hypotheses put forward was higher. Before displaying the classes of events developed by BASEBALL in this run, let us review some aspects of the generalization and verification processes. First, both processes function in “real time”; as the data are observed, episodes are generalized and verified (or eliminated). Since no external teacher is employed in the former process, BASEBALL must form its own classes. In particular, an episode class is formed on the basis of two (or more) episodes having the same hypothesized competitive and cooperative interactions. Within an episode class, classes of individual interactions are formed. Such classes and the episodes they compose are evaluated on the basis of a confidence value. In this run, the confidence value on a hypothesis must reach 6 before it is accepted as truth. An episode is considered to be verified if the final competitive hypothesis and half of the other competitive hypotheses are verified. BASEBALL’Soverall box score for this run is depicted in Tables VIa and b. While the correct number of episode classes represented by the 19 episode types was 12, BASEBALL actually formed 31 classes and accepted (verified) 10 of them as correct (Table VIa). Multiple and erroneous hypotheses caused a large number of classes to be formed initially. However, as the erroneous hypotheses were eliminated, the episode classes containing those hypotheses were also eliminated. Ten episode classes were finally verified, nine of which were deemed correct inasmuch as they were based on a correct analysis of the observed activity (Fig. 10). The one incorrect episode class which should not have been verified was a class composed of flyouts; in these episodes, BASEBALL mistakenly hypothesized that a competitive timing relationship existed between the outfielder who was CATCHing the BALL and the batter who was reaching FIRSTBASE. Some classes one might expect to be formed were not; BASEBALL did not verify hypotheses which would have established a class of outfield singles or those which would have established a class in which the pitcher throws the BALL and the batter does not swing his bat. Also, a large number (14) of episode classes were left “undecided.” These classes were formed at the end of the run and really should have been merged with episode classes formed earlier. The problem was that a number of those classes were verified or eliminated before the features in the episodes were fully generalized. Once verified or eliminated, the features became frozen in their undergeneralized state and thus were no longer able to accommodate new episodes. In Table VIb we break down the verified episode classes along another interesting dimension: level of generality. In Section IV,B, we described the rather
Elliot Soloway
228
TABLE VI
Box SCOREFOR RUN 1 Episodes Formed = 31
Verif led Eliminated Undecided
14
Episodes Verified = 10
I
Overgeneralized
2
I
Correctly Generalized
4
I
Undergenerallzed
3
I I
Episodes Which Should Not Have Been Verified
conservative process by which features were generalized; variables were substituted for features which were constrained to match only those values that had already been observed. This generalization strategy resulted in a number of episode classes ( 5 ) being verified before they were sufficiently generalized. For example, three classes of groundouts were formed, two of which were verified (see Fig. lo), where each class covered only a portion of the possibly observable episodes of that type. On the other hand, infield singles and doubles were overgeneralized into the same class because of the order in which they were observed; that is, in the descriptions of the infield single and outfield double provided to BASEBALL, there is little to which the system attends which discriminates between them; in both types of episodes a fielder throws the ball to the
TYPE OF EPISODE
EPISODE
CUSS
Out at Secondbase Groundout at Firstbase Infield Flyout Outfield Flyout Flelder’s Choice-I Fielder’s Choice-I1 Fielder’s Choice-111Fielder’s choice-Iv
Infield Single plus Baserunner-I11
Infield Single Outfield Single-I Outfield Single-I1 Outfield Single-I11 Throw-and-sringmiss Throw-and-nosring
CG
.Strike.
UC means undergeneralized, CG means correctly generalized. OC means overgenerallzed, * indicates an episode class which should not have been verified since the analysis was incorrect.
Fig. 10. Levels of classes formed by BASEBALL. UG, undergeneralized;CG, correctly generalized;OG, overgeneralized. * indicates an episode class which should not have been verified, since the analysis was incorrect.
230
Elliot Soloway
baseman while the runner arrives at that base just before the opposing fielder at the base catches the ball. At a higher lever of abstraction, BASEBALL formed three classes on the basis of only the final competitive goal of an episode (see Fig. 10). We have supplied the labels hit, strike, and out; they summarize the concepts acquired in the schemas. In the fielder’s choice episodes, there were two final competitive goals, one for the batter and one for the runner; either the batter failed or the runner failed with his goal of getting ON some base. Thus, this class of episodes participates in two classes-hit and out-at this higher level of abstraction. There were many (93) classes of individual competitive interactions formed. However, the majority of these classes were identical; for example, the competitive interaction between pitcher and the opposing batter appears in each of the episode classes. Thus, when duplicates were eliminated, the 36 classes of individual competitive interactions which were verified reduces to only 8 different classes. It is these classes in their production rule representation which are added to the set of causal link schemas and used to make specific inferences of goals and relationships in subsequently observed activity. Table VII lists the English equivalents of several of the learned rules.
2: THE EFFECTS B. EXPERIMENT OF ALTERNATIVE GENERALIZATION STRATEGIES
In this section, we shall analyze the performance of two generalization techniques: constrained data-directed generalization (CDDG) and unconstructed data-directed generalization UDDG. Both techniques accommodate observed differences in the data by inserting a variable into the generalized pattern description to replace differing constants. In the former technique, that variable is constrained to subsequently match only pattern descriptions whose value for that feature is a member of the set of already observed values; for example, a variable would be inserted into the generalized pattern description for the differing location features in (batter hits ball to centerfield) and (batter hits ball to rightfield) which would be constrained to match only instances of this pattern description in which the ball was hit to either centerfield or rightfield. The results presented in Experiment 1 were obtained using this technique. Alternatively, in UDDG a variable is substituted which is allowed to match any value for that feature. The choice of generalization strategy is important, since our task differs in two crucial respects from most generalization situations reported in the literature: ( 1) The data over which BASEBALL will generalize are not necessarily correct, and (2) the data have not been partitioned into the correct classes by an external source. It is the interaction of these two problems which causes the trouble; we shall see that each technique can cope with one of the problems, but not with the other. In particular, we shall analyze the behavior of these two techniques with
Knowledge-Directed Machine Learning
23 1
TABLE VII LEARNED RULESEXPRESSED IN ENGLISH 1.
I f a pl ayer a t HOMEPLATE HITS a BALL which was THROWn from t h e PITCHER'S MOUND by a member of t h e opposing team. Then Hypothesize
2.
t h a t t h e b a t t e r wanted t o HIT t h e BALL and t h u s SUCCEEDed with h i s g o a l ; t h e p i t c h e r wanted t o p r e v e n t t h e b a t t e r from performing t h a t a c t i o n and t h u s FAILed with his goal.
If a pl ayer a t HOMEPLATE SWINGS a t a BALL and MISSES it. and an opposing pl ayer a t t h e PITCHER'S MOUND threw t h a t BALL,
Then Hypothesize
t h a t t h e b a t t e r d i d n o t want t o miss t h e BALL and t h u s he FAILed with h i s g o a l ; t h e p i t c h e r wanted t h e b a t t e r t o
miss. t h u s t h e p i t c h e r SUCCEDed with h i s g o a l . 3.
I f a pl ayer a r r i v e s a t FIRSTBASE o r SECONDBASE b e f o r e an opposing p l a y e r a t t h e base CATCHes t h e BALL.
Then Hypothesize
4.
If a pl ayer a t FIRSTBASE o r SECONDBASE CATCHes t h e BALL b e f o r e an opposing pl ayer reaches t h e base, and who thereupon WALKS t o h i s DUGOUT.
Then Hypothesize
5.
t h a t t h e f o r m r pla ye r wanted t o a r r i v e a t t h e base, and t h u s he SUCCEEDed with h i s g o a l ; t h e l a t t e r p l a y e r wanted t o prevent t h a t outcome. and t h u s FAILed with h i s g o a l .
If
t h a t t h e l a t t e r pla ye r d i d n o t want t o go t o h i s DUGOUT and thus he FAILed with h i s g o a l ; t h e former p l a y e r wanted t h i s outcome. and t h u s he SUCCEEDed w i t h h i s g o a l
a pl ayer CATCHes a BALL which was HIT by an opposing p l a y e r b e f o r e it
h i t s t h e ground.
Then Hypothesize
t h a t t h e former pla ye r wanted t o perform t h i s a c t i o n , and t h u s SUCCEEDed with his g o a l ; t h e l a t t e r p l a y e r d i d n o t want t h i s outcome and t h u s h e FAILed with h i s g o a l .
respect to their tendency to produce the correct level of generalization for episode classes and their role in the acceptance of incorrect hypotheses. We shall conclude that neither strategy produces completely desirable results and that additional knowledge needs to be employed in order to cope with the complex generalization situations arising in a real-world task such as baseball. In the previous experiment, the confidence value of a hypothesis was computed on the basis of predictions and consistency or inconsistency with previously acquired knowledge. In this experiment, however, we wanted to study only the effects of the alternative generalization strategies on the evaluation of a hypothesis. Thus, in this experiment we did not allow BASEBALL to use previously acquired knowledge to increase or decrease the confidence value of a hypothesis. Rather, the confidence value was based solely on the results of
232
Elliot Soloway
predictions. Since the level of generality of a hypothesis was directly reflected in its predictions, we were able to draw a clearer picture of the contribution of the alternative generalization methods to the evaluation process. In column 4 of Table VIIIa, we depict the results of running the system using CDDG with a threshold setting of 4. While nine episode classes were verified, three of them were considered to be undergeneralized. For example, the class of flyout episodes was limited to matching only those episodes in which the ball was hit to either LEFWIELD, RIGHTFIELD, or SHORTSTOP; this is clearly only a subset of the possible locations to which the BALL could be hit. The reason for this undergeneralization is actually quite simple. BASEBALL had previously observed several infield and outfield flyout episodes in which the BALL was HIT to only the locations listed above. Thus, the variable in the generalized episode description was constrained to match only those locations that had been observed so far. The confidence values for this class then reached threshold, and the class was then accepted as correct. Since the features in the pattern description were frozen at that point, BASEBALL could not merge together any new flyout episodes in which the BALL was HIT to locations other than LEITFIELD, RIGHTFIELD, or SHORTSTOP. A similar analysis holds for other undergeneralized classes. We made a straightforward change to our generalization routines so that they would perform UDDG instead of CDDG. We then passed the same data through the system, and Table VIIIa also summarizes these results. For all threshold settings, UDDG formed fewer undergeneralized classes than did CDDG. Moreover, UDDG was not sensitive to the threshold setting at all once it reached the reasonable level of 4. While UDDG generalized faster than CDDG and thus tended to reach the correct level of class generalization sooner, there are still problems with UDDG. In particular, UDDG’s blind substitution of an unconstrained variable may also lead to overgeneralization. While the number of verified episode classes which were based on a correct analysis and which were overgeneralized is virtually the same for both generalization strategies (Table VIIIa), Table VIIIb tells a different story; that is, overgeneralizing a hypothesis-and the corresponding predictions based on it-which is incorrect tends to give that hypothesis more contexts in which it can be verified. Thus, in Table VIIIb we see that the UDDG strategy consistently accepted as truth more incorrect hypotheses than did the CDDG strategy. We can summarize the tradeoffs between the UDDG and CDDG approaches as follows: 1. In order to be able to recognize a new episode, a system which employs CDDG needs to have seen an example of it already, while a system which
Knowledge-DirectedMachine Lemming
233
TABLE VIII Box SCORE FOR RUN 2
u
D D G
c
D D G
u
D D G
c
D D G
u
D D G
c
D D G
8 =2
Q = 4
8 = 6
Overgeneralized
1
1
1
1
1
1
Correctly Generalized
4
2
6
4
5
2
Undergeneralized
5
8
0
3
0
2
Episodes Which Should Not Have Been Verified
4
3
3
0
2
1
(a)
(b)
i
UDDG, unconstrained data-directed generalization;CDDG,constrained data-directed generalization.
employs UDDG needs only to have seen two examples before it creates a generalized template. 2. Since UDDG can form classes faster than CDDG,it is reasonable that the former is less sensitive than the latter to threshold setting with regard to the level of generality of the classes formed. 3. However, since UDDG is faster than CDDG,it tends to accept incorrect hypotheses and form inappropriate classes. Thus, there is a subtle interaction between generalization strategy and evaluation of hypotheses. Neither the UDDG nor the CDDG are always effective. It appears that domain knowledge might be needed in the evaluation phase of BASEBALL in order to cope with the problems created by the domain-independent generalization strategies.
234
Elliot Soloway
VII. Concluding Remarks As evidenced by all the machinery in BASEBALL, learning is a complex activity, and, as we mention below, BASEBALL is still missing a very major component. However, by attempting to build a system that starts from raw sensory input and progresses all the way to develop an understanding of the activity in terms of the relationships and intentions of the observed actors, we feel that we have had a chance to explore some interesting issues that would otherwise not have been so readily apparent:
By attempting to use knowledge that is acquired in subsequent interpretation activity, we had to confront two key issues: (1) There is a subtle interaction between generalization processes and evaluation processes: The speed with which generalization takes place must be balanced with the speed with which the system accepts hypotheses as truth; and (2) The system needs to have the knowledge that is acquired in a format that is usable by the rest of the system, and the system must be able to know when to use that new knowledge. Mere recurrence isn’t enough to evaluate hypotheses; the system needs to follow out the implications of the hypotheses in order to provide a realistic evaluation. BASEBALL attempted to predict events based on its hypotheses; the results of these predictions were used to evaluate the hypotheses. Sometimes a system must be able to take a “second look” at information that it initially threw away. In wanting to “see” a competitive interaction, BASEBALL was sometimes forced to look again among the actions that were initially filtered out during the early stage of AF. These problems came to light due to the number of levels of processing in BASEBALL and their interactions. What was the contribution of the domain knowledge initially given to BASEBALL? Was BASEBALL so biased that its learning was really reduced to recollection? The experiments described in Section VI provide some information on this issue. Quite frankly, the number of alternative hypotheses generated by BASEBALL was not all that large. Thus, BASEBALL appeared to converge rapidly on a reasonable understanding on the actors. On the other hand, BASEBALL did exhibit some variability when the threshold setting was varied and when the sequence of observed actions was not opportune; that is, BASEBALL developed incorrect interpretations of the sort that a human might, given the particular sequence of input snapshots. Thus, we do not feel that the amount and type of knowledge initially given to BASEBALL trivialized its efforts. BASEBALL’S major weakness was its inability to recover from an error. We saw in Section VI that when an incorrect hypothesis was accepted as truth, BASEBALL would then merrily continue using that incorrect information; that
Knowledge-Directed Machine Learning
235
is, the confidence in other hypotheses was adjusted to reflect its newfound-and incorrect-truth. Essentially, BASEBALL would need processes to enable it to notice that its current view of the situation was growing more and more inappropriate, and then it would need processes to identify where it went awry; that is, it would need to cope with the credit assignment problem (Minsky, 1963). While some researchers have explored the latter problem (e.g., Waterman, 1970), the former problem is still quite perplexing. One can discern two general approaches to learning in the A1 literature: (1) The system is given general knowledge about a domain, and then it somehow customizes that general knowledge to the current specific situation [e.g., BASEBALL (Collins, 1985)l; and (2) the system is given specific knowledge about situations and attempts to map that knowledge over to a new specific situation (e.g., Burstein, 1983). Thus, how can the slave boy learn the proof of the Pythagorean theorem without knowing it already? A1 research of the sort described here and elsewhere in the literature has begun to provide some mechanisms that can carry out just such nontrivial learning. ACKNOWLEDGMENTS This work was supported by the U.S. Army Research Institute for the Behavioral and Social Sciences under Grant Nos. DAHC19-77-G-0012 and DAHC19-77-G-0013. This article is based on the author’s Ph. D. dissertation research carried out at the University of Massachusetts at Amherst.
REFERENCES Avedon, E. M., & Sutton-Smith B. (1971). The study of games. New York: Wiley. Bruce, B., & Newman, D. (1978). Interacting plans. Technical Report 88, Center for the Study of Reading, Univ. of Illinois at Urbana-Champaign. Burstein, M. (1983). Concept formation by incremental analogical reasoning and debugging. In Proceedings of the International Machine Learning Workshop. (R. S . Michalsk, Ed.), Univ. of Illinois at Urbana-Champaign, June, 1983. Collins, G.(1986). Organization of memory for generating explanations. Ph.D. dissertation, Yale University, New Haven, Connecticut. Hayes-Roth, F., & McDermott, J. (1976). Knowledge acquisition from srrucrural descriptions. Technical Report, Department of Computer Science, Camegie-Mellon University, Pittsburgh, PA. Larson, J., & Michalski, R. (1977). Program comprehension: Theory and implications. SIGART Newsletter. June, No. 63. Lehnert, W. (1981). Plot units and narrative summarization. Cognitive Science, S(4). Lenat, D. (1976). A M : An artificial intelligence approach to discovery in mathematics. Technical Report 286. AI Laboriitory. Stanford University, Stanford, CA. Michalski, R. (1977). Towards computer-aided induction. Technical Report. Department of Computer Science, Camegie-Mellon University, Pittsburgh, PA. Minsky, M. (1963). Steps towards artificial intelligence. In Computers and thought. (J. Feldman & E. Feigenbaum, Eds.). New York: McGraw-Hill.
236
Elliot Soloway
Newell, A., Shaw, J., & Simon, H. (1959). Report on a general problem-solving program. In Proceedings of the lnrernafional Conference on Informarion Processing. Paris: UNESCO House. Plato (1949). Meno (B. Jowett, Trans.). Liberal Arts Press. Plotkin, G. (1970). A note on inductive generalization. In Machine inrelligence (Vol. 5). New York: American Elsevier. Reynolds, J. (1970). Transformational systems and the algebraic structure of atomic formulas. In Machine intelligence (Vol. 5). New York: American Elsevier. Schank, R. C., & Abelson, R. (1977). Scriprs, plans, goals and undersfanding. Hillsdale, NJ: Erlbaum. Schmidt, C., Sridharan, N. S., & Goodson, J. L. (1978). The plan recognition problem: An intersection of psychology and artificial intelligence. Artificial Inrelligence 11, 477-478. Soloway, E. (1978). “Learning = interpretation + generalizarion:” A case study in knowledge direcred learning. Technical Report COINS 78- 13, Department of Computer and Information Science, University of Massachusetts, Amherst. Sussman, G. (1973). A computarional model of skill acquisition. Technical Report TR-297, A1 Laboratory, MIT, Cambridge, MA. Vere, S. (1975). Induction of concepts in the predicate calculus. In Proceedings of IJCAl4. Intemational Joint Conference on AI, Tbilisi, USSR. Vex, S. (1978). Inductive learning of relational productions. In fanern-Direcred Inference Sysrems. (R. Hayes-Roth & D. Waterman, Eds.). New York: Academic Press. Waterman, D. (1970). Generalization techniques for automating the learning of heuristics. Artificial Intelligence, 1, 121-170.
Francis S . Bellezza DEPARTMENT OF PSYCHOLOGY OHIO UNIVERSITY ATHENS, OHIO 45701
Learning without thought is labor lost. The Confucian Analects, Book 2: 15
I. Introduction What is the relation between learning and thinking? When learning, can a person verbally describe what he or she is thinking? Can such verbal descriptions provide valuable insights into the learning process? These are some of the questions addressed here. The approach to learning theory implied by these questions provides a link between the learning process and other mental processes and mental structures. This approach is in keeping with those recent developments in cognitive psychology that emphasize the importance of consciousness and mental events as integral to a scientific psychology (Hilgard, 1980; Miller, 1962). It is argued that mental events play a crucial role in learning and remembering, and a number of specific ways in which this occurs are discussed. A few introductory comments are necessary before addressing the issues that are the focus of this article. Two points are emphasized. First, mental cues consist of mental context generated by the cognitive system and present in conscious memory during learning. This contextual information resides in conscious memory along with mental representations created by perception of the immediate environment. When a particular arrangement of mental context and perceived information is stored in permanent memory, learning occurs. This allows for the mental context to be later reconstructed by the cognitive system and to serve as a memory cue for the perceived information. The second point to be emphasized is that mental context present during learning can often be concurrently described by the learner. This communication between learner and investigator provides valuable data that can be used by the experimenter in a scientific manner. The notions of mental cues, conscious memory, and verbal reports are interreTHE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 20
231
Copyright Q 1986 by Academic Press, Inc. All rights of reproduction in any form reserved.
Francis S. Bellezza
238
lated. The following points outline the discussion that follows: ( I ) A person is aware of information that represents some of the structures and processes of the memory system. (2) The information one is aware of is stored in short-term conscious memory. (3) Verbal reports can be given describing the contents of conscious memory. (4)Information generated by the cognitive system and available in conscious memory may become linked to new information presented to the learner. (5) Later, some of the contents of conscious memory can function as recall cues for other information that previously occurred with it. (6) Verbal reports provided by the learner comprise scientific data. Recall performance may be influenced by the experimenter by re-presenting parts of the reports as explicit recall cues. (7) Mental events must have certain properties to be effective as recall cues. The properties of constructibility, associability, discriminability ,and invertibility (Bellezza, 1981) are discussed. (8) Some new experimental evidence is presented in which mental events were verbally reported in a variety of learning contexts and also manipulated as recall cues. (9) Some limitations on the use of verbal reports are discussed as well as some unresolved issues involving learning and awareness.
11.
Mental Cues and the Computer Metaphor
Up to the beginning of the twentieth century, the goal of learning theories was to explain the laws of the association of mental events (Boring, 1950; Warren, 1921). But from approximately 1910 to 1960, behaviorism was the preeminent theoretical orientation of American psychology. During this period mental mechanisms were not used to explain learning. A.
CHUNKS AS MENTAL CUES
By the late 1950s and early 1960s, cognitive mechanisms were proposed to explain learning in a variety of paradigms traditionally used with human subjects. An early example was Bousfield’s (Bousfield & Cohen, 1953) explanation of the recall of lists made up of words from common categories such as trees, vehicles, and tools. Bousfield and Cohen proposed that as each list word, such as hammer, was studied, the superordinate category to which the word belonged was activated in memory. Later, when recall took place, the category labels were first recalled and then were used to cue the words associated to them during learning. This resulted in the clustering in recall of words from the same category, even though these words were not presented together. To explain how items not preexperimentally related become associated, Miller (1956) proposed the notion of chunking by which items unrelated to one another could, nevertheless, become part of the same mental unit, called a chunk. In a
Mental Cues and Verbal Reports
239
similar vein, Tulving (1962) proposed that subjective units are formed in memory when words not naturally related to one another are presented for free-recall learning. As a result of these and other studies, many investigators in the 1960s emphasized the role of organization as a process in learning (Mandler, 1967), with new information organized into some sort of mental packages. According to the emerging cognitive perspective, these cognitive units or chunks seemed to function as mental cues in recall. It was assumed that units formed during list learning had to be recalled as implicit cues before the separate items comprising them could be recalled (Bower, 1972a). Recalling the chunk increased the probability that its constituent information would be recalled. Unlike a common category, a newly formed chunk may be a mental structure with no verbal label. The precedence of chunk recall over item recall is frequently inferred through the analysis of transition-error probabilities (Johnson, 1972) and trial-by-trial consistencies in recall (Tulving, 1962). However, newly formed chunks can often be described or labeled by the learner. Later, these labels may be presented by the experimenter as effective recall cues for the items contained in each chunk (Bellezza & Hartwell, 1981). The early work on chunking was influenced by the notion that human memory is similar to the information-processing system of a digital computer. The simple serial connectionism of traditional associationist psychology appeared to be inadequate to explain some of the chunking phenomena found in recall experiments. A component was needed in the memory system in which the creation of chunks took place. This feature of the memory system will be referred to as conscious memory. B.
CONSCIOUS MEMORY
What James (1950) referred to as primary memory has in recent times been given a variety of different names. For convenience, the term conscious memory will be used here as a label for that part of memory of which we are immediately aware. What James called secondary memory will be referred to as permanent memory. A complete formulation of conscious memory will not be presented; instead, only those general properties that most investigators agree upon will be discussed. The similarities between the operation of conscious memory and the operation of the executive processor of a digital computer are striking (Gilmartin, Simon, & Newell, 1976; Newell & Simon, 1972). Permanent memory is similar to the permanent memory store of the computer from which information can be fetched by the executive processor, operated upon or transformed, and then stored again. Like the computer, the dual-storage memory system has input devices (the senses), buffer stores (iconic and echoic store), and output devices (language and other behavior). Simon (1974) has characterized what is here termed permanent memory and conscious memory in the following way: Perma-
240
Francis S. Bellezza
nent memory is associatively organized, has unlimited capacity, has a relatively slow storage time, and has a slightly faster accessing time. Conscious memory has a relatively short storage and access time and very limited capacity. Furthermore, conscious memory is a serial processor similar in function to a computer’s executive processor (Minsky, 1975). The role of conscious memory in learning is an important one, as is explained below. C. SYMBOL ASSOCIATION Symbol association is the basis for most of the verbal-learning theory and experiments that have been formulated. In these traditional experiments, materials such as nonsense syllables, words, sentences, passages, and pictures have been used. Symbol association involves the formation of new relations among percepts and concepts that, as symbols, are already part of the cognitive system (Newell & Simon, 1972; Simon, 1976). Types of learning that do not involve symbol association are some skill learning (Anderson, 1982), the conditioning of drives and emotions (Dollard & Miller, 1950), and rote learning (Bellezza, 1982). Perceptual learning is the process by which symbols are learned (Gibson, 1969), but symbol association, as the term is used here, does not include perceptual learning. The notion of symbol association and its limits has not been extensively elaborated in psychology. However, it seems that for cognition to occur, operations and transformations of symbols must occur (Newell & Simon, 1972; Simon, 1976). For example, the symbols corresponding to the mental representations that result from perception can be associated during the learning process and later can be retrieved from memory to form visual images. From this point of view, then, learning involves the formation of links to interconnect a set of symbols in conscious memory. Symbol association may be defined as the association or linking of information capable of being represented in conscious memory. Information can be represented in conscious memory by a symbol only if that information exists as a unit in permanent memory. The units may be of various sizes and be embedded in one another. Historically, these units have been called chunks, but more complex memory organizations can be formed called schemas (Rumelhart, 1980). The information in conscious memory is of two general types. First, external stimulation is interpreted by the memory system, and symbolic representations of the stimulation are activated in conscious memory. This process of perception may consist of a search for a match between the pattern of information in a sensory storage buffer and the patterns previously stored in permanent memory (Sowa, 1984, Chap. 2). Second, symbols may be placed in conscious memory with little or no external stimulation, as would occur when someone is deep in thought and is receiving very little external stimulation.
Mental Cues and Verbal Reports
24 1
How does recall take place in the system? For information to be recalled, cues associated with the information must first be activated in conscious memory (Shiffrin & Atkinson, 1969). New information in permanent memory that previously has become associated with these cues can then be accessed. These cues may originate in a variety of ways. The recall cues may have been perceived in the environment and represent background context present during learning (Smith, Glenberg, & Bjork, 1978). They also may originate as spoken or written communication from another person (Tulving & Pearlstone, 1966). Finally, they may be cues generated by the cognitive system itself, as in the use of the method of loci (Bellezza, 1981). There is evidence that little or no deliberate retrieval from permanent memory takes place without the rememberer attending to the task, that is, without awareness of the mental cues necessary for retrieval (Read & Bruce, 1982).
D. REHEARSAL AND LEARNING The notion that information is chunked in conscious memory has been suggested by a number of investigators (Anderson, 1980; Mandler, 1975; Miller, 1956; Simon, 1976). In order to be chunked, the relevant information must first be assembled there. Because conscious memory is of limited capacity, there is the possibility that information entering conscious memory early in the process may be overwritten by new information or may decay before the chunking process has been completed. To offset this, the subject may use the strategy of rehearsing the information to prevent it from decaying. Rehearsal consists of the process of recycling the information in conscious memory so it can be maintained there (Atkinson & Shiffrin, 1968; Shiffrin & Atkinson, 1969). It has been shown that there are two types of rehearsal. Maintenance rehearsal refers to a process by which information is maintained in conscious memory with little attempt to transfer the information to permanent memory. Elaborative rehearsal refers to rehearsal with the goal of transferring new information to permanent memory (Craik & Lockhart, 1972). This transfer process occurs when creating a chunk, which is the symbol in permanent memory representing the assembly of information in conscious memory. The symbol can then be rehearsed rather than the assembly itself, conserving the limited capacity of conscious memory (Simon, 1974). Conscious memory capacity for presented information is greater during maintenance rehearsal than under elaborative rehearsal (Bellezza & Walker, 1974; Geiselman & Bellezza, 1977; Geiselman, Woodward, & Beatty, 1982). This results because elaborative rehearsal involves not only the storage of symbols for presented information, but also information retrieved from permanent memory. The “old” information is associated with the “new” information, and the
Francis S. Bellezza
242
composite is stored in permanent memory using the organizational structure of the old information (Bellezza & Walker, 1974). This process is diagrammed in Fig. 1. The old information provides a framework or a set of cues to which the new information can be linked in permanent memory. This framework may be a natural language mediator, a visual image, a category label, a memory schema, a mnemonic device, or information idiosyncratic (Tulving, 1962) to the subject. These organizational structures are discussed in more detail below. OF CONSCIOUS MEMORY E. THE VERBALIZATION
The term rehearsal does not necessarily refer to overt vocalization, although rehearsal is often experienced as if one is speaking to oneself (Landauer, 1962). Because the cognitive system in general and conscious memory in particular utilize symbols representing units of information in permanent memory, it is not surprising that many of these symbols have verbal labels. Our natural language systems have verbal equivalents in permanent memory for representations of physical objects, classes of physical objects, and abstract concepts. As a consequence, it is possible for people to verbalize what they are thinking about, that is, what is in conscious memory. When structuralism was an important theoretical position in American psychology (Titchener, 1909) and the scientific study of consciousness was assumed to be the goal of psychology, experimental subjects were trained to describe the contents of consciousness. Later, during the behaviorist era, consciousness and conscious memory
permanent memory
memory schema
new information
new
episodic connections Fig. 1. Representation of new information and schematic information in conscious memory.
Mental Cues and Verbal Reports
243
verbal reports of consciousness were not considered a source of scientific data. However, in recent years subjects’ verbal reports of concurrent thinking have been collected and used productively in the study of problem solving (Newell & Simon, 1972).
111. Learning Paradigms Using Verbal Reports
Verbal reports cannot tell us everything that we sould like to know about cognitive structures and processes. In fact, verbal reports may be very useful, but only in a limited number of situations. Nisbett and Wilson (1977) have cautioned that subjects’ verbal reports may have little relation to the cognitive processes actually being used. However, Ericsson and Simon (1980) discuss experimental paradigms in which verbal reports given concurrent with cognitive activity accurately reflect the contents of conscious memory and provide useful information about the goals and cognitive procedures being utilized by the subject. It is proposed here that concurrent verbal reports can be profitably used to study the process of learning. In order to bolster this claim, I will review several learning paradigms which collected verbal reports and correlated these with recall. A.
OVERTVERBALREHEARSAL
Rundus and Atkinson ( 1 970) utilized a method of verbal report whereby the learner spoke aloud the presented items that he was rehearsing. They found that the more times an item was verbalized, the more likely that it would be transferred to permanent memory. Rundus and Atkinson’s results provided support for a system in which conscious memory and rehearsal play a role in learning and recall. In a later experiment, Rundus (197 1) presented lists made up of words from a number of categories and recorded overt rehearsal. He found that the experimental subjects tended to retrieve from permanent memory and rehearse previously presented words from the same category whenever a new word from that category was presented. Words from the same category were rehearsed together, even if these words were not adjacent on the presentation list. This result supports Bousfield and Cohen’s (1953) hypothesis that each presented word’s category label is activated when a list of category items is presented for learning. In a series of experiments designed to study the effects of persuasive messages on attitude change, Greenwald (1968) used a procedure different from the Rundus procedure. He had subjects write their own thoughts during the presentation of persuasive messages or immediately afterward. He found that the subjects often rehearsed their current attitude about the topic during presentation of the message, and that these attitudes could be contrary to those expressed in the
244
Francis S. Bellezza
message. He proposed that this rehearsal of internally generated information rather than presented information explains why the subjects’ attitudes did not necessarily change, although they sometimes could remember the presented message. B. NATURALLANGUAGE MEDIATION Before the mid- l960s, many verbal-learning experiments followed in the footsteps of Ebbinghaus (1964) and used nonsense syllables unfamiliar to the learner. It was assumed that learning could be investigated in a y r e form because nonsense syllables had no prior meaning to the learner. This assumption may have been wrong. It is true that a nonsense syllable does not have a unitary symbolic representation in permanent memory. Therefore, a nonsense syllable cannot be processed in memory as a unit when first presented. In contrast to this, a pronounceable nonsense syllable may produce an integrated verbal response, though it does not have a unitary symbolic representation in memory. If pronounceable nonsense syllables are used as responses, successful rote learning of the nonsense syllables as motor responses may take place. The more common nonpronounceable nonsense syllables, however, are likely learned as a sequence of three separate letters, and this sequence has to be chunked before successful learning takes place (Bower, 1972b; Underwood & Schulz, 1960). In order to learn quickly, subjects may try to substitute for each nonsense syllable a word that is similar in spelling to it. This result was found by Mattocks in an experiment reported by Underwood and Schulz (1960). A similar substitution process was discussed by Miller, Galanter, and Pribram (1960). This process of natural language mediation (the term coined by Montague, Adams, & Kiess, 1966) soon came to be studied in a variety of experiments (Prytulak, 1971; Montague, 1972). The advantage of natural language mediators for learning is obvious: A natural language mediator is a word that corresponds to an established symbol in permanent memory. This symbol can be then used in conscious memory to represent a nonsense syllable instead of three separate symbols. Early experiments investigating the role of natural language mediation often were paired-associate learning experiments. The goal of these experiments was to determine if natural language mediators could aid the learning of nonsense syllable responses. For example, Montague ef al. (1966) reported that subjects instructed to form natural language mediators learned pairs of nonsense syllables better than did control subjects who were not given mediation instructions. Furthermore, subjects seemed to be able to recall the correct response only if they could also recall the natural language mediator. If the mediator that was formed for a pair was recalled, then the probability of a correct response was .73. If the mediator could not be recalled, then the probability of a correct response was .02. Finally, if
Mental Cues and Verbal Reports
245
learning was rote, that is, with no mediator reported during learning, then the probability of a correct response was .06. The natural language mediator must be present during both learning and remembering. Schulz and Lovelace (1964) manipulated the time given for learning and recalling pairs and found that mediator formation did not facilitate recall if insufficient time was provided for the mediator to be recalled previous to the response. All the above results support the notion that a natural language mediator represents a mental event present during learning that acts as a cue during recall. Natural Language Mediators as Recall Cues. Research on natural language mediation has also been performed in which verbal units such as words rather than nonsense syllables are presented. For example, Bellezza and Poplawsky (1974) presented pairs of nouns to college students instructed to give a one-word mediator that was somehow connected to the nouns presented. The subjects were instructed to simply study other pairs on the list. In both a paired-associate and a free-recall task, those pairs were better recalled for which subjects were instructed to form mediators, This recall difference shows that the mediators were not automatically elicited from memory by the presented material. Their creation was dependent on the learning strategy used by the subject for each particular item. The finding that recall of the language mediator is necessary for recall of the response item has been found in diverse verbal-mediation experiments (Bellezza, 1984a; Sweeney & Bellezza, 1982). An additional procedure used by Bellezza and Poplawsky was to present each subject with his own self-generated mediators as recall cues. Each mediator was found to be a more effective recall cue for each of the words in the pair than were either one of the presented words themselves. As in the research involving nonsense syllables, the language mediators reflected the information that was added to conscious memory by the cognitive system. This added information from permanent memory enabled the word pair to be stored as a unit in permanent memory and to be later retrieved. In a later study, Bellezza, Poplawsky, and Aronovsky (1977) tried to deal with the criticism that natural language mediation is an epiphenomenon that somehow accompanies learning but plays no important role (Adams & McIntyre, 1967; Underwood, 1972). They formulated a word-association model in which the oneword mediator was assumed to be a high associate to one of the words of the pair, but did not reflect any learning process. This model could explain the results of Bellezza and Poplawsky (1974) as successfully as their mediation model. The association model was tested against the mediation model by instructing one group of subjects to form one-word mediators; another group of subjects gave a verbal association to either one of the words in each pair. In a test of recall, the subjects in the mediation condition performed significantlybetter than subjects in
246
Francis S. Bellezza
the association condition. Hence, a simple model of word association cannot explain the verbal mediation process and its effect on learning.
C. VISUAL-IMAGERY MEDIATION It is not possible here to review the extensive evidence for visual images existing as mental symbols independent of their verbal labels. However, two examples may help to make the distinction clear. First, it is possible to form the visual image of a face for which the name has been forgotten. In most cases, describing a face in words is difficult, and the image seems to exist independent of any verbal description. A second example comes from Hatano, Miyake, and Binks (1977). They found that abacus masters could do complex calculations using only a visual image of an abacus rather than the abacus itself. In this case, as with the preceding example, it is difficult to argue that verbal representations were the only ones used. It has been shown that instructions to form visual images helps subjects learn pairs of words, and that those pairs for which an image can be formed are remembered better than pairs for which no image can be formed (Bower, 1972c; Paivio, 1969). However, there have not been many experiments that have studied visual imagery in learning and, in addition, have asked subjects to give verbal reports of the visual images formed. This is because verbal mediation and visualimagery mediation may be confounded when verbal reports are requested. To deal with this problem, Paivio and Foth (1970) had subjects either write sentences describing their verbal mediators or draw pictures describing their visual images. This procedure ensured that the specific mediation instructions given were being followed. They found that verbal mediation was optimal for the learning of abstract noun pairs and visual-imagery mediation was optimal for the learning of concrete noun pairs. Abstract nouns are those nouns typically rated low in imagery and concrete nouns are those typically rated high. Bellezza et al. (1977) also presented subjects pairs of abstract and concrete nouns, and the subjects had to give a one-word mediator for each pair. For the abstract pairs, a correct response was never recalled unless the mediator word was also recalled. However, for the concrete pairs there was a significantly greater proportion of responses correctly recalled with no mediator recalled. Bellezza et al. speculated that in some instances visual-image mediators were recalled for pairs even though the verbal mediators could not be. The results of the Paivio and Foth (1970) and the Bellezza et al. (1977) experiments argue for the operation of mental cues that can be verbally described but are independent of the language system. In summary, the creation of visual images seems to be a powerful strategy for learning, but only if this strategy can be implemented by the learner. When visual images are recreated, they become effective mental cues for the recall of the previously presented information.
Mental Cues and Verbal Reports
247
D. MNEMONIC DEVICES Perhaps the oldest and most persuasive example of the role of mental cues in learning is the method of loci. In the method of loci the learner first memorizes a series of visual images, usually representing a sequence of places (loci) with which he or she is already familiar (Yates, 1966). When a list of items is to be memorized, a visual image of each item is combined with the image of the locus in the same sequential position. In this manner representationsof the information to be remembered are stored in permanent memory. When recall is to take place, the mnemonist mentally reviews the images of the loci. Combined with each locus is the image representing the information to be recalled. This procedure is remarkably effective (Bower & Reitman, 1972; Morris & Reid, 1970), but depends upon the learner being able to form visual images for the material to be remembered. Similar methods using verbal mediation, however, are available (Bobrow & Bower, 1969). As in the case of learning experiments using visual imagery, there have not been many studies reported in which experimental subjects using a mnemonic device gave verbal reports describing the contents of conscious memory. Rather, what typically happens is that the investigator assumes that the subjects did what they were told after the mnemonic instructions had been given. In constrast to this typical procedure, Reddy and Bellezza (1983) had subjects make up a story from a list of presented words and vocalize the story as they proceeded through the list. This is an example of the story mnemonic (Bower & Clark, 1969) and seems to result in learning through a complex combination of story rules, verbal mediators, and visual mediators (Bellezza, 198 1). In a free-association condition Reddy and Bellezza gave subjects no specific learning instructions, but had them simply vocalize about what they were thinking as they studied the list words. During free recall, subjects had to again verbalize as they tried to recall the list words. As might be expected, subjects in the story condition reconstructed their story in order to recall the list words. Subjects in the free-association condition also tried to reconstruct what they were thinking about when they studied the words. In both conditions the mental events experienced during learning acted as cues for recalling the list words. Recall of mental context was a necessary condition for recall. It was found that if the mental context present at the time of learning could not be recalled, then the corresponding list word could not be recalled. Subjects in the story condition recalled more than those in the freeassociation condition, only because they could reconstruct more of the mental context present during learning. These results support the hypothesis that the mental events that occur during learning are important because they can later function as internal cues for the target information. These cues must be regenerated by the cognitive system during remembering because they are not available in the external environment. Also, the cuing context must be identical to that
248
Francis S . Bellezza
generated by the subject during learning (Tulving & Thomson, 1973). In the Reddy and Bellezza study, subjects given someone else’s verbalization recalled more poorly that those who simply free recalled with no external cues provided. These latter subjects were more likely to generate successfully their previous mental context. Chase and Ericsson (1981) had subjects practice recalling random sequences of digits. Although they did not teach any mnemonic procedure, Chase and Ericsson studied the mnemonic procedures subjects developed on their own. Their most successful subject, SF, who was a very good long-distance runner and knowledgeable about track events, gradually started to think of subsets of successive digits as running times and remembered them in this manner. When he had to recall, he first thought of the sequence of track events he used for encoding, and from these he recalled the digits. The track events functioned as mental cues for the digits. E. MEMORY SCHEMAS The notion of a memory schema has had a profound effect on contemporary theories of learning (Norman, 1982; Rumelhart, 1980; Anderson, 1980; Anderson, 1984). In brief, a memory schema is an organized set of knowledge stored in permanent memory that becomes activated when the person processes information similar to that stored in the schema. In a way, a memory schema is like a natural category, but has much more structure (Mandler, 1984). One form of a schema is a cognitive map (Neisser, 1976). A person may have in memory a set of visual images that represents some geographic area with which she is very familiar, such as the inside of her house or the street layout of her neighborhood. This schema can be used as a map to navigate through her house or neighborhood. It enables a person to know how to get to the bathroom from the basement of the house or how to get to the grocery store when at the gas station. However, a cognitive map has other uses. It can be used to store new information. This occurs, for example, when using the method of loci. Our immediate concern is with how schemas function as sets of mental cues and how knowledge about their functioning can be obtained through verbal reports. A type of schema commonly-used in learning experiments is the script (Schank & Abelson, 1977). A script, such as the restaurant script, is built up in permanent memory through experience and provides a plan for what to do when eating in a restaurant. It contains information about seating, ordering, table manners, paying, and so on. However, a script also allows a person to comprehend and remember language descriptions of other people eating in restaurants. When we read or hear “Joan was hungry, so she walked into McDonald’s,’’ the restaurant script is activated; that is, parts of the restaurant script become active in conscious memory, and visual images may be formed from
Mental Cues and Verbal Reports
249
these parts. This process can be validated by asking people to make and report inferences from the restaurant script. These verbal reports consist of what may be happening concurrently with the events being presented (Joan opened a door) or what might happen next (Joan will look at the menu on the wall) (Graesser, 1981, Chap. 6). The answers given to probe questions are indications of whether the description is being comprehended, that is, whether the script has been activated. Schemas as Sets of Cues. Not all the information in a printed or spoken text is already stored as part of some script in memory. The name of the particular person involved (Joan), the restaurant (McDonald’s), and other information such as what Joan ate, how much she paid, who she sat with, and so on is information particular to the event being described. However, the activated script plays a role in storing this information in episodic memory. Workers in artificial intelligence have proposed that the generic script has “slots” in which specific information can be stored (Minsky, 1975; Schank & Abelson, 1977). If no specific information is provided in the text, the information that should be in the slots is inferred. For example, no specific information may be given in a text that a restaurant provides a napkin for the patron. However, this is assumed to occur in the event being described because it is a common event and is part of the script in memory. The notion of filling slots in a schema with information will not be used here. Rather, it will be assumed that the activated script provides a set of organized internal cues to which the new information can be associated. Association rather than slot filling is assumed to be the process by which new information becomes linked to an activated schematic structure. Hence, the memory script or schema, like a mnemonic device, provides a set of organized internal cues and thereby supports learning (Bellezza, 1983a; Bellezza & Bower, 1982). Like any set of mental cues, the schema must be activated both during learning and during recall (Thorndyke & Hayes-Roth, 1979). The use of schemas for making inferences can also be explained using the association mechanism. If specific information is expected in a schema-based text but is not provided, then it can be inferred from what has been associated to the schema in the past. Bellezza and Bower (1982; Experiment 2) compared the effectiveness of the restaurant script to a set of pegword cues such as those used in a mnemonic device. The pegwords were high-imagery concrete nouns not semantically related to one another in any systematic manner. There were two learning conditions. In one condition subjects learned a list of concrete script nouns appropriate to a restaurant script. Some subjects associated the script nouns to the text of the restaurant script during learning, and other subjects associated the script nouns to the set of pegwords. In the test of recall that followed, the subjects using the script as a set of cues recalled significantly better than subjects using the pegwords as a set of cues. This occurred because the script nouns fit the script well, but could not as easily be related to the pegwords. In a second condition of the
250
Francis S. Bellezza
experiment, randomly sampled concrete nouns were used as the list words. The recall results were just the opposite of what occurred when script nouns were used as list words. Subjects using the restaurant script recalled the random nouns more poorly than did subjects using the pegword cues. These recall results are shown in the bottom part of Fig. 2. The random nouns typically did not fit into the script framework at all, whereas they could be related to the pegwords in a moderately successful manner. As part of this experiment, the subjects had been instructed to rate during learning the quality of each visual image they experienced. It turned out that the pattern of imagery ratings matched the pattern of recall results. This result indicated that the recall of each item was closely related to the subjects' ability to create a visual image for that item. The mean imagery ratings are shown in the upper part of Fig. 2. When the script nouns were presented as list words, they could easily be fit into the script images. But when the random nouns were presented, they did not fit into the script images. On the other hand, the type of list words made little difference when using pegword cues, because some sort of relation between each cue and each list word could frequently be found using visual-imagery mediation. Bellezza and Bower suggested that recall for both the script and the pegword cues was based on similar processes, but that the script had a narrower bandwidth. The bandwidth of a set of cues determines what
4
I
TYPE OF CUE Fig. 2. Recall performance and imagery ratings for script nouns and random nouns when memorized using script cues or mnemonic pegword cues. The lower two lines in the figure represent recall performance and the upper two lines represent the imagery ratings. (Reprinted with permission from Bellezza & Bower, 1982; OI982 by North-Holland Publishing Company.)
Mental Cues and Verbal Reports
25 I
words can be associated with the cues. The greater the bandwidth, the greater the possible interpretations allowed for the information associated with each cue in the set. For the pegword cues, which were simply single, high-imagery nouns, any word could conceivably be related to each mnemonic cue using visual imagery. On the other hand, for a word to fit into a. restaurant script and ,be remembered, it has to make sense in the context of eating in a restaurant. Bellezza (1983a) replicated these results and demonstrated that scripts could sometimes be used to remember words that were rated as not fitting into the activated script. But this occurred only when a script-based mental cue occurred in conjunction with a list word in a meaningful relationship. The results of Bellezza and Bower (1982) and Bellezza (1983a) support the notion that memory schemas support learning by providing a set of organized mental cues to which new information can be associated. Furthermore, the appropriate subparts of the schema used must be present in conscious memory when learning and recall take place; that is, the subject must be aware of this mediating information and thus be able to report on it using verbal descriptions or some other reporting procedure. A question may be raised as to how a person can discriminate between information in memory that originated in the external environment and information generated by the cognitive system. It seems that people are often, but not always, successful in distinguishing between these two types of information. This discrimination process has been labeled real@ monitoring (Johnson & Raye, 1981) and is necessary for successful cuing to occur. In a typical recall test, the subject is conscious of both context information and the information to be recalled, but must discriminate between the two and recall only that information previously presented by the experimenter.
F. PREVIOUS USE OF VERBALREPORTS I N STUDIES OF LEARNING It is proposed here that verbal reports can play an important role in the study of human learning. Yet, the question may be raised as to why there have not been many learning experiments collecting verbal reports about language mediators, visual images, and memory schemas. There seem to be a number of reasons for this paucity of studies. (1) Investigators have relied on insights gained from their own mental experiences when they themselves learn symbolic material. Expen. ments can be performed that indirectly confirm these insights. For example, Lea (1975) collected reaction times from subjects learning a list of words using the method of loci. One of his results was that subjects take less time to generate mental images of familiar loci than to generate mental representations of the list words newly associated with these loci. This result agrees with what users of the method of loci sense in their own performance. (2) Special instructions are sometimes used to get people to use procedures that seem to create mental events
252
Francis S. Bellezza
similar to those experienced by the investigator. Therefore, providing instructions to engage in certain mental activities are assumed to replace validation through concurrent verbal reports of these activities. This assumption is sometimes warranted. For example, visual-imagery instructions (Paivio, 1971) or instructions regarding how to use a mnemonic device (Bower & Clark, 1969) can have a major impact on learning new information. (3) Verbal reports can be intrusive. In some special circumstances, such as in the study of visual-imagery mediation in learning, care must be taken to separate imagery mediation from verbal mediation. Having subjects give verbal reports can bias them toward a verbal mediation strategy (Paivio, 1971). (4) Normative data have often been collected for materials that vary in how well they elicit mental events. Early work on nonsense syllables referred to this property as “meaningfulness” or “associative value” (Underwood & Schulz, 1960). In a similar manner, imagery and concreteness ratings have been collected for nouns (Paivio, Yuille, & Madigan, 1968). Also, normative data have been collected for categories (Battig & Montague, 1969) and scripts (Bower, Black, & Turner, 1979). Experimenters using these materials assume that the mental events experienced by the subjects can be controlled in part by the kinds of materials presented, and that subjects react in a similar manner to the same materials. Therefore, verbal reports are not needed to describe mental events. ( 5 ) Early functionalists and behaviorists were skeptical of the value of verbal reports collected by the structuralists, and there continues to be good reason to be skeptical of verbal reports as explanations of cognitive processes (Nisbett & Wilson, 1977). However, verbal reports can represent the current contents of conscious memory, and this often is useful information (Ericsson & Simon, 1980). In summary, the reasons why verbal reports have not been widely used in the study of learning have to do primarily with the manner in which the study of learning has developed historically. Taking into account certain limitations discussed below, none of the above reasons is compelling enough to justify not using subjects’ verbal reports in the study of learning.
G. WHYVERBALREPORTS SHOULDBE USED Why are verbal reports useful in the study of cognitive processes such as learning? There are a number of reasons: (1) Subjects do not always follow instructions. In experiments using mnemonic devices, it is not unusual for a large proportion of the subjects to not follow the procedures in which they have been instructed or trained (Bellezza, 1981). (2) Normative data are limited in their usefulness. What is meaningful, of high imagery, or schematic for one person may not be so for another. Differences in prior knowledge do exist among people, even with materials as simple as categories of common objects. Also,
Mental Cues and Verbal R e p a
253
people seem to vary in how they respond to materials from one occasion to the next (Bellezza, 1984b). Verbal reports recorded during learning allow the investigator to deal with some of these problems. (3) It has recently become clear how verbal reports can be useful and when they can be used. Ericsson and Simon (1980) distinguish among three levels of verbalization. Level 1 verbalization is a direct articulation of the information in conscious memory that is already in a language code. An example of this would be the overt verbalization of implicit speech. Level 2 verbalization involves the recoding of nonverbal information without additional processing. This would occur when describing a visual image. Level 3 verbalization involves articulation preceded by decisions, inferences, or generative acts that involve not a description of information in conscious memory, but a transformation of it using other cognitive processes. Ericsson and Simon propose that Level 1 and Level 2, but not Level 3 verbalizations are legitimate ways to study the contents of conscious memory. Level 1 verbalizations do not change the course and structure of the cognitive processes, or their speed. Level 2 verbalizations may slow down performance and the verbalizations may be incomplete, but the course and structure of the cognitive processes will remain largely unchanged. In most learning experiments, the rate of presentation of the new information can be adjusted so that Level 1 or Level 2 verbalization can occur. For the reasons outlined above, the investigator can gain a greater degree of experimental control by using verbal reports. However, what can be accomplished in the study of learning that has not already been achieved? (1) The most important use of verbal reports follows from the theory of mental cues that is presented here. Verbal reports allow the investigator to determine to some extent the nature of the mental structures that act as mediators in learning. (2) Once these cognitive structures have been identified, they can be manipulated by the experimenter for each individual subject. For example, verbal reports can be presented as recall cues after being analyzed into parts or otherwise modified. (3) Verbal reports force investigators to develop theories that account for both the verbal reports and other overt behavior; hence, verbal-report data create more stringent criteria for learning theories (Simon, 1979).
IV.
Properties of Mental Cues Important in Learning
Psychologists working in the field of paired-associate learning have proposed properties of stimuli that are crucial to successful learning (Battig, 1968; McGuire, 1961). Accordingly, mental cues must also have certain properties for successful learning to occur. Bellezza (1981) proposed four key properties of mental cues used in mnemonic devices, and these same four properties are
254
Francis S. Bellezza
important for all the types of mental cues discussed here. These are the properties of constructibility, associability, discriminability, and invertibility. A.
CONSTRUCTIBILITY
The property of constructibility (Norman & Bobrow, 1979) refers to the reliability with which information can be constructed by the cognitive system, both at the time of learning and at the time of recall. If a mental symbol is activated in conscious memory at the time of learning and becomes associated with new information also there, then that symbol must be activated as a cue if the new information is to be recalled. Sometimes environmental stimuli will elicit mental cues (S. Smith et a f . , 1978), but often the mental context created from the cognitive system is what is associated with new information, and at recall these mental cues must be strategically regenerated to act as recall cues (Greenwald, 1981). Perhaps the most easily understood example of constructibility occurs in the method of loci. If a locus is forgotten during recall, then its corresponding list item will be forgotten. Similarly, if the loci are recalled in an order different from the presentation order, then the list items will be recalled in an order different from the original. Buschke and Hinrichs (1968) demonstrated the importance of constructibility. They found that after presenting numbers in the range from 1 to 20 for recall, performance was better when subjects recalled the numbers in ascending order compared to recalling them in the order they were presented. To recall in ascending order, the subject used the strategy of “marking” in memory each number as it was presented. At recall, the numbers from 1 to 20 were mentally reviewed to see which were marked. However, to recall the numbers in their order of presentation, this strategy could not be used. Another strategy had to be used that stored both the numbers and their order (Buschke, 1968). To recall the numbers in their ascending order, the “number loci” could be used, but these loci could not be used when recalling numbers in their order of presentation. The point is that a set of mental cues must usually be generated in a stereotyped order, and if this does not correspond to the required recall order, then these cues cannot be used during learning. Constructibility is also important in schema-based learning. Memory schemas facilitate remembering by providing an organized set of mental cues to which new information can be associated. If the schema is not well formed in memory, then the schema components activated during learning may not be identical to the components activated during recall. Because schema components function as mental cues, the cues used during learning may not be available during recall, thereby reducing the amount of new information recalled. One reason why people with expert knowledge are able to remember new information related to
Mental Cues and Verbal Reports
255
their area of expertise (Smith, Adams, & Schorr, 1978) is that at different times they are able to reliably generate many cues with identical organization. B. ASSOCIABILITY Not only do mental cues have to be reliably generated in the learning and testing situation, but they must also be readily linked to new information. All familiar words have symbolic representations in memory that can be activated. However, words that have associated visual images are more easily associated with new information than words low in imagery (Paivio, 1969). Similarly, mnemonic devices and memory schemas whose components are high in visual imagery will result in better learning than mnemonic devices and schemas containing low-imagery components (Delprato & Baker, 1974). In addition to visual imagery, other factors can enhance associability. For example, if the mental context and new information represent verbal symbols that have often been experienced together, such as dog-cut, then they will be easy to link. Similarly, typical actions are easier to remember in script-based texts than are atypical activities (Graesser, Woll, Kowalski, & Smith, 1980). This notion of associability is similar to the redundancy of Miller and Selfridge (1950) and the congruity of Craik and Tulving (1975). Because a memory schema often represents objects and situations of a specialized nature, only a narrow range of information can be associated to it, and the schema is said to have a narrow bandwidth (Bellezza & Bower, 1982). Hence, the notion of the bandwidth of a set of mental cues refers to how easily schematic cues are associated with a wide range of information. C. DISCRIMINABILITY
Mental cues, like physical cues, must be discriminable to support learning; they must not be confused with one another. The anonymous author of Ad Herennium, an ancient Greek textbook on rhetoric, suggested that the locations to be used in the method of loci not be too much alike and should be at least 30 feet apart (Yates, 1966, pp. 7-8). This advice has received empirical support from contemporary paired-associate experiments. It has been shown that word stimuli similar in meaning, that is, similar in their mental representations, result in poorer learning than do word stimuli dissimilar in meaning. Day and Bellezza (1983) found that paired-associated learning was poorer when the stimulus words were made similar by being chosen from the same natural category (such as fruits) than when they represented dissimilar physical objects. This was true even though all the stimuli were meaningful and high in visual imagery. Comparable results have been found by Underwood, Ekstrand, and Keppel(l965). In another
256
Francis S. Bellezza
study, Bellezza (1983b) has reported that different word lists presented on visually distinct background patterns are better recalled than different lists presented on the same pattern. It seems that the patterns acted as mental cues for the lists, and their discriminability in memory was an important factor in their effectiveness. So far, only semantic similarity among mental cues has been discussed as influencing discriminability. But episodic similarity is also possible. The same mental cue may become associated with new information on a variety of different occasions; that is, the same symbol may occur in a number of different codes in episodic memory (Tulving, 1972). For example, if a person is instructed to memorize a series of five lists of words using the same set of loci, he or she may have difficulty remembering if asked to recall the fourth word from the third list. This is because the subject is forced to associate the same mental cue (perhaps the visual image of the learner’s front lawn) with the fourth word in every list. When asked for the fourth word from the third list, the fourth word from one of the other lists might be recalled and an error made. It is surprising to investigators that people are able to perform so well in this type of recall task. However, it appears that the learner must rely on temporal-contextual information in memory in addition to the semantic characteristics of the mental cue (Anderson & Bower, 1972, 1974; Bellezza, 1982; Shiffrin, 1976). But use of this temporalcontextual information is not well understood. This whole problem of episodic similarity of mental cues has been studied extensively using traditional interference paradigms, though without the theoretical perspective used here (Postman, 1971). D. INVERTIBILITY Invertibility means that a mental cue and its corresponding information are bidirectionally associated, and this property is important for mental cues. During learning, new information coded into conscious memory may activate old information in permanent memory. Yet this activated information may have to serve later as a cue for the information that preceded it in time. For example, when reading a passage about eating in a restaurant, the passage may mention that the patrons were met inside the door by someone wearing a red dress. Information regarding a red dress becomes activated in conscious memory before the inference is made that this person is the hostess. However, at recall the order of mental events may be different. When a person tries to remember, he or she may first think about a restaurant. Next, specific cues may be generated from the restaurant script regarding the roles, props, and actions in a restaurant. The subject may think about the fact that when eating in some restaurants one must deal with a headwaiter or hostess. The symbol for hostess in conscious memory
Mental Cues and Verbal Reports
257
may therefore act as a cue for the information in the passage regarding the fact that the hostess was wearing a red dress. This reversal in the order of events during recall means that the associations between mental cues and new information in memory should be bidirectional. If symbol A precedes symbol B in conscious memory, symbol B should later be able to elicit symbol A. This invertibility seems to occur when the symbols associated are visual images, but the strength of the association may be asymmetric when only verbal responses are involved (Paivio, 1971, Chap. 8; Ekstrand, 1966). Without the property of invertibility, mental cues are ineffective because they cannot elicit information from episodic memory preceding them in conscious memory. One possible distinction between symbol association and rote learning is that in symbol association bidirectionality is likely to be preserved, whereas in rote learning it is not. In rote learning the symbols in conscious memory are not directly linked, but generate a sequence of motor responses. These motor responses become associated in the sequence in which they are verbalized.
V. Mental Cues Formed under Different Task Sets An understanding of mental cuing is important for the study of learning, and verbal reports are a valuable tool for doing this. In this section, two studies are discussed that demonstrate the necessity of mental cues in learning and the importance of the four properties proposed for them. Use is made of verbal reports to provide a description of the mental cues. A.
STUDY1 : CONSTRUCTIBILITY
In this experiment, randomly selected concrete nouns were studied by subjects who were instructed to report whatever came to mind in response to each presentation. Rather than being a simple free-association experiment, four different tasks were specified which varied from word to word. For a quarter of the words the subject was requested to give a word or phrase that sounded like the word presented. This was the sound task. For another quarter of the words, a dictionary definition had to be generated. For the remaining words, a personal experience related to the word had to be described, or the subject had to describe where in his or her house the object named by the word would be most appropriately placed. This last task was the house task. For each word and task combination, each subject wrote down the required response as a description of the mental events he or she experienced. Of course, the complete contents of conscious memory cannot be represented by such written descriptions or by any other kind
258
Francis S. Bellezza
of verbal report. However, it was assumed that the written descriptions were representative of the cognitive content of conscious memory when the task was completed. It was hypothesized that later recall of the words could occur only if the mental context generated during learning was also generated as a mental cue (see Reddy & Bellezza, 1983; Bellezza, 1984a). Because of the procedure used here, the terms mental context, mental cues, verbal reports, and written descriptions are used interchangeably. The hypothesis tested in Study 1 was that constructibility is an important attribute of mental cues. If one's house forms a cognitive map (Neisser, 1976; Norman, 1982), then at the time of recall it should be possible to recall words processed with the house task. This is because the mental cues derived from the house map are easily reconstructible. If one's personal experiences are well organized in memory, the same should be true for the experience task (Bellezza, 1984a). But constructibility should be less operative for the mental cues generated in the sound and definition tasks because sounds of randomly selected concrete nouns or the definitions of these nouns are not organized in any systematic manner in memory. For these two tasks, schemas are not available to mediate the organization of words in episodic memory. Other theoretical approaches to learning make different predictions regarding the effectiveness of these four tasks. The definition task does require a great deal of semantic processing. If depth or level of processing is an important determinant of free recall, as opposed to mental cuing, then the definition task should result in high levels of recall (Craik & Lockhart, 1972; Craik & Tulving, 1975). The definition task may also result in the maximum discriminability of episodic memory codes because the creation of dictionary definitions requires the identification of the unique properties of each of the defined words. If recall performance is good in the definition task, that would support the idea that the distinctiveness of the memory code is an important determinant of recall (Jacoby & Craik, 1979). 1 . Method
Four different list forms were made up, each consisting of 48 concrete nouns randomly sampled from Toglia and Battig (1978). Each list was based on the same sample of nouns presented in the same order. Each noun had a value between 5.0 and 7.0 on the dimensions of concreteness, imagery, and meaningfulness and had a value between 5.5 and 7.0 on the dimension of familiarity. Next to each word was printed one of the four tasks: similar-sounding word, write a definition, a personal experience, or where in your house? Each task occurred only once in every successive set of four words, but appeared in a different random order for each set. The list forms were created so that each word was paired with each task once across the four forms. The noun task items were printed in booklets, and the subjects spent 30 sec writing down a response to each
Mental Cues and Verbal Reports
259
noun task combination. The subjects were paced through the booklets by the experimenter and were instructed to write down a maximum of about 10 words in each of their responses. After they wrote down their verbal response and within the 30-sec period allowed, they also had to rate how difficult it was to generate a response for the noun task combination. A rating of 7 indicated that the task for the noun was very difficult, and a rating of 1 indicated that the task was very easy. Subjects were presented a practice list of four words to practice each task once. When the 48 noun task items in the main list were completed, the booklets were collected and a blank booklet was handed out. The subjects were instructed to write down both the list words and their own previous responses to them. They were to write down the list words and responses in two parallel columns. It was emphasized that if a list word, but not the response given to it, could be recalled, the list word should nevertheless be written down. Similarly, if a response, but not the list word that elicited it, could be remembered, the response should be written down. The subjects were given as much time as needed for this freerecall task. A total of 54 subjects were tested in two sessions. Approximately the same number of subjects were tested on each of the four list forms.
2 . Results A one-way analysis of variance was performed on the dependent variables, with processing task as the one within-subjects factor. The results are shown in Table I. a. Dificulty Ratings. The means for the difficulty ratings were significantly different across the four processing tasks, F(3, 159) = 8.61, MS, = .234, p < .001. Using Tukey’s HSD tests (Kirk, 1982), it was found that the definition task was rated as the most difficult, with the other three tasks not significantly different from one another. b. Elaboration. The mean number of written words per response also differed between tasks, F(3, 159) = 270.39, MS, = 1 . 3 8 , ~ < .001. Tukey’s HSD tests showed that the definition and experience tasks resulted in a larger number TABLE I DIFFICULTY, ELABORATION, AND FREE-RECALL MEASURES FOR THE FOURTASKS USEDIN STUDY1 Measure
Sound
Definition
Experience
House
Difficulty Elaboration Mental-cue recall List-word recall
2.03 1.14 .32 .34
2.48 6.58 .39 .41
2.14 6.95 .41 .44
2.16 4.65 .63 .65
260
Francis S. Bellezza
of words per response than did the house task. The task in which subjects had to generate a similar-sounding word resulted in the fewest words per response. c. Recall. There was a significant effect of task on the number of list words recalled, F(3, 159) = 43.38, MS, = .022, p < .001. Tukey’s HSD tests showed that more words were recalled in the house task than in the definition or experience tasks, which were not significantly different from each other. Recall was poorest following the task where subjects generated a similar-sounding word. The analysis of variance for recall of the subjects’ own responses gave results very similar to the statistical results resulting from analysis of the list words. As can be seen from Table I, the recall means of the list words and subject responses were almost identical in each task. When a list word was recalled, its subjectgenerated response was also likely to be recalled, and vice versa. Similar results have been reported by Bellezza (1984a, Experiment 2).
3. Discussion The level of free recall in the house task was about 53% higher than recall in the definition and experience tasks and about 91% greater than in the task that required similar-sounding words be generated. The most important characteristic of the house task was that the subjects utilized the cognitive map of their house during learning. When the cognitive map was again activated at recall, the components generated from it had become associated with the recently presented list words and could act as recall cues. Thus, the mental cues utilized in the house task were more constructible than those cues generated by any of the other three tasks. Other investigators have suggested that personal experiences are effective mediators for free recall (Bower & Gilligan, 1979). but personal experiences do not seem to be organized in memory as well as other schemas (Bellezza, 1984a), such as the schema for one’s own dwelling. The results of Study 1 support the notion that the recall of the list items was mediated by the mental cues described in written reports collected during learning. Also, these cues showed a high degree of associability. Whenever a mental cue was recalled, the list word associated with that cue was also recalled. One could argue that the list word was first recalled, and the verbal report was recreated using the list word. However, the significant differences found in the levels of recall found among the various tasks do not support this counterargument. Theories of learning that emphasize levels of processing propose that semantic processing will result in better recall than nonsemantic processing. To some extent, this occurred in Study 1 because the task in which subjects generated words similar in sound to the list words resulted in the poorest recall. A levels-ofprocessing approach, however, cannot account for all the results. Some investigators have suggested that items with the most distinctive memory codes are the
Mental Cues and Verbal Repow
26 1
most retrievable from memory (Jacoby, Craik, & Begg, 1979). Of the words in the four tasks, the words in the definition task should have had the most distinctive episodic memory codes. When giving a dictionary definition, a person must access in permanent memory those properties of the defined word that most clearly distinguish it from the other items on the list. If discriminability is of paramount importance, then the definition task should result in the greatest amount of processing. It did indeed result in the greatest amount of semantic processing, for the definition task was rated as significantly more difficult than the other three tasks. However, the definition task did not result in as much recall as did the house task. The definitions generated by the subjects were not organized in memory and therefore could not be reconstructed during recall to form an effective set of mental cues. Another hypothesis related to the levels-of-processingapproach is that those items that are most broadly processed (Craik & Tulving, 1975) or elaborated (Anderson & Reder, 1979) will be best recalled. One measure of elaboration is the mean number of words per response generated by the subjects in the four tasks. The definition and experience tasks resulted in a significantly greater number of words per response than did the house task, but the house task resulted in a greater level of free recall. The obvious result of Study 1 is that the house task resulted in the best free recall because the house task involved the most constructible mental cues.
B. STUDY2: ASSOCIABILITY, AND DISCRIMINABILITY INVERTIBILITY,
In Study 1, written reports of the contents of conscious memory were collected for list words during both learning and recall. It was found that a list word was almost always recalled when the mental events accompanying its encoding were recalled, and vice versa. This was taken as evidence that the mental events functioning as mental cues had a high degree of associability. In Study 2 a more direct assessment of the associability of mental cues was made. The presentation procedures and materials were the same as those used in Study 1, but there was a 3-day retention interval preceding free recall. Following free recall, each subject was given two types of recall cues. One set of cues consisted of half the nouns previously presented. For these the subject had to provide the same verbal report for each noun as he or she provided 3 days previously. The second set of cues was made up of the written reports the subject gave for the other half of the nouns. To these he or she had to give the list word for which the written report had been made. The written reports presented to each subject were always his or her own. Using this procedure, a direct measure of associability could be made by determining how well the verbal descriptions generated in each task elicited the original list words. Also, a measure of the invertibility of the mental cues
262
Francis S. Bellezza
could be determined by computing how well the verbal reports elicited the original list words versus how well the list words regenerated the verbal descriptions given 3 days earlier.
I . Method The procedure and materials were the same as those used in Study 1, but only up to the point where the booklets containing the written responses to the words were collected from the subjects. At this point, the subjects were dismissed and reminded to return 3 days later for the second part of the experiment. No mention was made of what the second session would entail. During the 3-day interval a cued-recall test booklet was made up for each subject. Half the words in each task were listed as recall cues. Also, the written descriptions made in response to the other half of the list words were included as recall cues. If the list word was contained in the written description, then a blank was substituted for it in the description. When the subjects returned for testing, they were given a blank booklet and the free-recall instructions used in Study 1. After 10 min these booklets were collected and the cued-recall tests handed out. The cues were arranged in the booklet so that it was clear which cue was a written description requiring a list word as a response and which cue was a list word requiring the description given 3 days earlier. The subject was not told what task had been originally associated with each cue, although when the written descriptions were cues, this information could often be inferred. A total of 40 subjects were tested in two sets of sessions. 2 . Results a. Free Recall. An analysis of variance on the proportion of list words free recalled showed a significant effect of task, F(3, 117) = 13.02, MS, = .011, p < .001. A similar result was obtained when analyzing the proportion of written descriptions recalled, F(3, 117) = 19.76, MS, = .010, p < .001. These proportions are shown in Table 11. Tukey HSD tests showed that more list words and written descriptions of mental context were recalled in the house task than in any of the other three tasks. The latter were not significantly different from each other. These freerecall results are similar to those obtained in Study 1, except that the levels of recall are lower. As in Study 1, recall of a list word was almost always accompanied by recall of its associated written description, and vice versa. b. Cued Recall. The proportion of items recalled in the cued-recall test for each task is also shown in Table 11. A 4 X 2 analysis of variance was performed on these proportions, with the first factor being task and the second factor being cue type. Both factors were within-subjects factors. Task was significant, F(3, 117) = 26.59, MS, = .047, p < .001. Tukey HSD tests showed that overall
Mental Cues and Verbal Reports
263
TABLE I1 FREE-RECALL AND CUED-RECALL MEASURES FOR THE FOUR TASKSUSED IN STUDY 2 ~~~~~
Measure Free recall Mental-cue recall List-word recall Cued recall List-word recall Mental-cue recall Correct-task recall
Sound
Definition
Experience
House
.I0 .I3
.I4 .16
.12 .13
.26 .26
.46 .53
.83 .74 .77
.67
.53 .59 .67
.64
.58
.65
cued recall in the definition task (.79)was superior to any of the other three tasks: experience (.62), house (.56), or similar-sounding word (SO). The only other significant difference was between the experience and the similar-soundingword task. The main effect of cue type was not significant, F < 1. 3. Discussion
The pattern of cued-recall results is markedly different from that of free recall. Why should this be? The answer is that the most important property of a set of mental cues in free recall is constructibility. If the mental cues cannot be constructed at recall, then the other properties of the mental cues do not have the opportunity to influence performance. However, in the tests of cued recall, written descriptions of the mental cues were used to activate their mental representations in the subject’s conscious memory. The property of constructibility was no longer necessary because the cues were presented to the subject by the experimenter. The best cued-recall performance occurred in the definition task. This was the case regardless of whether the written description was presented as a cue for the list word or the list word was presented as a cue for the written description. Hence, the associability of the mental cues in the definition task was greater than the associability of the cues in any of the other three tasks. Furthermore, the mental cues in all four tasks showed invertibility. Each written response described the mental context elicited by a combination of list word and task during the learning phase. Later, list words were again presented to cue the written descriptions, and written descriptions were presented to cue the original list words. If one type of presented cue had been more effective than the other, this would provide evidence for directionality in the association. Howev-
264
Francis S. Bellezza
er, there was no significant difference in the effectiveness of these types of cues, so bidirectionality of association was found. It should be noted that in Study 2 the degree of invertibility varied somewhat from task to task. Although there was no main effect for cue type, there was a marginally significant task by cue type interaction. It would not be surprising if future research found invertibility to be better for some types of learning cues than for others. The definition task resulted in the best cued recall because of the strong association between the list words and the definitions generated by the subjects. However, associability is not the only property of mental cues that is relevant here. The subjects correctlv recalled the definitions from the words and the words from the definitions, but the subjects may not have been aware that these were the same words and definitions used 3 days earlier; that is, it is possible for the subject to generate the correct response from semantic memory without recognizing the word as having occurred at some earlier time (Tulving & Thomson, 1973, Experiment 3). Because of this possibility, the question can be raised as to whether these mental cues are discriminable in episodic memory. As mentioned earlier, mental cues must be discriminable on a number of dimensions in order to be effective. These dimensions are often semantic ones dealing with the meaning of the words. However, the relevant dimension can be the temporal-contextual dimension that is important to the functioning of episodic memory. Because the same set of words may function as a mental cue for many episodes in a person’s life, he or she must distinguish among many instances of the same mental cue by using temporal-contextual attributes. How discriminable from episodes involving the same words were the mental events in Study 2 after a retention interval of 3 days? When the written descriptions were presented as cues, it was often clear from the description what the original task was. However, when the list words were presented, the subjects were not told what the task had been for each word. They had to look at each word and remember the response they had given to it 3 days previously. They simply could not write down a definition for every word, because four different tasks were involved. The last row of Table I1 indicates how often subjects gave a response representing the correct task category regardless of whether their written description was correct. As can be seen, these proportions are much larger than the chance value of .25 and are generally larger than the proportion of correct descriptions recalled. It appears that subjects could sometimes recall the task associated with the word but not make the correct response. The tasks could be more easily recalled than the particular written response first given to the word. Both the presented word and the recalled task functioned as cues for recall of the written description. It can be concluded that the episodic codes containing the words were successfully discriminated from other personal episodes of the subjects in which the list word was a part.
Mental Cues and Verbal Reports
VI.
265
Limitations of Verbal Reports about Learning
It is proposed that the analysis of verbal reports is useful for a better understanding of human learning. The reasons for this are as follows: (1) Much of human learning consists of the creation of new chunks in episodic memory by the interconnecting of cognitive symbols in conscious memory. These symbols represent a mixture of newly perceived information and information already organized in permanent memory. Some of these symbols can be later regenerated by the cognitive system as recall cues. (2) The symbols in conscious memory can, to some extent, be verbally or linguistically described. (3) The verbal data provided permit the investigator to better understand by experimental manipulation the information contributed by the subject’s cognitive system during learning. Controversy regarding the nature of verbal reports has existed from the time of the Wurzburg School’s discovery of imageless thought (Humphrey, 1963) to the present day (Ericsson & Simon, 1980; Nisbett & Ross, 1980; Nisbett & Wilson, 1977). The use of verbal reports in present-day psychology should not be thought of as a reversion to the method of introspection favored by structural psychologists such as Titchener. The structuralists believed that the study and understanding of the contents of consciousness was the primary goal of psychology and not simply one aspect of psychology’s methodology. They argued that conscious experience must be analyzed into irreducible elements such as sensations, images, and feelings. Also, each element had attributes such as quality, intensity, and extensity (MacLeod, 1964). Contemporary cognitive psychologists tend to regard verbal reports as another source of data, useful for the study of mental processes and their relation to language and behavior (Hilgard, 1980). A. PROCEDURAL AND DECLARATIVE KNOWLEDGE The distinction between what is called declarative versus what is called procedural knowledge has been made by a variety of investigators (Anderson, 1976; Ryle, 1949; Winograd, 1975). Declarative knowledge is knowledge in symbolic form stored in permanent memory. Anderson (1983, Chap. 2) proposes three types of information that can make up declarativeknowledge and thus become part of the content of conscious memory. These are abstract propositions, spatial images, and temporal strings. When any one of these three types of knowledge is accessed or activated, the learner immediately becomes aware of it and its related information in the same cognitive unit. In addition to declarative knowledge, there is another kind of knowledge called procedural knowledge. This knowledge may not be stored in a form that makes us aware of it when we use it. Our knowledge of grammar is such a type of knowledge. We create both original and grammatically correct sentences often without being able to explain how we do it. Other examples of procedural knowledge are those knowledge sets necessary
Francis S. Bellezza
266
for reading, writing, tying a bow, or driving an automobile. Simple memory processes may also represent procedural knowledge. In the act of retrieving information from permanent memory, we are aware only of the result of this act of retrieval. Although some of the symbols in conscious memory may be acting as retrieval cues, we are not aware of the process by which information is being retrieved from permanent memory. There seems to be agreement among investigators that verbal reports reflect the symbolic contents of conscious memory, but tell us little about the memory retrieval processes themselves (Ericsson & Simon, 1980; Nisbett & Wilson, 1977; Read & Bruce, 1982). Procedures that initially are accompanied by a large number of mental events may gradually be performed with little or no preceding thought. An example of this is learning to play a piano or learning to touch-type. When a procedure can be performed without having to think about it, we say that it has become automatic (Shiffrin & Schneider, 1977). Anderson (1982) suggests that all skill acquisition consists of two stages: The first is a declarative stage in which knowledge about the skill is in propositional form that must be interpreted before the skill can be performed. In this stage the information that the person utilizes can be expressed verbally. In fact, this information is rehearsed and used in a piecemeal fashion to perform the skill. The second stage is procedural. Through a series of processes that take place with practice, the skills become faster and more efficient. Furthermore, the skills become automatic; that is, the skills are performed without the declarative knowledge controlling the skill first being activated in conscious memory and then being interpreted. Even after a skill becomes automatic, the declarative knowledge used to first perform it may sometimes be available in permanent memory. Often, however, this knowledge is forgotten. For example, try to explain to someone how to tie a bow. There also seem to be learning situations in which little or no declarative knowledge is available when the skill is first being learned. This is true of many motor skills. Developing a skill such as throwing darts, dribbling a basketball, or learning to ride a bicycle seems to involve certain component skills that are not initially accompanied by required mental events; that is, learning does not consist of new organizations of mental symbols in the cognitive system, but rather involves the development of perceptual-motor skills not paralleled by new symbol organization. Learning occurs in the procedural-knowledge system that is independent of the declarative-knowledge system. Each improvement based on procedural knowledge is not necessarily preceded or accompanied by some mental event signaling that improvement, even though the learner may notice his or her improvement after it has occurred.
B. VOCALIZATION AS
A
SKILL
The process of overt rehearsal, which involves repeatedly verbalizing information after it has been presented, seems to be a fairly complex skill (Flavell,
Mental Cues and Verbal Reports
261
Friedrichs, & Hoyt, 1970). Rehearsal does not spontaneously occur in a verbal learning task when the subjects are young children or mentally retarded. Yet training in rehearsal improves memory performance for both the young children (Kenny, Cannizzo, & Flavell, 1967) and the retarded (Brown, Campione, Bray, & Wilcox, 1973). Furthermore, unrestricted reporting of the contents of conscious memory may be more difficult than the overt rehearsal of only externally presented information. For this reason, researchers often instruct the subject to report only material previously presented and still in conscious memory. However, vocalization with respect to learning and problem solving involves the reporting of as much information from conscious memory as possible, including those elaborations contributed by the subject. It should be kept in mind that this type of vocalization may itself be a complex skill developed over a long period of time. C. CONDITIONING AND AWARENESS A controversy related to the meaning of verbal reports concerns whether different types of human conditioning occur without awareness. Attention has been focused on three learning paradigms: verbal mediation in transfer experiments (Bugelski & Scharlock, 1952), classical conditioning (Brewer, 1974), and operant conditioning (Greenspoon, 1962). The problem of conditioning and awareness is a complex one, and reviews are provided by Brewer (1974) and Spielberger (1965). The issue of awareness is equivalent to the issue of whether the learner can accurately report the psychological processes responsible for learning. In most cases, awareness means that the learner must be able to report the process or relation that is critical for conditioning to occur. Prytulak (197 1) argues that verbal transfer must be accompanied by awareness, and Brewer (1974) provides an extensive review of the conditioning literature and concludes that awareness is necessary for conditioning to occur in humans. But the question may not be resolved (Dulany, 1974). As might be expected, many of the inconsistencies in experimental results have arisen from inadequacies in assessing awareness. As Brewer points out, many experimenters have not reported the procedure or test by which awareness was assessed.
D.
VERBAL
REPORTSAND
AFFECT
In addition to cognitive symbols, affective symbols representing current feelings are also active during cognition. Bower has proposed that cognitive symbols and affective symbols become interlinked in memory in a manner similar to the interconnection of cognitive symbols themselves (Bower, 1981; Bower & Cohen, 1982). However, the association of affective and cognitive components may complicate the process of thinking, learning, and verbalization. Because of the affect associated with some information in permanent memory, people may not be able to verbally report this information. Consequently, the kind of learning
268
Francis S. Bellezza
necessary for healthy cognitive functioning will not occur. According to Dollard and Miller (1950) the term repression refers to the automatic tendency to stop thinking about certain anxiety-producing information and avoid the necessity of activating it into consciousness. It is implicitly assumed that these negative feelings increase as we become more aware of the associated symbols and diminish as we become less aware of them. Whether the process of repression actually occurs in this manner is still uncertain (Holmes, 1974). However, it seems reasonable to assume that any person will have difficulty verbalizing information that has been associated with very negative feelings. ACKNOWLEDGMENTS Portions of this research were presented at the meetings of the Psychonomic Society in San Diego. California during the November 1983 and in San Antonio, Texas during November 1984. This research is supported in part by a grant from the Field-Wiltsie Foundation. Thanks goes to Steven 1. Lynn and Hal Arkes for their helpful comments on an earlier version of this article and to Kathy Sandy and Daniel Scully for their assistance in collecting and scoring the data. Thanks also goes to Ohio Computer and Learning Services for making computer time and their facilities available.
REFERENCES Adams, J. A., & McIntyre, J. S. (1967). Natural language mediation and all-or-none learning. Canadian Journal of Psychology, 21, 436-449. Anderson, I. R. (1976). Language, memory, and thought. Hillsdale, NJ: Erlbaum. Anderson, J. R. (1980). Concepts. propositions, and schemata: What are the cognitive units? In H. E. Howe, Jr. & J. H. Flowers (Eds.), Nebraska Symposium on Motivation (Vol. 28. pp. 121162). Lincoln, NE: Univ. of Nebraska Press. Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89, 369-406. Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard Univ. Press. Anderson, J. R., & Bower, G. H. (1972). Recognition and retrieval process in free recall. Psychological Review, 79, 97- 123. Anderson, J. R., & Bower, G. H. (1974). A propositional theory of recognition memory. Memory and Cognition, 2, 406-412. Anderson, J. R., & Reder, L. M. (1979). An elaborative processing explanation of depth of processing. In L. S. Cermak & F. I. M. Craik (Eds.), Levels ofprocessing in human memory (pp. 385403). Hillsdale, NJ: Erlbaum. Anderson; R. C. (1984). Some reflections on the acquisition of knowledge. Educational Psychologist, 13, 5-10. Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.),The psychology of learning and motivation (Vol. 2, pp. 89-195). New York: Academic Press. Battig, W. F. (1968). Paired-associate learning. In T. R. Dixon & D. L. Horton (Eds.). Verbal behavior and general behavior theory (pp. 149- 171). Englewocd Cliffs, NJ: Prentice-Hall. Battig, W. F., & Montague, W. E. (1969). Category norms of verbal items in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology Monographs, SO(3, Pt. 2).
Mental Cues and Verbal Reports
269
Bellezza, F. S . (1981). Mnemonic devices: Classification, characteristics, and criteria. Review of Educational Research, 51, 247-275. Bellezza, F. S. (1982). Updating memory using mnemonic devices. Cognitive Psychology, 14 301327. Bellezza, F. S. (1983a). Recalling script-based text: The role of selective processing and schematic cues. Bulletin of the Psychonomic Society, 21, 267-270. Bellezza, F. S. ( I 983b). The spatial-arrangement mnemonic. Journal of Educational Psychology, 15, 830-837. Bellezza, F. S. (1984a). The self as a mnemonic device: The role of internal cues. Journal of Personality and Social P ~ y ~ h o l ~41, g y ,506-5 16. Bellezza, F. S . (1984b). Reliability of retrieval from semantic memory: Common categories. Bulletin of the Psychonomic Society. 22, 324-326. Bellezza, F. S . , & Bower, G . H. (1982). Remembering script-based text. Poetics, 11, 1-23. Bellezza, F. S., & Hartwell, T. C. (1981). Cuing subjective units. The Journal ofPsychology. 107, 209-2 18. Bellezza, F. S., & Poplawsky, A. J. (1974). The function of one-word mediators in the recall of word pairs. Memory and Cognition. 2, 447-452. Bellezza, F. W., Poplawsky, A. J., & Aronovsky, L. A. (1977). The functional role of one-word mediators. Bulletin of the Psychonomic Society, 10, 460-462. Bellezza, F. S., & Walker, R. J. (1974). Storagecoding trade-off in short-term store. Journal of Experimental Psychology, 102, 629-633. Bobrow, S. A., & Bower, G . H. (1969). Comprehension and recall of sentences. Journal of Experimental Psychology, 80, 455-461. Boring, E. G . (1950). A history of experimental psychology. New York: Appleton. Bousfield, W. A,, & Cohen, B. H. (1953). The effects of reinforcement on the Occurrence of clustering in the recall of randomly arranged associates. Journal of Psychology, 36, 67-81. Bower, G . H. (1972a). A selective review of organizational factors in memory. In E. Tulving & W. Donaldson (Eds.), Organization of memory (pp. 93-137). New York Academic Press. Bower, G . H. (1972b). Perceptual groups as coding units in immediate memory. Psychonomic Science, 21, 217-219. Bower, G . H. (1972~).Mental imagery and associative learning. In L. Gregg (Ed.), Cognition in learning and memory (pp. 51-87). New York: Wiley. Bower, G . H. (1981). Mood and memory. American Psychologist, 36, 129-148. Bower, G. H., Black, J. B., &Turner, T. J. (1979). Scripts in memory for text. Cognitive Psychology, 11, 177-220. Bower, G . H., & Clark, M. C. (1969). Narrative stories as mediators for serial learning. Psychonomic Science, 14, 181-182. Bower, G . H., & Cohen, P. R. (1982). Emotional influences in memory and thinking: Data and theory. In M. S. Clark & S. T. Fiske (Eds.), Afect and Cognition: The Seventeenth Annual Carnegie Symposium on Cognition (pp. 291-33 l).Hillsdale, N1: Erlbaum. Bower, G. H., & Gilligan, S. G . (1979). Remembering information related to one’s self. Journal of Research in Personality, 13, 420-432. Bower, G. H., & Reitman, J. S. (1972). Mnemonic elaboration in multilist learning. Journal of Verbal Learning and Verbal Behavior, 11, 478-485. Brewer, W. F. (1974). There is no convincing evidence for operant or classical conditioning in adult humans. In W. B. Weimer & D. S. Palermo (Eds.), Cognition andthesymbolicprocesses (Vol. 1, pp. 1-42). Hillsdale, NJ: Erlbaum. Brown, A. L., Campione, 3. C., Bray, N. W., & Wilcox. B. L. (1973). Keeping track of changing variables: Effects of rehearsal training and rehearsal prevention in normal and retarded adolescents. Journal of Experimental Psychology, 101, 123-131.
270
Francis S. Bellezza
Bugelski, B. R.,& Scharlock, D. P. (1952). An experimental demonstration of unconscious mediated association. Journal of Experimental Psychology, 44, 334-338. Buschke, H. (1968). Perceiving and encoding two kinds of item-information. Perception and Psychophysics. 3, 331-336. Buschke, H., & Hinrichs, J. V. (1968). Controlled rehearsal and recall order in serial list retention. Journal of Experimental Psychology, 78, 502-509. Chase, W. C., & Ericsson, K. A. (1981). Skilled memory. In J. R. Anderson (Ed.), Cognitive skills and their application (pp. 141-189). Hillsdale, NJ: Erlbaum. Craik, F. 1. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671-684. Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104, 268-294. Day, J. C., & Bellezza, F. S. (1983). The relation between visual-imagery mediators and recall. Memory and Cognition, 11, 251-257. Delprato, D. J., & Baker, E. J. (1974). Concreteness of pegwords in two mnemonic systems. Journal of Experimental Psychology, 102, 521-522. Dollard, J., & Miller, N. E. (1950). Personality and psychotherapy. New York: McGraw-Hill. Dulany, D. E. (1974). On the support of cognitive theory in opposition to behavior theory: A methodological problem. In W. B. Weimer & D. S. Palermo (Eds.), Cognition and the symbolic processes (Vol. I , pp. 43-56). Hillsdale, NJ: Erlbaum. Ebbinghaus. E. (1964). Memory: A contribution to experimental psychology (H. A. Ruger & C. E. Bussenius, Trans.). New York: Dover. (Original work published 1885) Ekstrand, B. (1966). Backward (R-S) associations. Psychological Bulletin, 65, 50-64. Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. PsychologicalReview, 87, 215251. Flavell, J. H., Friedrichs, A. G., & Hoyt, J. D. (1970). Developmental changes in memorization processes. Cognitive Psychology, 1, 324-340. Geiselman, R. E., & Bellezza, F. S. (1977). Eye movements and overt rehearsal in word recall. Journal of Experimental Psychology: Human Learning and Memory, 3, 305-3 15. Geiselman, R. E., Woodward, J. A., & Beatty, J. (1982). Individual differences in verbal memory performance: A test of alternative information-processing models. Journal of Experimental Psychology: General, 111, 109-134. Gibson, E. J. (1969). Principles of perceptual learning and development. New York: Appleton. Gilmartin, K. J., Simon, H. A., & Newell, A. (1976). A program modeling short-term memory under strategy control. In C. M. Cofer (Ed.), The structure of human memory (pp. 15-30). San Francisco: Freeman. Graesser, A. C. (1981). Prose comprehension beyond the word. New York: Springer-Verlag. Graesser, A. C., Woll, S. B., Kowalski, D. J., & Smith, D. A. (1980). Memory for typical and atypical actions in scripted activities. Journal of Experimental Psychology: Human Learning and Memory, 6 , 503-515. Greenspoon, J. (1962). Verbal conditioning and clinical psychology. In A. J. Bachrach (Ed.), Experimental foundations of clinical psychology (pp. 5 10-553). New York: Basic Books. Greenwald, A. G. (1968). Cognitive learning, cognitive response to persuasion, and attitude change. In A. G. Greenwald, T. C. Brock, & T. M. Ostrom (Eds.), Psychological foundations of attirudes (pp. 147-170). New York: Academic Press. Greenwald, A. G. (1981). Self and memory. In G. H. Bower (Ed.), Thepsychology of learning and motivation (Vol. 15, pp. 201-236). New York Academic Press. Hatano, G., Miyake, Y., & Binks, M. G. (1977). Performance of expert abacus operators. Cognition, 5 , 57-71,
Mental Cues and Verbal Reports
27 1
Hilgard, E. R. (1980). Consciousness in contemporary psychology. Annual Review of Psychology, 31, 1-26. Holmes, D. S . (1974). Investigations of repression: Differential recall of material experimentally or naturally associated with ego threat. Psychological Bulletin, 81, 632-653. Humphrey, G. (1963). Thinking: An introduction to its experimenralpsychology. New York: Wiley. (Original work published in 1952) Jacoby, L. L., & Craik, F. I. M. (1979). Effects of elaboration of processing at encoding and retrieval: Trace distinctiveness and recovery of initial context. In L. S. Cermak & F. I. M. Craik (Eds.), Levels of processing in human memory (pp. 1-21). Hillsdale, NJ: Erlbaum. Jacoby, L. L., Craik, F. I. M., & Begg, 1. (1979). Effects of decision difficulty on recognition and recall. Journal of Verbal Learning and Verbal Behavior, 18, 585-600. James, W. (1950). The principles of psychology. New York: Dover. (Original work published in 1 890). Johnson, M. K., & Raye, C. L. (1981). Reality monitoring. Psychological Review, 88, 67-85. Johnson, N. J. (1972). Organization and the concept of a memory code. In A. W. Melton & E. Martin (Eds.), Codingprocesses in human memory (pp. 125-159). Washington, DC: Winston. Kenny, T. J., Cannizzo, S. R., & Flavell, J. H. (1967). Spontaneous and induced verbal rehearsal in a recall task. Child Developmenr, 38, 953-966. Kirk, R. E. (1982). Experimental design: Procedures for the behavioral sciences (2nd ed.). Belmont, CA: Wadsworth. Landauer, T. K. (1962). Rate of implicit speech. Perceptual and Moror Skills, 15, 646. Lea, G. (1975). Chronometric analysis of the method of loci. Journal of Experimental Psychology: Human Perception and Performance, 1, 95-104. MacLeod, R. B. (1964). Phenomenology:A challenge to experimental psychology. In T. W. Wann (Ed.), Behaviorism and phenomenology (pp. 47-78). Chicago: Univ. of Chicago Press. Mandler, G. (1967). Organization and memory. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (Vol. I , pp. 327-372). New York: Academic Press. Mandler, G. (1975). Memory storage and retrieval: Some limits on the reach of attention and consciousness. In P. M. A. Rabbit & S. Dornic (Eds.), Attention andperformance (Vol. 5, pp. 499-516). New York: Academic Press. Mandler, J. M. (1984). Stories, scripts, and scenes: Aspects of schema theory. Hillsdale, NJ: Erlbaum . McGuire, W. J. (1961). A multiprocessmodel for paired-associatelearning. Journal of Experimental Psychology, 62, 335-347. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits in our capacity for processing information. Pschological Review, 63, 81-97. Miller, G. A. (1962). Psychology: The science of mental life. New Y o k Harper & Row. Miller, G. A,, Galanter, E., & Wbram, K. H. (1960). Plans and the structure of behavior. New York: Holt. Miller, G. A , , Selfridge, J. A. (1950). Verbal context and the recall of meaningful material. American Journal of Psychology, 63, 176- 187. Minsky, M. (1975). A framework for representing knowledge. In P. H. Winston (Ed.), Thepsychology of computer vision (pp. 21 1-277). New York: McGraw-Hill. Montague, W. E. (1972). Elaborative strategies in verbal learning and memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 6, pp. 225-302). New York: Academic Press. Montague, W. E., Adams, J. A,, & Kiess, H. 0. (1966). Forgetting and natural language mediation. Journal of Experimental Psychology, 72, 829-833. Moms, P. E., & Reid, R. L. (1970). The repeated use of mnemonic imagery. Psychonomic Science, 20, 337-338.
272
Francis S. Bellezza
Neisser, U. (1976). Cognition and reality. San Francisco: Freeman. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Nisbett, R. E., & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgment. Englewood Cliffs, NJ: Prentice-Hall. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84, 23 1-259. Norman, D. A. (1982). Learning and memory. San Francisco: Freeman. Norman, D. A,, & Bobrow, D. G. (1979). Descriptions: An intermediate stage in memory retrieval. Cognirive Psychology, 11, 107-113. Paivio, A. (1969). Mental imagery in associative learning and memory. Psychological Review, 76, 241-263. Paivio, A. (1971). Imagery and verbal processes. New York: Holt. Pavio, A., & Foth, D. (1970). Imaginal and verbal mediators and noun concreteness in pairedassociate learning: The elusive interaction. Journal of Verbal Learning and Verbal Behavior, 9, 384-390. Pavio, A., Yuille, J. C., & Madigan, S. A. (1968). Concreteness, imagery, and meaningfulness values for 925 nouns. Journal of Experimental Psychology Monograph, 76(I , Pt. 2). Postman, L. (1971). Transfer, interference, and forgetting. In J. W. King & L. A. Riggs (Eds.), Woodworth and Schlosberg’s experimental psychology (pp. 1019- I 132). New York: Holt. Prytulak, L. S. (1971). Natural language mediation. Cognitive Psychology, 2, 1-56. Read, 1. D., & Bruce, D. (1982). Longitudinal tracking of difficult memory retrievals. Cognitive Psychology, 14, 280-300. Reddy, B. G.,& Bellezza, F. S. (1983). Encoding specificity in free recall. Journal of Experimental Psychology: Learning. Memory. and Cognition. 9, 167- 174. Rumelhart, D. E. (1980). Schemata: The building blocks of cognition. In R. Spiro, B. Bruce, & W. Brewer (Eds.), Theoretical issues in reading comprehension (pp. 33-58). Hillsdale, NJ: Erl baum . Rundus, D. (1971). Analysis of rehearsal processes in free recall. Journal of Experimental Psychology, 89, 63-77. Rundus, D . , & Atkinson, R. C. (1970). Rehearsal processes in free recall: A procedure for direct observation. Journal of Verbal Learning and Verbal Behavior, 9, 99- 105. Ryle, G. (1949). The concept of mind. New York: Harper & Row. Schank, R., & Abelson, R. (1977). Scripts. plans, goals, and understanding. Hillsdale, NJ: Erlbaum. Schulz, R. W., & Lovelace, E. A. (1964). Mediation in verbal paired-associate learning: The role of temporal factors. Psychonomic Science, 1, 95-96. Shiffrin, R. M. (1976). Capacity limitations in information processing, attention, and memory. In W. K. Estes (Ed.), Handbook of learning and cognifiveprocesses(Vol. 4, pp. 177-236). Hillsdale, NJ: Erlbaum. Shiffrin. R. M., & Atkinson,R. C. (1969). Storage and retrieval processes in long-term memory. Psychological Review, 76, 179-193. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: 11. Perceptual learning, automatic attending, and a general theory. Psychological Review. 84, 127-190. Simon, H. A. (1974). How big is a chunk? Science, 183, 482-488. Simon, H. A. (1976). The information storage system called “human memory.” In M. R. Rosenzweig & E. L. Bennett (Eds.), Neural mechanisms of learning and memon, (pp. 79-96). Cambridge, MA: MIT. Simon, H. A. (1979). Information processing models of acquisition. Annual Review of Psychology, 30, 363-396.
Mental Cues and Verbal Reports
273
Smith, E. E., Adams, N., & Schorr, D. (1978). Fact retrieval and the paradox of interference. Cognitive Psychology, 10, 438-464. Smith, S. M., Glenberg, A. M., & Bjork, R. A. (1978). Environmental context and human memory. Memory and Cognition, 6 , 342-353. Sowa, J. F. (1984). Conceptual structures:Informationprocessing in mindand machine. New York Addison-Wesley . Spielberger, C. D. (1965). Theoretical and epistemological issues in verbal conditioning. In S. Rosenberg (Ed.), Directions in psycholinquistics (pp. 149-200). New York: Macmillan. Sweeney, C. A., & Bellezza, F. S. (1982). Use of the keyword mnemonic in learning English vocabulary words. Human Learning, 1, 155-163. Thomdyke, P. W., & Hayes-Roth, B. (1979). The use of schemata in the acquistion and transfer of knowledge. Cognitive Psychology, 11, 82-106. Titchener, E. B. (1909). Lectures on the experimental psychology of the thought processes. New York: Macmillan. Toglia, M.P., & Battig, W. F. (1978). Handbookof semantic wordnorms. Hillsdale, NJ: Erlbaum. Tulving, E. (1962). Subjective organization in free recall of “unrelated” words. Psychological Review. 69, 344-354. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization and memory (pp. 381-403). New York: Academic Press. Tulving, E., & Pearlstone, Z. (1966). Availability versus accessibility of information in memory for words. Journal of Verbal Learning and Verbal Behavior, 5, 381-391. Tulving, E., & Thomson, D. M. (1973). Encoding specificity and retrieval processes is episodic memory. Psychological Review, 80, 352-373. Underwood, B. J. (1972) Are we overloading memory? In A. W.Melton & E. Martin ( a s . ) , Coding processes in human memory (pp. 1-23). Washington, DC: Winston. Underwood, B. J., Ekstrand, B. R., & Keppel, G. (1965). An analysis of intralist similarity in verbal learning with experiments on conceptual similarity. Journal of Verbal Learning and Verbal Behavior. 4, 447-462. Underwood, B. I., & Schulz, R. W. (1960). Meaningfulness and verbal learning. Philadelphia: Lippincott. Warren, H. C. (1921). A history of association psychology. New York: Scribner’s. Winograd. T. (1975). Frame representation and the declarative/pmdurl controversy. In D. G. Bobrow 8; A. Collins (Eds.), Representation and understanding: Studies in cognitive science (pp. 185-210). New York: Academic Press. Yates, F. A. (1966). The art of memory. London: Routledge & Kegan Paul.
This Page Intentionally Left Blank
Murray Glanzer and Suzanne Donnenwerth Nolan DEPARTMENT OF PSYCHOLOGY NEW YORK UNIVERSITY NEW YORK, NEW YORK 10003
I. Introduction: Restrictions This article describes a series of studies concerned with the role of memory mechanisms and, particularly, short-term storage in comprehension of text. The work described had the following restrictions. A.
ANALYSIS OF ONGOING COMPREHENSION
There are many studies that are concerned with the effects of such variables as text organization on subjects’ eventual recall or comprehension of the text. That work leaves open the question of just when the variable has had its effectduring comprehension, during later recall, or at some point in the time separating the two. Our focus was on effects that were measured during the course of comprehension. We routinely measured the subjects’ later recall or comprehension of text, but for the variables we used those later measures were not of primary interest. They also turned out not to be informative in our experimental arrangements. B.
USE OF “NORMAL”TEXT
The texts used were drawn from the types of material ordinarily read by our subjects, college students. They included both narration and exposition. The texts were restricted in specific ways for the purposes of some of the experiments; for example, sentence length was held constant. In all cases, however, the texts used could appear in the usual reading of our subjects without being noticed as remarkable. They were neither simpleminded nor artificial.
C. Focus ON “NORMAL”READING Our concern was with obtaining a picture of the ongoing process of comprehension during the reading of ordinary text. This is a rapid, automatic process THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 20
275
Copyright 0 1986 by Academic Press. Inc. All rights of repmduction in any form reserved.
Murray Glanzer and Suzanne Donnenwerth Nolan
216
for our readers. The work aimed at this rapid, automatic aspect. We intervened with secondary tasks and slowed the subject with a button-pressing task, but only to define the comprehension process as it occurred rapidly and automatically. We tried, particularly in the later experiments, to keep the obvious interventions to a minimum while still obtaining a reasonable amount of information from each experimental trial. The attempt to focus on normal reading is also seen in our use of normal text.
D. MAINTENANCE OF RELATION TO
SIMPLEMEMORY TASKS
We assumed that the processes involved in the memory of a simple list of words were also those involved in the comprehension of complex text. This assumption arose in part from the history of the work that started with simple memory tasks. The assumption seems, however, to be a necessary one for any psychologist concerned with either memory or comprehension. The assumption here leads to theorizing that draws on concepts which have been developed and tested in simple memory tasks. It also leads to the use of techniques that have been developed and tested for the study of simple memory performance, such as distractor tasks.
E. POSTULATION OF ENTITIES ONLYAS REQUIRED A major emphasis in our work was to keep the theoretical structure as simple as possible. This emphasis is closely connected to the attempt to maintain the relation to simple memory tasks. It is seen first in our attempt to keep the postulated contents of short-term storage to a minimum. At one point we thought we could hold the number of items involved in the ongoing processing down to one or two recent sentences held verbatim in short-term storage. The fuller probing of the performance involved in reading revealed the role of additional factors. Thematic or topic information, which had initially been difficult to substantiate in ongoing processing, turned out to be a major factor. Moreover, in order to describe the processing fully, we drew on the concept of cuing of information in long-term storage in addition to the concept of short-term storage. However, both of these concepts were supported by data from simple memory experiments. The view of the comprehension process we arrived at was, not surprisingly, more complicated than the one we started with. However, it remains much simpler than most current views of the process. The result of the emphasis on simple theoretical structure is that our view contrasts with views that attempt to explain comprehension on the basis of complex cognitive structures. These views are sometimes labeled constructivist. It also contrasts with views that attempt to explain comprehension on the basis of
Memory Mechanisms in Text Comprehension
277
complex and deliberate processes enacted by the reader, which we would label voluntaristic, others would label as effortful (Hasher & Zacks, 1979). We recognize that the reader’s knowledge plays an important role at all levels of comprehension, from the interpretation of words to the expectations concerning sequences of events in a story. This knowledge is assumed here, however, to be mediated by the same memory mechanisms that produce the recall of simple word lists. The demonstration that such knowledge plays a role defines a problem. Mechanisms have to be specified and tested. Another characteristic that differentiates our approach from others is that we have not assumed that readers translate the text into propositions. The form in which internal representations are held remains unclear. Adoption of particular form at this times seems premature (Anderson, 1978). The experimental work described in the following will demonstrate what can be done on the basis of the listed restrictions, particularly that of theoretical simplicity. It will show how the work leads to the demonstrationof regularities in representative reading tasks. Moreover, the work will show that the approach is flexible enough to lead, when the data call for it, to the formulation of an extended theory and, in turn, to further testable propositions.
11. Background: Preceding Work
A. REGULARITIES IN MEMORY FOR SIMPLE LISTS Our starting point was earlier work on free recall of word lists. That work had demonstrated a number of strong regularities for long-term storage and regularities for short-term storage (Glanzer, 1972). This division in regularities was the basis for asserting that a separate short-term storage existed. The regularities for long-term storage included the effects of rate of presentation, length of list, number and spacing of repetitions, meaningfulness, intelligence, and aging. Those for short-term storage included the effects of filled delay, presentation modality (auditory vs visual), and grouping. Data were also presented showing that these regularities generalize across a range of memory tasks. Once the regularities of short-term storage had been established, it was possible to begin the examination of its function in other types of performance, in particular, in text processing. The variables that had a specific effect on shortterm storage, such as grouping, suggested that it played a particularly important role in language and text processing. The concept of short-term storage has been widely used in the analysis of text comprehension. It appears in proposals by a number of investigators (Chafe, 1973; Kieras, 1981; Kintsch & van Dijk, 1978). This does not mean that there is agreement on the use of the concept in this area. Some investigators have questioned the utility of the concept of short-term storage. Others have offered
278
Murray Glanzer and Suzanne Donnenwerth Nolan
alternative labels-foregrounding (Lesgold, Roth, & Curtis, 1979), primary memory and working memory (Baddeley & Hitch, 1974). In some cases the alternative labeling has been associated with assertions about the characteristics of the memory, for example, that it varies in the amount of information it can hold. One popular relabeling that seems to avoid controversy is that of “activation.” What we will talk about as short-term storage could in almost all cases be equally well named “activated portion of memory.” The term has the advantage of not implying that there is a structure involved, that is, a definite physical unit. It has the disadvantage for us of losing sight of the fact that the contents of shortterm storage include verbatim representations of recent portions of preceding text. We will hold to the term short-term storage and not go into issues that digress from our main concern-its role in comprehension. It was decided to look at the role of short-term storage in, specifically, the comprehension of text. The stimulus for the work came in part from the following assertion by Huey (1908): It is of the greatest service to the reader or the listener that at each moment a considerable amount of what is being read should hang suspended in the primary memory of the inner speech. It is doubtless true that without something of this there should be no comprehension of speech at all. When a considerable amount is thus suspended, the attention may wander backward and forward to get a fuller meaning where this is needed, with no fear of losing the minor parts, which are taken care of physiologically and may be taken into the focus of consciousness at will. (p. 148)
The picture of comprehension that we derived will be less voluntaristic than Huey’s, but will support his assertion concerning the usefulness of short-term storage. Several background investigations helped determine characteristics of shortterm storage relevant to comprehension of text. The first was concerned with the amount of material that can be held in short-term storage. Can it cany a sufficient amount of information to play a role in intersentence processing?
B. THENUMBERAND KINDOF UNITS IN SHORT-TERM STORAGE A series of investigations (Glanzer & Razel, 1974) was carried out to determine the number and kind of units held in short-term storage. As a preliminary step, a survey was carried out on a range of studies on memory for lists of unrelated words. Analysis of the results of that survey indicated that short-term storage held an average of two units. The surveyed studies used lists of unrelated words, and so we inferred that the capacity of short-term storage was two words. The next question was whether larger units consisting of sequences of words were also held in short-term storage. Moreover, if larger units were held, would
Memory Mechanisms in Text Comprehension
279
two of them be held? These questions are critical. If short-term storage was limited to a couple of words, we would be restricted to the consideration of shortterm storage in intrasentence processing. Our interest, however, included intersentence processing, the processing referred to in the statement by Huey (1908). The experiments reported by Glanzer and Razel (1974) show, first, that familiar word sequences such as proverbs function much the same way as unrelated single words in free recall. Subjects show serial position curves for lists of proverbs that are very similar to those for single words (see Fig. 1). In particular, the end peak for proverbs is very much like the end peak for words. That end peak represents primarily output from short-term storage. It shows a large shortterm storage effect for sentences. Indeed, the amount estimated as held in shortterm storage for proverbs was comparable to that for words-approximately two units. The sentences used in that study were, as noted, familiar sentences, proverbs. A relevant question was whether the same number of units would be held if the sentences were not familiar. To answer that question, a study was carried out in which half the sentences were familiar sentences, proverbs, and half were new I .oo
,901
.so
6
-
.lo-
W
a [L
0
.60-
0
.50-
0
I-
[L
0
a
.40-
0 LL
a .30,201
SERIAL
POSITION
Fig. I . Serial position curves for words and proverbs in Glanzer and Razel (1974), Experiment 3.
Murray Glanzer and Suzanne Donnenwerth Nolan
280
sentences, matched with the proverbs in vocabulary and structure. Both sets of sentences were recalled under both delay and no-delay conditions. The results are shown in Fig. 2. Under both the delay and no-delay recall conditions, both types of sentences show the serial position curves found with unrelated words. The new, unfamiliar sentences do show a marked and expected difference from the proverbs in the long-term storage component, as indicated by the separation of the early portions of the serial position curve. Both the proverbs and the new sentences, however, have marked end peaks in the no-delay condition, indicating sizable amounts held in short-term storage for both. Estimates of the amount held in short-term storage for proverbs is two sentences. For new sentences it is 1.5. These numbers, moreover, probably underestimate the amount in short-term storage. One reason for the underestimation is that there is output interference in recall. When the subject recalls one item, the probability of recall of other items from short-term storage is reduced. A clear demonstration of this effect in a probe recall task is seen in a study by Tulving and Arbuckle (1963). Other I .oo
,901
p
.a0 -
I
I
.70-
li,
w [r
.60-
SERIAL
POSITION
Fig. 2. Serial position curves for proverbs and new sentences in Glanzer and Raze1 (1974). Experiment 6. Both delay and no-delay conditions are shown.
Memory Mechanisms in Text Comprehension
28 I
demonstrations are found in studies by Dalezman (1976) and Tulving and Arbuckle (1966). The number of sentences actually held in short-term storage may therefore be three or more. Thus, we have a basis for examining the role of shortterm storage in intersentence processing.
111. Text Comprehension Studies A.
DIRECTEVIDENCE OF SHORT-TERM STORAGE CARRYING INFORMATION DURING THE COMPREHENSION OF TEXT: RECALLOF RECENTSENTENCES
Now that short-term storage is shown to have the capacity to permit it a role in intersentence processing, the next question is whether it holds sentences in the same way during normal text comprehension. It is possible that the capacity measured in the studies above was peculiar to the recall of lists of unrelated units. That short-term storage did hold text sentences in the same way was indicated, however, by the studies of Jarvella (1971, 1979) and Sachs (1967). Moreover, it has been assumed to be present and an important functioning component in theories of comprehension (Just & Carpenter, 1980; Kintsch & van Dijk, 1978; Miller & Kintsch, 1980), with indirect evidence of short-term storage playing an important role in some of the work based on those theories. In order to uphold the relation to simple memory tasks, we thought it important to show fully that the regularities found for short-term storage in the recall of unrelated word lists are also found in the recall of sentences from organized text. The regularity we focused on was the serial position effect, the end peak seen in Fig. 1. We were also concerned with whether the capacity for short-term storage estimated in the preceding work would be the same in the processing of text. A series of experiments was carried out (Glanzer, Dorfman, & Kaplan, 1981, Experiments I A-D) to examine these issues. Subjects listened to tape-recorded text which was interrupted at various points. When interrupted, the subjects were given a probe cue to recall a sentence one (last sentence heard) to four positions back. The recall was to be verbatim. The main results of the study are shown in Fig. 3. The regularities that hold for short-term storage in recall of simple word lists also hold for the recall of text sentences. The end peak is present. The number of sentences held is approximately two. The ease with which the subjects gave the verbatim recall for the last few sentences supports the idea that short-term storage holds the sentence in verbatim form. These experiments and the experiments that precede them, in particular those of Sachs (1967), give evidence that short-term storage holds two sentences in verbatim from preceding text. Jarvella’s (1971, 1979) work raises the issue of whether the unit stored is clauses rather than sentences. Most of the work in the
Murray Glanzer and Suzanne Donnenwerth Nolan
282
l
.
t; w .80-
-
O O - ~la c ~~ .~ EXPERIMENT b lEXPERIMENT lc c--l EXPERIMENT Id
a a
~
~
~
~
~
~
8 60-
El' Is 0
8 .40a n
z
9I .2001
I
lx
I
m
I
II SENTENCE PROBE
I
I
Fig. 3. Serial position curves for probe recall of text sentences in Glanzer et al. ( 1 9 8 1 ~Experiments la-d.
literature favors a phrasing in terms of simple sentence or clauses (Chang, 1980; Clark & Sengul, 1979; Jarvella & Herman, 1972). However, the issue is not important for the kinds of experimental procedures we used and will not be considered further here. %.
EFFECTSOF STANDARD DISTRACTOR TASKS (COUNTING, ARITHMETIC)ON TEXTPROCESSING: PARALLELS TO SIMPLE MEMORYPERFORMANCE
Although there is evidence that short-term storage may be functioning during text processing, there is no evidence so far that it plays an important role in the processing. The short-term storage effects could be a by-product of processes operating during comprehension rather than a necessary part. To demonstrate that it is a necessary part requires several steps. The first step is to show that a standard operation used to eliminate the contents of short-term storage in simple memory tasks has a damaging effect on the comprehension of text-slowing the process or lowering the amount comprehended and remembered. To show this, two series of experiments were carried out (Glanzer et al., 1981) in which subjects read paragraphs. In the control condition, the sentences followed each other without interruption. In the experimental condition, in one series the sentences alternated with simple addition problems, while in the other series the sentences alternated with a simple counting task. The time the subjects took to read the sentences was measured, and their comprehension was also tested after they had completed the text.
Memory Mechanisms in Text Comprehension
283
The effect of both the arithmetic tasks and the counting task was to slow the reading of the text sentences -400 msec. Distractor tasks had little effect here and subsequently on the later comprehension measures. In most cases, comprehension measured after the reading in the interrupted condition was either equivalent to that after the control condition, or lower. When it was lower, the difference was not statistically significant. It is not surprising, however, that there were only weak effects on this later measure. The reading task was selfpaced, and the subjects had full opportunity in their slower reading to recover from the distractor task in order to maintain comprehension. Since for our selfpaced reading experiments the final comprehension scores here and subsequently were not informative, they will not be discussed further. However, they were collected for all the experiments described. In the experiments (3a and 3b) of this study, using counting as a betweensentence distraction, that distractor task was also used as a concurrent task. Subjects counted aloud as they read the text sentences silently. Such concurrent counting had a very strong effect in slowing reading. Those results indicate the role of short-term storage in intrasentence processing as well. Our focus here, however, is on the effect of the between-sentencedistractor task as evidence that short-term storage plays a role in intersentence processing.
C. EFFECTSOF READING DISTRACTOR TASKS ON TEXTPROCESSING: CONTENT VERSUS SET, TEXTVERSUS UNRELATEDSENTENCES
In the preceding experiments, two standard types of distractor tasks were used-counting and addition. The fact that they slow reading and that they also lower the end peak in recall of both lists of unrelated items and text sentences may indicate that there is a general influence on comprehension of a reduced short-term storage. It is possible, however, that the slowing observed with the between-sentence distractor tasks was due to the switch from one task to another, that is, from counting to reading or from arithmetic to reading. It will be seen later that such general task-related effects, which we label “set effects,” do exist. However, we were specifically concerned here with establishing the role of short-term storage in carrying information needed for text comprehension. To determine whether we were dealing with a loss of information or a loss of set, an experiment was set up in which the distractor task was also a reading and comprehension task (Glanzer, Fischer, & Dorfman, 1984, Experiment I ) . In the control condition, the text was read normally, one sentence after the other, and then a series of factual statements was read. In the experimental condition, the text sentences alternated with the unrelated factual statements. The subjects’ reading time for both sentence types was recorded. A sample paragraph in the experimental condition is given in Table I.
284
Murray Glanzer and Suzanne Donnenwerth Nolan
TABLE I TEXTOF A PARAGRAPH WITH INTERLEAVED FACTUAL STATEMENTS FROM GLANZER,FISCHER,AND DORFMAN’S (1984) EXPERIMENT
+ + +
+ + + +
+ + +
Jupiter is unlike the Earth in almost every way. Weizmann, the first president of Israel, was a well-known chemist before he took public office. We used to think it had a hard core, covered with a layer of ice. Some Roman emperors used lotteries to give away property and slaves to guests at feasts. Now we can see it with a telescope. Pigeons and doves, unlike most birds, keep their bills in water and drink with a pumping action. It seems now that it is made entirely of gas. In the reign of Peter the Great, a factory was established for the manufacture of asbestos articles. This is mostly hydrogen, some of which is combined to form poisonous compounds. Residents John Adams and Thomas Jefferson died on the same day, July 4, 1826. It is clear that no life can exist there. Amboy Street in Brooklyn was the site of the first birth control clinic in America. Not only is the atmosphere poisonous, but Jupiter is too far from the sun. George Washington was the sole survivor of 10 children at the time of his death. The planet is very cold. The first field hospital treating wounded soldiers on the battlefield was introduced by Queen Isabella of Spain. Like Venezuela, the economic backbone of the Caribbean islands of Trinidad and Tobago is petroleum. Until 1830, the deaf, dumb, and blind were not included in the U.S. Census. What was our earlier idea of how Jupiter was constructed? Of what do we now think Jupiter is composed? What is one ingredient in the poisonous compounds found on Jupiter? Are there other gases besides hydrogen on Jupiter? What about the atmosphere prevents life from existing on Jupiter? Why is Jupiter cold? What was the career of the first president of Israel before he took office? What was given away by lottery in early Rome? Which birds drink with a pumping action? In what state was the first birth control clinic in the U.S. located? In which country was the first field hospital located? What is the backbone of Venezuela’s economy?
Each sentence was presented separately and in succession. In the continuous condition all the paragraph sentences were presented in immediate succession. In both conditions the last statement was followed by two blocks of comprehension questions. The plus was presented before each factual statement to make sure that the subject could easily distinguish the two sets of material. Below the paragraph are the comprehension questions.
Memory Mechanisms in Text Comprehension
285
TABLE I1
MEANREADINGTIMES(MSEC) FOR THE CONDITIONS IN GLANZER, FISCHER, AND DORMAN’S(1984) EXPERIMENT 1 Reading condition
Text Unrelated sentences
Continuous
Interrupted
3896 6690
4210 6695
Table II shows the reading times for the continuous and the interrupted condition and for the text sentences and the unrelated factual statements. The text sentences were slowed by more than 300 msec in the experimental condition. It can be concluded therefore that the slowing occurs when the distractor task involves the same kinds of processes as the paragraph comprehension task, that is, even when there is no change in processing set. The slowing cannot be ascribed simply to a change in set. Also very important is the absence of any interruption effect on the unrelated factual statements. In the interrupted condition the unrelated factual statements were interrupted by the paragraph sentencesjust as the paragraph sentences were interrupted by the unrelated factual statements. The data show that the interruption effects occur, however, only for the continuous text. Reading times increase for the related text sentences after an interruption, but do not increase for the unrelated factual statements. The data support the following conclusions. Short-term storage holds information concerning preceding text. The loss of this information results in a slowing of reading. The slowing may result from the subjects taking additional time to recover missing information before or while continuing the following text. It may also result from the subjects continuing their reading without recovering the missing information, but reading slowly because they are handicapped by its absence. The fact that comprehension tests given after the text was read show negligible effects of interruption supports the first alternative. If the subjects had been reading that subsequent text at a lesser level of comprehension, some sign of that lesser level should have appeared in the later comprehension tests.
D. ROLE OF THEMATIC OR TOPICINFORMATION IN
SHORT-TERM STORAGE: FIRSTATTEMPT
The experiment on probe recall showed that subjects could recall verbatim one to two of the most recent sentences read. That finding and its congruence with
286
Murray Glanzer and Suzanne Donnenwerth Nolan
the findings from free recall of unrelated items led us to think that the main information carried in short-term storage was those sentences. However, theorists have assumed the presence of thematic or topic statements, higher-order statements (Kieras, 1981; Kintsch & van Dijk, 1978), during the processing of text, and have assumed that those statements were in short-term storage. These assumptions seem reasonable, since it would be expected that if subjects were interrupted in their reading and were asked what they were reading, they could produce a topic statement. But to draw this conclusion, several things have to be established. It is necessary to determine the timing characteristics of such topic statements in order to interpret that presumed performance. It is necessary to determine, for example, whether subjects would produce a topic statement as easily as a report on the last sentence read. The time to make the response is of major importance in determining whether thematic information is indeed carried in short-term storage. If subjects take longer to give thematic information than to report the last sentence read, then its retrieval may be from long-term storage. (In Section III,I we will describe the use of time to respond to thematic information as a technique to analyze the presence of that information in short-term storage.) It is also necessary to establish whether topic statements play a critical role in text processing. One way to determine whether topic statements have this role is by measuring the effect on text processing when they are eliminated from shortterm storage. We made several attempts using this technique to measure the presence and the role of thematic or topical information in short-term storage. As will be seen, the initial attempts did not succeed. The first attempt was with a procedure that added a step to that used in the preceding experiments. As before, a distractor task was used to clear short-term storage. Then the subjects were given a topic word that would presumably place the topic or theme again in short-term storage. The re-placing of the theme into short-term storage should, if the topic were important for ongoing processing, counter the effect of the distractor task. Text was read in either a continuous or interrupted condition. The distractor task (in the interrupted condition) was addition. Crossed with these two conditions was a theme versus no-theme condition. In the theme condition each paragraph sentence was preceded by a word designed to remind the subjects of the topic. In the paragraph shown in Table I the reminder word was “Jupiter.” In the no-theme condition the word preceding the next sentence would be a neutral item such as “text.” Our expectation was that if part of the critical information that subjects lost from short-term storage with the distractor task was thematic information, then furnishing them with a topic word would lessen the effect of the distractor. The results were negative. Although there was a strong effect of the distractor task, there was no effect of the topic reminder word.
Memory Mechanism in Text Comprehension
287
There were several possible reasons for the negative results. One possible reason was that thematic information was not being carried in short-term storage or was not important for ongoing processing. Another was that the thematic information was selectively maintained by the subjects despite the distractor task, although our detailed analysis of the data gave no sign of such special maintenance. A third possible reason was that the words we had selected did not correspond to the topic or thematic information subjects normally carry forward. Perhaps that information was more detailed than the information that was furnished, or it may have been canied in a different, more highly processed form. To try the large number of possible thematic statements did not seem feasible, particularly since those statements might be carried in a highly processed, abstract form. We decided instead to concentrate on the information we were fairly sure the subjects were carrying in short-term storage and to analyze its role further. In the course of this analysis, we thought we might identify several types of information being carried, including topic information.
E. SPECIFIC CONTENTS OF SHORT-TERM IN READING STORAGE We were fairly sure that the subjects were carrying the last one or two sentences in short-term storage, but we were somewhat less sure that they were critical for ongoing processing. We were also unsure whether those last two sentences were the only information being carried. The question considered next was whether all that the subjects needed for the continued processing of text were those last one or two sentences. To answer that question we tried to determine whether subjects would recover completely from an interruption if they were given the last sentence or two that preceded the interruption (a distractor task). We would clear short-term storage with a distractor task, reinsert the last one or two sentences, and then measure the subjects’ reading times. The logic is similar to that of the preceding experiment. If it turned out that resupplying the subjects with the last two sentences was sufficient for them to continue with normal reading, then the ongoing processing could be viewed as primarily sentence-to-sentencelinkage, If the last two sentences did not suffice, then another type of necessary information would be indicated. In the following experiments, the massive use of distractors between all sentences of a text that characterized the preceding experiments was replaced by a more focused technique. A distractor or interruption occurred at one place in a text, and the effect of that distractor with respect to reading time for successive sentences was measured. Thus, it was possible to view not only the impact of the interruption, but also the course of recovery from the interruption. In the first experiment (Glanzer et al., 1984, Experiment 3) with this technique, three
288
Murray Glanzer and Suzanne Donnenwerth Nolan
conditions were used: ( 1) continuous-the paragraph sentences were read in immediate succession; (2) interrupted-the paragraph was interrupted by the reading of another unrelated text; and (3) interrupted with repetition-the paragraph was interrupted as in condition 2, but instead of continuing after the interruption with the next sentence of the paragraph, the last two sentences read in that paragraph (before the interruption) were repeated. The interruption was effected by alternating blocks from different texts so that both texts furnished information concerning the effects of interruption. A sample sequence is presented in Table 111. The results of the experiment are shown in Fig. 4. The control condition is the continuous condition and affords the baseline for the other reading conditions. As might be expected, the first four sentences, which in the experimental conditions precede the interruption, are read at the same speed in all three conditions. In the simple interruption condition, the sentence that immediately follows the interruption, sentence 5 , is slowed by an average of 355 msec. The sentence that follows it, sentence 6, has still not returned to the control speed (although this elevation is not statistically significant). The third sentence after the interruption has fully returned to normal reading speed. These data support the assertion that two recent sentences have to be carried in short-term storage. In the interruption with repetition condition the contents of short-term storage were presumably removed and the last two sentences (before the interruption) placed again in short-term storage by presenting them again. The data show that the repetition eliminates almost all of the effect of the interruption. The slight elevation of 84 msec on sentence 5 above the control condition is not statistically significant. We could conclude that the reinsertion of the last two sentences fills in all the information lost during the interruption. However, the 84-msec elevation was bothersome, particularly since such elevations appear in the subsequent experiments. This deviation will be considered again. Also of interest are the reading times for the repeated sentences in the repetition condition. Both repeated sentences are read faster the second time than the first. However, there is a considerable drop in the reading time for the second repeated sentence as compared to the first. Even the reading of a repeated sentence is aided by the presence of a preceding sentence in short-term storage. The faster reading of a repeated sentence is not unexpected, but it does require an explanation in terms of any theory of comprehension that is adopted. The explanation we will use is closely related to the reason we now think that there is an advantage for the subjects in having a verbatim representation in short-term storage. The speed in reading the repeated sentence is interpreted as arising from its special relation to the trace it left in the text representation during its first reading. The repeated verbatim sentence is a strong cue for eliciting that part of the representation. The subject can therefore fit the repeated sentence into the representation without a search in long-term storage for its place in that represen-
Memory Mechanisms In Text Comprehension
289
TABLE 111 A SEQUENCE OF INTERRUPTEDPARAGRAPHS IN GLANZER, FISCHER, AND DORFMAN’S (1984) EXPERIMENT 3“*b 1
There have been many scientists over the years who have wanted to produce cheap synthetic diamonds. One chemist tried putting charcoal made from sugar between two blocks of very hot iron. He then plunged the blocks into cold water causing a sudden contraction of the iron. The contraction of the iron was supposed to exert great pressure on the sugar charcoal.
1 1 1
2
During the Middle Ages, the British military was controlled by people who spoke the French language. It was further influenced by France because many of its campaigns were fought in that region. These two factors resulted in the introduction of many French military terms into the English vocabulary. Many of the words acquired long ago from the French are still commonly used today.
2 2 2
According to his theory, the sudden pressure on the charcoal would turn it into a diamond. After many unsuccessful experiments, he finally had something that he thought was the real gem stone. However, later tests showed that he had failed once again and merely produced a carbide. Carbides are carbon compounds that are quite different from the carbon which is a diamond.
1 1
I I
2
These two factors resulted in the introduction of many French military terms into the English vocabulary. Many of the words acquired long ago from the French are commonly used today. Some of the words which we acquired from the French include “army,” “navy.” and “sergeant. ’’ Some of the terms were borrowed from the French to identify a new object or idea. In other cases, objects or ideas were identified by both an English and French word. For example, the word “battle” comes from the French and the word “fight” from the English.
2 2
2 2 2
There has never been much interest in producing man-made diamonds. One scientist tried to make a diamond by compressing sugar. This scientist thought that he had created a real gem. He actually produced a carbon compound called a carbide. The English army had a great influence on the French. The English language contains military terms acquired from the French. Words were borrowed in order to communicate with the French. “Battle” and “fight” were both borrowed from the French. ~
The first paragraph is in the interrupted without repetition condition. The second paragraph is in the interrupted with repetition condition. Each sentence was presented separately and in succession. In the continuous condition all of the first paragraph was presented and then all of the second. All pairs of paragraphs were followed by eight comprehension questions. The numeral 1 or 2 was presented before each paragraph sentence to make sure that the subject could distinguish between the two paragraphs. Below the paragraph are the true-false comprehension questions. 0
Murray Glanzer and Suzanne Donnenwerth Nolan
290
4100
-
mm-
uwo-
4100 urn
"t
m.
=Jam 3100
I-
I
-
I
I
p5"a
'
t I
I
4
=--
3300-
31M mm21100-
Zlrnj
I \
+CONTROL --InTERnuPT. NO REPEAT AIWTERRUPT. REPEAT314 P-
--
1
\
1 I
I
\ \
\
I I
I
'
1
1 II
I
\ I \ I \ I \I
A
Fig. 4. Mean reading times across eight sentences of text for the three experimental conditions of Glanzer et a/. (1984), Experiment 3. Abscissa positions 3R and 4R refer to repetitions of sentences 3 and 4. Breaks in the curves symbolize interruptions by another text.
tation. The place is automatically accessed. The additional speeding that occurs for the second repeated sentence may indicate that two verbatim sentences, the first in short-term storage from the previously read sentence, the second entering with the currently read sentence, are better cues for eliciting relevant parts of the underlying representation than one cue. The other possibility is that for the second repeated sentence there are three useful cuing units held in short-term storage: the first sentence (stored), the parts of the underlying text representation recovered by its presentation, and the second verbatim sentence (just entered). The idea of the role of both incoming and stored information as retrieval cues and their additivity will be considered again. They are additions to our initial formulation. Except for the slight 84-msec elevation previously mentioned, the data support the argument that the presence of the preceding one or two sentences in short-
Memory Mechanisms in Text Comprehension
29 1
term storage is sufficient for normal reading. ,This was the argument that was made in the paper cited (Glanzer et al., 1984). The next question concerned the specific information in those repeated sentences that permitted the subjects to recover from an interruption that cleared short-term storage of its contents. Several possible factors may be involved. One was already discussed in considering the faster reading of repeated sentences, namely, their cuing function. The preceding text sentences elicit the relevant parts of the underlying text representation. The second factor is higher-order thematic information, which may be embedded in or cued by the preceding sentences. A third factor, and the one considered next, concerns the way in which the new sentence is related to the preceding text. This factor, brought to the fore by the work of linguists (Grimes, 1975; Halliday & Hasan, 1976), will be considered here under the label of sentence-to-sentence linkages. It will be argued later that the linkage factor is closely related to the cuing factor. The information needed for sentence-to-sentence linkage is clearly returned with the repetition of sentences. The next question addressed was the role of that factor in the subjects’ use of short-term storage. Some of the data of the experiment previously described could be analyzed to give some light on the issue of linkage information. Thematic information will be considered again later.
F. INITIALEVIDENCE FOR THE ROLE OF LINKAGE INFORMATION IN SHORT-TERM STORAGE Using a suggestion from Grimes (1975) concerning factors involved in making text coherent, we classified sentences according to their role in linking with preceding text. Two classes were defined-dependent and independent. Dependent sentences contain words or phrases that can only be interpreted fully if the preceding text has been read. These included certain classes of anaphora such as pronouns and classes of connectives (e.g., causal connectives). An example of a dependent sentence is, “Eventually he learned to make what he needed instead of having to search for them.” The words “eventually,” “he” and “them” make the sentence dependent on the preceding text. An example of an independent sentence is, “Scientists who study gorillas in their native African habitat know what to do if one charges.” The topic sentence of a paragraph, particularly when it is the first sentence of the paragraph, is almost always independent. Other sentences in a paragraph may, however, also be independent. The preceding sentence on gorillas came from the middle of a paragraph. The idea of dependence and independence is closely related to the linguistic analysis of text coherence. For the present, we will consider the idea as defined here and as it relates to the subject’s processing of text. The first question of interest was the extent to which this linkage factor accounted for the effect of the
292
Murray Glanzer and Suzanne hnnenwerth Nolan
TABLE IV
IN
MEANCOMBINED READING TIMES (MSEC) FOR THE CONDITIONS GLANZER, FISCHER,AND DORFMAN’S (1984) EXPERIMENTS 3 AND 4 Reading condition
Sentence type
Control, continuous
Interrupt, no repetition
Dependent Independent
4429
4876 4807
4877
distractor task. To do this we classified the text sentences of the preceding experiment and another experiment involving such interruptions as dependent or independent. We then examined the reading times for each class in both the continuous and interrupted condition. The means are given in Table IV. It appeared that the interruption effect occurred only with dependent sentences. The data seemed to indicate again that the effect of the interruption was wholly due to the disruption of sentence-to-sentence linkage. This implied that the key and only needed component carried in short-term storage was linkage information. Linkage information could, of course, be nicely covered by the storage of verbatim sentences in short-term storage. As will be seen later, the data from this post hoc comparison were somewhat misleading. Moreover, there were signs mentioned earlier that the repetition of previously read sentences was not sufficient for the full recovery of reading speed. There was the slight elevation of 84 msec of the interrupt with repetition condition over the control condition in sentence 5 (see Fig. 4). This suggested that something was needed by the reader in addition to the last two sentences. The elevation was not statistically significant, but similar elevations appeared in this condition in the next two experiments. Therefore, the possibility remained that something else had to be recovered by the subject or be replaced in short-term storage. The possibility we considered next was the thematic or topic information that we had not been able to demonstrate earlier.
G . ROLE OF THEMATIC OR TOPIC INFORMATION: SECONDATTEMPT In the next experiment (Glanzer et al., 1984, Experiment 4), we pitted topic information against the information contained in the immediately preceding sen-
Memory Mechanisms in Text Comprehenslon
293
e-CONTROL IJ-
- -INTERRUPT. NO REPEAT
- - INTERRUPT. REPEAT 5 k - - INTERRUPT. REPEAT 1
f+
Fig. 5. Meaning reading times across nine sentences of text for the four experimental conditions of Glanzer er al. (1984). Experiment 4. Abscissa position R refers to the repetition of either sentence 1 or 5. Breaks in the curve symbolize interruptions by another text.
tence. The materials and basic structure of the preceding experiment were used. These was a control, continuous condition, an interruption without repetition condition, and now two interruption with repetition conditions. In one, the sentence that immediately preceded the interruption was repeated. In the other, a topic sentence, the first sentence of the paragraph, was repeated. The results of the experiment are shown in Fig. 5 . The results for the control condition, interruption without repetition condition, and interruption with repetition of sentence 5 (the last paragraph sentence read before the interruption) condition replicate the findings of the preceding experiment. Interruption with repetition of the topic sentence, however, disrupts the reading. The increase over the control condition is over 500 msec. The results seemed to indicate that topical information was unimportant and that the sentence-to-sentence linkage information was critical. The story is not complete,
294
Murray Glanzer and Suzanne Donnenwerth Nolan
however, since the information in the preceding sentence and topical information were in opposition here. The effects of the two were not being measured separately. However, the indications were again strong that a major factor was linkage information. These indications were further strengthened by the results from a follow-up experiment. That experiment showed that the specific location of the repeated sentence in the preceding text (early vs late) was not the important variable. A sentence that linked appropriately with the next text sentence would, when repeated, serve equally well to help subjects recover from an interruption. At this point, short-term storage seemed to hold nothing but the last one or two sentences read and their function was solely to permit the subject to link successive sentences to form a coherent text. The weight of evidence was against any strong, positive effect of thematic information in immediate processing. OF THE BUILDING H. DIRECTEXAMINATION OF COHESION, SENTENCE-TO-SENTENCE LINKAGE
The post hoc analysis (Section III,F) indicated that a critical type of information carried in short-term storage was related to the linkage or dependence of the new sentence to preceding text. Linguists have considered what was labeled dependence under the heading of cohesion devices. We will follow the usage and system introduced by Halliday and Hasan (1976), who have outlined four major classes of grammatical cohesion devices: reference (pronominals, demonstratives, definite articles, and comparatives), substitution, ellipsis, and connectives (additive, adversative, causal, and temporal). The devices used in the studies to follow were primarily in the class of reference: demonstratives, definite articles, and pronominals. A second major class used was conjunction, connectives such as “however” and “thus.” Also used were substitutions, for example, “phenomenon” substituting for a more specific noun such as “hibernation” in a succeeding sentence. The following dependent sentence uses both reference (the pronominals “them” and “its”) and conjunction (the adversative connective “but”): “But few of them realize what a remarkable achievement its construction was. ” The sentence cannot be understood fully without some preceding text. It would be made independent by rewriting it as follows: “Few people crossing the Brooklyn Bridge realize what a remarkable achievement the bridge’s construction was. In order to examine the role of grammatical cohesion devices, linkages, in short-term storage and text comprehension, a series of four studies (Fischer & Glanzer, 1986) was carried out. The studies varied the dependence of specific sentences in the text and examined the reading times for those sentences when they had been preceded by an interruption. The interruption was designed to remove the contents of short-term storage as in the earlier experiments reported here. A sample of the material used is given in Table V. The distractor task used ”
Memory Mechanism in Text Comprehension
295
TABLE V
SAMPLETEXTUSED BY FISCHERAND GLANZER(1986), EXPERIMENT 1“ Computers The word “computer” may be used to refer to any device that calculates or computes. However, its use is most often restricted to a particular device that has several distinguishing features. (Dependent) The word “computer” is usually used to refer to a particular device with several distinguishing features. (Independent) Computers are always electronic and because of this they can operate at very high speeds. (Independent) They are always electronic and because of this they can operate at very high speeds. (Dependent) A second important feature is that they have the ability to retain facts and figures. (Dependent) An important feature of computers is that they are able to retain facts and figures. (Independent) A computer’s ability to retain facts and figures is referred to as memory or internal storage. (Independent) The feature is quite often referred to as the memory or internal storage of the computer. (Dependent) Information stored in the machine’s memory can be recalled quickly and easily at some future time. Another distinguishing feature is that a computer holds in its memory a set of instructions. A set of instructions that is held in a computer’s memory is called a program. a The sentences preceded by a (1) appeared in one version of the text, the sentences preceded by a (2) in the other, counterbalanced version. The material in parentheses, numerals and classification, did not, of course, appear in the text presented to subjects.
in the first experiment of this series was the reading of a set of unrelated factual statements. We expected, with these materials, to repeat the findings shown in Table IVa disruptive effect of the distractor task on the dependent sentences, but not on the independent sentences. What we found was a differential effect of the distractor task on the dependent sentences, but a considerable effect on the independent sentences as well. The results are given in Table VI, and we will rely on those results for our further discussion, since they are based on a wider sampling of sentences and a careful matching and counterbalancing of the sentence conditions. That statement does not apply to the post hoc analysis that gave the data in Table IV. A main and expected finding seen in Table VI was the strong differential disruptive effect of the distractor on the dependent sentences. This fully supports the effect found in Table IV. Table VI also indicates that in the continuous condition dependent sentences are read faster than independent sentences. This effect appears in all five experi-
296
Murray Glanzer and Suzanne Donnenwerth Nolan
TABLE VI
MEANREADINGTIMES(MSEC) CONDITIONSIN FISCHERAND GLANZER’S( 1986) EXPERIMENT 1
FOR THE
Reading condition
Sentence type
Continuous
Distractor
Dependent Independent
4730 4822
5718 5410
ments we have run in which this factor was analyzed, but it is relatively small in size and not statistically significant in any one of the experiments. Moreover, the basis for the effect is not clear at present. It may be due to the efficiency with which text can be organized when it consists of dependent sentences. Dependent sentences indicate very clearly how the successive sentences are linked. However, the effect may also be due to word frequency effects. Dependent sentences have higher-frequency words (e.g., pronouns instead of nouns) than independent sentences. These points do not, of course, take away from the disruptive effect on dependent sentences in the interruption condition. Another important characteristic of the data in Table VI is the large increase of reading time in the interruption condition, which cannot be ascribed to the loss of linkage information. In the case of independent sentences, the distractor task increases the reading time in resuming the text by nearly 600 msec. This finding indicates that another class of information besides linkage information is lost with the interruption. One class of information that may be missing, information removed by the distractor task, is topic information. Further data on topic information will be presented later (Section IIIJ). The function of the topic information during ongoing processing will also be considered later. It will be proposed that topic information serves to cue relevant parts of the text representation for retrieval during processing. Two further experiments of the same type as the last were carried out, one using digit recall, the other using addition problems as the distractor task. The same general pattern of results was obtained as that shown in Table VI. Dependent sentences were slowed by interruption more than independent sentences. The overall effect of the interruption was much greater with these other distractor tasks, however. With digit recall the independent sentences increase in reading by 895 msec, and with addition problems by 1704 msec. The increase with a reading distractor task is only 588 msec (Table VI). These large increases in reading time with digit recall and addition suggest still another factor in deter-
Memory Mechanisms in Text Comprehension
297
mining reading time-a set for reading. Initially, we assumed on the basis of ideas from interference theorizing that using reading as a distractor task would produce larger effects than tasks that did not involve reading. The performance called for in the distractor task does have an effect, but not the one we expected. The less like reading the distractor task is, the more disruptive it is. We labeled the additional factor “reading set.” This set may be the same as the very general structures often assumed to control the reading of a text and include the goals and interests of the reader in reading the text (Kintsch & van Dijk, 1978; Meyer, 1984). They may also include the more specific knowledge structures that apply to a particular type of text, for example, a narrative story schema for a simple story (Mandler & Johnson, 1977) or the organizational schemata for expository prose outlined by Meyer (1975, 1984). The factor we are calling reading set may include such controlling schemata. The increases in reading time caused by the loss of reading set reflect, in part, the time needed to reconstruct or reactivate the controlling schemata that the subjects use in processing texts. A considerable amount of further work is needed, however, to specify how this factor appears or reappears during ongoing processing. In a final experiment in the series, the effect of both arithmetic and reading distractor tasks was examined using within-subjects comparisons. In addition, various classes of distractor reading material were compared-cases in which the interpolated sentences were unrelated to the text, cases in which the interpolated sentences offered possible but incorrect referents for the anaphors in subsequent sentences, and cases in which the interpolated sentences were thematically related to the interrupted text. Again, on the basis of interference theorizing we expected that thematically related distractor sentences would be particularly disruptive. The data showed, however, that any effect the thematic information had in this case was facilitative. On the basis of the various classes of distractor tasks, we were able to define four main factors as determinants of reading time. They were the following: independence, presence of needed linkage material in short-term storage, presence of topic or theme in short-term storage, and set. These four factors were incorporated in a multiple-regression analysis that is summarized in Table VII. The regression weights can be translated into the time it takes the subjects to recover from the lack of the listed factor. For example, if needed linkage information is not in short-term storage, as would occur with a dependent sentence after a distractor task, it costs the subject 448 msec to recover. The regression analysis was extended then to include all four experiments of the series, adding some constants to cover changes in the base reading rates for the four experiments. When extended in this fashion, we obtained a similar set of weights for the four main factors in Table VII, but with higher significance levels. (For example, the weight for dependence was now 156 and its significance level p < .076.) The multiple correlation squared was .986, F(8,13) = 112.12,
298
Murray Glanzer and Suzanne Donnenwerth Nolan
TABLE VII RESULTSOF MULTIPLEREGRESSION ANALYSISFOR FISCHER AND GLANZER’S (1986) EXPERIMENT IV” Factor
Regression weights
r (df = 5 )
p
Dependence (bl) Needed linkage information in STS (b2) Theme in STS (b3) Reading set (b4)
142 448
1.17 3.49
<.296 <.018
364 1461
4.56 16.30
<.007 <.001
Regression constant (a) = 5159. Multiple correlation squared = ,990, F(4, 5 ) = 124.22, p < ,001.
p < .001. The plot summarizing the relation of the predicted scores to obtained scores for the various conditions in all four experiments is shown in Fig. 6. The results of the set of experiments led us to an expanded picture of the factors involved in the process of text comprehension. In this picture the subject carried forward in short-term storage the last one or two sentences read in 7500
-
Experiment I Experiment II A Experiment 111 Experiment iV X
P
-t
v)
7100
-
6700
-
: F
0 6300 -
z0
9 K
0
W
>
K
”
A 5900 -
5500
I
0 5100
/
-
-
4700 \
of Fischer
Memory Mechanisms in Text Comprehension
299
verbatim form. This form served two functions: First, it facilitated the linkage process involved in text comprehension. The relation of the new sentence to preceding text was facilitated by having both available simultaneously. The linkage has to be made eventually, however, with the underlying representation which is in long-term storage. The second function of the recent sentences in short-term storage is as a cue to retrieve the appropriate section of the underlying representation and to permit linkages to be made with it. The result of that process is the modification and extension of the representation. In addition to the verbatim sentence information, there is some evidence, still not very strong at this point, that the subjects are carrying forward thematic or general topic information. This topic information may function as an additional cue for the retrieval process outlined above. The functions of both sorts of information-the detailed and the general-may also be viewed as addresses for text information in the long-term storage. 1.
EXAMINATION OF THEMATIC OR TOPICINFORMATION: THIRD ATTEMFT
Our next step was to attempt to look at theme or topic information again and in a more direct fashion. The positive evidence for the role of theme or topic information in the preceding section was either indirect (the effect of distractor task on the independent sentences) or weak (the effect of a distractor sentence on the same topic as the main text). Two experiments were therefore set up to measure further the effects of topic as opposed to detail information during the course of reading (Nolan & Glanzer, 1985). For these experiments, we constructed texts that discussed several main ideas. Each text was four paragraphs long. Each paragraph expressed a different main idea or topic and was considered as a separate thematic unit. However, the beginning sentence of each paragraph was not explicitly cued; that is, there was no indenting to mark off paragraphs. The way in which we explicitly stated the theme for each paragraph was to begin each one with a statement that summarized its contents. These initial sentences are referred to as the topic sentences of the paragraph. Each of the paragraphs thus had a deductive paragraph structure (Gilliand, 1975). They began with a generalization, and the other sentences in the paragraph expressed the details that supported this general statement. The construction of topic sentences was guided by van Dijk’s (1979) suggestion that the theme or topic of a sequence of propositions is a statement of both the central referent of that sequence and the major predications about it. A different procedure than the one using distractor tasks, as in the preceding experiments, was developed. Now the subjects’ response time in answering
300
Murray Glanzer and Suzanne Donnenwerth Nolan
questions about the text they had been reading was measured. In both experiments subjects read the text sentence by sentence, as in the preceding work. They were interrupted by probe questions during the paragraph that were answered true or false. There were two classes of probes: topic and detail. The topic probe paraphrased the topic sentence of the paragraph, while the detail probe paraphrased the content of one of the detail sentences in the paragraph. The response times for the answers to these two types of probes were used to tell us about the presence of topic and detail information in short-term storage and the length of time each was maintained in storage. In the first experiment, the probe questions were given either immediately after the sentence containing the relevant information or after three other intervening sentences. A sample text and probes are given in Table VIII. The time to respond to the questions was assumed to depend on the contents of short-term storage. Questions that can be answered by referring to information in short-term storage will be responded to more quickly than those that require the retrieval of information from long-term memory. It is clear from our previous work that the information necessary to answer a question about the sentence just read is present in short-term storage. That work also clearly indicates that a detail sentence is carried in short-term storage for only a short time, namely, for one or two intervening sentences. Thus, there is a clear expectation about the response times for detail probes. They should show a distance effect. Probes that refer to information from three sentences back should be responded to more slowly than probes referring to the last sentence read. The expectation for the topic probes is less clear. There is a considerable amount of literature that claims that topic information is privileged and presumably is carried longer during the course of reading (Kintsch & van Dijk, 1978; Miller & Kintsch, 1980). If this is the case, the topic probes should not show the distance effect detail probes should show. The measure used was the subjects’ speed of responding to the probes. The results given in Table IX indicate that there is a strong distance effect on the detail probes, as expected. The mean response time increased by 664 msec as the distance between sentence and probe increased. The topic probes, however, produce a different pattern. Although there is an increase of 52 msec in response time with an increase of distance, the increase is slight and not statistically significant. The large difference in the distance effects for the two types of probes supports the idea that topic information is privileged. Topic information is carried longer in short-term storage than detail information. It becomes important to consider the way that this occurs. This problem will be considered later. We also looked at the subjects’ reading time when they resumed reading the text after responding to the probe. Responding to a probe affects the contents of short-term storage. If the subjects have answered a question about information that would normally be present in short-term storage, they should be easily able
Memory Mechanisms In Text Comprehension
30 1
TABLE VIII SAMPLE TEXTOF Two PARAGRAPHS WITH TOPICAND DETAIL PROBES USEDIN NOLANAND GLANZER’S (1985) EXPERIMENTS 1 AND 2“ A career as a cowboy meant living in poverty and facing many dangers. T-0 Most cowboys shunned marriage because they were always on the move and their pay was too low to support a family. D-0 Their bunkhouse was little more than a slum, filthy and smelly. On a given day, a cowboy could find himself in the middle of a prairie fire, quicksand, or a stampede. T-3 He could be thrown or kicked by a horse, charged by a cow, or half-frozen on a winter search for strayed livestock. D-3 Exposure to the extremes of weather frequently brought on pneumonia, a leading cause of cowboy deaths.
Topic probe: Detail probe:
In their careers, cowboys faced many dangers and lived in poverty. Their low pay and constant moving around kept cowboys from marrying.
The “grand drive” of the cattle from range to marketplace was particularly arduous. T-0 Each drive generated its special measure of toil and trouble. D-0 Steers would drown in sinkholes at river crossings. Indians constantly tried to beg or steal cows. T-3 There was rarely enough water for the cattle. D-3 Settlers drove the trespassing herds from their lands with guns. The weary cowhands would work 18 hours a day and were always tired. The only comforts the cowboy had on the drive were his bedroll and a campfire at night. Topic probe: Detail probe:
The grand drives of cattle to the marketplace were particularly difficult. Each grand drive had its own special measure of work and trouble.
T-0, T-3, D-0, D-3 indicate the location of the topic (T) and detail (D) probe questions at each of the distances (0,3) in Experiment I . The probe questions placed at each of those locations are listed at the end of each paragraph. In Experiment 2 all probe questions were three. or more sentences from the related sentence.
to read the next sentence in the text. Thus, when they have responded to a topic probe at a distance of zero or three intervening sentences or a detail probe at a distance of zero intervening sentences, subjects should not be at a disadvantage in reading the next sentence in the text. Information closely related to what they normally carry in short-term storage is elicited. In contrast, when subjects respond to a detail probe at a distance of three intervening sentences, short-term
302
Murray Glanzer and Suzanne Donnenwerth Nolan
TABLE IX MEANRESPONSE TIMES(MSEC) TO TOPIC AND DETAILPROBESFOR NOLAN AND GLANZER’S (1 985) EXPERIMENT 1 Distance
probe type
0
3
Topic Detail
4468 4669
4520 5333
storage contains a representation of this preceding detail sentence. This representation would not normally be present in short-term storage, and its reentry into short-term storage has probably led to the loss of information relevant to the interpretation of the next sentence. Although mean reading time was indeed longer in this condition than in the other three, the effect was not statistically significant. The second experiment further explored distance effects for both detail and topic information. The pupose of this experiment was to determine whether topic information was carried across a paragraph boundary. Our expectation was that it was not. Kintsch and van Dijk (1978), however, have suggested that text coherence is maintained on two levels-the microstructure level, which represents the details of the text, and the macrostructure level, which represents the gist or main topics of the text. At the macrostructure level text coherence is maintained by the linking together of macropropositions which represent the main ideas or major topics of the text. This is the mechanism that allows the ideas expressed in the text to be linked to the topic of the discourse as a whole or to the topic of a fragment of the discourse. Although the way in which these macropropositions are stored has not been fully specified, it seemed likely that the storage capacity is limited, and the reader can maintain only a small number of macropropositions at any one time. Because of this capacity limit, there must be points during processing at which there is a replacement of this higher-level informationtopics, themes, or macropropositions. It seemed likely therefore that this replacement would occur shortly after a major change in topic. Such a change should occur at a paragraph boundary. While a particular paragraph is being read, a representation of its topic should be carried in short-term storage. This was indicated by the findings of the first experiment. When a change in topic is encountered at the beginning of the next paragraph, we expected signs of a loss of information concerning the previous paragraph. Our second experiment was therefore designed to investigate whether topic information dropped from short-term storage once the topic sentence of a new
Memory Mechanisms in Text Comprehension
303
paragraph had been encountered. The materials and procedure were the same as those in the first experiment except that distances were extended so that all the probes occurred at a distance of three or more sentences from the related sentence in the text. However, the probes were placed so that in half the cases they referred to information in the current paragraph and in the other half to information in the prior paragraph. The probes that referred to information from the prior paragraph were placed after the first detail sentence of the next paragraph. This is the sentence that followed the topic sentence of the new paragraph. We did not expect any paragraph effect on the detail probes, since in both conditions the number of intervening sentences was great enough to have the relevant verbatim information removed from short-term storage. We did expect to see an effect of paragraph on the topic probes. Topic probes referring to the current paragraph should be verified more quickly than topic probes referring to a prior paragraph. The only statistically significant effect obtained, however, was a difference in the speed of responding to topic versus detail probes. Questions about the topic of a previous paragraph were answered as quickly as those about the current paragraph. The mean response times are shown in Table X. Thus, we conclude that topic information is carried forward from the preceding paragraph even after the topic of the second paragraph has been introduced. The way in which this is done required further analysis. One possibility is that a linkage of topics occurs that parallels the sentence-tosentence linkage discussed earlier. There is evidence in the literature of such topic linkage, usually discussed in terms of the need of subjects to integrate the topic of a new paragraph with the topic of a prior paragraph. Lorch, Lorch, and Matthews (1985) found that reading times for the initial sentence of a new paragraph were influenced by the ease with which the new and preceding topics could be integrated. This may be looked at as a parallel to the effects we found for dependent and independent sentences in continuous reading. Moreover, where there is an easily established relation between topics, an interaction may be set up so that each is a strong cue for the other. This cuing effect, in addition TABLE X MEANRESPONSETIMES(MSEC) TO TOPICAND DETAILPROBESFOR NOLAN AND GLANZER’S (1985) EXPERIMENT 2 Paragraph
Robe
type
Current
Previous
Topic Detail
4940
4985
5328
5513
304
Murray Glanzer and Suzanne Donnenwerth Nolan
TABLE XI MEANRESWNSE TIME(MSEC) READSENTENCES FOLLOWING VARIOUSPROBESFOR NOLAN AND GLANZER’S (1985) EXPERIMENT 2 TO
Paragraph
Probe type
Current
Previous
Topic Detail
5501 5715
7097 7503
to the cuing effect of the probe question itself, may be sufficient to retrieve the related part of the underlying representation quickly. Examination of the times it takes subjects to resume reading after answering the probe shows effects of both probe type and paragraph. The mean response times are shown in Table XI. Both effects are statistically significant, but their interaction is not. These effects will now be discussed.
IV. Theoretical Analysis of Thematic Information Carryover There are a number of proposals in the literature concerning the role of thematic information in the final memory representation of the text. This information is assigned a prominent role in the organization of the representation and in recall. This prominent role suggests that thematic information is processed differently from detail information and in such a manner that it is relatively more accessible than detail information. Also, thematic information must be processed in a manner that allows it to serve an organizing function in the representation. Before discussing a mechanism for the processing of thematic information, it is useful to review proposals concerning the way in which thematic information is represented in memory. Many models of text comprehension assert that the long-term memory representation of the text takes the form of a hierarchical tree. One class of these models are the story grammar models of Mandler and Johnson (1977), Rumelhart (1975). and Thomdyke (1977). Although differing in details, these theories cluster the conceptually related statements in a text according to their membership in a story constituent. A representation of a story using a structural grammar consists of a hierarchical tree of nodes and relations. Important and general story propositions occur high in the organizational hierarchy, while less important details occur low in the hierarchy. In a similar vein, Meyer’s model
Memory M e c h a n h s in Text Comprehension
305
(1975, 1984) of text comprehension describes a hierarchical content structure for expository texts. In this content structure some ideas are superordinate to others and dominate their subordinate ideas. The top levels of the hierarchy represent the overall organization of the text. The next level represents the paragraph level of the text, reflecting the logical relations between the paragraphs. The lower levels represent the details of the text, describing or giving more information about ideas represented at higher levels in the text. This structure makes the paragraph the basic organizational unit in an expository text. The assignment of a prominent structural role for thematic information in text representation derives, in part, from its prominent role in recall. High-level information is better recalled than low-level information both in narrative texts (Rumelhart, 1977; Thorndyke, 1977) and in expository texts (Kintsch, Kozminsky, Streby, McKoon, & Keenan, 1975; Meyer, 1975; Meyer, Haring, Brandt, & Walker, 1980). There is also evidence to suggest that the details in a hierarchical representation are retrieved by first retrieving higher-level ideas. Yekovich and Thorndyke (1981) found that the conditional probability of recalling a text proposition given recall of its superordinate proposition was much higher than when its superordinate was not recalled. This result as well as the finding of a levels effect is consistent with a retrieval process that operates in a top-down fashion. There are a number of ways in which thematic information could get its privileged role during the processing of information. One way is by a deliberate strategy on the part of subjects. We will suggest a simpler, automatic mechanism drawn from memory theories. The mechanism is modeled on the role of superordinates in recall of categorized lists and makes use of the concept of cuing mentioned earlier. We view the relations of thematic or topic sentences to other sentences in the paragraph as paralleling the relation of a superordinate to subordinates in a categorized list. A critical factor is the close meaning relation in both cases. In the case of sentences, such a meaning relation, when present, is easy to see in the words of the sentences. For example, the topic sentences in Table VIII contain words that reappear or are referred to or are associates of words in each of the sentences that follow them. The thematic statement is ordinarily introduced early in the paragraph. Kieras (1980) has suggested that this is done to signal to the reader what the theme is for the paragraph. We will focus not on the signal character of that position, but on the strategic advantage given the topic sentence when it is in that initial position. That advantage is one that is of benefit to the reader. When the thematic sentence is placed early in a paragraph, its meaning relationship with each of the succeeding sentences results in its repeated elicitation. If it drops out of short-term storage, it is likely to reappear, cued by subsequent detail sentences. The repeated elicitation and interaction with succeeding sentences gives it the greatest
306
Murray Glanzer and Suzanne Donnenwerth Nolan
strength of registration. In subsequent recall, it is the most likely item to be recalled and the most likely to act as a retrieval cue for the other sentences. On the basis of its likelihood to be in short-term storage and its likelihood to relate to the next topic sentence, it is further favored and is likely to be carried over into the next paragraph. These statements account for the pattern of results found in the probe response times for Experiments 1 and 2 above. Since the strength of registration of the theme is related to the number of times the theme has been elicited during processing, a theme that subsumes many details in subsequent statements should be more strongly registered than a theme that subsumes few details. Black and Bower (1979) have reported data that indicate this is true. In their study, subjects read stories that varied in the number of subordinate detail statements they contained. Black and Bower found that thematic, superordinate statements were better recalled for the episodes that contained more subordinate detail statements. Another higher-order kind of linkage is also possible for topic statements. The topic for the prior paragraph is likely to be in short-term storage at the time a new paragraph begins, and that topic may be semantically related to the topic of the new paragraph. This relation and the coincidence of the two successive topic statements in short-term storage is the basis for repeated cuing of each topic statement by the other during the reading of the paragraph. In particular, the topic statement for the next paragraph may automatically elicit the prior topic just as a detail sentence elicits its topic sentence. So on the basis of its likelihood to be in short-term storage and its likelihood to relate to the next topic sentence, the theme of the immediately preceding paragraph is likely to be carried over into the next paragraph. These statements account for the pattern of results found in the probe response times for Experiments 1 and 2 above. They can also be extended to explain the recovery times for the reading of sentences after the probes. That case is somewhat more complicated, since we have different units in short-term storage than those present during continuous reading. We have the information retrieved to answer the question. If the subjects have just answered a current paragraph topic probe, they have appropriate information in short-term storage to help them interpret the next sentence in the text-both the topic and the last detail sentence read. If they have answered a current paragraph detail probe, they have available some information they normally would have (current topic) and some information that is somewhat relevant (detail information from several sentences back in the paragraph). Thus, in this case it would take more time to read the next sentence in the text because the subject must recover the last sentence read or try to interpret the sentence based on this information from earlier in the text. The reading times supported this explanation. Subjects were 214 msec faster in reading the sentence after a current paragraph topic probe question than after a current paragraph detail probe question. The situation with a previous topic probe is more complicated. With a pre-
Memory Mechanisms in Text Comprehension
307
vious paragraph topic probe the subjects have thematic information from a prior paragraph. This information is likely to elicit the topic information for the current paragraph, but may also elicit irrelevant detail information from the preceding paragraph. In the previous detail condition, the subjects are carrying irrelevant detail information from the preceding paragraph. This information will not help in the interpretation of the next text sentence. Thus, the reading time should be longer after a previous paragraph detail probe question than a previous paragraph topic probe. The reading times are indeed 406 msec slower after a previous paragraph detail probe than a previous topic probe. This explanation is complicated and can only be considered preliminary at this time. The data indicate, however, that a full analysis of the ongoing processing of thematic information is both possible and informative. Although the discussion above and the experimental work described are based on an opposition between two levels-topic and detail-that opposition is an oversimplification. There can be any number of levels between the most general statement to the most specific statement in a text. The way that they would be processed and affect the processing of other statements will depend on their placement in the text. There is no strong contrast between macrostatements and microstatements. There is no need to postulate separate storage units for different classes of statements.
V. General Theoretical Statement A. BASICASSERTIONS The basic elements of a theory of the process of comprehension are the following assertions. 1. Text comprehension is based on a repeated cycle of events involving shortterm and long-term storage. Each new sentence is entered into short-term storage and a representation of that sentence entered in long-term storage. The form of the information entered in long-term storage varies, depending on what other statements are in short-term storage during its entry there. The information entered in long-term storage is appended to the information about the text already there. The accumulated information forms the subject’s cognitive representation of the text. 2. The ordinary contents of short-term storage are, at a minimum, the last two sentences or clauses processed. These are held initially in verbatim form. These sentences or clauses are replaced by other statements as the processing continues. 3. In addition, short-term storage may hold information characterized by meaning overlap with several statements from an earlier section of text. These sentences are called topic or thematic statements. However, the meaning overlap
308
Murray Glanzer and Suzanne Donnenwerth Nolan
may be at a simple word level involving the relation between subordinate terms in one sentence and superordinate terms in the other. These statements may be carried forward in their original verbatim form or may be representations reentered from long-term storage. The mechanism for their prolonged maintenance in short-term storage is indicated below. 4. A representation of the processed form of a statement may reenter shortterm storage after its earlier representation has been replaced. This reentry will occur by cuing from the contents of short-term storage. Cuing is based on overlap of the content of statements. This includes meaning overlap. 5. Statements in long-term storage vary in their strength. This means that they vary in their accessibility when cued with either other text statements or other outside cues (e.g., questions, accidental or deliberate prompts.). 6. Statements in short-term storage, during their stay, strengthen their underlying representation in long-term storage. They repeatedly cue the retrieval of those underlying representations. Each retrieval strengthens the relation between cue and representation. 7. Statements in short-term storage at the same time will cause their underlying representationsto interact. The interaction will be seen as a linkage or change of interpretation of that underlying information. The interaction will also result in those statements becoming cues for each other. 8. The interaction of statements will be guided by cohesion devices present in the text. In some cases guidance by cohesion devices is efficient only when the verbatim forms of the statements are both in short-term storage (e.g., in cases of pronominal reference). Those are cases in which the cphesion devices are present in the surface form of a statement, but may not be in its processed form. Therefore the carrying of the verbatim form of statements noted in assertion 1 is of critical importance for the utility of cohesion devices. There are other types of cohesion devices (e.g., connectives relating thematic statements) that are effective if some representation of the earlier related statement, not necessarily verbatim, is present in short-term storage. 9. Cuing of statements in long-term storage and their resulting reentry in shortterm storage is probabilistic. Cuing effects of different statements in short-term storage, if they are likely to retrieve the same long-term storage statement, combine to increase the probability of eliciting that information. Most of the mechanisms and concepts outlined above have appeared in the theoretical literature on simple memory tasks. A detailed example of the use of such mechanisms may be found in the Raaijmakers and Shiffrin (1980) theory. The overall processing in reading described below is similar to that described by other theorists concerned primarily with reading (Carpenter & Just, 1977; Frase, 1975). The present description, however, differs markedly from theirs in its use of memory mechanisms and concepts. An approach similar in spirit to the one outlined here appears in an excellent paper by Alba and Hasher (1983).
Memory Mechanisms in Text Comprehension
309
B. DESCRIPTION OF COMPREHENSION IN TERMSOF THE THEORY The subjects’ comprehension of a text is their development of an underlying representation of that text that is changed with each new sentence read. These changes in the account presented here are mediated by short-term storage. They occur through the maintenance of information in short-term storage and, more importantly, by the repeated retrieval of information from long-term storage and its reentry in short-term storage. The subject in reading or hearing a text holds the last one or two sentences verbatim in short-term storage. Those sentences have several functions. Since they have been processed, that is, related to the underlying representation, the information in them has been made part of the overall representation. On the basis of the relation set up by the processing, they are strong cues for eliciting the corresponding part of the underlying representation. That elicitation or addressing function is one of their functions. Since it is likely that the part of the representation to which a recent sentence is related is also relevant to a new succeeding sentence, the presence of the preceding sentence aids the ongoing process. It is available to retrieve its underlying representation and bring it into contact with the information in the succeeding sentence. A second function of the verbatim storage is to aid further in the linkage process. Most cohesion devices are set up so that the surface form of successive sentences indicates the identities and relations between elements of those sentences. For example, in a sequence such as the following, there is no doubt what “he” refers to: “John went to the movies with Fred and Bill. Harry stayed at home with his cousins, Earl and Tim. He felt he ought to act as host to the visitors.” The assignment is simple because of the rules that govern cohesion devices and their interpretation. These rules concern the surface form of sentences. They are structured to fit the limits of short-term storage. For example, they govern the relations between immediately succeeding sentences. The use of these cohesion devices in an underlying representation lacking surface forms would be considerably more complicated. To see this, it is only necessary to carry out a propositional analysis of the first two sentences in the example above and diagram the network that results, as would be done by Kintsch and van Dijk (1978). To relate the third sentence properly to that network in the absence of the surface information in short-term storage is difficult. A further example of the role of verbatim representation is the following. The surface form of a sentence carries information that distinguishes sentence topic and comment. At a propositional level, the following sentences are identical: “The girl is petting the dog,” and “The dog is being petted by the girl.” But they are not equivalent in their communicative impact. Subjects perceive the active sentence as being about the girl and the passive sentence as being about the
310
Murray Glanzer and Suzanne Donnenwerth Nolan
dog (Homby, 1972). There is evidence that people do indeed make use of such retained surface information in processing the next sentence. Sanford and Garrod (198 1) report a study by Purkiss (1978), who found that a sentence containing a reference to the topic portion of a previously read sentence was read more quickly than one containing a reference to the comment portion. A third function of the verbatim representation is to permit correction of misinterpretation as indicated by subsequent text. This may occur when a sentence contains an ambiguous word. The other part of the representation that is retrieved during processing is thematic information. Its retrieval occurs repeatedly because of its meaning relation with subsequent sentences. This repeated elicitation may serve several functions in both ongoing processing and later recall. One function is that of auxiliary cue, adding to the cuing effect of the recent sentences. Another function is to serve as an alternative cue if those recent sentences do not elicit appropriate segments of the underlying representation. The two functions above arise during ongoing processing. In later recall these topic or theme units have again several functions. Since they have been elicited several times and therefore held in short-term storage for longer times, they are likely to be retrieved during recall. Since they have been elicited in the presence of several other sentences, they are likely to elicit those sentences during recall attempts. They are likely to be remembered and are likely to cause other units to be remembered. They thus have the capability of playing a central organizing role during recall. Finally, since the topic sentences are elicited repeatedly, they are likely to be elicited in conjunction with each other. They therefore furnish the basis for a higher-order linkage of large units of text. C. FURTHER IMPLICATIONS FOR COMPREHENSION OF TEXT The theory oulined above can be used to analyze and predict many effects in text comprehension. We will consider some of these briefly here. A number have been noted earlier. I.
The Effect of Placement of Topic Sentences in a Paragraph
The picture of repeated retrieval developed here implies that for optimal comprehension and recall a topic sentence should be placed at the very beginning of the paragraph. Moreover, the succeeding detail sentences should show a maximum of meaning overlap with it, at least initially. If these arrangements are not made, the repeated retrievals that strengthen its representation will not occur. On that basis, it is less likely to appear on a final test and therefore less likely to serve as a retrieval cue for other sentences. Moreover, without the repeated
Memory Mechanisms io Text Comprehension
31 1
retrieval, it will not have the opportunity to occur with and interact with many of the subsequent sentences. It will therefore be further impaired as a retrieval cue for other material. The experimental implications of this analysis are clear for the relation between topic sentence placement and text recall and comprehension (see assertions 3, 4, 6, and 7 in Section V,A). Data relevant to these points are found in Perfetti and Goldman (1974, 1975). 2. The Effect of Lack of Explicit Topic Sentences It follows from the discussion of (1) above that if the topic sentence is not presented at all, that is, left implicit, recall will be impaired. This statement is supported by the findings of Lorch & Lorch (1985). 3. Superior Recall of Topic or Thematic Information
This effect follows from the discussion above. It has been generally supported in experiments (Johnson, 1970; Kintsch & Keenan, 1973; Kintsch et al., 1975; Walker & Meyer, 1980). 4. Biasing Effect of Early Topical Information on Interpretation
A topic statement that appears early in a text will affect the interpretation of the text. This effect will be most marked when the text contains two different themes. A demonstration of such an effect is found in a study by Kozminsky (1977) in which the topic was biased by the title presented. 5 . Biasing Effect of Position on Topic Identification
Closely related to the implications above is the implication that a topic statement is more likely to be selected as such if it is placed early in the passage (Kieras, 1978).
6. Spacing of Related Information Related sentences are most likely to be integrated if they are placed close together in the text (Walker & Meyer, 1980). (See Assertions 2 and 7 in Section
VA.1 7. Phrasing of Related Information Related sentences are most likely to be integrated if they are worded in a similar fashion (Hayes-Roth & Thorndyke, 1979). (See assertions 3 and 7 in Section V,A.) The theory, with its assumptions concerning repeated retrievals that permit
312
Murray Glaazer and Suzanne Donnenwerth Nolan
interaction between related sentences, can be extended to sentences in sequences that do not form “natural” text. We will present a detailed analysis of an important case involving such sentences. There are three reasons for carrying out this exercise. One reason is that it demonstrates the application of the theory to a simple case. A second reason is that it shows clearly that a simple memory-based theory can work while avoiding the postulation of complex cognitive mechanisms. A third reason is that it can be extended to cover generalizations that a more complex cognitive theory cannot cover.
VI. Extension of Theory to Unnatural Text: Abstraction Paradigm Bransford and Franks (197 1) have reported a striking set of results on what they call abstraction. They had subjects listen to a series of sentences with overlapping arguments such as “The ants ate the sweet jelly” and “The ants were in the kitchen.” The sentences were separated by a brief distractor task and a question. The subjects when later tested were more likely to say that they had heard a complex sentence such as “The ants in the kitchen ate the sweet jelly” which they had not heard than simpler sentences which they had heard. There were two basic effects: One was the oldhew nondiscrimination effect; subjects had difficulty in distinguishing old from new sentences. The other was a size effect; subjects were surer that they had heard a sentence out of the related family of sentences, the more complex it was. The general findings have been replicated many times (Cofer, 1973; Flagg, Potts & Reynolds, 1975; Katz, 1973; Peterson & McIntyre, 1973; Singer, 1973). The nondiscrimination effect, however, can be weakened if visual rather than auditory presentation is used (Flagg & Reynolds, 1977; Katz & Gruenewald, 1974). There are a variety of theories that can be used to explain the general results, including theories involving factors such as abstraction or construction by the subject. We will argue, however, that the theory outlined in the preceding section can explain the results using the standard memory concepts discussed there. A.
EXPLANATION OF FINDINGS BY THE PRESENTED THEORY
The explanation presented here draws primarily on assertions 2,6, and 7 in the theoretical statements in Section V ,A. As subjects hear successive sentences, they retrieve earlier related sentences. The overlap in meaning (actually overlap in wording) of a newly presented sentence with an earlier presented sentence is the cue that effects the retrieval.
Memory Mechanisms in Text Comprehension
313
The result of this process is that two overlapping sentences will reside for some time in short-term storage, permitting the interaction noted earlier and the formation of traces of complex sentences. The subjects did indeed “hear” the complex sentences even though the experimenters did not present them.
B. OTHERDEDUCTIONS FROM THE THEORY I. Order Effects Several consequences follow from this interpretation. One that has not been tested is that the size effect referred to above should depend on the order in which the study sentences are presented. If the study sentences are presented with the more complex ones first, then the size effect should be increased. If they are presented with the simple sentences first, then the size effect should be reduced. If the simple sentences are presented first, then their retrieval during the processing of subsequent sentences does not contribute to the formation of new, complex traces.
2 . Meaningless Sentences Other consequences follow for which there are data. One is that fully meaningful sentences are not needed to produce the effects. This statement is contrary to a constructivist theory that emphasizes the role of meaning in the performance. Katz and Gru’enewald (1974) used the Bransford and Franks paradigm but made the sentences meaningless. They did this by a Lewis Carroll translation, substituting nonwords for content word stems and prepositions. Other function words and bound morphemes were left intact. They obtain both the oldhew nondiscrimination effect and the size effect with these sentences. This finding can be derived from the theoretical statements given earlier, again drawing on assertions 2, 6, and 7 (Section V,A). Memory for so-called nonsense material is an old phenomenon for memory-based theories. A futher aspect of the Katz and Gruenewald data can also be covered by the presented theory. Their data show a weaker size effect than that found with meaningful sentences. The weaker size effect is explained in the following way. The initial registration of meaningless sentences would be expected to be weaker. Their availability for retrieval by later, related sentences would therefore be reduced, weakening the size effect. 3. Nonsentence Sequences
The theory can also be used to cover any orderly sequence of items that forms an organized unit for the subject. Data on this extension of the theory were obtained in a study by Reitman and Bower (1973).
314
Murray Clanzer and Suzanne Donnenwerth Nolan
Those investigators used the Bransford and Franks paradigm but had, instead of sentences, sequences of letters or numbers. In one condition well-known sequences such as w x y z were presented visually. In another condition, new arbitrary sentences such as 3 z y D were shown. The data for the well-known sequences show a size effect but subjects could discriminate new from old test items. However, this discrimination is not unusual for the paradigm when visual presentation is used (as noted earlier). The size data for the arbitrary sequences were not orderly. That result is not unexpected, since these sequences are not well defined for the subjects and subsets of them are not likely to be effective retrieval cues.
VII.
Summary
We started the work described here with a number of restrictions-focus on the ongoing process of comprehension, use of normal text, focus on normal reading, maintenance of relation to simple memory tasks and the theories that go with them, and postulation of entities only as required. The last restriction gave us, initially, an oversimplified picture of processes in comprehension. The data, however, led us to consider a more complex picture. That picture can still be described in terms of mechanisms that have been well established in simpler memory paradigms. There are several advantages in starting with the restrictions we adopted. One is that it keeps the work on comprehension in communication with the work on memory. Such communication is likely to be fruitful. Second is that it holds back the postulation of complex mechanisms when simple mechanisms may suffice. For example, we do not assume “inference” is at work in the resolution of anaphora, particularly if inference is seen as effortful or voluntary process. If we found that the cuing process outlined above would not work, we would then try to define an additional mechanism to complement it. It is certainly true that the process of comprehension is vastly complex. It may not be true, however, that the appropriate way to handle that complexity is by starting with a complex theory. In this assertion our approach contrasts with constructivist, voluntaristic approaches to comprehension. A third advantage of a simple theory is that it leads more easily to prediction of new results or incorporation of known results and to the development of application of both sets of results. For example, the preliminary assumptions outlined above lead readily to testable predictions or explanations concerning placement of a topic sentence. They lead to predictions concerning cases that are less obvious, for example, concerning the effect of an omitted topic sentence as a recall cue after the text has been read. Applications to the writing of text readily follow from such considerations. They also lead to the incorporation of cases that
Memory Mechanisms in Text Comprehension
315
are considered different from ordinary reading, such as the paradigm for abstraction of complex ideas.
ACKNOWLEDGMENTS We wish to thank John K. Adams, David Dorfman, and A h a Hsu for their helpful comments. This article was written with support from National Science Foundation Grant Number BNS 84-15904.
REFERENCES Alba, I. W., & Hasher, L. (1983). Is memory schematic? Psychological Bulletin, 93, 203-231. Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psychological Review, 85, 249-277. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H. Bower (Ed.), Thepsychology of learning and motivation (Vol. 8). New York: Academic Press. Black, J . B., & Bower, G. H. (1979). Episodes as chunks in narrative memory. Journal of Verbal Learning and Verbal Behavior, 18, 309-318. Bransford, J. D., & Franks, J. J. (1971). The abstraction of linguistic ideas, Cognitive Psychology, 2, 331-350. Carpenter, P. A,, &Just, M. A. (1977). Integrative processes in comprehension. In D. LaBerge & S. J. Samuels (Eds.), Basic processes in reading: Perceprion and comprehension. Hillsdale, NJ: Erlbaum. Chafe, W. (1973). Language and memory. Language, 49, 261-281. Chang, F. R. (1980). Active memory processes in visual sentence comprehension: Clause effects and pronominal reference. Memory and Cognition, 8, 58-64. Clark, H. H., & Sengul, C. J. (1979). In search of referents for nouns and pronouns. Memory and Cognition. 7, 35-41, Cofer, C. N. (1973). Constructive processes in memory. American Scienrisr, 61, 537-543. Dalezman, J. J. (1976). Effects of output order on immediate, delayed, and final recall performance. Journal of Experimental Psychology: Human Learning and Memory. 2, 597-608. Fischer, B., & Glanzer, M. (1986). Short-term storage and the processing of cohesion during reading. Quarterly Journal of Experimental Psychology, %A, in press. Flagg, P. W., Potts, G. R., & Reynolds, A. G. (1975). Instructions and response strategies in recognition memory for sentences. Journal of Experimental Psychology: Human Learning and Memory, 1, 592-598. Flagg, P. W., & Reynolds, A. G. (1977). Modality of presentation and blocking in sentence recognition memory. Memory and Cognition, 5 , 111-1 15. Frase, L. T. (1975). Prose processing. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 9). New York: Academic Press. Gilliand, T. (1975). Readability. London: Houghton and Stoughton. Glanzer, M. (1972). Storage mechanisms in recall. In G. H. Bower (Ed.), The psychology of learning and morivarion (Vol. 5 ) . New York: Academic Press. Glanzer, M., Dorfman, D., & Kaplan, B. (1981). Short-term storage in the processing of text. Journal of Verbal Learning and Verbal Behavior, 20, 656-670. Glanzer, M., Fischer, B., & Dorfman, D. (1984). Short-term storage in reading. Journal of Verbal Learning and Verbal Behavior, 23, 467-486.
316
Murray Glanzer and Suzanne Donnenwerth Nolan
Glanzer, M., & Razel. M. (1974). The size of the unit in short-term storage. Journal of Verbal Learning and Verbal Behavior, 13, 1 14- 13 I . Grimes, J. E. (1975). The thread of discourse. The Hague: Mouton. Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. London: Longman. Hasher, L., & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108, 356-388. Hayes-Roth, B.,& Thorndyke, P. W. (1979). Integration of knowledge from text. Journal of Verbal Learning and Verbal Behavior, 18, 91-108. Hornby, P. A. (1972). The psychological subject and predicate. Cognitive Psychology, 3,632-642. Huey, E. B. (1908). The psychology and pedagogy of reading. New York: Macmillan. Jarvella, R. J. (1971). Syntactic processing of connected speech. Journal of Verbal Learning and Verbal Behavior, 10, 409-4 16. Jarvella, R. J. (1979). Immediate memory and discourse processing. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 13). New York: Academic Press. Jarvella, R. J., & Herman, S. J. (1972). Clause structuring of sentences and speech processing. Perception and Psychophysics, 11, 38 1-384. Johnson, R. E. (1970). Recall of prose as a function of the structural importance of the linguistic units. Journal of Verbal Learning and Verbal Behavior, 9, 12-20. Just, M. A., & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review, 87, 329-354. Katz, S. (1973). Role of instructions in abstraction of linguistic ideas. Journal of Experimental PSyChOIORy, 98, 79-84. Katz, S . , & Gruenewald, P. (1974). The abstraction of linguistic ideas in “meaningless” sentences. Memory and Cognition, 2, 737-741. Kieras, D. E. (1978). Good and bad structure in simple paragraphs: Effects on apparent theme, reading time, and recall. Journal of Verbal Learning and Verbal Behavior. 17, 13-28. Kieras, D. E. (1980). Initial mention as a signal to thematic content in technical passages. Memory and Cognition. 8, 345-353. Kieras, D. E. (198I). Component processes in the comprehension of simple prose. Journal of Verbal Learning and Verbal Behavior. 20, 1-23. Kintsch, W . , & Keenan, J. (1973). Reading rate and retention as a function of the number of propositions in the base structure of sentences. Cognitive Psychology, 5, 257-274. Kintsch, W., Kozminsky, E., Streby, W. J., McKoon, G.& Keenan, J. M. (1975). Comprehension and recall of text as a function of content variables. Journal of Verbal Learning and Verbal Behavior. 14, 196-214. Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychological Review. 85, 363-394. Kozminsky, E. (1977). Altering comprehension: The effect of biasing titles on text comprehension. Memory and Cognition, 5 , 482-490. Lesgold, A. M . , Roth, S. F., & Curtis, M. E. (1979). Foregrounding effects in discourse comprehension. Journal of Verbal Learning and Verbal Behavior, 18, 291-308. Lorch, R. F. Jr., & Lorch, E. P. (1985). Topic structure representation and text recall. Journal of Educational Psychology, 77, 137-148. Lorch, R. F. Jr., Lorch, E. P., & Matthews, P. D. (1985). On-line processing of the topic structure of a text. Journal of Memoty and Language. 24, 350-362. Mandler, J., & Johnson, N. (1977). Remembrance of things parsed: Story structure and recall. Cognitive Psychology, 9, I 1 I- 15 I . Meyer, B. (1975). The organization of prose and its eflect on memory. Amsterdam: North-Holland. Meyer, B. (1984). Text dimensions and cognitive processing. In H.Mandl, N. Stein, & T. Trabasso (Eds.), Learning and comprehension of text. Hillsdale. NJ: Erlbaum.
Memory Mechanisms in Text Comprehension
317
Meyer, B., Haring, M., Brandt, D., &Walker, C. (1980). Comprehensionof stories andexpository text. Poetics, 9, 203-21 1. Miller, J. R., & Kintsch, W. (1980). Readability and recall of short prose passages: A theoretical analysis. Journal of Experimental Psychology: Human Learning and Memory, 6, 335-354. Nolan, S., & Glanzer, M. (1985). Storage of topic and detail information in reading. Unpublished paper. Perfetti, C. A., & Goldman, S. R. (1974). Thematization and sentence retrieval. Journal of Verbal Learning and Verbal Behavior, 13, 70-79. Perfetti, C. A,, & Goldman, S. R. (1975). Discourse functions of thematization and topicalization. Journal of Psycholinguistic Research, 4, 257-27 1. Peterson, R. G., & McIntyre, C. W. (1973). The influence of semantic “relatedness” on linguistic information and retention. American Journal of Psychology, 86, 697-706. Purkiss, E. (1 978). The effect of foregrounding on pronominal reference. Unpublished undergraduate thesis. Glasgow. Raaijamakers, J. G. W., & Shiffrin, R. M. (1980). SAM: A theory of probabilistic search of associative memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 14). New York: Academic Press. Reitman, J. S., & Bower, G. H. (1973). Storage and later recognition of exemplars of concepts. Cognitive Psychology, 4, 194-206. Rumelhart, D. E. (1975). Notes on a schema for stories. In D. G. Bobrow and A. Collins (Us.), Representing and understanding: Studies in cognitive science. New York: Academic Press. Rumelhart, D. (1977). Understanding and summarizing brief stories. In D. LaBerge & 1. Samuels (Eds.), Basic processes in reading: Perception and comprehension. Hillsdale, NJ: Erlbaum. Sachs, J. S. (1967). Recognition memory for syntactic and semantic aspects of connected discourse. Perception and Psychophysics, 2, 437-442. Sanford, A. J . , & Garrod, S . C. (1981). understanding written language. New York: Wiley. Singer, M. (1973). A replication of Bransford and Franks’ “The abstraction of linguistic ideas.” Bulletin of the Psychonomic Sociery, 1, 416-418. Thorndyke, P. W. (1977). Cognitive structures in comprehension and memory of narrative discourse. Cognitive Psychology, 9, 77- 110. Tulving, E., & Arbuckle, T. Y. (1963). Sources of intratrial interference in immediate recall of paired associates. Journal of Verbal Learning and Verbal Behavior, 1, 321-334. Tulving, E., & Arbuckle, T. Y. (1966). Input and output interference in short-term associative memory. Journal of Experimental Psychology, 72, 145- 150. van Dijk, T. A. (1979). Relevance assignment in discourse comprehension. Discourse Processes, 2, 113-126. Walker, C . H.,& Meyer, B. J. F. (1980). Integrating different types of information in text. Journal of Verbal Learning and Verbal Behavior, 19, 263-275. Yekovich, F. R., & Thorndyke, P. W. (1981). An evaluation of alternative functional models of narrative schemata. Journal of Verbal Learning and Verbal Behavior, 20, 454-469.
This Page Intentionally Left Blank
A
Aspect ratio, visual pattern recognition and, 20, 48, 49 Associability, mental cues and, 238, 254, 255, 261, 263 Associative learning, contingency and circuits, 172, 174 historical perspective, 153 implementation, 149 presentation conditions, 139, 142 theory, 138, 139, 185 three-level analysis, 144 time, 183 Associative structures, instrumental learning and, 55-57 contingency effects, 60-64 postconditioning changes, 57-60 reinforcer encoding, 64 evidence on, 64-67 generality of, 67-78 reinforcer learning, 78-82 stimulus, role of, 82, 90, 91 discriminative performance, 90 occasion setting, 93-98 reinforcer, 91-93 residual response, 82-90 Asymmetry, see Symmetry Attention contingency and, 176 knowledge-directed machine learning and, 208 visual pattern recognition and, 15, 19
Abstract noun pairs, mental cues and, 246 Abstraction, text comprehension and, 3 12-3 15 Accentuation, visual pattern recognition and, 24 Accuracy, subjective time and, 106, 107 Act schemas, knowledge-directed machine learning and, 198, 199, 205, 206, 215 Activation, text comprehension and, 278 Algorithms contingency, classical conditioning and, 161-172, 186-189 blocking, 176 circuits, 173, 175 historical perspective, 152, 153 learned irrelevance, 179 theory, 137, 138. 184, 185 three-level analysis, 143-145, 148-150 time, 183 tracking, 177 knowledge-directed machine learning and, 197 Ames demonstrations, visual pattern recognition and, 10, 11 Arithmetic mean, subjective time and, 11, I12 double standard mixture. 133 harmonic mean, 115, 119, 122, 125 standards, 122, 124 Artificial intelligence, knowledgedirected machine learning and, 193 319
320
Subject Index
Attention focusing, knowledge-directed machine learning and, 197, 234 common-sense physics, 198 experiments, 224, 225 top-down knowledge, 208 Automatic procedures, mental cues and, 266 Automatic process, text comprehension and, 275, 276 Automacity, reinforcer encoding and, 68, 74 Aversion training, associative structures and, 71, 74, 78,87
B Background conditioning, associative structures and, 62, 63 occasion setting, 97 reinforcer encoding, 66 Backtracking. knowledge-directed machine learning and, 224 Bandwidth, mental cues and, 255 Baseball, knowledge-directed machine learning and, see Knowledge-directed machine learning Basic level categorization, visual pattern recognition and, 25, 46, 49, 50 Bias subjective time and, 115, 132 text comprehension and, 31 1 Bidirectionality, mental cues and, 264 Bisection, subjective time and, 108 Block, visual pattern recognition and, 1, 6 Blocking associative stmctures and, 93 contingency and, 152, 164, 172, 173, 176I78 Bottom-up approach contingency and, 149 visual pattern recognition and, 7
C
Categories, see also Basic level categorization; Subordinate categorization contingency and, 169 mental cues and discriminability, 255 memory schema, 248
recall tasks, 264 rehearsal, 243 verbal reports, 252 visual pattern recognition and, 3, 4, 25, 26, 28 componential recovery, 50, 51 object recognition, 5 Category labels, mental cues and computer metaphor, 238, 242 rehearsal, 243 Causal enablement chain. knowledge-directed machine leaming and, 198, 199 Causal link schemas, knowledge-directed machine learning and, 203, 204, 210 evaluation, 221 experiments, 225-227, 230 generalization, 215 Centering, visual pattern recognition and, 27 Chunks, mental cues and, 238, 239 natural language mediation, 244 rehearsal, 241 symbol association, 240 verbal reports, 265 Classical conditioning, contingency in, see Contingency, classical conditioning and Classification knowledge-directed machine learning and, 220 visual pattern recognition and, 1, 7 Coherence, text comprehension and, 302 Cohesion, text comprehension and, 308, 309 Collinearity, visual pattern recognition and, 14, 22, 35, 36, 39, 44 Color, visual pattern recognition and, 4, 6, 18, 28, 29, 33-35.49 Competition contingency and, 164, 169, 176 knowledge-directed machine learning and, 194, 234 attention focusing, 197 evaluation, 221, 222 experiments, 225-227, 229. 230 games, 200-208, 210 generalization, 215, 216, 218-220 Complexity, visual pattern recognition and, 31-33 Componential recovery, visual pattern recognition and, 38, 43, 46-50 Components, recognition by, see Visual pattern recognition. components and
Subject Index Composite condition, contingency and, 15 1, 152, 164, 175
Composite misinformation conditions, contingency and, 140, 149, 154, 156, 159 Computation, contingency, classical conditioning and, 143-150 algorithms, 148, 161-172, 186-189 circuits, 172-176 historical perspective, 152, 153 implementation, 148, 149 learned irrelevance, 178 space, 156-161 theoretical formulation, 153-156, 185 tracking, 177 Computer metaphor, mental cues and, 238 chunks, 238, 239 conscious memory, 239, 240, 242, 243 rehearsal, 241, 242 symbol association, 240, 241 Concavity, visual pattern recognition and centering, 27 degradation, 35, 36, 39 generalization, 21, 22 nonaccidentalness, I 1 perceptual organization, 22 recognition by components, 5-7 speech perception analogy, 3 Concrete nouns, mental cues and, 246, 249, 250, 257, 258
Conditional stimulus, see Conditioned stimulus Conditioned reinforcers, associative structures and, 72 Conditioned response, associative structures and, 91, 94 Conditioned stimulus (CS) associative structures and contingency effects, 60,62, 63 occasion setting, 94, 95 reinforcer, 91-93 reinforcer encoding, 66 contingency and, 185, 186 algorithms, 148, 162-165, 168-170,
321
presentation conditions, 139-142 theory, 137, 138, 185 three-level analysis, 144 time, 179-183 Conditioning associative structures and occasion setting, 94 postconditioning changes, 59 reinforcer encoding. 65, 69, 77 contingency and, see Contingency, classical conditioning and Cones, visual pattern recognition and, see Generalized cones, visual pattern recognition and Confidence value, knowledge-directed machine learning and, 221, 223, 227. 231 Conjunction-attentioneffect, visual pattern recognition and, 15, 18, 19 COMeCtiOniSt models, contingency and, I 4 4 Conscious memory, mental cues and, see Memory, mental cues and Consistency associative structures and, 92 .knowledge-directed machine learning and, 223, 231
Constrained data-directed generalization, 218, 224, 230, 232, 233
Constraint contingency and algorithm, 164, 165, 172 circuits, 172, 173 computational analysis, 146, 153, 154, 156
historical perspective, 150 implementation, 149 theory, 137, 138, 185 three-level analysis, 143 Constraint time, 179 tracking, 177 knowledgedirected machine learning and, 218, 219, 228
186- 189 circuits, 174, 175
Constructibility, mental cues and, 238, 254,
computational analysis, 145, 147, 148,
Constructivist views, text comprehension and,
153-156, 159
historical perspective, 150, 151 implementation, 149 latency, 177 learned irrelevance, 178
255, 257-261, 263 277, 313, 314
Context contingency and, 162, 169, 170, 181 knowledge-directed machine learning and, 219
322
Subject Index
mental cues and, 237, 247 constructibility, 258 discriminability, 256 memory schema, 251 recall, 262, 264 Contiguity, contingency and, 139, 150, 174 Contingency associative structures and, 55 occasion setting, 96 postconditioning changes, 58, 60 reinforcer encoding, 65 residual response, 84, 86-88 response-reinforcer association, 60-64 stimulus-reinforcer, 9 1 classical conditioning and, 185, 186 blocking, 176, 177 computation, see Computation, contingency, classical conditioning and historical perspective, 150- 152 latency, 177 learned irrelevance, 178, 179 presentation conditions, 139- 143 theory, 137-139, 183-185 three-level analysis, 143-145, 149, 150 time, 179- I83 tracking, 177, 178 Contour, visual pattern recognition and, 1922, 28-30, 33, 35-38, 40-42, 44,45 Cooperation, knowledge-directed machine learning and, 194, 206-208 experiments, 225-227 generalization, 215, 219 Cotermination, visual pattern recognition and, 9, 10 Count nouns, visual pattern recognition and, 4 Cues contingency and algorithm, 162, 169, 170, 186, 188 blocking, 176, 177 computational analysis, 147 historical perspective, 150, 152 theory, 137 three-level analysis, 143 time, 181-183 tracking, 177, 178 mental, verbal reports and, see Mental cues subjective time and, 126-130 text comprehension and, 314 abstraction, 312 linkage information, 296
thematic information, 299, 303-305 theory, 308-31 1 Curvature, visual pattern recognition and, 5 components, 5, 24, 28 degradation, 35-37, 39, 44 generalized cones, 13-15, 17 nonaccidentalness, 10, 12 perceptual organization, 22 Cylinder, visual pattern recognition and, 6, 16, 22, 23, 27
D Decay, mental cues and, 241 Declarative knowledge, mental cues and, 265. 266 Decode systems, subjective time and, 106- 108 Degradation contingency and, 140-142 algorithms, 163, 165, 172 circuits, 175 historical perspective, 153 implementation, 149 visual pattern recognition and, 5, 6, 29 componential recovery, 46 perception, 35-45 Delay subjective time and, 119-121, 130-132 text comprehension and, 277, 280 Dependent sentence, text comprehension and, 291, 295-297 Diffuse stimuli, associative structures and, 94, 95 Discriminability, mental cues and, 238, 254256 memory schema, 251 recall tasks, 264 Discrimination associative structures and, 90-92, 94-97 knowledge-directed machine learning and experiments, 228 generalization, 212 subjective time and, 106, 108 cued double standard, 128, 131 geometric mean, 108- 11 I text comprehension and, 314 visual pattern recognition and, 5 I componential recovery, 50 generalized cones, 16
Subject Index limited components, 23, 26 speech perception, 3 Disruptive effects, text comprehension and, 295 Distraction text comprehension and, 276, 283-286 abstraction, 312 linkage information, 292, 294-297 thematic information, 299 visual pattern recognition and, 18, 19 Domain knowledge, machine leaming and, 200-210, 235 experiments, 225, 233 generalization, 21 1, 214 Duration contingency and, 148, 183, 184 subjective time and, 116-118 baseline time left, 112-1 15 geometric mean, 108-1 10 harmonic mean, 120-1 22 standards, 122- 127, 129- 131
E Elaboration, mental cues and, 259, 261, 267 Elaborative rehearsal, mental cues and, 241 Encode systems, subjective time and, 106-108 Episodes, knowledge-directed machine leaming and, 197 experiments, 224, 227, 228 generalization, 219 Episodic memory, mental cues and, 249, 256258, 261, 264, 265 Evaluation, knowledge-directed machine learning and, 193, 194, 220-224, 234 attention focusing, 197 experiments, 232 Excitatory conditioning, contingency and, 150, 163, 178, 179 Expectations, contingency and, 148, 168-170, 176 Extinction associative structures and contingency, 61 occasion setting, 95 postconditional change, 59 reinforcer encoding, 69, 71, 75 residual responses, 84 contingency and, 184 Extinguishing, subjective time and, 113
323
F Faces, visual pattern recall and, 47, 48 Facilitation associative structures and, 94, 96, 97 text comprehension and, 297, 299 Familiarity, visual pattern recognition and, 1, 2, 32, 36, 37, 39 components, 5 degradation, 37 Feedback associative structures and, 56 subjective time and, 107 visual pattern recognition and, 32 Flavor-aversion training, associative structures and, 59, 77, 90 Free recall mental cues and, 239, 245, 247, 258-262 text comprehension and lists, 277 short-term storage, 279
G Games, knowledge-diicted machine leaming and, 200, 210 Gaussian memory distribution, subjective time and, 120, 122 Generalization associative structures and, 84, 85 knowledge-directed machine leaming and, 193, 194, 211, 234 data-directed, 21 1-214 evaluation, 221, 224 experiments, 227-233 text comprehension and, 299, 312 Generalized cones, visual pattern recognition, nonaccidental properties and, 12-14, 23 metric variation, 20, 23 parsing at joins, 21, 22 perceptual biases, 14-19 planar components, 20, 21 selection of axis, 21 Geometric mean, subjective time and, 108111, 115, 120, 122 double standard mixture, 133 standards, 124, 125 Gestalt principles, visual pattern recognition and, 22, 23
324
Subject Index
Goal-directed behavior, associative structures and, 56, 68 Goals, knowledge-directed machine learning and, 193, 203, 214, 230 Grouping, text comprehension and, 277
H Habituation, contingency and, 184, 185 Harmonic mean, subjective time and, 115, 119-122 asymptote, 125-130 double standard mixture, 133 Standards, 122-125
I Identification, visual pattern recognition, 5-8, 23, 28-30, 33 Identification latency, visual pattern recognition and, 7, 12, 24, 28, 37, 41, 44-46 Implementation, contingency and historical perspective, 153 three-level analysis, 143-145, 148, 149 Independent sentence, text comprehension and, 291. 295, 296, 299 Indifference, subjective time and, 124, 125, 128, 132, 133 Inference, text comprehension and, 3 14 Inhibition associative structures and, 68, 93, 96 contingency and, 152 Inhibitory conditioning, contingency and, 150, 151, 163 Instrumental learning, associative structures in, see Associative structures, instrumental learning and Interpretation, knowledge-directed machine learning and, 193 domain hypothesis, 200 evaluation, 220, 221 Interruption, text comprehension and, 287-289 Intersentence processing, text comprehension and lists, 278 short-term storage, 279, 281 Intervals contingency and, 152
knowledge-directed machine learning and, 194 subjective time and, 105, 106, 108, 130 baseline time left, I15 harmonic mean, 119, 120 standards, 123, 125, 129, 132 Invertibility, mental cues and, 238, 254, 256, 257, 261, 263, 264 Irrelevance, contingency and, 138. 172, 178, 179
K Knowledge-directed machine learning, baseball and, 194-196 evaluation, 220-224 experiments, 224-233 generalization, 21 I data-directed, 211-214 knowledge-directed, 2 14-220 interpretation, 196, 197 attention focusing, 197 common-sense physics, 198-200 domain hypothesis, 200-210 motivation, 193, 194
L Language mediation, see Natural language mediation Latency associative structures and, 75, 76 contingency and, 177 algorithms, 172 historical perspective, 152 theory, 138 Learned irrelevance, contingency and, 178, 179 Least common generalization, knowledge-directed machine learning and, 212 Length, visual pattern recognition and, 5 Lexical access, speech perception and, 2, 3 Linear time, subjective time and, 114, 115 Linguistics, text comprehension and, 291 Linkage information, text Comprehension and, 291-299, 302, 303, 308, 309 Log timing, subjective time and, 1 I5
Subject Index
Logical necessity, contingency and, 168-171, 187 Logical sufficiency, contingency and, 168171, 178, 187 Long comparison duration, subjective time and, see Duration Long-term storage, text comprehension and, 280 lists, 277 reading, 288 thematic information, 286, 300, 304 theory, 307-309
M Machine learning, knowledge-directed, see Knowledge-directedmachine learning Maintenance, text comprehension and, 287, 308, 309, 314 Maintenance rehearsal, mental cues and, 241 Manipulation, associative structures and contingency, 61, 63 postconditioning changes, 57-60 reinforcer encoding, 68, 76 reinforcer learning, 79-81 residual responding, 83, 87 Marking, mental cues and, 254 Mass nouns, visual pattern recognition and, 4 Matching, associative structures and, 60 Memory contingency and, 168, 170-172, 185 mental cues and associability, 255 chunks, 238, 239 conscious memory, 239, 240 constructibility, 254, 257, 258, 260. 261 discriminability. 256 episodic memory, see Episodic memory invertibility, 256, 257 mnemonic devices, 247 natural language mediation, 244, 245 recall tasks, 261, 263 rehearsal, 241-243 schemas, 248-25 I short-term storage, see Short-term storage symbol association, 240, 241 verbal reports, 252, 253, 265-268 verbalization, 242, 243 subjective time and cued double standard, 130
325
double standard mixture, 131-133 geometric mean, 111 harmonic mean, 120- 122, I25 text comprehension and, see text comprehension visual pattern recognition and, 4 degradation, 38 experiments. 33 generalized cones, 15 Limited components, 23-25 recognition by components, 5, 7 Mental cues, verbal reports and, 237, 238, 251-253, 265-268 associability, 255 chunks, 238, 239 conscious memory, 239, 240, 242, 243 constructibility, 254, 255, 257-261 discriminability, 255, 256 invertibility, 256, 257 memory schemas, 248-251 mnemonic devices, 247, 248 natural language mediation, 244-246 recall tasks, 261-264 rehearsal, 241-244 symbol association, 240, 241 Mentally retarded, rehe;ilsal and, 267 Metric variation, visual pattern recognition and, 20, 23, 49 Misinformation, contingency and, 140, 142, 168 Mnemonic devices mental cues and, 247, 248, 253 associability, 255 computer metaphor, 242 memory schema, 249, 251 verbal reports, 252 subjective time and, 106 Motivation associative structures and, 60, 64,72, 78, 79 knowledge-directed machine learning and, 04knowledge-directedmachine learningand, 233, 235 subjective time and, 108
N Natural language mediation, mental cues and, 242, 244-246, 251
326
Subject Index
Necessary condition, knowledge-directed machine learning and, 223 Neurobiology, contingency and, 137- 139 historical perspective, 153 theory, 184 three-level analysis, 144, 145, 148-150 Nonaccidental properties, visual pattern recognition and, 7 degradation, 35 generalized cones, 12- 14 contour variation, 19-22 perceptual biases, 14- 19 limited components, 23, 26 perceptual basis, 8- 12 Nonsense syllables, mental cues and computer metaphor, 240 natural language mediation, 244, 245
0 Object transfer, visual pattern recognition and, 46, 48, 49 Occasion setting, associative structures and, 93-98 Occlusion, visual pattern recognition and, 1 componential recovery, 46 degradation, 43, 44 experiments, 31 nonaccidentalness, I 1 object recognition, 5 surface characteristics, 4 Orientation, visual pattern recognition and, 5 , 20 variability, 46-48 Overgeneralization, knowledge-directed machine learning and, 218, 219, 224, 228, 229, 232 Overshadowing, associative structures and, 93
P Paired-associate learning, mental cues and, 244, 245, 253, 255 Pairing condition, contingency and, 140, 141, 143, 144 algorithms, 162-164, 172 circuits, 175 computation, 154-156
historical perspective, 152 time, 179 Parallelism, visual pattern recognition and, 9II degradation, 36 generalized cones, 14, 15, 17, 20, 22 orientation variability, 46 Parsing, visual pattern recognition and, 2 1-23. 35 Partial reinforcement, contingency and, 140I43 algorithms, 163-167, 172 circuits, 175 computation, 154, 156, 159, 160 historical perspective, 150-152 implementation, 149 theory, 184 three-level analysis, 143, 144 Partial warning, contingency and, 141-143 algorithms, 163-167, 172 circuits, 174, 175 computational analysis, 146, 147, 149, 154, 156, 159, 160 historical perspective, 151, 152 implementation, 149 theory, 184, 185 three-level analysis, 143, 144 Pattern recognition, visual, see Visual pattern recognition Pavlovian conditioning, associative structures and, 56, 57, 98 contingency, 60,62, 63 occasion setting. 93, 94, 96-98 postconditioning changes, 57, 58 reinforcer encoding, 64-66, 72 reinforcer learning, 78, 80 residual response, 89 stimulus-reinforcer association, 9 1-93 Pegword cues, verbal reports and, 249-251 Perception mental cues and, 240 visual pattern recognition and. 5 I attention, 19 componential recovery, 46, 48-50 degraded objects, 35-45 incomplete objects, 29-33 limited components, 24 nonaccidentalness, 8- 12 organization, 22, 23 speech analogy, 2, 3
Subject Index Perceptual bias, visual pattern recognition and, 10, 11, 14 Permanent memory, mental cues and, see Memory, mental cues and Phonemes, speech perception and, 2, 3, 7, 21, 51 Planar components, visual pattern recognition and, 20, 21, 36 Postconditioningchanges, associative structures and, 57-60 Primal access, visual pattern recognition and, 4, 6, 10, 16 Primary consequences, knowledge-directed machine learning and, 198 Primary enabling conditions, knowledge-directed machine learning and, 198, 199 Primary memory, text comprehension and, 278 Primary reinforcers, associative structures and, 72 Priming subjective time and, I13 visual pattern recognition and, 48, 50 Primitive action class, knowledge-directed machine learning and, 198 Primitive elements, visual pattern recognition and, 51 components, 6, 7 limited, 23, 24, 27 generalized cones, 14, 15 speech perception analogy, 2, 3 surface characteristics, 4 Probes, text comprehension and, 300, 301, 303. 306, 307 Procedural knowledge, mental cues and, 265 Pseudofacilitation, associative structures and, 96, 97
R Reaction time, visual pattern recognition and, 19, 28, 32-34, 40-43, 48 Reality monitoring, mental cues and, 251 Recall, see also Free recall mental cues and, 238, 243 chunks, 238, 239 constructibility, 254, 258-261 discriminability, 256 invertibility, 256, 257 memory schema, 249-251
327
mnemonic devices, 247, 248 natural language mediation, 244, 245 symbol association, 214 tasks, 261-264 verbal reports, 253, 265 visual imagery, 246 text comprehension and, 275, 277, 314 linkage information, 2% short-term storage, 280, 281 thematic information, 304-306 theory, 310, 311 Recognition, knowledgedirected machine learning and, 213 Recognition by components, see Visual pattern recognition Recollection, knowledge-directed machine learning and, 193, 194 Redundancy, visual pattern recognition and, 33 Regularization, visual pattern recognition and, 6, 24 Rehearsal, mental cues and, 241-244, 266,267 Reinforcement partial, see Partial reinforcement subjective time and, 108, 109 Reinforcement density, associative structures and, 70-72, 76 Reinforcer, associative structures and, 55-57, 98, see also Reinforcer devaluation manipulations; Reinforcer encoding; Response-reinforcer association delay, 72-74, 76 occasion setting, 94, % stimulus, 78-82, 91-93 Reinforcer devaluation manipulations, associative structures and, 60,72, 74, 76, 77, 79, 81-83, 89, 90 Reinforcer encoding, 64,67, 68, 76-78 concurrent measurement, 64,65 delay, 72-74 mutual interference, 65, 66 reinforcement density, 70-72 response form, 65 reward shifts, 67 stimulus, 74-76, 79 training, 68-70 Repetition, text comprehension and, 288, 292, 293 Repression, verbalization and, 268 Residual responding, associative structures and, 82-91
328
Subject Index
Response, associative structures and, 55, 56, 90,91, see also Response-reinforcer association discrimination, 90 reinforcer learning, 78-82 residual responding, 82-90 stimulus, 93 Response-reinforcer association, 57, 78-82, 98 contingency, 60-64 discriminative performance, 90,91 postconditioning changes, 57-60 reinforcer encoding, 64 evidence on, 64-67 generality of, 67-78 Retention, mental cues and, 261, 264 Retina, visual pattern recognition and, 1, 5, 48 Retrieval mental cues and constructibility, 261 natural language mediation, 245 symbol association, 241 verbal reports, 266 text comprehension and abstraction, 312, 314 short-term storage, 286, 290, 296, 299 thematic information, 300, 304-306 theory, 308-31 1 Reward shifts, reinforcer encoding and, 67
S Satiation, associative structures and contingency effects, 64 postconditioning changes, 60 reinforcer encoding, 72 Scalar timing system, subjective time and double standard, 131, 133 geometric mean, 110. 11 I harmonic mean, 115, 122, 125 Schema, see also Causal link schema mental cues and associability, 255 computer metaphor, 240, 242 constructibility, 254, 258, 260 memory, 248-25 1 text comprehension and, 297 Script, mental cues and association, 255 invertibility, 256 memory schema, 248-251
Segmentation. visual pattern recognition and components, 5, 6 nonaccidentalness, I 1, 12, 22 Semantics mental cues and. 257, 258, 260, 261, 264 text comprehension and, 306 Sensitization, contingency and, 184 Shape, visual pattern recognition and generalized cones, 13, 18 limited components, 23, 24, 27 Short stimulus duration, subjective time and, see Duration Short-term memory, mental cues and, 238 Short-term storage, text comprehension and, 275, 276, 278-281 abstraction, 3 13 distractor tasks, 283, 285 linkage information, 291, 292, 294, 297299 lists, 277, 278 reading, 287-291 sentence recall, 281, 282 thematic information, 285-287, 294, 300303, 305, 306 theory, 307-310 Similarity mental cues and, 256, 260 subjective time and geometric mean, 111 harmonic mean, 115 Simulation, contingency and, 148 Size, visual pattern recognition and experiments, 30 generalized cones, 13. 14, 17. 21 limited components, 26, 27 Snapshots, knowledge-directed machine learning and, 194, 234 attention focusing, 197 common-sense physics, 198, 199 competitive action games, 203 episodes, 224, 225 top-down knowledge, 208 Speech perception, visual pattern recognition and, 2, 3, 51 Sphere, visual pattern recognition and, 6, 27 Stimulus, associative structures and, 55-57, 72, 77-82, 90, 91 discriminative performance, 90 occasion setting. 93-98 reinforcer, 91-93 residual responding, 82-90
Subject Index Stimulus control, associative structures and, 74-7, 97 Stimulus-responsetheory, associative structures and, 55, 56, 68, 83, 90,91 Subjective equality, subjective time and, 108 baseline time left, 113 geometric mean, 1I1 Subjective time, structure of, 105-108, 130, 131 arithmetic mean, 111, 112 baseline time left, 112- 1 I 5 geometric mean, 108- I 1 I harmonic mean, 115, 119-122, 125-130 Standards, 122-125 cued double standard, 126- 130 double standard mixture, 131-133 time value experiments, 116- 1I8 Subordinate categorization, visual pattern recognition and, 3-5 Sufficient condition, knowledge-directed machine learning and, 223 Surface characteristics, visual pattern recognition and, 4, 6, 9, 46 Symbol association, mental cues and, 240, 241, 257 Symmetry, visual pattern recognition and, 3, 5, 6 degradation, 36 generalized cones, 13, 14, 16-18, 21 limited components, 24 nonaccidentalness, 8- 1 I perceptual organization, 22 Systematic testing, contingency and, 146, 147
T Text comprehension, memory and, 275-277, 314, 315 abstraction, 312-314 distractor tasks, 283-286 lists, 277, 278 short-term storage, 278-282 linkage information, 291, 292, 294-299 reading, 287-291 thematic information, 285-287, 294, 299-304 thematic information, 292-294, 304-307 theory, 307-312 Texture, visual pattern recognition and, 4, 6, 28, 29, 33, 34, 38, 49
329
Thematic information, text comprehension and short-term storage, 285-287, 291-294, 299-304 theoretical analysis, 304-308, 310, 311 Three-dimensional processing, visual pattern recognition and, 8-10 Time knowledge-directed machine learning and, 202, 210 subjective, structure of, see Subjective time, smcture of Timing, contingency and, 138 Top-down knowledge. machine learning and, 208, 209 Topic information, text comprehension and, 314 short-term storage, 285-287, 292-294, 296, 299-304 theoretical analysis, 305-307, 310, 311 Transfer associative structures and, 95 mental cues and, 241, 267 Transversality, visual pattern recognition and, 5 , 27 Trapezoid, visual pattern recognition and, 10, 21 Trivialization, knowledge-directed machine learning and, 215, 234 Two-dimensional image, visual pattern recognition and, 8-10, 26 1.
U
Unconditioned stimulus (US) associative structures and, 60,63 occasion setting, 94 reinforcer encoding, 65, 66 contingency and, 194, 185, 186 algorithms, 148, 162-165, 168-170, 186-189 blocking, 176 circuits, 174, 175 computational analysis, 145, 147, 153156, 159 historical perspective, 150, I51 implementation, 149 latency, 177 learned irrelevance. 178 presentation conditions, 139-142 theory, 137, 138, 185
330
Subject Index
three-level analysis, 144 time, 179-183 Unconstrained data-directed generalization, 218, 230, 232, 233 Undergeneralization, knowledge-directed machine learning and, 227, 229, 232 Unfamiliarity, visual pattern recognition and, I , 2, 15, 48
V Validation, mental cues and, 252 Variabilization, knowledge-directed machine learning and, 217-219, 221, 224, 228, 230, 232, 234 Variable duration, subjective time and, see Duration Variable interval associative structures and, 59, 69, 71, 72, 81, 87 subjective time and, 119, 120 Variable time schedule, associative structures and, 84, 87 Variance, subjective time and, 121, 122, 130133 Verbal mediation, mental cues and, 245-247, 252, 267 Verbal reports, mental cues and, see Mental cues Verbalization, mental cues and, 253 affect, 267, 268 conscious memory, 242 mnemonic devices, 248 Verification, knowledge-directed machine learning and, 227 Visual images, mental cues and, 242, 246, 247, 250-252, 255
Visual pattern recognition, components and, 1, 2, 51 basic phenomena, 4, 5 categorization, 3, 4 componential recovery, 46-50 experimental support, 28-45 generalized cones, 12-14 metric variation, 19, 20 parsing at joins, 21, 22 perceptual biases, 14- 19 planar components, 20, 21 selection of axis, 21 limited components, 23-28 nonaccidentalness, 8- 12 overview, 5-7 perceptual organization, 22, 23 speech perception analogy, 2, 3 Vocalization, mental cues and, 242, 266 Volume, visual pattern recognition and, 4, 12 componential representation experiments, 28 degradation, 44 generalized cones, 14, 17, 19-22 limited components, 23, 27, 28 Voluntaristic processes, text comprehension and, 277
W Warning, partial, contingency and, see Partial warning Weber’s law, subjective time and, 108, 111 Wedge, visual pattern recognition and. 6, 27 Word acquisition, visual pattern recognition and, 26 Word association, mental cues and, 245, 246 Working memory, text comprehension and, 278