INFERRED FUNCTIONS OF PERFORMANCE AND LEARNING
Siegfried Engelmann Donald Steely
Inferred Functions of Performance an...
36 downloads
870 Views
11MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
INFERRED FUNCTIONS OF PERFORMANCE AND LEARNING
Siegfried Engelmann Donald Steely
Inferred Functions of Performance and Learning
Inferred Functions of Performance and Learning
Siegfried Engelmann University of Oregon
Donald Steely Oregon Center for Applied Science
2004
LAWRENCE ERLBAUM ASSOCIATES, PUBLISHERS Mahwah, New Jersey London
Copyright Ó 2004 by Lawrence Erlbaum Associates, Inc. All rights reserved. No part of this book may be reproduced in any form, by photostat, microform, retrieval system, or any other means, without the prior written permission of the publisher. Lawrence Erlbaum Associates, Inc., Publishers 10 Industrial Avenue Mahwah, New Jersey 07430
Cover design by Kathryn Houghtaling Lacey
Library of Congress Cataloging-in-Publication Data Engelmann, Siegfried. Inferred functions of performance and learning / Siegfried Engelmann and Donald Steely. p. cm. Includes bibliographical references and indexes. ISBN 0-8058-4540-2 (cloth : alk. paper) 1. Artificial intelligence. 2. Cognitive science. I. Steely, Donald G. II. Title. Q335 .E54 2003 006.3—dc21
2002032697 CIP
Books published by Lawrence Erlbaum Associates are printed on acid-free paper, and their bindings are chosen for strength and durability. Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
Contents
Preface
vii
PART I: PERFORMANCE OF NONLEARNING SYSTEMS
1 A Framework for the Fundamentals of Performance
3
2 Basics of Hardwired Systems
24
3 Agent Functions
44
4 Interaction of Agent and Infrasystem
71
PART II: BASIC LEARNING
5 Perspectives on Basic Learning
95
6 Basic Antecedent Learning
118
7 Basic Response-Strategy Learning
148
8 Learning Patterns and Generalizations
174
9 Transformation of Data
202 v
vi
CONTENTS
PART III: EXTENDED LEARNING
10 Individuals and Features
233
11 Secondary and Unfamiliar Learning
261
12 Experimental Designs
293
13 Volition and Thought
318
PART IV: HUMAN LEARNING AND INSTRUCTION
14 Human Learning
347
15 Language
372
16 Human Cognitive Development
402
17 The Logic of Instruction
433
18 Issues
466
References
505
Author Index
509
Subject Index
511
About the Authors
523
Preface
The objective of this book is to identify what the intelligent system that produces responses must do to perform as it does. The analysis starts with the performance variables that must be in place for the organism that does not learn, and then overlays the functions required for learning. At one end of the performance-learning spectrum is the simplest performance machine and the simplest organism that is incapable of learning. At the other end is the human with its amazing learning-performance capabilities. The analysis applies to all organisms and machines within this spectrum. The overriding rule for the analysis is that the task requirements are the same for any organism or machine that performs the task. Therefore, any organism or machine that does not meet all the requirements could not possibly perform the task. Bipedal walking presents a set of basic requirements for any organism that performs it or any machine that performs it in the same manner the organisms do. The book presents a series of meta-blueprints, which do not specify nuts and bolts or circuits, but rather articulate the steps, content or specific information, and logical operations required for the system to perform the specified tasks. In other words, by designing specific machinery based on the various meta-blueprints, it would be possible to design machines that perform in the same way that organisms perform and learn in the same way they learn. The analysis is presented in four parts. Part I (chaps. 1–4) considers the performance system that does not learn. Part I also considers both the invii
viii
PREFACE
formation and motivation functions needed for organisms that perform operations that are not learned. The product of Part I is a meta-blueprint that presents the various functions that are logically required if the organism is to perform the observed behaviors. Part II (chaps. 5–9) presents a meta-blueprint for basic learning—antecedent learning and response-strategy learning. The analysis frames the learning capabilities as an extension of the basic performance system. The analysis further identifies the kind of data and data transformations the system needs to perform generalizations of what is learned. Part III (chaps. 10–13) presents a meta-blueprint for more complicated learning, such as the learning of highly unfamiliar content, secondary learning, and learning sets of related discriminations. This part specifies the functions for the ways learned material is classified by the system. The classification requirements derive from the need of the system to perform multiple discriminations involving a particular topic or specific set of examples. Part IV (chaps. 14–18) deals with human learning and how it is related to that of other organisms. Part IV addresses issues of human volition, extensive classification of information, and processes such as voluntary control over thought and language use. The analysis addresses human development and language learning. This part also considers implications of the analysis of learning for teaching, particularly teaching formal content, and it considers selected theoretical issues (e.g., the legitimacy of inferring inner processes from behavioral data and the implications of the analysis for the popular views of learning, the unconscious, and cognitive development). Finally, Part IV assesses the implications of the analysis for constructing artificial intelligence entities designed to meet functional learning and performance requirements identified for living organisms. This work should be of interest to various practitioners engaged in analyzing and creating behavior—ethologists, instructional designers, learning psychologists, physiologist-neurobiologists, and particularly designers of intelligent machines.
ACKNOWLEDGMENTS We are grateful to the people who gave us feedback and ideas that helped us progress through the various iterations of this book to its final form. Julian Guillaumot and John Lloyd read all the chapters, played devil’s advocate, and raised questions we sometimes had not considered. Dean Inman gave feedback and influenced us to focus some of our arguments. Tim Slocum and David Polson read parts of the book and pointed out its
PREFACE
ix
strengths and weaknesses. Their comments resulted in a rather thorough overhaul of the first chapters of the book. Lynda Rucker proofed first editions of the manuscript and Tina Wells was invaluable in helping us through the final edits and indexing. We are also grateful to Fran Goode, Lou Bradley, and the people we work with for listening to our many discourses on the intelligence of cockroaches, bees, and even single-celled organisms. —Siegfried Engelmann —Donald Steely
Part
PERFORMANCE OF NONLEARNING SYSTEMS
I
Chapter
1
A Framework for the Fundamentals of Performance
The behavior of organisms is purposeful. What they do on any given occasion may be conceived of as a task performed to bring about a functional effect. To perform the tasks as they do, the operating systems of organisms must carry out certain basic functions. A function is that which is essential or common to all possible systems that could perform this task. One operating system may do it one way and a second another way. Yet if there are 100 different system designs that perform the same task, all possess a common set of functions. These functions are identified by analyzing the tasks and specifying the details that are essential for performance, regardless of the specific way in which these functions are carried out. If a particular task calls for responding to a visual stimulus, all systems that perform that task must be able to meet the functional requirement of being able to receive visual information. The specific design of the receptor system may vary from a simple eye to a compound eye or even to a zoom lens. So long as the system has some basis for receiving visual signals, it meets one of the functional requirements imposed by the task. If an organism or machine responds in a predictable way to a particular visual signal, the system must have some way to associate or connect the perceived visual stimulus to the motor-response system. Essentially, the system needs some sort of rule that alerts the organism to the fact that the presence of the particular stimulus requires a particular response. If an organism or machine has a repertoire of different responses for various visual signals, it must have some form of screening function that is able to perceive and differentiate the various stimulus inputs and connect them to different response outputs. This requirement is fundamental if the or3
4
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
ganism (a) produces different responses to a given stimulus under different conditions, or (b) produces the same response in the presence of two or more different visual stimuli. Some type of decision making is implied to determine which response to produce for any particular stimulus because a particular response is no longer a simple function of the stimulus. The analysis of functions does not need to consider neurology, anatomy, physiology, biochemistry, genetics, or psychology in identifying these basic functions. Rather, it considers only the logically implied, universal features that would be identified in each of the 100 different systems designed to perform the same task. In working on this analysis, we repeatedly identified functions that seemed far too sophisticated for the organisms that produced the behavior. For example, the turning point in formulating the present version of the work occurred when we had begun a serious logical assault on what functions are needed for a single-cell organism to identify and approach a food source. The task we addressed was that of approaching a source of olfaction by using only olfactory sensory input. A casual analysis suggests that all the performer had to do was keep going in the direction that made the perception of the olfactory stimulus stronger and stronger until reaching the source. However, a more careful analysis of the task revealed that achieving this outcome requires an intricate interaction between information and logic. We consider only one aspect of this interaction for the time being, which is that the system must be able to perform some form of temporal analysis. Specifically, the organism must do a comparison of the levels of olfactory receptions at two separate points in time and then determine that at one of the times the stimulus is stronger than it was at the other time. Without this information, the system would not know whether it is should proceed in the direction it had been going or move in another direction. Furthermore, the comparison of the data at Time 1 and Time 2 cannot be performed until the second sample is obtained at Time 2. Therefore, the information from Time 1 must have been stored so that it is available to the organism as some form of representation. Once the organism concludes that reception at Time 2 is greater than that at Time 1, the system must produce behavior that is consistent with the conclusion. For all occasions in which the Time 2 reception is greater than the Time 1 reception, the behavior was successful, so the directive that results is functionally equivalent to the verbal directive, “Keep doing behavior X.” THEORY OF INFERRED FUNCTIONS In some ways, a theory of inferred functions proceeds in a different direction than current theories of learning or intelligence. The main difference is that inferred functions address more details of the content and logic of
THEORY OF INFERRED FUNCTIONS
5
the system that performs. Questions of the stimulus become questions of the features of the stimulus and the steps that would be required for the organism to be able to recognize the stimulus on various occasions, although the precise pattern of stimulation is never repeated. Questions about the response become questions about the kind of information the system needs on a given occasion to plan a response, direct it, and adjust it on the basis of feedback. Although the analysis of inferred functions leads to a greater articulation of content and logic, the analysis is an extension of behavioral analysis. Behavioral analysis reveals the functional relationships between stimulus events and responses—how behavior is affected by changes in discriminative stimuli and the consequences of behavior. The present analysis of inferred functions takes the next step in identifying the internal functions implied by the stimulus–response relationship. One contraintuitive aspect of the analysis is that it focuses on the things that the organism encounters—the shapes, colors, changes, and specific features that distinguish each from other things in the surroundings. The analysis of this aspect of performance is contraintuitive because it involves a sometimes technical consideration of the features or stimulus elements that various things and events possess. The analysis leads to issues about how an organism is able to identify a novel stimulus instance as a positive or negative example of a discriminative stimulus when the organism has never encountered that example before. There are certain stimulus elements in the natural environment to which an organism may innately respond. However, unless the organism has sufficient knowledge of those things to identify them and has knowledge of those features of a response strategy that change the current setting in specific ways, the organism cannot perform. Just as an organism’s system has been shaped to exploit the surroundings by using sophisticated receptors and performing sophisticated chemical analyses, it must conduct an equally sophisticated analysis of the specific features—properties, behaviors, tendencies—of the things that are important to the organism’s survival. The bottom line is that all the basic discriminations and strategies the organism has or learns are perfectly logical because things in the surroundings follow rules that are governed by logic. If an object is here, it is not there, and the system that performs must record where it is. If the object is transformed in universal ways when movement occurs (e.g., becoming apparently larger as it is approached), the system that performs must have provisions for accommodating the effect of the transformations and recognizing what remains the same about the object. In the same way, if the object differs from something else in three specific features, the learner must learn the nature of these features.
6
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
Furthermore, the learner must have all the logic required to draw conclusions about which features of the object are correlated with reinforcement. The logic is needed to determine what the presence of these features predicts. Without such logic, there can be no performance or learning because there would be no way for the organism to interact with the things in the surroundings. Whatever the organism learns is data based. There are no general, nondistinct learning tendencies. There is only the learning based on facts about the things that are relevant to what is learned.
A PERFORMANCE FRAMEWORK The required components of a performance system may be identified by starting with a simple machine and noting the changes in functions that result when the performance task becomes more demanding and more like that of an organism. Machines can be deceptive because their design permits single parts to perform multiple functions. For the machines that we describe, we separate the functions. Binary Performance The simplest machine is designed to produce one response to a stimulus. This machine is an analogue to a spinal reflex. For our machine, a particular visible object—a ball—causes the machine to produce a single response—moving an arm up in a specified arc at a specified rate. This machine requires a minimum of five functions. Some of these functions are obvious and some are not: 1. 2. 3. 4. 5.
Reception Screening Planning Directive Response
Reception Function. The reception function is simply the capacity of the machine to receive the type of sensation that contains the positive examples (as well as the negative examples). In this case, the sensation is visual, and the positive example is the ball. Screening Function. The screening function occurs next. It is essential because sensory receptions contain both positive and negative examples (balls and not balls). The system must have some qualitative criterion for
A PERFORMANCE FRAMEWORK
7
identifying or screening the positive examples and only the positive examples. The criterion rule may take many different forms, but all have a single content function—to identify objects that share specific features. For the screening function, one particular pattern of features is identified as positive; the others (by default) are negatives. Without this screening function, the system would not know when to plan the response. The screening must function as an independent variable and the response planning as a dependent variable. Planning Function. The planning function is activated when the presence of a positive example has been confirmed. Note that the planning function is not directly linked to either the reception or response. It is the link between the screening function and directive function. Receptions of negative examples result in a default plan to do nothing. Receptions of positive examples result in establishing a specific, planned response. The planning function does not produce or even direct the response. It simply identifies what the response directive will be. When the plan is activated, it specifies exactly what the machine will do next. Directive Function. The directive function directs the motor-response system to activate the response. The response has been planned. The directive function tells the system, in effect, “Do that response.” The directive function is not the response, but the process that orders the execution of the response that has been planned. Response Function. Unlike the other functions, the response function is observed as an action. It is not information in the form of a plan or directive, but a physical change created by the response directive. The physical change corresponds to the change described in the plan and issued by the directive. In the case of the single-response machine, the response function produces the only response the machine is capable of producing. Logic and Information. Three functions—screening, directive, and response—are characterized by transforming one logical mode into another. The screening mode transforms the raw sensory stimulation into facts about the sensory record (the presence or absence of the ball). The directive function transforms information about the response into the directive or imperative to produce the response. The response function transforms this directive into actual behavior. These various functions are independent of each other, and the content of one cannot be derived logically from the content of any of the others. If they are connected in a performance system, therefore, the connections have to be achieved by invention, not by logic.
8
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
The planning function is different from the others in that it links information to information—information about the record (“yes, the ball is present”) to information about the response directive (“yes, it’s appropriate to do response X based on plan X”). Planning and Directive Functions. It may appear that it would be possible to eliminate the planning step and go directly from the information about the record to the directive. This truncation would simply place two functions on the directive—specifying the response that will be produced and actually directing that response to occur. In practice, this amalgamation of the functions for binary performance systems certainly occurs. From the standpoint of functions, however, the directive function is greatly different from the planning function. The plan is information about what to do. The directive is an imperative to do it. The necessity of the directive function is not readily identified because the result appears to be automatic. The presentation of the ball simply activates the circuit that produces the response. The role of the directive is demonstrated by wiring the system so that the directive function is simply an on–off switch for the response. If the reception is identified as a positive example, the directive to respond would simply turn on the switch, thereby causing the specified response to be performed. The function is now observed by disabling the switch but maintaining the operating status for the other functions. The machine receives the sensory input, screens it, identifies the ball, and plans the response to be produced. However, the response does not occur although the system has the capacity to produce the response. The directive to respond has been disabled so there is no basis for activating the response. The role of the directive function can be demonstrated in another way. Follow this plan or do not follow it: Touch your head. The plan was the same regardless of whether you followed it. Your system understood what behavior the directions specified. The behavior occurred, however, only if you directed one of your arms to touch your head. “Touch your head,” therefore, was the planning function that described the behavior. What you actually did was the directive function. You either issued the directive “Touch head” or you did not. Fixed-Response Performance The planning function becomes more obvious and complex if we add a second behavior—moving the leg in a particular arc. If either “move the leg” or “move the arm” is to be produced in response to the ball on a particular occasion, the planning function of the machine necessarily changes. The identification that the ball is present still activates the planning function,
A PERFORMANCE FRAMEWORK
9
but the planning must provide some form of procedure for making decisions about whether to move the arm or leg. A large number of decision formats are possible, but the choices fall into four groups: preset sequence, probability, correlational, or combination. For each of these formats, the screening of the positive example would activate the selection process. Preset Sequence-Based Decisions. The preset sequence would provide a pattern of responses that would be produced on successive trials—simple alternation, double alternation (arm, arm, leg, leg), asymmetrical series (arm, arm, leg, arm, arm), and so on. For the machine to perform any preset series, the machine would need the addition of what amounts to a memory function. It could be as simple as a toggle switch or counter, but it serves a memory function if it carries information about the past into the current setting. Probability-Based Decisions. Some form of random-selection mechanism would be part of the planning. The system would randomly select on the basis of a schedule that provided for a given probability of arm or leg being chosen on a particular trial. The probability of arm or leg could be weighted so that a given response (e.g., the leg) would be selected on two thirds of the trials. If the system had any type of random selection mechanism, it would be possible only to estimate the probability of which of the two responses would occur on the next trial; however, it would be possible to predict the machine’s behavior over the next 100 trials with great accuracy. Correlational-Based Decisions. For correlational decisions, the planning function would be designed to take into account a variable independent of the ball’s presence. For example, if the variable is present on a given trial, the arm response is planned. If it is absent, the leg response is planned. The correlated feature could be any property that did not occur on all trials, ranging from the duration of the interval between two appearances of the ball to the time of day that the trial occurs, or the amount of light coming through the windows. For any correlation, the reception and screening requirements of the system are doubled. The system must now screen for the presence of the ball and the status of the correlated variable. Combination-Based Decisions. Combinations would constitute planning that had, for example, both a correlational and preset pattern. With every fifth trial, the system would analyze the status of an independent variable. The outcome of this analysis would determine the pattern for the next five trials.
10
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
Multiple Classification Criteria. For each of these four possible formats, an independent variable of some sort is added to the screening function to create more than one possible response plan for the current setting. This second stimulus condition may be a visible feature in the current setting or some form of information. The location of the ball would be a visible feature. The record of the last four responses would be information. In either case, two different inputs must be considered for the formulation of the plan. Let’s say that the response the machine selects is correlated with some feature of the ball. The ball appears randomly in one of two locations. If the ball is in one location, the arm is to move and strike it. If the ball appears in the other location, the leg is to move and strike it. This task involves two independent features of the positive examples— the presence and location of the ball. Therefore, the planning function cannot be performed without information about both the presence of the ball (which means that some response will be planned) and the position of the ball (which determines the response that will be planned). A second analysis is necessary following the screening function. The result is three different behavioral possibilities, each the function of a different combination of ball and its specific features: 1. The ball is not present either here or over there. (No behavior) 2. The ball is present and is here. (Plan leg response) 3. The ball is present and is over there. (Plan arm response) Although the performance is variable, the system is in effect a binary reflex. The performance would be something like a pupillary reflex that had three settings—inactive, constricted, and dilated. If illumination is present and too bright, the system produces the constriction response. If illumination is present and too dark, the system produces the dilation response. Because the no-response, constriction, and dilation responses are a strict function of specific stimulus conditions, all are reflexive. Performance Based on Continuous Variation We move away from the strict fixed-response reflex design when we introduce outcomes based on continuous variation. We simplify the design so that it involves only an arm and a ball. The ball could be anywhere inside or outside a three-dimensional zone. The arm could be in any location and orientation. The goal is for the machine to identify the presence of the ball when it appears in the zone and to move the arm in a direct route from its current position so that the hand makes contact with the ball. This task implies great changes in the design of the planning function. There is no lon-
A PERFORMANCE FRAMEWORK
11
ger a single plan or a plan based on fixed-response choices. Rather, the plan produced is a function of three variables: (a) location of the ball, (b) location of the hand, and (c) orientation of the arm. Because all of these variables are continuous, any one of thousands of different arrangements is possible on any particular occasion. The Planning Function. The planning function must account for how the machine performs on each occasion. The system may be designed in one of two ways to achieve this planning. It may be designed so there are thousands of preset plans, one for each possible combination of arm position, hand position, and ball location. The system would identify the particular combination of coordinates for the three variables on a given trial and specify the combination. The other possible design is for the system to engage in a process that is the same on all occasions, but that results in situation-specific behavior. The presence of the ball would cause the system to identify and record the position-location of the arm and hand and the location of the ball. This information would be used as the basis for connecting the locations with an efficient route from hand to ball. Obviously the process-based solution is far more efficient than the locusto-plan system. The process-based solution, however, involves functions that were not required by the previous designs. The outcomes for all the previous designs were completely a function of the antecedent stimulus (the ball or the ball and a correlated stimulus). The process solution requires information of more than one variable across a potentially continuous range of variation. The choice is no longer between here and over there, but between which of thousands of possible combinations is present. The information obtained is not a function of the antecedent stimulus (screened before the plan), but of the behavior that the system performs after the screening when it analyzes the current setting. Model of Space. For the response to be planned, all three variables would have to be expressed in the same three-dimensional model. Unless this requirement is met, the location of the hand could not be correlated with the location of the ball or position of the arm. Planning an efficient route connecting hand and ball, therefore, would not be possible. The information about the locations and arm position would lead to a projection of a route from hand to ball. The route would provide specific information about both the direction and distance of the response. The projection of the route would serve as part of the plan for the response—the sensory part. To complete the plan, the response component would have to be added. Given that the hand–arm is in a particular position and orientation, the route would determine which hand–arm movements were to be speci-
12
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
fied. The response directive would have to specify these motor activities, which means they would have to be planned before they were directed. Multisensory Map. If we go a step further, we require the system to construct the spatial map from different sensory modes. For this machine, the information about the ball’s location is still determined by visual information. However, the information about the location orientation of the hand–arm is now determined by proprioceptive information. For this design, the system would have to possess an abstract map of space that is capable of identifying any location by either proprioceptive or visual coordinates. The task presented on each trial would require the specification of the coordinates for the ball from visual information and the coordinates of the hand–arm from proprioceptive information. Both data points would be entered on a single abstract map. A projected route from the hand to the ball would be formulated. The plan for the route would specify the proprioceptive changes that are implied by the route as well as the corresponding visual changes that are to occur. These would describe the parameters of the response to be planned. Feedback Requirements. Feedback becomes necessary with uncertainty that the directive will achieve the intended outcome. We introduce uncertainty by having the ball move in and out of the response zone at different speeds and in nonuniform patterns. The ball will be stationary about half the time in the zone and in motion the other half. The task facing the system on a particular trial is now complicated by the possibility that the response initially specified will not be successful. Without additions to the system, the response will fail on possibly half the trials. The additions must provide a feedback function. In its simplest form, the feedback function would permit the system to replan or adjust a response that is in the process of being executed. This adjustment would occur whenever there is a discrepancy between the original projection of the response and the current sensory data. If there is no discrepancy, the planned response is continued. If there is a discrepancy, the system computes how the current response would be most efficiently transformed to create a new route based on current data. For the feedback mechanism to work, the system requires some method of comparing any new route with the original route and correlating the changes needed in the route with isomorphic changes that would have to occur in the response. The fact that the arm is in motion means that the adjustment will take place within the motion context. The arm will not stop and then assume the new route. This fact has implications for the plan that is presented to achieve the adjustment. The movement will continue; however, some features of the movement will change. This means that some fea-
A PERFORMANCE FRAMEWORK
13
tures of the original directive will remain (e.g., pattern of the response), but other features will change (e.g., direction). A major implication of this requirement is that the system must be changed so that it analyzes the response as a sum of various features. Only in this manner could it adjust specific features while not adjusting others. For the system to accommodate ongoing adjustments, it would have to be designed so that it made continuous comparisons of the route the current response would take the hand with the earlier-projected route and where it would take the hand. It would have to compare routes, calculate the specific discrepancies, and translate discrepancies in projected outcomes into responses that eliminate the discrepancy. This would require the system to have knowledge of which feature of the response would achieve the desired change and to adjust that feature without adjusting those features of the response that are common to both the original and revised routes. Decisions. The final change in the complexity of the system would occur by requiring decisions not based strictly on information about location, distance, and so on. Let’s say that there was an arm and a leg. Either could make contact with the ball anywhere within the response zone. If one of the limbs is a lot closer to the ball than the other, the system would select it for the response. If the ball was within a common zone, however, the choice of limb would not be specified on the basis of proximity or any sensory-based data. The system would need some nonsensory criteria independent of location, position, or distance. The same options indicated for programming the decisions for binaryperformance machines are available for this decision: preset sequence, probability, correlational, and combination. For example, a probability decision could be designed so that the machine had mood swings based on internal criteria. These mood swings would be determined by a probability mechanism. The system could have a magnet that rotated independently of any external sensory data. When the magnet is oriented toward one pole, the leg would be selected. Rotation toward the other pole would result in selecting the arm. For balls that are in the response zone, the system would first identify that it is in the zone and then use the data-independent criterion for selecting a limb. For balls outside the response zone, the routes for both arm and leg would be compared. The one that is shorter would be the one that is planned. An important feature of the system that makes these decisions is that the criteria for selecting a limb are based on features (information) in the current setting that are not used by any of the other criteria for planning responses. Even if decisions are based on correlated features observed in the ball, arm, or hand, these features are independent of the features used for
14
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
the standard selection criteria. For instance, the decision-making format could be based on leg orientation when both hand and leg are in the common zone. One set of orientations leads to a leg response, the other to an arm response. Note that the leg orientation is not used to plan a leg response, but to select either an arm or leg response. The selection is based on an arbitrarily designated feature of the setting. Inferences About Performance Systems The design of our most complex machine presents a greatly simplified version of the task facing an organism that performs even a simple task, such as moving to intensify an olfactory reception. The organism has many more sources of sensory reception (possibly four or more sensory modalities), produces responses that require much more intricate plans and directives (such as locomotion), transforms data from all sensors and displays them on a single, abstract, spatial map, and derives feedback from data far more complicated than that implied by the machine examples. The organism is also complicated because it does not have a switch that turns it on. It must be motivated to respond. Furthermore, this motivation must be caused by specific stimulus conditions—a strict function of the antecedent stimuli. Despite the added complexity, the design of both the informational and motivational components are implied by the requirements of the tasks the organism performs. Basic Functions. The same five functions are implied for basic pursuits that involve any behavior the machine or organism produces: reception, screening, planning, directing, and responding. The obvious difference is the degree of complexity. In the progression of machines we examined, the greatest changes occurred in the planning. From the standpoint of an observer, the behavioral results of several different designs appeared the same. The ball was presented and the machine made contact with it. As the progression of machines illustrated, however, there were great differences in internal, unobserved processing steps that are implied by the different rules that the machine follows to meet the requirements imposed by the task. If the task requires responses to continuous variation of stimulus and response, the design of either machine or organism must accommodate this requirement. Single or casual observations of performance do not usually reveal the complexity of the task. Performance must be carefully observed on various occasions to distill the nature of the task and its variation across different settings. As the complexity of the machine or organism increases, the necessity of the various functions becomes more obvious. When uncertainty and con-
A PERFORMANCE FRAMEWORK
15
tinuous variation are characteristics that must be addressed by the response, the need for the planning function is quite obvious. Each situation presents a unique problem, which can be solved only by referring to specific details of the current setting and making calculations based on those features. The response cannot occur without the screening (identifying that the ball is in the response zone), planning (identifying the limb’s position and the ball’s position and then using this information to specify a response for following a specific route from the current position to the ball), and directing (issuing the imperative for the plan that has been constructed to be transformed into action that is isomorphic with the plan). When uncertainty and continuous variation characterize the task, feedback is imperative. However, feedback is not possible unless a process is in place. If the machine simply observes where the ball is now, feedback is not possible because no response is implied. If the system has a projection about where the current response will take a limb, feedback is now possible. The original projection of the response may be compared with the current projection of the response. If they do not coterminate, a specific adjustment is implied by the difference between current plan and projection for new plan. Universal Processes. The system must employ universal processes rather than simple reflexive connections. The universal process involves two aspects—general and situation-specific. The general aspect provides for the rules, steps, or procedures to be followed on all occasions. The situationspecific aspect of the process is the application of the general rule to the current setting. With the one- or even two-response machine, processes are not needed because uncertainty and continuous variation are not factors. One response may be wired to one stimulus condition and another to a second condition. As the response requirements become more demanding, processes become essential. When the position of the limb and position of the ball are variable (not fixed locations), a process is implied (rather than thousands of individual reflexive links). The projected response is unique to the current setting. Therefore, it must be expressed in spatial terms that are isomorphic with features of the response. When the system is required to combine sensory information from multiple sensory inputs (e.g., both proprioceptive and visual), something that functions as a universal three-dimensional spatial map is necessary. If there is uncertainty about the future position of the ball, the system must be designed to have an ongoing monitoring function based on a comparison of the projection on which the response was based, with the projection based on the current sensory conditions.
16
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
Multiple Features and Content. Systems that produce behavior more complicated than a simple binary responses must be designed to screen for multiple features of the stimulus (and perhaps the setting) and specify multiple features of the response. The stimulus is no longer a monolithic physical stimulus as described in traditional static terms. Rather, it is something composed of many observable features, some of which permit it to be classified as a positive example (e.g., a ball). In addition, it has the feature of currently occupying particular coordinates on the visual spatial map. It also has features of being a particular distance and vector from both the hand and foot and has motion features. Each of these features is independent of the others and necessary for the machine to perform as observed. The fact that the system must secure information about various qualitatively unique features implies that the processes involve content. Although the machine may be designed so that it classifies visual input in a digital or binary manner, the process is referenced to something that is qualitatively unique. In effect, the machine or organism must abstract multiple features of the stimulus and multiple features of the response. The rules that the machine follows to analyze these are strictly content rules referenced to specific, observable features of the ball and limbs. Coordination. All the information about a particular instance of the ball that is relevant to the response must be presented to the part of the system that directs the response. The uncertainty of stimulus and continuous variation of response imply that the analysis of the ball and limbs must be continuous. Therefore, the system must provide for close coordination between the part of the system that plans and the part that issues directives. The part that plans must have ongoing information about the current location of the limbs. The part that plans must also have information about the correlation between the current location of ball and limbs and the parameters of the response that would achieve the projected route. In other words, all current sensory and response information relevant to the successful pursuit must come together before the plan can be created. The information must be expressed so that it appears on the same spatial map. Reflexive and Consequent Functions. The two classes of functions—those that refer to all applications and those that refer only to the current setting—may not be obvious from the design of an efficient machine because functions from the two groups may be combined. The separation of universal and situation-specific functions has particular importance in understanding the organism that performs the complex tasks. The reason is that those aspects of the process that are universal may be processed by reflexes that are a strict function of the antecedent stimuli. The presentation of something completely causes a response from the sys-
A PERFORMANCE FRAMEWORK
17
tem. For instance, every visual stimulus is analyzed. The system, however, responds to only one set of features for a differential response. Those features reflexively activate the classification or screening process. The process requires no more information about the example than the identification of a particular stimulus feature being present. So this process is entirely reflexive. The response is a function of the antecedent stimulus conditions. In the same way, the part of the planning that is the same on all occasions of a particular stimulus may be presented reflexively. The screening function would activate the general or universal process of creating a plan. This would consist of the rules that the system is to follow. It would not involve any specific information about where the ball or limb is located in the present setting, only rules about what information is needed to complete the plan for the current conditions. The process of obtaining the needed information requires specific action—scanning the zone to identify the location of the ball and limbs. The result is a specific set of coordinates and a vector that are unique to the current setting. The specific plan constructed for the current setting is a function of both the antecedent stimulus and consequences of an investigation. Therefore, the format for the formulation of the plan involves two related processes. One is universal and reflexive. Every time a positive example is identified, the planning procedure that the system is to follow is reflexively activated. The other is specific and nonreflexive. The application of the process to the current setting requires an information-seeking function that specifies situation-specific values. The nature of the analysis that describes the machine or organism is implied by the five functions needed to transform a stimulus into a response. Therefore, the proper analysis addresses all five as they relate to performance. Reducing the analysis to only two—stimulus and response—provides only an analysis of phenomena, not of the essential variables of what must be screened, what must be planned, and what must be directed. Table 1.1 summarizes the characteristics of the functions that are reflexive and those that require situation-specific planning detail. Both the general and situation-specific functions are activated by the antecedent stimulus. Because the general functions are the same in all situations, they are not influenced by behavioral consequences, only antecedent consequences. The situation-specific detail, however, is a function of behavioral consequences and may not be obtained unless unique behavior is produced to identify specific details of the current setting. The two sets of functions interact. The general functions present global formats—something like a fill-in-the-blanks set of directions. The situation-specific functions perform the behavior that provides information for each blank. The assumption is that the universal functions may be designed to present any sort of information or knowledge as a function of specific anteced-
18
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
TABLE 1.1 General and Situation-Specific Functions General Functions (Reflexive) Activated by antecedent conditions The same for all applications Detail not a function of behavioral consequences
Situation-Specific Functions (Nonreflexive) Activated by antecedent conditions Different for any two applications Detail is a function of behavioral consequences
ent stimuli. If the organism’s pursuit is based on a particular odor, the universal function may be designed to specify both the features of that odor and the general features of the behavior that are called for—intensifying the reception of the odor. In the same way, the universal functions could be designed to reflexively present the blueprint for making a nest, for a mating ritual, or anything else that meets the dual requirements of (a) being activated by the presence of specific stimulus conditions, and (b) being exactly the same on all occasions of the specific stimulus condition. Without this provision of general information, it would be impossible for any organism to exhibit situation-specific behavior that is referenced to general rules and achieves the same objective in various settings. A Performance Example. We can illustrate the parallels in the five functions for machines and organisms through an apparently simple task. We put sugar behind a barrier 8 feet from a caged cockroach and release the cockroach. The cockroach has no visual basis for determining where the sugar is and must therefore use olfaction. The cockroach proceeds in a direction that would miss the sugar, stops, then goes in a beeline direction to the sugar, where it stops and eats. The behavior is not learned, so it is based solely on information and functions that are prewired into the cockroach. The behavior is many times more complicated than that required by our machine. The cockroach responded to the presence of an odor. The odor, therefore, caused the process that led to responses in the same way that the presence of the ball caused the machine to respond. The cockroach’s system received the sensory information and screened it. This screening cannot lead directly to behavior of approaching the source of the odor anymore than the presence of the moving ball leads directly to behavior of the machine. The cockroach’s behavior must be planned. The planning, in turn, requires situation-specific information. The only information that could be used to plan an approach is the relative intensity of the reception. Therefore, the system analyzed the level of the reception at different times and correlated the findings with specific response strategies.
A PERFORMANCE FRAMEWORK
19
The plan that applies a general strategy to the current setting requires the specification of multiple features of a specific mode of locomotion that would be projected to make contact with the food. At a minimum, the plan would have to specify the direction and pattern of locomotion. (Without both categories of information, the approach would fail.) Once the route and nature of locomotion are planned, they must be directed and adjusted. For the performance system to adjust the route, the system must receive continuous sensory information to confirm either that the route is succeeding in intensifying the reception or that it is failing. Any specific adjustment in specific dimensions or features of the projected route implies a correlated adjustment in features of the original response pattern. The cockroach will not continue to move in the same direction if that direction would bring it far to the left of the target. Just as the machine required a spatial map that could accommodate data from different sensory sources, the cockroach requires such a map. Directions that are mapped on the basis of olfactory receptions are to be correlated with proprioceptive data (and perhaps data from other sensors) that describe the same locations or directions. This union of these disparate data is possible only if all types of data are abstracted and projected on a common three-dimensional spatial map. Just as the machine’s performance is not possible unless more than one feature of the ball is analyzed, analyses based on multiple features of the sugar odor are logically required for the cockroach to perform the observed behaviors. The presence of the odor could account for the activation of locomotive responses; however, it could not account for the specific pattern that the cockroach used to approach the sugar. Identifying an approach route is not based on information that the odor is present, but information about changes in level of the odor. Furthermore, the analysis involves attending to both the absolute and relative levels of the odor. Facing the source requires comparative information about the various directions. The one that is relatively highest is the direction the cockroach faces. To determine the relatively highest level, the system must compare the absolute levels for different directions. (Note that the one it faces does not have to be a particular absolute level, like Level 6 on a 10-point scale. It is simply the direction correlated with the relatively highest level.) In the same way, the route is judged to be appropriate only by comparing absolute levels at different times. If the absolute level increases, the assumption of the system is that the pattern and direction of the locomotion produced during the period of change should continue. So the analysis involves a minimum of three features of the odor: presence, relative level, and absolute level. The analysis also involves a mini-
20
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
mum of two features of the response—pattern of response that creates locomotion and direction of the locomotive response. Motivation. Unlike our machine, the organism faces a design problem of connecting the general behavior described by the reflexively transmitted format and the behavior that results from a unique plan. The production of a specific behavior cannot occur until it is planned and directed. Before the planning occurs, the most specific input the planning component could receive would be general information about what it will do and some form of motivation to carry out the steps needed to perform the planning. So a gap exists between what the system is able to do reflexively and what must be done if a plan is to be made for the current setting. There is only one possible way to solve the problem: design the situation-specific unit so that it responds differentially to the consequences of its behavior. It would respond differentially to positive and negative consequences. In effect, the formula would be for the reflexive unit to present the general rule that the situation-specific unit was to follow in the current setting. If the situation-specific unit performed the behavior required to identify the specific location of the target (unique directions and information about the other specific variables that are relevant to the production of the response), the system would be designed to issue some form of positive consequence to the situation-specific unit. If this unit did not comply with the general directive to identify specific information, it would receive a negative consequence, something from which the unit would want to escape. The result would be that the situation-specific unit would comply with the general requirements by securing the needed information. Through the control of changes in reinforcing or punishing consequences, the reflexive unit would be able to control the planning and directive behaviors of the situation-specific unit. If a particular type of change occurred in the current setting (e.g., the plan being completed), the reflexive unit would reflexively issue positive consequences. Note that the same conclusions about the two types of units (reflexive and situation-specific) may be inferred from behavior. If the organism does some things, even general things, that are always the same in the presence of specific stimulus conditions, the system must have preknowledge of what to do. Regardless of whether the knowledge is learned or hardwired, the knowledge must be present at this moment. Whatever the organism does that is the same on all occasions implies knowledge that is presented reflexively to the part of the system that produces responses. The Infrasystem and Agent. This scheme raises many questions about how the units interact and particularly how the situation-specific unit is designed so that it responds to the consequences of its behaviors. These issues
A PERFORMANCE FRAMEWORK
21
are addressed in chapters 2 to 4. The analysis refers to the reflexive unit of the performance system as the infrasystem and the consequence-governed functions as the agent. The infrasystem performs the universal functions by reflexively presenting the general rules and controlling the motivational variables (consequences of the agent’s behavior). The agent does what the infrasystem cannot do—formulates situationspecific plans and directives based on the rules and motivational variables (differential consequences). The designation, agent, derives from the fact that the infrasystem cannot produce situation-specific responses. The infrasystem therefore needs an agent that performs the operations the infrasystem is incapable of performing. Through reflexes, however, the infrasystem influences the agent to create situation-specific plans and directives. The discussions in chapters 2 to 4 address technical details of this twounit performance system because these are the details that are relevant to the key features of the performance system. The system runs on specific content, facts, and relationships. It does not blindly associate a global stimulus to something loosely described as a response. Rather, it performs all the functions needed to secure and mobilize the information required for the response to be possible—attending to multiple features of the stimulus and multiple features of the response, making plans that are related to projections, directing responses that comprise specific components, and performing monitoring functions that logically require inputs of current data and projected data. The overriding theme of chapters 2 to 4 is that, for the system to perform, it must process content about both the setting and responses it plans and produces. Therefore, it must have knowledge of both its response capabilities (repertoire) and how to produce responses called for by various stimulus conditions, which are revealed by receptions from different sensory modalities. A theme related to content is that every detail of the scheme articulated in chapters 2 to 4 is logically necessary. Omitting any of the steps or details that are logically implied by the details needed to achieve the observed performance would make it impossible for the system to perform. The sophistication of the functions the system must perform may be quite contraintuitive. How could a simple organism have sophisticated knowledge and perform logical operations that involve subtle detail and refer to variation of multiple features of response and setting? The most relevant answer is that it could not perform the behavior without performing the various inferred functions. A related answer is that everything about the organism— from the structure of its cells to the electrochemical processes required for the simplest behavior—is elegant and incredibly sophisticated. So it is with performance. The cockroach is not drawn by an invisible and mystical magnet to the sugar. The route is mapped on the basis of content, comparisons,
22
1.
FRAMEWORK FOR FUNDAMENTALS OF PERFORMANCE
calculations, and correlations. The system must address many issues to perform as it does. Chapters 2 to 4 address the central issues.
SUMMARY The details of the task to which a performance system responds imply the functions that the system must perform. If the task cannot be performed unless the system obtains and organizes information, the system—organism or machine—performs operations necessary to secure and organize the information. Five major functions are required for the most basic tasks: reception, screening, planning, directing, and responding. Each function is a variable that is required for any stimulus to be transformed into a response. A thorough analysis of performance, therefore, would articulate each function. To eliminate the intermediate functions and proceed directly from stimulus to response is to create a relationship that basically eliminates the content. However, the content is the sine qua non for performance on the task. Unless the system is designed to identify details of what, where, and when of both stimulus and response, the observed behavior could not possibly occur. Planning is the function that changes the most as the tasks increase in complexity. The plan is based on multiple features of the stimulus and multiple features of the response. The operations that are implied by these features include comparisons and correlations of changes in stimulus conditions to changes in specific features of the responses. The system required to perform tasks characterized by continuous variation in stimuli and uncertainty of response has two parts. One part, the infrasystem, processes everything that is the same for all applications of a particular pursuit (such as locating food). The other part, the agent, deals exclusively with the situation-specific details required for the organism to perform in each unique setting. The operations performed by infrasystem are reflexive in nature. Specific stimulus conditions reflexively cause particular outcomes. Part of what the infrasystem issues reflexively to the agent is general information about the pursuit to be planned and executed. The agent functions are not reflexive, but are sensitive to consequences of behaviors. The general directives from the infrasystem indicate what the agent is to do. When the agent performs the steps required for the planning and response production, it receives positive consequences. If it does not comply with the general directives, it receives negative consequences. This scheme requires the system to have preestablished stimuli that are primary reinforcers. The presence of a primary reinforcer moti-
SUMMARY
23
vates the agent to perform the behaviors that have been designated by the infrasystem. All the behavior discussed in this chapter, as well as chapters 2 to 4, is assumed to be completely hardwired with all the necessary content and consequences preestablished and not the product of learning. The goal of the analysis is to account for the functions that occur the first time the spider constructs a web, the bee repairs a wall of a cell, the cockroach runs from sudden light, or the salamander strikes at a prey.
Chapter
2
Basics of Hardwired Systems
This chapter establishes two basic functions of the infrasystem—the basis for motivation and the basis for the preestablished information the infrasystem issues to the agent. The basis for motivation is an internal reinforcement system reflexively linked to the presence of specific stimulus features. The information is in the form of preestablished blueprints that indicate what changes the organism is to produce in the current setting in response to specific stimulus conditions. The information is general, so it applies to a large range of individual applications. Like the motivational component, the information is issued reflexively by the infrasystem. Both the motivation and information functions are necessary if the agent is to perform specific operations in a variety of settings. LEARNED VERSUS UNLEARNED PERFORMANCE There may actually be few species that are totally incapable of learning. The spider, bee, and organisms that are far less sophisticated behaviorally are capable of learning. However, much of what they do is not learned. The discussions that follow refer to the hardwired organism as if it is totally hardwired. However, the line between learning and not learning is indistinct in several aspects. The most critical is that for the organism to perform behavior in a specific concrete setting, the organism must learn or gain knowledge about the features of this setting that are relevant to the response patterns that will be produced. If the target of a hardwired pursuit is to the left, the organism goes to the left. To achieve this end, the hardwired system must be designed so that information acquired about the current setting at 24
LEARNED VERSUS UNLEARNED PERFORMANCE
25
Time 1 is applied at later times. If it takes 18 seconds for the organism to reach the stimulus, information about the stimulus must have occurred before the first response and must have persisted for 18 seconds. The truly hardwired system could not learn anything that carries over to another setting. It would simply respond to the demands of each setting by following the hardwired performance format. The question is, does this temporary acquisition and retention of information in the current setting constitute some form of learning? This analysis considers learning to be a change in behavior that would persist from one setting to the next, so a temporary gain of information is not learning. Evidence that the organism uses information in the current setting to perform does not imply in any way that the organism learns. However, the fact that the organism does not learn does not imply that the organism is incapable of using information in the current setting. Operant Responses in Hardwired Systems The kinds of behaviors that the cockroach performs when it approaches sugar are operant responses. Traditionally, the only context for using the designation operant is that of learning. However, the performance of the hardwired organism challenges this limitation. The fact that a response is not learned does not imply that it is not operant. Operant responses are demonstrated to be functions of reinforcing consequences. Clearly the performance of the cockroach is a function of reinforcing consequences. If the sugar is moved from Location 1 to Location 2, the cockroach goes to Location 2. Therefore, the sugar, by definition, is a reinforcer, and the cockroach’s approach behavior is a function of the reinforcing consequences. Furthermore, discussions of reinforcing consequences assume that there are primary reinforcers. True primary reinforcers are not learned. For some organisms, the behavior that secures primary reinforcers may be learned; however, the primary reinforcer is a hardwired component of any organism’s system regardless of whether or not the organism learns. The organism does not learn to be hungry or learn pain. These are primary reinforcers that are part of the organism’s basic performance machinery. The operant behavior of systems that do not learn has the same general character of learned operant responses. The discriminative stimulus is presented. It serves as information that a reinforcer is present in the current setting. The organism produces nonreflexive responses that are keyed either to escape from the reinforcer (if it is negative) or approach it (if it is positive). Stated differently, if we were presented with an organism that was unlike any we had ever observed, we would not be able to determine from observations of its behavior whether the operant responses it produced were learned or were simply part of the organism’s hardwired repertoire.
26
2.
BASICS OF HARDWIRED SYSTEMS
Let’s say that we observe that the organism consistently escapes from sudden bright light. If the organism is in a large, dark room and we suddenly illuminate the area where the organism is, it runs quickly to a dark area. There is no doubt that the bright light serves as a negative reinforcer—an aversive stimulus from which the organism escapes. The specific behavior is different from one application to the next, so it is not a simple reflex. It is an operant response that takes into account the details of the current setting and therefore the consequences of different behaviors. We could not possibly determine whether this response strategy was learned or inherited. The first time we tested the organism, it performed in the same general way that it did on all subsequent trials. Only if we know the history of the organism would we know the basis for the performance. If the organism had no opportunity or basis in experience for the learning, we would conclude that the behavior was inherited. If we observed that the light initially did not induce escape behavior, but that it did after training, the conclusion would be that the behavior was learned. The training would have the basic form of presenting the light as a predictor of aversive consequences. When the light occurred, the organism would be able to avoid the consequences by running to darkness. The fact that the behavioral functional relationships are the same whether the behavior is learned or inherited implies that the essential details of the underlying system must be the same. Both systems face the same task; therefore, both need the same provisions for performing. In summary, a response is either reflexive or operant—either a strict function of antecedent stimuli or not. A learned operant response remains an operant response after it has been learned. In other words, it remains a function of reinforcing consequences. In exactly the same way, a nonlearned (hardwired) response that is shown to be a function of reinforcing consequences is not reflexive. The only difference between operant behavior that is based on a hardwired system and learned operant behavior is that the hardwired operant behavior is always driven by primary reinforcers—it is not learned or driven by secondary reinforcers. All the reinforcing provisions must be preestablished as part of the organism’s hardwired performance potential.
Basic Requirements for Hardwired System Operant Responses All the significant behaviors of organisms such as ants and bees are operant. When ants transport leaves, their behavior is operant. When they tend to ant cows or to the queen, their behavior is operant. When they battle intruders, their behavior is operant.
LEARNED VERSUS UNLEARNED PERFORMANCE
27
The behavior of the battling ant may be shown to be a function of specific stimuli, either the odor or some other specific feature of the intruder. Therefore, the behavioral set or mode of behavior (the battle mode) is caused by the stimulus. As the series of simple machines illustrated, the specific responses that are produced, however, are not reflexive responses. They are a function of behavioral consequences. For the bird that migrates north as a function of duration of illumination, the unconditioned stimulus is the duration of illumination, and the discriminative stimulus is the orientation—north. This information is an aspect of the bird’s performance that is the same across all birds of a given species and instances of a given bird. Therefore, this information is not only hardwired, but is a reflex issued under specific stimulus conditions. This inference is based on the fact that specific information is required for the migration. The individual bird must know which direction is north and what behavior to perform. This information must be presented to the bird prior to any responses on a given occasion because the responses are a function of consequences, and therefore must be planned. The only possible mechanism is for the information to be presented reflexively in the presence of particular stimulus conditions. All hardwired behavior must follow this same format. Before any responses are produced on a given occasion, there must be a basis for the responses. That basis is general information about some goal to be achieved and the behavior required. The basis is reflexively caused by the presence of specific stimuli. For example, it is possible to cause a shark to swim to a targeted location by transmitting a low level of electric current from the location. If the shark is in a tank, the shark will produce the same behavior of approaching the source of the current thousands of times. The shark is being governed by what amounts to a rule of behavior—“approach the source of current.” This behavior is not learned. It is partly a function of the discriminative stimulus—the presence of the current. The presence of the current reflexively causes the rule to be presented to the agent. The specific twists, turns, and passes that the shark makes on the present occasion are behaviors that are consistent with the goal of approaching the source. Therefore, before the shark produces any response on a given occasion, three steps must have occurred: 1. The shark received information that the current was present. 2. The behavioral rule was reflexively issued to the agent (“Move to intensify the reception of the current”). 3. A specific plan was created for how to move—the specific direction and pattern. The information about the stimulus to be pursued and the nature of the pursuit served as a goal or criterion for planning specific behaviors. After
28
2.
BASICS OF HARDWIRED SYSTEMS
the behaviors were planned, the shark produced responses that led to the reinforcing consequence of approaching the source of the current. The specific form of the information is not implied by the shark’s performance. The essential details of the information, however, are. The current is recognized as something to be responded to. The response strategy is an approach. For the shark, this performance is relatively more important than doing other things at the present moment. As indicated earlier, the information cannot create the responses. The performance system must be designed so that it is motivated to produce the responses. Therefore, the stimulus that causes the planning and behavior (the presence of the current) must also be the source of motivation. Operant responses are maintained by reinforcing consequences, and the system must be designed so it provides reinforcement for particular behaviors. If the organism pursues the discriminative stimulus, the system must somehow empower the discriminative stimulus with reinforcing properties. Like the information, this reinforcing property must be present before the organism produces the responses, and it must persist until the response is terminated. For the simplest arrangement, the organism will either approach or escape from the discriminative stimulus. The discriminative stimulus for approach functions as a positive reinforcer; the discriminative stimulus for escape serves as a negative reinforcer.
SENSATION The first function of the performance system is reception. If light or sound influences operant responses, the light or sound must have been received in some form by the part of the system that produces operant responses. What the performance system receives, however, is not actual light patterns or sound patterns, but sensations that are isomorphic with features of light patterns and sound patterns. The sound cannot travel through the system, nor can the light or odor. The only possible way in which these stimuli could affect the operant responses, therefore, is for the physical features of the incoming stimuli to be represented by the system as sensation. The system creates representations that are in the form of sensation because it has no other choice. The part of the system that produces operant responses needs information about the current setting. If the relevant details are at a distance, the only option available to the system is to use some stimuli that are not at a distance and convert them into sensations that provide information about what is at a distance. The light from an object has properties that provide information about the object. That light reaches
SENSATION
29
the organism. The organism’s system represents specific features of the reception as sensations. The sensations created are isomorphic to the features of the sensory input. Changes in features of the physical reception are presented in real time as corresponding changes in the sensation. Different systems are designed to produce different classes of sensation—those that represent sound, light, odor, magnetic variation, heat, and so on. Each class of sensation is configured such that it is discriminable from the others—so the system knows whether it is receiving visual information rather than auditory information. However, all primary information that the system receives is in the form of sensation. Distortions of Sensation A major requirement of the sensations that the system represents is that the sensations must provide information that is relevant to the behavior the organism produces. If the system is to respond to a feature that is in the ultraviolet range of electromagnetic waves, the system must be designed to receive ultraviolet input. To meet the basic requirement of creating an analogue to selected features of the surroundings, the system must create parallels in sensation for details of the stimulus that is received. If the physical features of the reception change, the sensations that represent the reception would change. Without this requirement, the information the system receives would be an unreliable predictor of the physical world. Disproportionate Representation of Features. The system is not designed to record the full visual spectrum. Rather, it is designed to provide the part that is relevant to what the organism will do. The choice of what is represented constitutes a kind of distortion from the actual range of physical variation that reaches the organism. Through its selection, the system limits potential information. If vision is monochromatic, discriminations based on color are limited. In addition to distortion created through the selection of the range of variation that is represented, the system may be designed to enhance the reception of specific segments of the range through disproportionate representation. For instance, the system may provide for the reception of the auditory range of 700 Hz through 1700 Hz to be received in a way that provides it with far more variation in sensation than the rest of the range. A reception in the range of 800 Hz (which is in the range of enhanced stimuli) would be represented with relatively more detail than something in the range of 300 Hz.
30
2.
BASICS OF HARDWIRED SYSTEMS
Secondary Sensations To provide the agent of the system with motivation to respond, the system extends the use of sensation from simply representing physical features of the sensory input to creating the primary reinforcers that drive the agent. The system creates secondary sensations that have no corresponding physical basis in the reception and attaches these sensations to specific receptions. In this way, the agent that plans and directs responses receives both the primary sensations, which represent the physical features of the reception, and the secondary sensations, which provide the motivation for the agent to plan situation-specific behaviors. An example of such a secondary sensation would be pain. Pain is nothing more than a secondary sensation that is added to the reception of specific physical features. The pain is not a physical feature of the reception, but one that is added by the system. Pain has a reinforcing function. The body part that is painful demands attention and response planning. The goal is to escape from the pain. Unlike disproportionate representation, the secondary negative-reinforcing sensation is a pure creation of the system and is not implied in any way by analysis of the physical features of the stimulus to which the pain is affixed. The system may be designed so that if the loudness of the reception exceeds a particular range, the secondary sensation of pain is added to the reception. Similarly, the brightness of visual receptions, the pressure of tactual presentations, and the level of a particular gustatory or olfactory stimulus all may be designed as variables that trigger the secondary sensation of pain. The reinforcing sensation is not a substitution or modification of the physical features of the reception. The reception of a particular sound is still in the class of auditory sensation that has specific, potentially discriminable features that make the reception recognizable as a horn blast, for instance. Because the loudness feature of the reception exceeds the range of auditory stimuli that are classified as neutral by the system, the sensation of pain is added so that the reception now has both sensations that result from the physical properties of the reception and those that are imposed by the system to promote some form of operant behavior. The critical difference between the primary and secondary sensations is that secondary sensations do not have the goal of representing features of the stimulus, but of influencing the agent. The unmodified primary sensations provide information. Secondary sensations enhance receptions so that they are able to serve as primary reinforcers. In the case of the pain created by the horn blast, the secondary sensation does not provide information about the loudness of blast. The reception of the physical features provides this information. Rather, the pain is designed to influence the agent.
SENSATION
31
If the horn blast did not have an aversive character, there would be no possible basis for the organism to produce escape responses. The system may be designed so the degree of pain created is correlated with the degree of loudness—as the loudness decreases, the pain decreases. The organism may therefore escape from the pain by producing any of a variety of possible operant responses (running away, hiding behind a barrier, etc.). With the addition of secondary sensation, the horn blast now has two functions. It is both a discriminative stimulus and negative reinforcer. The two are correlated in a way that implies behavior. By escaping from the loudness of the horn blast (primary sensation), the organism escapes from the pain (secondary sensation). For some orientations to learning, references to sensation and pain are considered subjective, anthropomorphic, or unscientific. From a behavioral standpoint, references to pain (or aversive sensation) are no less objective than references to vision or olfaction. Pain sensation is inferred in the same way that vision or olfaction is inferred. If the behavior of the organism implies that it is responding to changes in the visual field, the organism is receiving visual information in the form of sensation. If the organism escapes from specific stimuli, these stimuli have negative-reinforcement properties. The only bases for these negative-reinforcement properties are secondary sensations. If an organism, whether completely hardwired or capable of learning, escapes from specific stimuli, the organism experiences what amounts to pain. This is not to say that all have the same patterns of pain or that the same body parts issue pain. However, if the organism escapes from certain stimulus conditions, those conditions are functionally aversive, which means that the system must have some provisions for making those conditions painful. The addition of pain is possible only if the primary reception is enhanced with secondary sensation. Methods of Enhancing Secondary Sensation. Two types of pain are possible, but both have the same ultimate function of influencing the agent to produce responses that reduce the pain. Some pain has a physical locus. An injured leg produces pain that is correlated with a particular body part. All implications for behavior have to do with that body part. Not walking or walking in a manner that reduces the pain and other associated behaviors are referenced to a particular physical locus. Other pain does not have a precise physical locus. An example would be a chicken running from the shadow of a particular character (a nonlearned behavior). The pain is associated with a particular visual reception. However, there is no physical locus for the pain in the same way there is for the injured leg. The escape behavior of the chicken occurs not because the eyes hurt, but because the agent that produces responses is in pain.
32
2.
BASICS OF HARDWIRED SYSTEMS
The difference between the type of sensation that has a physical locus and the systemic sensation is subtle. One has to do with the level of pain, and the other has to do with content. Secondary Sensation for Level of Reception. Every sensory representation of any stimulus has many different features. Each could be the basis for enhancement with secondary sensation. The system could be designed so that the addition of secondary sensation corresponded to the level of the reception. Because the level of reception has nothing to do with other discriminable features of the reception, any reception that meets a preestablished level criterion would be enhanced. The system would respond the same to a horn blast, a dynamite blast, or a shearing wind blast that reached a certain decibel level. If the auditory reception meets the loudness criterion for pain, it is enhanced with secondary sensation. Attaching secondary sensation to a level of reception may be an all-ornone design or a continuous-variation design. For the all-or-none design, a particular threshold level of reception would activate the secondary sensation, and the sensation would maintain a fixed value regardless of how far above the threshold the reception goes. For the continuous-variation design, the enhancement would be activated by a threshold level and would vary as the energy in the reception varies above this level. Either format could be keyed to specific behavioral strategies. However, the continuous-variation design has an advantage for escaping from any loud sound. It permits the organism to correlate behavioral progress with progressive changes in the level of reinforcing sensation. As the organism produces responses that decrease the loudness of the horn blast, the level of pain decreases. Therefore, the response strategy that leads to the decrease is reinforced. If the organism does nothing, the level of pain remains high. Note that this strategy is available to the organism that does not learn. If movement in a particular direction results in less pain, the organism continues to move in that direction. Secondary Sensation for Specific Content. The system may be designed to add secondary sensations to specific features of the reception that are independent of level. For instance, a particular form, movement, pattern, and so on would be enhanced with reinforcing sensation. The level of the reception is not relevant to this format. If the hardwired organism were to respond to the pattern features of the horn blast, for instance, it would respond in the same way whether the horn blast was received at 140 or 14 decibels. In the same way, if the organism is to respond to a particular pheromone, regardless of its level of reception, the system requires secondary sensations that make the unique features of this pheromone salient or compelling for the agent that plans responses. Instead of this pheromone being
SENSATION
33
on equal footing with other sensations that are of the same level of reception, the enhanced pheromone stands out as something that commands attention, even if the reception is faint. Whether the basis for sensational enhancement is a specific stimulus pattern, movement, or combination of features, such as green and movement, the added sensation is arbitrary from the standpoint of a physical analysis. However, it is not arbitrary with respect to the needs of the organism. If a system needs a discriminative stimulus to identify a candidate for mating, the system may be designed to enhance something that is present in potential mating settings, but not present in any other settings—possibly an olfactory molecule that emanates from a potential mate. The presence of the molecule becomes both the reinforcer and reference for approach behaviors. This particular enhanced sensation would necessarily be designed so that the changes in the enhanced sensation parallel changes in the level of reception. Behaviors that intensify the reception of secondary sensation would bring the organism closer to the mate. If the sensation were not designed so that changes in enhanced sensation parallel changes in level, the organism would not be provided with differential consequences for behavior. If the level of enhanced sensation were fixed, the agent would only know that the pheromone is present somewhere in the area, but would have no basis for producing behaviors that approach the mate. No behavior would result in a change in the level of the secondary sensation; therefore, none would take precedence over others. The features that are enhanced must be specific enough to rule out any competing stimulus that could be present under normal conditions. An example of a complicated set of form-movement criteria is the shadow that promotes chickens’ escape behavior. The criteria are based on the deltashaped shadow of a chicken hawk. However, a stationary delta-shaped shadow does not excite chickens, nor does a delta-shaped shadow that moves, but not point first. Only a delta-shaped shadow that moves point first elicits the escape behavior. If the three features are not present in the example—delta shaped, movement, in point-first direction—the secondary sensations are not produced. When all features are present, the secondary sensation is issued to the agent, and the seemingly ordinary shadow evokes an enormously energetic escape response. The three-feature criterion for secondary sensation tends to guarantee that, under normal circumstances, the only shadows that the chickens respond to is that of a chicken hawk. Secondary Sensations for Approach and Escape. The escape response is different from that of an approach response because any response that escapes from the stimulus leads to a relatively more positive state. (If the level of the aversive stimulus is reduced to zero, the escape is successful.) How-
34
2.
BASICS OF HARDWIRED SYSTEMS
ever, for the approach response, the positive state is created not by reducing the level of the reception, but by increasing it. What is not apparent is that both the escape and approach responses require the same pattern of enhancement and changes in sensation. For both, the behaviors are designed to escape from negative reinforcement or a negative sensation. It is not immediately apparent that the approach response involves negative reinforcement. However, the main design problem involved in creating sensations that lead to an approach is to make sensations for responses that approach the target more reinforcing than sensations for other possible responses. If the organism is at X distance from the object of the approach and the sensations at Point X are perfectly positive, there is no possible reason for the organism to continue to move toward the stimulus. Only if sensations at Point X are relatively more negative (less positive) than a point closer to the target could the system reinforce the agent for approaching the stimulus. Therefore, when the organism approaches the stimulus, it is actually escaping from a relatively negative sensation. The secondary sensations that are present while the organism is not approaching the target have the property of being negative reinforcers. The only way the organism is able to escape from this negative sensation is to approach the target. The negative sensation is not absolute, but relative. The current setting may be relatively positive; however, it will lead to approach responses only if the current reception is less positive than it would be to produce approach responses. An example of a relatively positive setting that promotes behavior is a moth that approaches a light but is restrained from reaching the light by a screen. The moth chooses to fly against the screen. Every time the moth is removed and put at a distance from the screen, the moth flies against the screen, indicating that being restrained at the screen is relatively more positive than being farther from the screen. The moth, however, would not stay at a screen distance if it were not restrained. Therefore, when the moth is restrained, the current state is negative compared with that of closer approach to the light. The sensational changes that lead to approach have the properties of an urge. The urge has negative properties that are relieved only by an approach or intensification of the discriminative stimulus. These negative properties overlay a positive baseline. The format for an approach cannot be a perfect parallel to the pain format. The level of pain diminishes as the physical energy of the stimulus diminishes. If this relationship occurred with a positive approach, the secondary sensation would diminish as the organism approached the target. This scheme wouldn’t work because the enhanced sensations would be low when the organism was near the target. In summary, the arrangement for either escape or approach behaviors is the same with respect to how secondary sensations change. The presence of
SENSATION
35
the discriminative stimulus creates a relatively negative sensation. The sensation changes and becomes more positive as a function of changes in physical details of the reception. The only difference between approach and escape is whether reducing the level of reception creates a more positive change in sensation or a more negative change. For prewired escape behaviors, the sensations become more positive when specific details of the reception diminish. For approach behaviors, the sensations become more positive when the details of the reception increase. The choice of physical feature to which changes in positive or negative sensations are keyed must be limited to those features that change as a function of distance. Apparent size, intensity of reception, and relative position change as a function of distance and are therefore possible keys that the system uses to vary the level of enhancement of the sensitized stimulus. Note that none of these features is used to identify particular inputs as positive examples. The enhanced sensations must be keyed to multiple features of the stimulus in the same way that information must be keyed to multiple features. One set of features identifies the presence of a target for approaching or avoiding. This activates the secondary sensation. Another set of features has to do with changes in the target. These are the basis for issuing changes in the secondary sensation. If the system fortifies olfactory molecule Y with a secondary sensation, the system must have a criterion for identifying molecule Y. This identification criterion is needed to determine which incoming stimuli are to be enhanced by a particular secondary sensation. The system also requires another criterion for determining how the enhanced stimulus changes as specific features of the current setting change. Creating change in sensation would have to be based on changes in features that are independent of the features that identify the object, such as changes in level of reception. As the level decreases, the secondary sensation changes. The same general format could be used to issue changes in a visual stimulus (the larger the image on the retina, the more positive— or negative—the sensation) or auditory stimuli (the greater the amplitude of the reception, the more positive—or negative—the sensation). An absolute requirement for secondary sensations is that the changes in sensation must be keyed to changes in some physical aspect of the stimulus. All manipulations involving secondary sensation are performed reflexively by the system—as a strict function of current stimulus conditions. The variation in secondary sensation is reflexively linked to changes in specific physical features of the reception. Content Maps Secondary sensation is not sufficient to account for unlearned behaviors of organisms. The hardwired organism needs the same kind of information the organism that learned specific content would need. The organism
36
2.
BASICS OF HARDWIRED SYSTEMS
needs information about how the setting is supposed to change and what to do to affect the change. For instance, the organism has to know how to construct a web, build a nest, dig a hole, or create six-sided cells in a hive. These pursuits involve specific content. Therefore, if the hardwired system is to perform as observed, specific content must be provided to the agent that plans and produces responses. Without this information in some form, the agent would have no possible basis for producing responses or determining whether the responses are appropriate. The presentation of the content is limited in the same way that the presentation of secondary sensations is. The content must be reflexive. It must be a function of current sensory conditions. It must be in the form of sensation. The difference is how that sensation is configured. Unlike secondary sensations that provide reinforcement, the content map “sensation” actually presents the agent with qualitatively unique sensory features (information) about various features that are not present in the current sensory reception. These features imply the specific changes that must be achieved by performance in the current setting. These features would be realized only if the current setting is changed in specific ways. Possibly, the location of the agent would change; specific features may be enlarged, intensified, or altered, or there would be a web where there is now space between two branches. Some form of the qualitative features of the change must be presented to the agent before any planning occurs. The content information is logically prior to the plan because this information is the basis for how the current setting should change. The content information about the change also serves as a basis for providing feedback. If the organism performs in a way that is consistent with the features described by the information package, the infrasystem provides the agent with positive feedback. If the behaviors being performed are not consistent with the features of changes that were to occur, the agent receives negative feedback. The goal of the behavior is that the setting changes in the way specified by the information package. This package is the only possible basis for the spider’s ability to construct the web. The agent needs criteria for producing responses. The responses are designed to create a pattern. Unless the agent has some form of specific information about the pattern, the responses required for building the web could not be planned. In one form or another, the information about the pattern must be in place prior to any agent responses or there would be no basis for any behaviors or consequences. Although the analysis refers to the information that fills the content requirement as a content map, the precise form of the content map for a particular stimulus class is not implied by the analysis except that (a) it is an abstraction that has the capacity to serve as a criterion for evaluating incoming sensory receptions, and (b) it is in the form of sensation. The map may
SENSATION
37
be simple or complicated. An example of a simple map is implied by the behavior of approaching a particular stimulus. Even if we allow that the stimulus is enhanced with secondary sensations that urge the organism to do something, no particular responses are implied unless the organism has some knowledge of what to do or what kind of outcome is to be achieved by the response. In either case, some form of content information is needed before response could be planned. Secondary sensations without a content map are not sufficient to account for observed behavior. If the content information were not present, the organism would not exhibit the behavior of approaching a discriminative stimulus without exhibiting random behaviors. The organism would have no information about what sort of behavior—if any—led to more positive sensations. The organism would experience a positive change in sensation only if locomotive responses in a particular direction were directed toward particular stimulus features. If the organism did not have this information before the responses are produced, the organism would produce random responses before fixing on an approach route. Obversely, if the organism moves toward the discriminative stimulus without hesitation or false starts, these conservative conclusions follow: (a) The organism has content information about the kind of responses that are to be produced in the presence of the discriminative stimulus, (b) this information is presented before any responses are produced, and (c) the content information is applied to the details of the current setting, and the responses are therefore driven by reinforcing consequences. If the system is using visual sensation to approach Stimulus X, the map would have to contain an informational equivalent to the rule “Produce locomotive responses that increase the apparent size of X.” This rule would permit the organism to (a) orient itself toward X, (b) produce locomotive responses that increase the size of the visual reception of X, and (c) receive positive changes in sensation as the organism’s responses result in an increase in the apparent size of X. Content Inferences. Inferences about the nature and extent of the map are fairly rigorously inferred from the behavior of various organisms of the same species. Those features of behavior that are the same across all successful applications of all individuals describe the scope of the map. For example, we look at various webs built by different spiders and by the same spider built on different occasions. The features that are the same describe the content map that is reflexively issued to all spiders of the species before the web is constructed. Only those features that are the same could be directly controlled by the map because the map is a hardwired reflex. When specific stimulus conditions are present, the map is presented to the agent. The map could
38
2.
BASICS OF HARDWIRED SYSTEMS
not vary from one trial to the next because it would have to be keyed to the presence of specific stimuli. The fact that the stimulus is present is independent of the variables taken into account in planning or producing a particular response in the current setting. Therefore, it would be impossible for the map to provide information sufficient enough for the construction of a web in any specific situation. The map provides the general rules, not the specific detail. Although it would be possible for the system to be designed to issue many variations of the map for many variations of patterns that occur in different settings, this practice would be inefficient and ineffective. The current setting is unique. As the response is being produced, the current setting changes in unique ways. Even if the system were designed to issue 1,000 different maps reflexively, each keyed to a specific pattern, they could not account for the variations observed or account for the variation of repairs that are performed when the web is damaged. Just as the samenesses that occur across settings imply what is presented reflexively as a content map, the variations or differences that occur in various settings imply the extent and nature of the agent functions associated with the content map. The system issues a copy of the same map for all occurrences of particular stimulus conditions. That map serves as a template or skeleton for all instances of the pursuit. This arrangement would be required even for simple pursuits such as, “Use locomotive responses to increase the apparent size of X.” Because the map would have to be applied to each setting, it would be customized on the basis of setting-specific detail (in the same way that the simple machine customized plans to the settingspecific information about features of member and ball). In summary, if one map is to serve as a template for all instances of the pursuit, the map’s specifications are logically limited to those details of the operation that are the same in all applications. The qualitative features of a map are inferred from behavior. If an operation, such as building a nest, involves Steps A, B, and C, and if A and B may occur in order A-B or B-A, the inferred content map is something like, “Create C after A and B are complete.” Because the map does not specify the order for A and B (except as they relate to C), a decision must be made by the part of the system that produces operant responses because some particular pattern will be produced in the current setting. This choice does not mean that the agent is somehow free and not constrained by the content map. It means rather that some nonreflexive, consequence-referenced function is needed to account for the choice (this topic is developed in chap. 13). Possibly the closest analogy to the content map is an annotated blueprint that has all the specifications for the various details of the product
SENSATION
39
and lists all the construction steps mandated for the process that leads to the product. For example, the organism is constructing a cell in a beehive. The content map specifies all relevant details of a cell—the number of sides, the angles, the thickness of the walls, the depth, and all the other specifications that are the same across this type of cell in various hives. The content map also functions as something of a step-by-step guide for constructing the cell. In performing the construction function, the content map may be conceived of as an overlay of the final product on the actual cell that is under construction. Every detail of the cell under construction either corresponds in every detail to the content map or it doesn’t. If the agent produces a change in the current cell that makes it more like the model of the completed cell, the agent receives more positive secondary sensations. If there are discrepancies between the changes the agent makes and the blueprint, the specific deviations are enhanced with sensations and become the discriminative stimuli for the next behaviors. For instance, if the cell is not yet completed, there is a discrepancy between the map and features of the cell under construction. The system would enhance the specific areas of discrepancy on the content map and would issue relatively negative sensation. The enhancement would show exactly where the finished part ends and where the next steps would have to occur. This boundary would be enhanced with sensation and would thereby become salient to the agent. The secondary sensation would serve as negative reinforcement for the agent to perform in a way that eliminates the discrepancy between current cell and the model provided by the content map. If the cell under construction has the wrong angle between two of the walls, the deviation between the specified and realized angle would be highlighted on the overlay. The enhancement of this part ensures that it is salient and serves as the discriminative stimulus for the next behaviors. In all cases, the behavior would be planned to eliminate the discrepancy between map and overlay. All details of how the content map is projected, how discrepancies are noted, and how the secondary sensations are adjusted must be achieved reflexively. The system would reflexively project the blueprint, reflexively highlight anything in the cell under construction that did not match the specifications of the content map, and reflexively change the level of the secondary sensation as the features of the cell change. These are operations that would provide specific information for the agent and be the same in all instances. At the same time, they are strictly based on antecedent conditions. If a particular visual impression is presented, it is superimposed on the map reflexively and compared reflexively. The result is a reflexively issued change in secondary sensations.
40
2.
BASICS OF HARDWIRED SYSTEMS
Termination Criteria. Just as the content map for the bee would logically require some sort of termination standard that is continuously compared with the current setting and updated as the setting changes, the content map for a simple behavior, such as approaching X, would need a variation of the same comparison process. The process could assume a variety of possible forms. For instance, if the map called for an increase in the size of visual stimulus X, the system could display the current status on the retina as a thick outline around the image of X. As the organism approached X, the outline would become thinner. The shape of the outline would change as the shape of current image changes. The outer edge of the outline would indicate the size to be achieved. The inner edge would be based on the current image. When the outline thickness is reduced to nothing, the map is terminated and the approach is terminated. This example is not intended to suggest that any organism uses this particular mode of representation (dynamic outline), merely that such a mode would meet the requirements of being a reflexively activated pattern that would permit ongoing comparisons and modification of the content map in a reflexive manner. It would also provide the termination criterion for the response. The termination criterion is as necessary as an activation criterion. Like the activation criterion, the termination criterion must be reflexively referenced to particular features of the current setting. If there is no discrepancy between the blueprinted product of a particular pursuit and a representation based on the current setting, the sensation associated with the discriminative stimulus and the content map are terminated. The object or physical outcome that had been the focus of activity is no longer enhanced with sensations. It is therefore not salient to the agent. No behavioral consequences are associated with it. No differential reinforcement is issued to the agent for responding to it, and no content information is linked to it. The termination criterion for one operation may be the activation criterion for the next operation. If Behaviors A and B are to be produced in sequence, the system must be designed so that the termination criterion for A is the activation criterion for discriminative stimulus B. The presence of the specific stimuli that signal pursuit of B also terminate all of the secondary sensations of A—the content map and the enhanced discriminative stimulus. These are replaced with sensations and map for B. By using this format, the system is able to produce chains of sequences that lead to a terminal goal. For example, in nest building, the bird may have to perform the operations of locating a site, finding building material, and constructing the nest. All these components may be chained together by using the format of having the termination criterion for one of the segments being the activation criterion for another.
SUMMARY
41
SUMMARY The completely hardwired organism needs the same type of information and motivation to respond that would be required if a particular response strategy had been learned. The reason is that the information and motivational requirements are the same. They are strictly implied by the nature of the task and how the learner performs the task. If the organism had learned how to build webs of a particular pattern, rather than inherited the information, the learner would have content information that applied to various settings—information about the specific details of pattern that must occur in every application. The task facing the learner would be to change the current setting in such a way that an example of that pattern would be present where it does not currently exist. The content information must be in place before the fact. The only possible way the learner would know whether its efforts in creating the pattern were appropriate or inappropriate would be to compare the specific details of the current production with the abstract pattern. If the two are the same in essential detail, the learner is reinforced for its behavior and the learner continues. If there is a discrepancy, the learner receives negative feedback. The behaviors that led to the discrepancy are rejected, and the learner must issue compensatory behaviors that redo specific parts of the current project. There is no magic that could permit the completely hardwired organism to perform this task with less information or information of a different type. The responses called for are uncertain and characterized by continuous variation. They must be planned, and there must be a specific criterion for determining whether they are reinforced or punished. If it is not absurd for the learner to have a content map that guides its productions for a particular pursuit, it is no less absurd for a hardwired organism to have this information. In the same way, there must be reinforcement and punishment to account for both learned and unlearned behavior. If the response is operant, it is a function of reinforcing consequences. From a design standpoint, this means that both the organism that learns and the one that does not must be sensitive to different behavioral outcomes. Both produce operant responses. The major design problem facing the hardwired system is how to create reflexive processes for both the agent’s motivation and information. Each process must be reflexive because whatever information is represented to the agent cannot be modified through learning. Therefore, it must be carefully keyed to specific stimulus conditions, and it must be both specific enough to provide relevant information about what changes are to occur in the current setting and general enough to apply to all settings that have particular stimulus properties.
42
2.
BASICS OF HARDWIRED SYSTEMS
The agent must also be motivated to plan and direct the response in the current setting. All details of the deployment of reinforcement must be accounted for reflexively. The presence of specific stimulus conditions reflexively causes a transmission of something relatively more reinforcing or less reinforcing. The system solves this design problem by enlisting primary reinforcers. The model for these is the manner in which the system processes sensation. Just as the system transforms physical features into a sensational representation of those features, the system adds a secondary sensation to those features to which the agent must attend. This secondary sensation is a creation of the system, but in a sense so are the representations of incoming stimuli. Secondary sensations are not enhancements designed to provide additional information about the features of the reception, but to provide consequences for the agent’s behavior. When predetermined physical stimuli are received, they are reflexively enhanced with secondary sensation. Although the reception may be faint from the standpoint of physical features, the enhancement makes it more salient to the agent than any other stimuli in the current setting. Secondary sensations are attached to whatever behavioral process the agent is to perform. If the agent is to plan an approach to a particular set of visual features, the agent is reinforced if it plans an approach that specifies the direction and mode of locomotion (at a minimum). The agent is then reinforced for directing and executing the plans. In other words, the agent had great interest in performing these operations and little interest in planning or executing other responses in the current setting. Secondary sensations are not sufficient. The agent cannot perform any planning or directing behavior unless it receives content information. The most radical departure from transmitting unbiased sensory reproductions of the physical properties received by the organism is the presentation of content maps. Like other secondary sensations, content maps are sensory creations of the system. Instead of simply enhancing features already present in the sensory reception, the system creates specific features not in the sensory present. The system reflexively attaches the information about how the current setting is to change to those stimulus features in the current setting that are relevant to the change to occur. The infrasystem uses a combination of the map and changes in the secondary or reinforcing sensations to influence the agent and provide feedback. The change in sensation is a consequence of the agent’s planning and directing behavior. Positive change reinforces the behavioral patterns that the agent directs. Negative change punishes these efforts. All the steps in these operations are reflexive. When a particular stimulus appears, the infrasystem reflexively enhances it and reflexively issues a content map. When the details of the content map change, the system reflex-
SUMMARY
43
ively issues changes in the secondary sensations. When the current setting has been modified in the way called for by the content map, the secondary sensations and content map are reflexively terminated. This termination may be reflexively linked to another step or pursuit. The infrasystem has other functions not considered in this chapter; however, they are implied. The reflexive format that the system uses to compare details of the present setting with those specified by the content map is applied to other types of comparisons the system is required to make as a response is being produced. These details are addressed in chapter 4.
Chapter
3
Agent Functions
This chapter elaborates on the two basic functions of the agent—to plan responses in the current setting and direct them. Agent and infrasystem functions are coordinated. To perform in the current setting, the agent needs information about what kind of change is to occur and how to produce responses to create the changes. Information about what kind of change is called for comes from the infrasystem through a content map. How to produce responses that create the changes is largely an agent function. The agent has a response repertoire and knowledge of the effects created by the various responses. So if the content map urges the agent to produce locomotive responses, the agent knows how to plan and direct those responses in the current setting. OVERVIEW OF AGENT FUNCTIONS The analysis in chapter 2 identified the two major infrasystem functions of the hardwired organism that produced operant responses: (a) modifying specific sensory receptions by enhancing them with reinforcing sensations, and (b) issuing content maps. The infrasystem reflexively adds secondary sensations to particular stimulus patterns and reflexively presents the content map that describes the general outcomes or goals and the general response strategy the agent is to apply in changing the current setting in a way implied by the goal. Because these two functions are reflexive, the product of these functions is the same each time specific sensory conditions are present. The role of the agent is to apply the infrasystem information to the specific setting. Operationally described, the agent consists of those functions 44
OVERVIEW OF AGENT FUNCTIONS
45
that cannot be accounted for by the infrasystem (those that cannot be keyed solely to an antecedent stimulus). The primary purpose of these agent functions is to produce operant responses—that is, responses that are at least partly a function of consequences. Operant responses, particularly those that involve uncertainty and continuous variation, require both planning and directing on the part of the agent. The plan formed by the agent must be isomorphic in some ways with the directive to respond. If the plan indicates one pattern and the directive does not specify this pattern, the plan will not work. Furthermore, if the responses are influenced by consequences, then the behavior that led to the planning of those responses is also influenced by consequences. Therefore, the agent is designed so that it is sensitive to consequences. These consequences are in the form of sensation, which is reflexively issued by the infrasystem as specific details of the setting change. To plan a response, the agent must have knowledge of the specific changes that are to occur in the current setting and knowledge of its capacity for producing responses that will change the current setting. Knowledge of the specific changes is in the form of content maps issued by the infrasystem. The agent applies the content to the current setting. If the content map provides a general directive, such as “move to intensify stimulus X,” the map is applied to the current setting by taking into account that which is not specified by the map. The map does not name the type of locomotion, direction, or other details that would have to be in place before the organism could produce a response in the current setting. The agent secures this information to create a plan. The agent must also have knowledge of the responses that are available to apply to the current setting. The response produced will involve specific voluntary behaviors. These must be planned before they are produced. Part of planning would involve some form of abstract code that specified responses in a way that prevented each from getting confused with the others. The agent would logically need knowledge of the code for the different responses. If the response that will intensify X involves turning to the left, the agent must specify this response before it is produced. Therefore, the agent needs something of a menu of responses that are available and knowledge of how to transform patterns of responses by adjusting the weight, effort, and other dimensions required for different applications. In summary, the agent decodes the content map so that the agent knows what to do to receive some form of positive consequence. The agent secures information about the present setting that is needed to transform the content map into a situation-specific plan. The agent codes the details of the response to be produced into a plan that is consistent with the general provisions of the content map. The agent directs the response. The infrasystem produces the response.
46
3.
AGENT FUNCTIONS
Inferring Agent Functions From Behavior The basic properties of the agent’s functions are inferred from those behavioral components that are not the same across a range of instances that involve a particular content map. Those details of the responses that are the same on all occasions imply behavior referenced to specific provisions of the content map. If, under specific stimulus conditions, the organism always performs some behavior that brings it into contact with X, this aspect of behavior is inferred to be a function of the content map. This aspect is always the same, and it occurs strictly as a function of antecedent stimulus conditions. The fact that it is not a specific behavior does not matter. The system is assumed to be capable of reflexively issuing general directives that do not contain enough detail to permit them to determine the precise responses that occur in any setting. However, the general directives are specific enough to control the common features observed in the various applications of the content map. The variation in specificity of these general directives can be observed in the construction of an anthill or the nest of a particular species of birds. Both the anthill and nest are the product of behaviors. Some of the behaviors are the same on all occasions, and some are different. If certain steps are common to all applications, these steps imply information issued by a content map. For instance, if the ants always construct one type of chamber before constructing another, this sequence is part of the content map. If all details of the behavior patterns observed were the same on all occasions of specific antecedent conditions, all of the behavior could be attributed to the content map. The extent to which there are differences in behavior from one application to the next determines the scope and extent of the behavioral components that are not reflexive (not completely controlled by the content map) but that are the product of agent behavior. Observations reveal that the samenesses are in terms of objectives or goals and the general strategies for achieving the goal. These samenesses describe the content map. The differences from one occasion to the next have to do with the interaction of the content map with the details of the current setting. The map is customized. The variation in customization across the range of applications describes the agent functions.
INTERACTION OF CONTENT MAP AND AGENT As noted in chapter 1, the influence of the infrasystem on the agent is indirect. None of the provisions of the content map can actually cause specific behavior because none of the provisions describes the specific behavior that would have to occur in the current setting. Because the infrasystem
INTERACTION OF CONTENT MAP AND AGENT
47
cannot specify the situation-specific responses through a content map, it does not cause the behavior. Rather it causes sensations that influence the agent. The infrasystem issues sensations that are more positive if the setting changes in ways specified by the content map and more negative if the targeted changes do not occur. The changes in sensation reinforce the agent for complying with the map requirements and punish the agent for not complying. Cycle of Specificity The interaction between infrasystem and agent is a cycle that progressively increases in specificity. The cycle assumes not only that the agent is able to decode the general information provided by the infrasystem content map, but that the infrasystem is able to decode the agent’s directive to respond. Figure 3.1 shows this relationship of the cycle of specificity. The content map is the most general form (i.e., “Move to intensify X”). For any transformation of this general directive into a response directive, the agent must be able to decode it and take the steps necessary to issue a specific-response directive consistent with the content map. For instance, after receiving and decoding the content map, the agent identifies where X is in the current setting and constructs a plan to move in that direction (D) at a particular rate (++) and with a particular walking pattern (P). This plan is far more specific than “Move to intensify X.” However, it is not specific enough to activate the various muscles and sequences of movement that are required for this response to occur. Rather, the directive is issued for the infrasystem to take these descriptions (moving in direction D, walking pattern P, rate ++) and create the actual responses. The actual responses are far more specific than the directive. The agent directive did not have to specify every step (right front forward, left rear for-
FIG. 3.1.
Cycle of specificity.
48
3.
AGENT FUNCTIONS
ward, left front forward, right rear forward) and the coordination of leg muscles throughout the time that the response pattern is being produced. This is an infrasystem function. The cycle of specificity goes from general to specific, with the infrasystem both issuing the most general directive and creating the most specific application. A critical content issue is how specific the content map and directive are. The content map would be designed so the information is no more general than is necessary for the range of variability to which the map is to apply. The directives to the infrasystem would be designed so the information is no more specific than necessary to specify the response that the agent projects. This arrangement gives the agent the most specific input about what to do. At the same time, it permits the agent to specify response units that serve as relatively large lumps. If the content map is to apply to situations in which a wide variety of locomotive responses are possible, the content map would necessarily be general enough that it did not specify one type of locomotion. Just as specifying a type of locomotion would be too specific, failing to specify that locomotive responses were necessary would make the content map too general. The map would not alert the agent to the fact that responses like lying down, head turning, or making vocalizations would not lead to intensifying X. The directive to respond is most efficiently designed if it is no more specific than is necessary for the current setting. If the current setting requires a specific type of walking or walking that may change every few steps, the specification would necessarily not be a global “Walk.” If the directive was simply “Walk,” it would fail. So it must be specific enough to accommodate the projected consequences of the behavior. However, if “Walk” would be sufficient for the pattern needed in the current setting, specifications more precise than “Walk” are not needed. Reflexive Components of Responses The specifications that the agent uses to direct the response correspond to the various reflexes that the infrasystem uses to transform directives into behavior. If the agent directs rate ++, the infrasystem must be able to respond to this directive as it would to any other antecedent stimulus that triggers a reflex. Rate ++ describes a dimension of a repetitive pattern. The infrasystem must be able to incorporate this feature into the response. For some pursuits, part of the behavior may be largely reflexive at least momentarily. The striking behavior of the salamander or stinging behavior of the bee have moments that are clearly as reflexive as a sneeze or an orgasm. The antecedent stimulus conditions that cause these responses are necessarily precise and detailed. Before the bee can reflexively sting, how did the bee get into position to sting? The answer is through operant behav-
PLANNING RESPONSES
49
iors. Similarly, operant behaviors permitted the salamander to be in a position to strike. Although behavior may have reflexive components or moments that are apparently governed by reflexes, most behavior that has been identified as reflexive actually consists of significant components that are the product of agent planning. The scratching reflex provides a good example. It is stimulated by particular sensations. (If one scratches a dog’s back, the dog will often produce scratching responses while the leg thumps against the floor.) When the dog scratches itself, however, the reflex is not the total behavior that is produced, but rather a component embedded in operant responses that account for the (a) posture of the animal, (b) specific location scratched, (c) rate and amplitude of the scratching responses, and (d) duration of the scratching. There are certainly aspects of the scratching behavior that are the same on all occasions and therefore are reflexive. However, given the details of the operant behavior that are needed to deploy the scratching, an inference is that the scratching is a response in the animal’s repertoire that is greatly enhanced with positive sensations under specific stimulus conditions. The actual scratching pattern that is produced, once the posture is specified and the leg is directed to the locus of sensation to be scratched, is probably the result of a directive issued as “Scratch.” The infrasystem produces the pattern as the agent plan directs the components that guide the pattern to the targeted location. Another example is the specific posture a shark assumes when it is aggressive—pectoral fins flexed and back humped. If this posture is unique to the behavior of making attack passes at a potential prey (or enemy), the content map for attack must provide a specification for the posture. The posture must be part of the agent’s response repertoire because, although it occurs during attacks, its deployment is variable in other maneuvers. The variability means that the content map describes the strategy and general pattern of movements (the quick turns and other behaviors that are the same on all applications) and reinforces the agent for assuming the posture when making “passes” at a potential prey. The posture is part of a behavioral package that the agent issues. “Pass ++” could specify both the specific character of the pass and the aggressive posture. Certainly the posture has reflexive components; however, the variability in both degree and correlation with specific maneuvers implies agent functions. The shark’s behavior is not a reflex, but the product of a plan that specifies reflexive components.
PLANNING RESPONSES The analysis of variability (the aspects of response production that cannot be accounted for by the infrasystem’s content map) implies three major roles for the agent:
50
3.
AGENT FUNCTIONS
1. formulate plans for specific operant responses, 2. direct behaviors consistent with a plan, and 3. adjust the plan on the basis of incoming sensory reception. Formulating Plans The planning of the response is possibly the most elaborate part of the process—certainly it is the sine qua non of performance. Furthermore, this planning starts with a content map that must refer to three classes of variables that govern any plan: (a) response class that is to be produced, (b) purpose of the response (the specific variable that is to change as a function of the response), and (c) discriminative stimulus that serves as the reference point for the behaviors. Response-Planning Templates. For the map for “Move to approach X,” the response class is “move,” the purpose is approach, and the referent is X. Both the purpose and referent (discriminative stimulus) are completely specified by the map. The unspecified variable is the specific response that the agent will plan to move. The content map could be designed to prompt the planning by presenting a response-planning template, which would consist of a series of menus for the response options that are consistent with the general response class to be performed. If the content map called for the response “Move,” the various modes of locomotion in the agent’s repertoire would be presented to the agent as options for planning the response. Decisions are required if one of the options is to be selected. Those decisions are agent decisions. If the options are walk, run, swim, and climb, these options would be presented to the agent by the response-planning template. Just as the agent is assumed to be capable of decoding and producing response features named by the content map, the agent would have to be capable of decoding and encoding any response option presented by the menu. Table 3.1 presents a simplified diagram for some of the key decisions that would have to be made by the agent. The form of the menu indicates the nature of the variables, but the listings are only examples of what a particular menu would provide. The table shows three classes of variables: response variables (modes of locomotion and features of each mode [Columns 1 and 2]), purpose (Column 3), and referent (Column 4). Columns 1 and 2 list different choices. Columns 3 and 4 show that the purpose and referent for “Move to approach X” are the same for all combinations of response variable choices. For different content maps that involve moving, the same response variables would be presented. However, the purpose and referent would change—for instance, “Move to escape from stimulus K,” “Move to circle
51
PLANNING RESPONSES
TABLE 3.1 Initial Response Planning Response Variables Locomotive Variables
Motor Response Variables Move
Run Walk Swim Climb
Rate, Rate, Rate, Rate,
direction, direction, direction, direction,
posture posture posture posture
Purpose
Referent
To approach
X
To To To To
approach approach approach approach
X X X X
B,” or any other map that would take the form “Move to .” Also, the variables in Column 2 would be presented in any content map that required a motor response other than a locomotive response—“Grab to secure X,” “Bite to eat K,” and so forth. All these behaviors would vary in rate, direction, and other motor response features. The selection of values for the motor response variables (Column 2) for any particular mode of locomotion (Column 1) would be based on the current situation, which provides information about the relative position of X and other details that influence the response decision. The direction specified for the locomotive response would be referenced to where X is. The rate would be referenced to the behavior of X and to the route that approaches X. If X is moving away, a faster rate may be necessary. If the terrain is hazardous or complicated, the slower rate is implied. For each motor-response variable, there would be a menu of possible settings. The rate menu would present the range of possible rates that are under the agent’s control. The agent would select one. The direction could be referenced to the deviation of the visual reception of X from the organism’s current midline (e.g., 14 degrees to the right of midline). The response is not possible until these variables have been selected. If the rate is not planned, the response is not possible. The same is true of posture—uphill, downhill, close to the ground, and so on. When all the motorresponse variables have been selected, the content map has been transformed into a situation-specific response plan. Table 3.2 shows a plan that is the result of the decision to walk. The plan specifies the selected mode of locomotion (walking), the specific rate (represented as ++), a specific relative direction (+14°), and specific posture (P1). The purpose and referent of the response are “to approach X,” which means that the response is to be evaluated according to whether it brings the organism closer to X. Both the mode of locomotion and values for the motor variables are influenced by details of the setting. The general principle is that, if there are features of the setting that logically influence some feature of the response, the organism that produces
52
3.
AGENT FUNCTIONS
TABLE 3.2 Response Planning Based on Walking Response Variables Locomotive Variables
Motor Response Variables Move
Run Walk Swim Climb
Rate, direction posture Rate++, direction+14°, posture P1 Rate, direction, posture Rate, direction, posture
Purpose
Referent
To approach
X
To To To To
approach approach approach approach
X X X X
TABLE 3.3 Response Planning With Detailed Content Map Response Variables Locomotive Variables
Motor Response Variables Walk
Walk
Rate, direction, posture
Purpose
Referent
To approach
X
To approach
X
successful responses takes these setting features into account and plans the response variables accordingly. In response to the content map “Move to approach X,” the responseplanning template shown in Table 3.2 required choices about the mode of locomotion as well as choices about the features of the response. If the infrasystem issued a different content map, one that required the organism to “Walk to approach X,” rather than “Move to approach X,” the response-planning template would not show the locomotive options. It would provide only those options that apply given that walking has been specified. Table 3.3 presents a template like the original except it does not provide options for locomotion. However, it requires the same decisions that the content map “Move to approach” presented once the agent selected the option of “Walk.” Whether the option is preselected by the content map or selected by the agent, the subsequent series of decisions is the same. The agent must still make decisions about the rate, direction, and posture. Just as it is possible for one content map to specify “Move” and another to specify “Walk,” a different content map could specify a particular type of walking. In this case, the presence of X would result in a content map that requires still fewer decisions by the agent. Table 3.4 shows a template for the content map “Walk with posture P1 to approach X.” The only variables that the agent is to set are those for rate and direction.
53
PLANNING RESPONSES
TABLE 3.4 Response Planning With Greater Detail in Content Map Response Variables Locomotive Variables
Motor Response Variables
Walk with posture P Walk
1
Rate, direction
Purpose
Referent
To approach
X
To approach
X
The shark that produces a unique posture when attacking would be influenced by a template similar to the one in Table 3.4. The content map would specify the posture required for the attack mode. The agent would specify rate and direction. The four planning templates described in Tables 3.1 through 3.4 for approaching X could generate exactly the same behavior on particular occasions. For example, walking in direction -20° at rate ++ with posture P5 could be the result of a detailed content map, such as “Walk at rate ++ and posture P5 to approach A,” or a more general content map, such as “Walk to approach X.” This behavior could also be the result of a content map even more general than “Move to approach X,” such as “Move to approach X or Y.” With this content map, the agent must decide which entity to approach before specifying the details of the approach. In each example, the more general the content map, the greater the number of agent decisions. However, the response that is planned would have to be specified in the same amount of detail regardless of the process that evoked it. The difference is simply in the extent to which choices are possible within a particular decision-making template. Certainly it would be possible to design the interaction of template and agent so that the agent had little more to do than fill in the blanks by attending to the surroundings and providing the site-specific information. It would also be possible to design the interaction so there were few scaffolds or prompts to guide the agent. Instead of being presented with an extensive menu in the presence of each discriminative antecedent stimulus for each different kind of approach (sex, hunger, construction, attacking, etc.), the agent would be directed only broadly (e.g., to “Approach X”). For this interaction, more of the information about how to do that would reside with the agent’s repertoire. The advantage of this system is economy. If there are 12 different content maps that call for basically the same locomotion choices, the capacity to call for the menu could be made an agent function, in which case the menu would reside with the agent as knowledge of its repertoire for locomotion. Therefore, it would not have to be repeated for each of the 12 applications. Rather, the infrasystem would urge only the planning of a locomotive response.
54
3.
AGENT FUNCTIONS
In other words, there is a certain amount of information needed for the response to be planned and executed. Some of that information may be prompted by menus. If less prompting is provided by menus, the agent must be designed so that it is more sophisticated and has the menuselecting function as part of its performance repertoire. Whether a specific content map provides relatively specific or general directives, however, the agent completes the plan so that a fully specified response is possible in the current setting. Some form of planning template is logically required for the infrasystem to be able to respond to the agent’s planning efforts. The template provides the infrasystem with a mechanical checklist for altering the reinforcing sensations transmitted to the agent. When the blank template is presented, the infrasystem presents urgings for the agent to perform the planning. As the agent provides information about each item in the template, a positive sensation is issued for that item. The process continues until all the details of the planned response are specified. Without some form of template or criterion for whether the response was completely planned, the infrasystem would have no basis for altering the reinforcing sensations transmitted to the agent, and the agent would have no basis for determining whether all the variables of the response had been specified. The agent runs on sensation. The infrasystem must be able to control that sensation if the agent is to apply wired-in knowledge to specific situations. Planning Templates and Decisions There are three basic requirements for the design of content maps and agent’s planning templates for the hardwired system: 1. The agent is able to decode the requirements of the content map. (If the agent cannot decode it, it cannot produce the planning and directing responses that it calls for.) 2. The agent uses some form of planning template that is issued in connection with the content map for planning those response details not specified by the content map. (If the response details are not planned, the response cannot be produced.) 3. The system is able to produce responses that are planned whether they are required by the content map or specified by the agent. (If the responses encoded by the agent cannot be decoded by the infrasystem, the directive to respond cannot be transformed into a response with the specified features.) The overall implication of these requirements is that the agent— whether that of a simple organism or a sophisticated one—makes decisions.
PLANNING RESPONSES
55
Decision making is the basic reason for the agent to exist. Decisions must be made before a response is issued. The agent of a simpler organism makes choices, but each choice is a decision about a detail of the response strategy that will be produced. The agent of more sophisticated organisms makes selections that are not necessarily presented as entries on a unique menu that is presented only in connection with a specific pursuit. These are decisions that are more elaborate. For the simpler organism, the decision requires one step—fill in the menu. For the more sophisticated organism, the decision involves two steps—select a menu from the repertoire and then fill in the menu. In either case, the agent creates a plan consistent with the broad requirements of the content map. The content map urges a particular outcome; the agent assesses the features of the current setting and specifies the pattern and features of the pattern that achieve the outcome. Logically, the fact that a gap exists between what could possibly be programmed reflexively and the observed behavior implies the specific decisions that are needed. Without these and without a nonmagical basis for their formulation, the system could not possibly perform. So the basic question about organisms like spiders making decisions is not whether they make decisions, but how they make them and how the system is designed to provide the decision maker with guidance about what to attend to and what to plan. Options From the Agent’s Perspective. The presentation of the content map and response-planning template to the agent occurs within the context of sensation. For the agent, the decision making would not be an exercise in logic, but what amounts to insight. The planning of the details would not be an intellectual activity, but something that had to be done. The rule for the agent’s planning performance is that if the agent knows something on the menu, that insight is reinforcing. If the agent does not know, the lack of knowledge is negative (frustrating) and demands further mobilization of attention. The menu would be presented to the agent in the form of “Find out ” (i.e., find out where X is or figure out how to get there). There would be a sensation of great urgency connected with the various menu items. The discovery of each answer would be reinforcing. So, in effect, the organism would experience something that is functionally equivalent to the following scenario: “What’s that? Was that X? Look again. Where is it? Aha, there it is. It is X. Oh boy. I’ll go over there and get it. How will I get over there? I’ll walk. How fast will I walk? I’ll walk very slowly because it’s slippery. Here I go.” Certainly, the colorful monologue would not be words, but the relative sensations and changes in sensation associated with the planning, and the directive of the response (“Here I go”) would be the same as those implied by the monologue. Also the implied awareness would be the same. The
56
3.
AGENT FUNCTIONS
agent would definitely know where X was, feel the urgency to approach X, and have knowledge of the ambient details of the setting that are associated with and influence details of the approach toward X. The agent would be aware of the plan (walking over there) and the properties of the approach strategy (walk very slowly). Also as every detail of the plan is completed, the agent would experience more positive sensations. Although the agent would have to account for the various details of the response before it could be executed, the process would not necessarily be linear. The agent may be able to specify all the details of the plan simultaneously. Relationship of Response-Planning Template to Content Map. The information presented reflexively to the agent when the specific stimulus conditions occur must provide the agent both some kind of information about what has to change in the current setting (Approach X) and some kind of model that serves as a basis for feedback. Chapter 2 presented several models. For instance, the bee that is building cells within a beehive needs a kind of blueprint of the cell and a series of construction steps. For a simple example, such as approaching X, the map could present a representation of a straight line route with no impediments to X. The route could be something like a visual tunnel from agent to X. The ideal route would be projected as something of an overlay on the current sensory data. The sensory details outside the tunnel would be visible. The ones inside the tunnel, however, would be enhanced so they would serve as the primary basis for prompting decisions about the response. If there were obstacles between the agent and X, these would be highlighted as deviations from ideal. Each enhanced detail within the tunnel would require some form of response planning. Instead of just being able to walk, the agent would have to modify the walking to accommodate the obstacles. The straight-line tunnel (or functional equivalent) would provide for automatic replanning of the approach route. Let us say that an impediment required the agent to plan the first part of the route so it went 15 feet to the right of a straight-line route. The new view of the target implies a new straight-line route. It also provides information about the obstacles that are in this route. Even smaller deviations from the ideal require response adjustments. If the route has an irregular surface, the detail would be highlighted and require adjustments in the response. To correlate the relationship of the deviation with a response that compensates for the deviation, the agent would need knowledge that is independent of the content map. If the map highlighted an impediment in the path, the enhancement would urge the agent to adjust the response. However, the agent would have to possess knowledge about how to do that. The agent would have to know which features of the response need adjustment to accommodate the deviant feature.
DIRECTIVES TO RESPOND
57
Because the response involves continuous variation, the model of the route would have to be dynamic so that, once the organism is in motion, the information about the relationship of organism to goal is current. For instance, after the organism starts uphill, the model would not show the horizontal component as a discrepancy. The implication would be that the pattern of locomotion selected will continue to be appropriate so long as the incline is about the same. Without this provision, the horizontal component would be a discrepancy that required further adjustment every moment, although there is no discrepancy between the current effort details of the setting. Default Values in the Planning Template. Because many features of the response would be the same on all occasions, the system could be designed so that default values were possible. This process would require the agent to plan only those response features implied by deviations from the ideal, not the entire response. For any sensory details that do not deviate from the ideal, the default value would be issued automatically or reflexively. The agent would know the value, but would not have to set it. For instance, if the agent observes that the current setting permits a straight-line route to X over level ground, the agent would simply have to issue the directive, “Walk over there,” and not specify the settings for rate, posture, or other details of the pattern. The entire default package is set and the response is completely planned. If the default response is walking at rate R with posture P, these values would be automatically issued in response to the agent’s decision to “Walk over there.” If X moves and only the direction toward X is at odds with the ideal, only the direction would have to be specified. As the number of deviations from ideal increases, the percentage of the response that could be governed by default values decreases. This default format is efficient because all values that are specified for the no-deviation conditions may be engineered to be reflexively activated from a single directive. The system simply compares the ideal that is specified by the content map with the sensory present and issues a response default value for all response features associated with details of the setting that do not deviate from the ideal. For a response like walking, the direction for the response is always at odds with the ideal and must be specified for the initial plan and as the response is being produced. DIRECTIVES TO RESPOND The plan determines the responses that will be directed. The directive to respond is simply a transformation of the plan into a command that corresponds to the details of the plan. If the plan is to run in direction S, at rate++, with posture D, the directive or command to respond indicates, “Create running in direction S at rate ++ with posture D.”
58
3.
AGENT FUNCTIONS
The agent necessarily issues this directive. The directive cannot derive in an unmediated manner from the plan because the plan is simply a description of what will happen. The directive to respond is an imperative. Imperatives logically cannot derive from plans or statements of what will occur. Therefore, an agent function is implied to mediate between plan and directive. The agent receives secondary sensations that urge it to produce responses that comply with the plan. The agent first creates the plan and then issues a response directive based on the plan. As noted previously, the basic assumption of conservation or efficiency is that the agent issues the most general directives that are consistent with the sensory conditions. Even if behavioral observations disclose that the agent is capable of guiding or moving legs individually, the agent would not engage in this level of specificity when directing something like walking. The system has to be designed so the various default provisions permit the agent to direct the largest unit of behavior that, once directed, can be processed automatically by the infrasystem. After a plan is formulated, the agent would be able to produce the behavior simply by issuing the directive that indicated “Do the plan.” All values of the plan are issued as being set until otherwise directed. The pattern of the response will continue until terminated or adjusted. The agent will not have to attend to the details of the response unless the agent receives information that the plan is failing. These provisions result in efficiency because the agent necessarily attends to multiple features of the setting and the response when planning. If we assume that the agent has only so much attention, the more attention the learner devotes to the components of the responses—either during the initial planning or while the response is being performed—the less attention the agent will have for other details of the setting that may be relevant to the response. The willing or ordering of the response is a parallel to the willing or ordering of the plan. The only difference is that one directive results in a covert outcome (the plan) and the other in an overt response. For the plan, the agent’s decision may either be a choice (“Choose option R”) or a specification (“Create option R”). Neither form of willing implies unbridled free will. It represents a logically inescapable function. At some point after the planning, the response directive must be directed in a way that results in a particular response. The agent is responsible for directing these operant responses. ADJUSTING RESPONSES The response is adjusted if it does not comply with the content map and therefore does not achieve its projected outcome. For an adjustment to be possible, the agent must receive information about whether the current
ADJUSTING RESPONSES
59
sensory data indicates that the response, as planned, is (a) being executed as planned, and (b) succeeding. For the system to make this determination, some version of the plan must be transported through time. The plan occurred at Time T. The determination of whether the response is proceeding as specified occurs at Time T+N. The determination can occur only if some manifestation of the plan is available at Time T+N. However, more than just the manifestation is needed. To evaluate the success of the response by comparing the current sensory input with some form of the plan, the projection must also have specified the sensory input that will occur if the response is proceeding according to specifications. If the response complies with the plan, visual images will change in particular ways, and specific proprioceptive sensations will occur. The projection of what will occur must indicate or imply key features of this future transformation. Ongoing comparisons would be needed to determine whether the response pattern is consistent with the projection. If there is no discrepancy between projected sensation and current sensation, the response pattern continues. Any deviations imply adjustments in response. A record of the specific response settings the plan called for must also be retained over time. Without information about the setting for the various components of the response, the system would not have information about how to make changes if a discrepancy occurs between sensory and projected sensory outcomes. If the rate had been set at ++ and the system concludes that this rate is inadequate and must be slightly faster, the original rate must be increased. If the record of the original rate (++) is not available at this time, the system would have no basis for increasing the rate. The agent might specify rate ++ again unless the information was available that the current rate was in fact ++. There are different possible solutions to this problem, one of which would be a kind of default provision that permitted the agent to adjust the difference in rate (relative values) without being concerned with the absolute values. The infrasystem would have the information that the current rate is ++. The adjustment of the response requires only increasing whatever this rate is to one that is a little faster. Therefore, the agent could be designed so that it would not need knowledge of the absolute values once they are set. In this way, the agent would not be burdened with information that could be processed by the infrasystem and would be able to direct only the component of the ongoing response that was to change—in effect issuing a directive that functions as “Go + faster.” In any case, however, conservative adjustments would not be possible without information about the response that was directed (i.e., a record of the plan). The context for the response has some uncertainty, which means that the plan may have to be adjusted because the initial settings for response
60
3.
AGENT FUNCTIONS
components were incorrect or because the setting changed in unanticipated ways as the response is produced. Incorrect settings require adjustments in the values assigned to the different response components so they achieve what the original plan assumed they would achieve. An example of incorrect settings would occur if rate ++ was planned and directed, but the rate was not actually achieved because the surface was sticky, the route was more vertical than the plan anticipated, or some other form of information was lacking. In any case, the anticipation of moving at rate ++ would be maintained, but more effort would be added to produce the outcome. Unanticipated changes in the setting require adjustments—not because the response didn’t occur as planned, but because the original plan is no longer appropriate for the current setting. Some details of the plan no longer apply to the current setting and must be replaced. If the direction is no longer appropriate, direction must change. For either kind of problem, there must be an adjustment of the directives that the agent issues. All adjustments follow a variation of the format used to direct the response initially, except that a conservative adjustment implies adjustments only in those features of the response that are no longer appropriate. The agent is now moving or producing whatever response had been planned. The magnitude and direction of the deviation between projected and realized sensation implies the magnitude and direction of the adjustment. As with the initial planning, adjustments are assumed to be conservative, which means that the plan for the adjustment addresses only those aspects of the response that deviate from projection. If too many critical details of the response are at odds with the current sensational patterns, the current plan would be rejected and a replacement plan formulated. These ongoing adjustments of plans assume not only that the plans are carried forward in time and compared to projections, but also that the content map persists in time. The assumption is that as long as the discriminative stimulus is present, the content map is present. If this were not the case, the agent would have no criteria for making adjustments in the response. For example, the organism is attempting to approach X with a straight-line route that traverses a steep, sandy slope. The organism tends to slide a little with each step. The minimum adjustment is for the agent to continue directing the same straight-line route by adjusting the component responses to compensate for the sliding. The agent directs an orientation that faces more uphill and responses that move more in the uphill direction. Ideally, the adjustment results in the original straightline route to X. If this adjustment works, no further adjustments are needed, and the other components of the plan remain in place—the rate, the general posture. If the organism reaches a place where it slides some distance down the hill, the original straight-line approach is not possible. A new plan must be
STEPS IN PLANNING, DIRECTING, AND ADJUSTING RESPONSES
61
formulated. However, that plan would still be in compliance with the provisions of the governing content map—Approach X. Some adjustments that are necessary may be programmed as reflexes that are part of the agent’s response repertoire. A deviation of five degrees from the proper center of balance implies an adjustment. Because the adjustment would always be a compensation in the opposite direction of the deviation, the system could be designed to make these adjustments automatically as part of the agent’s skill repertoire. In contrast, an impediment in the middle of a straight-line route implies a possible choice. Should the organism go to the left or right? An agent decision is needed. The efficiently designed agent directs the largest units that will achieve the adjustment. Just as the initial planning is most efficiently achieved by the agent directing the largest units that will permit the response to conform to the current setting, the efficient agent directs the largest possible units that will achieve the desired adjustment. Each time the response is adjusted, a new projection is issued. Because adjustments may have to occur frequently, the feedback and adjustment process would be a continuous cycle of planning, projecting, and adjusting. Once the agent formulates a plan, it is used by the infrasystem as a standard for enhancing sensations and continuously updating the outcomes being achieved to those called for by the plan (and the content map).
STEPS IN PLANNING, DIRECTING, AND ADJUSTING RESPONSES Figures 3.2 through 3.6 illustrate the steps in planning and adjusting responses. These figures indicate the general processes that are implied for initial planning of responses and adjusting responses on the basis of sensory input. Planning Figure 3.2 shows a functional representation of a possible responseplanning template that accompanies the content map for a simple operation, such as “Move to intensify X.” The content map would be presented to the agent as an overlay (something that functions like a visual tunnel on the current path from organism to X). The tunnel would highlight the deviations between the ideal and the current setting. The results of this sensory comparison imply response values. Segments A to F show six classes of response features that must be planned before the response can be executed. Note that the number of response features is arbitrary and does not necessarily reflect the number that
62
3.
FIG. 3.2.
AGENT FUNCTIONS
Response-planning template.
would be employed by any particular map. The number of adjustable features would have to be determined by behavioral observations. The response-planning template is currently blank and has no values assigned for Features A to F and therefore could not apply to any specific setting. When values are assigned to all the response features, the planning template is completely transformed into a situation-specific plan and serves as the basis for the response directive. For the map that requires moving to intensify X, the various segments would be direction, rate, posture, and other agent-directed features of the response that must be planned before a specific response pattern could be executed by the infrasystem. The arrows show that the pattern is projected into the future. Once the response pattern is planned and directed, it persists as something the agent functionally understands and as something the infrasystem understands. The plan persists until the termination criterion is met or the original plan is modified. The heavy black outlines on the diagram indicate areas of agent attention needed for initial planning. Attention is required for all the sensory details of the setting that have implications for the variables—Features A to F. The default range is shown by a dashed line. The dashed line is about halfway up. This is the assumed level for each variable. For rate, it would be about half of what the agent is able to direct. For direction, halfway would indicate straight ahead. Note that the scale for each unit is different. Also the default settings in this figure have been set arbitrarily at midlevel. To use the plan, the agent attends to the setting. Features that deviate from the default settings are highlighted. Any response-relevant features that are not highlighted, and therefore do not deviate from the ideal, imply default settings for the planning template. The agent does not have to plan
STEPS IN PLANNING, DIRECTING, AND ADJUSTING RESPONSES
63
them, but simply recognize that specific sensory features do not deviate from the ideal. If all relevant sensory details of the setting are in the ideal range, the default settings would be issued for all the concomitant response values. The result is shown in Fig. 3.3. The default value plan is completed. To execute it, the agent simply transforms it into a response directive of the form, “Do the planned response.” The agent does not have to reissue a directive for each component. The plan not only provides the basis for the response directive, it also provides a projection of the response settings that will occur over time. The projection implies that certain transformations will occur in the sensory data the system receives.
FIG. 3.3.
FIG. 3.4.
Default value settings.
Agent-planned settings.
64
3.
AGENT FUNCTIONS
If specific features of the setting deviate from the ideal, the deviation is highlighted and the agent must plan a setting for this feature of the response. The result is a projection in which some or all of the features will have nondefault values. Figure 3.4 shows such a plan. The height of Components A to F indicates the response setting that is estimated for the current application. Feature E is set by default. All the other features have agent-planned values. Like the plan that has only default settings, this plan is open-ended, which means that the response will continue to be produced as planned until sensory data indicate either that it needs to be modified or terminated. Directing and Adjusting Responses The response is directed as planned. As it occurs, sensory conditions change. If no unanticipated sensory outcomes occur, the realized conditions correspond to the projected conditions. The implication for the response setting is that no adjustments are needed; therefore, no attention is needed for these features of the response. The motor response pattern is now automatic. Figure 3.5 shows a response that is generating the projected sensory outcomes, which means that the response is proceeding as planned. The alternating divisions show the outcome that occurred for each component of the response (shaded segments) and the projected outcome for the response component (white segments). The white segments correspond in height to the shaded segments, indicating that there is no deviation between the projected and realized outcomes. The black segments at the beginning of the response sequence indicate the initial need for agent attention. The absence of black segments after the initial response specifi-
FIG. 3.5.
Response as planned.
STEPS IN PLANNING, DIRECTING, AND ADJUSTING RESPONSES
65
cation indicates that the process continues relatively automatically and requires no significant attention or planning by the agent. For this automatic process to occur, the system must retain a record of the settings for the various features of the response and must continue to issue the same directive automatically. In Fig. 3.5, the continuous projection is shown by arrows at the top of each component. The same values that were set initially are still in effect and are projected into the future. The diagram does not suggest the rate of the cycles that compare projections with current data; however, observations of behavior suggest that the cycles may be continuous for some organisms. If the organism responds quickly to changes in the setting, the system must be prepared to respond immediately to unanticipated sensory data. When there is a discrepancy between current sensory data and projection, the system enhances those details of the current response that deviate from projection so that the agent attends to and changes only those aspects of the response that are at odds with the projection. The infrasystem has to perform comparisons of the current sensory data with the projected data; however, this process would not necessarily involve the agent. Only when a discrepancy occurs would the agent be alerted (through enhanced sensation on the part of the plan that is at odds with projection). The content map would show the deviation. The corresponding part of the response plan would be highlighted or enhanced. The agent would thereby receive information about which features of the response plan are to be modified. In Fig. 3.6, the response sequence presented in Fig. 3.5 is extended. At Times M and N, the response is adjusted because aspects of the response deviated from projected values. The magnitude of each deviation is shown with a striped segment. These striped segments imply both the magnitude and direction of the change that is needed. Following each striped segment is a black segment that shows the adjusted response. The segment is black, indicating that agent attention is required to achieve the adjustment. The newly adjusted plan (following each deviation and correction) serves as the basis for a response directive that has the new settings for the various components that are replanned as well as the settings that were not modified. At Time M, the projected sensory outcomes for Feature E deviate from projection. The adjustment is made only in Feature E. Attention is required to (a) plan the adjustment that compensates for the discrepancy, and (b) make a projection based on the new value of E. If an increase in E is to correct the response, specific improvements would be anticipated to occur following the adjustment. A response-adjustment directive is issued (black segment at M). Note that the agent is required to attend only to Feature E because only E deviated from projection and was highlighted by the infra-
66 FIG. 3.6.
Response adjustments.
STEPS IN PLANNING, DIRECTING, AND ADJUSTING RESPONSES
67
system. This type of situation would occur if E is posture and the posture must be adjusted slightly. The other aspects of the response directive continue as initially specified. The second striped segment occurs at time N. Primary deviations occur for A and B. These are corrected. The adjustment, however, creates a secondary deviation in E. It must be corrected. Note that the striped pattern for E is different from that of A and B because E was not involved in a sensory discrepancy that required adjustments to A and B, but was a deviation created by the adjustments to A and B. For instance, if A and B are direction and rate, their changes resulted in a necessary change in E (posture), indicating that the change in the rate could not be accommodated without a postural adjustment. The adjustment at time N shows that attention is provided to all features that are to be adjusted—A, B, and E. However, some kinds of adjustments for secondary deviations may be achieved through default transformations. In this case, the secondary deviation would not require agent attention, but simply an infrasystem operation. For instance, if B is rate and the rate is increased in a way that cannot be achieved without a change in posture, the postural change may be linked to the rate through a default transformation. (Fast running requires a generically different posture than slower running; therefore, specification of fast running would automatically adjust both the rate and posture.) The system may utilize extensive default transformations; however, they would be limited to those adjustments that implicate secondary features of the response. Also the fact that default transformations occur for some adjustments does not imply that the secondary feature is always linked to the primary feature. If it were, the pair would constitute a single feature and would never require independent settings. With or without the default transformation provisions, the amount of attention required for various changes is not a linear function based on number of deviations or extent of deviation. If the level of confidence originally set for the response is low, any adjustment in the response would probably require considerable attention. The mouse that is sneaking past the sleeping cat may change direction only slightly, but this change is the product of considerable attention. Ambient Sensory Conditions If there are no enhanced stimuli except those associated with the content map, the projection assumes that no enhanced stimuli other than those associated with the content map will occur. If there is no indication that a predator is present, the presence of one is not projected. So it is with other ambient stimuli.
68
3.
AGENT FUNCTIONS
If a discriminative stimulus for another pursuit occurs as the original response is being produced, however, the new stimulus is enhanced and treated like any other deviation from the projection. It commands attention and adjustment in the response, possibly even termination of the original response. For instance, the system may be designed so that any change in the direction of the wind results in the enhancement of the wind stimulus. If the wind is blowing from the north when the response is planned, the wind feature is recognized by the agent and is a tacit detail of the projection. The wind is not a factor that influences the response-production efforts in the current setting. If the wind changes, the wind sensation deviates from the projection of sensory conditions and is enhanced. The agent attends to it and must make a decision about whether it influences the current response strategy. The overall strategy that the system follows is the same for responserelated and ambient stimuli. The projections made assume a response that proceeds in the way it was planned at Time T, and ambient stimuli will continue as recorded at Time T. If enhanced stimuli that are not associated with the response occur during the time the response is being produced, they command attention. If the system is designed so that ambient movement patterns not assumed by the plan are enhanced, the unexpected presence of movement would create a discrepancy between the projected and realized sensory features. The movement would be enhanced. The enhanced stimuli would command attention and would imply replanning or possibly discontinuing the present response. If the infrasystem is not designed to enhance a particular movement pattern, its occurrence creates no discrepancy. It is not enhanced, and it does not influence the planning of the agent. SUMMARY The purpose of the agent is to change the current sensory conditions through operant or planned responses. The way that the agent achieves change is through planning, directing, and adjusting responses. To perform these functions, the agent needs information about (a) current sensory conditions, (b) enhanced discriminative stimulus, (c) content map and the related plan, and (d) response options consistent with the current plan and the content map. The agent functions rely on different categories of sensation. These provide the agent with the reinforcement and information about the purpose and referent. The purpose and referent are not teleological and mystical properties, but are needs implied by the task. The agent must know which of the thousands of details of the current sensory reception is to serve as a referent for the behavior to be produced. The agent must also know what
SUMMARY
69
sort of behavior is to be produced and what specific changes are to occur in the current setting. To provide the agent with information about the referent and purpose, the system reflexively produces a content map for the particular discriminative stimulus in the current setting. The map either consists of or is supplemented with a response-planning template, which ensures that the agent will attend to the response-related details of the current setting and plan responses that are consistent with both these details. The planning assumes the agent’s ability to decode the requirements of this content map and identify responses from its response repertoire that will transform the current setting in specific ways. The response-planning options may reside with the content map or may be part of the agent’s response repertoire. The design advantage of having them as part of the agent’s repertoire is that, if various content maps require the same class of responses (such as locomotive responses), the content map could issue a more general directive, such as “Move to intensify X.” The agent would then access the menu for the different types of movement and select one that is appropriate for the current task. Once the response is planned, it is directed. As it is being produced, adjustments may be needed. Planning and adjusting responses cannot occur without ongoing sensory input. For the agent to direct or produce responses, the agent needs a response repertoire and a code that keys the components of the response that is planned to the details of the current setting. If the agent is to move in a particular direction, it must be able to specify the component of the response that achieves the particular direction. In the same way, the rate and other components of the current setting imply features of the response. The agent specifies the response features. Once directed, the response is produced by the infrasystem and continues as planned. The infrasystem monitors both the response and changes in the ambient sensory receptions to determine whether they are consistent with projections. For those realized details that correspond to the projected details, the agent does nothing. For those details that deviate from the projected details, the agent makes adjustments. All adjustments are conservative, which means that the involvement of the agent is limited to those details of the response that are implicated by deviations of current sensory conditions from projected conditions. Some adjustments may involve only a single feature of the response, such as direction. Others may require both primary and secondary adjustments. The primary adjustment is based on sensory input that deviates from projected changes. Secondary adjustments are needed if a feature cannot be adjusted without affecting another feature of the response. This chapter illustrates the response planning with simple examples, but the same response-planning format applies to any hardwired behavior. The
70
3.
AGENT FUNCTIONS
content map and response-planning template control what the agent does to receive relatively positive reinforcement. The map indicates the general classes of responses to be produced or general outcomes to occur. The agent has a response repertoire and the ability to specify or select specific behaviors or response components. If a nest is to be affixed to branches that have a particular geometric arrangement, the agent is able to key the various nest-producing responses to the spatial details of the pattern, possibly centering the nest at the juncture of two branches of particular thickness. The centering requires the construction of a circular design that is symmetrical around this center. The agent that constructs the nest has knowledge of the various component responses that will lead to the placement of the materials that result in the symmetrical design and its anchors to the branches. Whether the components of the various responses are specified by the content map or are left to agent choice, the response-planning template for the response is presented to the agent for completion. The template has provisions for the agent to produce responses that are compared with the ideal pattern or outcome. By using the content-map format, the system is able to vary the amount of flexibility that will occur for different pursuits and control their character, outcome, and intensity. If it is important for the agent to plan a particular kind of ritualistic walking (one possibly associated with a sexual ritual), the posture is specified by a content map. This specification would be coupled with some form of reinforcement. If the agent assumes the specified posture, it receives positive sensations. If not, the agent receives negative sensations.
Chapter
4
Interaction of Agent and Infrasystem
Chapters 1 through 3 developed the various components required for the production of operant responses. Chapter 4 addresses the relationships between infrasystem and agent. The chapter presents variations of two metablueprints. These show the relationships among the various functions. One shows the temporal cycle of steps required for the organism to produce operant responses. It shows the functions of reception, screening, planning, directing, and response production in a temporal sequence. The second meta-blueprint presents a functional anatomy of the system. It indicates the ongoing relationships between the infrasystem and agent. Its emphasis is on the organization and interaction of the various agent and infrasystem functions. THE CYCLE OF PLANNING, PRODUCING, AND ADJUSTING RESPONSES Figure 4.1 shows the various steps required for the cycle of planning, directing, and response adjustment. Three cycles are shown, delineated by response segments 1, 2, and 3. These are segments of an ongoing response, such as moving to approach X or performing whatever operations are referenced to a governing content map. These cycles would continue until the termination criterion for the operation has been met or until they are terminated by an unanticipated enhanced stimulus that commands attention and a different response. The sequence of arrows shows the ordering of the events. Vertical arrows show events that are approximately simultaneous (or that must occur be71
72
4.
FIG. 4.1.
INTERACTION OF AGENT AND INFRASYSTEM
Cycles of planning, production, and adjustment.
fore the next horizontally oriented function occurs); horizontal arrows show temporal events. The time arrow below the diagram shows that the cycles are continuous. The response segments 1, 2, and 3 occur sequentially. They are accompanied by the steps shown by the various cells and arrows. Cycle 1 Cycle 1 starts with Sensory Data 1. The system receives sensory data indicating that a hardwired discriminative stimulus is present. The system produces the content map that is called for by the discriminative stimulus and presents it to the agent. For instance, a pheromone associated with mating is identified. The content map for the behavior provides the strategy and includes any response-planning templates that prompt the selection of response features that are needed to plan a situation-specific strategy. The content map and associated enhanced sensations urge the agent to produce a plan for carrying out the first behaviors specified by the map. The arrangement of arrows between Sensory Data 1 and the content map suggest there is an interaction between the two elements. The sensory data activate the content map. Then the details of the sensory data must be taken into account before a plan can be formulated. The vertical arrow indicates that Plan 1 results from the interaction. That plan is consistent with the content map and is referenced to the current setting. The plan specifies the details of the response that are directed—direction, rate, and so forth. The plan in turn generates both Projected Sensory Data 1 and Response Directive 1. The projected sensory data include both information about
THE CYCLE OF PLANNING, PRODUCING, AND ADJUSTING RESPONSES
73
how the response will feel if it is produced according to specifications, and what kind of transformations will occur in the surroundings. Response Directive 1 is an imperative that orders the system to produce the response represented by the plan. Cycle 1 ends with the initiation of Response 1. Cycle 1 is an arbitrary time period during which the system produces a response that begins to transform details of the current setting. The response is ongoing, but at the end of Response 1 the system has information about whether the projected changes are occurring or whether an adjustment is needed. Two products of Cycle 1 must be transported into Cycle 2. These products are shown with shaded arrows—Plan 1 and Projected Sensory Data 1. Both are necessary if some form of adjustment in the response is to be achieved. The projected sensory data serves as a criterion for whether Response 1 is successful. If the anticipated sensory events occur, the response segment was successful. In the pheromone example, the response was successful if the intensity of the pheromone increased as planned. Plan 1 is necessary after the production of Response 1 if conservative adjustments to the plan are to be made on the basis of later sensory data. Discrepancies between the projected and realized sensory data imply the magnitude and direction of adjustments needed in the plan. In the pheromone example, a decrease in intensity would imply an adjustment in direction. These adjustments are not possible unless some version of Plan 1 is transported into Cycle 2. This is the earliest time that a comparison could be made, and the original plan could be conservatively adjusted. Cycle 2 Cycle 2 begins in the context of ongoing response and the concomitant changes in the setting. The content map continues to serve as the criterion for the responses produced and for how the setting is to change. The result of Response 1 is Sensory Data 2. These data are assumed to be a product of the response directive. The response directive was based on a plan, so Sensory Data 2 has implications for the plan. Sensory Data 2 is compared with Projected Sensory Data 1. If there are no discrepancies, Plan 1 is not modified. It is simply reinstated as Plan 2. If there are discrepancies, Plan 1 is adjusted conservatively to address those areas of discrepancy. Each discrepancy implies a modification of specific details of the plan. The modified plan is Plan 2. Projected Sensory Data 2 and Response Directive 2 are generated from Plan 2. Even if Plan 2 is the same as Plan 1, the projected sensory data are not the same because the response is already in progress and different changes are expected to occur. If Projected Sensory Data 1 indicated that the segment of the route immediately in front of the organism will be transformed so that it is no longer in front of the organism, the same projection
74
4.
INTERACTION OF AGENT AND INFRASYSTEM
would not be relevant for Projected Sensory Data 2. A new segment of the route is now relevant—the segment immediately in front of the organism. In the same way, the response directive is different from the Cycle 1 directive. The initial directive occurred in the context of a stationary organism. Therefore, the directive initiated the response. Response Directive 2 occurs in the context of an ongoing response, so it either indicates “Keep following Plan 1” or indicates only those components of the response that are to change. The others continue. Therefore, Response Directive 2 is always different than Response Directive 1. Response Directive 2 results in Response 2. The same steps that immediately follow Response 1 follow Response 2. Plan 2 and Projected Sensory Data 2 are transported forward in time. Cycle 3 and Beyond Following Response 2, Sensory Data 3 are compared with Projected Sensory Data 2. The comparison results in continuation of the plan without a change or a conservative modification of Plan 2. The result is Plan 3. Cycle 3 is identical to Cycle 2. The content map provides an ongoing template for the responses that are planned. The planning occurs on the basis of projections and current sensory data. The cycles continue until the termination criterion is met or until some unanticipated stimulus interrupts or overrides the current content map. In summary, to maintain operant responses that have been initiated, the hardwired system would have to execute five steps: 1. Compare the current sensory data with the data projected and identify specific discrepancies. 2. If there are discrepancies, correlate the discrepancies with the plan and the projected data to identify the specific compensations that would have to occur in the plan. 3. Conservatively specify a response that compensates for the identified problems. 4. Use current sensory conditions and the current plan to generate both the projected sensory data and the response directive. 5. Produce the response as planned and projected (either by continuing the plan formulated earlier or by creating specific modification of the earlier plan). The Content of Projections This sequence assumes that the system is designed so that it is able to make conservative adjustments. These adjustments imply the detail of projected sensory data. If the rate or direction of the response are adjustable as inde-
THE CYCLE OF PLANNING, PRODUCING, AND ADJUSTING RESPONSES
75
pendent response components, some form of projection of the rate or direction that will be achieved by the current plan must be available at a later time. Unless the projection had some indication about the rate that would be experienced as the response occurred, rate could not be adjusted because there would be no basis for comparing the rate currently experienced with the rate projected. The same limitation applies to direction or any of the features of the response that may be adjusted conservatively. The projection, therefore, must refer to multiple features of the response and must provide for some sensory indication as to whether the sensory indication is occurring. If the system projects a relative direction to be achieved (the target at midline of the organism), conservative adjustments in the response are possible. Discrepancies between projected and realized sensory data suggest both the direction and extent of specific compensations that must occur in the plan. If the discrepancy indicates that the organism is moving three degrees to the right of the target, the compensation is an equal and opposite adjustment—three degrees to the left. Unless specific discrepancies (the deviation of target from midline) implicate specific features of the planned response, the system would have to replan all features of the response if any discrepancy occurred. Feedback Rate Figure 4.1 does not specify the rate at which the comparisons of sensory and projected data occur. The rate at which they occur in a particular organism may be inferred from the rate at which the system is able to respond to unanticipated changes in sensory conditions. If the organism is able to respond to something unexpectedly attacking it within 1/20th of a second, the rate of processing is fast. However, this rate does not imply that the response is necessarily adjusted at this rate on all occasions, only the rate at which adjustments could occur under emergency conditions. Termination Criteria The cycles continue until the content map is terminated. Termination may result from two conditions: (a) the system identifies the presence of stimuli that drive competing and more compelling content maps, or (b) the system identifies that the changes called for by the content map have been achieved. Competing Content Maps. If the current sensory data included a competing discriminative stimulus that had greater enhanced sensation than the current discriminative stimulus, a competing content map would be pre-
76
4.
FIG. 4.2.
INTERACTION OF AGENT AND INFRASYSTEM
Discrepancy resulting in content map termination.
sented and the current response would be terminated. For instance, if Sensory Data 2 disclose the presence of a predator, the original content map is canceled. Figure 4.2 shows this contingency. For this type of termination, no comparison between the previously projected and current sensory data is required. The termination involves only a comparison of the content map and the current sensory data (Sensory Data 3). When the content map is terminated, the enhanced stimuli and urgings issued with the content map are also canceled. In the case of a predator appearing, a new content map now engages the agent, and new plans are urged by the infrasystem. These are keyed to the predator. Goal Attainment. If the organism meets the requirements of the content map, the map, plan, and all enhanced sensations are terminated. If the organism has approached the potential mate, the approach content map is terminated and possibly replaced by a content map for some type of ritualistic display. If the organism has completed the nest, the nest-building content map is terminated and no longer commands attention or urges behavior. Plan Cancellation. A termination of the plan may occur without a cancellation of the content map. For example, the discrepancy between the Sensory Data 2 and Projected Sensory Data 1 could be so great that no details of the current plan could continue; however, a new plan is possible for satisfying the extant content map. For instance, the organism suddenly encounters a barrier while trying to approach X. The current plan could not be revised in a way that would surmount the barrier. The current plan has
THE CYCLE OF PLANNING, PRODUCING, AND ADJUSTING RESPONSES
FIG. 4.3.
77
Discrepancy resulting in plan termination.
to be terminated because no details of the plan would persist. The existing content map would still govern the agent’s planning, but the cycle would have to start over with a plan that is consistent with the current sensory data (immediately following Response Segment 1). Figure 4.3 shows the termination of a plan that occurs after Response 1 has been produced. The discrepancy of Sensory Data 2 and Projected Sensory Data 1 is so great that Plan 1 is terminated. The termination implies that an entirely new plan would have to be formulated. The dashed arrow shows that a new Cycle 1 plan must be formulated. The same content map is still governing the planning operation. Attention Required for Accommodating Feedback An assumption is that the system is able to attend to a limited amount of sensory data and attention may be distributed among an indefinite number of concurrent operations. A further assumption is that the system is designed to conserve attention so there is always a reserve. The extent to which the organism is completely engaged (100% of the available attention) in one activity is the extent to which it is potentially vulnerable. However, some activities are designed so they engage the agent fairly thoroughly. In any case, if one of the operations that is currently occurring requires relatively more attention, there is relatively less attention for other operations. If the system is designed to formulate minimum-necessary adjustments based on discrepancies between response projections and current sensory data, the system is able to conserve greatly on the attention that most discrepancies require. If there are no discrepancies between projections and current sensory data, the response is fairly automatic and requires the agent
78
4.
INTERACTION OF AGENT AND INFRASYSTEM
to relegate relatively little attention to the response. In this case, the agent has a reserve of attention not dedicated to the response underway. AGENT AND THE INFRASYSTEM INTERACTION Figures 4.4 and 4.5 provide an anatomy of the functions required for the system to perform the operations indicated in Fig. 4.1. These figures provide a different meta-blueprint of the functions of the hardwired performance scheme than would be needed for any pursuit. In other words, by providing specific physical mechanisms to account for the various functions shown in these figures, one would have an example of a machine that has the essential features of all hardwired performance systems. Both Figs. 4.4 and 4.5 show the separation of functions between the agent and infrasystem. The diagrams are not intended to show anything about the physical arrangement of the various functions, simply what they are and their temporal ordering of functions (implied by the sequence of arrows). The agent consists of the set of functions that plan and direct operant responses. The infrasystem consists of the universal functions that must be performed if the agent is to plan and produce situation-specific responses. The infrasystem provides the agent with the information and motivation that underpin the planning and response production. As noted earlier, the division between agent and infrasystem is the division between those processing steps that are always the same for a particular content map and those details of the observed responses that are unique to a particular setting. The infrasystem performs the same set of stimulus-driven operations each time a given discriminative stimulus is presented. The agent is responsible for what is unique to each application—the setting-specific features of the response. Figure 4.4 shows the infrasystem and agent functions for the initial processing of a reception that results in any preprogrammed pursuit or ritual. The reception contains a specific stimulus that triggers a specific content map. The process shown in Fig. 4.4 results in a plan and response directive based on the plan. The infrasystem functions are indicated with black arrows, the agent functions with shaded arrows. The Infrasystem Functions The cycle starts with a reception, goes through the infrasystem, and then goes to the agent. The process consists of the basic functions of reception, screening, planning, and directing. The cycle stops at this point because the production of the response provides the basis for the next cycle, which is presented in Fig. 4.5.
AGENT AND THE INFRASYSTEM INTERACTION
FIG. 4.4.
79
Initial infrasystem and agent functions.
The cycle starts with a reception that presents a predetermined set of features that triggers the hardwired pursuit—the bright light that activates the cockroach, the shadow that activates the chicken, the pheromone that activates the peacock to display itself. These are the response-associated sensory receptions. They occur within the context of ambient sensory receptions. Both types are screened by the receiver. The functions of the receiver are to (a) convert physical stimuli into patterns of sensation that roughly parallel physical features of the reception, and (b) screen stimuli for the presence of specific features that signal specific hardwired pursuits (response-associated receptions). Once the receiver determines that a response-associated stimulus is present, the input is transmitted to the modifier. The system must adjust the sensory input so that the important details of the reception are salient to the agent, the agent is motivated, and the agent has general information about the goals of the hardwired pursuit. These are the modifier functions of the infrasystem. The modifier adds or creates sensation. One type of created sensation enhances specific details of the setting that are relevant to the response—the target of the response and possible other features of the current setting. The
80
4.
INTERACTION OF AGENT AND INFRASYSTEM
second type of sensation is in the form of knowledge—a content map that conveys what the agent is to do in the presence of the sensitized stimuli. All receptions that enter the modifier are nonenhanced sensations. However, not all receptions are enhanced by the modifier. Those that are not enhanced include the ambient sensory details that may play a role in the later planning of the response as the setting changes. These details are not enhanced at present because they are not central to the immediate planning task. The arrows from the modifier show that all three products of the modifier—content map, enhanced sensations, and nonenhanced sensations— go to the agent functions, where they are used as information and motivation to formulate a plan. In summary, the specific functions of the modifier are to create sensation that has no parallel in the physical features of the reception. All creations of the modifier are in the form of sensation. The Agent Functions The agent receives the three transmissions from the modifier—enhanced discriminative stimulus, content map, and nonenhanced sensory transmission. Figure 4.4 shows that the central function of the agent is to formulate a response plan, which is influenced by the three transmissions from the infrasystem. The content map presents a template or blueprint of the way in which the current setting is to be changed. The enhanced sensations attached to some stimuli show the agent what to attend to in planning the strategy presented by the content map. The enhanced sensations urge the agent to respond by creating a plan. To make the plan specific to the current setting, the agent must take into account the ambient as well as enhanced stimuli. All three transmissions from the infrasystem (content map, enhanced sensation, and nonenhanced sensation) are essential for creation of the plan. These transmissions do not go directly into the plan. They are mediated by agent decisions about how to comply with the content map in the current setting. Agent Resources. This mediation is shown as the agent resources—the agent’s knowledge of its response repertoire and spatial and temporal transformations. These resources are different from other inputs that go into the plan: They are accessed by the agent. The content map provided by the infrasystem describes the response strategies and transformations that are to occur in general terms. The agent decodes the description from the content map and encodes the specific response from the repertoire. For instance, the content map may specify a large class of responses (such as locomotive responses) and prompt some of the response options with enhanced sensation; however, the agent must plan the specific response
AGENT AND INFRASYSTEM INTERACTION FOR ONGOING RESPONSES
81
strategy to be produced in the current setting. Without knowledge of the possible responses and response variations that it is capable of producing, the agent could not plan or direct specific responses. The plan must also be based on the assumption that there will be spatial and temporal transformations of the current sensory conditions. To create the plan that will accomplish specific changes, the agent must have knowledge of the changes that would be created by different responses. If the enhanced stimulus is moving away from the organism, the agent must know that a change in rate is the way to achieve the desired transformation (the distance between stimulus and organism diminishing). The agent must also have knowledge of the relationship between the spatial or sensory over there and the corresponding responses that will create the transformation of moving over there. This knowledge implies an agent map that correlates spatial sensory information with response information. Input from various sensory modalities would be presented on the same abstract map (discussed in chap. 1). Relationship of Plan to Response Directive and Projections. The PLAN is a description of the response that will occur. The PLAN indicates response components (rate, direction, posture, and other possible components). As Fig. 4.4 shows, the PLAN is also the basis for the Response Directive and Projected Sensations. An arrow on Fig. 4.4 goes from the PLAN to the Response Directive. The Response Directive is a derivation of the PLAN that orders the specified response, with the various response components indicated in the PLAN. Another arrow goes from the PLAN to a copy of the Plan, and the associated Projected Sensation (which are in the infrasystem). The Plan and the Projected Sensation are needed by the infrasystem for evaluating the response produced and for making possible adjustments. The Plan indicates what response details are to occur and how specific details of the current setting will change. These changes imply the Projected Sensation that the infrasystem will receive if the response is proceeding as planned. The only way in which the system will be able to judge whether the response is proceeding as planned is to have a record of the Plan and the related sensory changes that are to occur as a function of the directive based on the plan.
AGENT AND INFRASYSTEM INTERACTION FOR ONGOING RESPONSES Figure 4.5 shows the cycle that occurs as the response continues. The second cycle begins where the first left off. As Fig. 4.5 shows, the Response Directive has been issued. The directive creates a response. The response
82
4.
FIG. 4.5.
INTERACTION OF AGENT AND INFRASYSTEM
Complete infrasystem and agent functions.
leads to changes in the response-associated reception. The response is experienced by the system as sensory input created by the response. Calculator Function The ongoing cycle has a function not needed for the initial cycle—the calculator function. The reason is that the calculator function determines whether the response is consistent with the requirements of the content map and with the plan based on the content map. For the initial cycle, there was no ongoing response. Therefore, there was no need to calculate possible discrepancies between responses and the projected results of the responses. The purpose of the calculator function is informational. It does not modify anything. It simply provides the calculations needed for specific details to be modified. The response-associated receptions and the ambient sensory reception go to the calculator before going to the modifier. The calculator performs two functions: (a) It identifies any discrepancies between the projected sensations and the realized sensations, and (b) it identifies progress in achieving the goal of the plan. These operations are re-
AGENT AND INFRASYSTEM INTERACTION FOR ONGOING RESPONSES
83
lated. Both involve comparisons of plan or content map with current sensory input. If the projection indicated three specific transformations that were to occur as a function of the response, they either occurred or they did not. To make this determination, the calculator compares projection with current input. Discrepancies. Any deviation must be noted with enough detail that it implies a specific remedy. Unless the results of the calculator imply which components of the planned response are responsible for specific discrepancies between projected and realized outcomes, the calculations do not result in information that is useful to the agent. If the plan was to make the wall of the cell straight, the projection would be a straight section. If the last responses resulted in a part that is bowed, the calculator would determine the location and direction of the bow so that the information could be transmitted to the agent. In the same way, if the projection were that the approach route would show X at midline and X is currently 3 degrees to the left of the organism’s midline, the calculator would determine both the direction and some information about the relative degree of the deviation. Progress. The calculator function also updates the progress of the response if there are no discrepancies. For instance, if the organism has completed the anchor of the nest and the bottom part, the calculator computes what the agent is to do next. All of the operations performed by the calculator function involve comparisons and are strictly mechanical. Given the projection and current sensory conditions, the calculator notes those details that correspond to the projected sensations and identifies those details that differ from the projected sensation. Modifier Function All of the calculator’s operations result in information, and all of the modifier’s functions result in the addition of secondary sensation based on this information. If the calculator determined that the target of the approach is no closer than it was, the modifier would enhance the rate component of the plan so that the agent would receive information that this component needs adjustment. If the segment of the cell wall is bowed, the area that is bowed is sensitized so the agent will attend to and correct it. If the anchor and bottom part of the nest have been completed, the surfaces of the nest that serve as foundations for further building are highlighted or enhanced with secondary sensation. The transmissions from the modifier, in turn, provide the agent with information about which aspects of the response need adjustment to remain
84
4.
INTERACTION OF AGENT AND INFRASYSTEM
in compliance with the content map. The transmissions consist of updated content map, enhanced sensations, and nonenhanced sensations. The updated content map is not keyed to the initial planning, but to the current status of the pursuit. In the same way, the updated enhanced and nonenhanced sensations provide information about the next steps the agent takes in executing the plan. The agent plans the adjustments that are needed and modifies or confirms the response directive. The updated plan and projections based on the plan are transmitted to the infrasystem for comparisons that will occur as the revised plan is executed. The response directive is issued (or is continued as originally set) and new sensory data are generated. The cycle shown in Fig. 4.5 continues in this manner until the termination criteria are met or competing response-associated receptions override the current content map. Limitations of the Representations Figures 4.4 and 4.5 show a temporal ordering of the operational steps required for any pursuit requiring operant responses. The diagrams do not capture two aspects of the process—its continuous nature and some assumed features of the agent. The performance tasks that face the organism require continuous variation of responses. Therefore, the operations shown in Figs. 4.4 and 4.5 occur at a high rate and as ongoing processes. The modifier is receiving information continuously, as is the agent. The changes in the current sensory conditions require continuous monitoring of the responses being produced. The agent is not shown as an entity in Figs. 4.4 and 4.5, although it has the logical properties of an entity. The agent’s decision-making functions are not shown. They are implied, however, by the agent resources—the response repertoire and agent’s knowledge of universal transformation. Although the figures could have been expanded to show the steps that the agent takes in formulating the plan (reception, classification, comparison of plan with current sensory conditions, etc.), the main agent functions are shown by the figures. The agent—from a functional standpoint—consists of the steps that create the plan and issue the response directive. The plan is the central product of the agent because it implies what is accessed from the repertoire, response directive, and projected sensational changes that are anticipated. TRANSFORMATIONS Because the responses and changes in the setting are continuous, the infrasystem and agent require knowledge of transformations. During any time interval, some stimulus features remain constant, while others change.
TRANSFORMATIONS
85
For the system to work, it must be capable of interpreting these transformations—identifying both the stable features of stimuli and the manner in which other features are changing. All the ongoing adjustments that are required imply knowledge of specific transformations. Discussions of transformations, however, are cumbersome because our language does not have words for basic types of transformations or transformed relationships. Consider the complexity of the stimulus input that an organism receives as it approaches an object. The object changes in apparent size. The strength of vibrations or other sounds that issue from the object increase in a nonlinear way. As the organism approaches an object, the visual image undergoes various shape distortions (such as barrel distortion and parallax distortion). Furthermore, there are transformations independent of other transformations. As the organism goes up a hill, the proprioceptive features change. The transformations for objects that move are different from those of the nonmoving elements. Furthermore, the extent and rate of transformation is not the same for all stationary objects. Those that are closer to the organism change more rapidly and extensively than those that are more distant. The system must have procedures for both stabilizing some things that are apparently changing and for keying responses to stable or enduring features of things that undergo sensory transformations. Infrasystem Sensory Transformations The system must have hardwired programs that accommodate transformations of sensory elements. To refer to them as elements tends to beg the question because they are elements only after they have been identified as specific, enduring features. Discriminative Stimuli. A basic requirement of the system is that it must be able to recognize some objects across a range of transformations in shape, size, and relative orientation. To achieve this goal, the system must be designed to key on a feature or set of features that remain constant across the transformations. Only by identifying that which is constant would the system be able to know that the object retains its identity across changes in shape, size, and position. These constant or enduring features are the only basis logically available for the organism to identify or discriminate between objects. If the organism keyed on features that were not stable across transformations, the system could not reliably identify the discriminative stimulus. Single-Feature Identification. It may be possible for a system to identify an object by a single feature. This solution is probably used extensively in hard-wired systems because it is efficient. If a particular feature, such as a
86
4.
INTERACTION OF AGENT AND INFRASYSTEM
pheromone or vibratory pattern, is the unique possession of one class of objects (or of one individual), the system could reliably identify the object (or individual) by attending exclusively to that specific pheromone or the vibration pattern. If it is logically possible to identify the object on the basis of a single feature, the system could be designed to enhance reception of that single feature and use it as the sole basis for object identification at a distance. If the screening criterion is a single feature, and if some members outside the intended class possess this feature, misidentification results. For instance, bees apparently recognize other bees or nonthreatening organisms by a pheromone. The greater wax-moth (Galleria mellonella) and the lesser wax-moth (Achroia grisella) apparently possess the friendly pheromone. Therefore, the moth is not attacked as it enters the hive and kills bees. Multiple Feature Identification. More than one feature is needed to classify some objects or events. For instance, chickens’ reaction to the shadow of a chicken hawk is based on shape, movement, and direction. If chickens’ receivers screened objects on the basis of single feature, such as shape, chickens would run and hide from any delta-shaped shadow. Auditory criteria, like visual ones, may also require sound features as well as duration, temporal-pattern, or intensity features. Regardless of the sensory modalities involved, the criterion that the system uses to screen for a discriminative stimulus must be based on a sufficient number of features to describe all positive instances of the targeted object and no negative instances. Screening stimuli to identify those receptions that have a particular feature or set of features presents a logical problem to the system. The goal is to use specific features that identify all members of a class and no members that are outside the class. For example, if all members of a particular class have the set of features A, C, and D, and no member outside the class has this set of features, the set of features could be used as a screening criterion for identifying or sorting every member of the class and no other member. If each of the features A, C, and D is possessed singly by members outside the class (but no outsider possesses more than one of the features), any combination involving two features (AC, AD, CD) would reliably admit all members of the class and no member outside the class. If the screening criterion is based on a single feature, this would result in overgeneralization (members that are not positive being identified by the system as positives). The problem of identifying an individual (such as the mother) is basically the same as that for screening a class of objects. Any combination of features possessed by mother on all occasions and possessed by no other object in the class would serve as an effective screening criterion. A possible problem occurs with visual stimuli. If the visual feature were not visible with all orientations of the object, identification of the object
TRANSFORMATIONS
87
would not be possible on all occasions. The solution is to use more than one visual screening criterion or a combination of visual and nonvisual screening criteria so that the object would be recognized in any orientation. This problem is largely limited to vision; however, a parallel problem exists for other sensory inputs—the problem of mixing. Auditory, olfactory, tactual, and other inputs may be mixed. A sound may occur in the context of other sounds, an olfactory reception in the presence of other olfactory receptions, and so forth. If the system is designed to key on a particular auditory or olfactory input, the system must be able to identify its features across a range of contexts that present stimuli from the same class. The solution may be the use of a single-feature screening criterion that identifies what is unique about the targeted reception. Features That Serve as Response Criteria. As the illustrations in chapter 1 showed, the identification of the object and changes in the object require reference to more than one feature of the object. For instance, a lizard strikes at the cockroach only when the cockroach is within range. The criteria for action is that the object has features of an edible object and that it is within range. The object-identification features are enduring unique features of the object—those that distinguish it from other objects on all occasions. In contrast, the action criterion is based on features of the object that are variable. The object either has the feature of being within range or it doesn’t. The range feature varies as the object-identification features endure. The lizard’s system, therefore, must refer to multiple features of the stimulus to strike—the object-identification features, the range feature, and the specific-location features. After the system identifies the object as something edible and in range, it cannot strike it unless it identifies the specific location of the object. Universal Features and Transformations. The screening criteria are based on what is unique about all instances of a particular class. The action criteria are based on universal features of objects. All objects share universal features. All become apparently larger when they are closer. All result in some sensory movement when the object moves relative to the organism. All occupy a specific position when they occur in a setting. Millions of different shapes could be a particular absolute size on the retina, and all of these change in apparent size as a function of distance from the observer. Millions could become more intense or appear to move a particular direction or reach a certain absolute level of intensity. These universal features are the only basis for action. Projected changes are therefore referenced by the system (and agent) to these features. In summary, unique, enduring features are necessary for determining whether entities are discriminative stimuli. Universal features of the entity
88
4.
INTERACTION OF AGENT AND INFRASYSTEM
are necessary for planning and evaluating action. The unique, enduring features are irrelevant for determining the specific action required; the universal features are irrelevant to identification of an object. Therefore, the organism that is capable of both discriminating and responding in particular ways to particular entities must have a system that uses multiple feature information about all positive examples. It must use the unique features to classify and the universal-transformation features to perform actions. Knowledge Requirements for Transformation Both the agent that directs responses and the infrasystem that evaluates those responses must have some form of knowledge of the various transformations associated with responses. The agent produces a response designed to achieve a particular transformation of the features. The infrasystem evaluates the response by determining the extent to which the transformations occur. For example, an organism is planning to eat an out-of-reach object. The system may be designed to determine that it is out of reach because of its size. To plan a strike, the agent needs knowledge not only that the object is not the right size on the visual field, but also what action is needed to appropriately transform the object’s size so that it is within reach. The projection that accompanies the approach plan must indicate how the image size is to change if a successful response occurs. To evaluate the success of the approach response, the infrasystem compares the universal transformation features projected on the basis of the plan with those that are realized. If both infrasystem and agent do not have knowledge of the universal transformations, they could not communicate. The agent plans and directs transformations. The infrasystem uses information about the nature of the transformation that is occurring to provide the agent with feedback about how the plans and response directives may have to be transformed to achieve intended outcomes. If a particular object is in a particular spatial relationship with the organism, that location could be identified through vision, olfaction, vibration, magnetic sensing, temperature analysis, or other sensory means. All the sensory systems employed by the organism would agree that the object is in a particular spatial relationship with the organism. As noted in chapter 1, for this to be possible, the system would have to utilize a spatial map that is capable of accommodating information from all types of sensors. Also the agent would have to decode information presented on this map so that the location information from any sensory modality may be transformed into corresponding response information. Whether a particular location derives
SUMMARY
89
from hearing or olfaction, the agent would draw the same responseplanning conclusion about the route that approaches this location. To adjust responses, the agent must use information about discrepancies in projected and realized transformations. If the objective of the response is to achieve an image of a certain size on the retina, any effective format would involve a comparison of two superimposed pictures (one that is present and one that is provided by the infrasystem as the standard or goal of the response). Effective responses would be those that eliminate the size difference in the two pictures. Any difference in size would imply the response to be produced. It would also serve as the basis for the infrasystem to modify the enhanced sensations and as an updated content map.
SUMMARY The nonlearned performance of the organism reveals details of the functions that have to be performed. The organism has preknowledge of how to do certain things. The spider knows how to attack a prey. The spider does not attack leaves or other things that move. Therefore, the presence of a prey must somehow trigger knowledge of what to do. The same knowledge is presented on all occasions of prey, and some aspects of the spider’s behavior is the same on all occasions. Yet other aspects vary from one situation to the next. The general knowledge could be programmed as a reflex— some sort of general information that is automatically issued every time certain sensory conditions are met. However, only the aspects of the behavior that are the same on all occasions could be reflexive. The specific details that differ from one application to the next could not be reflexive. These facts imply the various functions that the system must perform and, ultimately, the details of the division of labor between those functions that are reflexive and those that are functionally influenced by the consequences of behavior. This division of functions implies a two-part performance system—the reflexive infrasystem and the consequence-driven agent. The infrasystem receives sensory data and reflexively operates on it. The infrasystem receives, screens, calculates, and modifies details of incoming stimuli. The receiver function converts incoming stimuli into sensation and screens for the presence of specific features that trigger a hardwired pursuit (response-related sensory receptions). The calculator function produces facts that are relevant to the pursuit. It accepts information provided by the agent’s plan and projections, computes the difference between these projections and what is currently occurring, and transmits those facts to the modifier. The modifier function adds sensations that have no counterpart in the physical features that enter the receiver. These are the enhanced sen-
90
4.
INTERACTION OF AGENT AND INFRASYSTEM
sations and the content map to be used to plan and produce responses in the current setting. The modifier changes sensory features in response to the facts that the calculator presents to the modifier. The operations of the modifier are reflexive. Given a particular fact from the calculator, the modifier reflexively adds so much of a particular sensation to the deviation or representation of features that the calculator identifies as a target for the agent’s attention. The modifier issues three types of transmissions to the agent: the parts of the initial reception not enhanced with sensations, the sensory inputs enhanced with sensation, and the content map. All this information is transmitted reflexively to the agent. The agent is sensitive to the enhanced sensations and changes in sensation. These provide the agent with information about which behaviors are leading to reinforcement and where to focus attention to change features that are not achieving projected outcomes. The agent’s central activity is to construct a plan. The plan logically requires information about the discriminative stimuli that are involved, the nature of the response that is to be planned, the criteria for determining whether the response is being conducted properly, and the details of the current setting that are relevant to the response. The agent also requires information about the responses that will be used to create changes in the current setting. Much of this information comes from the infrasystem. Some comes from the agent’s decisions. All involve the agent resources. Agent resources provide the agent with the tools needed to make plans that will produce the appropriate changes in the current setting. The agent has knowledge of how to direct specific responses. The agent also has knowledge of how the projected response will change details of the current setting. These areas of the agent’s knowledge interact. The agent must be able to map both information about the current setting and information about the response on the same spatial map. Transformations are the currency for both the sensory details of the setting and the response. The current setting is not the one that the content map describes. The discrepancy between what exists and what is described by the content map implies a transformation that is needed to eliminate the discrepancy. To achieve this transformation, the agent must specify a situation-specific response plan. The plan implies the response directive. The plan also implies the specific changes that are projected if the plan succeeds. Ongoing information is necessary because the response plan is uncertain. Some form of projection is logically necessary for the system to be able to judge whether the response will proceed as planned, and whether the situation has changed in a way that requires a modification of the plan. Both agent and infrasystem must be able to respond to multiple features of the setting and of the response being produced. The discriminative stim-
SUMMARY
91
ulus is identified by the set of enduring, unique features that are shared by all instances of the discriminative stimulus and by no other stimuli. The changes in the discriminative stimulus are based on a different set of features—the universal transformation features shared by virtually any object. To plan responses, the agent specifies multiple features of the response. Only if multiple features are used in the plan would the agent be able to adjust the response without terminating it and then starting over.
Part
BASIC LEARNING
II
Chapter
5
Perspectives on Basic Learning
Chapters 5 through 9 deal with basic learning. Basic learning is learning governed by primary reinforcers. Primary reinforcers are not unique to learning, however. They are hardwired components that occur in some form in any completely hardwired system that produces operant responses.
LEARNING AS ADAPTATION The purpose of all learning is to create flexibility so that different members have different content maps, each consistent with the experiences the individual received. This flexibility is particularly important for organisms that have to adapt to changing circumstances. In contrast, for all hardwired systems, one size has to fit all—all individuals and all applications that have particular stimulus features. The presence of specific sensory conditions reflexively results in the presentation of a content map that has specific behavioral demands.
LEARNING AND HARDWIRED PROCESSES From a functional standpoint, primary reinforcers drive the completely hardwired system and drive the system that is capable of learning. The only difference between the two is the nature of the content map that is presented with the occurrence of a primary reinforcer. For the organism that is capable of learning but hasn’t learned what to do, the map is initially either 95
96
5.
PERSPECTIVES ON BASIC LEARNING
incomplete or absent. Through some specific encounters with sensory conditions, the map is completed or created. The end result is a content map that has the same features as any completely hardwired content map. 1. Both the hardwired and learned content map apply to a range of situations. 2. Both maps are evoked by the presence of specific stimulus conditions. 3. Both maps are logically limited to providing general information about the unique features that are common to all positive examples (those that call for a common behavioral strategy) and common to no negative examples. 4. Both maps require planning if they are to be applied to any specific setting. 5. Both maps are terminated when specific stimulus conditions exist. 6. Both maps generate the same problems if they are too general or too specific. Because both maps function in the same way, it would be impossible to determine whether a particular behavioral strategy was learned or hardwired without information about the organism’s history. The difference between the hardwired content map and the learned content map is not its form or function. The difference is that a learned content map requires acquisition steps and steps for modifying it if it is not working as well as anticipated. The learned content map is based on experience, which is sufficient and necessary to account for the resulting details of the map. The problems generated by the hardwired and learned content maps are functionally the same. If the content map is too broad, it admits examples that are negatives. In contrast, if the map is too specific, the map classifies some examples that are actually positives as negatives. We see examples of both in hardwired systems in birds. The male robin provides examples of a hardwired content map that is too general. It battles its reflection in a window for hours, day after day. Its content map for territorial aggression treats all male robins and two-dimensional reflections as positive examples of a competitive male robin. The content map would have to be modified with the addition of specific content to rule out reflections as enemies. The boobie provides an example of a content map that is too specific. It responds to some interlopers as threats, but it does nothing as the skua steals its eggs. Its content map for nest protection is not general enough to include the skua as a positive example of an interloper. The correction would require identifying a set of features shared by all potential egg stealers and chick killers and key the activation of the content map to these features. (An alternative solution would be to have an aggregated list of spe-
THE PROCESS OF LEARNING
97
cific interlopers; however, if the identification is based on common features shared by all positives, the solution would involve making the map more general so that it would include all threats.) The hardwired and learned content maps create the same problems if they are too general or too specific. The learned content map also has a possible problem that rarely occurs in hardwired content map: The map may be keyed to stimulus conditions that are irrelevant to securing the primary reinforcer. For instance, the pigeon bobs its head and then pecks at a button from time to time. Every now and then food is dispensed following a peck. The presentation of the food, however, is not geared to the head bobbing; rather it is entirely a function of a tone and the button peck. A tone signals that pecking the button will lead to food. So the pigeon has a spurious content map that is based completely on a feature that does not predict the presentation of food. From the bird’s standpoint, however, the presentation of food is correlated with the head bobbing because the food always follows the head bobbing and the button pecking. Hardwired content maps of this type are practically impossible because the map (however it is formulated) has to be adaptive, which means that it will be efficient and keyed to true predictors of primary reinforcing possibilities. THE PROCESS OF LEARNING In a real sense, the goal of the learning system is to create learned content maps that function as hardwired content maps. The format for creating learned content maps requires both sophisticated additions and extensions to the hardwired system. However, learned content maps are functionally the same as any hardwired map. They permit the organism to predict future events and primary reinforcers on the basis of current sensory conditions. The special reactions that organisms have to primary reinforcers were not designed so that learning could occur. They were designed so that hardwired performance could occur. All performance that results in operant responses (whether learned or hardwired) requires the hardwired underpinnings and functions specified in chapters 1 to 4. The general design problem facing a performance system that adds the functions required for learning is how to identify specific stimulus features that predict specific future outcomes. Solving this problem requires both logic and memory. Requirements of Learning For the learning system to create content maps, the system has to consider what occurs in different settings. Gathering and processing this information implies additions and extensions to the hardwired system.
98
5.
PERSPECTIVES ON BASIC LEARNING
Logic. Logic is required so that the system is able to (a) rule out spurious predictors that occur before the primary reinforcer but that do not predict, and (b) identify the unique qualitative features of the stimulus setting that do predict. Memory. Memory is required to perform the logical functions because the primary reinforcer occurs at Time 2 and the possible predictors occurred at Time 1 and may not be present at Time 2. If the system does not have some sort of record of Time 1, it would be impossible for the system to identify anything that could be a possible temporally prior predictor of the primary reinforcer. Therefore, the overall process implied is for the system to transport the various possible predictors that occur at Time 1 to Time 2. The system then applies logic to rule out those candidates that do not predict and identify the feature that does predict. Because this process is logically impossible on a single trial, the logic of ruling out possibilities must be relegated to later trials. Furthermore, it must be designed so that it is relatively efficient at ruling out spurious predictors. This overall scheme raises a plethora of issues about specific procedures and processes—from how the predictor is represented to the possible formats and logic the system uses for ruling out possibilities. Learned Content. To learn a content map, the organism must learn a specific relationship. The map “Move to approach X” refers to both a behavior and discriminative stimulus (X). The assumption that underpins this content map is that X predicts the later presentation of the primary reinforcer. The presence of X activates the system to perform approach responses. Therefore, in its simplest form, what must be learned before a content map is formatted is that X predicts the primary reinforcer. If this relationship is not learned, no behavior would be implied. If the system learned only that X predicted “something that is not specified,” the system would not be able to respond because it would not know whether the responses were successful or what sort of changes the response was to achieve in the current setting. If the system knew only that the primary reinforcer was imminent but did not know what specific stimulus to key on to track the primary reinforcer, no response would be possible. The learner would be limited to experimental behaviors and hope that they would reveal the primary reinforcer. Only if the learner learns some form of the relationship that X predicts the primary reinforcer could a content map be formulated. Once it is given that X predicts the primary reinforcer and that contact with the primary reinforcer issues the reinforcement, the system would have sufficient information needed to formulate the content map. “To secure the primary rein-
TEMPORAL BASIS OF LEARNING
99
forcer, move to approach X.” The relationship that X predicts the primary reinforcer carries implications for the behavior of moving, for the fact that the behavior is referenced to X, and for the objective of the behavior (attaining the reinforcing consequences). In summary, the process of inferring functions needed for basic learning is based on two assumptions: (a) the prevailing context for all learning is temporal in nature and addresses specific changes in events over time, and (b) the basic relationship for all operant learning is of this form: S1 predicts primary reinforcing consequence S2 and does not necessarily include reference to specific “responses.” Agent and Infrasystem. Because learning occurs within the context of performance, the system that learns has the same basic components and relationships as the hardwired system. The part of the system that plans and is influenced by the consequences of behavior is the agent. The part of the system that performs those operations that may be reflexively programmed on all occasions is the infrasystem. Specific functions that are unique to learning are simply added to the design of the system that is required for performing, which means that there are new roles for the infrasystem, new roles for the agent, and new interactions between infrasystem and agent. However, the agent still plans and directs the behavior. The infrasystem performs the universal functions needed to support the agent functions.
TEMPORAL BASIS OF LEARNING If any learning is to connect a future outcome or reinforcing event with the current state, that learning must be based on some cues that are present now. Furthermore, the learning must be expressed as a relationship between the current cues and the specific reinforcing event that follows. The most basic learning is not a vague association, but rather an abstraction of the form S1 ® S2, in which S1 is the discriminative stimulus, S2 is the primary reinforcer, and S1 and S2 are temporally sequenced. Obviously some outcomes occur only if certain responses are produced. Therefore, it would seem that the response should be part of the scheme expressed as S1 ® R ® S2. In fact, this relationship provides a practical articulation of what occurs in any learning setting. The problem of referring to the response as part of the basic relationship, S1 ® S2, is revealed by considering the requirements of hardwired relationships between stimulus and operant responses. The specific responses produced on a particular occasion are a function of a plan for the current setting. This plan derives from a content map. The content map may indeed specify some general forms of
100
5.
PERSPECTIVES ON BASIC LEARNING
behavior. That content map, however, must be based on a basic relationship of something being a reliable predictor of something else. The primacy of the relationship and the secondary role played by responses is logically implied by the range of possible behaviors that may be created to achieve primary reinforcing consequences. For instance, we could teach the relationship that a light predicts the presentation of food in a way that shows the dependence of the specific responses on the understanding of the basic relationship that S1 ® S2. The basic relationship is that a light occurs and 5 seconds later food appears behind a glass panel. The learner learns that the light is a predictor, although no instrumental response is produced to gain access to the food. From a purely behavioral perspective, the learner did the same thing on various trials—observed the place where the food was presented (and possibly walked to the glass). These are the specific approaches that were reinforced. After learning is established, we could show that the knowledge of S1 ® S2 is capable of generating a variety of response strategies. After learning has occurred, we remove the glass. The learner now produces responses not reinforced during training. If the learner had not learned the basic relationship, the learner’s responses when the glass was removed would be inexplicable.1 We could further show that the various responses the learner produces are dependent on knowledge of the basic relationship S1 ® S2. We could introduce a tunnel through which the learner would have to crawl to reach the food. We could place a moat between the learner and where the food would appear so that the learner had to swim to reach the food. We could install a fenced route that would take the learner around the food and approach it from the opposite direction of the original approach. All these untrained behaviors are clearly referenced to the relationship S1 ® S2. We can also demonstrate the dependence of specific responses on the general relationship S1 ® S2 through classical conditioning experiments, such as the one referred to in chapter 2. After conditioning in which a buzzer is followed by an aversive air puff to the eye, the dog reliably blinks when the buzzer sounds regardless of whether the air puff follows. According to the analysis of inferred functions, the dog learned a relationship that accounts for the observed behavior and any other behavior that is consistent with the basic relationship of what predicts what. 1It may be argued that the situation is different when the glass is removed and that the absence of the glass prompts some sort of previously learned, generalized response that accounts for the difference in behavior. Although the cause of the different responses (previous learning) is not challenged, the logic is. A particular behavior was reinforced. It did not lead to an unqualified behavior. It led to what amounts to the best response available given that the light predicts the food and the food is a primary reinforcer.
1
TEMPORAL BASIS OF LEARNING
101
If the relationship is learned, it should be possible to show that the dog produces responses that were not trained during the experiment, but that are generated by the knowledge that the buzzer predicts the aversive air puff. Indeed, by loosening the straps that hold the dog’s snout in place, we observe that the dog will move its head on presentation of the buzzer. If we permit greater response latitude by removing all the restraints from the dog, the dog will not remain close to the source of the air puff. None of these behaviors is rigorously explained by the response induced by the training. Nor are they explained convincingly as latent responses because there is an indefinitely large set of other responses that the dog could produce, none of which were observed during training. It is not possible to account for the dog’s behavior without assuming that the dog has learned these fundamental relationships: (a) the room predicts something aversive, and (b) the buzzer in the room predicts an air puff. Knowledge of these predictions accounts for all of the learner’s behaviors—the “conditioned eye-blink response,” the operant response of turning the head, and the avoidance and escape behavior associated with the setting. Note that these responses are of three different response classifications—escape, avoidance, and conditioned response. Explanations that assume less knowledge about what predicts what are incapable of explaining the observed outcomes. If an explanation suggests that the dog learned a response that somehow generalized to the other responses, the response has the magical properties of containing information and knowledge of relationships. How could the response of blinking possibly generalize to the response of jumping from a table and running from a room? If the explanation assumes that the learning is a product of stimulus–response chains, the explanation cannot rigorously account for the conditioned response, let alone the various operant behaviors. Any behavior may be described as chains of component behaviors. The relevant question, however, is to predict something about the chain before the fact. A related problem is that the response-chain explanation is guilty of reductionism. If the same response components are involved in two chains that have different purposes, the chains are basically irrelevant to why a chain is deployed in a particular setting. If walking is used to avoid something and, on other occasions, to approach something else, the details that account for walking do not suggest why the walking occurred. It occurred only because the organism has some form of knowledge that S1 ® S2. If a behavioral approach is used to describe the dog’s various behaviors, it is possible to identify the buzzer as an SD (discriminative stimulus) and a conditioned negative reinforcer that is now generalized; however, the explanation is not edifying with respect to what is generalized. The only thing that could be generalized to account for the head turning, the conditioned eye-
102
5.
PERSPECTIVES ON BASIC LEARNING
blink, and the avoidance-escape behaviors is knowledge of the relationship between buzzer and air puff. Once this knowledge is identified, everything else follows without additional machinations. Certainly the process of how it is possible for the dog to attain and apply this knowledge must be explicated, but the fact is that knowledge frames whatever occurs in any specific setting. Responses The stipulation that basic learning takes the form S1 ® S2 does not imply that responses are never represented as predictors of primary reinforcers. If the reinforcer is not revealed or created unless a particular behavior is produced, the learner that attains the reinforcer consistently would have had to learn the relationship between execution of a behavioral strategy and the reinforcer. In this case, the responses that lead to reinforcement are coded as R. S1 is the setting event that makes the response possible, and S2 is the goal of the response (the positive consequences). If we include the knowledge of the response in the equation, we have something of the form: S1 ® R ® S2, with the relationship that R will lead to S2, cued by the discriminative stimulus S1. (When S1 is present, the relationship R ® S2 is possible.) For example, the bird learns that if somebody presents food, the bird can obtain that food by standing on one foot. For the bird’s representation, the presence of food predicts that the act of standing on one foot leads to securing the food. Content Maps The learning required to formulate a content map is of the form S1 ® S2. The content map, however, includes reference to the behavior that exploits this relationship. The content map is of the form S1 ® R ® S2, with R referring to the behavioral strategy that is planned or preferred. The analysis tends to refer to the content map that is learned without referring to the basic relationship that underpins the content map. The practical reason for referring to S1 ® R ® S2 is that no behavioral functional relationships are observed that do not have a response. Unless the learner formulates a content map that includes reference to some response component, the behavior is never observed.
LEARNING BASIC COMPONENTS OF RELATIONSHIPS Five acquisition steps are implied for learning a content map from specific sensory inputs. The steps outlined here are greatly amplified in the following chapters.
LEARNING BASIC COMPONENTS OF RELATIONSHIPS
103
1. The presentation of a primary reinforcer reflexively causes the system to represent those stimulus changes and features that occurred both immediately before and during the event. This reinforcing event may be either positive or negative. The process of recording what occurred immediately before the event is not a learned process, but must be a hardwired function of the infrasystem for learning to occur. The representation of temporally prior events also requires a memory and some form of ongoing process that records either a full range of ambient sensory input or specified categories of stimuli. 2. The representations of events or features that occurred prior to the presentation of S2 are enhanced with secondary sensations so they may serve as possible discriminative stimuli. As a general rule, the stronger the primary reinforcer, the more intensely and extensively features are represented and enhanced. 3. The system reflexively constructs a trial content map of the form S1 ® SR ® S2. If Features A to D were recorded as events prior to S2, the infrasystem classifies all of them, including the response strategy (SR), as possible predictors of S2. 4. The trial content map is applied to subsequent encounters with the various Features A to D or with the response strategy. The learner plans responses based on the assumption that some combination of Features A to D predicts S2. If the learner later encountered Feature C, the learner would produce the response strategy that anticipates S2. If the system encountered the combination of Features A and B, the learner would produce the same response strategy. 5. The trial content map is modified each time S2 occurs. The system compares the projection based on the trial content map with the record of what actually occurred. The trial content map (the relationship of S1 ® SR ® S2) is either confirmed or contradicted. If the trial relationship does not predict, it is disconfirmed and weakened. If it predicts, it is confirmed and strengthened, which means that it would be retained by the system as a trial content map. An illustration of this process would be an organism learning to identify the features that are unique to flowers with ample nectar. The primary reinforcer is a flower with ample nectar. When a flower with ample nectar is encountered: 1. The organism’s infrasystem reflexively represents the features of events that occurred immediately before reaching the flower and the distinguishing features of the flower (arbitrarily designated as Features A–D, which could represent height, redness, near the stream, and having large leaves).
104
5.
PERSPECTIVES ON BASIC LEARNING
2. The system enhances the representation of each of these features observed in the flower so that each feature commands the agent’s attention on the subsequent trial. 3. The system formulates a trial map that says, in effect, “Fly to flowers with any single or combination of features A, B, C, or D” (the features of the flower with ample nectar). 4. If the organism observes a flower with Features A and B, the organism complies with the map and flies to this flower. 5. The system responds differentially to outcomes that are positive and those that are negative. If the outcome is positive, the system strengthens Features A and B as predictors. If the outcome is negative, the system weakens Features A and B as possible predictors, but does not weaken the features of the original example that are not present in the current flower (Features C and D). The five-step process is repeated until a feature or set of features is confirmed as a reliable predictor of nectar. Default Content Maps for Basic Learning Not all systems that learn particular content require the same amount of learning. The reason is that different systems may be designed to create learning tasks that vary in difficulty. The degree of difficulty of what is learned is the degree of deviation from a completely hardwired content map. A hardwired map provides for all components—discriminative stimulus, behavioral strategy, and reinforcing consequences. If this hardwired content map is completely replaced, the learner must learn both the discriminative stimulus and response strategy that reliably lead to the primary reinforcer. In addition, the learner must learn the rule that relates the discriminative stimulus, response strategy, and reinforcing consequences. This learning would present maximum difficulty because it requires the maximum amount of learning that could be required within the framework of basic learning. Less learning is required if the content map has some, but not all, of the provisions of the hardwired map. Prompting what is learned through partly completed content maps has potential benefits and drawbacks for a system. The great benefit is that what is to be learned is tightly framed, and therefore a relatively high rate of acquisition would be predicted. Fewer possibilities must be tested. Therefore, learning the relationship is faster. Another advantage is that the system needs less elaborate memories and logic provisions if the scope of what is to be learned is circumscribed by the default content map. The disadvantage is that the system is more inflexible than one that places fewer constraints on the nature of the learning.
LEARNING BASIC COMPONENTS OF RELATIONSHIPS
105
The five-step illustration could involve a high degree of prompting by the default content map or relatively little. In all cases, however, the same five steps would ultimately occur. Let’s say that members of a particular species have no extant content map before learning occurs. Nectar is a primary reinforcer for this learner, but the learner possesses no particular strategies for obtaining nectar. The relationship the learner must learn is to fly to red flowers—the ones with ample nectar. The learner randomly flies around and lands on a red flower and encounters nectar, the primary reinforcer. The system reflexively represents and enhances all those features that are present in this setting and all features and responses that occurred immediately before the nectar appeared. This list of features would be extensive, including direction, time of day, temperature, wind direction, and all surrounding objects and their features. Next the organism would have to rule out possibilities. The memory requirements for such a task are quite imposing, and the logical formats are extensive. Some details of the relationship that the learner must learn have to do with flying. Flying is something that occurred before the nectar was encountered. So a possibility is that flying to anything would lead to nectar. This possibility would have to be disconfirmed by flying to something that did not generate nectar. However, if the system were designed so that disconfirmation categorically resulted in a feature being ruled out, the system that disconfirmed flying as a behavior that predicts nectar would not fly to seek nectar on subsequent trials. The system that must learn multiple features of the relationship (flying and approaching red) requires more elaborate logic for confirming and disconfirming possible features. At the opposite extreme of the unprompted content map is a structured default content map of the form, “Fly to flowers that have features to secure nectar.” This map provides for the response strategy, specifies the primary reinforcer, and shows the relationship between the features of a flower, the behavior, and the primary reinforcer. It lacks only a description of the specific features of flowers. The learner knows that it will fly, that it will fly to flowers (or facsimiles), and that the pursuit will result in nectar. Because the only detail to be learned is the specific features of the flower, a less elaborate categorization, memory system, and logical analysis is needed to achieve the learning. Furthermore, because the learning is so tightly circumscribed, the system could even be designed to present a menu of possibilities in the same way it does when the agent plans an approach in the current setting. Types of Default Content Maps There are three types of default content maps. Each type requires learning different content:
106
5.
PERSPECTIVES ON BASIC LEARNING
1. In the presence of , do behavior B to secure reinforcement R. 2. In the presence of X, do behavior to secure reinforcement R. 3. In the presence of , do behavior to secure reinforcement R. Note that the primary reinforcer (R) is given for all types of maps; therefore, the agent knows the goal of the response strategy. Type 1 provides the behavior and reinforcement, but would require the learner to identify the discriminative stimulus. (In the presence of , fly to source B to secure nectar.) Type 2 provides the discriminative stimulus and reinforcer, but not to catch a behavioral strategy. (In the presence of a rodent, do it.) Although Type 3 seems to provide no prompting, it does provide some in the form of a general rule that the learner should do something to find out how to attain or avoid the reinforcing consequences. These three types are direct default maps. They prescribe the learning that is possible. There are also indirect default maps. Their purpose is not to lead to the completion of a specific behavioral strategy, but to increase the possibility that the learner will encounter possible predictors of the primary reinforcer at a higher rate. One type of indirect content map is designed to prompt the agent to secure more information about an unknown object or event. For instance, “In the presence of something that is small, that moves, and that is not known, investigate to secure information.” Securing information is the primary reinforcer. The information may be positive or punishing. The agent, however, is strongly urged to investigate (perform behaviors that will reveal more information about the event or object). The investigation provides information about the event or object. The investigation answers the question, “What does this particular thing predict?” The most basic learning derived from this indirect map is whether a particular class of objects is reinforcing or punishing. If the object that is approached is a skunk, the features of the skunk will be learned so that future contact with these features will lead to avoidance behavior. Not only would the learning affect how the learner responds to skunks in the future, it would affect the default content map so that a skunk is no longer classified as something unknown. It is known and classified as something punishing. Although the hardwired default content map that makes the organism curious about certain classes of events may lead to punishing consequences, the organism remains curious on future occasions. It may be less likely to approach an object of curiosity, but it will observe the object. The object that cannot be classified as a known object remains an object that commands the attention of the agent.
TYPES OF BASIC LEARNING
107
TYPES OF BASIC LEARNING In one sense, basic learning is a simple extension of what the hardwired organism does when it creates a plan to obtain food in the current setting. The difference is that, with learning, the successful plan is generalized and retained so that variations of the same strategy may be applied to other settings. The missing details in the default content map prescribe what the organism must learn. The three types of default maps result in three types of basic learning. Discrimination-Only Learning The content map for the simplest type of learning would be Type 1: “In the , do behavior B to secure reinforcement R.” This conpresence of tent map would present the urging and even specify the behavior form. The learner would only be required to learn the specific conditions that signal the occasion for the behavior. The learning involves simply filling in the blank in the default content map. With this default content map, a bee could learn various rules for various settings. If yellow flowers were particularly rich in nectar, the bee could learn the modified content map, “Fly to yellow to obtain nectar.” If large flowers, tall flowers, or flowers of any other discriminable feature were especially productive, the bee would be capable of learning the rule for the most productive foraging strategy. Note that bees have experimentally demonstrated the capacity to learn to fly to symmetrical flowers (Giurfa, Zhang, Jenett, Menzel, & Srinivasan, 2001). The learning was measured by the extent to which they flew to and hovered near untrained examples that were symmetrical versus nonsymmetrical. Although the learning is sophisticated, the system could be designed so that all the bee had to learn was the common feature of examples that were sources of nectar. Imprinting An even less complicated version of discrimination-only learning is the content map for imprinting specific behaviors, such as the following behavior of geese. The newly hatched chick has a map of the form, “Stay close to to obtain positive sensations.” The imprinting is a special type of one-trial learning that tends to be irrevocable. The content map is completed with the particular discriminative stimulus that is present following a particular event (hatching).
108
5.
PERSPECTIVES ON BASIC LEARNING
Response-Only Learning The Type 2 default content map is designed for response-only learning. The content map before learning specifies the discriminative stimulus and reinforcement, but not the response. “In the presence of X, do behavior to secure reinforcement R.” Examples of response-only learning occur with babies. The human infant sees an object on a mobile and tries to grab it; however, the grabbing is not consistent. The object is the discriminative stimulus. The reinforcement is the consequence of grabbing it. The observed behavior implies what the learner must learn. If various uncoordinated efforts of tracking are observed for various objects (bottle, mobile, etc.), the conclusion is that the learner must learn a general skill of coordinating the visual map with the motor map. Response Strategy Versus Motor Learning. The response-only condition has two distinct behavioral subtypes. The first was described earlier—the behavioral pattern clearly implies that the learner must learn the motor response or something about how the response system is correlated with sensory input. Another variation of response-only learning involves the learning of a response strategy that may be executed by mobilizing behavior already in the agent’s response repertoire. The learning of a response strategy is the learning of what to do, not how to do it. For instance, a cat may be permitted to escape from a room by pressing a lever. The cat may press against the bar with the side of its body, with its head, or with a paw. Following the learning, the cat’s behavior is different from what it was before the learning. The cat now performs the barpressing behavior before leaving the room. None of the motor-response behaviors that are observed, however, are new behaviors. The cat has been observed to rub against and push against various objects before learning the current response strategy. Therefore, the new learning did not result in substantial learning of new motor responses (although the cat would have to learn what kind of resistance the response had to overcome). Rather, the central focus of the learning was a response strategy about what to do to open the door. Once learned, the strategy could be executed by the agent using extant responses. The distinction between motor responses and response strategies implies that different learning settings may vary greatly in the extent and magnitude of response learning. The task could require the learning of a response strategy only, the learning of a new motor response only, or the learning of a combination of new motor response and response strategy. The more elaborate the response specifications are, the greater the number of components that must be learned and therefore the more diffi-
TYPES OF BASIC LEARNING
109
cult the skill is to learn. Clearly, learning what to do requires less learning if it involves responses that are already in the learner’s repertoire. Learning that involves both new motor responses and a new response strategy prevents the learner from achieving the goal until both the motor and strategy components have been learned. If the cat were required to balance something on its nose to escape from the room, the cat would have to learn how to balance things on its nose (a motor response that has never been observed) and learn the rule about which stimulus conditions signal that it is time to execute the response. The cat would not be able to escape from the room until it had learned both how and when to balance the ball. An implication of more elaborate response learning (or any learning that requires a combination of components) is that it is far less likely that the learner would learn both criteria unless there had been some form of systematic teaching that provided prompting or staging of the two components. Components of Responses. The analysis of motor responses and response strategies should not be extended to the level of component behaviors. For instance, it may be possible to analyze a person’s behavior and conclude that the person has all the components needed to juggle three balls and should be able to do it simply by learning the strategy. This conclusion is flawed by a poor analysis. The timing imposed by the behavior of the balls places unique motor requirements on the new task. The task requires new response learning even if all components have been observed. Although careful analyses of response components lead to potentially accurate conclusions about the degree and type of new learning that has to occur, a more pragmatic solution is available. If you have never seen that response produced in the same way on some occasion or observed variations that imply the response, the learner must engage in some degree of new response learning. The conclusion that the learner would be able to perform on a variation never observed before is determined analytically by referring to the range of examples that have been observed. For instance, a person is going to juggle egg-sized rocks—a feat that this person has never done before. However, the person has been observed juggling marbles, balls, oranges, grapefruits, and juggling bags. This range of variation implies that the person has generalized juggling skills and therefore should be able to juggle the rocks, although that behavior has never been observed. The implication is articulated by arranging the objects that had been observed on a continuum of variation—possibly with marbles at one end, bags at the other, and balls and fruit in the middle. It is not a perfect continuum because it involves objects that vary in more than one dimension (hardness, texture, size, shape, density). However, if the egg-sized rocks would fit
110
5.
PERSPECTIVES ON BASIC LEARNING
within this continuum (and not outside the range of variation), the learner should be able to juggle them. The rocks could be placed on several places of the continuum, but possibly the best place would be between marble and ball. It might be argued that the rocks are not perfectly round and therefore are not a perfect fit between marble and ball. The observation is correct; however, the juggling of bags had been observed. Bags are not perfectly round. Therefore, the performance with different shapes implies that the learner should be able to process the egg-sized rocks. If the learner were required to juggle something outside the range of variation, the amount of new learning is implied by the difference between the features of the objects that have been juggled and the new object. For instance, if the learner were to juggle large plates, some new learning is clearly implied because plates do not fit within the continuum of variation described by the observed examples. Plates are flat and larger than grapefruits. These two features suggest that although some of the juggling-skill components (timing, balance) would transfer (because they are common to all instances), the unique features of the plates imply a unique way of both catching and tossing the plate (so that the plate remains vertical in the plane the learner is facing). The net prediction is that this learner should be able to learn to juggle plates in far fewer trials than a person of matched skills who has never juggled anything. The prediction of savings is based solely on the learner’s known performance and the features shared by the set of examples that have been learned and the new example. The difference between the total skill requirements imposed by the task of juggling plates and the requirements learned by the learner describe what the learner must learn. The learner who has mastered some of the requirements has less to learn than the naive learner. Response-Related Learning. The distinction between learning a response strategy and learning a motor response is suggested by the functional anatomy of the system as described in chapter 3. The motor responses are resources that are in the agent’s response repertoire. In contrast, the response strategies are plans based on content maps. In creating a plan consistent with the content map, the agent specifies responses that are in its repertoire. Learning is greatly simplified if the learner has a repertoire of responses that may be deployed to serve any content map. As a result, the simplest form of basic response learning would be an extension to the hardwired system such that the response repertoire, discriminative stimulus, and primary reinforcer are hardwired, and the system learns only the response strategy. A naive learner with this hardwired underpinning may be presented with an egg and recognizes it as food, but has no hardwired response strategy other than approaching the food and trying to eat it. Therefore, the task is
TYPES OF BASIC LEARNING
111
to formulate a strategy for cracking the egg to make the food accessible. Given that this pursuit involves behaviors already in the learner’s repertoire, the task requires only the learning of a strategy. Response Learning Qualification. In a strict sense, it is not possible to learn a new response strategy without learning something about a subclass of discriminative stimuli (antecedent events). For instance, if the learner learns a strategy for cracking an eggshell to obtain food, the learner learns both the strategy and the new discriminative stimulus—the egg. The strategy that results in a cracked shell does not apply to all examples of food, merely those specific instances that have a particular shape, size, surface feature, and odor—the egg. In a strict sense, therefore, any response strategy that is learned always involves the learning of those stimulus details (antecedents) that predict mobilization of a specific response strategy. Reinforcement. An unstated part of both response learning and discrimination learning is the role of reinforcement. The reason for learning the relationship of discriminative stimulus to response strategy is to secure positive reinforcement or escape from negative reinforcement. The learning may be influenced not only by the presence of a primary reinforcer, but possibly by its strength and frequency of its occurrence. Its strength influences the persistence of attempts to secure it. Its frequency affects the rule or relationship that the successful learner learns about securing it. Learning Both Discriminations and Responses The ultimate type of basic learning that is governed by a primary reinforcer would not specify either the discriminative stimulus or response strategy. This learning would involve a default content map of the form: “In the presence of , do behavior to secure reinforcement R.” It would require the learning of both a behavior and the discriminative stimulus. A characteristic of this type of learning is that the behavior to be produced is arbitrary and therefore cannot be supported by a default content map. We could design 100 new rules to secure food, and each rule could require a different behavior. If the learner has a default content map associated with food (approach it), only those new rules that required approach behaviors would be consistent with the default content map. Those new rules that require moving away from the food or performing an action that does not approach the food are not supported by the default content map. To learn any of these rules, the learner must actually replace or override the default content map.
112
5.
PERSPECTIVES ON BASIC LEARNING
Let’s say the learner would have to do something like press a lever to secure food, but do this behavior only when a light over the food dispenser is illuminated. The lever is farther from the food than the light is. Hence, the learner is required not to approach the food dispenser, but to move in another direction. The completed map would be: “In the presence of the light, approach and press lever to secure food at the dispenser.” The learner must learn both that the light (not the food) signals the response strategy and that the lever press (not approaching the food) is the response that will make the food accessible. The only type of default content map that could support this type of learning would be one that could be overridden, unlike other hardwired maps that are unyielding with respect to content. The original relationship the learner formulates may be overridden by later information the system receives. In fact, during learning, several trial content maps may have to be rejected before the system learns the one that predicts well. In the same way, the default content map that supports learning a new discriminative stimulus and a new response strategy may be designed like a trial content map. For instance, for the initial and most direct learning, it would prompt the learner to approach food. For later learning, the map could be modified if the learner received evidence that contradicted the default content map. The learner could learn arbitrary relationships that did not approach food, but that led to food. LEARNING AND TRAINING Learning that is consistent with the default content map is necessarily easier to learn than behaviors that are inconsistent with the map. The reason is that there are fewer variables. Therefore, no staging is necessary. If the learner is to learn only a new type of response for approaching or performing a behavior that clearly reduces the physical distance between learner and primary reinforcer, the learned content is simply a subclass of the responses specified by the default content map. The content map is “approach food,” and the learned behavior is “Use type B locomotive pattern to approach food.” If the learner is required to learn a relationship that is not a subtype of the original relationship (approach food), some type of training or staging may be required to provide the learner with information it needs to learn the relationship. The information the learner receives is a primary variable in whether the learning will be successful. The learner needs a reason for performing. The basic reason is that some behavior will lead to a primary reinforcer. If the relationship between behavior and primary reinforcer is not clearly established, however, the probability of learning may be greatly reduced.
LEARNING AND TRAINING
113
The trainer who wants to establish the dolphin jumping in response to a hand wave does not start with the hand wave and wait for the dolphin to learn the relationship that the hand wave is to be followed by a jump, which is followed by approaching the trainer and receiving food. The probability of the dolphin learning that relationship within any reasonable time frame is extremely low. However, if the trainer implemented a program of instruction, the probability of teaching the relationship would be very high. Initially, the trainer would show the food to the dolphin so that a default content map would be activated. In other words, the dolphin would be motivated by the food to approach the food. The trainer could hold the fish high above the water. If the dolphin produced a leap of a particular form, the trainer would throw the fish to the dolphin. This phase of the training conveys the general relationship that, in the presence of food and trainer, leaping a particular way leads to food. From the standpoint of the default content map, the dolphin learns that it approaches food in the presence of the trainer by leaping a particular way. The next phases of the training would simply make the steps for approaching the food more remote, both in time and space. For instance, the trainer may stand far from the edge of the pool and wave the fish. The appropriate leap would lead to the trainer throwing the fish into the pool. After this relationship is established, the trainer could produce the same hand signal, but without holding the fish. The trainer would also change the schedule of reinforcement so the dolphin may be required to do a series of jumps, and possibly other behaviors that the trainer signals, before receiving the fish reward. The details of this training may vary in a number of ways and still be successful. The basic requirement is that the communication of information is staged so the dolphin learns how the behavior is related to the relevant default content map. Once a direct relationship is established, more indirect procedures may be introduced to extend the distance between the default content map and the map that is learned for securing food from the trainer. If the problem of training is seen as one of communicating information about a relationship, the training will be far different than if it is viewed as some sort of response pairing game. Many traditional experiments draw perfectly erroneous conclusions about what and how organisms learn because the experiments do not consider the task of inducing learning as being a process of communicating. For instance, the timing of the presentation of the reinforcer after the response affects performance. The reason is that the goal is to teach the learner that the response leads to the reinforcer. This message is most articulately communicated when the reinforcer follows the completion of the response. This timing produces relatively fewer intervening events between response completion and reinforcement.
114
5.
PERSPECTIVES ON BASIC LEARNING
As the delay increases, intervening events occur. Each or any combination of these could be postulated by the learner as a cause of the reinforcer being presented. For instance, intervening events A to D happen between the production of the response and presentation of the reinforcer. Given that the system will simply enhance these features and use them as a basis for a trial content map, the system must now analyze A to D and the actual basis for the reinforcement. A training program effectively eliminates the competing features A to D by timing the reinforcement so there is probably only one variable that occurred before the presentation of the reinforcement. For instance, the dolphin trying to learn the relationship performs the leap, returns to the surface following the leap, vocalizes, swims in a figure-8 pattern, and then receives the food reinforcement. The dolphin has no foreknowledge of which behavior (if any) caused the fish to appear. The only assumption the system has for determining a possible relationship is that some things occurred before the primary reinforcer occurred or some things were present at the time the primary reinforcer occurred. Possibly the figure-8 swimming caused the fish to appear; possibly the vocalization caused it. For the learner to learn the relationship between fish and behavior, the learner may have to rule out a significant number of possibilities. In contrast, if the fish immediately follows the behavior, there are far fewer possibilities to rule out. Therefore, the learning is faster and less likely to support superstitious correlations. Superstitious behaviors can become part of the response because they are difficult to rule out. For instance, the dolphin might conclude from a long-latency condition that it had to leap, come up, and vocalize for the fish to appear. Although vocalization is not a necessary part of the equation, it is not contradicted because the fish will be presented later regardless of whether the learner vocalizes. There are fewer opportunities for superstitious learning if the reinforcement is presented immediately following the response. Limits of New Learning Potential A final implication of the relationship between a default content map and learning requirements is that some organisms may be capable of extensive learning, but only within the limits established by their hardwired default content maps. Although a bee may be capable of learning sophisticated discriminations and response strategies involved in locating sources of nectar, it may be quite impossible to teach bees not to follow their queen when she leaves the hive or to build a five-sided cell rather than a six-sided cell. If observations disclose that it is not possible to train the organism to perform behaviors that clearly contradict hardwired behavioral patterns, the conclusion is that the content maps that support these behaviors are not designed to accommodate learning.
LOGIC AND LEARNING
115
The organism may be facile in learning in some areas and incapable in learning in others. This arrangement may occur because different content maps are independent of each other. If the learner is to learn the features of the flowers currently abundant in nectar, the learner must have a default content map that promotes learning the correlation of specific features of the flowers and nectar. In summary, there are three general categories of default content maps, two of which promote learning: 1. Hardwired maps that are not modifiable in any way through learning; 2. Default maps that admit only to modifications that are consistent with the general provisions of the map (three types of direct and one type of indirect). These default content maps support only what is sometimes referred to as natural learning. The organism follows the urgings of its default content maps and learns specific relationships that are framed by the natural tendencies of the organism. A rat running a T-maze would be an example of this type of learning. 3. Default maps that may be overridden by learning. These function as trial content maps. They support content that is consistent with the map provisions and modifications that are not necessarily consistent with these provisions. These maps accommodate the learning of relationships that have no necessary counterpart in nature, and therefore may be completely unanticipated by any specific content map. The dolphin that leaps out of the water to obtain food is not approaching the food. The food follows at a place that may be far removed from the leap. The leap led to the food only in an abstract way as an instrumental behavior produced in response to the trainer’s hand signal.
LOGIC AND LEARNING All basic learning, whether performed by a human or a bee, involves content. Furthermore, all content is identified only through the application of logical operations. Therefore, the organism that learns represents content and performs the logical operations. The learner learns something specific and qualitative—a relationship that expresses what amounts to a fact about the content at Time 1 and the content at Time 2. Just as the application of content to a specific setting requires logic, the learning of content requires logic. The basic process of learning involves comparisons that identify what is the same about two situations and what is different. The sameness is expressed in terms of specific features, not global or unspecified similarity. If
116
5.
PERSPECTIVES ON BASIC LEARNING
the learner responds only to objects that share specific features, the system must have represented these features and their relationship to a later event. Consider an organism that responds to a spherical object in the same way it responds to an egg-shaped object. It does not respond in the same manner to any form that has corners, such as box forms or any object that has negative inflection points (dents) in the surface. If a spherical object and an egg-shaped object are responded to in the same way, the only possible basis would be features shared by both objects, but not shared by any negative examples. A careful articulation of the common features of the objects would absolutely include the feature or set of features that the learner uses to treat the objects in the same way. The learner’s rule might not include all the features on the list, but any effective rule would include features that are on the list. There are no other nonmagical options. The only form of contact the learner has with the objects is through sensory receptions. The only way to identify a positive example is on the basis of features that are always observed in the object. If more than one object is treated in the same way, the only basis is something they have in common—the specific features they share but that are not shared by the negative examples. Therefore, the system that identifies the positives as being the same must be able to identify the features of each, determine which are shared by both, and construct a content map that refers to these common features. To perform the various operations needed to determine the possible samenesses, the system must have hardwired formats for performing logical operations. These operations consist of a predetermined set of analytical steps that are performed on all objects with highlighted features to determine their common features and whether they predict the same reinforcing consequence.
SUMMARY Basic learning is directly related to primary reinforcers. For basic learning, the system must identify the relationship between the predictor and primary reinforcer, S1 ® S2. This relationship is independent of any responses (except that S1 may be a representation of a response that leads to S2). The primacy of this relationship is demonstrated by the ability of the organism to produce various possible responses to achieve a primary reinforcer. For any observable demonstration of operant learning, a three-part relationship is required: S1 ® R ® S2. A system that learns requires the basic hardwired provisions of the nonlearning system. It must have the same functional anatomy articulated in chapter 4, which entails agent and infrasystem functions. It must perform the
SUMMARY
117
same kind of planning, projecting, and adjusting of responses based on sensory input. It must be capable of replanning in a way that is sensitive to the unique details of the current setting. It must have the same motivational provisions in the form of sensitized stimuli that command attention and an infrasystem that produces a sequence of hardwired operations to process incoming sensory receptions. The only necessary difference is that the system that accommodates learning has specific additional provisions for modifying the content maps and possibly modifying the response repertoire on the basis of encounters with specific sensory conditions. The content that is learned may be viewed as a simple extension of what the nonlearning system must do to formulate and execute a successful plan in the current setting. The only difference between the content maps that the system uses to achieve basic learning and the content maps that only accommodate unlearned performance is that the content maps for learning are incomplete. There are different degrees of incompleteness that describe different possible types of default content maps—maps in which only S1 is missing, maps in which only S2 is missing, and maps in which both S1 and S2 are missing. The more permanent parts the map has, the less that may be learned and the simpler the learning. Learning that requires only new stimulus features for a hardwired pursuit is much simpler than the learning that requires both information about the stimulus features that predict the primary reinforcer and the instrumental response that leads to the primary reinforcer. The system may be designed so that learning is possible only in specific performance areas, with the learning format being only a slight modification of a completely hardwired content map. As learning capabilities of species increases, the content map may have fewer hardwired provisions and rely more on learning. If the organism is required to learn all the content involved in a relationship of the form S1 ® R ® S2, the successful learner would have a system that permits substantial replacement of hardwired functions with procedures for learning the features of S1, R, and S2 and representing how they are related.
Chapter
6
Basic Antecedent Learning
Table 6.1 provides a summary of basic learning types discussed in chapter 5—imprinting, antecedent learning, response-strategy learning, and antecedent-plus-response-strategy learning. Motor-response learning is not included in this classification because it involves learning details on the infrasystem level that simply increase the organism’s response repertoire. Therefore, all the categories presented in Column 1 refer to content map issues. This chapter discusses the basic properties of imprinting and antecedent learning. Chapter 7 considers basic properties of response-strategy learning. Column 2 of the table indicates the status of each content map before learning. The part of a content map in Column 2 that is not specified is the part that requires learning. For example, if the discriminative stimulus is specified but the response strategy isn’t, the content map provides the organism with information about what to respond to (the discriminative stimulus), but does not provide a format for responding, only a general urging to respond to achieve reinforcement. When learning is achieved, the result is a completed content map that provides for a specific discriminative stimulus and a specific behavioral strategy that is keyed to the presence of the discriminative stimulus. The assumption for all the types of learning listed is that the system has a response repertoire (either learned or hardwired) that is adequate for the specified learning. Table 6.1 presents the different categories of learning in an all-or-none manner—as if learning could involve only the discriminative stimulus or only the response strategy. However, any learning influences both the 118
119
INTRODUCTION
TABLE 6.1 Types of Basic Learning Categories Imprinting (quasilearning) Antecedent learning
Response-strategy learning 1 Response-strategy learning 2
Antecedent + responsestrategy learning
Content Map Before Learning
Purpose of Learning
Behavior complete. Discriminative stimulus not specified. Response strategy specified. Discriminative stimulus not specified. Discriminative stimulus specified. Response strategy not specified. Discriminative stimulus not completely specified. Response strategy not specified. Discriminative stimulus and response strategy not specified.
Assigning content map to specific individuals Learning a specific discriminative stimulus for hardwired response strategy Learning a unique strategy for a given hardwired discriminative stimulus Learning a unique strategy that is facilitated by hardwired stimulus provisions Learning a rule that relates a new discriminative stimulus, a behavior, and a consequence
discriminative stimulus and response strategy. If the content map calls for the organism to learn only the antecedent stimulus, a necessary result is that a modified response strategy is learned. For instance, the organism learns that feature M is a discriminative stimulus. The organism would necessarily respond differently toward M after learning had occurred. Stated differently, if the organism does not respond in a new way after learning the function of a discriminative stimulus, we would have no basis for judging what, if anything, the learner actually learned. Only if the learner produces a new pattern of responses that are correlated with the discriminative stimulus would we be able to infer that the learner learned something about producing differential responses. Likewise, if the organism learns a generically new response strategy, a modified discriminative stimulus is implied. The organism does not produce the response randomly, but only in the presence of particular sensory cues. Therefore, the learner must have learned something about the antecedent conditions that signal the opportunity to produce the new response. Despite the necessary interaction of discriminative stimulus learning and response-strategy learning, we treat each separately. The reason is that the essential learning has to do with either the antecedent events or response strategy produced. If the organism knows that the behavior is collecting leaves, but must learn where the leaves are, the learning is primarily discrimination learning, not response learning. If the learner is attempting to open a lock that requires a unique set of behaviors, the discriminative stim-
120
6.
BASIC ANTECEDENT LEARNING
ulus and general goal of the response are given. What the learner must learn is the response strategy that opens the lock. IMPRINTING (QUASILEARNING) In Table 6.1, imprinting was classified as quasilearning because it is generically different from other formats of learning. The process of imprinting leads to the completion of a content map through a single experience or exposure. Imprinting results in an individualized hardwired program. No aspect of the content is really learned. It is simply individualized. There are two main types of imprinting—sensory imprinting and response imprinting. Sensory imprinting fixes an elaborate content map to an individual. The following behaviors of ducks or geese (chicks following the mom and siblings), the navigation behavior of birds (birds learning the map properties of migration routes or homing), the egg-laying behavior of salmon (returning to the estuary where they were hatched), and many other imprinted sensory behaviors involve a hardwired content map that is simply referenced to a particular individual, group, or place. Response imprinting fixes an elaborate new-response program to an individual. A newborn mountain goat learns an incredible amount about balance and locomotion in a matter of minutes. The only possible design would be some version of incorporating information about the individual (body weight, center of gravity, balance effects when moving body parts, etc.) into a sophisticated motor-response system. Sensory Imprinting A completely hardwired system that relied on no experience could, theoretically, be designed so that the content map refers to one individual. If the hardwired program were keyed to one individual, however, all organisms would respond to that individual. Bees would not respond differentially to one queen versus another. All bees in existence would respond to one queen. All goslings would respond to one goose. A hardwired system could also be designed so the content map refers to a class of individuals. In this case, all organisms would respond to any member of the class. All bees would respond to any member of the queen class in a manner different from their responses to other classes of bees. All goslings would respond to any goose as if it were their mother. A completely hardwired system cannot be designed so that different individuals respond differentially to different members of the class. The solution to these problems is some form of quasilearning, which provides all organisms with the same basic program, but references it to a specific stimulus (a unique individual, place, object, or operation).
IMPRINTING (QUASILEARNING)
121
This sensory imprinting of behavior may be achieved through two formats: (1) The imprinting may be keyed to a single sensory feature that is unique for the individual, or (2) the format may call for identifying a sum of sensory features that is unique for the individual. An example of Format 1 would be keying on a particular type of olfaction that varies across individuals. Like fingerprints, no two organisms provide exactly the same olfactory cues. If the format for imprinting is the olfactory feature, the imprinted organism identifies the discriminative stimulus only by the olfactory feature. It may be possible for the learner to later learn about other distinguishing features of the discriminative stimulus, but initially the only basis for identifying the discriminative stimulus is a unique olfactory feature. For instance, a particular queen bee is discriminated by her unique olfactory features. The particular queen is not confused with any other queen. Similarly, if imprinting were keyed to a specific visual feature, the only basis for recognition would be that unique visual pattern. If all members of a species have stripes, but no two have the same stripe pattern, the visual identification of an individual to be imprinted could occur on the basis of the unique stripe pattern. For Format 2, the system imprints various features. Olfactory and other sensory features may be part of the imprinting package. Also within a particular sensory domain, more than one feature may be imprinted. For instance, the learner may be imprinted with various visual features that provide a three-dimensional picture of the discriminative stimulus. For this format, the sum of the component features identifies an individual. If the target of imprinting is positioned in a manner that does not make one of the visual features visible, another visual feature or possibly an auditory or olfactory cue is present. The hardwired program that is imprinted may be quite simple, although it exercises great control over the imprinted individual. For instance, the imprinting of the following behavior of chicks is actually a program of affiliation. Before imprinting, the content map could be something like, “Keep scent at a high level to receive reinforcement R.” Reinforcement R is an enhanced positive feeling or a lack of anxiety. A specific mom scent completes the content map. Once completed, the map functions the same way as a completely hardwired content map. If the scent is weak, the agent experiences strong negative feelings and is compelled to strengthen the scent through a response plan. As the plan is being executed, the infrasystem provides the agent with feedback in the form of sensations and changes in sensation. The agent uses this information to plan and produce behaviors that ideally result in more positive feelings (which occur when the chick is closer to mom). The plan is terminated when this positive outcome is achieved. The program is relatively simple, yet it has great influence over how the chick responds in a great variety of situations.
122
6.
BASIC ANTECEDENT LEARNING
Note that for visual and auditory imprinting, the system must create something of a map that expresses the unique features that are imprinted. For instance, if the stripe pattern of mom is represented by a scheme that provides for identification in a variety of settings, the pattern must be expressed as something of a three-dimensional model. Without this provision, the imprinted organism would not be able to recognize mom unless she was encountered in a particular orientation. If the imprinted visual pattern were represented simply as a left-facing view of mom’s profile, the organism could not recognize a right-facing view of mom unless the pattern were expressed in a way that is relative to mom’s orientation or body parts. Also for auditory patterns, such as mom’s call, the discrimination must be based on features that are independent of loudness, direction, or background sounds. If the pattern were simply represented as being a sound having a particular rate, sound quality, or loudness, it would be too general to distinguish mom from other calls that have the same set of features. The discriminative basis must be some feature or combination of features unique to the individual—possibly a particular pattern of overtones. Motor Imprinting The newborn mountain goat learns an incredible amount about balance and locomotion very quickly. The newborn’s initial attempts to stand and walk provide information about the level of effort needed for the individual to achieve basic movements, the organism’s strength and relation of strength to specific movements, and the geometry of the individual. Apparently, once this information is received, an individualized program is imprinted. The program provides the organism with extensive knowledge about coordination of body parts and the relationship of visual transformations of objects with the concomitant transformation in motor responses. The goat is now able to navigate down slopes that are incredibly precipitous and produce feats for which no learning opportunities have been provided. Goats are not the only organisms that have motor-response programs that are imprinted rather than learned. Generally, those organisms that must be mobile immediately after birth have such a program so that what would normally require extensive experience is achieved quickly through a hardwired individualized package referenced to the relevant details of the individual. Motor imprinting, like feature-based sensory imprinting, is inferred from behavior. The goat’s initial attempts to stand and walk provide no evidence that the agent has a thorough knowledge of its response repertoire. The goat wobbles and moves uncertainly. A few moments later, it is able to produce behaviors that could not have been learned because they have no experiential base. Therefore, the system must have used the initial informa-
ANTECEDENT-ONLY LEARNING
123
tion about its unique characteristics to extrapolate and install a complex program that is now individualized.
ANTECEDENT-ONLY LEARNING In Table 6.1, antecedent learning was classified as true learning (along with response-strategy learning and combinations of the two types). These are generically different from imprinting in that the learning is not a function of a single experience. It is a function of repeated exposures to an antecedent (discriminative stimulus) and a consequence (reinforcing variables). The learner learns a new content map that is limited to and completely consistent with the facts of the experiences. This is not to say that the learning is always correct, merely that it is always a function of specific sensory experiences and is perfectly consistent with these experiences. If the learner learns that a particular odor leads to the primary reinforcer, there was a basis for this learning. At some time during the learning process, the learner was presented with examples in which the odor occurred in a way that made it a possible predictor of the primary reinforcer. Learning—whether it is response-strategy learning or the learning of antecedent events—is always based on specific features observed in events and on the common relationship between the predictor and what it predicts. Learning always involves content of some sort, and the learning process is based on hardwired logical operations. For the learning of antecedent stimuli only, the organism learns to apply a hardwired response strategy to a new antecedent stimulus. For some pursuits, a default content map may be designed so that if no learning occurred, the learner would not be perfectly disabled. This fail-safe feature would ensure, for instance, that the bee that failed to learn about a new antecedent stimulus would not be preempted from being productive. Default Design The default design for antecedent learning calls for a content map that provides both the behavior and a general class of discriminative stimuli. Through learning, this general class of discriminative stimulus would be replaced with a more specific stimulus. For example, the bee would be provided with a default map of the form, “Fly to flowers to secure nectar.” No subclass of flower is specified. Therefore, the bee would fly to flowers of various features. This default content map has two functions: (a) to ensure that the organism is at least minimally productive, and (b) to provide the general discriminative stimulus (flowers) that will generate the opportunity for learn-
124
6.
BASIC ANTECEDENT LEARNING
ing a particular subtype (short red flowers). In the bee example, the default content map ensures that (a) the bee will be productive even if it does not learn a relationship (such as that flowers of a particular petal pattern are productive), and (b) the bee will be in a position to receive information needed for new learning to occur. If the default content map were too general, such as, “Fly to for ,” it would not be specific enough to ensure that the learner would be in a sensory context of flowers. The bee could fly to trees or fly in circles. If the map is too specific, it restricts the learning too much. For instance, “Fly to ‘large’ flowers for nectar” not only places a needless restriction on the pursuit, but it limits the possible learning about smaller flowers that have abundant nectar. Correlation of Features For antecedent learning, the system must formulate a correlation between two sets of properties or features. For the learning of the rule that red flowers have abundant nectar, the correlation involves a color feature and a nectar feature. The color feature is observed at a distance, the nectar feature isn’t. If the color feature covaries with the nectar feature, the color features can be used from a distance to predict the nectar feature. Presence of the color feature at Time 1 implies ample nectar at Time 2; absence of the color feature at Time 1 implies no ample nectar. No simpler form of relationship is possible. To say that the bee learns to associate the color with nectar is either to beg the question or express the relationship in a superficial way. The system must identify exactly what it is that predicts something else. For this prediction to occur, the system must be able to correlate two features—a specific color and nectar. No lesser learning is logically possible. By removing the reference to the specific color, the bee has no basis for approaching specific flowers. If the relationship does not involve nectar, there could have been no possible basis for the learning of red. The nature of this relationship implies that learning cannot occur on a single trial. At least two flowers must be compared to confirm the relationship of nectar and color. If both flowers share only the color feature (and do not share any other features, such as size, shape, etc.) and if both flowers have ample nectar, the color feature is confirmed as a predictor. If the two flowers differ only in the color feature and one flower does not have ample nectar, the color feature of the ample-nectar example is confirmed as the predictor. Investigating just one flower that has ample nectar logically generates a great number of possibilities, each based on a feature or combination of features of the flower—color, location, leaf size, stem characteristics, height, blossom size, and so forth.
ANTECEDENT-ONLY LEARNING
125
Process of Feature Abstraction The system may have different procedures for identifying the key feature or set of features that distinguish productive flowers from the others. Any procedure requires the abstraction of specific features and cannot be based on general similarity of the flowers that have nectar. General similarity does not specify which of the multiple features of the object are involved in the decision that the flowers are similar. Learning must be based on knowledge of specific features and specific content relationships. The process the learning system would follow is to (a) identify some feature or set of features observed in a positive example, (b) determine which other examples (positive or negative) share the feature or set of features, (c) represent those features that are common to all positive examples and no negative examples, and (d) discard features that are common only to negative examples. Sum of Features. An ineffective strategy would be for the system to correlate all the features of the productive flower with the nectar. For this option, the organism would take a kind of picture of the flower to use as a criterion for visually screening other flowers that look the same. Another flower would be considered the same only if it had the size feature, color feature, and other features of the original flower. The strategy is abortive because is too safe. If the property of ample nectar is related to any single feature or set of features of the original flower, those flowers that are nearly identical to the original will certainly have ample nectar. The problem is that only a small number of flowers would be nearly identical to the original. Possibly only two in the entire field are identical. For instance, if there were only three degrees of variation within a feature (high, medium, low) and only five different features, the chances of two flowers being identical are about 1 in 243. Physically, they may be far apart and require time to locate. After the organism exploits the members of this small population, the organism must revert to the default strategy. The most serious problem with identifying flowers by all features is that there may be hundreds of others within the field that have ample nectar. However, the format of looking for similar examples has preempted the organism from finding the specific feature or subset of features that truly predicts nectar. Representing Abstract Features. A strategy based on the abstraction of specific features is more productive. This strategy would involve some form of hardwired, systematic elimination process. The system would represent a nectar-rich flower (Flower 1) in a way that preserves all of its component features. The system would then compare the original flower with another
126
6.
BASIC ANTECEDENT LEARNING
convenient flower (Flower 2) that has one or more of these features (but not necessarily all). This flower either has ample nectar or it doesn’t. If it does, all features that are common to both examples are retained as possible predictors of nectar. Those features not shared by both examples are not included in the criterion. Let’s say that color and stem characteristics are common to two flowers with abundant nectar. Both are red with long stems. Therefore, those features are retained for the classification criterion in a trial content map. The other features are eliminated as possible predictors of nectar. The next flower approached (Flower 3) would have the color, stem characteristic, or both. Let’s say the second flower (Flower 2) is a negative example and does not have ample nectar. All the features that are common to the original flower and Flower 2 are weakened as possible predictors. If three features are common to both the positive and negative flowers (leaf shape, blossom size, and height), these features could not be the reason Flower 1 is positive (ample nectar) and are therefore rejected as possible predictors. If two remaining features are unique to Flower 1 (color and stem characteristic), these are retained. Note that whether the comparison flower is positive or negative (ample nectar or not), the system can still make inferences about predictive features. If both are positive, the common features are retained. If the second flower is negative, the features that are in the positive flower but not the negative flower are retained. Following the comparison of the two flowers, the system has knowledge of a smaller set of features that potentially predict ample nectar. The subsequent pursuits would be based on these features. The process is repeated until only one feature remains or until there is confirmation that both features must be present to predict nectar. If a test flower has one feature in common with the other productive flowers but is not productive, the system would revert to the last combination of features that predicted productive flowers. If all earlier flowers had the same color and stem features and a later negative example had only the same color, the system would conclude that both color and stem feature were needed for examples to be positive. This general strategy is designed to determine the most specific rule that applies to the current population. If the most specific criterion involves one feature, that criterion is the most productive for the organism to learn. If a single-feature criterion is too specific and does not predict, the procedure permits the system to identify the criterion based on a combination of features. Feature-Elimination Logic. The system that learns about individual features must exercise some form of feature-elimination strategy. The specific logic that the system uses could be determined by a controlled set of exam-
127
ANTECEDENT-ONLY LEARNING
ples. The correspondence of performance to the implications of the ideal strategy would suggest the operating logic and representational capacity of the system. If the system were able to learn single features, however, some version of the process outlined here would be necessary. The argument for the ideal strategy is simply that it permits the organism to sample flowers that are close to the original, and it provides for the possible elimination of many features with a single comparison. Furthermore, for a population that had three levels of variation for each of five different features, identifying a single-feature correlation would require no more than four comparisons if none of the examples had exactly the same set of features. Table 6.2 lists five flowers, each of which is different from any of the others. The first flower encountered is A. Flowers B through E are arranged according to the number of features they share with A (not according to the order in which they are encountered). Flower B has four features that are shared with A. Flower E has only one feature in common with A. For this example, that is the single feature that predicts ample nectar. The goal of the learner’s pursuit is to identify this single feature. The goal is reached by flying to flowers that have one or more of the features of Flower A and performing a comparison. If that flower has nectar, it is shown in the Positives column. If the nectar is not present, the outcome is shown in the Negatives column. The worst case is the sequence that requires the largest number of flower comparisons to determine the single feature that predicts nectar. That order is achieved if the flowers are encountered in the order they are listed: AB-C-D-E. This sequence eliminates one feature per comparison and requires four comparisons (BA, CA, DA, and EA). The flower that is first is compared with A. It either has ample nectar (Positives column) or not (Negatives column). If Flower B is the first to be compared, only one feature would be ruled out by the comparison. Flower B shares four features with A. So if B is positive (Column 2), there are four possible features that could still account for the examples being positive, TABLE 6.2 Ideal Features Identification Logic
Flower A B C D E
Positives (number of features that are the same as features of A)
Negatives (number of features not the same as features of A)
Comparison (number of possible features eliminated by comparison with A)
5 4 3 2 1
1 2 3 4
B vs. A rules out 1 feature C vs. A rules out 2 features D vs. A rules out 3 features E vs. A rules out 4 features
128
6.
BASIC ANTECEDENT LEARNING
and only one feature is ruled out because it is not present in both nectarrich flowers. If B is negative (Column 3), the single feature that is not shared with A is the only possible basis for B being negative. This feature is therefore identified as a negative feature (nonpredictive). The other four features are shared with A and therefore may be predictors of nectar. Regardless of whether the outcome is positive or negative, only one feature is ruled out. The procedure is repeated with a comparison of C and A. It shares three features with the original flower. Whether C is positive or negative, one more feature is eliminated, leaving only three possible features that could account for the positive examples. The procedure is repeated for comparisons DA and EA. The comparison of the features shared by A and D reveal that two features are shared by positive examples. E is either positive or negative. In either case, the single feature that is common to all positive examples and no negative examples has been identified by the comparison of E and A. For the worst-case sequence, this single feature is identified through four comparisons. The various better-case scenarios are also indicated by the last column of the table, which shows the number of features eliminated by a comparison of the flower with A. If the second flower tested is D, not B, the table shows that the comparison of DA would eliminate three features, whether D were positive or negative. Example D would have two features that are shared with A and three that are not. If the next comparison is with E, one more feature is ruled out, and the system has identified the single predictor that signals abundant nectar. Only two comparisons were required—DA and EA. For the best-case pattern, the first comparison would be EA. That single comparison would discredit four of the possible features and identify the single basis, flowers with abundant nectar. As noted previously, the extent to which organisms actually apply an ideal sorting strategy is an empirical question. The actual learning tendencies identified through a controlled presentation of both positive and negative examples would provide information about both the overall strategy and the extent to which information is recorded by the system following a single positive example. It may be that the organism tends to follow the pattern shown previously but requires substantially more than four trials to learn the rule. In this case, the system is not initially precise enough to identify the five features on a single trial or the memory is not initially strong enough to retain the information. Another possibility is that the system is not designed to eliminate possibilities too quickly. As discussed in later applications, a too-hasty dismissal of a possible feature retards learning in some situations. In summary, some version of a feature-identification strategy is required. The prior plan provides for the most efficient exploitations of probabilities.
ANTECEDENT-ONLY LEARNING
129
After finding a productive flower or various productive flowers, the bee simply goes to a convenient flower that has one or more features in common with the first and systematically eliminates features that do not predict the nectar. Learning Model for Tightly Specified Pursuits The simplest model of how the system would go about identifying the predictive features of discriminative stimuli shares many aspects with the model needed by a completely hardwired system to provide feedback for a responseproduction plan. Chapter 3 presented a model of the functions required for planning and adjusting specific features of a response. The system has a checklist for the information that must be provided before the response is initiated. The system gathers the information and draws conclusions about how to formulate the plan and modify it on the basis of feedback. The same sort of checklist used for planning responses could be adapted to the default content map. The checklist option is applicable only for those types of learning in which the same set of variables apply to all applications of the trial content map. It is possible for the system to make a checklist of features for the pursuit of gathering nectar from flowers because all instances of this pursuit will involve flowers, and all flowers may be described by reference to the same set of general features—color, location, size, leaf color, and so forth. Because the range of variables is limited, the system could be designed to present a checklist of possibilities. The system would reflexively present the checklist when the bee encounters a flower with ample nectar. The system would record the values for the preset features, make calculations, and provide the agent with a trial content map (modified default map) based on the new information. This format would permit learning of antecedent or predictive stimuli within any particular hardwired context. Therefore, it would be quite possible for the organisms of a particular species to have a remarkable capacity to learn within a specific area. Table 6.3 shows a possible feature-classification format for identifying flowers with ample nectar. The specific features listed are arbitrary and not assumed to be exhaustive. Additional categories of features could be added, such as location, scent, and overall plant features. The main divisions shown in the table are for the blossom, leaves, and stem. Each division presents the same set of variables—color, size, shape, markings/texture, and arrangement. Each row shows a feature that is classified according to a continuum of choices. The color of the individual flower would be rated as a particular point or combination of points on the continuum. Likewise, the size would be noted on a continuum that implied the relationship of the observed size to other possible things. For the remaining items—shape, markings, and arrangement—the values would occur as continua that show
130
6.
BASIC ANTECEDENT LEARNING
TABLE 6.3 Model of Hardwired Classification of Features Category
Feature
Blossom
Color Size Shape Markings/texture Arrangement
Leaves
Color Size Shape Markings/texture Arrangement
Stem
Color Size Shape Markings/texture Arrangement
Value
different models. For instance, the shape of the leaves could be shown with a continuum of different shapes, the markings with various pattern choices, and the arrangement by variation in pattern variables. This system would not learn particular features. For instance, the system would not have to discover that things differed in color or size. The system has preestablished criteria for identifying the features (procedures for grading the individual features observed in a particular flower). The system would simply enhance the categories so the agent would attend to each variable and record its value as a degree or range. Regardless of how heavily the learning is prompted by the system, the system makes a representation of individual features, represents each in a way that it can be used to identify that feature in various contexts, uses this information as a basis for determining which feature predicts, and enhances or emphasizes the presence of the feature that is identified as a predictor so that it becomes a discriminative stimulus. Once the features have been identified, the system formulates response plans based on the trial predictors. The system projects the outcome that is consistent with the predictor and uses later data to determine whether the predictor actually predicted. These and other details of the comparison-and-enhancement operation that are required for antecedent learning may be represented as an extension of the completely hardwired performance system. Figure 6.1 shows this extension. The process involves three stages. Stage 1 occurs when the learner has encountered a flower with ample nectar. The
131
FIG. 6.1.
Three-stage extension of hardwired system.
132
6.
BASIC ANTECEDENT LEARNING
system installs a trial content map. This content map is basically a representation of the various features of the flower with ample nectar. The features are enhanced so that the bee will use them as predictors, either singly or in combination. Stage 2 occurs after the learner formulates a plan based on the trial content map and reaches a second flower that has one or more features of the original flower. At this time, the system receives information needed to evaluate the trial content map. The flower either has ample nectar or it doesn’t. If it doesn’t, the trial content map must be modified on the basis of the new information. Stage 3 (New Stage 1) is a modified content map, which is the product of the Stage 2 calculations. It may still be a trial content map, subject to further modification after the next application. However, it is the content map that will be installed for the next pursuit of finding a flower with ample nectar. Stage 1 of the next pursuit will have this map. The learning cycle occurs within the context of the performance cycle. The performance cycle is used by the system to implement specific plans that the agent formulates and follows to meet the requirements of the extant content map. For instance, the trial content map is presented to the agent. The map directs a general strategy, such as “Fly to red.” The agent plans a flight to a specific red flower. If there is a wind, the agent must adjust its flight plan. This entire pursuit is Stage 1. For Stage 2, the system evaluates whether the content map predicted correctly. Does the current red flower have nectar? If so, the trial content map is strengthened. If not, the map is modified based on the information provided by the red flower. The result of Stage 2 is a content map that is modified in some way (New Stage 1). That map becomes the trial content map for the new Stage 1 pursuit. Stage 1. The start of the learning cycle is the initiation of the plan. The pursuit is continued until termination. The aspects of Stage 1 that are new for learning are the trial content map and the record of sensory features. The projected outcome for the Stage 1 pursuit is that the primary reinforcer (ample nectar) will be attained. The termination of Stage 1 occurs when the status of nectar is evaluated. At this time, the system has information about whether the projection is accurate. The diagram shows that the trial content map has been installed as an infrasystem function, not an agent function. The infrasystem presents the trial content map to the agent as it would present a completely hardwired content map. It is issued reflexively in the presence of the features called for by the trial content map. It fortifies the agent’s reception of the features with enhanced sensation. The agent does what it does in a hardwired pursuit. It responds to the urgings to pursue the specified features. To do this, the agent applies it to the specific concrete setting, formulates a plan con-
ANTECEDENT-ONLY LEARNING
133
sistent with the map, and issues a corresponding response directive. The main requirement of the trial content map is that it represents the features of the flower in a way that allows comparisons of these features with the features on which the plan is based. The diagram shows that the trial content map is retained by the system until Stage 2. This provision is necessary because the calculations performed during Stage 2 lead to modification of the trial content map. The map cannot be modified if it is not retained by the system through Stage 2. Record of Sensory Features. Figure 6.1 shows that the record of sensory features is created at the time the plan is formulated. The record provides information about the various features of the flower, not simply the features that may be involved in the plan. The plan may be based on a flower that shares one or two features with the original flower. However, unless the representation refers to the other features of the original flower, the system will not be able to draw conclusions if the flower being pursued proves to be negative (no nectar). In contrast, if the system has a complete representation of the original flower, important inferences are possible. The record of sensory features is retained by the system until Stage 2. At this time, the targeted flower has been approached and its nectar feature has been identified. The record of sensory features becomes a necessary element in the process that determines the modifications of the trial content map. Stage 2. Stage 2 occurs after the presence of nectar has been either confirmed or disconfirmed. As the diagram indicates, the analysis for Stage 2 is based on (a) the record of sensory features of the original flower, (b) the trial content map that governed the Stage 1 pursuit, and (c) the current sensory receptions. All these elements are necessary for the system to identify the implications of the current flower for how the trial content map will be modified. The analysis for Stage 2 involves three calculations. Calculation 1. Calculation 1 assembles the information the system needs to perform comparisons of the projected outcome with the observed outcome. This information consists minimally of the record of sensory features of the original flower and the sensory features of the current flower. The sensory features of the current flower are determined by reference to the current sensory input. Calculation 1 identifies these features according to the same set of abstract rules that were used to formulate the record of features for the original flower. With both flowers expressed as a sum of their features, the system has the base information needed to perform comparisons.
134
6.
BASIC ANTECEDENT LEARNING
Calculation 2. Calculation 2 compares the projected outcome with the realized outcome. The projection was that the flower would have ample nectar. That projection is confirmed or disconfirmed by comparing the projected nectar with the realized nectar. If the nectar is abundant, the projection is confirmed. If the nectar is not abundant, the projection is disconfirmed. The operations involved in achieving Calculation 2 are the same as those identified for the hardwired system that compares sensory outcomes of a plan with projected outcomes. As noted earlier, both outcomes involved in the comparison (the present setting and the projection) are represented in a common code. Without this abstraction, no comparison would be possible. Calculation 3. Calculation 3 correlates the results of Calculations 1 and 2 with the features specified by the trial content map. The trial content map has been retained by the system. The map provides a record of the features that the system assumed would lead to ample nectar. Calculations 1 and 2 provide information about the current flower, its features, and whether it has nectar. To perform Calculation 3, the system needs two different logic formats—one that applies if the current flower is positive and another if it is negative. The system assumes a causal relationship between the map and the realized outcome. The learner approached the flower only because the trial content map indicated that approaching a flower with certain features would lead to a positive outcome. Table 6.4 shows the types of modifications of the original content map that are implied by whether the current flower is positive or negative. As Table 6.4 shows, the system will either strengthen or weaken features. There are three categories of features—features common to both the original positive example and current flower, features unique only to the original flower, and features unique only to the current flower. The second column shows the modifications that occur if the current flower is positive. The features that are common to this flower and the original are strengthened. The features that are unique to either the original or current flower are weakened. The logic is that only those features that are common to both flowers could be the basis for a rule that predicts outcomes TABLE 6.4 Modification of Trial Content Map Based on Calculation 3 Current Flower Features common to the original and current flower Features unique to original flower Features unique to current flower
Positive
Negative
Strengthened Weakened Weakened
Weakened Strengthened Weakened
ANTECEDENT-ONLY LEARNING
135
with other flowers. Therefore, these features are strengthened. Those features that are unique to either the original or current flower could not possibly be single-feature predictors. They are therefore weakened. The third column shows the modifications that occur if the current flower is negative (not ample nectar). The features that are common to the original flower and the current flower are weakened (because they could not possibly be single-feature predictors). The features that are unique to the original flower are strengthened (because they have not been discredited as being predictors). The features unique to the current flower are weakened. The system retains a record of features in a weakened form. The reason is that, without this provision, the system would not be able to learn predictions based on a combination of features rather than just single features. Let’s say that the trial content map is based on the rule that red predicts nectar. The bee flies to a red flower. The sensory receptions disclose that there is no nectar. If the system were designed to eliminate red as a consideration for all future content maps, the system would not be able to accommodate the possibility that nectar is predicted by more than a single feature. For instance, if color and size predicted nectar, elimination of red as a possible predictive feature would preempt the system from discovering the combination that predicts. If the features of the original flower are retained, even in weakened form, the system has the potential to learn discriminations based on a combination of features. Note that the analysis applies to any learned content. It does not assert that bees are able to learn to categorize on the basis of a combination of features. Rather it says that if they were capable of formulating such trial content maps, their system would need some provision for weakening, not eliminating, features that apparently have been disconfirmed. The only complete source for information about possible feature combinations are the positive examples. Stage 3. The result of the analyses performed by Calculations 1 through 3 is an updated trial content map. The modified content map is retained by the system and becomes the new basis for the agent to formulate a plan. The map indicates the various features that the system currently assumes to be predictive (and has enhanced with secondary sensations). The new content map would tend to specify fewer features than the previous trial map and would enhance particular features more strongly. This process would be repeated until the outcomes tended to be those predicted by the trial content map. At this time, the trial content map would function as a learned map and would tend not to be modified by information about new flowers (because all would confirm that the features specified by the content map led to nectar).
136
6.
BASIC ANTECEDENT LEARNING
Requirements for Feature Identification and Content Maps The various steps in Stage 2 may seem abstract and difficult for a simple learning system to perform. However, the proof is in the behavior. If the system learns that a specific feature or set of features is a discriminative stimulus for the approach response, the task cannot be accomplished unless the system identifies the features that are discriminated and performs the calculations required to use information about each pursuit or set of pursuits to identify those features that are true predictors. If the system is to learn to identify specific features of antecedent events that predict, the system must engage in some form of the three-phase analysis indicated for Stage 2. The details of the sensory reception must be classified in a way that makes them comparable to details that have been represented by the system in Stage 1. The three stages shown in Fig. 6.1 are necessary for all antecedent learning. Stage 1 shows that the learner already has a trial content map for securing a positive outcome. This trial map could not have been formulated until the learner had information about an occasion of the primary reinforcer. The primary reinforcer may have occurred when the learner was engaged in another pursuit. Suddenly the primary reinforcer occurs. At this time, the system must formulate a trial content map based on the current setting and whatever may have occurred before the presentation of the primary reinforcer. The system enhances changes that occurred before the occurrence of the primary reinforcer with secondary sensations so the agent will attend to them when planning a pursuit that involves the primary reinforcer. The changes are recorded in the trial content map. The features may not be under the learner’s control (created by actions of the learner), which means that the learner may have to wait for one or more feature to occur and then determine whether they predict the primary reinforcer. When one of these features does occur, the trial content map is presented to the agent, and the Stage 1 plan is formulated. The agent assumes that the antecedent event predicts the primary reinforcer and formulates a plan of action—escaping a negative primary reinforcer or approaching a positive one. The system records what happens following the presentation of the trial predictor. This record indicates whether the primary reinforcer occurs. If the primary reinforcer does not occur, the predictor is weakened and the system must wait for (a) the occurrence of possible predictors, or (b) the next occurrence of the primary reinforcer to gather more information about possible predictors. Any antecedent event that occurred on both occasions but not at other times is most probably the predictor. The system would therefore be designed to greatly enhance the record of that event, but retain the others in a less-enhanced form (on the possibility that the pattern may involve different predictors on different occasions or in the
ANTECEDENT-ONLY LEARNING
137
context of different enduring features). If the predictor is a single feature that occurs on all occasions of the primary reinforcer (and only on these occasions), the ideal system that maintains a record should be able to identify the antecedent stimulus in two trials. The extent to which the system is ideal is an empirical matter. Relative Features. The processes the system uses to classify and represent features must be fairly sophisticated if the system is to refer to relative features such as apparent size. If the system defines a large flower as a medium-size image of the flower, the classification would completely fail because any flower looks medium size at some distance. A large flower will look identical in size to a small flower that is at a different distance. Therefore, unless each flower is referenced to a standard unit or method of comparison, identification of the object’s size would be impossible. Possibly, the reference is to blossoms that are larger or smaller than most in the population. Possibly, the system has a precise way of measuring size. For either possibility, the system would have to abstract sensory receptions and apply the criterion for determining whether the current flower falls in the appropriate size range. If the system is capable of determining absolute size, it would have to abstract that part of the visual reception that constitutes the flower, calculate the distance from the eye, and project the flower on a scale that indicates its absolute size. A transformation of some sort would be required for the flower to fit the scale. Any simpler model for determining absolute size would fail. Also some sort of transformation is needed for any feature that may be observed in various transformed appearances. For instance, apparent color (that received by the eye) varies greatly as a function of level of illumination. Without a transformation-independent representation of red, red in shadow lighting would not be recognizable as the same red in sunlight. If the wavelength is abstracted and represented as being independent of the level of illumination, the color may be identified in different illumination contexts.1 Default Functions. There are different types of content maps for different types of learning. The differences result largely from two factors: 1The
extent to which the learning-performance system corrects transformed input could be determined by presenting positive examples of a color and negative examples that are a darker shade of the same color. Next, a much brighter illumination context is presented. A new negative example that is close in apparent color to the positive example in the lower light context is presented. Finally, all four items are presented in both low- and high-illumination settings. If the learner discriminates in both settings, the learner is able to identify the effects of illumination. If the learner confuses positives and negatives as a function of the illumination, the learner’s receptions are based on absolute values.
1
138
6.
BASIC ANTECEDENT LEARNING
1. Whether the learned map is to become a relatively permanent part of the learner’s repertoire or whether the learned map is to be used only so long as specific conditions exist. 2. Whether the features of the examples are the same for every occasion and may therefore be prompted through checklists and default content maps. The degree of how permanent the learning is depends on the permanence of the content assumed by the system. The productive rule for finding nectar this week is not necessarily the same as the rule that holds for next week. Therefore, the system would be poorly designed if it made the rule for this week part of the learner’s permanent repertoire. Rather, the system would be more efficiently designed to learn new rules as the conditions change. (This week, pursue large red flowers, next week pursue small yellow flowers on south slopes.) The degree of permanence is revealed largely through the extinction behavior of the organism. The kind of temporary content map that would be used for pursuits like seeking nectar would necessarily be designed to undergo extinction readily as the information about available nectar changed. The kind of content map used to process rules like “Stay away from porcupines” would be designed so it did not extinguish readily and instead became part of the learner’s permanent repertoire. The most specific default content map possible may be analytically determined by identifying what is the same about all pursuits of a given type. Only those aspects that are the same could be incorporated in the default content map. For the flower pursuit, there are details about the objects of pursuit and the objective of the pursuit that are the same for all applications: 1. For all pursuits, the learner flies to a visual image that has features of flowers. (Experimentally, pursuits that involve nonflower facsimiles may be designed, but they have flower features, including sugar.) 2. For all pursuits, the learner observes features of the flower from a distance. 3. For all pursuits, the goal is to locate nectar. These shared samenesses describe the maximum possible specificity of a default content map. If blossoms are present but the default content map (not a trial content map) governs the pursuit, the most specific default map possible would be something like, “Fly to flower to obtain nectar.” This map refers to the general behavior and the feature of the object that is targeted at a distance. The learner must identify an object that is classified as a flower and track it as the flying occurs.
ANTECEDENT-ONLY LEARNING
139
For a pursuit with fewer common samenesses, the default content map would necessarily be less specific. If the source of nectar were variously some flowers, some tall trees, and some holes in the ground, the default content map would be something like, “Fly to a flower, tree, or hole to obtain nectar.” If the only feature sameness of various pursuits was that the nectar may be present in the current setting, the default content map could not provide specifications more precise than “Fly and search for nectar.” For systems that provide for only general default content maps, although more specific ones may be available, the learner is provided the opportunity to learn a rule about which features of the setting predict nectar. However, the learner must process a larger number of features than are required by a more specific default content map. Deployment of Default Content Maps. For nonpermanent learning (e.g., seeking nectar), the learning system would be designed to revert to the default content map without prejudice. The failure of the extant content map would be determined in the same way that the failure of a plan is determined—through a discrepancy between the predicted and realized outcomes. A confirmed discrepancy signals that the system is to revert to the default content map. If a system can deploy default content maps that specify details of the pursuit, extant content maps can be functionally erased when the predictions they make are not met. The system simply replaces the content map that does not predict with the default content map. The learner does not necessarily have a memory of the rejected content map because memory is not needed for pursuits that may be highly prompted by default content maps. If a system cannot rely on a default content map for specific content, the system cannot be designed to revert to a default content map without prejudice. The earlier content map cannot be erased and replaced by a default content map. There are three reasons the earlier content map cannot be erased: 1. There is no possible default content map that could help the learner greatly if the original rule were no longer predictive. For example, the learner had learned something that could not be supported in much detail by any default content map—to lie on its back to receive food. If that behavior ceased to be reinforced, there is no particular default content map that could be deployed by the system to increase the possibility of learning what it must do now to obtain food. It is not logically possible to design a contentspecific content map that would prompt the new learning because the new learning is perfectly arbitrary. 2. The system does not know whether the learned rule is to be superceded by another one. If the responses are no longer reinforced, a conser-
140
6.
BASIC ANTECEDENT LEARNING
vative system would retain the learning that had earlier led to reinforcement in case the conditions change and the response is once more reinforced. 3. If the conditions changed so that following the learned rule now led to aversive consequences, the system would have to be designed so that additional learning occurred. Because no content-specific default map could be available, it would not be possible to design the system in a way that would erase the learned material. Instead, the learner would have to relearn the role of the former discriminative stimulus. No longer does the presence of certain features call for lying on the back; however, the details of a replacement strategy are not implied by this condition. Complete relearning for skills intended by the system to be permanent would not be as efficient or thorough as original learning because the original learning would tend to be retained by the system. Requirements for Learning Temporally Prior Events The pursuit of nectar involves enduring features. They may be observed either before the nectar is measured or after it is measured. If the features that predict are not enduring, an additional provision must be added to the learning system. Predictive features must be identified when they occur. All features—whether they are temporally prior or enduring features— have a temporal-predictive function in learning. The bee that has learned to fly to red flowers has learned that the visual feature observed at Time 1 predicts nectar at Time 2. The redness of the flower, however, is not temporal. It endures, which means that before learning occurs, there is no requirement to record it at a particular time. Temporally prior features that predict are no longer present when the primary reinforcer occurs. If the learner is to learn that temporally prior features predict, the system must be designed to identify not only what these features are, but also something about when they occur in relation to the primary reinforcer. For instance, a buzzer occurs 3 sec before the presentation of an electric shock. To learn the relationship of the buzzer and the shock, the system would have to create a record of the sequence of events that occurred immediately before the presentation of the shock. If the system is designed to represent all features that occurred before the shock, the system has a record of various possible predictors. Subsequent presentations of the shock reveal which antecedent features are common to all occurrences of the shock and which are not. If the system does not represent these events, it is absolutely preempted from learning any relationship between antecedent and later events.
ANTECEDENT-ONLY LEARNING
141
Some temporally prior predictors are logically easier to learn than others. For instance, if a buzzer sounded for 1 sec and was immediately followed by the shock, the probability of it being recorded as an event that preceded the primary reinforcer is greater than the probability of the system identifying a buzzer that occurred 8 sec before the primary reinforcer. There are two reasons for this difference in ease of learning. One is that the particular system may not have the capacity to maintain a running record of things that occurred during 8 sec. Possibly its limit is 6 sec or 4 sec. The second reason is that even if the system does maintain a record of the changes that occurred within the 8 sec of the reinforcer, the system would probably be designed to assume that changes that were more temporally proximal (less than 8 sec) are more likely predictors and are more likely to be enhanced with secondary sensation. This relationship is based on natural events. Those changes that occur immediately before a presentation provide the most obvious information basis for planning behaviors that avoid the primary reinforcer. Stated differently, the greater the intervening period between buzzer and shock, the greater the number of events that could occur between buzzer and shock. If three unique events of equal salience occurred after the presentation of the buzzer, the likelihood of the buzzer being enhanced with secondary sensation is greatly reduced.2 All the provisions illustrated in Fig. 6.1 apply to learning temporally ordered events. However, the record of features that the system creates through Calculation 1 would have to indicate the temporal ordering of any possible predictor and the primary reinforcer. Content maps for temporal events require detail not needed for processing enduring features. Maps for enduring features require only information about what and where. Maps for temporal features require the specifications for what, where, and also when. The specification of when must be fairly precise for some content maps. For instance, if the buzzer occurred 8 sec before the primary reinforcer, the system would represent the avoidance options differently than it would if the buzzer occurred 1 sec before the primary reinforcer. The difference in options implies an understanding of the time available for escape. Memory of Temporal Events. The system could not learn to respond to temporal predictors unless the system performed four operations: 1. Enhance any unpredicted sensory changes that occur before the primary reinforcer occurs. 2. Preserve the events and preceding changes in memory.
2Note that these system features would generate the data observed in conditioning experi2 ments and those that involve escape and avoidance behavior (Skinner, 1938).
142
6.
BASIC ANTECEDENT LEARNING
3. Permit the memories of features to decay if they are not followed by a primary reinforcer on subsequent occasions. 4. Fix and enhance records of any feature that occurs in more than one presentation of the primary reinforcer. Attention to Sudden Changes. If any unique sudden or unanticipated changes in stimulus states are enhanced with secondary sensation, the system ensures that the agent will attend to these changes. The logic is that if a predictor occurs before the presentation of the primary stimulus, it will either be an enduring feature that may be identified before or after the presentation of the primary reinforcer, or it will be a temporally prior predictor. All possible temporally prior predictors are marked by some sort of change in the patterns of sensations the system receives. Therefore, if the system enhances all unanticipated changes, the system will identify ones that predict the primary reinforcer. Note, however, that most of the changes recorded will not be predictors. All changes must be enhanced, however, if the system is to ensure identification of the features that do predict the primary reinforcer. Enhancement ensures that the agent will attend to the next occurrence of the possible predictor. Without enhancement, the agent is less likely to attend to it, particularly if the predictor occurs within the context of competing stimuli. The enhancement increases its salience and therefore increases the probability of the agent attending to it. In summary, if the predictive event is not recorded, it cannot be learned because it will be part of the past that had not been transported (by a record) into the future. Therefore, it is an axiom that if the buzzer sound is learned to be a discriminative stimulus, it was recorded, represented in memory, and retained in memory until the trial on which the learned performance occurred. Memory and Decay. One problem facing the system is how to keep those features that may be relevant to a pursuit long enough for them to be available on the next trial. The problem is that the next trial may be more than 24 hours after the first. A possible design is a memory that assumes that the primary reinforcer will occur again and is predicted by unique features. When the primary reinforcer occurs, it fixes a set of features (not necessarily all, but at least some) in what amounts to a long-term infrasystem memory. The features would include both the enduring and unique temporally prior features. The length of time the items are in this memory is a function of both the species’ capacity and the strength of the primary reinforcer. Also with stronger reinforcers, more features would be fixed in memory.
ANTECEDENT-ONLY LEARNING
143
The system tests the possible predictors in one of two ways. If the primary reinforcer occurs again, any features that occurred on the first and present occasions are confirmed as predictors. The other way is to observe what happens on various occasions of the possible predictors that have been recorded. If they occur and are not followed by the primary reinforcer, they decay. If the primary reinforcer follows their next occurrence, they are strengthened for the next trial content map.3 Ongoing Record. As noted earlier, the capacity to learn is a strict function of the duration prior to the presentation of primary reinforcer that the system is able to reconstruct, and the length of time the system is able to retain features that are fixed in memory. These requirements imply a mechanism that sweeps through time and maintains a running record of what is happening now and what has occurred within the latency limits of the system. From time to time, something unexpected occurs. The system reflexively enhances the record of all events that occurred before it and the features of the current setting. As the organism proceeds through time, the possible predictors occur (including those that are enduring features of the setting). The system tests each and weakens any that do not predict the previously unanticipated outcome. The system also enhances those features that predict. The predictive ones are retained in trial content maps. This ongoing record and all its provisions are absolutely necessary if the learner is to learn. Learning assumes prior uncertainty. Before the organism has learned something, it does not know the relationship between a primary reinforcer and predictors. Because learning is uncertain, the system does not know when a particular predictor will occur, and it does not know what that predictor is. The only manner in which it will learn about events that are temporally related is to maintain a running record, sort or identify possible predictors, and test them on later occasions. By eliminating any of the memory or testing functions specified, the system would fail and not be able to learn. There are other possible memory designs, but they are not as efficient or practical as limiting the record to changes that occurred before the primary reinforcer. For example, if everything that occurred immediately before the primary reinforcer is enhanced and recorded, the system would have to process an incredible number of records. Most of them would be inert in3A
possible problem occurs with the testing of possible predictors that anticipate a highly aversive primary reinforcer. The system would greatly enhance all the possible predictors and the agent would tend to respond to them when they next occur. If one of them is the predictor of the primary reinforcer, the system is preempted from discovering which one it is because the agent avoids all of them.
3
144
6.
BASIC ANTECEDENT LEARNING
formation. For instance, a tree is to the left, the wind gusts, the organism is moving, and hundreds of other things are currently occurring. Most of these do not have to be maintained in the record because they are discredited as temporal predictors. The wind gusts after the primary reinforcer occurred, and it is not followed by a second occurrence of the primary reinforcer. Therefore, it is discredited. It is a feature that the agent was aware of, but not a feature that could uniquely predict the primary reinforcer. Nor could most of the others, like the walking. Only those unique changes that occurred before the primary reinforcer are possible predictors. Frequency of Primary Reinforcer. One possible problem that could greatly retard the learning of some relationships is that the primary reinforcer may occur infrequently. Those details fixed in permanent memory may decay. Therefore, it may require several occasions of the primary reinforcer to produce a memory of the events. However, regardless of the system’s efficiency, it logically must perform all the functions indicated in Fig. 6.1 and would need to perform the various memory functions indicated previously. Negative Reinforcers. As noted, an inevitable problem with the learning of temporally prior predictors occurs if the primary reinforcer is negative. The problem is that the execution of any response based on avoiding a feature leads to a self-fulfilling prophecy. If all the temporally prior changes in features are identified and used as information to avoid the primary reinforcer, all are judged to be successful. However, most are not true predictors of the aversive stimulus. Therefore, the system learns unproductive strategies. Let’s say there were two unpredicted changes that occurred before the presentation of a highly negative primary reinforcer and that one of these prior features is the true predictor. Both of these features are highly enhanced by the system. On subsequent trials, the system would not receive good information about which change is the actual predictor unless it waits for the primary reinforcer and then identifies the predictor that occurred on both occasions. The risk may not be worth the aversion. Therefore, the agent may produce avoidance responses to both potential predictors. This strategy effectively avoids the negative outcome, although one avoidance is based on a false predictor. If both predictors occur infrequently, the system may never learn that one of these is not a true predictor. If the false predictor occurred at a high rate, the system would be able to discover the true predictor simply because the probability is high that the learner would fail to respond to the true predictor and encounter a primary reinforcer despite plans to avoid it. If only one of the two predictors oc-
SUMMARY
145
curred prior to the occurrence of the primary reinforcer, the system would have information about the true predictor.
SUMMARY Basic learning involves the completion of content maps. These maps are grounded in primary reinforcers, sensory states, and outcomes that are either positive or negative. The completed content map identifies three components—discriminative stimulus, response strategy, and reinforcing consequences. The assumed relationship of the components is that antecedent features predict the possible occurrence of the primary reinforcer. The response exploits this relationship. The content maps that are created or completed through learning are limited by the principle that the map must be designed to apply to all applications that have particular sensory features. The assumption of learning is that some form of temporally prior event predicts the occurrence of the primary reinforcer. Different degrees of scaffolding or prompting limit the learned content required to complete a map for a particular type of learning application. The greater the amount of scaffolding, the greater the constraints on the content learned. For imprinting, a complete and often extensive hardwired program is fitted to an individual. The imprinted map may be keyed to an individual event or place (imprinting of a map for following an individual) or it may provide the learner with a tailor-made response repertoire and information about how to direct sophisticated motor functions. For antecedent learning, all basic learning is keyed to the presence of a primary reinforcer. There are two differences between imprinting and even simple learning that involve only the identification of antecedent events: 1. Antecedent learning is based on the observed correlation of features—a predictive feature and the features of the primary reinforcer; 2. True learning is achieved through a series of logical steps that confirm the predictive feature and disconfirm nonpredictive features. The agent has no preknowledge of the relationship. Therefore, the content map that results from the first occurrence of the primary reinforcer takes the form of a hypothesis that is modified on the basis of evidence. These are the trial content maps, which result in plans that either lead to the predicted outcome or that fail.
146
6.
BASIC ANTECEDENT LEARNING
To formulate hypotheses, the system must be able to identify possibilities and use logical operations to rule out those not supported by evidence. Default content maps may be employed to simplify and focus learning. The default map provides general directives that increase the probability of the learning that applies to the current conditions. If the pursuit is quite specific (such as finding nectar), the default content map may be quite specific. As the range of variation in what the learner may learn increases, the specificity of the default content map necessarily decreases. For learning that is not confined to a particular type of object or particular behavior, the default content map may be no more specific than a strong urging to “Find what predicts the primary reinforcer.” Learning requires an extension of the processing functions required for hardwired performance. The learning function is an addition, not a substitution for the performance requirements of the system. The basic steps of planning a strategy, executing it, and comparing projected and realized outcomes of the plan are required for the system that learns. This system simply adds provisions to the performance system. Learning requires a cycle of six steps for applying a trial content map, performing logical operations to determine how it should be modified, and modifying it for the next possible encounter with the primary reinforcer: 1. Create a trial content map that identifies features assumed to predict the primary reinforcer. 2. Base a plan that is consistent with the map and execute it. 3. Secure sensory information about the outcome. 4. Compare the projected and realized outcomes. 5. Use the information about the outcome, features of the earlier occurrence of the primary reinforcer, and features of the current occurrence to draw a conclusion about the features that predict. 6. Modify the trial content map (either altering the content or strengthening the extant provisions). The basic learning system must be designed to perform the logical operations necessary to execute Step 5. The system must have provisions for (a) identifying possible antecedent predictors, and (b) eliminating possibilities that do not predict. The features identified may be enduring or nonenduring. The system must be able to identify them without prejudice because the system has no way of knowing which are true predictors. The basic process for eliminating those that do not predict involves classifying outcomes as either positive or negative and correlating each outcome with the trial content map. If the outcome was predicted (and therefore positive), the feature on which the
SUMMARY
147
prediction was based is strengthened. If the outcome was not predicted (and therefore negative), the feature on which the prediction was based is weakened. For both enduring features (such as color or location) and nonenduring features (a prior event that does not occur when the primary reinforcer is present), the discriminative stimulus has a predictive function. An enduring feature that uniquely marks the occasion or presence of the primary reinforcer may function as a temporal predictor if it is observed before the primary reinforcer is encountered. Predictors like the buzzer, which are not enduring, have the same temporal function. The way in which they are learned, however, requires that they are identified as having the feature of being temporally prior. The learning of nonenduring predictors requires provisions that are not needed for the learning of enduring features: 1. The system is designed so that the agent attends to things that change unpredictably. 2. A memory function retains unpredicted changes that precede a presentation of the primary reinforcer. 3. When the primary reinforcer occurs, an enhancing function reflexively fixes all features in the memory record. The products and operations required for antecedent learning derive from the content. The learner is going to predict. What it predicts is not amorphous, but specific. The learner predicts something. That something must be specified. The prediction is based on something that occurred earlier. That something must be specified. The predictor and that which is predicted are related in a specific way. That way must be specified. To secure all the relevant information needed to formulate the prediction, the system must collect data and perform the minimum set of the logical operations that process those data so they lead to a conclusion consistent with facts.
Chapter
7
Basic Response-Strategy Learning
Chapter 6 presented the basic learning paradigm for antecedent-only learning. This chapter develops basic learning of response strategies. The learning of new motor responses is not dealt with in this chapter, but is presented in chapter 11. The response strategies discussed in this chapter are categorized into two groups: (1) strategies that require the learning of only a response strategy, and (2) strategies that require both the learning of the response strategy and the antecedent stimulus. The learning of Type 1 strategies could be supported by a hardwired content map. The learning of Type 2 strategies could not because the learner must learn both the predictor of the opportunity to respond and the response. No default content map could be designed to anticipate the various combinations that the learner may confront in different pursuits. The difference between the types is illustrated by two experiments. For the Type 1 learning, food is presented in a chicken-wire box. The learner is required to figure out the response strategy of pressing a button on the box to open it. (The box with food in it is the predictor or discriminative stimulus; the button pressing is the response strategy.) All the learner would have to learn is the specific strategy of pressing the button. For the Type 2 learning, food is presented in the box. To open the box, however, the learner must go to the other side of the room and press a lever. For the Type 1 situation, an effective default content map would simply prompt the learner to approach the food. The learner would get as close as possible, encounter the box, and, through experimentation, press the button. For Type 2, no possible hardwired default content map could set the 148
CREATING CONTENT MAPS FOR RESPONSE STRATEGIES
149
stage for the learner to not approach the food, but to consider an extraneous feature of the setting as the true predictor. The default content map is limited to those features that are the same for all applications that secure food. All natural instances involve approaching the food and eating it. The default content map, therefore, could present nothing more specific than a general urging to the learner to approach and secure the food. The Type 2 situation requires a behavior that does not involve either approaching food or demanding—crying or begging so that food would be presented. In fact, the default content map that calls for approaching the food could seriously retard the learning of the experimental rule because learning the needed response requires moving away from the food to discover the role of the lever. When dealing with organisms that learn, the Type 1 content map may be learned. The learner who has learned to approach food possesses a content map that does not support the learning of a Type 2 strategy. Without some form of prompting or scaffolding that simplifies the task, the probability of learning a Type 2 strategy in a relatively timely manner is far less than the probability of learning a Type 1 strategy. Effective prompting for the Type 2 strategy would change the task from a Type 2 to a Type 1 so that the learner’s default content maps and previous learning of natural relationships would facilitate the learning. Once the simplified task or component strategy is learned, the learner would not have great difficulty learning the unprompted task because now the prior learning would facilitate learning the ultimate experimental rule.
CREATING CONTENT MAPS FOR RESPONSE STRATEGIES Response-strategy learning is functionally different from antecedent learning because the learner must express the response strategy. As with any effective content map, the response strategy must be expressed so that it implies the class of responses the learner is to produce and must also be framed broadly enough to apply to a full range of applications. At the same time, the response strategy must be specified in enough detail to ensure that the learner references the response to the appropriate features of the setting (the discriminative stimulus and primary reinforcer). The only source of information for constructing the content map is what occurs on a trial (or a set of trials) that leads to reinforcement. The system assumes that the outcome of the trial was a function of the plan and the response directive. Therefore, the system must somehow transform the plan and response directive into something more general than a guide for a situ-
150
7.
BASIC RESPONSE-STRATEGY LEARNING
ation-specific application. The system must, in effect, do the opposite of what it does when it responds to a content map issued by the infrasystem. Instead of going from the general to the situation-specific application, the system must proceed from the facts of the successful application to a map that applies to that setting and all others that have certain features. Without creating a general content map, learning is impossible. Neither the plan nor the directive that the learner formulates for the first successful trial will ever be appropriate for another situation because the conditions will never be exactly the same. Without some form of map that refers to the successful setting and others that have specified features, it would be impossible for the learner to respond to the full range of applications. Strategies as the Sum of Features The most basic tasks that involve antecedent learning focus on a single feature, and the learner has the option of creating single-feature trial content maps—flying to a flower that has the feature of redness versus whiteness, for instance. Response-strategy learning does not afford this single-feature option because, unless a minimum combination of response features is specified in the trial content map that follows the first successful attempt, there is no second attempt. The content map must describe multiple features or characteristics of the response strategy. For example, if the learner successfully breaks an eggshell by dropping the egg on a rock, the learner could not simply identify a single feature, such as a rock or dropping the egg, as the response component in the trial content map. The representation would have to describe a strategy—dropping the egg on a rock. Furthermore, the details of dropping would have to be articulated precisely enough for the learner to replicate the various features of the original successful response (e.g., about how high it is held before release). Details of Response Repertoire The response strategy refers to what the agent will do and, therefore, implicates the learner’s response repertoire. Whatever strategy is specified must be expressed in a way that describes classes of responses or behaviors that are in the learner’s response repertoire. The strategy will indicate some form of systematic pattern that occurs over time. In effect, the process is, once more, the opposite of that used to go from content map to response directive. For proceeding from content map to directive, the hardwired system issues possible menus for selecting responses and identifying the components of the response. Proceeding from directive to content map re-
151
CREATING CONTENT MAPS FOR RESPONSE STRATEGIES
quires the specification of more general response categories based on a specific application. Table 7.1 shows a reversed version of Table 3.2. Table 3.2 showed the menu of response options and the plan based on the response options. The content map presented the rule “Move to approach X.” The agent planned a specific response that was consistent with the provisions of the content map. The agent selected a walking response that had rate of ++, a direction of +14 degrees, and a posture of P1. Table 7.1 is based on a situation in which the learner discovered the response strategy that approaching X had led to the primary reinforcer. Let’s say the learner encountered the primary reinforcer only one time by issuing any of the directives specified in the table. The system would tend to generalize beyond the specification of the initial directive. If the learner issued the running directive, the response could be generalized to, “Run with posture P5 to approach X,” “Run to approach X,” or, ultimately, “Move to approach X.” Some degree of generalization is logically required, however, if the learner is to apply the plan to any other setting. To achieve the generalization, the system must be designed so that it groups locomotive options according to common features. For example, the system must be able to identify that walking at a specified rate and direction is an example of walking and must be able to further classify walking as one of the locomotive options available to the agent. The process of generalizing to create the content map could be achieved largely by the infrasystem because it would involve a variation of the same process for every learning application. The system would generalize the specific plan or directive. The only constraint is how far it is to be generalized. If the system is to perform such generalizations, the system must be configured so that the more specific features must be related to their more general counterparts. For instance, walking in direction +14° must be classified as an example of walking, and walking must be classified as an example of locomotion. The generalization process is discussed more thoroughly in chapter 8. TABLE 7.1 Content Map From Situation-Specific Response Response Variables Used Locomotive Variables Run Walk Swim Climb
Motor Response Variables rate rate rate rate
++++, direction -3°, posture P5 ++, direction +14°, posture P1 +, direction -18°, posture P9 +, direction +49°, posture P3
MOVE
Purpose
Referent
Approach Approach Approach Approach
X X X X
TO APPROACH
X
152
7.
BASIC RESPONSE-STRATEGY LEARNING
TYPES OF RESPONSE-STRATEGY LEARNING As chapter 6 indicated, in a strict analysis, learning involves both learning something about the discriminative antecedent stimuli and learning the appropriate response strategy. The bee modifies its behavior when it learns to fly to red because the discrimination rule it has learned constrains the flying so that the bee flies only to red. Likewise, the experimental subject that learns to open the box to get food learns that the box signals that only a particular response strategy is to be employed. Type 1 Response-Strategy Learning The simplest type of basic response-strategy learning involves a default content map that increases the probability of the learning. The map is referenced to a primary reinforcer or desired outcome. As noted earlier, the map must be modified so that it accommodates (a) a possible subclass of the discriminative stimulus, and (b) a specification of the response strategy for that stimulus. The egg-breaking strategy illustrates the changes that occur in a possible default content map. The learner encounters an egg for the first time. The olfactory signal from the egg is a discriminative stimulus for food and calls for an approach response. (This relationship is either learned or hardwired.) Applying the default content map, the learner approaches the egg, but cannot penetrate the shell. The learner persists in rolling the egg and biting it, but these behaviors are not effective. After attempting various other strategies, the learner picks up the egg and drops it on a rock. The contents are exposed, and the learner eats them. This strategy is learned and applied to other examples of the egg. The learner did not learn that the egg is a discriminative stimulus for approach. It did not learn that the discriminative stimulus was associated with primary reinforcement (eating). It did not learn that it was to persist in manipulating the egg in different ways. These aspects of the behavior were already provided by the default content map. In essential ways, the process of Type 1 response-strategy learning is not greatly different from that of a completely hardwired learner that solves the problem of attaining food. In both cases, the content map specifies the discriminative stimulus, but is general with respect to the behavior to be planned. The behavior has to be planned to meet the requirements of the current setting. The difference for a learned strategy is that the adaptation that occurred in the current setting is extended to other situations. The successful plan becomes a permanent model for later plans.
TYPES OF RESPONSE-STRATEGY LEARNING
153
Type 2 Response-Strategy Learning We can convert the Type 1 learning into Type 2 by experimentally causing the egg to break following a behavior that is not implied by the default content map. For example, experimental conditions could be designed so the egg breaks following any arbitrary behavior, such as barking, rolling over, or pulling on a rope. In all these designs, the discriminative stimulus would be the egg and the primary reinforcement is eating the contents. The likelihood of the learner learning the designated response strategy is low, however, unless the response strategy required a behavior that the learner often produced when approaching food (or simply produced at a high rate). As noted, the likelihood of learning is increased by an instructional program. This program would establish the arbitrary behavior in a context that could be transformed or modified into the ultimate context. Shaping the Response or Setting. There are two basic types of instructional programs—those that shape the configuration details of the response and those that shape the details of the setting. For both types, a series of transformations requires the learner to refine the response strategy. The response-shaping procedure modifies the response through a shifting criterion for issuing reinforcement, which leads to progressively closer approximations of the ultimate response. The setting-shaping procedure systematically transforms the setting from a Type 1 arrangement to the desired Type 2 arrangement. The response-shaping process is generated from the theoretical orientation that responses are the basic elements (Skinner, 1938). The settingshaping orientation derives from the current analysis, which is based on the assumption that the learner is perfectly capable of producing the response and has produced it on various occasions. The purpose of the training is simply to provide the learner with information about what to do and to provide the learner with a reason (in terms of reinforcement) for doing it. We can teach a strategy like neck stretching using either shaping approach. For the response shaping, a dispenser would issue a food pellet if the bird first stretched its neck at all. After this neck-stretching occurred at a relatively high rate in the experimental setting, the criterion for issuing food would change. Now, for instance, food would appear in the dispenser only if the bird stretched its neck vertically at least 3 inches. Later the criterion would again change so that the reinforcement was contingent on a 4inch vertical stretch. The result would be that, in the setting in which reinforcement is to occur, the bird produces 4-inch neck-stretching responses. The criteria for reinforcing responses are arbitrary. There may be three shifts in the criteria or five. After the first criterion, however, the changes in
154
7.
BASIC RESPONSE-STRATEGY LEARNING
criteria are progressive in the direction of the ultimate criterion—the 4inch stretch. Setting shaping is different from response shaping because the criterion never changes during the training. All that changes are details of the setting. If the criterion is vertical neck stretching of 4 inches, that’s the only criterion that is ever reinforced. A possible training sequence could start by placing the food pellet at a level that would require the bird to stretch its neck 4 inches vertically to touch the pellet. The pellet would appear a moment after a buzzer sounded. The pellet would drop to the floor as soon as the bird stretched up high enough to touch it. After this response is established on a few trials, we would change the setting so that the bird had to stretch two or three times before the food dropped. Next we would present trials that presented the buzzer but delayed the food until the bird stretched its neck 4 inches. Next, we would introduce an intermittent schedule of reinforcement so that the bird would have to produce two or more consecutive responses to the buzzer before the food would appear. Both procedures result in the same outcome; however, the setting shaping provides clearer communication with the learner because, from the onset, the behavior is clearly an approach behavior that is consistent with the default content map. In other words, we do not have to wait for the learner to produce the operant response as Skinner (1938) suggests. We can engineer reinforceable behavior simply through food deprivation procedures that ensure that the food-approach default map will be activated. From the onset, the experimental procedures precisely convey the response requirements for securing the food. Note that the teaching of the response at a high rate requires changes in the schedule of reinforcement because, in this case, the schedule of reinforcement (how many responses are produced before reinforcement occurs) is a feature of the setting and is a criterion for issuing reinforcement. The only way we can clearly show the learner that it must produce many head stretches before we issue reinforcement is to show the learner through a schedule of reinforcement that only by producing the reinforceable responses frequently is reinforcement earned. The task of obtaining food by going to the other side of the room and pressing a lever is quite difficult to teach as a response-shaping operation. The information the learner needs is far more easily conveyed through shaping the setting. When we apply this strategy to the lever-pressing teaching, the learner always presses the lever to obtain food. This requirement never changes. What changes is simply the physical arrangement of the setting. For example, at first the food dispenser would be positioned next to the lever. The probability of the learner producing the lever-pressing response is increased by the juxtaposition of dispenser and lever. Once the
TYPES OF RESPONSE-STRATEGY LEARNING
155
learner learned the response, the dispenser would be repositioned. Repositioning would not have to be a progressive sequence of position changes (first a little bit, then more, etc.) because the learner has already learned the rule that pressing the lever results in food in the dispenser. The lever remains in the same room. Therefore, the same behavior is strongly prompted. The only difference is in the specific route that the learner plans to approach the lever and the route planned to obtain the food following the lever press. After the dispenser had been placed in possibly one intermediate position (6 feet to the side of the food dispenser), the dispenser could be placed on the other side of the room. The prediction would be categorically that the learner would generalize because the generalized behavior is implied by the observed behavior. On all trials, the learner performed the sequence—approach the lever, press the lever, then go to the dispenser. There was sufficient range in variation of the response to guarantee that it would generalize to a new location of the lever. Note that the difference between response shaping and setting shaping has nothing to do with whether the learner receives reinforcement while learning the response strategy. It has to do with the relationship of the reinforcement to the possible content maps implied by the learner’s behavior. With the response-shaping process, the learner could be learning a series of different response strategies, each consistent with the changing criteria. For instance, if the learner is reinforced for producing approximations of the lever-press response, the food would appear in the dispenser first when the learner first approached the lever, then when the learner touched the lever, and finally when the learner pressed the lever. Response shaping is quite ineffective for teaching Type 2 response strategies. Each time an approximation is reinforced, the content map that led to reinforcement is not the content map we want the learner to learn. Therefore, a series of possible mislearnings results from the practice. For the teaching of response strategies (vs. new motor responses), setting shaping is more efficient because it communicates the response requirements more clearly than response shaping does. The responseshaping procedure, however, is the only systematic procedure for inducing new motor responses that have never been observed. The setting-shaping procedure is inappropriate because it assumes that the learner is capable of producing whatever motor responses are required to execute an effective strategy. Therefore, the learner would not receive reinforcement until the response was finally learned, which means that the learner incapable of producing the motor responses would have to produce many unreinforced attempts. However, shaping the response through a shifting criterion permits us to start where the learner is (indicated by the response approximations that have been observed) and demonstrate, through systematic varia-
156
7.
BASIC RESPONSE-STRATEGY LEARNING
tions in reinforcement patterns and criteria, how the response is to be modified. Some behavioral experiments that required learning two response strategies (Boren & Sidman, 1953; Modaresi, 1990; Sidman, 1953; Sidman, Herrnstein, & Conrad, 1957) could have achieved the learning outcomes much more systematically through shaping the setting. For instance, escape experiments typically taught animals to escape from a confined area by pressing against a lever. The discriminative stimulus (a buzzer) was shortly followed by the activation of an electrical grid on the floor. The animal would scurry about and, at some point, press the lever. The door would open and the animal would escape. After some trials, the animal would not become frantic at the presentation of the discriminative stimulus. It would simply walk to the lever, press it, and exit the room before the floor grid was activated. Note that this design does not call for any sort of training. The learner is left to its own devices to figure out the rule for escaping. Shaping the setting to teach the strategy could be achieved by presenting a buzzer followed by food. The food dispenser would be next to the lever. The only way the learner could gain access to the food would be to press the lever. Following the learner’s successful performances with this arrangement, the food dispenser would be located outside the open door. The learner now would have to press the lever and then exit the room to obtain the food. Next the food dispenser would be outside a closed door. Pressing the lever would open the door and permit access to the food. Next, the electrical grid would be activated with a slightly aversive shock when the buzzer sounded. The shock would slowly increase in intensity. There would be no food in the dispenser outside the door, but the lever would still open the door. Finally, the terminal experimental condition would be presented: The buzzer would be followed in a few seconds by a full-fledged shock. The learner would avoid the shock by exiting the room before its onset. Again there would be no food in the dispenser outside the door, but the lever would open the door. The learning curve for any setting-shaping design would look quite different from that of the experimental design that provided no setting shaping. In the former, the learner would probably experience no more than one full-fledged shock in learning the escape rule.
THE PROCESS OF RESPONSE-STRATEGY LEARNING The basic problem facing the system is to determine which response features are to be incorporated in a strategy that serves as a trial content map. The system must make judgments about the extent to which the specific response details that occurred on the first successful trial are to be required
THE PROCESS OF RESPONSE-STRATEGY LEARNING
157
for all future trials. One successful attempt does not guarantee that the learner has learned the response strategy. The learner has simply produced one positive example of a response that led to the desired outcome. In the egg-dropping example, the system would have to identify the behavior as something that predicted the successful outcome, represent the behavior in a trial content map, and apply the map to future encounters with the egg. The behavior is not pure behavior, however, because it involves other details of the setting. These have to become part of the content as well. The system would have to use some format for identifying and eliminating nonpredictive features of discriminative stimuli and the behavior and retain features that predict. Generating the Response Strategy There are three possible ways that the system could be designed to generate a trial response strategy—a model that utilizes information about the various events that occurred, a model that simply generalizes the plan or response directive, and a combination model that incorporates events and generalization of the plan. The models are theoretical, and we make some assumptions about which types are used by which organisms. However, it is probable that most species use some combination of the models. Events Model. The system could be designed to identify the response events simply as things that happened without regard to exactly what the learner planned and directed to achieve them. This format would follow that of the antecedent learning. After the primary reinforcer has been achieved (the contents of the egg are exposed), the system would reflexively represent the temporal sequence of events that occurred. The egg was on the ground, the learner picked it up, and the learner dropped it on a rock. The various events are summaries of three behavior patterns or subgoals. When the system records observed events, they are expressed as generalizations. The trial content map is therefore constructed at the presentation of the primary reinforcer (breaking the egg) just as it is with antecedent learning. The relevant details are identified and expressed as a rule of behavior. The problem is selecting the appropriate features to include in the map. Plan/Directive Model. The system could be designed to use some combination of the plan and directives that it issued in creating the response. This format does not have a parallel in antecedent learning because the plan refers to internal operational steps, not the observed features of the response. The content map for various settings is not constructed at the first
158
7.
BASIC RESPONSE-STRATEGY LEARNING
presentation of the first primary reinforcer. Rather, it is formulated after the plan has been presented twice and modified on the basis of the second application. The advantage of the plan-based format is that it includes operational details that would be omitted if the representation of the event were limited only to observed outcomes. The problem is that much of that detail refers to a particular concrete setting; it is not generalized. Combination Model. The system could be designed to identify the features of both the plan or directives (what the learner did) and the observed sequence of events (various things that were observed). The dual representation would describe the series of events as something like, “What I did . . . What happened when I did it?” The first part would be something of a summary of the planned behavior. The second part would be a description of what was observed. The potential value of this approach is that it is able to provide a more thorough representation of the process in terms of what it attempted to do and what occurred. Problems With Specification. The logical problems associated with any of these formats are substantial. If the system transforms the record of what it observed into a more general summary, the features may be specified too generally, and some of the detail that is necessary to re-create the outcome may be omitted. If the specification of features is too specific, it may fail as planned or fail to be replicated. For instance, the original plan referred to the features of a specific rock. Does the plan refer specifically to that rock, to a rock that shares some of the features with the original, or to no rock? The original rock may not be available on the next occasion, or it may be impractical to return to the original setting so that the same behavioral sequence (or a close approximation) may be executed. However, reference to just a few of the features of the original rock (perhaps color and elongated shape) may result in the learner dropping the next egg on a dirt mound of the same color and shape. Problems With Negative Examples. Another problem has to do with the influence of negative examples that occur before a successful trial content map is formulated. The general design for trial content maps that identify antecedent events is that if the primary reinforcer is not realized, the features specified by the trial content map are systematically weakened so they will not recur in subsequent trial content maps. Some form of this format would have to apply to the various negative examples of the response strategy, or the learner would be observed producing the same negative exam-
THE PROCESS OF RESPONSE-STRATEGY LEARNING
159
ples and never apparently learning from them. The learner would bite at the egg endlessly, apparently never learning that this behavior predictably leads to a negative outcome, not a positive one. If the outcome were negative, the system would weaken the features of the response. If premature weakening occurs, however, the learner may not learn that some of the response features are to be retained. For instance, the trial content map “Drop the egg” is not specific enough. If the strategy is prematurely abandoned, however, the learner will not learn the strategy “Drop the egg on a rock” because the learner would no longer drop the egg. The logical problem is that many of the same features may occur in both successful and unsuccessful attempts. The system has no preknowledge of which features of the response that leads to a negative outcome are shared by a successful strategy. Therefore, the same provisions that would work for ruling out single-feature predictors of antecedent events are not as effective for ruling out unpredictive response-strategy features because the response strategy has many features. Merely because the strategy fails does not imply that all the features are implicated. To protect against premature abandonment of a strategy, the system would have to be designed so it derived some learning from negative outcomes, but did not routinely tend to abandon a strategy after only one failure. The simplest mechanism would be for the system to retain the same general strategy, but produce variations. The strategy would be weakened after a range of variations produced negative outcomes. With a system of this kind in place, the learner might be observed rolling the egg repeatedly—rolling it in a kind of circular route, a straight route, downhill, and so on. Likewise, the learner might be observed dropping the egg several times and holding it different ways on different trials. If after trying different ways and receiving information that none of them worked, the system would conclude that the food is simply not available. Problems With Positive Examples. The response to positive examples is easier for the system to address. Unfortunately, positive examples may occur only after negative examples have been explored and the response strategy modified more than once. The presentation of the positive reinforcer would reflexively fix in memory some details of what had occurred immediately prior to the presentation of the primary reinforcer. The details identified as prior events (the plan or events) would be reflexively presented to the agent when another egg is encountered. The agent would be urged to follow the plan or content map (the same process used for hardwired content maps). For both the event- and plan-based models, the rock would most probably be included in the strategy. The amount of detail required for a learned
160
7.
BASIC RESPONSE-STRATEGY LEARNING
response strategy, however, requires more than the rock. For the strategy to be successful, it must refer to the rock (or some generalizable feature of the rock), the egg, and the action that relates them. Coding Requirements. The content map must describe the response strategy in the same language that the system uses to code items in the learner’s response repertoire and responses identified in other content maps. If the learner has another content map that refers to dropping something, the same behavior would have the same “label” in the current content map. If the strategy is not expressed in the standard language or code of other response components, or if the strategy does not relate to the learner’s response repertoire, the agent would have no way to access it at a later time.
PLAN-BASED PROCESS If the system uses the current plan as a basis for creating details of a corresponding content map, the plan would have to be modified so that it would apply to settings other than the original one. The simplest design would be for the system to generalize the plan conservatively, which means adding no more detail than is needed to apply the plan to another setting. When a plan is successful, it is too specific because it refers to a single setting and must be generalized if it is to apply to other settings. The simplest process would involve first fixing the plan in memory, retaining it until the discriminative stimulus (egg) again appeared, adapting the plan to the current setting, and using information about both successful settings to formulate a more generalized content map that could apply to all settings. The process of generalizing the plan would be achieved in basically the same way that a nonlearning agent applies a plan to the current setting. The agent would try to follow the original plan. Details of the plan are modified to achieve the outcome projected for the original plan. Revising Parts of the Original Plan The system would have to be programmed to make a number of decisions about the extent to which the plan is to be changed to match the setting and the extent to which the setting must be changed to correspond to the setting for the original plan. For instance, the learner encounters a second egg. The plan for the original encounter with the egg is reflexively presented to the agent. This plan calls for dropping the egg on a particular rock. That rock may not be present in the current setting. In fact, the setting may be quite different from the original.
PLAN-BASED PROCESS
161
In the same way that the agent of a hardwired system must routinely replan responses that do not meet projected outcomes, the agent that learns treats the plan as an approximation for transforming the current setting. If the agent of a hardwired system encounters a barrier not observed earlier, the agent retains the objective of achieving the primary reinforcer, but replans the specific behaviors that lead to a positive outcome. If the agent that learns cannot find the particular rock indicated in the plan, the agent retains the objective, but attempts to find some form of substitute for the rock. Adjusting the previous plan to accommodate the current setting is different from adjusting a plan based on a hardwired content map. The hardwired content map would not specify something like a particular rock. Rather, the hardwired content map would specify a general rock. For the agent that is generalizing a plan, substituting the rock originally specified for another object is not possible unless reference is made to specific features shared by the original rock and the object encountered in the current setting. The criterion for generalizing the rock must be based on some sort of feature sameness. Representing Objects as the Sum of Their Features A possible process for generalizing based on samenesses would be for the system to categorize the original rock as the sum of its features (or the sum of the features the system knows). Any features of the original rock that occur in the second setting would be enhanced. Those objects that have more of the shared features become more likely candidates for inclusion in the current plan for the egg-dropping behavior. If no objects in the current setting share a large set of features with the rock, the agent would either settle for those that had only some features or try to take the egg to another setting. Anything in the setting that is identified as a substitute for the original rock would be assumed to have the egg-breaking potential of the rock. This process of categorizing an object as the sum of its features may seem to be far too ambitious a process for an animal that does not engage in greatly sophisticated learning. However, chapter 10 shows that a classification by an individual logically requires a sum of features, and that each feature or combination of features may be used to classify the individual. If the original rock was recorded as the sum of its features (sandy colored, raised area, flat topped, hard, oval shaped, and X units large), different combinations of these features would be enhanced in the current setting, and they would be the basis for the pursuit. Possibly a sandy-colored mound is the best substitute in the current setting. The analysis of inferred functions does not indicate the particular feature categories that the system has or is capable of learning. The analysis simply
162
7.
BASIC RESPONSE-STRATEGY LEARNING
asserts that if the plan is to be generalized to the current setting, the only basis is to identify some of the features of the original rock and use them as a basis for creating a match or an approximation in the current setting. If the system does not represent a sufficient number of features of the rock or does not represent the critical features, the learner may not be able to construct a response to achieve the desired outcome. For example, the system may not represent the hardness feature of the rock. On the second trial, the learner may repeatedly drop the egg on a dirt mound and never succeed in breaking the egg. The original plan and its application to the current setting would be terminated, and the learner would have to start over, trying to discover a new way to break the eggshell. The Process of Identifying Samenesses Across Applications The original plan is shaped into a trial content map following the second successful outcome. The most efficient process would be for the system to identify what is the same about the original plan and the second application. The process necessarily requires the system to identify sameness of features. If the learner solves the problem in the second setting by finding a small brown round rock, the system could be designed to identify samenesses in features by superimposing the feature representation of this rock over the representation of the original rock. The features that are the same are retained. Those that are not the same are weakened or eliminated. Both have raised surfaces. Both are hard. Therefore, these and a few other features shared by both examples would be retained. The rocks are not the same color, same shape, same size, or in the same location. So these and other features of the rocks that may have been originally represented are eliminated or weakened for the next iteration of the content map. This map is applicable to a variety of settings. The process would be repeated for subsequent maps. Those details of the setting that differ are systematically weakened. Those that are the same are strengthened. The process involves some form of overlying the trial content map over the plan for the current setting. Cumulative Trace Memory Knowing what to do implies the full range of knowing what not to do. Yet it is logically impossible to identify what to do from information about what not to do. Basically, the system wants to learn something of the form, “If S1, then do R to secure S2.” The information from negative examples simply shows what not to do: “If S1, don’t do RN1, RN, RN3. . . .” The list of what
PLAN-BASED PROCESS
163
not to do could be very long and would still not indicate what to do. Even for pursuits designed to avoid or escape from something, the learner needs to know what to do, not simply what not to do. As noted earlier, however, the system must be designed to glean some information from negative examples. Without this provision, the system would tend to repeat the same mistakes. So the system must not discard trial content maps completely if the result is negative. A cumulative-trace memory process could be designed so that, following each negative outcome, the plan or events that preceded it are recorded faintly (something analogous to an underexposed image on a film) and classified as negative. These would tend to be imperceptible to the agent. Subsequent negative examples are overlaid on the record of the original (as a transparency would be overlaid on another with a similar pattern). Those features that are the same become more dense and perceptible than the features not repeated. After repeating this process several times, the system would have a basis for some things not to do. Because the process would be relatively slower than that for recording features of positive outcomes, the learner would have opportunities to learn that some components of earlier negative trials may be incorporated in a strategy that leads to positive outcomes. It would be important for the features of positive outcomes to be fixed more quickly than the negative outcomes because the learner must learn what to do, not what not to do. If the learner learns any strategy that is at great variance with the demands of a default map, many negative examples may have to be presented before the system has sufficiently strong information to abandon the general directions of the default map. For this learning to occur, the default map would have to be constructed in a way that it could be superseded if it did not lead to a positive outcome. The greater the potential of the default map to be overridden, the greater the range of relationships the learner could potentially learn with sufficient exposure. After receiving sufficient information that the default map leads to negative outcomes, the learner could explore other possibilities or engage in behavior only remotely related to the default map. These exposures would increase the possibility of the learner learning Type 2 relationships. In summary, the system could have a combination of two processes for using information about negative examples to facilitate learning. The first would be a short-term retention of the last negative plan so that the agent would not tend to repeat the same unsuccessful plan on juxtaposed trials. The second general process would involve trace records of various negative plans and identification of features they share. This information would slowly become available to the agent and influence future planning.
164
7.
BASIC RESPONSE-STRATEGY LEARNING
Influence of Reinforcers Learning tendencies, especially of Type 2 strategies, are influenced by the strength of the primary reinforcer. If the primary reinforcer is relatively strong, Type 2 learning is less likely because the default content map is stronger. For instance, if a rat must learn to press a bar three times to terminate an electric shock, the learning will clearly occur much faster if the shock is mild. The default map calls for an immediate escape from aversion. The mild shock is nonaversive enough to permit the learner to produce a wider range of possible behaviors. Likewise, if a male dog were required to sit for 15 seconds to open a cage that contains a female dog in heat, the learning would be difficult because the default content map to approach the dog is very strong. Plan-Based Summary The plan-based content map is achieved by retaining the plan for the first successful encounter, issuing it when the discriminative stimulus is encountered a second time, formulating the best match for the plan in the current setting, and modifying the plan into a content map on the basis of features shared with the original and second plans. The product is a trial content map applied to the next occurrence of the discriminative stimulus.
EVENT-BASED PROCESS The other major design option for creating content maps for response strategies is the event-based process. The main difference in design is that the event-based process does not deal exclusively with the plan, but takes account of the various outcomes the plan created. The event-based process is modeled after the basic antecedent-event process. In the same way that the highly prompted antecedent-learning process identifies the features of the current flower and then systematically eliminates those that do not predict, the event-based process attempts to identify the features of the egg-dropping event that occurred prior to the presentation of the primary reinforcer. The greatest difference between simple antecedent learning and response-strategy learning is that there are many more variables associated with the event. Consequently, it would be more impractical for the system to provide some kind of checklist or menu of options for identifying features of the event. Therefore, the system tries to record the events with enough detail that it would be possible to later rule out those features that are not relevant to all instances of the positive outcome. This enterprise is uncertain because it is
EVENT-BASED PROCESS
165
unlikely that all the features of the event are recorded, particularly if the event occurred in the context of trying various strategies. The system would even have trouble determining where, in the stream of ongoing behavior, the successful trial began. The more remote steps would be less likely to be retained by the system. Those that occurred immediately before the presentation of the reinforcer are most likely to be retained, such as details of dropping the object and the details of the rock. The two general classes of information the system must record are the details of the setting and the details of the response. Representing the various features of the response event is perhaps best achieved by retaining some form of the plan. If the system does not retain the plan, but represents some details of it, the system would have to decide which features to retain or record. The process of formulating a response-strategy content map is different from that used for an antecedent predictor. The trial content map for the antecedent predictor assumes that the presence of any feature or any combination of features (such as those of a flower) may predict the primary reinforcer. The assumption of the response strategy is that all the features of the response strategy are necessary. Unless the trial content map provides sufficient information for the agent to plan a variation of the response components that lead to a successful outcome, the map will fail. For instance, if the agent assumes that simply picking up the egg will lead to the egg breaking, the strategy would fail. The response strategy must be represented so that the agent is able to create an event like the original. The event is a sum of various features. Therefore, the representation that the system retains must possess enough features to create the event, and the agent must have knowledge that the sum of the features must be replicated to achieve the primary reinforcer. Future revisions of the content map are based on adaptations needed to fit the map to the next setting. The process is the same as that used to determine the essential features of possible predictors. The system compares the record that served as a basis for the first trial content map and the record generated by the second application. Those features that are the same are strengthened. Those that are different are weakened. The resulting content map is the sum of the features that are the same across both applications. Determining Relevant Features One of the problems with the event-based process is that some important detail of the original event may not be recorded. Unlike the plan-based process that retains all relevant features of the response (but not necessarily of other events), the event-based process may omit important detail. The result is that the system may have to relearn the sum of the details that are
166
7.
BASIC RESPONSE-STRATEGY LEARNING
necessary to achieve a positive outcome. Let’s say that the egg breaks only if it is dropped onto a rock from at least 2 feet. The system may have represented only that the egg was dropped. This specification would be too general. On subsequent encounters, the learner may receive information that reveals the overgeneralized parts of the original content map. After possibly many trials, the learner would receive information consistent with the correct rule. However, alternative interpretations are often possible. For instance, the learner may try dropping the egg 11 times using different techniques before the egg breaks. There are several conclusions that are consistent with the facts. One is that a large number of drops may be required to break the egg. This conclusion may be consistent with the original encounter with the egg. The learner may have dropped it many times before dropping it on a rock and may have dropped it on a rock several times before the final trial. Without further information about other events, the learner has no clear basis for determining which details are essential. It is possible that the color of the rock, location of the rock, and contour of the ground are relevant to the successful outcome. Not all this detail would be recorded by an eventbased system. Shaping the Event-Based Content Map The shaping of the event-based content map would tend to proceed from the original description to one that is either more specific or more general. In contrast, the shaping of the plan-based content map would always go from a specific description (the original plan) to one that is more general. Both approaches could result in some overgeneralization. For instance, on the second occasion, both may result in the agent trying to break the egg on surfaces other than rocks. The reasons would be different. For the planbased application, the agent was not able to find the original rock and identified a substitute. For the event-based application, the content map did not specify that the egg had to be dropped on a rock. The greatest amount of hardwired support the system could provide in developing an event-based content map would be a process that permitted the system to represent the original event in great detail. If the original rock is represented in great detail, it could serve as a basis for later identifying various possible features that are essential for the egg-dropping operation. Disadvantages of Event-Based Processes The event-based approach process has two disadvantages. First, the process tends to result in slower learning. The probability of the event-based plan omitting relevant detail is greater than that of the plan-based process. The
COMBINATION PROCESS
167
plan-based process requires less learning in the presence of a clear predictor (such as an egg). The plan-based approach simply adapts the plan to the next application. The event-based approach must experiment with different combinations of response features and possible predictors. The second disadvantage is that the event-based process requires more elaborate memory and classification provisions. The event must be represented as the sum of the details. The system, therefore, requires classification by event and by each component feature. This arrangement is generalizable to the full range of content to be learned, but requires an investment in memory and classification machinery. Event-Based Summary The event-based system attempts to generalize features on the basis of a single example. The efficient event-based system needs two separate analyses: that of the behavior planned and that of the sensory details of the context in which the behavior was performed. The latter analysis is the same as that used to identify possible antecedent stimuli features of the setting. Although the system may have different ways to obtain the needed feature information, the simplest may be to retain in memory (a) the last part of the plan that immediately preceded the presentation of the primary reinforcer, and (b) features of the current setting that may be predictive. COMBINATION PROCESS An efficient system could use two separate analyses: one for the behavior and another for the possible features of the setting that are predictive of the positive outcome. This system would perform the event-based analysis and would retain the plan used for the successful trial. Enduring features such as the rock and the contents of the egg may be identified as having a predictive function. These would be represented as the sum of their features. The plan or response directive would be retained and would provide a precise record of the behavior planned in the original encounter with the egg. Even with these two separate analyses, the system would probably not construct a functional content map after only one trial. There may simply be too many variables and therefore too many possible interpretations about the features that are essential to all positive examples. Superstitious Behavior Constructing a trial content map that is too specific creates a problem of superstitious behavior. If a map is too specific, the learner will not receive information that certain behaviors are not necessary to the successful comple-
168
7.
BASIC RESPONSE-STRATEGY LEARNING
tion of the operation. The reason is that the response was successful. Because plan-based content maps are based on specific details, they are particularly susceptible to superstitious behavior. For example, if the learner nods its head three times before dropping the egg, this behavior is not contradicted by a successful outcome. If the behavior is coded as part of the plan, the detail becomes something of a self-fulfilling prophecy. On every successful trial, the learner nods its head three times before dropping the egg, and on every successful trial the egg breaks. Therefore, the system concludes that head nodding is an essential response component. The event-based process is less susceptible to superstitious behavior because it tends to err in the direction of creating maps that are too general rather than too specific. If details are in the record but are not incorporated in the current general content map, the agent receives information about their necessity. For instance, if the record shows how high the egg was held on the first encounter and the agent does not hold it that high on the second encounter, the agent receives information that the specification is of height. Regardless of whether a plan-based process, an event-based process, or a combination process is employed, the system will generate error because going from information about one setting to a map that deals with all settings is a guess. If the guess is too conservative, the map will contain detail that is irrelevant to future applications. If it is too general, it will leave out details that are relevant to the successful operation.
INFERRING PROCESSES FROM BEHAVIOR Organisms designed to conserve weight would tend to use systems that are simpler. We would expect flying birds to use plan-based systems and mammals to use event-based systems. It is possible to infer the general tendencies of the learning system from behavior. For instance, we could compare the responses of a pigeon and a dog. Some learning tasks favor the plan-based system. Those are tasks that require a simple adaptation of the plan. Tasks that favor the event-based system would be those that require learning keyed to more than one feature of the event and relationships not associated with specific behaviors. The first experiment favors the plan-based learner. The learner is to sit (or hunker down) to obtain food. After the learner has been subjected to a period of food deprivation, we wait for the learner to produce the response and immediately follow it with a small food reward. We place the reward in front of the learner so that it must move out of the sitting position to eat it. We follow the same procedure until the learner produces the sitting response at a high rate.
INFERRING PROCESSES FROM BEHAVIOR
169
The prediction would be that the dog would learn far more slowly than the pigeon, implying that the dog has an event-based system that does not automatically result in the fixing of the plan following a reinforced response. Possibly the dog considers the person who issued the food as the predictor, and the dog concludes that the person should be approached. If the pigeon learns much faster than the dog, the pigeon’s system performed a simpler operation, such as fixing the plan that the system assumes led to the primary reinforcer. We could confirm the extent of the discrepancy between the two learners by increasing the complexity of the response to be produced. After the high-response rate has been achieved for the sitting response, we wait until the learner sits and produces a unique second behavior before issuing the food. This behavior does not have to be the same for both subjects. It is simply a behavior that the learner produces while sitting. The learner may lift one foot off the floor, turn its head to the right, or make a vocalization. The combination of the two behaviors is reinforced until the learner produces the sitting-plus-designated-second-behavior at a high rate. The dog, again, would be predicted to learn the two-behavior rule more slowly than the pigeon. Following the presentation of reinforcement for the first combination response, the dog’s system may simply conclude that the reinforcement occurred later than usual, which only means that it has to be more patient while sitting. As it sits, it may turn its head and be reinforced again, but it would probably require many trials for the learner to identify that head turning while sitting is the predictor of the reinforcer. For the task that favors the event-based system, each learner is placed in a room that has one side carpeted. The learner must learn that, shortly after the buzzer sounds, the part of the floor that does not have a carpet issues a strong electric shock. The floor continues to be activated as long as the buzzer sounds, which occurs randomly from 1 to 3 minutes. However, the learner is fed on the shock side (uncarpeted side) of the room. The learner receives food only if it remains on the uncarpeted side for 2 minutes. During initial training, the buzzer sounds only when the learner is on the uncarpeted side of the room. The presentation of the shock is governed by a temporal feature (the buzzer) that predicts the shock. An enduring feature (the carpet) is an essential feature of the rule the learner must learn. (The learner must learn that the carpeted side of the room is safe.) The presentation of the food is governed by a temporal rule that does not involve specific responses, but rather any behavior performed so long as the learner remains on the shock side of the room. Both learners must learn that (a) the buzzer predicts a shock on the uncarpeted side of the room, (b) termination of the buzzer predicts the termi-
170
7.
BASIC RESPONSE-STRATEGY LEARNING
nation of the shock, and (c) food is contingent on remaining on the uncarpeted side of the room for a certain time. For learning (a) that the buzzer predicts a shock on the uncarpeted side of the room, neither system has a clear advantage. Both systems have to learn an avoidance response to the buzzer. Both have a high probability of going to the carpeted part of the room when the shock is initially presented. The event-based system would have a possible advantage in relating the prior event (the buzzer) to the shock. The plan-based system might have the advantage in relating the successful response to the shock, but this relationship would not help the learner avoid the shock by responding to the buzzer. So the dog would tend to more quickly learn that the buzzer predicts the shock, but the pigeon would more quickly learn that going to the rug escapes the shock. For learning (b), that the termination of the buzzer predicts the termination of the shock, the advantage would go to the event-based system, although the learning is greatly influenced by the rate and type of responses produced. The more frequently the learner tests the uncarpeted floor, the more quickly it will learn that it is safe. Given an equal number of tests of the floor at different times, however, the advantage would go to the eventbased system because it would have an easier time learning the relationship of two conditions independent of behavior (when the buzzer is not sounding, the electric grid is not activated). For learning (c) that the food is contingent on remaining on the uncarpeted side of the room for a certain time, the advantage would go to the event-based system. No behavior is immediately reinforced, so the relationship between the food reinforcer and time spent on the uncarpeted side of the room should be easier for the event-based system to discover. The planbased system would have a greater tendency to produce superstitious responses while on the uncarpeted side of the room because the plan for whatever the learner does immediately before the presentation of the reinforcer tends to be fixed and repeated on subsequent trials. Possibly, however, if the plan-based system repeats exactly what happened on the previous trial, that system would learn to stay in place long enough to be reinforced. The final variation would favor the event-based system. Once the learning had been established so that the learner enters the uncarpeted side of the room shortly after the termination of the buzzer, the room would be reversed so the formerly carpeted side was now uncarpeted, but all the feature relationships of the original design remained. (The buzzer predicts the shock. The uncarpeted side is shocked. Food is dispensed on the uncarpeted side only after the learner remains on the side for 2 minutes.) The prediction would be that the event-based system would transfer all the functions more rapidly than the plan-based system. The plan-based sys-
SUMMARY
171
tem would probably adapt to the onset of the buzzer quickly because, once the learner produced a successful response, the avoidance part of the equation would be established. The establishment of the new behavior for procuring food would not be as easy for the plan-based system because no particular response is required for attaining food. If the plan-based system has learned superstitious behaviors about attaining food, the behaviors could interfere with the transfer because the learner might link these responses to the side of the room now carpeted, which means that learning would be retarded. The particular systems possessed by the dog and pigeon are probably not pure systems of one type or the other. It may be that, for some pursuits, learning is improbable for either or both animals because the system has hardwired content maps that are not easily overridden by learning. Specificity and Generalization A recurring theme of all processes (plan-based, event-based, and combination) is that the learning-performance system is drawn in two directions— toward specificity and generalizations. The specificity is necessary because the details that the system identifies are based on the concrete examples that the learner encounters. The capacity of the system to learn a strategy is a function of the records that it has of specific concrete settings and events. If there is no record about particular details, no learning associated with these details is possible. Even if there is detail, however, operations must be performed on them. The simplest is that of identifying the component features of the event. This process is deceptively sophisticated, however, because it requires categorizing each feature of the event as something independent of the event and that may occur in other events. Despite the necessity for recording detail, the system must also generalize to formulate a content map. Although the map is based on details, it must be made general enough to apply to settings that differ in many dimensions and details from the events experienced.
SUMMARY The systemic requirements for response strategies are more demanding than those for antecedent-stimulus strategies. The reason is that creating a successful response strategy must provide for all the details required for executing the strategy on a particular occasion. The learned content map must provide for this specificity—either directly or through implication. In
172
7.
BASIC RESPONSE-STRATEGY LEARNING
contrast, the successful representation of a discriminative stimulus does not have to include all features of the stimulus that is discriminated. There are two basic types of response-strategy learning. Type 1 involves learning that could be facilitated by a default content map. Type 2 involves learning that could not be facilitated by a default content map. There are three basic processes of formulating the content map so that it includes sufficient information to permit its application in various settings. One is a plan-based process. Another is an event-based process. The planbased process retains the plan for the behavior produced immediately before the presentation of the primary reinforcer. The event-based process enhances all the changes that occurred before the presentation of the primary reinforcer. The plan-based approach has the advantage of simplifying the learning if a behavior preceded the primary reinforcer. The eventbased approach has the advantage of learning relationships that may be independent of behaviors the learner produced. The third type of process is one that combines the plan- and event-based processes. Both the plan- and event-based processes lead to error because the system does not know which of the details that occurred on the first application of the strategy are needed for all other successful applications. The plan-based process creates a record that is too specific and must be generalized if it is to apply to a range of examples. The event-based plan may initially be too general. The reason is that when the system attempts to record details of the event, critical details of the strategy may be omitted either because they were not observed or because they were not deemed by the agent to be critical to the next application. The system is later able to identify the additional needed detail by attempting to follow the too-general map and providing adjustments. The record of the successful attempts is compared to determine which features are essential to both and which are unique to one and therefore may be eliminated. Systems that need a less elaborate learning procedure would tend to be designed to use the too-specific, plan-based approach. Those systems that need more flexible strategies would tend to adopt the more general eventbased approach. The description of the two processes assumes that the recording potential of the system is good. In fact, however, its recording capacity may be limited. The system would still perform the functions, but it would not do so at the rate suggested by our descriptions. Even if the machinery were precise and functioned as described, learning would be greatly retarded if many negative examples were registered before the positive example occurred. The cumulative record of the negative examples may be nearly as strong as the representation of what happened on the positive example. The result may be that the learner does not retain a clear memory of what had occurred on the successful example.
SUMMARY
173
The kind of process a particular learner employs may be inferred from learning tendencies. If the learner tends to imitate previous behaviors with fidelity, the system probably uses a plan-based procedure. If the learner tends not to produce responses of the same configuration on various trials, but tends to transfer what is learned to settings that share specific features with the original setting, the system has event-based properties. The eventbased process has the greatest potential for learning relationships. For basic learning, however, both the plan- and event-based systems are shaped by the results achieved on various trials.
Chapter
8
Learning Patterns and Generalizations
Chapter 7 indicated that learning—particularly the learning of new response strategies—probably will not occur following a single trial. Rather, the learning system selects aspects of the initial strategy and generalizes it so that, on subsequent trials, the same content map that occurred on the first trial will be repeated. This chapter articulates both the use of information about positive and negative outcomes to shape the trial content maps and the process of feature generalization. This chapter addresses phenomena considered by the literature as generalization and discrimination. The current analysis recognizes the phenomena, but provides an analytical foundation for how and why generalization and discrimination occur. They are part of a single process. The learner attempts to identify those things that are the same about a successful strategy or a stimulus that predicts the primary reinforcer. In identifying what is the same, the system makes guesses. The guesses are either too specific, too general, or fail to identify the true predictor. The learner receives subsequent information about positive and negative examples that serve as a basis for the system to refine the content map for predicting or planning an effective strategy. There is no generalization without discrimination and no discrimination without a representation of specific content. RESPONSES TO REINFORCEMENT Just as the system identifies features of single examples, it identifies patterns of outcomes. The discussions in chapters 6 and 7 were simplified because all the examples assumed that the relationship between antecedent events 174
RESPONSES TO REINFORCEMENT
175
and the occurrence of the primary reinforcer was ideal—the antecedent events always predicted the primary reinforcer, and the primary reinforcer never occurred unless it was preceded by the antecedent events. In reality, the correspondence is not always perfect. Rather, there may be a probability that the antecedent events will predict the primary reinforcer or a probability that the primary reinforcer will not occur without being preceded by the discriminative stimulus or response strategy. The extensive literature on schedules of reinforcement shows that experimental subjects learn different patterns of responses from different schedules of reinforcement (Skinner, 1969). The response tendencies of learners imply the nature of what the learner has learned. For instance, if the reinforcer occurs on every fourth occurrence of the antecedent stimulus, a particular performance profile will occur that shows the learner has learned the pattern. The behavior is scalloped with the tendency for responding on the fourth trial much greater than it is for the trial immediately following the reinforcer or for the next trial. This pattern is quite different from the one-to-one pattern in which the response probability is high for every example. At the far extreme from a one-to-one schedule is an intermittent or variable schedule that issues reinforcement in an unpredictable pattern, such as reinforcement after between 1 and 14 trials. The learner tends to respond to every occurrence of the antecedent event. However, the behavior is resistant to extinction, requiring a large number of unreinforced trials to achieve extinction. In contrast, the schedule that reinforces every fourth response extinguishes more quickly, and the one-to-one pattern extinguishes even more quickly. Learning Requirements These performance phenomena are explicable only in terms of the content that the learner had learned. For each condition to be learned, the content map must logically refer not only to the discriminative stimulus and response, but also to the pattern of occurrences of the primary reinforcer. Furthermore, for the scalloped behavior to occur, the learner would have to know two things about the pattern—the pattern and the relationship between the pattern and the current example. If the learner did not know that the pattern approximated one reinforced trial followed by three unreinforced trials and that the current trial occupied a particular position in a series of temporally ordered events, the learner could not possibly perform as observed. Knowledge of both the rate and temporal pattern is necessary for the observed behavior to occur, but it is not sufficient. If the learner knew only the rule about the reinforced and unreinforced trials, the learner could not re-
176
8.
LEARNING PATTERNS AND GENERALIZATIONS
spond to the current instance because more information is logically required. Because the learner responds differentially to positive and negative examples, the learner must know the different responses that are called for by the positive examples and the negatives. Pattern Knowledge and Extinction. The knowledge of the pattern is represented like any other content—as unique, qualitative features that are the same for all positive instances. In this case, the sameness in feature is a temporal pattern. Knowledge of the relationship of the current example to the pattern is possible only through some form of functional counting of examples (or approximating the counting). The counting starts with the reinforced instance, which means that some record of this event is carried forward in time so that the next occurrence of the predictor is recognized as the second example in the temporal sequence and is therefore predicted to be a nonreinforced example. The documented patterns of extinction are predictable only if the learner understands this pattern of reinforced instances. Extinction occurs when previously reinforced positive examples are no longer reinforced. The extinction process provides the system with a contradiction between what had been predicted and what occurred. Because the process provides information that S1 no longer predicts S2, the implication for learning and performance is that the previously learned pattern should not be followed. The one-to-one schedule extinguishes quickly because the learner quickly accumulates information that the predicted pattern is being contradicted. According to the prediction for this learned pattern, every instance of the response should lead to reinforcement. Because the pattern is oneto-one, the first extinction trial (unreinforced response) provides the system with information that the learned pattern is contradicted. Eight unreinforced responses for the one-to-one schedule provide the system with eight contradictions about the learned pattern of reinforcement. For the one-to-four schedule, however, eight responses provide information about only two contradictions. For the learner to receive information about eight contradictions, the learner would have to experience 32 trials. This is not to say that, if the one-to-one schedule requires eight contradictions for extinction, the one-to-four schedule would also require eight contradictions. It asserts only that information about eight trials requires a much larger number of trials if the schedule is one reinforced positive to every four negatives. The total number of trials is therefore deceptive. The presentation of eight extinction trials also does not imply the number of pattern cycles needed for extinction. If five unreinforced trials have occurred, the system may not have sufficient information that the schedule has changed.
RESPONSES TO REINFORCEMENT
177
An unpredictable schedule would logically require a large number of trials to provide the system with information that the learned pattern of reinforcement was being contradicted. If, during training, the schedule had presented as many as 14 unreinforced trials before a reinforced trial occurred, the learner would have to experience more than 14 unreinforced trials to receive information about one contradiction. The logic of the relationship between extinction trials and the rate of contradictions that are generated is consistent with the data. Learners who have been reinforced for intermittent, variable schedules of reinforcement have been resistant to over 1,000 extinction trials (Skinner, 1969). Modification of the Content Map A theme of earlier discussions on the type of information that is logically required for content maps is that the learner must represent multiple features of the stimuli and responses involved in a particular pursuit. To approach an object, the system needs information about more than one feature of the object pursued, more than one feature of the responses being directed, and more than one feature of the response strategy. The data on extinction patterns show that the learner must also have information about multiple features of the reinforcers—not only what they are, but the schedule of reinforcement. Following application of the first successful response strategy, the resulting content map provides a description of the strategy and presents a tacit or assumed pattern. The assumed pattern is one-to-one correspondence between predictor and reinforcer. The next occurrence of the assumed predictor and execution of the response strategy is assumed to lead to the reinforcer. If the primary reinforcer occurs on only some predicted occasions, the system tries to discover a rule that permits predictions on a larger percentage of the trials, hopefully all of them. The prediction that the reinforcement will occur on every trial is ultimately abandoned, and the system tries to identify what pattern does describe the reinforcement schedule. The ability of the system to succeed is a function of the system’s memory capacity and its ability to represent correlated features (the individual events that are experienced and whether each is reinforced). The result is a content map that has provisions for predicting not only the positively reinforced trials, but also the unreinforced trials (or at least the tendency for the trial to be unreinforced). Even if the pattern provided reinforcement on all trials, the content map would have added information about the pattern to the information about the features of the predictor, strategy, and primary reinforcer.
178
8.
LEARNING PATTERNS AND GENERALIZATIONS
Making the Predictor More Specific. The problem of intermittent reinforcement is initially that the reinforcer does not occur when it is predicted. The prediction must be modified so that it predicts fewer reinforced examples than it currently predicts. This reframing of the predictor is achieved by adding a qualification or feature to the original predictor. This added feature does not alter the original formulation of response strategy and primary reinforcer. The predictor still predicts and the primary reinforcer still occurs as a function of the response strategy. If the original relationship was that a light predicted food, the predictor is still a light. If pressing a lever led to food, the predictor is still pressing a lever. A second, independent feature—a pattern—is simply added to the predictor. It is added to the predictor because it is an expectation. Will this specific instance of the predictor lead to the primary reinforcer? With this question answered, the original problem is solved to some degree. The primary reinforcer tends to occur only on the predicted trials. As chapters 6 and 7 indicated, the system must perform logical operations to draw conclusions about whether specific antecedent stimuli predict the primary reinforcer. For the learner to base predictions on a schedule of reinforcement, the system would require a minimum number of trials to formulate the pattern. If the reinforcement pattern were one in four, the learner would have to experience at least five trials to obtain one example that suggested a pattern. Information about the positive example and the three subsequent trials (positive, negative, negative, negative) is not sufficient and provides no possible indication of what the next example would be. With the presentation of the second positive example (positive, negative, negative, negative, positive), however, the system could at least make a guess about the nature of the next example. If, at the time of the second positive example, the system guessed that there was a pattern, the system would not be able to confirm that guess until the next occurrence of the primary reinforcer. Unfortunately, there are various patterns that are consistent with the first five examples. It could be that the schedule is based on reinforcement that occurs once in a while, or it could be that it would repeat this sequence precisely. The learner may not identify the pattern precisely; however, it now has information that suggests that there may be a pattern. The system needs procedures for determining the pattern that leads to the highest percentage of reinforced trials. The only way the determination is made is by changing the content map and observing the resulting change in the pattern of reinforcement. The learner may be experiencing an intermittent schedule for more than one possible reason. It may be that the learner has not correctly identified the predictor or response strategy. For instance, the learner has learned that red predicts the primary reinforcer when in fact red plus another feature predicts the rein-
RESPONSES TO REINFORCEMENT
179
forcer. The learner will receive reinforcement on only some of the trials; however, by changing the rule for the predictor, the learner would receive reinforcement on all trials. Given that the learner has no preknowledge of whether the predictor the system has identified is the best predictor, the intermittent pattern may be interpreted by the learner as an indicator that the rule for the predictor should be made more precise. The learner will never discover whether the rule should be modified unless the learner has some means of representing the frequency of reinforcement. If no other predictor achieves a better frequency, the intermittent pattern is assumed to be the best that is possible. Because the system would tend to assume that the lack of reinforcement on every trial implies that the identified predictor is not adequate, training procedures do not usually start with something like a one-in-four schedule, but with a one-to-one schedule that is later modified. If the intervention started with a one-in-four schedule, it would logically require many times the number of trials implied by a one-in-one schedule. On a one-in-four schedule, if the learner tried a particular approach two consecutive times before abandoning it, the chances are only .5 that even one of those trials will lead to reinforcement. If the learner assumes that reinforcement should occur on the next one or two trials (if the predictor is correctly identified), the learner would likely abandon all of the predictors. If the learner does represent the intermittent pattern, the initial attempts may indicate only the more general features of the pattern. For instance, the initial representation may indicate that (a) if reinforcement occurs on the present trial, the next trial will not be reinforced; and (b) if reinforcement does not occur on the present trial, it may or may not occur on the next trial. With further exposure to the pattern, the prediction rule could undergo further refinement. For example, the next iteration of the rule could be, “If the current trial is positive, more than one negative trial follows.” If the system were capable of greater precision, the system would ultimately formulate the rule that the reinforcer occurs following three unreinforced trials. Knowledge Versus Responses. The learning of schedules of reinforcement provides compelling evidence for the assertion that basic learning takes the form of S1 ® S2. If the experimental design is such that the learner does not respond to the discriminative stimulus on some trials and responds on others, and if there is a correspondence between the schedule of reinforcement and the pattern of responses, the learner has learned when not to respond as well as when to respond. The same discriminative stimulus that leads to a response on one trial leads to a nonresponse on another trial. The discriminative stimulus, therefore, cannot be the cause of the response or the nonresponse. The only possible basis for the phenom-
180
8.
LEARNING PATTERNS AND GENERALIZATIONS
ena is something independent of the discriminative stimulus. That something is knowledge of the pattern and knowledge of where the current trial is in that pattern. Adding Schedule Information to Content Maps. The system would have to perform two operations when testing various examples—one for the presence of the discriminative stimulus, and one for the position of the example in the pattern the system represents. To ensure that the learner receives enough information to identify a possible pattern, the system needs guards against premature abandonment of possible response strategies. Chapter 7 pointed out that if the disconfirmation of a particular response strategy results in the disconfirmation of all the features of the strategy, the content map that would have worked with an adjustment in only one feature would be preempted. In the same way, the learner who has learned a reinforceable response is preempted from learning about a pattern of examples if the trial content map is rejected prematurely. Positive Reinforcers. Given that the system has no preknowledge of the content or pattern, there is a possibility that, by changing some features of the strategy, the system will formulate a strategy to predict all examples or at least a higher percentage of examples. However, the safest procedure is for the system to persist in testing the same feature set that was effective earlier, although it may not predict on all, or even most, instances. If this sometimes predictor is retained, the system may still be able to learn some form of feature modification that will lead to a higher percentage of the instances being predicted correctly. The system must test different possibilities. The results of the tests provide the system with information about the predictor tested. If the manipulation results in a lower percentage of reinforced responses, the manipulation has not identified a predictor better than the original. Note that for the system to determine whether the percentage of reinforced responses increases or decreases, the system must have baseline data on the current percentage of reinforced responses. Without baseline data, no comparison of percentages is possible. Therefore, if the system is capable of making adjustments that correct possible problems of predictors that are actually wrong, too specific, or too general, the system must have some method of computing percentages. If the various tests do not result in improvement of the percentage, the system concludes that the current predictor is correct. In natural situations, the learner must be prepared to formulate many response strategies that do not lead to reinforcement on every trial. Yet they are judged by the system to be the best available. For instance, the cat may have an elaborate ritual for approaching a bird that is feeding on the ground. The strategy may involve a careful approach from behind, with fre-
RESPONSES TO REINFORCEMENT
181
quent and long pauses. The approach, although well designed, does not work on every trial, nor does the strategy of making loud and persistent meows always ensure that the door will open and the cat will be able to go outside. These strategies, however, tend to be retained because they work better than others the system is capable of testing. Negative Reinforcers and Punishers. Learning designed to avoid an aversive stimulus has the same basic information requirements about what predicts a successful outcome. The efficient system follows these rules: (a) Once a strategy for an aversive stimulus predicts on a single trial, confirm the strategy and apply it to subsequent trials, (b) do not abandon the strategy without trying it on more than one occasion, and (c) if it does not predict on every trial, try two alternatives—identifying another possible predictor or making the predictive rule more general. The system would probably not be designed to make the rule for predicting the aversive outcome more specific because the system would be designed to play it safe. If it can avoid all the true positives (punishing examples) by creating a rule that treats some negatives (nonpunishing examples) as positives, it will exercise this option. The stronger the aversive stimulus, the more the system would tend to use the strategy of making the rule more general as its first option. If this more general rule works, the system would tend not to experiment in an attempt to determine whether another strategy works or whether the punishing consequences always follow the predictor that has been identified. Information Based on Patterns of Reinforcement. The system must use information about patterns of reinforcement to determine which features predict the primary reinforcer. The information that the system receives presents four possible patterns that relate occurrences of the predictor and occurrences of the primary reinforcer: 1. The predictor is always followed by the primary reinforcer, and the primary reinforcer is always preceded by the predictor. 2. The predictor is always followed by the primary reinforcer, but the primary reinforcer is not always preceded by the predictor. 3. The predictor is not always followed by the primary reinforcer, but the primary reinforcer is always preceded by the predictor. 4. The predictor is not always followed by the primary reinforcer, and the primary reinforcer is not always preceded by the predictor. Figure 8.1 shows circles that represent the four possibilities. For Pattern 1, the circles coterminate. For Pattern 2, all predictions are within the occurrences of the primary reinforcer, which means that all predictions are
182
8.
FIG. 8.1.
LEARNING PATTERNS AND GENERALIZATIONS
Discrepancies between projected and realized reinforcement.
confirmed. However, the reinforcer sometimes occurs when it is not predicted. The predictions must somehow be made more general to include all occurrences of the primary reinforcer. Pattern 3 is the opposite of Pattern 2. All examples of the primary reinforcer are predicted; however, predictions are also made about examples that do not lead to the primary reinforcer. For Pattern 4, there is misalignment between predicted and realized; the primary reinforcer occurs when it is not predicted, and predictions of the primary reinforcer are confirmed only sometimes. The prediction must be reformulated. For two of these patterns, no adjustments of the content map are implied. For Pattern 1, the trial content map is completely confirmed and requires no adjustment. As predicted, every occurrence of the predictor leads to the reinforcer, and the reinforcer never occurs sans the predictor. For Pattern 4, the content map is rejected rather than modified. The predictor does not reliably predict the reinforcer, and the reinforcer occurs without being predicted. Patterns 2 and 3 imply a qualification of the initial assumption that the predictor will have a one-to-one relationship with the primary reinforcer.
RESPONSES TO REINFORCEMENT
183
The content map is not completely rejected. Rather, the map is modified by making the predictor relatively more general (applying to a wider range of examples as in Pattern 2) or more specific (applying to a fewer number of examples as in Pattern 3). For Pattern 2, the predictor does not predict every occurrence of the primary reinforcer. For example, the content map is based on the rule that the presence of a particular bird always marks where food of a particular type is available. However, the learner encounters that type of food at places not predicted by the presence of the birds. Because the predictor is always confirmed, the map may be the best one available. If it were to be improved, however, it would be made more general. This is done by designing the rule so that some of the examples currently not considered predictors of the reinforcer would be identified as predictors. For Pattern 3, the predictor predicts every occasion of the primary reinforcer, but also predicts a range of examples not followed by the primary reinforcer. The remedy is to add qualifications to the original content map that restrict the range of positive examples. For example, instead of every occurrence of the red light predicting food, a provision is added that the red light plus the trial that follows three unreinforced trials predicts food. Types of Adjustments Required by Patterns. To make the rule for the predictor more general or inclusive (Pattern 2 in Fig. 8.1), the system must identify an or relationship. The original predictions were always confirmed, but the primary reinforcer occurred on additional occasions. So the scope of the predicted reinforcement must be broadened to accommodate instances not predicted by the original projection. To broaden the scope of the predictors, the system may formulate relationships that are independent of the original predictor or dependent on the original predictor features. If the original predictor were assumed to be red, an example of an independent feature would be something not related to red or to color. For instance, the learner discovers that size sometimes predicts the primary reinforcer (e.g., nonred examples of a particular size). The adjusted predictor would then refer to color or size. If either red or a particular size is observed, the primary reinforcer is predicted. Another type of adjustment involves a dependent feature of the original predictor. This adjustment applies to features that have continuous variation. By broadening the scope of the predicted examples, the system reformulates the original criterion. For instances, the feature of red might be broadened into red, orange, or yellow. This broader range incorporates a continuous variation of examples that range from red through yellow. Pattern 3 implies that the rule for the predictor should be made more specific. To make the rule for the predictor more specific, the system must identify an and relationship. Instead of a single criterion, a double criterion
184
8.
LEARNING PATTERNS AND GENERALIZATIONS
is required. Like the options for Type 2, the additional feature may be independent of the original predictor or may be a dependent feature. The addition of an independent feature could include anything from a pattern of reinforcement to an additional feature category, such as size or shape. For instance, the learner assumes that red flowers have nectar, but discovers that a flower must also be of a certain size. The reformulated content map would refer to both red and the size. An adjustment of a dependent feature would be associated with continuous variation, like color, size, or intensity. The original predictor describes a range of variation. When a dependent feature is added, the range is narrowed. For instance, the learner assumes that red flowers predict nectar, but discovers that only some red flowers have nectar—those that are rustred. In effect, the original predictor is made more specific by narrowing the range of positive variation from red to rust-red. Some examples that would have been predicted by the original rule are now negatives (flowers that are red, but not within the range of rust-red). In summary, broadening a criterion for predicting may be achieved by creating a larger number of features expressed as an or relationship. The or relationship may involve another feature that is independent of the original or a continuous dependent feature that is broadened. Narrowing a criterion to make it more specific is expressed as an and relationship. Again, the relationship may involve an independent feature or adding restrictions to the original range of continuous variation.
THE MODIFICATION PROCESS The process that corrects the predictors for pattern Types 2 and 3 logically requires separate evaluations of the predictor and primary reinforcer. Type 2 modifications tend to be theoretically easier for the system to process because the presence of the primary reinforcer occurs in two contexts. Either these two contexts share features that could predict (dependent features) or they do not share features (independent features). For either case, the scope of the predictor may be enlarged to include the occurrences of the primary reinforcer that are currently not predicted. Any predictor that improves the percentage of predicted outcomes is retained by the system. For instance, the learner discovers that birds sometimes indicate places where there are carcasses. The learner also discovers that some carcasses not marked by birds issue a distinctive odor. The learner now has a rule that indicates birds or odor predicts carcasses. The percentage of instances now predicted may not be 100%, but it is closer to this goal than it was before. Furthermore, the system may later identify a third predictor and a fourth.
THE MODIFICATION PROCESS
185
For a Type 2 modification, the system is designed to add features until all the positive instances are predicted. If this goal is not met, the learner is limited to the formulations that predict the highest percentage. Making Type 3 modifications requires a different analysis because there are no unpredicted instances of the primary reinforcer. Instead, there are predictions that are not confirmed. For a Type 3 modification, the system is designed to subtract features until only the positive instances are predicted. An analysis of the common features of the primary reinforcer (both predicted and unpredicted) does not provide articulate information about possible features of the predictor. Rather the analysis must involve a comparison of predictor features observed on instances in which the primary reinforcer occurs and in which the reinforcer does not occur. The analysis would identify any set of predictor features common to all occurrences of the primary reinforcer and to no examples in which the primary reinforcer did not occur. Rules that identify such features would reduce the discrepancy between the percentage of predicted outcomes and realized outcomes. Continuous Analyses The tendency of the system would be to improve the percentages of verified outcomes by using a variation of the two analyses during all phases of learning. The analysis of features present on the occasions of the primary reinforcer provide clues about possible predictors that may occur on all instances of the primary reinforcer. If the predictor does not predict all instances of the primary reinforcer, the analysis of the primary reinforcer would identify common features of groups of instances not predicted by the current predictor. If the predictor predicts all instances of the primary reinforcer, but unintentionally creates unfulfilled predictions, the comparison of unpredicted and predicted outcomes provides the basis for revealing specific features shared only by the positive examples. This analysis of common features may reveal a temporal (schedule-of-reinforcement) feature shared only by positive examples (e.g., all positives may occur only after one or more negatives). Analysis of Extraneous Features The modification processes would also lead to corrections of inert features and chance occurrences. For inert occurrences, a subclass may be identified as the predictor when in fact a larger class serves as a more productive predictor. For instance, red flowers have ample nectar. Some of the red flowers have yellow leaves. No flowers, other than red ones, have yellow leaves. The learner does not correctly identify redness as a predictor, but rather yellow leaves. This creates a Type 2 pattern, but the learner may have
186
8.
LEARNING PATTERNS AND GENERALIZATIONS
no basis for identifying red as the better predictor. For the learner, all the flowers predicted by the yellow-leaves rule are positive; therefore, yellow leaves appears to predict. However, if the system performs an ongoing analysis of the features of the setting each time the primary reinforcer (nectar) occurs, the system will identify red as a common feature. As such, red becomes a possible predictor. When it is tested, the analysis reveals that red predicts. When tested extensively, the examples show that red is a better predictor than yellow leaves. A probable occurrence does not permit the learner to predict every outcome or pattern because there is no discernible pattern or basis for predicting every example. For instance, the only flowers that have nectar are red, but not all red flowers (perhaps only one in three) have nectar. This is a Type 3 problem. The system would be able to learn that this is an example of a temporal pattern by testing various red flowers and comparing features of outcomes that predicted correctly with those that predicted incorrectly. The common feature of examples predicted correctly is that they are red. No distinguishing features are observed in the examples predicted to be positive but that proved to be negative. No systematic variation in red or any other feature is observed in these examples. Therefore, no improvement in the percentage of positive outcomes is assumed to be possible. The content map is retained, and the learner knows that ample nectar is predicted by red flowers, but not by every red flower. In summary, the system must be designed to perform logical operations that permit the confirmation, disconfirmation, or modification of a hypothesized predictor and allow for corrections of inert features and chance occurrences. Various patterns of modification are possible, but all adjustments are referenced to occasions of the primary reinforcer. If unfulfilled predictions of the primary reinforcer occur, a more specific predictor is implied. If less than 100% of the occurrences of the primary reinforcer are predicted, a more general predictor is implied. The adjustments occur either through formulating and relationships (a combination of more than one feature that makes the predictor more specific) or by or relationships (more than one feature option, which makes the predictor more general). The features may be independent of the original predictor or may be a range or degree of the original predictor. ANALYSIS OF FEATURES Throughout the formulations of performance and learning functions, we have referred to features. In most contexts, the meaning of the term was evident. This section describes features more thoroughly so that the requirements of the system in creating, sorting, classifying, and generalizing features may be more articulate.
ANALYSIS OF FEATURES
187
Anything that is recognized by a learning-performance system is a feature. The only basis for relating a particular setting to any other setting is through the presence or absence of features. The basic notion of predictors assumes that a particular antecedent event, response, or primary reinforcer is identifiable on more than one occasion. If it is, there must be something that is the same about it on those occasions. Without this sameness, there would be no way to identify or represent it. However, any two things that are identified as being the same are not the same in all details. If all details were the same, the two examples would have to occur at the same time and in the same place, and would therefore be one unique instance. Because features refer to only particular aspects of the sensory flux the organism receives, features are perforce abstractions of some qualitatively unique detail that remains stable across various transformations. If it is recognized by being large, red, having a long tail, moving fast, or by any other potentially discriminable part, attribute, pattern, behavior, arrangement, or tendency, all the references are features. For the system to function, therefore, it must meet the fundamental requirement of being able to abstract, retain, and represent qualitative samenesses of any type of content implied by the behavior. If the behavior reveals knowledge of redness, the system abstracts redness. If the behavior reveals knowledge of a pattern of temporal events, such as the knowledge of a pattern of reinforced trials, the system must be able to represent and manipulate that which is the same about all positive examples—the pattern. In the simplest terms, the system assumes that the incoming reception has a number of discriminable details, and that some combination of these may be represented as features for sorting examples. Let’s say that the presentation is a buzzer sound and that the representation by the system has 100 discriminable features. Further, 10 of them, singly or in concert, identify the buzzer in a way that it will be identified and discriminated from other competing stimuli. The system that learns represents some combination of the 10 possible features. We do not know which ones, but we know that if the learner performed successfully, the system did not base this rule on any of the following features of the original presentation of the buzzer: It was a noise, it was in a particular place, it occurred after another specific event, it occurred when the learner was looking in a particular direction, and so forth. Although these are features of the example that could be used for other applications, they are irrelevant for the present application. The representation of the buzzer must be based strictly on sound features that distinguish it from other auditory patterns the system receives. These may be represented specifically (in a way that discriminates between this buzzer and virtually any other) or more generally (in a way that would not discriminate between hundreds of other buzzers, but that would discriminate between the present buzzer sound and any competing sounds that occur on later trials).
188
8.
LEARNING PATTERNS AND GENERALIZATIONS
Host and Residential Features There are two general classes of features—features that describe any instance or occasion of the host and those that reside in individual hosts. A host is an entity of some sort that the agent experiences—tree, place, particular type of animal. The host may also be a temporal event, identifiable over time, like the act of running or flying. Any host is characterized as consisting of more than one feature or attribute. For example, all trees have the features of leaves, a tree shape, and having a trunk or branches. The residential features of individual hosts would include all the ways in which the host can vary either from one individual to another or over time—parts, behavior, patterns, color, and so forth. In other words, the host is an arbitrarily designated constant. The residential features are the variables that compose or accompany the host. Some residential features are enduring and always accompany the host in some form. Some may or may not be present on a particular occasion. Running may be treated as a host. It will always involve a particular pattern of movement (an enduring feature). The organism, however, will not always be running with something in its mouth (a variable feature). The observed tree will always have various shape features; however, the leaves, surface features, and illumination features vary. The host and residential features have a kind of reciprocal relationship. Residential features cannot exist without a host, and a host cannot exist without various residential features. A tree cannot be represented without an outline of some sort or discriminable parts, particular colors, texture of various parts, and hundreds of other residential features. Likewise, the residential features of the tree cannot exist without a host. The part of the tree identified as bark cannot exist naturally without some sort of host covered with bark, nor can the color of the leaves, behavior of the tree in the wind, and any of the other residential features that describe the particular tree at a particular time. If we analyze any feature of the host, however, we discover that the feature is a host composed of features. For instance, the host bark has residential features. It may be rough, soft, fibrous, furrowed, smooth, brown, gray, mottled, irregular, and so forth. If we look at a part of the bark, such as a plate or fiber, we discover that it too has residential features. The analysis of features of features may not result in an infinite regression; however, the assumption is that any feature has features. The reason is practical. If the system identifies something as a host or as a feature, it is distinguished from other hosts or features. The only possible basis for its being distinguished is a difference in features. If the system identifies a particular shade of gray, whether that gray is glossy or flat (for this instance) it is a feature of gray. Gray is the host; the features have to do with the variations of the host that are observed.
ANALYSIS OF FEATURES
189
There are two important implications of this analysis. The first is that the system is able to represent hosts of various degrees of inclusiveness. The second is that the relative inclusiveness of the host is a function of the objective of the system’s analysis. If the organism is to seek particular insects that reside only in a particular bark pattern, the system must represent the host as some components of the tree that have bark—branches and trunks. The residential feature is the pattern of the bark. If the organism is to swing through trees, the system must represent the host as trees, and the residential features have to do with geometry and the relative position of branches, not with the bark. If the goal is to dam up a stream by felling a tree, the host is a tree and the residential features have to do with proximity to the river and the size and shape of the trunk. This ability of the system to represent different hosts is implied by behavior. If the learner varies specific features of walking, the learner is able to represent the variations in the features that are observed. For instance, if the organism is to continue walking while moving its head to the right, the host is the act of walking and a residential feature is the variable head movement. If the organism is able to turn its head to the right while standing, the host is standing and the residential variable is head movement. If the situation requires variation in rate of head movement while shaking the head, the host becomes head movement. The residential variables are the types of head movements produced to look to the sides, shake off a horsefly, or issue a warning to a rival. In chapter 3, we illustrated host and variable residential features as possible menus. When a particular host is identified (such as locomotion), particular variables are implied by the various types of locomotion available to the agent. These would be presented as options. When a particular mode of locomotion is selected (such as walking), a new set of variables becomes menu items. If walking of a particular type is selected, new menu variables are presented. For responses, the system must account for all the variables required to produce the response in the current setting. For most settings, a package of default values could be directed as a package. These would relieve the agent of having to direct all the features of the host activity (such as toe and ankle variation while performing ordinary walking). In specific settings, however, the organism may be required to use a particular type of foot movement. The toe–foot movement becomes a residential feature of this pattern. Abstracting Residential Features The residential features, like the host features, are composed of features. These component features are implied by the various transformations or continuous variation that the feature undergoes. If the feature of a host is a
190
8.
LEARNING PATTERNS AND GENERALIZATIONS
particular tone, the tone may become louder, change pitch, change apparent location, and so forth. Each of these transformations describes a different feature of the original feature. As the discussion of predictors indicated, the system may adopt a relatively broader or narrower focus for identifying a color that predicts. For example, rust-red represents a particular feature that is part of the more general color red. If the color spectrum is represented as a continuum of variation, rust-red is a part of the segment identified as red. Rust-red has the feature shared by all examples of red. It also has the additional feature of the rusty hue, which is not shared by other reds. If we further examine rust-red, we can identify that the class may be divided into subclasses on the basis of specific variations in rust-red. In other words, rust-red has features just as red has features. Red is different from a host composed of many parts because red does not have parts, as a tree does. It is a single dimension. The particular color identified is an arbitrarily specified range of a continuous color variation. Because it is an arbitrary range, its divisions or subtypes are also arbitrary. Probably the most important difference between a residential feature like color and a host is that there is no color in the abstract; there are only colored hosts—lights, sky, objects. Furthermore, there are no visible hosts without color. (Even if it is transparent, it possesses the colors observed in the space it occupies.) Therefore, awareness of color is possible only if the system abstracts the color from the various hosts in which it resides. If an organism has learned red as a discriminative stimulus, the organism has abstracted redness from various red things and represented the qualitative color feature common to all of them. For example, the organism is presented with continuous variation in the color of a light. The light randomly changes from one color to another. The light also randomly varies in intensity. When the light is red, it signals the presence of a primary reinforcer. Any system that responded consistently to red would have to create a complete abstraction of red because the only difference between the light when it is purple, red, or orange has to do with the color—not position, size, level of illumination, or any variable other than redness. The learner’s system identified the light as the host and the color red as the residential variable that signaled the primary reinforcer. Features as Sorting Criteria Abstract features are the only basis for sorting members of any population into subgroups. The members that have particular features are the positive examples. Those that do not are the negative examples. The more elaborate the set of features used to sort the members, the smaller the group of positives. If a particular feature used for sorting is possessed by all members of the population, the feature cannot function as a screening criterion for any
ANALYSIS OF FEATURES
191
particular group. If the criterion referred to features possessed by no members of the group, the criterion could not identify the group or any members of the group because it does not apply to the group. If the criterion referred to a feature possessed by only some members of the group, it would divide the group into positive and negative subgroups. The process of learning involves exploiting the relationship between features and the implications features have for grouping. The system correlates specific features of those examples that are identified as positive (leading to a primary reinforcer) and those that are negative. If a feature predicts, it describes a group of instances, all members of which possess this feature. Therefore, the feature may be used to identify members of the group. If the feature selected identifies only some of the positive examples, the feature is broadened and made less specific. If the feature consistently identifies positive examples, but also identifies examples that are not positive, the feature must be narrowed—made more specific. Functionally, the same process is required for the system to analyze either the host or residential features. In both cases, the feature represented by the system is made more specific by the addition of features or made more general by the subtraction of features or options for choices of features. The subtraction of features is more evident for changing the criterion from something like small and red to simply red. Making rust-red more general (red), although the same process, does not as obviously involve the subtraction of features. A representation that describes a more general population of examples specifies features present only in all examples of that population. A representation that describes a more specific population describes features that are present only in all examples of that more specific population. If all examples within the more specific population are also in the more general population, the more specific representation must contain all the features of the more general representation plus additional features. Therefore, the additions to a set of features that describe a broad population result in describing a subgroup of the larger population. For example, adding the feature of small to a group of red objects defines a subgroup of small red objects. Subtracting features from a set that describe a broad population results in describing a larger, inclusive population. Whether the subtraction of features involves eliminating an independent feature, such as size in the representation of small and red, or involves an increase in the range of variation of a feature, such as using red instead of rust-red, the subtraction of feature requirements results in a larger group of positive examples.1 1The
specification for any individual is logically more detailed than the specification for any group that shares some characteristics (and is not simply an aggregate of members that are identified as individuals). Conversely, the larger the group identified, the less the specific detail used to sort these members from others.
1
192
8.
LEARNING PATTERNS AND GENERALIZATIONS
Definition of Feature The preceding discussion provides the basis for a more precise description of a feature. There are several ways to define a feature. According to the discussion, a feature is a qualitatively unique aspect that is observed in more than one presentation of a stimulus and implies a particular pattern for grouping all examples as either positive or negative. A feature is either present or absent in a given stimulus context. An operational definition is implied by the changes in grouping pattern that result from changes in the range of detail that describes the feature. The grouping pattern associated with the feature changes as the feature becomes either more general or more specific. The minimum difference between the groupings in which the feature is present and absent describes the feature—what is removed or added to transform a negative example of the feature into a minimally different positive example is the feature. The conversion of rust-red into another shade of red that is minimally different from rust-red functionally describes rust-red. The conversion of any value of red into not red describes the range of red. This operational view of features is dynamic, continuous, and flexible. It is certainly the manner in which the system that learns performs. The system receives the core sensory data and identifies what amounts to a feature—aspects that, individually or in concert, describe the quality. The feature that the system identifies is not something that exists in reality. It is an abstraction that indicates the key to predicting future outcomes—for grouping examples as either positive or negative. The feature is based on a concrete sensory reception, but the reception is transformed so that other examples share the same feature.2 All content-related operations performed by the system refer to something recognized or created on more than one occasion. Therefore, all these operations specify features. Functionally, if it is possible for a system to recognize or create it on more than one occasion, it must be represented by the system as an abstract feature. It is represented by the system in a way that is perfectly consistent with the reception the organism has received,
2An
absolutely unique experience may be recognized because it is an individual composed of various features. For instance, a great green flash that seems to have no particular location may be a unique individual composed of the features great, green, flash, and the feature of being without location. Also within the range of continuous variation of some features (e.g., color), there may be unique instances that have never been experienced before. For instance, the green flash may be a completely unique color never experienced by the learner before. The question is how green is identified as green. A summary answer is that it has a color property shared by all examples of things that are green and by no other examples. A more detailed answer is provided in chapter 9 under the discussion of interpolation of continuous-variation features.
2
ANALYSIS OF FEATURES
193
but it includes fewer details and therefore applies to occasions beyond those in which it had been first observed. Completely Generalized Features As indicated before, when the representation used by a grouping rule becomes more general, a larger percentage of the population of examples becomes positive. If the representation calls for rust-red, there may be only 5 positive examples in a population of 100. If the representation were broadened to include the full range of red, perhaps 15 examples would now be positive. The process of broadening the criterion could continue until the criterion was so broad that all members of the population were positive. At this point, the representation has no potential function for sorting examples in this population. In other words, the color feature is completely generalized. Color variation still exists. It is simply not used as a criterion for sorting different members of the group. If the population of flowers included blue, yellow, red, and white in a variety of hues, a completely generalized representation of the color record would be a feature or set of features that apply to all colors. In other words, color would be ignored and not considered a variable. If the system generalizes the feature conservatively, the feature is a basis for screening members of the population. The difference between the feature not being recorded by the system and the feature being completely generalized is important. If the feature is recorded by the system as being present but not a variable, the feature has the potential to (a) become discriminative stimuli at a later time, and (b) serve as a criterion for identifying members of the group. (Not all members of the larger population have all the features that describe a particular group; therefore, the feature may be used as a reliable criterion for identifying all members of the group.) The agent may be aware of the color it is approaching; however, the reason it is approaching has nothing to do with the color. If the feature is not recorded by the system, the absent feature could not become a variable at a future time because it is not in the record. For the learner to discover at a later time that color—any color—is a predictor, the learner would have to consult the record of at least one past occasion in which color could have been a predictor. In effect, the system that is able to later learn that color is a variable goes from a completely generalized representation of color to one that is less generalized. From General to Specific Representations When the system formulates rules based on sensory receptions, the system goes from specific to more general. The system sees the rust-red example and classifies it as a color that includes at least some range of variation. It
194
8.
LEARNING PATTERNS AND GENERALIZATIONS
may be a narrow range of rust-reds or a wider range that includes all reds and some oranges. However, when producing responses, the system must go in the other direction—from general to specific features. The content map provides information that applies to all settings that have particular sensory features. The system must particularize the rule so it fits the current setting. For example, a content map provides a general description such as “Approach X.” The system would have to be designed so that it recognized the presence or absence of X. When X is present, the agent would be faced with the task of transforming the general directive into a specific concrete application. Although 100 variations of “approach” are possible in the current setting, the agent produces only one. Selecting this one is achieved by adding features to the original “approach” representation presented by the content map. For example, the agent may select walking. This feature is added to the original broader criterion, “Approach X by walking toward X.” This specification has reduced the number of possibilities greatly, from 100 variations of “approach” to possibly 8 variations of “walking.” However, the criterion is still too general, which means that the agent has to add features until one specific concrete example is described: “Approach X by walking toward X in direction D at rate R and with posture P.” Note that the feature of approaching X is still present as is the feature of walking. In summary, if a general directive is to be applied to a particular setting, all the variables that are influenced by details of the current setting must be specified. The system must go from a broad representation that applies to all settings to one that applies to a single setting. From Specific to General Representations Processes that go from specific representations to an identification of broader features create different problems of focus. If the task facing the learner is to identify walking, not to create walking, the learner must refer to features. The problem facing the system is that the example of walking has hundreds of features. If all are recorded, how many should be subtracted to create a representation that is useful to the system? The various possible groupings of positive and negative examples are a strict function of the features included in the representation. Let’s say that a naive learner who has never observed walking before observes it on one occasion and attempts to represent it. The learner may represent only the feature of moving, in which case the learner’s criterion would classify all examples of walking as well as all other modes of locomotion as walking. Another naive learner may identify the feature of the feet moving so they are sometimes in contact with the ground and sometimes not. These fea-
THE PROCESS OF FOCUSING FEATURES
195
tures would generate a screening criterion that classifies all examples of walking as positive examples and also all examples of skipping, running, and jumping. Some examples of locomotion would not be included— crawling, rolling, and swimming. Furthermore, not all of the positive examples in this classification would involve locomotion. Jumping up and down and running in place would be positive examples because they have the essential feature of moving the legs so the feet are sometimes in contact with the ground. At the other extreme is the naive learner who identifies features of specific foot positions, specific stride lengths, rate of movement, and pattern of arm–leg movement. This criterion would be so specific that it would exclude all negative examples of walking, but it would also exclude many positive examples of walking (walking with a long stride or with minimum arm movement, etc.).
THE PROCESS OF FOCUSING FEATURES Features are central to three primary functions of the system that learns: 1. Going from the general representations provided by content maps to specific applications that have the same set of essential features. 2. Going from specific sensory settings to more general representations assumed to incorporate the essential features that the setting would share with other settings. 3. Adjusting the content map and application according to sensory feedback. Figure 8.2 shows a functional model of a multirepresentational capacity for performing feature-abstraction and feature-application operations. The figure presents three cross-sections that occur in sequence over a short period of time (possibly 3 sec). Each cross-section shows a variation in the detail recorded by the system (the black sensory core) and the extent to which the core record is generalized (the gray area surrounding the sensory core). The farther a gray segment is from the sensory core, the greater the core detail is generalized. The core provides the record. The generalization (gray area) is the interpretation of the record. The figure assumes that a primary reinforcer is influencing what is generalized. This framework operates over time and produces ongoing transformations of sensory input. Although the diagram shows three discrete cross-sections, the process is actually continuous, not segmented. The segmentation is designed only to show that the core is not constant over time, but that the generalizations tend to remain constant during the pursuit.
196
8.
FIG. 8.2.
LEARNING PATTERNS AND GENERALIZATIONS
Sensory core and generalized features over time.
This representation would indicate that a particular trial content map is in effect. The Core Each excursion of the core (individual black bars extending from the center of the diagram) represents a unique sensory detail that is recorded. One bar may represent the apparent color of the object, another the apparent location, and another the sensation from a leg. The number and type of features are strictly illustrative. In actual practice, the system may record many times more features than the diagram suggests. The greater the distance of each bar from the center, the greater the salience of the sensation. The record of a bright object would be a bar that extended farther from the center of the core than the bar for an object that is poorly illuminated. A record for a big object would be a bar that extended farther from the center than the bar for the same object when it appears smaller. Any sensation enhanced by the system would also be relatively more salient and therefore would extend farther from the core than neutral, unenhanced sensations. The pattern shown in the core record changes over time as the sensory details of the setting change. Some of the recorded details endure with minor transformations, some change greatly, and some disappear as others
THE PROCESS OF FOCUSING FEATURES
197
appear. At any point in time, however, the system is receiving a large number of sensations from both external and internal sources. Most of these details are irrelevant to the content map and primary reinforcer. Each of the eight radiating segments (A–H) represents a feature category of the core. Each feature category is independent of the others, which means that it cannot be derived from any combination of the others. The number of feature categories shown is arbitrary, but greatly inadequate for a system capable of responding to proprioception, tactual sensations, vision, hearing, and olfaction. If there were only six feature categories per modality, there would be at least 40 static categories and possibly double that number if the dynamics over time were considered. For simplicity, assume that the diagram relates only to visual features. Those excursions within a given radial segment (e.g., Segment A) are assumed to share features. If Segment A is color, all the irregularity of the core in A has to do with color, not simply the color of a targeted flower, but also color of ambient things—foliage, sticks, grass, leaves, sky. The colors change over time. If Segment B has to do with changes in proximity or apparent size, all excursions have to do with specific features currently present. Each segment represents a dimension in which something may be transformed—in shape, position, size, color, and so forth. The excursions from the core are therefore organized in a way that they do not actually exist in the reception. It may be that for the system to perform the various abstractions and generalizations, the system needs multiple copies of the same sensory input so it can organize it in different ways according to the variety of categories that apply to each cluster of related sensory input. Functionally, however, the inputs are sorted and reorganized when the system performs any operation related to a particular dimension or quality, such as color, size, relative position, and so forth. For instance, the learner observes that something relatively small and red moves in Pattern P. The details of the reception are distributed to four independent categories—color, size, movement, and pattern. The analyses for these variables require varying amounts of time. The analysis of direction may require less time than the analysis of the pattern. In any case, each segment requires a unique analysis. The analysis depends on the core record. Independence of Core Categories The components small, red, movement, pattern, and some entity are from independent categories, yet they reside in a single host. The independence of the categories is most easily demonstrated through substitution of the values within each category. For instance, the system that is able to respond to each of the independent categories would be able to represent the following minimally different variations: (a) Something relatively small and
198
8.
LEARNING PATTERNS AND GENERALIZATIONS
red that moves in Pattern P, (b) something relatively large and red that moves in Pattern P, (c) something that is relatively small and blue that moves in Pattern P, and (d) something that is relatively small and red that moves in Pattern Q. Feature Generalization of Core Representations The shaded area in Fig. 8.2 shows the degree of generalization of the various core representations. Whereas excursions for the core represent salience, the distance of the shaded area from the center represents the degree to which the core record is generalized. Every detail recorded in the core (every excursion) is generalized. Only if it is generalized would it be recognizable over time or on another occasion. The farther the shaded boundary is from the center, the more the core detail is generalized. As a generalization of the core detail approaches the outer edge of the cylinder, details of the core record are subtracted. Therefore, greater distance from the core increases the percentage of positive examples in a population. If the generalization reaches the edge of the cylinder, 100% of the examples are positive, which means that the generalization does not provide any basis for grouping things on the basis of the feature. If a generalization is between the outer edge of the cylinder and the core excursion, it is a variable, which means that it is capable of providing a basis for grouping members as positives (those that have the feature) and negative (those that don’t). If the boundary is close to the outer edge, it is generalized broadly, which means that it is categorized on the basis of relatively few features. If it is relatively close to the corresponding core excursion, it retains a relatively large number of features of the core record. The closeness to the core does not affect the importance of a generalization, merely the amount of core detail used as a basis for grouping examples. Closer proximity of the generalization to the outer edge implies an easier discrimination. Fewer details of the feature are relevant and a broader range of variation is acceptable. As the generalization approaches the core, more details of the sensory record are involved, so the discrimination would require identifying the presence of many features. The addition or subtraction of features is illustrated by the various ways it is possible for the system to respond to a rust-red light. The light has a certain amount of salience and detail on a particular occasion. Some of that detail has to do with the color. If the color of the light (rather than position, intensity, timing, and other features) is a predictor, the color must be abstracted from the context in which it occurs. If the experimental rule is that all lights that are not blue are positive examples, the rust-red example is positive because it has the feature of being a light that is not blue. This generalization would be quite close to the outer edge because it involves a single, broad feature
SUMMARY
199
(not blue). If the rule is that only red lights are positives, the example is positive because it has the features of being a light that is red. This generalization is closer to the core because it requires a specification of much more detail than the discrimination not blue requires. If the rule is that only rustred lights are positives, the example is positive because it possesses the features of being a light that is rust-red. This generalization would be quite close to the core record. If only a specific shade of rust-red light were positive, a very large number of features (greater detail) of the example would be required to discriminate positive from negative examples. The generalization would be very close to the core. At the other extreme, we could make all features of the rust-red example irrelevant to classification by not referencing the positive examples to that feature. For instance, a dog now signals the primary reinforcer. The reception and record of the rust-red light still occurs, but it is now irrelevant to the pursuit and is therefore completely generalized, which simply means that none of the core reception details is considered for the current pursuit. For a completely generalized detail, 100% of the members in the rustred light population meet the screening requirement for color because there is no screening requirement for color. As Fig. 8.2 shows, some of the categories (C, D, E, G, H) are completely generalized for the current pursuit. The only relevant categories are A, B, and F. The core record for Segments A, B, and F changes over time, but the degree to which they are generalized remains stable. Variation in the core reception does not affect the particular generalizations for these categories. If color is relevant and present, the same color features are abstracted from all three records that are shown. The generalization is the same because the function is the same. The specific color is an ongoing variable for the current pursuit. It is attended to and tracked. The degree of generalization of one feature category does not imply the degree of generalization for the other categories. In the figure, the relative distance of various generalizations from the core is not the same, which means there is a range of possible generalization for the details of the core. The generalization is influenced by logical factors—implications that suggest the nature of the detail that is predictive.
SUMMARY The most basic fact about the system that learns or performs is that it generalizes as part of what it learns. This is a logical imperative. If it does not abstract some qualitative detail from one setting that is shared by another, identifying the same predictor or applying the same response strategy on two occasions is impossible. The two settings are not identical, but they
200
8.
LEARNING PATTERNS AND GENERALIZATIONS
share some features. The features are abstractions, not a specific sensation or something observed in nature. Therefore, the most basic function of the system is to abstract and represent features. The features that are represented may be relatively general or more specific. However, all are representations of qualitatively unique content. The system that learns must perform multiple representations. These are necessary because the system formulates conclusions about related features. Changes in specific sensory features imply changes in some response features. The system learns not only features of individuals, but patterns that characterize a sequence of examples. More sophisticated learners are even able to identify patterns of reinforcement. This pursuit logically requires abstracting the pattern of temporal ordering by correlating the presence of a recurring pattern with the presence or absence of reinforcement. The analysis of features required by the system that learns is extensive. The system must deal with enduring features, transitory features, and correlations involving individual features. The system necessarily identifies patterns from those involved in walking to the temporal rules for intermittent reinforcement of trials. The system must be able to change the scope of a generalization on the basis of feedback. The feedback indicates whether the predicted outcome occurred. The implications are whether and how the predictor should be modified. If the criterion that the system assumes to be predictive predicts all the occurrences of the primary reinforcer, but also predicts the primary reinforcer when it does not occur, the criterion must be made more specific. This adjustment is achieved through the addition of features (or the specification of a smaller range for a feature characterized by continuous variation of a single dimension). If the predictive criterion always predicts the primary reinforcer, but the primary reinforcer also occurs when it is not predicted, the scope of the criterion must be made more general. This adjustment involves the subtraction of features (or the specification of a greater range of variation). If the generalization of the detail the system records is conservative, it involves many of the recorded features. If the generalization of the detail is extensive, it involves only a few details of the record. If the detail is completely generalized, it is recorded, but the feature is not relevant to the current pursuit. The system represents the sensory details and abstract features of the record. The scope of the generalization of the core record determines the percentage of examples in the population that are positive. If the feature is conservatively generalized, it screens the population so that only a few members of the population are positive. If the feature is more broadly generalized, the percentage of positive examples increases. If the feature is
SUMMARY
201
completely generalized, it is not used as a screening criterion because all members of the population have the feature. The scope of the generalization may therefore be adjusted to group the population in ways that permit the learner to predict accurately. Features are subtracted to make the abstraction more general (so that more members of the population are included). Features are added if the predictor is to be made more specific so that it refers only to a subset of the current positives.
Chapter
9
Transformation of Data
Chapter 8 presented a model for generalizing incoming receptions. The process involves selecting only some of the details of the reception and representing them as features. If the representation needs to be more general, features are subtracted from the representation. If the representation must be made more specific, features are added. The capacity to construct multiple representations that vary in specificity requires the application of logical operations. The extent to which a particular system applies these logical operations is inferred from behavior. Different patterns of behavior imply the four fundamental operations—interpolation, extrapolation, stipulation, and transformations that create gestalts. The particular operation involved is determined by comparing the degree of generalization with the set of examples the learner received during training. The pattern of the learner’s responses to examples that are different from those presented during the training implies the extent and type of logical operations the system performed to classify the nontaught examples. If the learner was not shown a particular example during training and classifies it as a negative, for instance, the only basis for this conclusion is a logical operation. The learner either interpolated, extrapolated, stipulated, or unified data points into a gestalt. INTERPOLATION Interpolation is an operation based on the difference between examples. If Examples A and B are positive and B is different from A along a particular dimension, X, then any example that is less different from A than B along 202
INTERPOLATION
203
the same dimension is also positive. From the presentation of three different shades of green in a controlled experimental setting, we are able to infer the extent to which the learner performs the logical operation of interpolation. The procedure would be to teach a particular response to three examples—yellow-green, green, and blue-green. The learner is reinforced for responding to these and only these values. To achieve the latter restriction, the design would provide negative examples—yellow, blue, and two more examples that are farther from the three positive examples on a continuum of color variation—orange and purple. The examples are presented by a single light illuminated for each trial. If the learner responds to a positive example by pressing a bar, the learner receives a food reward. If the learner responds to a negative example, the learner receives a mild shock. A condition of the training is that all examples we present fall within the discrimination capacity of the subjects. When the learner consistently responds to all positive examples and responds to only positive examples, we test the learner with examples that are discriminably different from the training examples. All of the test examples that are positive would be interpolated values—a green that is between the yellow-green and green, and another between green and bluegreen. Negative examples would also be logically interpolated values— one between the blue and purple and another between yellow and orange. The test consists of these examples randomly interspersed with the taught examples. Table 9.1 shows the arrangement of the taught examples (positive green and negative not green) and test examples on a continuum of color variation. If the learner responds to any test example (pressing the bar), the response indicates that the learner has classified the test example as a positive. No response to a test example indicates that the learner has classified the example as a negative. The pattern of responses to the test examples indicates how the learner has classified the training examples and the extent to which the learner interpolated. Classification Designs for Interpolation The system could be designed to identify the training examples correctly by employing three different hardwired formats: TABLE 9.1 Test Examples and Taught Examples for Interpolation
204
9.
TRANSFORMATION OF DATA
1. The system could classify the positive examples according to their shared features and the negatives according to a single shared feature. 2. The system could classify the positive examples as three discrete, unrelated values and the negatives according to a single shared feature. 3. The system could classify the positives as three discrete, unrelated values and the negatives as four discrete, unrelated values. For Design 1, the shared-feature classification, each instance would be classified according to the set of features that are common to all positive examples (the range between blue-green and yellow-green). All negatives would be classified on the basis of a single feature: They are not within the range of positives. The predicted performance for this sharedfeature design would be that the learner would identify all four test examples correctly. For Design 2, the aggregated positive-example design, there would be no attempt for the system to relate the features of one positive example to those of the others and therefore would make no determination of whether they share common features. The positive examples would simply be aggregated in the same way they would be if the learner had learned that a stone, hat, and green light are to be treated as positive examples and that there were also examples that were not positive. The negatives would be identified according to a specific shared feature—the feature of not being one of the positives. The predicted performance for this aggregated positive design is that the learner would identify all four test examples as negatives. All test examples lack the feature characteristics of any examples that the system recognizes as positive; therefore, all test examples are judged to be negatives. For Design 3, all positive examples and all negative examples are categorized on the basis of their individual features. No shared features are considered. The predicted performance for this design is that the learner would respond to the test examples randomly. If the system knew only how to classify the three discrete positives and the four discrete negatives that occurred during training, the system would have no basis for classifying any test example as either positive or negative. Therefore, the system would have no basis for identifying it. It would be arbitrarily classified as positive or negative—perhaps treating some as positives, perhaps treating all as negatives. For the learner, each test example would remain unrelated to the others until the system receives information about whether it is positive or negative. The most ineffective system would be Design 3. In fact, Design 3 is a paradox because the content map must classify each positive as having shared
INTERPOLATION
205
features if it is to recognize it on more than one occasion. Recognizing any of the positives on more than one occasion logically implies responding to fewer features than the original example presents. To produce the positive response (going from the representation of the content map to the specific application in the present setting), the system goes from general to specific. In other words, this system would have to possess the capacity to go from specifics to shared features and from shared features to specifics. Furthermore, the system must recognize that all positives share the overriding feature of leading to reinforcement. So given that they have this common effect, the most reasonable system of logic would search for the most general sameness in feature that predicts the reinforcement. That would be the set of features shared by all positives. The fully aggregated design is also inefficient because it classifies each negative as a discrete, unrelated value. Again this position is paradoxical because, ultimately, each negative lacks the features that make it a positive. Therefore, all are related because all lead to punishment if they are responded to as positives. Design 2, which uses shared-feature logic for negatives only, is paradoxical because the behavior shows that the learner knows that all positives lead to positive reinforcement. So it produces the same response strategy without relating that response strategy to a possible common basis in the positive examples. The search for common features of the examples does not imply that there are always shared features. For instance, if the design of the experiment involved three arbitrary positives (stone, feather, and glass) rather than shades of green, the search for finding a single common feature would not be productive. Rather, the system would have to learn that each of the discrete positives, and only these positives, lead to reinforcement. Although this possibility is present before the learning occurs, the system that assumes there is some basis for sameness will have an advantage because it will discover the largest grouping for the positives that is implied by the experimental conditions. By using adjustments in the scope of the generalization (detailed in chap. 8), the system would ultimately identify (a) whether there is a single common feature or set of predictive features that are unique to all positives, and (b) the nature and extent of this feature or set of features. This process would be referenced to the presentations of the primary reinforcers (positive and negative). If negative outcomes are consistently avoided during training, the system has learned a scope of generalization that is appropriate for the training examples. The feature common to all the positives is that they are green (which is arbitrarily defined by the set of positive and negative examples).
206
9.
TRANSFORMATION OF DATA
Design 1 is the most efficient and the only design that allows for the process of interpolation. Designs 2 and 3 (which are not based on sameness of feature for identifying positives) are unlikely designs because they would become overwhelmed with information as the range of positive variation increases. If the range of positive variation is very great, the range of negative variation becomes relatively smaller, so the system would realize diminishing savings by grouping the negatives as not positive. Concurrently, the system would have to perform enormous calculations to keep track of the various positives. Even if the system uses a Design 1 process, it may not have the capacity to identify all test examples correctly. If the system responds correctly to all test examples except those negatives and positives that are only minimally different from each other, the system is identifying shared features of positives and negatives, but is poor at representing those shared features. For example, if the system learns to correctly classify the middle green as positive and orange and purple as negatives, the system attempts to identify shared properties of the positives, but it is not successful at identifying the range of qualitative variation that characterizes the positive examples. It has learned middle green because this color shares fewer features with the known negatives (orange and yellow, blue and purple) than the other positive examples share. The system might tend to confuse blue-green and blue and confuse yellow-green and yellow. The likelihood of the system confusing yellow or blue with middle green is less likely because this green shares fewer features with either yellow or blue. Consider two other possible experimental conditions. Condition 1 calls for every color except blue to be a positive predictor of food. Condition 2 calls for only blue to be positive and all other colors to be negative. The learning system designed to aggregate positives but globalize negatives would have a lot of trouble learning to respond to Condition 1 although it is logically no more difficult to learn than Condition 2. The system that groups both positive and negatives according to sameness in features would use parallel logic for Conditions 1 and 2. For Condition 1, the content map would be based, in effect, on a rule such as, “If an example is blue, it is negative; if it is not blue, it is positive.” The map for Condition 2 would be based on a parallel rule: “If an example is blue, it is positive; if it is not blue, it is negative.” Similarity and Shared Features. If the learner is capable of discriminating the various color values and correctly identifies both positive and negative values, the only possible basis for this performance is some form of shared-feature analysis. This conclusion is quite different from the traditional interpretation that the learner simply learns that the positives are
INTERPOLATION
207
similar. If similarity is used as an explanation, why doesn’t the learner treat yellow-green as a negative? It is as similar to yellow as it is to green. So raw similarity cannot be the basis for one example being positive and the other negative. If the learner consistently discriminates yellow as a negative and the yellow-green as a positive, the learner must identify specific features that are unique to the positives. Interpolation of Positive Values The characteristics of the logic for interpolating positive examples may be expressed in different ways. Basically, however, the system must meet two requirements. First, the system must have the capacity to determine the degree of transformation between the test example and the known data points that are closest in value to the test example. Second, the system must have knowledge of the possible various values that lie between the two data points. If the operation does not meet both these requirements, it could not work. As with other logical operations, these requirements do not suggest how the system is designed to achieve the analysis, merely that it is logically necessary and the system could not perform the generalized behavior without performing the logical operations. For instance, the learner receives information that yellow-green is positive and that middle green is positive. The system then functionally reasons as follows: 1. To transform one of these positives into the other would require transformation X. 2. In performing transformation X, the operation would pass through values 1, 2, 3 . . . N. 3. If transformation X results in both examples being positive, any transformation from yellow-green to 1, 2, 3 . . . or N results in a positive example. In other words, any transformation that is a component of a transformation required to convert one positive into another positive results in a positive. The system must represent the positives as a range, not as discrete points. To group both taught examples as occupying the same range, the system has to identify what they have in common, which is a unique quality common to a range of feature variation. This procedure requires subtracting some of the features of yellow-green and green until the residue is the unique shared features. These serve as the basis for the range of positive variation for the two values. Viewed differently, the system converts one
208
9.
TRANSFORMATION OF DATA
value into another and classifies the feature difference between the two as an irrelevant difference. If the system is to apply this knowledge to any concrete examples that lie in value between the known data points, the system must be able to represent the transformation between the positives as an ordered series of intermediate variations—component transformations—that are implied by the difference between the two values. The system could not identify the interpolated test examples as positives unless it was able to place them somewhere within the range of positive variation. Basically, the system must know that one test example lies between the yellow-green and middle green. This knowledge would not be possible unless the system recognized the range of variation that would occur between the yellow-green and middle green. Identification of Negative Values Although a variation of this interpolation process may be used by the system to classify the negative test examples, more than one design option is possible. To perform the interpolation for the negatives, the system would identify the specific features of those negatives closest in transformation to the positives. The system would then record the data points for the other two taught negatives and construct the transformation from yellow to orange and from blue to purple. The system could then identify the test examples that fall in the transformation ranges—the yellow-orange and blue-violet examples. Because these examples fall within the known range of two negatives, they must be negative. (The transformation from one known negative to the other passes through the test example. Therefore, the test example is negative.) A somewhat simpler basis would be possible for classifying the negatives. The system could group the four negatives originally taught as being the same in that they lacked the range of greenness that characterizes the positive examples. This classification would provide a criterion for identifying the test negative examples and any other negative example. The test examples do not have the property shared by the positives. Therefore, they are negative. This logic permits the range of variation used to describe the positive examples to serve as the criterion for identifying the negative examples. This option would be sufficient if the only operation the learner performed was interpolation. However, for some learning, the system must perform not only the operation of interpolation, but also operations that combine both interpolation and extrapolation.
209
EXTRAPOLATION
EXTRAPOLATION Interpolation involves the transformation that occurs between known data points. Extrapolation involves those transformations that go beyond the known data points. Let’s say we performed training that presented the same three positive examples of green, but with no negative examples. Table 9.2 shows the three taught positive examples and eight test examples on a continuum of color variation. In the original experiment, there was only one relevant variable—the color of the light. By eliminating the nongreen colors (negative examples) from the set of training examples, we create two possible predictors. One would be the greenness of the light, the other would be illumination. A property of all the training examples before the testing occurs is that if the response is produced when a light comes on, reinforcement follows. The light is now a shared feature of all positives; the illumination could be identified as the predictor, and it would predict accurately on all training examples. For the interpolation training, the illumination was not a property unique to all positives. Because there is more than one predictor of positive reinforcement in the extrapolation design, a wide variation of behavior is theoretically possible because a wide variation is consistent with the information the learner has received. The learner may respond to any test example as a positive, only the green examples as positive, or to all green and some nontaught examples as positives. All these outcomes are consistent with the information the design conveys to the learner. The only examples ever reinforced are green; therefore, it is possible that green is a predictor. The only examples ever reinforced are occurrences of the light turning on. Therefore, it is possible that illumination is a predictor. The only examples reinforced varied some in their color. Therefore, a range that includes some of the test examples could be a predictor. We could eliminate illumination as a possible predictor by changing the design so that the light was on all the time. It would be either white light or green. Because illumination does not occur as a variable, only the color green is a variable. Its presence predicts the food. The presence of white light predicts the unavailability of the food. In a sense, this change in the
TABLE 9.2 Example Set Requiring Unknown Extrapolation
210
9.
TRANSFORMATION OF DATA
experiment has created a negative example—the white light. Therefore, this experiment is like the original except that it has one, not four, negatives and no negative variation. (To control for the possible shared feature that positives are presented for a shorter period of time than negatives, the design could control the length of time each example is presented. If a positive were present for 3 minutes, however, the learner would be reinforced for only one bar press.) This design does not provide information about how the learner is to classify colored examples that are not white or green. Should the range of positives be generalized conservatively or extensively? Any statement that could be made about all the positive examples but not the negative describes a possible feature that could be used by the system as a basis for successfully identifying all examples within the experimental context. The general principle is that any feature shared by all positive examples and no negative examples may be used by the system to describe the range of positive variation. In other words, the system would be able to perform successfully on all training examples by formulating one of a large number of possible relationships between the positives and the negative. These are the primary possible rules arranged from more general to more specific: (a) If it has the feature of being not white, it is positive; (b) if it shares more color features with the green examples than the white example, it is positive; and (c) if it is strictly within the range of positive examples, it is positive. Because each of these ranges is consistent with the information the learner received during training and each leads to correct identification of all positives and the negative, each is a possible feature range that could be identified by a system that uses data and logical operations. If the system identified not-white as the criterion for positives, any colored light that is presented following training would be responded to as positive. If the system identified the criterion of being closer to green than to white, a number of patterns is possible, each grouping some untaught colors as positive and some as negative. However, all would have the blues and yellows identified as positives. For the most restricted option, any example that fell outside the range of green variation shown for the positives would be negative. Blue and yellow would be negative. Because each of these possibilities will result in the same function (that of correctly identifying all the positives and the negative of the training set), different learners would be expected to select different arrangements. Even plan-based systems could generate different patterns in different learners. However, the most probable pattern for the plan-based system would be a conservative generalization of greenness, whereas the featurebased learner would tend to generalize more.
211
EXTRAPOLATION
Extrapolating the Difference The facts about the difference between positives and negatives implies the logical function that the system employs. In its simplest form, the system identifies the difference and uses an extrapolation logic based on this difference: If the difference from a known positive to a known negative is D, any difference greater than D creates a negative through extrapolation. We can illustrate the logic by teaching two negatives from the previous example set. If we teach yellow and blue as negatives, the remaining negative test examples are greater in difference from positives. Table 9.3 shows this arrangement. The learner is taught three positives and two negatives. All the test examples are negatives, and all have a greater difference from the positives than yellow and blue. The boundary between positives and negatives is clearly established by yellow and blue. According to the logic of extrapolating the difference, the learner who learns the taught examples would correctly classify all test examples as negative by extrapolating the difference. The transformation of blue-green to blue is magnitude D. The transformation between blue-green and blue-violet is D plus additional features. If transformation D creates a negative example, transformation D plus additional features creates a negative through extrapolation. The process of extrapolating the difference guarantees that the learner would not need a full range of negatives to identify the features that characterize all negatives. The learner would need information only about those negatives that are minimally different from positives. If the learner learned to discriminate between negatives that are minimally different from positives, the learner would have sufficient information to classify the full range of negatives correctly. Difficulty of Examples The principle that any feature shared by all positive examples and no negative examples may be used by the system to describe the range of positive variation implies why some discriminations are logically more difficult than others. The greater the number of options that could account for all TABLE 9.3 Example Set That Specifies Extrapolation Range
212
9.
TRANSFORMATION OF DATA
positives being positive and all negatives being negative, the greater the probability that a learner will identify one of them after only several exposures. The fewer the number of available options, the lower the probability that a learner will identify one of them with the same number of exposures. The learning task that presents 10 options is logically easier to learn than the task that presents 2 options; the learner will more probably learn 1 of the 10 options than 1 of the 2 options. The question is not necessarily whether the learner is capable of learning the discrimination, but whether the learner fixes on the correct feature difference between positives and negatives. When the negative example is white light, a great number of different ranges of color variation would permit the learner to learn some set of features that permits correct responses to all positive green examples and the negative white examples. So the probability is higher that the learner will identify one of them. When the number of features that are shared by both positives and negatives increases, the range of options for discriminating positives from negatives decreases. The learner would be expected to require more trials to identify the features that distinguish yellow from yellow-green than to distinguish orange from yellow-green. The system cannot classify test examples without constructing some assumed range of acceptable positive variation. The system does not have the option of designing a classification procedure that says in effect, “I know the greens are included, but I don’t know about the others. Therefore, I will just not deal with them.” If the system is able to describe the greens, a classificatory range is implied. All those that the system does not know about are outside of the range of greens and are negative. An unbounded classification is not functionally possible because the learner will respond to all examples. If they are not treated as positive, the system has—either deliberately or tacitly—identified them as negatives. Therefore, the only option available to the system is to define some range that encompasses the green examples and does not include white light. Minimally Different Examples. The requirements of the system that performs extrapolation of differences imply the principle of the magnitude of difference. If the difference between positives and negatives is D, any discrimination that requires a transformation less than D is more difficult to learn and any discrimination that has a difference more than D is relatively easier to learn. This principle does not attempt to assign an absolute level of difficulty to the discrimination. Rather, the principle asserts that whatever difficulty the system has in learning the discrimination, it would probably have more difficulty learning a discrimination that involved a transformation of lesser
EXTRAPOLATION
213
magnitude and have less difficulty learning a transformation of greater magnitude. The principle of the magnitude of difference expresses a probability. On a given occasion, the learner who has not learned an effective content map will identify a particular extrapolation pattern of positives and negatives. The pattern will either be contradicted or confirmed by later information. If there is a relatively large number of possible patterns that would work, there is a given probability that the learner will select one of them and different learners would learn different patterns. As the number of possibly effective patterns decreases, the variation in patterns that are learned decreases. If the difference between positives and negatives involves only a single feature or a small difference, all learners who perform correctly to all positives and all negatives have learned the same thing. This principle has important implications for instruction (delineated in chap. 17).1 Double Extrapolation The process of deciding the extent to which the positives are extrapolated beyond the taught examples automatically implies the extent to which the negatives are extrapolated. The process requires extrapolation from the positives and extrapolation from the negatives. There is a certain amount of distance or transformation magnitude between the extreme positives (yellow-green for instance) and the closest known negative. The positive range plus the negative range equals the entire distance. The information provided by the taught negatives and positives does not provide information about how much of the area between known positives and negatives is positive and how much is negative. The learning system must therefore determine the pattern of extrapolation. Figure 9.1A shows a conservative extrapolation for positives. Figure 9.1B shows a more extensive extrapolation based on the same set of positives and negatives. For both A and B, the three positives are shown on the left and the negative white light on the right. The top bar shows the range of extrapolation from yellow-green to white. The bottom bar shows the extrapolation from blue-green to white. The extrapolation of the positives for A is 1An intriguing phenomenon is associated with the principle of the magnitude of differ1 ence. It is logically possible to make the difference smaller no matter how small the original difference D is. The reason this is possible is that one can add features to the rule for positive variation. This addition implies a new negative, which has all the features of the minimally different negative except for the added feature. For instance, the green example is positive only if it is green and has been preceded by a negative example. The green example is positive only if it is green and has been preceded by an orange negative.
214
9.
FIG. 9.1.
TRANSFORMATION OF DATA
Conservative and extensive extrapolation of positive examples.
conservative. The basic rule that generates this pattern is that if a new example is not close to being within the taught range of green, it is not judged positive. Because the extrapolation of positives is conservative, the extrapolation from white light is relatively extensive. The extrapolation of positives shown in Fig. 9.1B is the opposite of that shown in 9.1A. The pattern for B is based on the rule that if it is not white it is positive. Therefore, the extrapolation of the positives is extensive and the extrapolation of the negatives conservative. The diagram shows the generalizations from the positives as symmetrical—the extrapolation from yellow-green is the same distance as the extrapolation from blue-green. Symmetry is not a necessary feature of the extrapolation pattern. Any arbitrary points on the top bar and the bottom bar mark a possible extrapolation pattern. Effect of Consequences on Extrapolation. The pattern of double extrapolation would tend to change as the reinforcing consequences change. Changes in reinforcing consequences do not affect the information that the agent receives about color variation, but rather information about the relative importance or response cost of representing the variation conservatively or extensively. Consider two conditions, one of which has a high response cost for the learner responding to the negative example. The learner receives a highly aversive shock if it responds to a negative example during training. The other condition has a highly sought reinforcer. The prediction for a population of learners would be a tendency for more conservative positive extrapolation with the highly negative consequence. Conversely, extensive extrapolation of positives would occur in response to the highly sought reinforcer.
EXTRAPOLATION
215
To establish a baseline condition, the learner would be reinforced for pressing the lever when the light is green. The reinforcer would be mildly reinforcing. The consequence for pressing the bar when the white light is on would be a mildly aversive shock. This condition would provide the baseline for a group of learners. It would show the extent and variability of the extrapolation of the positives and negatives. For the high-response-cost condition, the initial training has a different negative condition, but the positive reinforcer remains the same. The learners would receive a more severe shock for responding to the white light. When tested on nontaught examples, the learners in this group would tend to generalize the positive features more conservatively. This is a result of the negative condition. The only way to be assured of avoiding the negative condition is to extrapolate the negative feature extensively, possibly to the point that the learner would tend to treat any example outside the immediate range of the taught examples as negative. An increase in the strength of the positive reinforcer would be predicted to have the opposite effect—that of increasing the extrapolation of the positives possibly to the point that the learner might sometimes respond to the white light during training (as well as the green examples). For example, the learner would receive a small amount of water after being deprived of water for a long period before each experimental session. Responding to the negative would result in a mild shock. On the test examples, the learners would be predicted to respond to the full range of nongreen lights with far greater frequency than they would under baseline conditions. The prediction of extensive positive extrapolation is based on the idea that there is little cost for extending the range of positives. However, there would be a serious possible loss of reinforcing opportunities if the range were too conservatively extrapolated. Extrapolation of Negative Examples and Interpretations In chapter 8, we indicated that the adding of features to the positives increased the precision of the discrimination required and therefore increased the difficulty of the discrimination. That discussion referred to the number of features to which the learner would have to attend. If an independent feature is added to redness (e.g., redness and smallness), the learner’s representation would have to take into account more than redness. A related issue is the nature of the set of positive and negative examples required to ensure that all learners learn this combination of features. To achieve this goal, the training set of examples for small, red positives would have to include negative examples of at least two types—those that are small and not red and those that are red and not small. As noted, if the negatives are only minimally different from the positives, only one pattern of extrapo-
216
9.
TRANSFORMATION OF DATA
lation is possible. If the negatives are greatly different from the positives, various patterns are possible. Table 9.4 shows the extrapolations possible as a function of the negatives for four experimental treatments. For all, the positives are red and small. Each treatment has a different pattern of negatives. The last column indicates the possible interpretations or rules that are consistent with the information conveyed by the combination of positives and negatives. These would be the only possible interpretations for learners who learned the discrimination and correctly responded to all positives and negatives in the training set. The table does not suggest the relative frequency of each interpretation of the data, merely that each is possible because it is consistent with the information the learner received. If the negatives are not red and not small, the learner could learn any of three possible rules or interpretations: Every positive must be red and small, every positive must be red, or every positive must be small. The negatives permit extrapolation of red and small because neither dimension is ruled out by the negatives. The only interpretation the negatives rule out is that a positive example cannot be both not red and not small. Three possible positive combinations remain depending on how the learner extrapolates information about negatives and positives. Rows 2 and 3 indicate that if the negatives are either not red and small or red and not small, only two interpretations are possible. One is the correct interpretation. The other is a pattern of extrapolation that has not been ruled out by the negatives. The bottom row has two types of negatives during training: those that are not red and small and those that are red and not small. The first set of negatives rules out the possibility that the positives may be not red. The second set rules out the possibility that the positives may be not small. Together they limit the number of possible interpretations to one: Every positive must be red and small. TABLE 9.4 Possible Interpretations as a Function of Negative Examples Features of All Positives
Features of All Negatives
Red + small
Not red + not small
Red + small
Not red + small
Red + small
Red + not small
Red + small
Not red + small Red + not small
Possible Interpretations Positives Positives Positives Positives Positives Positives Positives Positives
= = = = = = = =
red + red small red + red red + small red +
small
small small small
EXTRAPOLATION
217
The test of whether an interpretation is possible is logical, not empirical. If the interpretation is possible, the learner who used that interpretation would be able to respond correctly to all positive and negative examples within the training set. If two possible interpretations permit a learner to respond to all training examples correctly, all interpretations are learned within a population of learners.
Modifying the Range of Extrapolation If we teach a negative that is minimally different from the range of positive variation, we show more precisely where the positives terminate and where the negatives begin. If we want the learner not to generalize beyond the greens shown in the original setup, we add a negative that is closer to bluegreen than blue is. If we want to show that the range of positives extends somewhat beyond blue-green, we add a positive example that is more blue than the blue-green example. The range of positives is extended, but the difference between the positive and negative is minimal. Obviously, there is a point at which the difference between positive and negative becomes so precise that it is impractical because the learner cannot represent the difference. Contradicting Possible Patterns of Extrapolation. The original set of greens and the white negative permit a great variety of extrapolation patterns. Let’s say that there are 50 of them. If we later introduce a red negative, the new design contradicts some of the patterns that were consistent with the original—possibly 10. If we introduce a turquoise negative instead of a red one, we might contradict 40 of the possible patterns. If we introduce both a turquoise negative and greenish-yellow negative, we might contradict 49 of the possible patterns. Two implications derive from the fact that various extrapolation patterns are contradicted when we add negatives that reduce the difference between positives and negatives: 1. If the range of difference between positives and negatives is clearly implied by the initial set of positive and negative examples, relearning will not be required for any learner who masters the initial training set. 2. If the range of difference between positives and negatives is not clearly implied by the initial set of values, testing with examples that are in the difference range will yield information about what was taught initially as well as information about learning tendencies.
218
9.
TRANSFORMATION OF DATA
Learning requires application of logical operations. If a taught set of examples does not imply the status of a particular example, different learners draw different conclusions and treat the example variously as positive or negative. The behavioral trend is quite different if the only test examples presented are either within the demonstrated range of positives or the demonstrated range of negatives. Adding or Subtracting Features to Change Extrapolation. Extrapolation between the range of positives and negatives may be manipulated by changing the range of positives and negatives along a continuous dimension or by adding or subtracting independent features. The original condition logically dictates a certain number of possible patterns of extrapolation. If features are added to all positives or negatives, the range of difference increases and therefore supports a larger number of possible extrapolation patterns. The increased range of difference makes the discrimination easier simply because it increases the probability that the learner will identify at least one of the possible patterns. For instance, we could change the original color experiment so that, instead of creating all the examples from the same light source, we present one light source for the positives and another for the negative. The added position feature may be used as a reliable indicator of whether an example is positive or negative. The system now has the option of attending to the difference in color or position of the light. With this addition, colorblind organisms would be able to learn the discrimination more easily than they would from the original design. Salience of Added Features. Any difference added may be relatively more or less salient. The less salient it is, the less likely it is that the learner will attend to it. We could change the original design so that the examples were created by projecting circles of either white or green. The positives and negatives would not occur at the same time, but would appear in almost the same location, the negative circles slightly to the left of the positives. A position difference between positives and negatives is still present. It is just not as obvious. As we increase the difference in location of the circles (until negatives are clearly not in the same location as the positives), we make the presence of this feature more salient and therefore more probable as a predictor. It does not matter whether the additional difference feature is added to the positives or the negative. The effect is the same. Added differences that are more salient are more likely to be incorporated into the pattern the learner adopts.
EXTRAPOLATION
219
Awareness and Interpretations. The general principle of awareness is a variation of Premack’s (1965) principle. If there is more than one predictor of reinforcement, some learners learn to respond to either predictor. The extrapolation pattern the learner selects is limited by what the learner is aware of, but is not necessarily congruent with all that the learner knows. The learner may have a precise awareness of the range of positive variation and awareness that the negative light is not in the same position as the positive light. However, this information does not suggest whether the learner assumes that all these details are necessary for future examples. The learner may assume that any light that is not white is positive or that any color in the negative position would be negative (which means that even green would be negative if it were in the negative position). The learner may assume any other configuration of extrapolation that is consistent with the current pattern of reinforcement. As noted, the difference between positives and negatives may be conceived of as the transformation steps needed to convert a positive example into a negative. The more steps involved in the transformation, the more probable it is that the learner will identify at least one of the steps or features that are unique to the positives. To change a positive example into a negative in the original setup, we have to change its color only. A greater difference would be created by changing the negative from a white circle to a white square or to a white square that is in a different location than the light for the positives. Learning is more likely because there are more predictive differences between positives and negatives. One of the most easily established relationships there is about learning is that the more similar the stimuli, the more readily they are confused. Children do not tend to confuse the letter d with an elephant when learning to read. They do tend to confuse d with b because they are the same form presented in a different orientation. So if a child attends simply to the form and does not attend to the orientation, the child will confuse the letters. If the child learns the discrimination of b and d, the child will not tend to confuse b and because a greater number of transformation steps are required—first converting the b to a d, then changing the straight ascender into a curved descender. Extrapolating Ordered Examples of Positives. A variation of the logic used to identify negatives by extrapolating the difference may be used to identify positives. In the simplest application, a continuous dimension such as height would be established with three rectangles, all of the same width, but each a different height—5 inches tall, 7 inches tall, and 15 inches tall. The negative examples are 5 inches tall and 7 inches tall. According to the prin-
220
9.
TRANSFORMATION OF DATA
ciple of extrapolating the difference, the learner would be predicted to classify test examples that were less than 5 inches tall as negatives. By the same logic, the system would classify examples that are taller than 15 inches as positives. In this case, the system could safely extrapolate the difference to draw conclusions about the positive examples.
The Logic of Generalizing Features The efficient system is designed to identify the features that permit the largest generalization consistent with the various data points identified as positive or negative. The system attempts to interpolate as much as the data points permit and extrapolate as much as the data points permit. The reason for this design has to do with probabilities in learning a basic relationship. Let’s say that the learner has learned that responding to a yellow-green light leads to reinforcement, but has not learned about any other example. Next we present a blue-green example. The learner does not know whether it is positive or negative. If the learner’s system is capable of representing the difference between examples, two conditions are possible: (a) the learner treats the example as a positive and responds to it, or (b) the learner treats it as a negative and does not respond to it. If the learner’s system is conservative, it classifies the new example as a negative. (If the learner is conservative and does not respond to the example, the learner classifies the example as a negative.) The learner’s assumption is neither confirmed nor disconfirmed. The learner has not received the mild shock or positive reinforcer. Therefore, the system does not obtain all the information it needs to learn various relationships. The less conservative system would tend to treat the new example as a positive. This assumption may be wrong, in which case the learner would receive a shock. Whether the learner receives a shock or reinforcer, however, the system would know whether the example is positive or negative. The learner that responds only to the yellow-green example as a positive is receiving positive reinforcement every time it responds. However, the system’s poor design has effectively tricked the agent into assuming that it has learned the relationship when, in fact, the learner may be receiving only half the available reinforcement. In summary, the system designed to generalize the maximum amount permitted by the data points will tend to learn relationships faster simply because it will receive information about a larger range of possibilities. Just as the most efficient system for the bee would be to record all the features of the productive flower and treat any one of them as a possible predictor,
STIPULATION
221
the effective system for a learner would record the various features of the positive and treat any of them as a possible predictor. The system is optimistic in the sense that when it encounters an example that is the same as the original in some ways and different in others, it assumes that the quality of sameness suggests that the new example is positive.
STIPULATION The tendency of the efficient system to respond to various features as a basis for generalization are the conditions that lead to stipulation. Stipulation is a negative generalization—a tendency for the system to extrapolate the positive condition conservatively. Stipulation is caused by repeatedly presenting a narrow range of positive variation. For instance, we present the design that has a white negative and three green positives to two groups of subjects. For the nonstipulated group of subjects, we establish consistent responses to the positive condition and then present no more trials before testing. For the stipulated group, we establish consistent responses and then present 100 more reinforced trials before testing. To determine each group’s tendency to stipulate, we measure the degree of extrapolation and interpolation to different shades of green and nongreen lights. The prediction is that the degree of extrapolation and interpolation will be greater for the nonstipulated group. The information about the features of the positive and negative examples is the same for both groups. Both groups know what is being reinforced. Therefore, both groups have the same opportunity to interpolate or extrapolate. The difference between the stipulated and nonstipulated presentation has to do with the apparent importance of the features of the positive examples. The stipulated presentation increases the probability of the learner concluding that, because these positive examples are the only ones reinforced and this pattern has occurred many times, any examples other than these three positives are not positive. The logic of stipulation is that (a) the presentation suggests that all features of the various individuals are relevant to their classification as positives, and (b) if an individual lacks any of the three sets of features, it tends not to be positive. In other words, stipulation increases the probability that the learner will treat the individuals as an aggregated set of positives, not as a range of variation. In effect, the system concludes that if the various features of the three individuals are not the critical basis for classifying the examples, why are only three discriminably different positive individuals presented on more than 100 trials?
222
9.
TRANSFORMATION OF DATA
Stipulation and Instruction Because the repeated presentation of examples within a particular range decreases the learner’s tendency to extrapolate or interpolate, the degree of stipulation should be considered in any experiment or instructional intervention that draws conclusions about generalization or transfer. The influence of stipulation is usually not considered, however. For instance, the investigation may present a particular pattern several hundred times and then test the learner’s response to a variation in the pattern. The conclusions drawn are not qualified by the fact that the procedure induced substantial stipulation. Stipulation is observed in many instructional settings. For instance, an aide teaches a developmentally delayed child to do things he has never done before. She gives directions such as “Hand me the eraser,” and he performs reliably. The aide always works with the child in a particular cubbyhole in the larger classroom. To achieve this performance, the aide has presented hundreds of trials of the various tasks and basically has been the only one the child routinely responds to in the context of instruction (one task presented after another). After she has worked with him for the school year, we can most probably demonstrate the effects of stipulation by changing any of the features of the intervention. We simply move the aide and the child to another location— one that does not have distractions but that is greatly different from the original. The prediction would be that if the learner has performed only in the context of the cubbyhole, serious stipulation would occur. The child’s performance will therefore be far inferior in the new location than in the classroom cubbyhole. If we have someone other than the aide present the tasks that were presented successfully, the child will either tend to perform more poorly or not perform at all. Even if we change the aide’s routine by having a second child present and directing some tasks to this child, the original child’s performance will show the effects of stipulation. For the child, all the details of the original setting have become discriminative stimuli, necessary features, for the behavior. The aide must be the presenter of the task, the place must be the cubbyhole, and each task must be presented in a particular way. Because all these details have been stipulated by being in all examples, the removal of any of them results in the child’s system not being able to generalize the task. Stipulation is often seen in pets, particularly dogs. A dog is taught certain behaviors inside, never outside. Following the stipulated presentation, the tendency of the dog to perform those behaviors outside is greatly reduced. Poorly designed instructional programs for children are replete with stipulated details that prevent children from generalizing appropriately.
GESTALT PHENOMENA
223
For instance, the program teaches children to sound out and identify words. All the words the child is taught have three sounds in the consonant–vowel–consonant order. The words in this set are practiced many times. Although sounding out a two-sound word is mechanically simpler than processing a three-sound word, the lower performing child who has learned the stipulation of the three-sound word will often not be able to sound out and identify two-sound words when tested. The child will tend to try to add a third sound. In the same way, the child will probably not be able to sound out and identify a three-sound word that begins with a vowel— and, for instance. The child may try to add a consonant sound before the sound for a. The stipulated presentation functionally adds features that are not necessary to sounding out and identifying words. In summary, stipulation, like other logical operations, is based on data. Stipulation exaggerates the individual features of positives and suggests that all features are necessary. If particular examples are presented repeatedly and exclusively, the examples tend to be identified as individuals rather than members of a larger set based on sameness of features. If the operation is stipulated by applying to a narrow range of examples, the stipulation will result in the reduced probability of the learner applying the operation to examples outside the range. This logic is the same as that used for response strategy. The strategy must be specific enough to govern the various applications. It must be the sum of various features. If a particular set of response features always accompanies positive examples, the set of features judged to be necessary becomes extended over that implied by a nonstipulated presentation.
GESTALT PHENOMENA Gestalts are the results of both interpolation and extrapolation. Gestalts are produced when the system creates wholes from information about parts. The system that interpolates between positive examples and extrapolates past the demonstrated range of positive examples creates a gestalt model that accommodates the data points experienced and those that are logically consistent with these data points. When the learner receives information that spatial data points 1 and 2 are both to be treated the same way—as part of the same route, for instance—the system formulates a gestalt that connects the points and incorporates them in a larger unit. The route is the gestalt—the form that organizes the points into a qualitative whole. If the system receives information from a new data point that implies a change in the route to accommodate the new point, the form is changed so that the new whole is consistent both with the new data points and with the other
224
9.
TRANSFORMATION OF DATA
known points. This gestalt is consistent with a relatively few known data points, but generates an indefinitely large number of other data points—all those that would fall on the projected route. Gestalts are the grist of all learning and performance and are necessary creations of the most primitive performance system. Some organization of discrete data points is needed for the system to perform even the simplest behavior that is subject to adjustment on the basis of subsequent data. Gestalts That Conserve Features The most basic problem facing the system is to identify and create features subject to continuous variation. Details of the setting change over time. The present is something of a node that moves through time. Without a form that creates continuity of features over time, there is no way for the system to direct or interpret continuous variation of details, which means that there could be no continuity of behavior. Even if a response requires only a few seconds, the system must maintain continuity during this period. If the system simply recorded the physical features of an object being pursued at Time 1, the representation would be relatively useless at Time 2 because movement occurred. Therefore, the details of the earlier representation are not the same. Not only would the object being pursued change, but there would also be no necessary guarantee that the system would be engaged in a particular pursuit without a form that interpreted the discrete temporal events as part of a pattern. The most fundamental requirement is that both the pursuit and object of the pursuit must be represented as enduring forms. This presentation is only possible if the system identifies those essential or unique features of the object and pursuit that persist over time and uses them as a basis for creating a form that is conserved throughout continuous variation. The representation that the system requires, therefore, functions as an abstract set of criteria about what will occur over time. In summary, if things change over time, the common features must be abstracted so they persist over time and are recognized as a host undergoing changes in its residential features. This kind of abstraction requires not only creating a gestalt that is consistent with the various data points, but also organizing them into an enduring host. The Gestalt Basis of Performance Content maps, plans, and projections are gestalts. They are enduring forms that identify a qualitative set of outcomes or changes that are to occur. They have the potential to generate data points that may be compared to ideal
GESTALT PHENOMENA
225
data points of the gestalt. For example, the agent plans a straight-line route to visual target X. This design does not specify the route as something like 19 steps, but as a general map for achieving the goal of reaching target X. The learner could produce hundreds of variations of approaching X, each generating specific data points that are different from all other approaches and yet each consistent with the plan. The content map or response plan is an abstraction that addresses only specific dimensions (in this case, X, direction, movement). The system determines whether particular outcomes have occurred as planned. This requires projecting details that are then compared with the realized outcomes. Comparing two unlike sets of data—those projected and those embedded in current sensory receptions—requires both to be reduced to a common form, a gestalt that accommodates all the relevant data while ignoring all that are not relevant. The act of going from something more general to something more specific requires a transformation of some features and a conservation of other features. This type of organization is possible only if the various data points are coalesced into a gestalt or whole. Feedback requires comparisons of projected outcomes with realized outcomes, and therefore requires gestalts. For instance, if there are seven holes that must be avoided to create an approximation of a straight-line route, the specifications of the plan become more elaborate than the plan for a straight-line route; however, it still accommodates many different sets of data points. The Gestalt Basis for Learning In the prior performance examples, the system provides the content map (gestalt) and the agent produces the data points. For learning, the reverse orientation occurs. From concrete specific data points, the system creates a form that is consistent with the data points experienced and that accommodates an indefinitely large set of additional data points. For instance, encounters with specific nectar-rich flowers A and B generate the abstraction that nectar-rich flowers are red. This abstraction provides a guide that is (a) consistent with known data points, and (b) capable of including additional data points consistent with the form. The learner will not go to the same flower it just went to, but to others that have the feature. Interpolation and Extrapolation in Gestalts Whether the learning is a simple visual feature (red flowers) or a more complicated set of features (pressing the lever on the wall that is farthest from the food), gestalts are created through a combination of interpolated and
226
9.
TRANSFORMATION OF DATA
extrapolated data. The infant zebra recognizes the stripe pattern of its mother. If this knowledge were simply presented as known data points, it would consist of specific pictures that the infant has observed. If what the infant sees did not precisely match a picture, there would be no particular guidance. The system would not have a means of transforming the known data points into forms consistent with these points. When the mother zebra is observed at a distance, the pattern is greatly different from what is observed up close. Not only is the entire pattern smaller, it does not have the same perspective as the up-close variation. Perhaps the most obvious gestalt phenomenon would be the ability of the learner who sees only the rear end of its mother from a distance to identify it as mother. This act clearly involves creating a whole from partial data. The phenomenon provides evidence that the learner possesses what has to be an abstract map that accommodates various data points. Identifying the mother requires extrapolation from the data points given. If the mother were viewed so that only head and hindquarters were visible, the creation of the whole would require interpolation. If the baby has only encountered mom from the side or rear, interpolation and extrapolation would be required to identify the pattern of a three-quarters rear view. The extent to which the baby zebra builds and deploys such a model can be determined experimentally. If the baby does extrapolate or interpolate following controlled encounters with the mom (seeing her only in certain relative positions and being blindfolded at other times), the system performs the degree of abstraction needed to account for the observed behavior. In summary, gestalt phenomena are not limited to higher order cognitive manipulations; they are a basic necessity for planning responses and for the most elementary learning. Gestalts are not limited to learned phenomena, but are necessarily functions for the most basic operations, such as approaching X. Gestalts are necessary for any plan because, by nature, a plan provides a functional representation of something not present but that must be extrapolated from present data. In one sense, the future always involves some form of extrapolation. To correct a deviation from the plan involves some form of interpolation. The plan provides the information about the desired outcome. The analysis of sensory data provides information about the current outcome. The difference between these data points implies a behavior that goes from the current to the planned. This assumes that there is continuous variation between the two. Transformations as Gestalts What we have referred to in earlier chapters as transformations assumes that what is transformed is some form of gestalt. If the object had only been observed in left profile, but is now in right profile, it may be identified—but
SUMMARY
227
only if the system represents the original form and manipulates that representation so that it overlays the current sensory data. With this transformation of the representation, the system is able to compare the sensory data with the model to see whether a match exists. Temporal patterns are represented as forms as much as spatial patterns are. The cat positioned next to the computer observing the moving fish on the screen saver gives ample evidence of executing plans based on extrapolation of patterns. The cat clearly calculates where the fish will be when they go past the edge of the screen and exit the computer. They don’t exit the computer and the cat paws at the air, but the cat’s timing is exquisitely precise and shows that the cat represents the temporal pattern of the fish and extrapolates the pattern over time. If the timing of trying to paw the fish changes as a function of the speed of the fish, the cat uses the feature of rate as well as direction to determine when the fish will exit the computer.
SUMMARY Any effective learning performance system must be designed so that it performs the operations of interpolation and extrapolation. The system receives information about single examples. From these disparate data points, the system must formulate rules about examples not encountered. Which of them will be positives and which will be negatives? The only possible way the system has of answering this question is to identify possible features that the future examples may share with the known examples. The system needs logical formats to formulate the basis for an example being either positive or negative. These formats focus on features—qualitative, discriminable attributes of the examples. These features must be abstracted from known examples, with the assumption that there are other examples that will have the various abstracted features. The problem facing the system is that each example has hundreds of features. To determine which of the features are relevant, the system must consider common features of more than one positive example, difference features between known positives and known negatives, and common features of negatives. Differences that would be created by transforming one known positive into another known positive describe the basis for the operation of interpolation. The system interpolates between positives by using the logic that any transformation that is a component of a transformation required to convert one positive into another positive results in a positive. If Difference D transforms positive Example A into positive Example B, any component of this transformation also creates a positive example.
228
9.
TRANSFORMATION OF DATA
Shared samenesses of all positives or samenesses of all negatives provide the basis for extrapolation. If all known positives have Feature X, all other examples that have Feature X could be positive through extrapolation. The system extrapolates the range of negatives by using a different format. In effect, it holds that if the transformation from a known positive to a known negative is D, any difference greater than D creates a negative. The difference between a known positive and known negative is an uncharted range that may be considered positive or negative. The greater the difference between positives and negatives, the greater the uncharted range. This range is necessarily interpreted differently by different learners. If we repeatedly present and reinforce the same set of positives, we change the pattern of both interpolation and extrapolation through stipulation. The repeated presentation implies that only those positive examples used in training are legitimate positives. Therefore, the emphasis migrates from shared characteristics to the sum of the specific individual features repeatedly encountered. Stipulation is a kind of negative or antigeneralization. The logical format for stipulation holds that if either the same positive or negative examples are presented repeatedly and exclusively, all features are assumed to be relevant, and each example must be represented more conservatively—as individuals. The operations of interpolation and extrapolation are used by the system to create what functions as gestalts—forms that imply a range of possible data points. Any representation of data that admits to transformation is functionally a gestalt because it involves creating a whole that requires both interpolation and extrapolation, and that is consistent with the given data points. Going from general to specific requires a gestalt because the data points provided by the specific manifestation must be generated by the gestalt. Going from specific to general requires a gestalt because the general is a form that involves only some dimensions of the data points given. The learner goes from general to specific when making plans that are consistent with the content map. The learner goes from specific to general when learning. Therefore, the learner creates gestalts and applies them as guides for creating concrete applications. Even when the learner produces a simple response that involves continuous variation, like scratching behind its ear, it generates a series of specific data points designed to correspond to those that had been planned. The plan serves not only as a model for the production of the response, but also as a basis for comparing possible deviations from the response with the intended gestalt. The system needs formats for interpolating and extrapolating data and formats for formulating and applying gestalts. These logical operations are needed for the most basic performance system. The system that is efficient
SUMMARY
229
is also designed to extrapolate and interpolate as extensively as the data points permit. The reason is so that the learner will test the possibilities, thereby receiving more information about what predicts and what doesn’t than the system that interpolates and extrapolates conservatively.
Part
EXTENDED LEARNING
III
Chapter
10
Individuals and Features
Chapters 5 through 9 addressed basic learning—primarily the learning of single discriminations or response strategies. For basic learning, the learned behavior is directly influenced by a primary reinforcer. Chapters 10 through 13 address learning configurations that are more complicated than those required for basic learning. Chapter 10 considers the systemic requirements placed on the system when the learning involves the interaction of various discriminations in a single context. The discriminations interact because they involve the same individuals or hosts. A given individual may be classified with specific other individuals in the population to express one discrimination and grouped with different individuals in the population to express a different discrimination. The multiple features of the individual are used to classify the individual in more than one group. One set of features is the basis for one classification; another set of features is the basis for another. The fact that multiple learning is possible has great implications for the manner in which information is organized and classified within the system.
FEATURES AND CLASSIFICATION The learner encounters specific concrete objects and events. Each hosts an indefinitely large number of features. The learner will not come into contact with features, only the concrete objects and events in which the features reside. Yet if the system is to learn any relationship that involves these objects and events, the system must identify features. Unless the system 233
234
10.
INDIVIDUALS AND FEATURES
views the things that are encountered in two ways—as individual concrete things in the sensory present and as possessors of specific features—the system will not be able to learn multiple relationships. Individuals as the Sum of Features If the learner is to discriminate any two individuals, the discrimination must be based on a difference in features. One individual possesses one or more features not possessed by the other. If the only basis for distinguishing one individual from another is features, and if no two individuals in the group have the same set of features, the only way the learner could discriminate an individual from any other is to view the individual as the sum of features. Each of these features may be shared with some others in the group, but no other individual in the group has the same sum of features. If the learner is to discriminate between Individual 1 and Individual 2 in the current setting, the learner must identify a basis in features for the discrimination—some feature not shared by both individuals, but that is unique to one of them. If the learner is to discriminate between Individual 1 and Individual 3, the learner must identify a different feature that is unique to each individual. If we continue to add other individuals that must be discriminated, we soon conclude that the learner must learn to identify Individual 1 as the sum of its features and that various features come into play as the learner discriminates Individual 1 from others in the group. Populations as Conveyors of Features We can illustrate the problem facing the learning system by creating a simplified population of individuals. Let’s say that within the population there are four feature categories—color, shape, height, and width. Within each category, there are two variations. The color variations are red and blue, the shape variations are bottle shape and box shape, the height variations are tall and short, and the width variations are wide and narrow. All individuals in the population have four features. The individual is one of the colors, one of the shapes, one of the heights, and one of the widths. No two individuals in the population have the same set of features. Also all the differences are assumed to be salient. There is an obvious difference in height between the tall and short individuals and between wide and narrow individuals. Table 10.1 shows a summary of the features using the letters A to H to refer to the eight feature variations. There are 16 individuals in the population. Each column in Table 10.1 shows the four features of each individual object. Object 1 is a red bottle shape that is tall and narrow. Object 2 is a blue bottle shape that is tall and narrow. Object 3 is a red box that is as tall
235
A C E G
Color Shape Height Width
B C E G
2
A D E G
3
B D E G
4 A C F G
5 B C F G
6 A D F G
7 B D F G
8 A C F H
9
KEY: A = red, B = blue, C = bottle, D = box, E = tall, F = short, G = narrow, H = wide.
1
Feature B C F H
10
TABLE 10.1 Population of Four-Feature Individuals
A D F H
11
B D F H
12
A C E H
13
B C E H
14
A D E H
15
B D E H
16
236
10.
INDIVIDUALS AND FEATURES
as Objects 1 and 2 and as wide. Object 16 is a blue box that is tall and wider than Objects 1, 2, or 3. Table 10.1 shows the 16 individuals. The population generates a number of possible classification criteria for groups of individuals, including the grouping of single individuals on the basis of the sum of their features. To discriminate Individual 1 from all other individuals in the set, the learner would have to learn that Individual 1 has the combination of features A–C–E–G. At the other extreme, if the learner is to discriminate individuals on the basis of redness, the learner would have to identify all eight individuals in the populations that have Feature A. The same feature used to group all eight individuals is also used in the process of identifying Individual 1 as a unique individual not to be confused with Individual 2, which is the same as Individual 1 except that it is not red. If the learner identified all features of Individual 1 except for redness, 1 and 2 would be confused. Within the population, there is a conservation of features and members. The more features used as the classification criterion, the fewer the members in the group. For a single feature, eight members are in the group. For a four-feature classification, only one member is in the group. If the classification criterion involves two features, such as red bottles, four members are in the group—the tall red ones and the short red ones. Table 10.2 indicates that the population generates 80 different feature classifications. The first column indicates the number of features involved in the classifications. It is possible to classify on the basis of one, two, three, or four features. The second column indicates the number of possible classifications generated by the population. If we classify on the basis of a fourfeature criterion, 16 different classifications are possible for the population. These correspond to the 16 individuals of the population. If we classify on the basis of three features, 32 different combinations are possible. The third column shows the number of individuals there are for each specific set of features. If we select a classification that involves three features, there are two members in each class. For example, individuals that are red, bottle shaped, and tall share three features, A–C–E. Two individuTABLE 10.2 Summary of Population Classifications for 16 Members Number of Features for Classifying Members
Number of Possible Classifications
Number of Members for Each Classification
4 3 2 1 Total classifications
16 32 24 8 80
1 2 4 8
FEATURES AND CLASSIFICATION
237
als in the population meet this criterion—numbers 1 and 13. These bottles are not identical because one is narrow and one is wide. Multiple-Feature Learning The assumption is that there are learners that would be able to learn some, if not all, of the feature combinations generated by the population. By analyzing what is required for the learning of more than one discrimination involving the same objects, we are able to infer important features of how the learning-performance system classifies information about individuals and features they possess. To show learning of any combination, we need to designate a behavior that signals knowledge of a specific class. For the sake of simplicity, we use different locations—1 through 5. Each location calls for a different grouping of individuals. For instance, the learner is required to put the tall, red, narrow bottle (Individual 1) in Location 1. The primary task used during training is to place any or all members in a location that is highlighted. A location is highlighted by flashing a light above it. When a location is highlighted, the learner is to respond by placing any appropriate individuals in the location. If the location calls for four individuals, the response is not correct unless all four individuals that have the features for the location are moved to the location. Requiring the learner to respond to the highlighted location is preferable to requiring a response to highlighted objects or set of objects. We are not as interested in identifying the extent to which the learner is able to learn a location for various objects as we are in identifying the extent to which the learner is able to scan the population, identify features, and select objects that have the features appropriate for a particular location. We teach the learner five different classifications: 1. The tall, red, narrow bottle goes in Location 1. (one individual with four features) 2. The tall, blue, narrow bottle goes in Location 2. (one individual with four features) 3. The short, blue boxes go in Location 3. (two individuals with three features in common) 4. The narrow bottles go in Location 4. (four individuals with two features in common) 5. The red objects go in Location 5. (eight individuals with one feature in common) Individual 1 is involved in three of these classifications. It is grouped on the basis of four features (Location 1), on the basis of two features (the nar-
238
10.
INDIVIDUALS AND FEATURES
row bottles for Location 4), and on the basis of being red (Location 5). In the population of 16, Member 1 is discriminated from Member 2 only on the basis of color, from Member 3 only on the basis of shape, from Member 5 only the basis of height, and from Member 13 only on the basis of width. Unless Individual 1 is represented as something that has all four features, it will not be discriminated from those members that differ from it in only one feature. If two members that differ in only one feature are not confused, the only possible explanation is that the unique features of each were recorded and used as a basis for identifying the difference in individuals. Dual Classification Requirements. For the learner to perform on this task, the system must classify the objects in two ways—as the sum of their features and by each discriminable feature. The task requires the learner to manipulate individual objects. Each object is the sum of its features. However, if the learner is to respond to the various groupings that are possible, the system must be able to search for single features or combinations of features. If the learner is able to learn the five-task classification required by the experiment, the learner’s classification system must classify the various individuals that are manipulated by the sum of their features and by each of the various features. When the learner puts Individual 1 in the various locations that are highlighted, the learner is providing evidence that it recognizes Individual 1 as the sum of its features and that it may be grouped according to single features (such as red) or any combination of features. In other words, the learner would be able to learn any sample of classifications from the 80 that are possible for the population. This is not to say that the learner would be able to learn all of them. The reason for not being able to learn all has nothing to do with the learner’s classification system, however, but with the memory power of the system. Because different features of the individual are relevant to different interactions, the first classification that the system must make when encountering any individual that may be related to a primary reinforcer is the most specific identification that is possible or practical. If no reinforcing consequences are associated with the object, the system may represent some features of the object, but probably not to the extent that they would be represented if strong primary reinforcers were involved. This requirement applies to the most basic survival skills. If the hungry learner does not classify the animal as a pet or is in battle and does not recognize a combatant as a sibling, the learner will make unfortunate conclusions about which interactions are appropriate. The classification of individuals as the sum of their features is necessary because the potential to learn is a strict function of the features that the system represents. The greater the number of features that the system identifies, the greater potential the system has for learning relationships or classi-
FEATURES AND CLASSIFICATION
239
fications that include the individual. The extent to which the system is not aware of features is the extent to which the learner is preempted from learning discriminations that involve those features. Therefore, the efficient system represents enough features of the individuals to permit the learning of various possible discriminations. The efficient learning system records many features of all concrete things demonstrated to be possible variables for learning and performance, not simply the features of things involved in the current pursuit. Enduring Features for Identification The general rule is that the object is identified only by its enduring features. Interactions that involve manipulations of an object are based on its nonenduring features. At any given time, the individual or object will have transitory features in addition to the enduring features. It will be in a particular position, place, and possibly engaged in a particular action. Nonenduring features are not relevant to the identification of a particular individual or object because what will be observed on another occasion is limited to those features that endure from one occasion to the next. Nonenduring features do not help identify the individual or object, although they may be important for any pursuit or interaction with the individual. However, some nonenduring features (such as a particular pattern of movement when the individual moves) may be treated as an enduring feature if it is consistently observed on various encounters with the individual or with all individuals of the same kind. Like any other feature, it may then be used as one of the features summed to create a representation of an individual. For instance, a bat may be recognized by its flight pattern when it is flying, but not when it is not flying. Therefore, the flight pattern is one of the bat’s features that is not enduring, but that permits discrimination of the bat. Most nonenduring features do not identify individuals or groups. When the learner interacts with the set of 16 individuals, Individual 1 may be in any position relative to the other members in the group. If the learner had classified the position of Individual 1 on a particular trial as an enduring feature, the learner would be faced with the dilemma on the next trial—not being able to identify Individual 1 because no individual in the set has all of the features that Individual 1 had on the previous trial. This discrepancy describes the process by which the learner learns which features are enduring and which are ephemeral. At the same time, the nonenduring features are important for how the learner is to interact with the individual in the current setting. The learner is to put Individual 1 in Location 1. That involves a specific action that is contingent on the current location of Individual 1. The response that the learner
240
10.
INDIVIDUALS AND FEATURES
produces if Individual 1 is to the left of Location 1 is different from the action the learner performs if Individual 1 is on the right side of Location 1. Features and Populations The specificity of the content map required to identify individuals is a strict function of the features within the population. We could change the population so that Individual 1 was not different from all others in the population, but was a member of a specific group of five members. This is achieved simply by adding four more individuals that have the same features as Individual 1 (tall, narrow, red, and bottle shaped). Also we could change the properties of individuals by adding feature variables. For instance, we could add a series of members to the set that were heavier than the current corresponding members of the population. With this addition, we are able to create individuals that would have to be identified as the sum of five features, not four. For the single-feature classification involving color (red objects go in Location 5), the system must use the color as the sole basis for assembling the individuals that go into the location. It might be argued that it would also be possible for the learner not to use a criterion of color for this location, but simply to remember the specific individuals assigned to this location. According to this argument, the learner does not refer to the color, but to the sum of the features for those individuals known to be in this location— Individuals 1, 3, 5, 7, 9, 11, 13, and 15. When the location is highlighted, the learner searches for these eight individuals. Superficially, this seems to be a possibility. However, it isn’t. Regardless of how complicated the learner’s classification procedure is, in the end, color is the only basis for the classification in Location 5. The individuals are discriminated only by the features used in the classifications (and not by tiny marks and other identification features unique to various individuals). If the learner does not identify the individuals that have the feature of redness, the learner could not identify the correct eight members that go in Location 5. The learner would confuse Individuals 5 and 6, for instance, because the only difference between these members is color. If Individual 5 is reliably classified in Location 5, the system must record the feature of redness for Individual 5. Although the system may classify each individual as the sum of the features, the only way it will be able to put the right individuals in Location 5 is to sort on the basis of redness. In summary, it is possible for the system to recognize each member as an individual. However, the system must also recognize that the individuals that go in a particular location share the features demanded by the location. To test whether the learner memorizes individuals according to unique features beyond the four controlled for by the design, we could add four
FEATURES AND CLASSIFICATION
241
new red members to the set. If the learner places only the original eight members (1, 3, 5, 7, 9, 11, 13, 15) in Location 5, the learner is classifying on the basis of extraneous features (memorizing each individual). If the learner places only eight members in this set, some of which are new and some of which are original, the learner is classifying on the basis of color, but is following a more elaborate rule that only eight red items go in Location 5. If the learner places all the red members (both original and new ones) in Location 5, the learner is classifying strictly on the basis of a single color feature and the rule that all red members go in Location 5. Representations by the Infrasystem The various classification rules for the five locations imply how the system must organize the representation of the features: 1. The particular concrete objects encountered are identified as the sum of the features that may be variables. A feature would be a variable if it were the sole basis for distinguishing one individual from another. The system does not know exactly which features are variables, so the efficient system errs in the direction of representing those features that are possible variables as well as those that are demonstrated variables. 2. For each feature identified, the system has a category independent of the classification of individuals. If red is a feature, there is a category for red things including things that have nothing to do with the current pursuit. 3. The features are used singly or in combination to create the classification criteria for the various locations. When formulating a criterion for a particular location, the system refers to the single-feature representations. If more than one feature is required for a location, the single-feature criteria for all required features are specified. For Location 3, for instance, the system would have to scan for the separate features of short, blue, and box shape. A two-step process would occur with the highlighting of a particular location. First, the content map would present the classification criterion for the location. The criterion would be of the form “location = features .” For Location 5, the criterion would be “Location 5 = feature red.” Second, the learner would scan the individuals for the feature or combination of features specified by the classification criterion. This process involves comparing each individual with the criterion to determine whether the individual possesses the feature or combination required for the location. The scanning for the multiple-feature process assumes that the system identifies each feature and does not amalgamate
242
10.
INDIVIDUALS AND FEATURES
more than one feature into a lump. This assumption is necessary because any approximation of a correct criterion is modified only through the addition or subtraction of single features. By applying the same two-step process to each discrimination, the system is able to learn various groupings that involve the same individual.
LEARNING CLASSIFICATIONS Chapter 6 presented the scheme possible for learning single-feature discriminations and others that may be supported by a default content map. When the learning involves multiple features of the individuals, the system could not possess significant default content maps because the learning involves different possible combinations of features. Therefore, the system must be designed to use a more conservative process for drawing conclusions about which features are relevant to a particular discrimination. Single-Feature Classifications Table 10.3 (presented as Table 6.2 in chap. 6) indicates that if there were five features involved in a discrimination, the learner could identify which feature signals the primary reinforcer through a few comparisons of flowers. For the ideal system, the worst-case outcome would require only four comparisons. The reasons that such a small number of comparisons is required are that (a) the pursuit involves only one discriminative feature, and (b) the system has preknowledge of the possible features involved in the pursuit. For learning not prompted by default content maps and checklists of the possible features, the learner does not know before the fact how many features are involved or what those features are. Therefore, the minimum TABLE 10.3 Inferences Based on Single-Feature Comparisons
Flower A B C D E
Positives (number of features that are the same as features of A)
Negatives (number of features not the same as features of A)
Comparison (number of possible features eliminated by comparison with A)
5 4 3 2 1
1 2 3 4
B vs. A rules out 1 feature C vs. A rules out 2 features D vs. A rules out 3 features E vs. A rules out 4 features
LEARNING CLASSIFICATIONS
243
steps that the system would have to take to learn a single-feature discrimination are: 1. 2. 3. 4. 5.
Identify the first positive example as the sum of its features. Represent each feature singly. Identify the next example as the sum of its features. Represent each feature singly. Compare the features of the first positive example with the features of the current example. 6. Draw inferences about the essential features of the positives. For any discrimination in which all the positives share only a single feature, the negative examples would be useful in drawing possible inferences about the predictive features. For a single-feature pursuit, negative examples provide as much information as positives. If the first positive has five features, ABCDE, any features that are common to the positive and a negative identify features that are not essential to positives. For instance, if Features A, B, and C are shared by a positive and a negative, none of these features could account for why the positive example is positive. The only features that could be responsible for the positive status are the two features of the positive example not shared by the negative—D and E. In the same way, any features shared by a positive and another positive confirm these features as possible essential features. If Features A, B, and C are common to the original positive and a comparison positive, all these features are retained as a possible basis for the example being positive. In summary, if a single feature is involved, features shared by a positive example and a negative example are categorically ruled out as being essential features for the positive classification. Features shared by positive examples are retained as possibilities, not certainties. If there is a single feature that determines the positive examples, the system that confirms that combination A–B–C is shared by positives and not negatives has not identified the single feature, but has limited the choices to one of these three features. If the system follows the strategy of testing any of these singly or in combination, the system will discover which feature is relevant through the comparison process. Multiple-Feature Classifications The problem facing the system that does not follow a default content map geared to single-feature pursuits is that the system does not know whether the positives are to be classified on the basis of a single feature or multiple features. For the population of 16 presented in this chapter, the learner
244
10.
INDIVIDUALS AND FEATURES
must start by learning one of the five discriminations without knowing either which one it is or the number of features involved in the pursuit. This uncertainty creates a difference in the logic that underpins the pursuit. We can illustrate the problem with a two-feature discrimination criterion. If the first comparison example is negative, it is not possible for the system to eliminate the features shared by the positive and the negative. The reason is that the system does not know whether one feature or two features is required for the positives. Let’s say the positive has five features, ABCDE, and the negative shares features ABC. If the discrimination involves either one feature or two, the negative example could be negative for three possible reasons: 1. The positives are based on a single feature not present in the negative. 2. The positives are based on two features, and one of them is present in the negative. 3. The positives are based on two features, and neither of them is present in the negative. These possibilities mean that the system is not close to making a determination of what the discrimination is. With no more information than the system currently has, the discrimination could involve any of these features or combinations: D, E, DE, AD, AE, BD, BE, CD, CE. For a classification of positives based on three features, the possibilities increase. The discrimination (based on the same positive with features ABCDE and negative with features ABC) could be any of these three-feature combinations: ADE, BDE, CDE, ABD, ABE, ACE, BCE. Clearly, the system that learns multiple-feature discriminations needs a strategy that relies far more heavily on positive examples than on negative examples; however, negatives still play a potentially important role. Negatives are useful if there are single-feature differences between positive and negative examples. If the positive and negative examples differ by only one feature, that feature is essential for the positive example. If the positive and negative differ by more than one feature, however, there are only possibilities, and the system cannot identify which feature or combination describes the criterion for the positives. So the ultimate, and possibly only, efficient strategy that the system could possess would be to respond conservatively to negatives. The efficient system would incorporate three strategies: 1. If the essential feature cannot be clearly identified by the available data, the system would weaken those features shared by the positive and negative examples, but it would not disconfirm them until further data are
245
LEARNING CLASSIFICATIONS
obtained. The weakening would simply alert the system to the fact that, on one or more occasion, each of the weakened features had been observed in the negative. 2. The system would eliminate all features unique to the negative examples. If Feature T occurred in the negative and not in the positive, this feature could not possibly play a role in the positive identification because it is not in the positive example. 3. The system would rule out all features shared by only some, but not all, positives. If two positives shared five features and differed in three features, the features not shared are disconfirmed because they are not common to all positives. By retaining only those features that are shared, the system would be able to identify the essential features of the positives through subsequent encounters. Comparison Strategies What the learner learns is a strict function of the range in variation of features that the learner encounters. If the learner encounters only a narrow range of the actual feature variation, the learner will learn a too-specific relationship. If the learner encounters the full range of feature variation, the learner will formulate the set of features that is appropriate for the population. Let’s say the learner encounters the following three examples that do not provide an indication of the range of feature variation that occurs in the larger population. Positive 1 Positive 2 Positive 3
A A A
B B B
C C C
D D D
E E
F F F
K K
L P
According to the rules for eliminating features, the features E, K, P, and L are disconfirmed because they do not occur in all positives. The resulting set of features common to all positives is A–B–C–D–F. This formulation is consistent with the learner’s experiences, but not with the full population of examples. Another learner encounters the following three examples from the same population. Positive 1 Positive 2 Positive 3
A A A
B B B
C
D D D
E E
F
G
H K
F
L M
P
246
10.
INDIVIDUALS AND FEATURES
The rule that the system derives from this information is that Features C, E, F, G, H, K, L, M, and P are disconfirmed and the positives share Features A, B, D. The examples illustrate that it is quite possible for two learners exposed to examples from the same population to derive different rules. In some cases, the system could extend the analysis if it compared a series of negatives with a positive, but this process is dangerous because it leads to false conclusions unless there is sufficient information about the range of positive variation. Here is a possible series of examples. Positive Negative Negative Negative
A A A
C D B
C C
E E
G H H
F E
G
The first example is positive and the rest are negative. The analysis, therefore, would eliminate Features D and H because they are unique to the second example, which is negative. The system would also eliminate Feature F in the third example and B in the fourth for the same reason. The analysis would conclude that the positive example must have Features ACEG because these are the only ones present in the positive and not exclusive to the negatives. This conclusion is hasty because the set of examples does not show any negatives that differ from positives by only a single feature. Far more information is presented if there are more positives. If the first three examples are positive and only the last is negative, Feature B could be ruled out because it is unique to the negative. The common feature of the positives is A. Positive ACEG and negative BCEG differ only in the A. Therefore, the only possible basis for BCEG being negative is the absence of Feature A. Even if the negative were not presented, the system would be able to infer that A is the basis for positive classification. It is the only feature common to all three positives. Positive Positive Positive Negative
A A A
C D B
C C
E E
G H H
F E
G
If the first three examples are negatives and the last example is the only positive, the analysis of negatives fails to generate much useful information. Negative Negative Negative Positive
A A A
C D B
C C
E E
G H H
F E
G
LEARNING CLASSIFICATIONS
247
The system could rule out A, D, F, and H as possible variables on the basis that they are unique to the negative examples. The result is that none of the features of the positive (BCEG) have been ruled out and only one feature has been confirmed—B. (It is the only difference between the first and last example; therefore, B is essential to the positive classification.) An implication of the preceding multiple-feature analysis is that if specific learning is required, a specific set of training examples is needed to present the information the learner needs to draw the appropriate conclusions about the content being taught. The set of training examples must be a microcosm of the population the learner will encounter. This requirement has nothing to do with the frequency of the occurrences of the various examples in the larger population. It is based on the fact that the learner needs information to deal with the full range of examples to be encountered regardless of their frequency. Differences Between Temporally Ordered and Simultaneous Examples There are two general types of presentations for presenting training examples: presentations in which the examples are presented in an order not determined by the learner, and presentations in which the order of examples is determined by the learner. For the first type, a temporally ordered sequence of examples is implied. The design presents one example, then the next, and so forth. The learner responds to each. For the second design, the various examples must be present at the same time. The learner selects the one it will respond to first and next. Therefore, the broad design options describe temporally ordered and simultaneously presented examples. If the learner is presented with various examples and selects some to put in Location 5, the examples are simultaneous. If the learner is presented with one example that the learner places either in Location 5 or in another place, the examples are temporally ordered. Dynamic Presentations. The most articulate possible presentation is a dynamic presentation that involves single-feature differences. Dynamic presentations that temporally sequence single-feature changes are artificial products of instruction created to maximize the clarity of the features that control the positive classification. For the dynamic presentation, there would be continuous change from one example to the next. Each change would involve a single feature. If the change results in a negative example becoming positive, the change absolutely describes a feature that is essential to positive classification. The presentation is the most articulate possible because it is designed to exploit what the system does—attend to changes. The dynamic presenta-
248
10.
INDIVIDUALS AND FEATURES
tion of single-feature changes retains all the features of the preceding example but one. Because only one feature changes, its function is made obvious to the learner’s system. The difference between naturally occurring, temporally ordered events and those in the dynamic sequence of temporally ordered examples is that the learner probably would not encounter examples that changed in ways that provided specific information about the role of the various features of the examples. The examples or events would not necessarily occur in the same place, would not immediately follow the preceding example, and would not necessarily differ from the preceding example in only one feature. The examples presented in the earlier chapters for the various antecedent-learning applications involved temporal presentations. For example, the trials on which the learner encounters eggs may be days apart. Therefore, comparisons required a permanent representation in memory. Also for comparisons to be performed, precise information about features had to be retained by the system at the time the features occurred. The 16-member population may be set up so that all members are present for each trial. This is a simultaneous presentation. The only differences from trial to trial are the positions of the various individuals and the particular location illuminated. There are three advantages that the simultaneous presentation has over naturally occurring, temporally ordered examples of multiple-feature discriminations: 1. The simultaneous presentation permits direct comparisons of examples. 2. Although the comparisons that a learner performs will occur one at a time in sequence, both the examples that are compared and the order of the comparison are determined by the agent, not by requirements of the learning task. 3. The simultaneous presentation permits the physical grouping of examples to show how they are the same. With the simultaneous presentation of examples, direct comparisons are possible. If two individuals in the population of 16 are next to each other, the color, shape, height, and width of one may be compared with those of the others. The simultaneous presentation permits more accurate comparisons because they are direct and do not involve a representation of one of the individuals. The learner will necessarily make comparisons one at a time; however, neither the individuals involved in a given comparison nor the order in which the agent compares various individuals are restricted as they would be by the presentation of temporally ordered examples. The learner must dis-
249
LEARNING CLASSIFICATIONS
cover what is the same about the various positive examples for a particular location. If all the examples are present at the same time, we are able to create a model that shows the features of the examples required for a given location. For example, we could place three examples of red in Location 5. If we choose the examples carefully, the set of demonstration examples will provide the learner with sufficient information about the specific feature required for this location. There is no close parallel for this type of prompting potential with naturally occurring, temporally ordered examples. Patterns of Temporally Ordered Positives and Negatives. Part of the problem of transmitting adequate information to the learner through temporally ordered examples is that the sequence of positive and negative examples must be designed so that the presentation is consistent with only one possible interpretation. The means by which this is achieved is discussed broadly in chapter 11. The point we make here is simply that both features of the examples and the order of the examples make a difference in the communication potential of temporally ordered sequences. A sequence may fail because it does not show the full range of positives, the full range of negatives, or the minimum differences between positives and negatives. Table 10.4 shows that the set of examples appropriate for one discrimination may not be well designed for others. The same set of eight examples is used to teach three different discriminations: A (red), AC (red bottle shape), and ACE (tall red bottle shape). The sequence is adequate for providing information about Discrimination A, but not adequate for the other two. The order of examples is not particularly good for teaching Discrimination A because it alternates between positives and negatives. The learner TABLE 10.4 Eight Temporally Ordered Examples
Example
1
2
3
4
5
6
7
8
Color Shape Height Width For showing single feature A For showing multiple features AC For showing multiple features ACE
A C E G + + +
B C E G – – –
A D F G + – –
B D F G – – –
A C F H + + –
B C F H – – –
A D E H + – –
B C E H – – –
KEY: A = red, B = blue, C = bottle, D = box, E = tall, F = short, G = narrow, H = wide.
250
10.
INDIVIDUALS AND FEATURES
could learn the correct response by attending only to the order of the examples—every other one is positive. The sequence, however, presents an adequate number of positives and permits immediate comparisons of positives and negatives that differ by only a single feature. Therefore, the sequence provides information about the sameness of feature that is shared by all positives for Discrimination A. For the discrimination of red bottles (Features AC), the sequence generates only two positives. They suggest the range of positive variation. Example 1 presents AC in the context of Features EG. Example 5 presents AC in the context of Features FH. The only shared features of both positives are AC; therefore, the sequence provides information that identifies the features of the positives. The sequence is weak, however, because it presents only two positives, and the second is separated from the first by three negatives. Not until the fifth example does the sequence provide a positive that shows AC in the context of FH. The sequence is even weaker for demonstrating the necessary features for ACE. Only one example is positive, and the sequence does not adequately show the basis for that example being positive. For the sequence to show the combination of Features ACE, more positives and different negatives would be needed. Table 10.5 provides a sequence that would be well designed to show the range of positive variation for ACE and the minimum-difference negatives. (Again the order of examples is poor because it alternates between positive and negative, but the information provided by the set of examples is adequate.) Note that the sequence would not be well designed for the singlefeature discrimination (Feature A). The sequence would show the range of positive variation, but would provide only two negatives (Examples 2 and 6). Because the set has only two different positive examples (ACEG, ACEH), each is repeated. However, the juxtapositions show single-feature differences that confirm the necessity of features. For instance, the only difference between Example 1 (+) and Example 2 (-) is color. Therefore, the color of the positive is an essential feature. TABLE 10.5 Eight Examples for Three-Feature Discrimination ACE Example
1
2
3
4
5
6
7
8
Color Shape Height Width For showing multiple features ACE
A C E G +
B C E G –
A C E H +
A C F H –
A C E G +
B C F G –
A C E H +
B D E H –
KEY: A = red, B = blue, C = bottle, D = box, E = tall, F = short, G = narrow, H = wide.
SINGLE- AND MULTIPLE-FEATURE DISCRIMINATIONS
251
STRATEGIES FOR LEARNING SINGLEAND MULTIPLE-FEATURE DISCRIMINATIONS Trial content maps for multiple-feature discriminations generate surprising numbers. Let’s say that the learner fails to identify all four features of Individual 1 in the original population (shown again in Table 10.6). If the learner identifies only one, two, or three features of the example, the learner will overgeneralize the rule for positives. The learner will correctly identify the positive Individual 1, but will incorrectly identify some negatives as positives. For instance, if learner identifies a two-feature criterion by recognizing positive Individual 1 is a red bottle shape, the learner will always identify Individual 1 as a positive example. The learner, however, will also identify three other examples as positives, although they are negatives (Examples 5, 9, and 13). If the learner identifies three of the four features (tall, red bottle), the learner will identify Individual 1 as a positive and one other negative as a positive (Example 13—the tall, wide, red bottle). In other words, if the learner attends to only two of the four features, the learner will correctly classify 13 of the 16 individuals in the population as positives or negatives. The learner correctly identifies the true positive and incorrectly identifies three of the negative examples. If the learner attends to three of the four features, the learner will correctly identify 15 of 16 individuals. This same pattern holds for any number of features. If the learner attends to only one feature, the learner will identify more than half of the population correctly regardless of the number of features that uniquely describe the positive examples. Furthermore, the learner will correctly identify all true positives as positives. As the learner identifies a larger number of the features, the number of misidentified negatives decreases. Implications for Learning The relationship of features learned to features required for a perfectly accurate rule suggests strategies the system uses to process information about the features of positives and negatives. As pointed out earlier, the negatives are only useful if they differ from positives by a single feature. Also the learner needs to come in contact with a broad range of positives so that the features essential to all positives may be identified. The learning of one or more features not only permits the learner to respond correctly to all positives, but also generates negative examples that differ from the positives in only a single feature. For the population of 16, the learner places every positive in the correct station. All mistakes that the learner makes are created by putting negatives in the station as well. For instance, if the learner identifies red bottle shape as the rule for Individual 1
252
–
+
–
A D E G
3
–
B D E G
4
–
A C F G
5
–
B C F G
6
–
A D F G
7
–
B D F G
8
–
A C F H
9
KEY: A = red, B = blue, C = bottle, D = box, E = tall, F = short, G = narrow, H = wide.
B C E G
A C E G
Color Shape Height Width
2
1
Example
TABLE 10.6 Four-Feature Discrimination
–
B C F H
10
–
A D F H
11
–
B D F H
12
–
A C E H
13
–
B C E H
14
–
A D E H
15
–
B D E H
16
SINGLE- AND MULTIPLE-FEATURE DISCRIMINATIONS
253
(rather than red bottle shape that is tall and narrow), the learner places all four red bottle shapes in Location 1. The boxed examples in Fig. 10.1 show the learner’s selections. Two of the negatives (5 and 13) differ from the positive by one feature, and the other (9) differs by two features. The red bottle that is short and narrow (5) differs from the positive by a single feature—tallness. Therefore, the tallness of the positive is an essential feature. Let’s say the learner receives immediate feedback on whether the individual placed in a location is correct. If the learner first places Individual 1 in the location, then the short red narrow bottle (5), the learner receives feedback that the short bottle is negative. The learner is in a position to compare the single-feature difference between the examples and identify tallness as an essential feature. The learner now has a rule based on three features: tall, red, and bottle. With this addition of tallness to the content map rule, the only false positive the learner will identify is a red bottle that is tall and wide (13). This negative differs from the true positive by only one feature—width. If the learner places individuals in Location 1 and receives feedback that the wide example is negative, the feedback implies that the feature of narrowness is essential to the positive classification. These juxtapositions occur naturally if the learner receives differential feedback for each example. In summary, by correctly identifying one feature of a four-feature positive, the learner responds correctly to over half the members of the population. By identifying an additional feature, the learner responds correctly to all but three examples. Two of them differ from the positive by only a single feature. If the third feature is identified, only one false positive remains, and it differs from the positive by a single feature so the necessity of that feature is demonstrated. Infrasystem Strategy The facts about how populations and features interact suggests the efficient learning format that would be a hardwired attribute of any effective system. The format is the same one discussed in chapter 8 for learning possible schedules of intermittent reinforcement. The learning of features of the population of 16 and the learning of schedules of intermittent reinforcement are the same in that both require learning multiple features of the positive examples. The basic strategy involves three steps: 1. Identify a feature that predicts reinforcement even if it occurs on only some of the trials. 2. Add possible features to the criterion—shared features of positives and single-feature differences between positives and negatives.
254 FIG. 10.1.
Examples selected for red bottle rule.
SINGLE- AND MULTIPLE-FEATURE DISCRIMINATIONS
255
3. Correlate the addition of each feature with the change in percentage of reinforced responses. If the system identifies a feature that leads to reinforcement, the system retains the feature even if it leads to reinforcement in only about one of four trials. The system identifies shared features of positives and adds a second feature to the rule for the content map. The second feature either increases the percentage of reinforced trials or decreases it. A decrease means that the feature is inappropriate, whereas an increase means it is appropriate and is to be retained. Illusion of General-to-Specific Learning. The successive approximations created by adding features to achieve a multiple-feature discrimination accounts for the illusion that learning proceeds from general to specific. The traditional notion is that the learner starts with a vague amorphous conception of the positive examples and, with further exposures, creates a more focused, articulate conception. This notion of becoming more specific is certainly correct in function. However, rather than formulating an amorphous whole, the learner is adding specific features to create the final conception. The Infrasystem’s Classification Scheme. The properties of features have implications for the infrasystem. The infrasystem must be able to abstract and identify features that are independent of various examples. The system would have records of individual events, which are the sum of their features, and of each feature. The agent would be able to access the infrasystem’s classification of individuals because the responses that the agent produces target individuals. The single-feature representations that the infrasystem uses would be strictly in the domain of the infrasystem. The agent is able to identify single features of individuals. However, the agent would not be able to abstract, plan, or basically be aware of single features apart from individuals. This does not mean that the agent is unaware of the presence or absence of features. The restriction is only that the features must be attached to an individual. The agent would certainly be able to identify an individual that is red or tall. The agent, however, would not be able to represent red as something that occurs independently of individuals. In contrast, the infrasystem represents single features in a way that makes them completely independent of any individuals, which means that the representation is completely abstract. The infrasystem also has a categorization for individuals, which the agent may access. The infrasystem’s exclusive domain, however, is the classification of single, abstract features. The limitation of the agent’s role to representing individual, concrete objects is not a logical necessity, but a practical consideration. Nothing the
256
10.
INDIVIDUALS AND FEATURES
agent does implies the need for access to the single-feature representations. The agent operates in the context of the concrete present. As noted earlier, the agent never encounters red in the abstract, only red in individuals. All the features that the agent deals with are in hosts. So the necessary capacity of the agent is to represent and attend to the features of hosts. For the infrasystem, the categorization of single, abstract features is essential. If the system is to project single-feature changes, the infrasystem has to be capable of engineering these changes. To do this, the system would have to apply the abstract feature to a particular form or host. The inability of the agent to represent single, abstract features holds for humans as well as simpler organisms. This inability is demonstrated by doing the following tasks. First, think of two bottles that are the same height, but one is wide and one is narrow. You probably had no trouble with this task, which means that your infrasystem applies abstract knowledge of bottle shape to create concrete, specific individuals. Next, think of a generalized bottle shape that would accommodate both bottles you imagined and any other examples of the “same shape.” Do not think of a series of bottles. Think of the bottle-shape feature that is independent of any specific example. You cannot do this task because it is impossible for you to represent any feature without representing it as a feature of a particular object. Whatever you imagined has specific point-and-space properties. It was actually a specific form even if you blurred the edges and created a range of shape. You may have represented the bottle shape by eliminating the top and bottom of a bottle and representing only the sides. This is certainly an abstraction that presents the quiddity of the bottle shape; however, it is a representation of an individual, not abstract bottle shape. If your representation had a particular spatial orientation, possibly with the neck at the top, it does not meet the needs of the system, which could identify the bottle when tilted, on its side, or upside down. Your representation was not of the general bottle shape, but of bottle shape, plus position, size, and possible other features. The need of the infrasystem is to abstract the feature so that it is void of shape, position, or any other feature. You would have no intuitive sense of this feature category because it has no counterpart in anything you have experienced. However, it is necessary for the classifications the infrasystem performs. For the next task, form the mental image of a bridge. This is something you can do easily. The bridge that you represented, however, was a specific bridge, which means that it had many details that were not specified by the direction. It was a particular type of bridge, possibly over a river or stream, possibly over an expanse, but it was a concrete singular individual that meets the criterion of being a bridge. Now make your bridge longer. This direction calls for a transformation that preserves some of the features of the original, but modifies others.
SUMMARY
257
Again this transformation is not a problem for the system because something specific changes—the length of the bridge. You may have discarded the original bridge and replaced it with a longer one. In any case, you have created a concrete singular object that has a pair of features—bridgeness and relative length. Length, like the bottle shape, may be harnessed to any host in your repertoire—horse, hose, or hearse. Next, add a concrete rail on either side of the bridge, with concrete posts in the shape of milk bottles. This direction is something that your system can follow easily. To comply, your system creates another transformation that involves multiple-feature representations—the shape of the various posts and the organization of them into a form that functions as a rail. You now have a representation of a singular, concrete object that is consistent with three main features—bridge, long, concrete rail. Now color your bridge blue. This, too, is a single-feature transformation that is not difficult for the system because it involves changing a single feature of an object. The system retains the object and changes the color. All the shape features of the bridge remain unchanged, and the surface becomes blue. If your infrasystem could not represent blue apart from any concrete application, there is no way that your agent could affix the blueness to the bridge. The system had no foreknowledge that it would be required to do this, and it probably had never experienced (either in reality or in imagination) the particular bridge that you just created. Note that you probably did not make everything in the scene blue, only the bridge. Any learner able to learn various discriminations in the population of 16 individuals has extensive capacity to formulate and manipulate single-feature abstractions and do exactly what you did—follow classification rules that call for identifying individuals that have specific features. The human is able to derive information about the abstraction capability of the infrasystem through inference and logic, not through intuition. It may seem that the agent is able to think of blue, longer, or parts in the abstract, but this is an illusion. The directions “make it blue” or “make it longer” mean something to the agent only if they are attached to hosts. When they are, the agent is able to perform remarkable transformations that involve any of the thousands of features in the agent’s repertoire.
SUMMARY The only way the system is able to recognize individuals or groups of individuals is by reference to common features that are unique to the individual or groups. If the learner is able to learn more than one discrimination that involves a particular individual, the learner must be capable of identifying the individual as the possessor of more than one feature. The learner’s sys-
258
10.
INDIVIDUALS AND FEATURES
tem must be capable of identifying the individual as the sum of its specific features and identifying the features singly. The sum of features the learner initially constructs may prove to be inadequate, but some form of sum-offeatures is necessary. The amount and type of learning that is possible is a function of the features of individuals that the learner identifies. If the learner does not attend to any of the enduring features (including the actions that occur with every presentation of the individual), no learning that involves these features is possible. The learner who identifies an individual as the sum of six features has the potential to learn more relationships that involve that individual than the learner who sees the individual as the sum of five or four features. Given that the learner has no preknowledge of the combination of features that will be needed for learning relationships, the efficient system represents individuals in the maximum practical detail. The general rule that the efficient system would follow if an individual is associated with a primary reinforcer is that the system represents not only those features relevant to the primary reinforcer, but other features as well. The memory or retrieval system must have a dual classification process— one classification for the individual and one for each individual feature of the individual. The features are combined or used singly to formulate criteria for identifying possible groups that include the individual. Within this dual system, any individual would be expressed as a set of features. Any classification criterion would be expressed as a specific combination of single features. For any feature, there would be a directory of the individuals and classes of individuals that have the feature. The overall difficulty of learning is logically a function of the number of features that are required for the classification rules. The system that performs single-feature learning only could have a design that required fewer trials to learn than the system that is able to learn various combinations of features within a population. In the context of learning more than one discrimination, learning a single-feature discrimination would be relatively easier than learning multiple-feature discriminations because there would be significant opportunities to encounter positives that share only a single feature and to encounter negatives that differ from known positives only in a single feature. The strategies that the efficient system employs are conservative. Features unique to negatives are eliminated as essential features of positives. Single-feature differences identify essential features of positives; features shared by a range of positives are identified as essential features. If the discriminations involve enduring features, examples may be presented singly in temporal order or may be presented simultaneously as a group. The simultaneous presentation is better than the temporal presentation at conveying information about the role of specific features because it permits the learner more opportunities to examine individuals, more op-
SUMMARY
259
portunities for comparison, and more opportunities for chance juxtapositions of positives and negatives that are minimally different. As the number of features required for the discrimination increases, the information potential of the random temporal series diminishes primarily because of the relatively lower percentage of positive examples. If individuals are randomly selected from a population, the presentation would be grossly inadequate for learning four-feature rules. If there are 16 individuals in the population, the random sequence would have to contain around 64 examples to have four or more positive examples of the targeted fourfeature discrimination. The dynamic presentation of examples provides the most articulate information about the role of individual or a combination of features. The sequence converts one example into the next by changing a single feature at a time. This type of presentation is highly contrived and designed to maximize information about the role of features. The contrived series that provides adequate information about one discrimination may be quite poor for others. Series for discriminations that involve combinations of three or more features are not good for showing the function of single features, for instance, because the sequence would have a relatively high percentage of positives and the positives would have the same three or more features. Therefore, the series would tend to provide reduced information for the teaching of a single-feature discrimination. The simultaneous presentation of examples permits variations of the same task. For instance, the learner places Individual 1 in Location 1. The individual is removed and relocated in the group. When the next trial involving Location 1 occurs, the learner must go through the process of scanning the individuals in the group to locate Individual 1. The learning of all features essential to the classification rule for positives occurs naturally if the learner initially identifies at least one feature that is common to all positive examples. If the individual has four features and the learner identifies two, the learner will correctly identify 13/16 of the individuals in the four-feature, 16-member population. Because the identification of one or more feature ensures that the learner responds correctly to every positive example, the learner receives information that the features identified in the rule are necessary. They occur in all positive examples and in no negatives. The mistakes that the learner makes when it identifies even one feature of a four-feature rule provide for single-feature differences between positives and negatives, and therefore provide information about the features lacking in the classification rule. For instance, the learner identifies bottle shape as the necessary feature, but all positives are tall, red, narrow bottles. The learner has an opportunity to compare features of all examples the learner identifies as positives. Those that differ in only a single feature
260
10.
INDIVIDUALS AND FEATURES
provide information about essential features of the learner’s rule. If the false positive differs from the true positive by being wide instead of narrow, narrowness is identified as a feature that is essential to the positives. It is the only possible feature basis for the example being negative. At the center of all learning are features. They are most easily understood operationally. If the learner discriminates between any positive and negative, the difference describes a feature or set of features. To clarify what they are and how they are combined, we reduce the difference between positives and negatives. Mapping the differences that the learner is able to discriminate identifies the qualitative features the system is capable of identifying, comparing, combining, and using to draw conclusions.
Chapter
11
Secondary and Unfamiliar Learning
Extensions of basic learning go in two directions: (a) secondary learning, in which the link between the primary reinforcer and the behavior becomes remote; and (b) unfamiliar learning, in which the system must learn new responses or highly unfamiliar material. Both directions present a variation of the same problem. The learner is currently at skill level A. To perform the terminal activities implied by the new learning, the learner must have skill level N. If the learner is simply presented with the terminal tasks (those that require skill level N) until finally succeeding, the chances of the learner performing successfully within a reasonable amount of time is highly unlikely. On each trial, the learner receives feedback that its attempt leads to failure. However, the feedback does not provide the learner with information about what to do—only what not to do. As the discussion in chapter 8 indicated, if negative examples differ by only one feature from positives, that feature may be reliably identified through the presentation of negative examples. However, if there are many differences between known positives and negatives, negative examples become relatively useless. The problem in learning a many-skill operation is not only that the negative examples are greatly different from positives, and therefore relatively useless in conveying information about what to do, but also that the learner has never produced a positive (reinforceable) outcome, and therefore must operate solely from information provided by negative examples. If all outcomes are negative, the only procedure available to the learner is one that attempts to eliminate negative possibilities until the positive outcome or a close approximation is produced. Even if the learner produced oc261
262
11.
SECONDARY AND UNFAMILIAR LEARNING
casional positive outcomes, however, the learner would probably not have a clear representation of the plan or directive that led to the positive outcome. The reason is that the system has so many discarded plans and trial content maps that the system may not be able to identify the map clearly. Another possibility is that the learner may not be able to execute plans that are perfectly consistent with the trial content map. If the learner is attempting a complex motor behavior that has never been achieved, such as turning a somersault in the air and landing feet first, there is a clear discrepancy between the directive that the learner issues and the response produced. The learner may direct itself to leap high in the air, lean back, tuck, spin around, and land feet first. It probably will not happen. Even if the learner produces the response acceptably after some practice, it probably will not happen again on the next trial because the learner cannot represent it clearly.
CHANGING REINFORCING CONDITIONS The theme of both secondary and new motor response learning is that a great skill disparity exists between what the learner knows and the requirements of the terminal task. The learning, therefore, is predictably slow. In fact, it may not happen at all unless the reinforcing conditions change. The reinforcing conditions may be altered in two primary ways: 1. Change the criterion for performance so the learner is able to receive reinforcement on a higher percentage of trials; 2. Change the task so the learner is not required to perform on the terminal task, but on an intermediate task that requires the learning of fewer skills. The Intervening Program As the discussion in chapter 7 indicated, changes in reinforcing conditions involve either shaping the response or context in which the response is produced. Because of the great discrepancy between the learner’s current skill level and that required by the tasks to be learned, uniformly successful programs that lead to progressive approximations of the terminal task may be extensive. The program strategy is simply to create some form of scaffolding—interpolated steps between the learner’s current performance and the performance required by the terminal task. Introducing a program theoretically complicates any interpretation of learning unless the learner’s behavior is recognized to be a product of both the learner’s ability and the program. The effect of the program would be known only if we had information about the tendencies of a population of learners who attempt the
CHANGING REINFORCING CONDITIONS
263
learning without a program. For instance, if we knew that the average learner required about 5,000 trials when there were no intermediate steps provided between the learner’s original performance level and the terminal activity, we would have a baseline of information that could serve to evaluate the savings in time or trials achieved with a specific program. The issue of the relationship between learning and interventions that teach is developed more extensively in chapter 12. The Magnitude of Program Steps. Empirical data are the only basis for determining the relative savings achieved by different program sequences; however, general results may be predicted on the basis of analysis. The analysis starts with the assumption that if the learner performs correctly on all trials, the learner has nothing to learn and has completely mastered the task. At the other extreme, if the learner performs correctly on no trials, the learner has absolutely no knowledge of how to produce the responses required by the task. Halfway between having complete mastery and complete lack of understanding would be correct responses on about 50% of the trials. This performance indicates that the learner knows at least some components of what is required for mastery, but certainly does not know all of them. From an analytical standpoint, 50% correct indicates that the material is initially too difficult for efficient learning. Consider the learner that receives information that about half of the trials fail. The learner will do something on the next trial, but the learner knows that the history of this task is that attempts are unsuccessful about half the time. Possibly the learner doesn’t even know that a higher percentage of reinforceable responses is possible, in which case the learner would continue to use the abortive strategy that leads to 50% reinforcement. If the step is reduced in size so that the learner receives reinforcement on about 70% of the trials, not only is the step smaller, but the learner is far more likely to receive information that a better percentage is possible. The data indicate that the learner’s attempts are correct more often than not, which means there may be only one or two features of the learner’s strategy that need to be changed to achieve 100% correct performance. Stated differently, the percentage indicates that the learner is able to learn primarily from positive examples. The learner will be able to use information about the common features of positives and the features that are unique to negative outcomes to shape an effective content map. The 70% correct criterion is certainly not rigorous and would have to be validated by experimental results. Possibly different types of tasks may require different percentages. However, some percentage that is more than 50% and less than 100% would represent the ideal step for learning various material.
264
11.
SECONDARY AND UNFAMILIAR LEARNING
Content of the Steps. Analyzing the content for a possible intervention may be conceived of as the reverse of what the learner does when learning something. When learning, the learner may add knowledge of a feature. The analysis simply identifies the more obvious features of the operation needed to perform the reinforceable task and subtracts them, one at a time, until arriving at what seems to be a simple task or set of tasks that has some of the features of the targeted operation.
Secondary Reinforcers Secondary learning involves a predictor of a predictor of a primary reinforcer: S1¢ ® S1 ® S2. S1 is the discriminative stimulus that immediately predicts reinforcement. S1 is also a positive reinforcer. It is not the primary reinforcer, but it has the functional properties of a reinforcer. S1¢ is a discriminative stimulus that predicts S1. In a sense, secondary reinforcers are guaranteed. For any predictor to serve as a discriminative stimulus, it must be fortified with secondary sensation. Unless this is done, the organism has no particular reason to respond to it in any way. Given the secondary sensation, the predictor now has a positive valance if the primary reinforcer is positive or a negative valance if the primary reinforcer is negative. The organism is designed to perform behaviors that maximize the reception of discriminative stimuli that lead to positive reinforcers. Therefore, the learning that the discriminative stimulus is a secondary reinforcer has already occurred once the discriminative stimulus has been established as a predictor. Not only will the learner attend to it, but the learner will have knowledge of what it predicts. If what it predicts is a positive primary reinforcer, it may now serve as something pursued by the system in the same way primary reinforcers are pursued. The logic is that to find or approach the predictor is to come closer to the primary reinforcer. In many cases, a different response strategy is required for each discriminative stimulus, and the entire performance sequence is represented as S1¢ ® R1¢ ® S1 ® R1 ® S2. An example of this chain would require learning that pressing a lever leads to a token, which may then be placed in a slot to receive food. There are two responses in this chain— pressing the lever and exchanging the token for food. There are two stimuli: the slot, which immediately leads to reinforcement; and the lever, which provides the token that is placed in the slot. The basic relationship involves knowledge of both S1¢ and S1 and the behaviors for each. The learner who has already learned the relationship between S1 and S2 is in a natural position to learn the complete chain. The learner that does not know the relationship between S1 and S2 has much to learn, and without a program to guide the learning, the learner would most probably have
CHANGING REINFORCING CONDITIONS
265
to learn the relationship in a particular order—first learning S1 ® R1 ® S2, then learning S1¢ ® R1¢ ® S1. Quasisecondary Reinforcers Tasks of the form S1 ® R1 ® R2 ® S2 require a chaining of two responses to receive the primary reinforcer. These involve quasisecondary reinforcers. The learner does not have to learn two independent predictors, only one. For example, first the learner is taught that when it sits on a pillow for 5 sec it is permitted to take food from the master’s hand. This part of the chain is S1 ® R2 ® S2. For an expanded task, the pillow is placed in a low shelf. For the learner to sit on the pillow, it must retrieve the pillow from the shelf and then sit on it for 5 sec. The discriminative stimulus is still the pillow. The only difference is that a more elaborate behavioral strategy must be performed to obtain food and praise. The fact that more than one action is involved is trivial from the standpoint of learning, but very significant from the standpoint of training. Because there are more variables, there is a greater probability for miscommunication. The training issue is how to provide the learner with unambiguous evidence that the discriminative stimulus indicates that the component responses must be put together in a certain order. Natural Order of Learning Secondary Reinforcers For secondary learning, S1 has the dual role of being both discriminative stimulus and a positive reinforcer. This dual role suggests what is analytically a natural order for learning—first learn that S1 predicts the primary reinforcer, then learn that S1¢ predicts S1. This order first establishes the reinforcement role of S1. Therefore, the natural order program would first establish the relationship between S1 and the primary reinforcer and then introduce the extension. If this sequence is not followed and S1 is completely neutral (nonreinforcing and nonaversive), the task would have no reinforcer, so there would be no reason for the learner to learn the relationship. It may be that the developmental patterns of some animals are designed to create a natural progression from strong primary reinforcers to various secondary reinforcers. For example, the organism may be designed so it has no particular program for imprinting, but instead achieves the strong affiliation with its mother through stages of learning that establish various features of the mother as predictors of positive reinforcement. These predictors thereby become positive reinforcers that support additional learning.
266
11.
SECONDARY AND UNFAMILIAR LEARNING
If food is the primary reinforcer, the infant’s system could be designed so it learns (with the aide of some reflexes) that there is a warm source of milk. This source has distinctive olfactory patterns. If the infant’s development is such that it is initially blind, it must learn basic relationships only from tactual, auditory, and olfactory cues. The infant learns that contact with the warm thing predicts the possibility of nursing. The infant also learns that the level of olfaction changes as a function of distance from the warm thing. A low level of reception or no reception (which occurs when the mother is far from the learner) becomes a negative reinforcer because it predicts absence of nursing. So the increase in the level of olfaction predicts contact with the mother, which predicts the opportunity to nurse. This behavior is of the form S1 ® R1 ® R2 ® S2. S1 is the scent. S2 is the milk. The response R2 is that of making contact. R1 is the approach response. The scent is used when the learner is close or farther away. When farther away, however, the learner must produce approach behaviors. The reinforcer now becomes the intensification of the scent. When the learner later receives vision, it is in a position to learn visual secondary reinforcers. A certain visual shape is correlated with the scent. When the scent diminishes, the size of the shape diminishes. So the reception of a small image of the mother is a negative reinforcer. The learner therefore learns behaviors that increase the size of the image and the scent by approaching the source. When the learner is at a distance, the behavior takes the form S1¢ ® R ® S1 ® S2. S1¢ is the visual image with no scent. The response is designed to increase the size of the visual image by approaching the source. The response leads to the reinforcer S1, which predicts the contact and the milk, S2. This chain may be elaborated if hearing is also involved. The scenario conveys the general format of learning that may be naturally scheduled for a puppy to gain affiliation with its mother. One of the more important points about the possibility of such a schedule is that the system would be required to learn multiple features of various sensory inputs. If olfaction, audition, and vision are to be correlated, they must be independently represented. Changes in the universal characteristics of each must be recorded by the system (changes in the size of the image, changes in the loudness of the sound pattern, changes in the intensity of the scent). Changes must be correlated so that the change in visual size becomes a predictor of parallel changes in intensity of scent. For even the simplest behaviors that the puppy learns, extensive applications of secondary learning occur. For the affiliation example, the setting is initially simplified by removing vision. This guarantees the learning of the olfaction and the relationships associated with contact. Then with the addition of vision, the natural program permits an extension of the basic relationships that have been learned. If the learner is able to identify features— whether they are visual, auditory, or tactual—it is possible for the efficiently
TRAINING
267
designed system to create various extensions or chains that are grounded in the immediate predictors of primary reinforcers. At each stage in the chain that is learned, a feature that predicts a positive reinforcer becomes a positively charged discriminative stimulus that serves as a reinforcer, which anchors further learned extensions.
TRAINING Although there may be a natural order for the learning of secondary reinforcers and various response-strategy elaborations that lead to the primary reinforcer, the implications for training do not automatically follow. Training is a contrived intervention. A natural order is irrelevant to the training except to understand that any natural order is most probably designed cleverly so that the learner progresses from a context that has a high probability of inducing basic learning to one that is extended. For training, the relevant components are simply (a) the task or operation that is to be mastered, and (b) the current skill status of the learner. The training could be presented in stages that parallel the natural order, first establishing that S1 leads to primary reinforcement and then introducing S1¢ as a means of securing S1. This paradigm is what behaviorists refer to as backward chaining. However, because the terminal task does not provide any prima facie implications for a training sequence, the training could also be designed so it first taught the relationship between S1¢ and the reinforcer and then added S1 to the chain. The only requirement would be that, at each stage of learning, the probability of learning the correct discriminations would be made very high. This criterion is met by ensuring that there are not a large number of possible options about what leads to the primary reinforcer. The fewer the options, the greater the possibility that the learner will learn the correct one. Training and Unfamiliar Learning One format for training involves simply placing the learner in the setting in which it has the opportunity to learn the final relationship. For the leverpressing task, the learner would have to learn to press the lever, take the token to the food dispenser, and insert it. The food would then appear. This design creates an example of highly unfamiliar learning. The reason it is highly unfamiliar may be expressed as the number of possibilities that would lead to the food—given that the learner does not have any prompting. The room has an enormous number of features. Each could be relevant to securing the food. If there were only 10 features or possibilities
268
11.
SECONDARY AND UNFAMILIAR LEARNING
within the room, and if the task required chaining 2 operations out of a set of 8 possible operations, there would be 5,040 possible combinations. Only one of them would work. The probability of the naive learner (one who has not been a subject in other experiments) learning the relationship is far less likely than it would be if there were only four features of the room that could be relevant possibilities and four possible operations. The problem is even more complicated because the natural order would prompt the hungry learner to remain close and attend to the food dispenser. Probability of Learning If the number of possible relationships generated by requirements of a secondary-reinforcement task is high, some learners by chance would not be expected to learn the relationship. These learners would have the same ability, skill level, memory capacity, learning potential, and other attributes as some learners who succeed. The reason for their failure and the success of the others is simply chance. For all, the learning will require experimenting with possibilities. Some learners will tend to succeed because they produce more responses than the others. Some will succeed because they have better memories. However, some will fail simply because their sample of attempts did not relate the token to the food dispenser. They may have discovered that the bar delivers tokens and there is food in the dispenser, but they simply did not hit on the lucky possibility that the tokens fit in the slot. An intervention that would work for all learners would reduce the number of possibilities and thereby tend to remove chance from the equation. For instance, we reduce the possibilities by first teaching that tokens lead to food. We place tokens on the floor near the slot. We may also prompt the learning by presenting a model who picks up tokens, puts them in the slot, and receives food. The convention of starting with the slot behavior creates an artificial parallel to a natural order. If the learner understands the predictive value of the tokens, the learner will know what to do with tokens once they are obtained. Alternative Designs Alternatives that do not use the natural order would be predicted to achieve about the same rate of learning as backward chaining. An approach predicted to work well would be to first train the learner to press the lever to obtain food, not tokens. Once this relationship is established, we change the design so that pressing the lever leads to different combinations of food and tokens—sometimes tokens and food, sometimes only tokens. At this point, we have established that tokens are enduring features present when the primary reinforcer is present. Next we would use a model to
HIGHLY UNFAMILIAR LEARNING
269
show that food could be obtained by placing the tokens in the slot on the food dispenser. As the learner practiced this strategy, the design would change so that the lever dispensed only tokens. The tokens have been shown both to accompany the primary reinforcer and predict food at the food dispenser. Because the experimental design presents only a few variables at a time and then adds others, the communication is as articulate as the backward chaining and would be predicted to work at least as well. The design quickly establishes S1 (the tokens) as both reinforcers and predictors of the primary reinforcer.
HIGHLY UNFAMILIAR LEARNING As noted, secondary learning could be presented so that it qualified as highly unfamiliar learning or so that it reduced possibilities by staging the instruction—simplifying the tasks and thereby increasing the probability of the learner producing reinforceable responses. Other types of highly unfamiliar learning, however, do not provide substantial program options. The best possible communication for these types does not result in immediate learning even if it is analytically the smallest practical step. There are three categories of this type of highly unfamiliar learning: 1. The learning of new motor responses, 2. The learning of internal directives for accessing knowledge, and 3. The learning of generically new discriminative stimuli. Examples of learning a new motor response (1) include learning to ride a two-wheel bicycle, learning to drive a golf ball with some degree of accuracy, learning to stand up, and learning to say the statement, “My name is Henry.” Learning internal directives (2) involves the agent learning to communicate with the response repertoire. This type of learning is most readily illustrated with stroke victims, who may be capable of relearning lost functions through practice. The victim does not suffer hardwired motor disability, but rather cognitive problems associated with the ability to access the classification system. The victim initially tries to say, “hello,” but says, “Merry Christmas” or some other utterance from the agent’s response repertoire. The pattern of agent-initiated directives to order specific items from the classification system has been transformed so that the learned directives are not connected to the anticipated items in the repertoire, but to other items. The learner must learn new rules for coding the content of directives. Another variation of learning an internal response class would involve autistic
270
11.
SECONDARY AND UNFAMILIAR LEARNING
children and their attempts to direct thought or draw conclusions from observations. Learning generically new discriminations (3) is required by pursuits that involve discriminative stimuli that the learner has never attended to—for instance, the naive learner trying to identify and interpret tracks of animals in grass, identify man-made fragments found in the rubble of an archaeological dig, or identify speech utterances presented as tactual patterns that are isomorphic to the auditory patterns. A primary difference between learning a highly unfamiliar discrimination (3) and learning a new motor response (1) is the range of tests that would categorically disclose whether the learning had occurred. For motor responses, only one type of test would indicate learning—performance on a task that calls for the response. The learner’s verbal report of being able to ride a bike does not provide indisputable evidence of the skill. Only an observation of the learner riding a bike provides such evidence. In contrast, there are many different tests of whether the learner learned the discrimination of being able to identify animal tracks. All would involve some form of positive or negative example that the learner would evaluate, but the particular types of responses required of the learner could vary greatly. The learner could be asked “yes–no” questions, to touch the part of the display that shows “rabbit tracks,” to answer “what” questions about various parts of the display (“What does this show right here?”), to complete chains of tasks (“Find the place that shows the rabbit tracks . . . Point in the direction the rabbit was moving . . . What evidence is there that it moved in that direction?” . . . etc.), or to go in the field, locate the tracks of a particular animal, and track the animal. The only purpose of these tests is to disclose the extent to which the learner has mastered content. Many test formats are therefore possible because any responses within the learner’s repertoire could logically be used to disclose whether the learner had the knowledge. Knowledge and Unfamiliar Learning All of these unfamiliar learning pursuits involve knowledge of specific content. If the agent does not have knowledge of how to produce a response, the agent is not able to plan it so that it works. The juggler who has knowledge of the operation is able to direct the voluntary response to juggle and is able to identify when one of the balls is not on the planned course, how it deviates, and what compensatory response is needed to correct this deviation. In the same way, the accomplished pianist is able to play relatively flawlessly even if the piano slowly moves across the floor as he plays—although the learner must assume a posture never experienced before while playing the piano.
HIGHLY UNFAMILIAR LEARNING
271
An important distinction for human learning is that this knowledge of responses must be functional and not simply verbal behavior. For instance, the golf swing involves a host of simultaneous and carefully sequenced components (right foot positioned at angle X and distance Y from ball, left arm straight, full back swing, eye on ball, weight shift to left leg when driver is in position P, etc.). The golf swing is classified as a motor response and not a discrimination because there is only one response pattern that serves as evidence that learning has occurred. The learner may be able to recite all the things he tells himself as he swings, and may even be able to point out flaws observed in the swings of others and self; however, the ability to produce the swing is the only relevant evidence about the learner’s relevant knowledge of planning and directing the response. The inability to produce a response consistently implies that the learner does not have complete knowledge of the relationship between the directive that the agent issues and the response that the infrasystem produces. Without knowledge, the learner is not able to plan or execute the desired response consistently.
Learning New Motor Responses As indicated, the basic problem with new response learning is that the agent must plan and specify the response. However, if the response has never been done successfully, the agent does not know how to direct it. For some tasks, the problem facing the system is double-edged. The agent does not know how to specify the directive, and the infrasystem is not capable of producing the response. The solution requires learning that adds the response to the repertoire and codes the response in a way that the agent is able to specify it. Even if the agent is able to specify a directive, the physical capabilities of the learner may have to be modified to comply with the directive. For instance, the agent is probably able to provide an articulate directive to the infrasystem to do the splits. This directive is a simple extrapolation of familiar directives for maintaining relatively straight legs and changing the angle so that the legs become horizontal in opposite directions. The response, however, is preempted either by good sense or the body’s inability to perform the operation. Another example is the walking position of older people. They may walk in a bent position and may have a perfect understanding of how to walk with a straight back. However, the infrasystem may not be able to produce the response without creating great pain. Program Variables. The more systematic intervention programs for inducing motor-response learning simplify and stage the learning of various component responses. The gross program variables are:
272
11.
SECONDARY AND UNFAMILIAR LEARNING
1. Whether the agent knows the objective of the response, 2. Whether the agent knows the effect of the various response components on features of the response outcome, 3. Whether natural reinforcers provide direct feedback on the components of performance, and 4. Which components of the response may be isolated or simplified. A relatively complete discussion of the logic and practices of teaching motor responses appears in Theory of Instruction by Engelmann and Carnine (1982). Our purpose here is not to duplicate that discussion, but merely to rephrase some of it as it applies to learning, not teaching. Knowledge of the Objective. The agent may not know the objective. If the learner does not, the sequence is logically more involved. The lack of understanding or misunderstanding of the goal may be unintentionally induced by the shaping program that we use. For instance, we want the animal to jump over a high bar. If we work on jumping over a low bar, which may be a necessary first step, the learner will receive positive reinforcement for performing correctly. However, this does not show the learner the ultimate goal of the learning. The learner is being reinforced for one response, and the learner may reasonably assume that the goal is to jump over the low bar. The learner has no information that this goal will change or how it will change. Although practice jumping over the low bar will probably equip the learner with the tools needed to jump over higher bars, we can hasten the knowledge of what we will expect the learner to do by using multipleshaping criteria. We provide three levels of reinforcement all based on the learner’s performance. The first level presents the least attractive positive reinforcer. The third level presents a positive reinforcer the learner finds particularly attractive. During a session, the trainer presents the different reinforcers as a function of the characteristics of each response produced. A Level 1 reinforcer is issued for the performance of successful responses closest to the bar (the learner not clearing the bar by more than possibly 2 inches). The criterion is designed to include about 50% of the successful trials. The Level 2 reinforcer is based on 30% of the successful trials that are farther above the bar. The Level 3 reinforcer applies to the best 20% of the successful trials. Those are the responses that clear the bar by the greatest distance. This differential reinforcement permits the learner to extrapolate and conclude that the trainer responds more positively to higher jumps. Although the learner receives practice and positive reinforcement for all positive outcomes, the learner learns that jumping higher over the bar leads to better reinforcement.
HIGHLY UNFAMILIAR LEARNING
273
When the learner improves such that only about 30% of the trials receive Level 1 reinforcement, the task changes. The bar is moved up to a level that would permit about 50% of the current successful responses to receive Level 1 reinforcement. The criteria for issuing Level 2 and Level 3 reinforcement are again calculated on the basis of responses the learner produces. The best 20% receive Level 3 reinforcers, and the next-best 30% receive Level 2 reinforcers. This process retains the best of what may be achieved through shaping while demonstrating to the learner which responses are relatively better. Knowledge of How Components Influence Responses. For the learning of motor responses, we cannot control the order of positive and negative examples as we can for a temporally ordered sequence of discrimination examples. We do not have control of either the juxtaposition of examples or the differences in response topography from one trial to the next. The result is that many trials may be required before the learner receives sufficient information to identify the function of some components and learn the response directives needed to control specific components. What we may be able to do is create prosthetic devices that hold some of the response variables constant while permitting variation in others. We may also be able to remove individual components from the total response and provide practice on the components in isolation. These techniques apply more to the type of response that involves a chaining of discontinuous parts. If the response involves a continuous movement (such as the golf swing), great distortion may result if the components are removed from the context of the terminal response. In other words, practicing components in isolation may result in some immediate savings, but may not have a great overall effect. This is because the difference between practicing the component in isolation versus the context of the response is so great that little is gained for practicing the component in isolation. We could introduce the skill of pedaling in a context other than that of riding a two-wheel bicycle. We could do it with a tricycle or an exercise bike. When pedaling is later introduced in the two-wheel context, however, some response components that worked in the isolated setting will have to be modified. If the learner pushes very hard on a stationary bike’s pedals, for instance, it is possible to lean to the side of the leg that is pushing (leaning left if the left leg is pushing). Transferring this behavior to the two-wheel context would result in a nasty fall. In the same way, a device like training wheels holds the balance constant while the learner works on the steering and pedaling components of the operation. The same problem that exists with the exercise bike occurs with training wheels. The learner may practice leaning the wrong way when going around corners and pedaling the wrong way. These compo-
274
11.
SECONDARY AND UNFAMILIAR LEARNING
nents of the responses the learner practices would tend to interfere with later unassisted riding. If we identify balance as the key component of the two-wheeled operation, we could first introduce various simplified tasks in which the learner is required only to coast—first only in a straight line, then curving in a single direction, then following an undulating S-shaped course. Following this training, the learner would have no trouble integrating whatever knowledge the learner had acquired about pedaling because the learner would know the basic balance variables as they relate to leaning. The Role of Natural Reinforcers and Direct Feedback. As a rule, the organism functions in an environment that provides immediate feedback. When the learner tries to ride a two-wheel bike, for instance, the physical environment provides immediate feedback on the various response components. When the learner violates the zone of acceptable center of balance, the learner falls. This outcome is a punishing consequence linked to the specific behavior that preceded it. The system concludes that the planned behavior failed. Not only does the physical environment respond to some behaviors by providing immediate feedback, but it may also clearly demonstrate the relationship between minimum differences in behaviors and corresponding differences in outcomes. These differences are most clearly demonstrated with behaviors that involve continuous variation. If the learner leans just a little too far, the learner is able to compensate through a counterweight shift. If the learner leans a little farther, a greater weight shift is required. If the learner leans beyond the range of possible weight shifts, the learner falls. These correlations provide the learner with a clear map of the sensations of balance and parallel behaviors. Because of the articulate nature of the feedback provided by the physical environment, many skills for which such feedback is provided are reliably learned by learners who have difficulty learning skills not directly monitored by the physical environment. With respect to providing feedback, the physical surrounding is functionally an active variable that issues feedback capable of shaping responses. If the goal of a behavior is achieved, the response that led to it is judged adequate by the system. If the goal is not achieved, the behavior is judged not adequate and is to be changed. This logic characterizes the most elementary behaviors and covers a great range of additional behaviors. For some learning, the feedback is not directly correlated to the behavior and is established through secondary indicators rather than through sensation and outcome. For example, if a golf swing were as closely correlated to physical environment feedback as standing up was, the learner would receive immediate feedback from the swing about whether the response was acceptable. This feedback would come in the form of patterns
HIGHLY UNFAMILIAR LEARNING
275
of sensation. Doing it the wrong way would lead to physical pain, not simply to an unwanted outcome judged by referring to related events. Furthermore, the range of variation of the response would be correlated with sensations that clearly identify the particular response components that lead to a successful outcome. If the learner is blindfolded and stands on a floor that slowly tilts different ways and causes the learner to adjust his balance, not only is his balance adjusted properly on all occasions, but the learner is able to tell us whether the floor is changing and whether it is close to level. In contrast, if the learner hits a golf ball and is not permitted to observe the flight of the ball, the learner is not completely reliable about predicting the flight of the ball. Behaviors for tasks that involve more indirect feedback are more difficult to learn because indirect knowledge is needed to interpret outcomes. The system must draw inferences that are not necessarily based on how something feels, but perhaps on how it looks or sounds. Specifically, the learner must have a good understanding of the goal (e.g., making a basket) and precise knowledge of the discriminative variables that are to be correlated with the response-component variables (how far from the basket, where the release point will be, what the projected arc will be, and the apparent weight of the ball). Potential to Isolate Components. The potential to isolate components of the response derives from the structure of the task. As the information about the response components becomes increasingly remote from the sensations involved in the production of the response, the system must draw an increasing number of inferences. The learning of the relationship becomes more an agent function as the relationship becomes more remote. The reason is that nothing hurts during the jump shot when the learner’s shooting arm elbow is too high and the ball veers to the left. The learner is going to have to figure out what is wrong. Without help, the learner may not identify the simplest response component or set of components. For learning skills that depend on indirect information, the infrasystem must be taught what hurts. The sight of the ball going to the left of the basket must become a painful outcome. This is achieved only through agent input (see chap. 14). Even the sight of the ball going the wrong direction does not implicate any particular components of a quite complex response. If the learner maintained all the components of the response in exactly the same pattern, the correction would be achieved simply by aiming a little to the right. If changing the response in this way results in the ball sometimes being on target and sometimes going to the right, the implication would be that some of the components of the response are inconsistent and must be stabilized. The learner needs some kind of routine for the more likely de-
276
11.
SECONDARY AND UNFAMILIAR LEARNING
tails—hand position, elbow orientation, and so on. If these details are fixed or more tightly controlled, the response would probably be corrected. Once the routine is corrected and sufficiently practiced, the infrasystem would be able to relieve the agent of thinking about and planning the components. The agent would be able to direct more global units; the infrasystem would have information about the components. In effect, the learner would tend to direct something like, “Fake left, regular jump shot 18 feet.” If this is the level at which the learner is able to operate, the learner should not engage in a more detailed analysis of the shot; simply permit the infrasystem to handle those details. What typically happens is that when the learner is under stress (in an important game), the learner has the tendency to think more about components. The result is probable failure simply because the infrasystem is not used to executing the shot in this context. Levels of Task Difficulty Establishing the transfer from agent to infrasystem requires practice. Furthermore, the practice must have a high percentage of correct responses. As long as the agent must remind itself of the various response components, the response will tend to be guided one component at a time, not fluid. From the agent’s standpoint, the directive would have to include references to various components. “Jump, not too high, shoot, regular timing, head up, back straight, elbow down, point and follow through.” Different types of practice tend to produce different outcomes in how well these remote skills are learned. The samenesses and differences from one trial to the next, as well as the timing of the juxtaposed examples, make the difference between the practice that promotes serious stipulations and the practice that is effective. There are four arbitrarily designated levels of task difficulty based on the kind of processing that is required to perform them successfully: Level 1: the easiest level, presents juxtaposed examples of the same task (the same response). Level 2: presents a modest interruption between the examples of the same task (a small time delay or a task that is greatly different from the targeted task). Level 3: presents a longer time interval between examples of the task and presents interruption tasks that include tasks that are minimally different from the targeted task. Level 4: presents a variable time interval between examples of the task and a full range of randomly sequenced interruption examples.
HIGHLY UNFAMILIAR LEARNING
277
During a practice session, the learner would start on the highest level of difficulty achieved during the previous session and would go to a simpler level only if the targeted task were not being performed properly. This format is different from that of traditional practice, which tends to start with the easier levels as warmups. These warmups are actually spurious prompts. If the goal is for the learner to perform in a Level 4 context, that is the context that should be practiced so that the infrasystem becomes automatic in this setting. For the initial training, the learner would work on Level 1 until reaching a specified performance criterion. Then the learner would work on Level 2 until reaching a specified performance criterion. Once the criterion for this level is met, the learner would work on some sequences that are of Level 3 difficulty and some that are of Level 4 difficulty. The learner may work on various tasks during a session (lay-up, set shot, etc.), but all would follow the same pattern of progressing to Level 4, which integrates what is being taught with other skills in an unpredictable pattern of juxtapositions. This hierarchy of difficulty is based on logical considerations and may be demonstrated to apply to any aspect of highly unfamiliar learning. Level 1 provides the greatest prompting because it requires the least memory, presents examples that are as close to being the same as possible, presents examples at a relatively high rate, and therefore permits the learner to focus on the details of the discrimination or response. Level 2 is more difficult because it requires a more precise representation of the positive example. The interruption serves as a distracter that prevents the agent from relying on engrams of the previous trial to prompt performance. This shapes the infrasystem to create a permanent and accurate representation of the discrimination or response components. The infrasystem cannot rely as much on comparative information that refers to the preceding trial. Level 3 creates a more invasive interruption and therefore requires an even more precise representation. For Level 3, the interruptions are tasks that share a lot of features with the targeted task. Level 4 presents the most difficult type of juxtaposition because it requires the infrasystem to have precise knowledge of the various behaviors, including those that are minimally different from the most recently taught skill. Also contributing to the difficulty is the relatively low rate at which the most recently taught skill occurs. Level 4 most closely approximates the context in which the targeted performance is applied. For example, the learner is learning to shoot free throws. Instead of working on nothing but free throws, the early sessions start with Level 1 (juxtaposed examples of shooting free throws with no interruptions from trial to trial) and then proceeds to the other levels if the learner meets the criterion, which might be making two consecutive free throws. For Level 2
278
11.
SECONDARY AND UNFAMILIAR LEARNING
practice, the learner may be required to dribble the ball about 5 feet from the free throw line, come back, and then shoot another free throw. Level 3 would involve something like shooting a jump shot from longer range and then returning to the free throw line and shooting one free throw. Level 4 could be a random set of tasks that would integrate various skills the learner is being taught. For instance, after shooting a free throw, the learner would be required to shoot two to five other shots and do other things, such as dribble to the end of the court. The tasks would not occur in a fixed sequence. Part of the sequence might consist of a jump shot from the threepoint line, a free throw, a lay-up, a jump shot from right corner, a jump shot from top of the key, dribble to the other end, a free throw, a jump shot from right corner, dribble to other end, left-handed lay-up, and free throw. The next part of the sequence would have the same kind of unpredictable mix of tasks. A version of this progression could be used to shape any highly unfamiliar skill. The rate of progression from one level to the next is a strict function of the learner’s performance. If the learner performs on Level 4 and continues to meet a specified criterion of performance on the various tasks being performed, the learner practices on Level 4. If the performance falls below the established criterion, the learner works on an easier level. Teaching Communication Between Agent and Repertoire The same strategies that apply to teaching new motor responses apply to the reteaching required for the agent to regain access to the response repertoire. The least complicated examples are those that involve stroke victims. The familiar commands that access specific files in the learner’s repertoire no longer work and must be replaced or transformed so they do work. The learning trends observed with people who go through a program capable of achieving this reteaching are basically the same as the learning trends observed when inducing highly unfamiliar discrimination strategies. We have worked with a number of victims of serious head injuries, ranging from those who have serious involvement of all parts of the brain (e.g., the result of serious anoxia) to those who had aneurysms or trauma that affected only some of their cognitive capacities, often not extensive motorresponse functions. Although there is variation from one victim to the next, there are predictable levels of difficulty for different types of tasks. Tasks that refer to details of the temporally concrete present are easier for the victim to relearn than rule-based, fact, and pattern relationships. For instance, the task, “What color is this?”, is generally a lot easier than the task, “What color is your house?” or other things not in the concrete present. Often the learner is not able to pick up on obvious patterns or relationships. For instance, some would consistently miss tasks like these:
HIGHLY UNFAMILIAR LEARNING · · · ·
279
What is the name of the room your bed is in? In what room do people take a bath? If today is March 23, what will the date be tomorrow? What day comes after Wednesday?
Fact relationships are often difficult for them: · · ·
What is your name? Where do you live? How many children do you have?
Sameness Analysis. The assumption of the training is that the infrasystem has items classified according to various features. The problem is for the learner to produce responses that access the classification system. The assumption is that if the training teaches the learner how to access some items that share a set of features, the system will be able to access other examples that have the same set of features. With practice in a range of item groups that differ greatly from each other, the system will be able to reformulate or transform the entire general access function so that an approximation of normal functioning would be restored. This prediction tended to be confirmed, often dramatically, through the training. Although there are notable exceptions, the best results were obtained with younger people who had specific deficiencies and who had suffered the trauma within the previous 2 years. The least impressive results we achieved were with a woman of 65 who had suffered a massive stroke over 10 years earlier. Even if she had the potential to learn immediately after her stroke, she had learned adaptive practices and habits that required many trials to extinguish. (We did not take credit for successes obtained with those people who had suffered a trauma within less than 6 or 8 months from the time we started to work with them. The reason is that some form of recovery often comes about naturally, particularly with victims who have only specific cognitive-functioning problems.) Our procedure was to initially ask questions and identify some things that the victim could not do. To create the program we would: 1. Identify a set of examples that share the same set or sets of features as the failed tasks identified in the testing. 2. Initially present training on one or more failed items through a relatively fast-paced presentation based on the four levels of difficulty for juxtaposed tasks. 3. Introduce new items as the current ones were mastered, and present a cumulative review set that included all earlier taught items.
280
11.
SECONDARY AND UNFAMILIAR LEARNING
4. After the learner mastered a set of three or four examples of a category, test the learner on various examples that had not been taught. Presentation Format. The four levels of difficulty for a particular item would be operationalized as follows. Level 1 would be the same item, repeated after only a short pause. For example, “Listen: Your name is Nicki. What is your name?” (Response) “Yes, Nicki.” (Pause) “What is your name?” (Response) “Yes, Nicki.” This format would be repeated until the learner could perform correctly on about three consecutive trials. Level 2 tasks were presented immediately following Level 1 completion. For example, “Yes, your name is Nicki.” (Pause for 4 or 5 sec) “What’s your name?” Next, Level 3 tasks were presented by interpolating familiar tasks between trials of the targeted task. “Yes, your name is Nicki, isn’t it? Touch your head. Good. Hey, what’s your name? You’re doing pretty well, aren’t you? Do you like doing this? Yes, it’s a drag, isn’t it. Say, what’s your name? . . .” After the learner correctly performed on two or three consecutive examples of “What’s your name?” (each of which was separated by three or four interpolated tasks), the item would be included in the cumulative review set. Items from this set would be presented during every session in random order, and any mistakes would be firmed through the four-level paradigm. Relearning Trends. The typical overall trend was for the learner to require a great number of trials to master the first few examples. The learner’s performance would then tend to accelerate greatly, and the learner would perform spontaneously on examples that had not been taught. One reason these trends are not often observed in traditional training is because of the number of trials required to achieve the initial relearning. Commonly, the learners we worked with would require more than 1,000 trials to meet mastery on the first item. Furthermore, the regressions in performance observed when the next few items were introduced would seem to suggest that the training was a complete failure. After “What’s your name?” reached criterion, other items would be presented that addressed parallel facts, such as, “Where do you live? What is your husband’s name? How many children do you have?” Often the learner required as many trials on these items as on the original. When a new item was introduced, the learner would typically regress on all of the earlier taught items. This trend implies that there are two stages of learning required to reestablish access to the classification system. The first is to relearn the features of the first item. The next is to relearn those features that are the same across parallel items. If the learner learned what was the same about the various early items (facts about self), the learner would be able to access the category based on this common feature (facts about self). This type of
HIGHLY UNFAMILIAR LEARNING
281
sameness knowledge was as lacking in the same way that knowledge of individual items, such as “What is your name?”, was lacking. The learner basically learns the first item from scratch as if no classification system exists. The learner then learns the second item from scratch. Because both are learned as individuals, they are not classified according to sameness of features. This classification is required when both items are in the same set. The system is now required to retrieve either A or B. If they are in disparate classes, the learner has difficulty retrieving the one requested. The fact that these items have the same features actually makes the task more difficult because the system has not clearly classified the items as individuals (the sum of their features) and classified them by each of the features in the sum. If this had occurred, the second item would be relatively easier than the first. The integration of items that are the same with respect to some features forces the system to reestablish the classification of the items by sameness in feature and by individual. Some learners who had lost a substantial segment of their repertoire made so many mistakes when items were integrated that we created what we called throw-away sets, which consisted of the first few examples of a particular type. The learner would tend to learn these items when they were in a cumulative set, but the introduction of each new example led to extensive reteaching of the other examples (often requiring hundreds of repetitions). When the learner approximated the criterion for mastery on Level 4, we would discard the throw-away set and never use the items again. Typically the learner would learn the next set in far fewer trials and would ultimately gain access to various untaught examples in the set. The trend was for the learners to (a) require fewer trials to relearn subsequent items in a set, and (b) generalize to items that had not been taught. The relatively low number of trials required to induce the item indicated that the system was able to establish sameness in the features that were relevant to the training. Once this sameness was established, the system tended to regain access to other items classified on the basis of sameness of feature. The general rule was that, once new items could be taught in a relatively few trials (12 or fewer), the learner would generalize to untaught examples. Figure 11.1 presents the performance data on a well-educated man of 64 who had retained extensive parts of his abstract repertoire (Glang, 1987). For instance, he could describe in detail the economic problems of underdeveloped countries. However, there were many basic things he could not do, such as identify common objects, answer simple math-fact questions, and identify how two things are the same. He also had problems identifying members of basic classes like vehicles. The instruction focused on four areas: similarities, object identification, classification, and memory tasks involving numbers. The criterion for introducing new items was mastery. An item or task was considered mastered
282
11.
FIG. 11.1.
SECONDARY AND UNFAMILIAR LEARNING
Performance on coordinate tasks.
only if the subject answered correctly on the first trial during a session and responded at 95% or above on the item when it was integrated with other taught items. Figure 11.1 shows the number of trials to criterion for new tasks from particular areas. During the first 5 weeks of instruction, the learner required about 400 trials to achieve mastery on items involving similarities, object identification, and classification. He required only about 100 trials to achieve mastery on the number-memory items. The rate of learning new items had increased greatly by Week 10. The learner learned new similarities in less than 50 trials and classification items in less than 20 trials. By Week 15, his rate of learning new objects required few trials, particularly for items that occurred at a relatively high rate in his surroundings. He was functioning at a level that would not readily suggest that he had any serious deficiencies. His WAIS–R verbal IQ had risen from 80 to 95. As Fig. 11.1 shows, the trend was not highly labored. Inducing mastery of the first item required only about 400 trials. After relearning a set of 20 object names, the learner regained access to the classification system and exhibited spontaneous knowledge of many items not taught. For instance, working on math facts led to spontaneous recovery of topics that were not
HIGHLY UNFAMILIAR LEARNING
283
closely related (such as geometry and logical operations). The trends for some of the other people we worked with were much slower, requiring possibly 100 different items before the system had regained access to a significant segment of untaught examples. These performance facts suggest that what the trauma causes is some form of transformation. What the training does is provide the basis for a compensatory transformation. If the system learns some examples of how to compensate for the transformation, the learner may be able to apply the same compensation pattern to other untaught material. Teaching Highly Unfamiliar Discriminations The trends observed in learning highly unfamiliar discriminations have a strong parallel to those of relearning functions lost through traumatic head injury. Engelmann and Rosov (1975) conducted a series of studies that involved highly unfamiliar content. They used a tactual vocoder (a device that presents tactual patterns that are isomorphic with sound patterns) to teach speech to both hearing and deaf subjects. The vocoder consisted of 24 vibrators that spanned the range of 80 to 10,000 Hz. The vibrators were configured in two banks. One was for the low-frequency sounds such as mmm, rrr, and components of vowel sounds. The other was for higher frequency sounds, such as sss and t. The patterns were presented in real time and would tend to present an analogue of various speech features. If a word was said more loudly, the pattern vibrates more vigorously. If the sound was presented faster, the vibratory pattern would occur faster. If the pitch of the utterance changed, the tactual pattern would shift. The basic training procedure was for the trainer to stand behind the learner so that the learner had no visual cues. The trainer would say words into a microphone. The learner would verbally repeat the word into a microphone. Hearing subjects were artificially deafened so they had to rely only on the tactual vibration to identify words the trainer said. The deafening was achieved by ear plugs and a headphone that delivered about 120 dB of white noise. The sequence of word introduction for the deaf students was relatively safe and progressive. The sequence started with highly discriminable words—monkey, elephant, and tree—and progressed to words that shared many features (e.g., man and ran). One study with hearing subjects showed that the sequence that resulted in the fastest learning juxtaposed highly similar words from the beginning. However, this same sequence also resulted in the poorest overall performance when the criterion for mastery was high. If subjects were able to identify the subtle differences between the highly similar words (such as man and ran), they would have no trouble with words that were greatly different. However, if the performance required too
284
11.
SECONDARY AND UNFAMILIAR LEARNING
many consecutive correct responses, the subjects tended not to learn the minimally different words. At the onset, the prediction was that the material would be much more familiar to hearing subjects and they would therefore learn it in far fewer trials than the deaf subjects. The hearing subjects had a repertoire of sounds correlated to speaking. The relationship between changes in sound and speech were far less well understood by the deaf subjects. The results of the training verified the prediction, with hearing subjects able to learn various words, sentences, and cues often in less than one fifth the number of trials required by the deaf subjects. The data clearly show that the task is not the same for hearing and deaf subjects. The deaf subjects had much to learn about identifying individual words and classifying them on the basis of sameness of features. The training items were presented through a variation of the four-level hierarchy of difficulty based on juxtaposition of items—first repetitions of the same item, then the item presented with one or two interference examples (words that were known, but that were highly dissimilar from the new word), then integration with recently taught items, then the item became part of the cumulative set. New words were added to the set when the learner reached a specified criterion of performance on the recently taught items. Learning Trends. Instruction occurred daily for about 1 hour a day. During that time, the learner would work on various activities with some breaks. Typically, the learner would produce hundreds of responses during a training session. The trends for learning words and utterances have some of the same features as the trends for relearning following head injury. 1. For some learners, the first few examples became so confused that they were thrown away. 2. Early in the sequence, a great number of trials were required to induce mastery of new items. 3. Evidence of the multiple-classification difficulties that learners experienced created a serious plateau after 20 to 30 words had been taught to mastery. 4. Following this plateau, the rate of learning was greatly accelerated. Figure 11.2 shows the performance of a deaf subject. The subject was tested every week. The test was not preceded by any instruction or review. The initial learning was slow. Observers of the early trials during which the learner was unable to discriminate between elephant and man expressed great doubt that the learner would be able to learn much through tactual vibration.
HIGHLY UNFAMILIAR LEARNING
285
FIG. 11.2. First trial word identification (from Engelmann & Carnine, 1982). Copyright © 1991 by Engelmann-Becker Corporation.
After Week 10, the subject had mastered a set of only 12 words. The plateau is evident after about 30 words had been introduced. As the data show, the percentage of first-time correct words dropped below 50% for a number of weeks. Starting with Week 30 and continuing to the end of the study, the subject learned new words at a remarkable rate. By Week 45, the subject had a cumulative set of more than 130 words. During the last week, the subject learned over 12 new words, which is more than 15 times faster than the
286
11.
SECONDARY AND UNFAMILIAR LEARNING
less stable mastery achieved during the first 10 weeks (if the refirming that had to occur later is taken into account). The learner’s performance continued to accelerate following the record shown in Fig. 11.2. Although the vocabulary now contained many words that differed from others by only a sound, the learner could usually learn new words from a presentation that related the new word to familiar words. “Listen: This word rhymes with fell. What does it rhyme with? This word starts with the sound sss. What sound? Say the word.” A few months after the last data point on the figure, the learner had mastered over 600 words. For one deaf subject we worked with, the task involved even more highly unfamiliar content than it did for the others. This student had been placed in a residential facility for the mentally retarded, although she was not mentally retarded. When we began work with her, she was 14, very responsive, and good natured, but very naive. She had been taught virtually nothing from adults. What she had learned, she had picked up on her own. We started a program with her that we thought was basic. One bank of vibrators was positioned on each thigh. The trainer would say either the sound mmm or sss. One sound activated vibrators on the left thigh and the other vibrators on the right thigh. The task that we presented was for the learner to point to the bank of vibrators activated by the sound. Her performance was random even after modeling. We simplified the task by disabling one of the banks. All she had to do now was point to the left vibrator when it started to vibrate. After more than 100 trials, her performance was random pointing frequently to the bank on her right leg. Finally, we removed the right bank completely, but for possibly 50 trials, she would randomly point to her right leg, although it had no vibrator. When she finally caught on, she achieved something of the same learning curve shown in Fig. 11.2; however, it was much faster. In fact, it was probably the fastest learning of any of the deaf subjects we taught. (This is a speculation because not all of them went through the same word list or the same criteria for introducing new words.) She learned more than 150 words in about 20 weeks. Possibly one of the more interesting features of learning highly unfamiliar content is that the learner’s initial performance does not predict how fast the learner will learn the material. Sometimes a person with only modest loss of functions through head injury requires an unusually large number of trials to achieve recovery. In contrast, the performance of the deaf girl would have been perfectly unpredicted by her initial performance. In all cases of highly unfamiliar learning, the number of trials is great, generally far beyond that normally addressed through initial or rehabilitation training. The number of trials or amount of time is not that startling, however, if we look at the performance of the human infant. Although the learning context for the infant and adult are quite different, the rate at
HIGHLY UNFAMILIAR LEARNING
287
which the infant learns language is not impressive if one counts trials or exposures. The 18-month-old has a repertoire of only a handful of spoken words and possibly less than 100 that are understood. The learning task of the infant is more difficult in some ways because the infant must learn both word meaning and sound features of particular words. In other ways, however, the tasks facing the older deaf child are more demanding than those facing the infant because the learner has already learned some things about speech and language. Many of them, however, must be relearned to accommodate what the learner is now required to learn. The judgment that the training requires too many trials, therefore, tends to apply as much to the human infant as to the deaf youngster or stroke victim. Teaching Secondary Reinforcers and Unfamiliar Content The teaching of many complex skills implies programs that establish secondary reinforcers and teach highly unfamiliar content. Such a set of complex components is illustrated by a young retarded adult. The instructional problem was presented by a psychologist who had been working with the young man in a residential setting. The problem was described as “He can’t schedule his time.” A week earlier, the young man and psychologist had agreed that the man would do his laundry after lunch at 1:00 p.m. The psychologist rehearsed him, making sure he knew the current time and when he would do his laundry, but when 1:00 p.m. came around, he completely forgot about it—every day. The psychologist introduced enticing reinforcers, but there was no change in the young man’s performance. There was a somewhat obvious solution: Give the learner an alarm watch, set at 1, and make sure he knows what to do when the alarm goes off. The psychologist, however, was interested in providing a program that would teach the skills. As it is with all programs, the starting point of the program had to be identified. The ideal starting point is a variation of the targeted task that permits the learner to perform at about 70% correct (or that would yield this percentage after modest training). So the initial assessment was based on the idea that the central skill components of the terminal task could be incorporated in a simpler task: 1. The time could be reduced, thereby reducing the memory requirements. 2. The interference or number of potential distracters that occur between the current time and targeted time could be reduced. 3. The time referent could be made simpler by linking it to an event that occurs at a particular time rather than to clock time.
288
11.
SECONDARY AND UNFAMILIAR LEARNING
4. The behavior to be performed could be prompted by a unique routine that created additional predictors of what to do. We evaluated the learner and identified how many of the central skill components he knew. We first determined the extent to which he could estimate time. We had him look at his watch for 15 sec. For the next trial, we told him that we wanted him to guess when 15 sec had passed without looking at his watch. He performed quite well. We changed the criterion to 30 sec, and he again performed remarkably well. Even at 1 minute, his estimations were within 5 sec. Next, we tested his responses to distraction. We told him we were going to try to fool him, but that he had to remember when a minute had passed and signal by saying “stop” (the same signal he had used for the other examples). For example, we told him that the time was starting now, then asked him his favorite TV show. He said it was “NYPD Blue.” We responded by saying, “Oh, that show stinks.” For the next 5 minutes, he explained, with considerable fervor, why it was the best show imaginable. After 5 minutes, we stopped him and asked, “Weren’t you supposed to do something?” He looked puzzled and shrugged. We offered a couple of other prompts. After the third one, he smiled, shook his head, and said, “Oh yeah, I should say stop. I forgot.” His behavior provided us with enough information about where to start and how to proceed. We had to start with an example of “At a particular time do something.” The task would have far fewer variables and distracters than the task “Do the laundry at 1 o’clock.” The primary problem we had to address was the learner’s inability to remember something after being distracted. We knew that he had the capacity to estimate relatively short periods of time, but that even within these periods a highly engaging distracter would completely override the resolution of what to do. So part of the program worked on presenting distracters of the type we presented during the assessment. When the learner performed acceptably on those tasks that involved relatively shorter periods, we would either increase the magnitude of the distracters or increase the time interval. If the learner dropped to about 60% correct, we made the task easier. When the learner performed over 80% correct, we made the criterion more demanding. At the same time, we introduced a separate set of tasks that were referenced to lunch, not to 1:00. These addressed the laundry issue. The content map that the instruction was designed to induce was, “When you leave the dining area after lunch, you do your laundry.” Initially, we reminded him of the task he was to perform about 1 minute before he completed lunch. After the learner performed successfully on six occasions, we provided him with a card hung around his neck. On
HIGHLY UNFAMILIAR LEARNING
289
the card was written “Laundry after lunch.” The task was the same every day. Before lunch, we would remind him of the rule. “When you walk through the door after lunch, touch your card.” The purpose of this response was to make the learner’s thinking overt. (If he remembered to touch the card, he probably remembered what he should do.) Without such a device, we would have to wait until he failed to do the laundry. Then we would be working from a negative example, not a positive example. If he failed to respond to the card, we were still able to provide a correction that involved touching the card. The learner would be reinforced for touching his card by receiving a ticket that he could later exchange for a treat. In other words, the card was established as a secondary reinforcer. If the learner did not touch the card, he would return to the table and sit for about 1 minute. (He would estimate the time.) Then he would leave the dining area and touch his card as he walked through the doorway. If he touched his card and forgot to do his laundry on time, he would be required to read the card aloud after leaving the dining area. After a couple of sessions, we would again drop this requirement. As he became more proficient, we generalized the function of the card. Variations of cards then were used for a variety of independent activities. The content map we induced was expanded to the form, “When you complete event X, you do activity Y.” We introduced scheduled tasks after breakfast, lunch, and dinner. Each would be written on the card. The learner would use the same generalized behavior on all occasions of leaving the dining area—touching his card. We would sometimes ask, “What are you supposed to do now?” From the beginning, he would usually be able to respond correctly without reading his card. The final step was to replace the original card with a tiny one that he wore under his shirt. This version of the card had nothing written on it. The behavior that was required, however, remained the same. The learner would touch the card when leaving the dining area. The card would serve as a prompt for what he was to do next. The reinforcement schedule was changed so that he no longer received a ticket for each successful card touch, but the learner was still able to receive some treats and praise for remembering to follow his schedule, which he did quite consistently. Other programs are certainly possible. All successful programs would have to start with a simplified version of the terminal task (or a component of the terminal task). They would reference the starting point to the learner’s current performance data and create intermediate tasks or stages so that there were various reinforceable events that could be presented at a relatively high rate. (For example, practicing waiting 1 minute, walking through the doorway, and touching the card could be repeated as many times as it took for the learner to become proficient.) Finally, any success-
290
11.
SECONDARY AND UNFAMILIAR LEARNING
ful program would systematically remove the prompted elements from the chain. The program described shows the relationships among highly unfamiliar learning, secondary reinforcers, and generalized reinforcers. The learning was unfamiliar because the learner was initially a long way from being able to perform the terminal task. Part of the program involved creating a chain that could be generalized to a variety of behaviors. Within this chain were various secondary reinforcers. The card-touching led to a ticket, which led to a treat and praise. The card also served as a generalized reinforcer within the context of leaving the dining area. The card also served as something of a default content map. When the learner touched the card, he knew he had to do something next. The miniature card did not indicate what that something was, but each instance of touching the card prompted the same communication between agent and infrasystem: “There is something you have to do now. Recall what it is.” The final program provisions permitted the learner to practice components to mastery. The practice was referenced to performance of about 70% correct. No judgments were made about how long it would take for the learner to exceed 70% or when the performance requirements would be made more demanding. The only judgment was that if the learner dropped to around 60% correct, the task requirements were too difficult and needed to be modified so the learner achieved a higher percentage of correct responses. Some skills were taught through a program that shaped the context of the response. For others, we shaped the response by changing the criterion for mastery. We could not simplify some, however, and had to achieve learning through the extensive practice needed to modify the learner’s response repertoire. In general, the staging of different tasks addressed the secondary reinforcement issues. The practice formats and changing of performance criteria to induce mastery addressed the highly unfamiliar aspects of the content to be learned.
SUMMARY An axiom of performance for voluntary behavior is that the agent must have a content map about when and how to produce the behavior. Unless both when and how are known, the agent is not able to plan the response in sufficient detail. If the learner does not know how to produce the response, either the infrasystem will not be able to translate the learner’s directive into the intended response or the response has not been fixed in the agent’s repertoire, and therefore the infrasystem will not be able to carry out the direc-
SUMMARY
291
tive. The agent could direct the response of lifting up the back end of a bus, and the infrasystem would try to achieve this response, but the body would probably not comply. The naive agent could direct the juggling of three balls, and the infrasystem would not be able to respond because the directive would not be specific enough for the infrasystem to perform the interaction of responses necessary to juggle. To induce any type of new learning, some form of practice is implied. Its purpose is to modify the communication between agent, its repertoire, and the infrasystem. For learning tasks that involve only a single step to predict the primary reinforcer, the programs are relatively simple. Tasks that involve secondary reinforcers are more complicated because they involve more than one predictor and more than one response. If two basic tasks are linked together so the second predicts the primary reinforcer, the second discriminative stimulus must have a dual role of also being a predictor and reinforcer. It is a predictor that has been positively charged with secondary sensation so that it serves a reinforcing function. Secondary reinforcement relationships that logically cannot be supported by a default content map imply some form of program simply because the probability of learning the relationships without a program is low. Furthermore, the learning often relies on chance. In its simplest form, secondary learning does not involve the learning of motor responses the learner is not able to produce. Rather, it requires simply learning the predictors of the responses. These may be in the form of response strategies, but they involve only learned response components. Programs that teach secondary reinforcement relationships establish the relationship of one predictor to an established reinforcer, then add a loop that involves another step. Another type of content learning involves highly unfamiliar discriminations. The content is highly unfamiliar because initially there is a low probability that the learner will be able to produce an acceptable response. Learning new motor skills is an example of highly unfamiliar learning. There are various degrees of unfamiliarity. A person learning golf who is proficient at baseball, croquet, and swinging an ax would have less to learn than a person who had no such experiences. For all, however, a successful outcome would not be achieved with only a few practice trials. The programs for this type of learning do not have the obvious parts or components that the secondary reinforcer tasks have. The programs are derived analytically by attempting to find a starting point that will yield about 70% correct responses. This endeavor attempts to identify a component of target discrimination that is either a simpler task or a simpler context for the terminal task. A practical requirement for any simplified starting point is that it should permit some form of relatively rapid temporal juxtaposition of examples. For instance, if the discrimination involves tracking, the prac-
292
11.
SECONDARY AND UNFAMILIAR LEARNING
tice formats would not begin with fieldwork, but with visual representations and diagrams that could be presented quickly and show minimum differences of the different discrimination so that the amount of relevant practice the learner would receive would be many times the amount generated during the same time in the field. Highly unfamiliar learning requires the agent to attend to variables that the learner has never attended to. The initial training may go slowly. Even the simplest examples we are able to create may lead to a percentage of correct responses far below 70%. A dramatic example of highly unfamiliar learning is the task of learning to use a sensory modality for an entirely new purpose. The performance of adult, congenital cataract patients when their vision is restored provides evidence of the complexity of learning to isolate and identify objects (Senden, 1932). For some highly unfamiliar learning, the agent must relearn directives to access items already classified in the infrasystem. To gain access, the learner must relearn the features of individual examples and also relearn how these examples are the same and, therefore, how they are classified with other examples that share specific features. Relearning the first few individuals from a class is highly unfamiliar to the agent and may require hundreds of trials. Learning about the features shared by the several examples from a particular class induces confusion in the earlier learned examples. After two examples from the class have been taught to some level of mastery, the teaching of the third example predictably results in a great deterioration of the learner’s performance on the earlier taught examples. When the learner is able to access the items consistently, the learner has reestablished the directive for the class, which means that the learner now tends to perform on various examples that had not been presented during the training. Learning new responses involves programs that have the same basic features as those that involve highly unfamiliar content. If the response is to become part of a voluntary response system, the necessary response details must be known to the agent. The agent must know what it is to direct it. Both secondary learning and learning of highly unfamiliar skills require learning a great number of skills or discriminations. An effective instructional program lessens the gap between what the learner is required to do and what the learner is able to do. The program’s small steps provide a more articulate communication with the learner because there are either fewer variables that could account for the outcome or the response demands are much less than those of the terminal task. Each step has a higher probability of inducing mastery. Learning of both highly unfamiliar skills and secondary learning is therefore greatly influenced by the programs that the learner has received (either intentionally or unintentionally).
Chapter
12
Experimental Designs
It is not possible to assess learning systematically without observing or controlling the basis of inducing learning. If the quiddity of learning is the specific content, the basis for inducing learning must meet the requirement of providing sufficient information about the content. We cannot assess what learning occurs without considering the process of how learning occurs. What is learned is not merely a tendency, but a content-map rule that is precise and strictly based on the information the experimental intervention provides. The rule accounts for whatever tendencies are observed. Experiments may be viewed as vehicles for teaching specific content to a learner. Without teaching specific content—what to respond to, how to respond, when to respond, why to respond—no experimental intervention is possible. The experiment presents the learner with information that is to influence the responses the learner produces. Without presenting examples and securing responses that are supposedly indicators that the learner has learned something, there are no experimental results. Regardless of how minimal the learning is, if it changes the learner’s behavior, it teaches the learner something. It presents some form of communication to the learner. A serious problem often occurs when the outcomes of an intervention are interpreted. If the design does not take the communication variables into account—what the examples logically imply and what learning options there are for performing correctly—the experiment may provide great distortions in any conclusions about the learner’s capacity to learn and even what the learner actually learned from the intervention. 293
294
12.
EXPERIMENTAL DESIGNS
It is not possible to learn without identifying a feature of sameness shared by more than one example. That sameness feature is the basis, and the only basis, for generalization. Therefore, what the learner learns predicts the pattern of generalization that will be observed. There is no general or amorphous learning, only specific strategies that involve specific content—even if the specific strategy is to guess or produce no response. To assess learning from the experimental facts, we must consider the possible interpretations that are consistent with the experimental outcomes and design. We must consider the possibilities that are generated by the design and interpret the behavior within this framework. We must consider the information function of the set of training examples, patterns of reinforcement, and other variables that affect the communication. The basic assumption of all training, including experimental interventions, is that the intervention generates possible interpretations or contentmap rules. What the learner learns falls within the range of possible interpretations. If the design communicates or is consistent with only a single interpretation, the prediction is that all subjects would exhibit the same pattern of generalization on an extended set of examples that share the positive and negative features used in the training set, but that have never been presented to the learner before. In the same way, if the training procedures communicate more than one possible interpretation, there is more than one pattern of generalization exhibited by the experimental subjects; however, all observed patterns will be based on the interpretations generated by the design. If what is learned is a function of the possible interpretations created by the experimental variables, the principal concern becomes that of controlling these variables so they clearly communicate only one interpretation. (Alternatively, the design could control for possibly three interpretations and document the percentage of subjects that learned each possible interpretation.) In either case, the emphasis on providing information that generates interpretations moves the focus to specific content—what is and is not learned. This is not to say that the measurement of rate and other aspects of learning are unimportant, only that it is not particularly important to the nature of what is learned. Rate is important when learning highly unfamiliar content (chap. 11) and when learning those tasks that imply a minimum rate of performance, such as speaking. The techniques of controlling the communication with the learner and the effectiveness of the design are based on analytical techniques and empirical evidence. There is no best method that may be identified before receiving empirical data on various approaches. Rather, there are some designs that are analytically better than others and that produce better results.
THE LOGIC OF TRAINING
295
FEATURES AND INDIVIDUALS The preceding chapters presented several recurrent themes that are particularly germane to experimental design issues. One was the need for multiple classifications of individuals so they may be effectively treated on four levels of specificity: 1. Completely generalized nonvariables recognized as being present in the current setting (background details) 2. Very general ambient variables relevant because of universal features that are relevant to responses the learner produces (barriers or possible facilitators of the response) 3. Specific variables relevant to the pursuit (referenced to a single- or multiple-feature criterion for pursuing) 4. Individuals recognized as unique entities according to the sum of their features. Another recurrent theme is that the information the learner receives about a discrimination or response strategy may admit to several interpretations—all of which are consistent with the evidence the learner has received. In chapter 9, we posed the problem of continuous-variation features that imply the need for interpolation and extrapolation of known data. If the reinforcing patterns create a gap or large difference in the color between known positives and negatives, the design is consistent with various possible interpretations about the extent to which how much of the gap is positive and how much negative. The learner may assume that the range of positives is closely circumscribed by the known positive examples, or the learner may assume that the range extends considerably beyond the range of positives. Neither interpretation is contradicted by any data the learner has received about known positives and negatives.
THE LOGIC OF TRAINING The set of training examples used in an experiment or intervention is ideally designed to present a microcosm of the full set. The basic assumption of the training examples is that if the learner responds correctly to the set of training examples, the learner has learned the features that are relevant to classifying all relevant examples that will ever be encountered within the larger population. Once the larger population of tasks is identified, the set of training examples and training tasks must be carefully specified.
296
12.
EXPERIMENTAL DESIGNS
Frequency of Positive Examples The training set of examples must mirror the total population of examples in features that will be encountered, but not in frequency of occurrences. If the experiment used a random selection to identify 16 training examples, the set would tend to show all the types of examples, but it would tend to represent the most common types far more frequently than the less common types. The population of 16 members (referred to in chap. 10) shows the natural frequency tendencies. There is only one example that has any unique combination of four features, and there are eight examples of any single feature. If the goal were to teach single-feature classifications, random order of examples would provide an adequate mix of positives and negatives. For four-feature combinations, however, the set is not well designed because positives would occur at the rate of about once every 16 trials. For effective instruction, the training set of examples would have to be distorted so that a targeted four-feature combination was represented more than once in the population. (Alternatively, tasks could be designed that sampled the targeted individual more frequently than random selection would do.)
Unwanted Interpretations Just as the training set must be contrived if it is to provide an adequate representation of positives, it must be further contrived to ensure that it preempts, rules out, or contradicts any interpretation except the intended one. An unwanted interpretation allows the learner to identify all positive and negative training examples correctly, and only when new examples are presented would the unwanted interpretation be apparent. There are two ways to eliminate the possible unintended interpretations. One is to eliminate (or greatly reduce) the gap between positives and negatives. The other is to provide early information about the range of the positive examples. The process of reducing the gap involves presenting extreme examples that mark the boundaries of the positives. If the difference between positives and negatives is small, the learner is provided with precise information about the positive and negative conditions. The range of training examples does not have to include every variation, but should provide enough information to promote interpolation from known data points to other examples that have not been presented. With example sets that show the appropriate range, the learner would interpolate to any other positive
PRINCIPLES FOR CONTROLLING COMMUNICATION VARIABLES
297
example within the implied range, but would not extrapolate beyond the extreme positives. To provide early information about the range of positive variation, stipulation of positives is effective. This procedure does not attempt to draw the line between positives and negatives, merely to imply the range of positive variation by repeatedly presenting examples that fall within a particular range of variation. Whatever common features these examples have will be assumed by the learner to describe the positives. The greater the number of exposures to the stipulated positive examples, the less the learner will tend to generalize beyond the range of examples shown.
PRINCIPLES FOR CONTROLLING COMMUNICATION VARIABLES Teaching through the presentation of examples provides the learner with classification information circuitously—not by somehow telling the learner the rule, but by arranging reinforcing contingencies so that a more positive reinforcing consequence follows positive examples. The learner operates from information that positive reinforcement implies a plan or content map that worked well and should be repeated. A consequence that is not positive implies a plan or content map that failed. The basic issue is that if the antecedent and consequent conditions are not controlled, it becomes difficult to interpret the experimental outcomes. The learner’s performance under uncontrolled presentations will never be as good as it would be under carefully controlled conditions. If the training is not designed to control the communication details conveyed through positive and negative reinforcers, there is no basis for suggesting what the learner would or could learn under conditions that do control these variables. In contrast, if the training does control the variables, the results show what the learner is capable of learning under near ideal conditions, and therefore imply how learners will perform on conditions that do not exercise complete control over the controllable variables. How the learner performs in situations other than ideal are therefore viewed as a joint function of the learner’s potential and of the communication variables, not strictly as a function of the learner. This information is valuable if the goal is to deduce and test generalizable canons of learning and performance. The communication provided by an experimental design could be so poorly controlled that it is highly unlikely that the learner would have any chance of learning. In contrast, a carefully controlled design makes learn-
298
12.
EXPERIMENTAL DESIGNS
ing far more likely. Control is required for three main components of the design: 1. The task and the responses that are required, 2. The reinforcing contingencies that are provided, and 3. The pattern of juxtaposition of examples that are presented. Four basic principles govern the type and degree of control that can be exercised over the design of the task, reinforcing contingencies, and pattern of examples presented. 1. Discriminations Taught Separately If multiple discriminations are to be learned, the task is easier if it requires learning just one discrimination at a time. This principle provides a simple roadmap for controlling the relative difficulty of the task that is presented and the sequence of the examples—one discrimination at a time. Learning a discrimination that involves more than one feature is logically more difficult than learning a discrimination that involves a single feature. Therefore, faster overall learning would be predicted if the two-feature discrimination were staged so that one of the component discriminations was learned before the combination is introduced. Minimum-Difference Negatives. One reason for the relative clarity of the single-feature discrimination has to do with the negative examples. The rule is that the number of features required for the discrimination describes the types of minimum-difference examples needed to ensure that the learner attends to each feature. If the discrimination requires two features, two types of minimum-difference negatives are implied, one for each feature. For example, all positives have features M + N. One set of minimum-difference negatives would lack M only. Another set would lack N only. Possibly (but not necessarily) the negative examples would include a third set that lacked both M and N. (These would be classified as negative through extrapolation even if they were not included in the training set.) Without the two minimum-difference negatives (M only and N only), however, there would be no basis for concluding that the learner actually discriminated on the basis of both features, M and N. If the only negatives that were minimally different from the positives had feature M, it would be possible for the learner to learn one of two possible content maps: (a) the presence of N predicts positive examples, or (b) the presence of M + N predicts positive examples. If both types of minimum-difference negatives are presented, possible interpretation (a) is eliminated. The minimum-differ-
PRINCIPLES FOR CONTROLLING COMMUNICATION VARIABLES
299
ence negatives that had feature N contradicts the possible interpretation that N alone predicts positives. 2. Desirable Ratio of Correct/Incorrect Responses If the ratio of correct responses to incorrect responses is relatively great, the task is easier to learn. This principle expresses an empirical way to measure the effectiveness of the communication during learning. If the learner is responding correctly more frequently than incorrectly, the learner is receiving more productive information than if a lower percentage of responses were correct. The principle about ratios does not imply that the experimental design should attempt to achieve 100% correct. If the communication is well designed, failure to respond correctly to positives or negatives provides the learner with important information about which features are necessary to the positive classification. A desirable ratio for a targeted task is not always achievable without arranging some form of program or progression of subtasks that introduces a one-feature task and then builds on the first. Later examples in this chapter present task sequences that would be more likely to achieve desirable ratios of correct to incorrect responses than less carefully designed sequences. A poor ratio of correct/incorrect responses for a targeted task does not suggest exactly what needs to change, merely that something should change—possibly the task, possibly the sequence of examples, or possibly the conventions for providing reinforcement. More precise information about error tendencies provides more exact information about both what needs to change and how the successful changes will be identified. Adjustments made on the basis of low correct-response ratios and a subsequent increase in that ratio does not mean that the revised task is now designed so that the correct response is a reliable indicator that the learner has learned the intended content. Rather, the increased ratio merely indicates that it is now more probable that the learner will produce responses reinforced according to the rules of the design. If the learning involves discriminations that are highly unfamiliar, the ratio on the terminal task may remain low for a substantial number of trials (hundreds). 3. Mastery Achieved With Fewer Tasks and Trials The sequence of tasks that reaches the terminal performance goal in relatively fewer trials or less time is judged to be better than a sequence requiring more trials or time. Principles 1 and 2 do not imply that the process of learning should be atomized and broken down to a kind of linear progression of task variations that starts with the most salient single feature and adds on a feature at a time until finally leading to the ultimate goal tasks. If
300
12.
EXPERIMENTAL DESIGNS
it is possible to follow Principles 1 and 2 to arrive at the terminal performance through a progression of only four task types, that sequence is superior to one that requires nine task types. The less elaborate sequence is desirable because it induces fewer variables and therefore less possibility of unintended stipulation, unintended interpretation, and unnecessary learning of isolated components. 4. Greatly Reduced Performance Variation Across Learners If the presentation of examples is consistent with a single interpretation rather than with various possible interpretations, there will be virtually no variation across all learners that complete the goal tasks. All subjects have the same pattern of generalization because the set of examples admits to this pattern and this pattern only. Providing communication that admits to only a single interpretation is possibly the most critical concern in designing learning. Although it is possible to waffle some on the other principles—presenting a less than desirable ratio of correct–incorrect responses or a less than minimum sequence of tasks—violations of the single-interpretation principle lead to false conclusions about learning and the importance of the procedures used to communicate content to the learner. Unwanted interpretations can be identified before the fact through an analysis of the examples and the task. If it is possible to identify a rule other than the intended one that would permit the learner to classify all examples correctly, there is a probability that some learners will learn this interpretation. However, this before-the-fact analysis may not reveal all possible alternative interpretations. Unwanted interpretations can be identified empirically after the fact through a record of the learner’s performance on test examples not presented during training, but that have the same set of features as the training examples and therefore should be responded to with the same contentmap rule that applies to the training examples. Any particular pattern of learning that is observed in several learners and that deviates in the same specific way from the intended rule implies that the program sequence caused an unintended interpretation. Careful analysis of the sequence that induced the unwanted interpretation will reveal the weakness. If the learner overgeneralizes by responding to some new negative examples as if they were positives, the problem with the training set is that it did not adequately communicate the minimum differences between positives and negatives. It needs specific minimum-difference examples or a task form that effectively teaches the feature difference. If the learner does not correctly identify all positive test examples (undergeneralizes), the full range of positives was not adequately represented
ILLUSTRATION OF COMMUNICATION TECHNIQUES
301
in the training set. If the learner responds incorrectly to some positives and some negatives, the communication was poorly designed. The performance implies a careful examination of the tasks, examples, and particularly the patterns for presenting juxtaposed examples. For instance, if the set of training examples is designed so they alternate between positive and negative, an interpretation is that if the first example is positive, the next is negative. On the test, when the examples occurred in a random order, the learner will miss items that do not follow the alternating rule. Mastery Implications. The identification of unintended interpretations should involve a population of learners that achieve mastery on the training set. The reason is that each learner will adopt only one rule. If there are two rules that are consistent with the information provided during training and that are equal in probability, half the learners in the population would learn the unintended rule. The population would be extremely bimodal. If the learners are not brought to mastery on the training set, however, we would observe a less clear-cut relationship. Clear inferences about how the training should change are not apparent. When all subjects achieve mastery on the training set, all share a content knowledge basis for the common performance. Any variation in how much effort was required to induce mastery and the learners’ performance on the larger set is a function of individual differences.
ILLUSTRATION OF COMMUNICATION TECHNIQUES We can illustrate the role of different design techniques by teaching the five discriminations involving the population of 16 individuals presented in chapter 10. For all the illustrations that follow, a station is illuminated. The learner is to place objects in the illuminated station. After the trial, the objects are returned to the population of objects. Positives are not returned to their original location, but to new locations. This convention ensures that the learner does not key on a spurious feature of the positives (location). The total sequence involves inducing mastery on the following five discriminations: 1. 2. 3. 4. 5.
The The The The The
tall, red, narrow bottle goes in Station 1. (four features) tall, blue, narrow bottle goes in Station 2. (four features) short, blue boxes go in Station 3. (three features) narrow bottles go in Station 4. (two features) red objects go in Station 5. (one feature)
302
12.
EXPERIMENTAL DESIGNS
Minimum Scaffolding The most poorly designed format for inducing learning of the five discriminations would be to simply present the training examples for the five discriminations and differentially reinforce correct responses. Five stations would be set up. The entire population of 16 members would be presented to the learner. One of the stations would be illuminated. If the learner responded by placing an appropriate individual in that location, the learner would receive positive reinforcement. If the learner placed an incorrect member in the location, the member would be returned to the group and no reinforcement would be provided. The procedure would be repeated with stations being randomly illuminated. This format violates the single-discrimination requirement. Therefore, it would not bring the learners to mastery in a relatively fewer number of trials, would not lead to a desirable ratio of correct responses, and would not induce a small range of performance differences among learners who completed the training. There would be great individual differences in rate and success, with possibly most learners never learning some of the discriminations. It could be years before the learner figured out all the rules or months before the learner figured out even one of them. If we used this experimental design to draw a conclusion about the learners’ capacities to learn the five discriminations, our conclusion would suggest an inability of learner rather than the communication deficits of the design. Prompts That Increase Motivation An indirect way to improve the design would provide strong reinforcers for correct performance. For example, we induce water deprivation for 20 hours before the experimental sessions. There would be water at each station, but the only way the learner would be able to access the water would be to put an appropriate member in the location illuminated. Clearly, this control would hasten the learning process because the deprivation increases the probability that the learner will try to figure out ways of securing the water and therefore produce behaviors related to the stations. A possible misrule could be conveyed by conventions for issuing positive reinforcement only. If the learner is reinforced for placing correct objects in a station, but receives no penalty for placing incorrect ones in the location, the presentation is consistent with the interpretation that the goal is to cram as many objects into a station as possible. This rule would lead to a high rate of reinforcement, but clearly the learner could achieve what appears to be mastery (placing all the correct items in a location) without ever learning the discrimination for the station. Therefore, some sort of conventions are needed to clarify what kind of response is reinforceable. Possibly,
ILLUSTRATION OF COMMUNICATION TECHNIQUES
303
the light for the station goes off as soon as an incorrect object is placed in the station. That object is then returned to the set of 16. Response Shaping Another indirect technique that could be added to the deprivation setup is response shaping, which requires a variable criterion for reinforcing responses. Here is a possible sequence: (a) Provide water if the learner picks up one of the members, (b) provide water if the learner places the member closer to the station, (c) provide water if the learner places the member in a station, and (d) provide water if the learner places the correct member in the station. The process of shaping is based on the idea that the response is somehow the central learning. The current analysis considers responses for a task of discriminating simple manifestations of knowledge. If the learner knows which objects go in which stations, we would achieve the same pattern of correct responses through a variety of response formats—touching the targeted objects, knocking them over, holding them up, putting them in a bag, and so forth. The pattern is the same, and the only basis for the pattern is knowledge of the content. Response shaping is not an effective technique unless the learner is incapable of producing the motor response required by the task. If the goal is to teach a discrimination, the learner must be shown which features of the event are being reinforced. If we reinforce the learner for picking things up, the learner who learns this rule must learn a different rule when we change the criterion for reinforcement. The sequence of shaping will put each rule that is prior to the terminal rule on extinction when we require the learner to put the member closer to the station. A far more efficient practice is to teach the terminal discrimination early in the sequence and then shape the context for applying the rule. Clear Communications Three variables can be manipulated to shape the context to achieve clear communications: 1. The composition of the set (the number and type of positives and negatives for the various stations), 2. The number of discriminations that are required (the number of stations that are activated), and 3. The sequence for introducing and integrating discriminations.
304
12.
EXPERIMENTAL DESIGNS
These three variables generate a large number of possible effective programs for teaching the five discriminations. All involve some form of context shaping. To illustrate this process, we could start with the simplest possible context for inducing the Station 1 discrimination. This context presents a single station, not five stations. The positive examples are tall, red, narrow bottles. We can adjust the difficulty of this discrimination by manipulating the composition of the negatives. If we initially introduce all the types of negatives in the population of 16, we make it relatively difficult because positives must be discriminated on the basis of all four features. If we remove those negatives that differ from the positives by only one feature (color, height, width, shape), we make the discrimination easier. By removing all those examples that differ from the positives by two features, we make it still easier. By presenting no negatives, we make it as relatively easy as possible. Therefore, this is where we start. If the learner places one of the objects in Location 1, it will be an object that would go in Location 1 regardless of the composition of the negatives in the set. When we reinforce the response, we are not reinforcing any serious possible misinterpretation. When Station 1 is illuminated, the tall, red, narrow bottles go in it. We increase the number of positives for Station 1 from one to four examples so that the learner receives reinforcement at a high rate. We also give the learner a reason for responding through deprivation. Water is awarded for each positive response. This setup presents exactly the same criterion for Station 1 that is used when all five stations are activated and the population consists of 16 rather than 1 type. Note that the initial setup is the logical equivalent of the setup in which the green examples are positives and no light or the white light is the only negative. All changes in the set are keyed to performance. As soon as the learner demonstrates mastery on one of the stages, the next is introduced. Here are some possible stages or progressive changes for shaping the context for Station 1. 1. As soon as the learner responds consistently to putting the objects in Location 1, we introduce some other members of the set—members that differ in more than one feature from the tall, red, narrow bottles. For instance, we add two tall, red, wide boxes and two short, blue, narrow boxes.
ILLUSTRATION OF COMMUNICATION TECHNIQUES
305
We retain the same criterion for reinforcing the learner—placing tall, red, narrow bottles in Location 1. At this stage, the learner has already practiced the reinforceable behavior and has operated on the four positives. Collectively, the negatives require the learner to attend to any difference or combination of differences that would reliably distinguish the positives— shape or width for discriminating the red boxes; color, shape, height for discriminating the blue boxes. (This stage would be parallel to introducing a red negative in the green-light discrimination experiment. It requires a more precise representation of the positives, but there are great options in the specificity of the rule.) 2. As soon as the learner responds consistently to this stage, we change the context again, this time by introducing two short, red, wide bottles. Now the learner must minimally attend to the height or width of the bottle shape.
3. As soon as the learner responds consistently to this context, we change it by adding two more negatives—tall, blue, narrow bottles. Now the learner must attend to some combination involving color (the only difference between the positives and new negatives) and some combination of height–width–shape.
4. We introduce remaining minimum-difference negatives—bottles that are red, tall, and wide and those that are red, short, and narrow.
306
12.
EXPERIMENTAL DESIGNS
This final set of members ensures that the learner’s rule has been shaped so that it refers to all four of the discriminated features of the positives. Red distinguishes positives from the minimally different bottles that are blue. The width distinguishes positives from the minimally different wide bottles, and so forth. Note that the sequence does not include the short, blue bottle. This type would be negative through extrapolation. If the short, red bottles are negative, the short, blue bottles are even more negative. The context shaping ensures that any learner who performed at mastery had some version of the same minimum content map that would generate the performance. No learner could perform without such a map. Shaping Discriminations One Feature at a Time. The logic of this staging is that it permits the learner to draw strong inferences from possible errors. The staging creates a minimum difference between the least specific content map the learner could have at a particular stage and the least specific content map possible for the next stage. In other words, each later stage presents negatives that differ in only one feature from one of the possible rules that would be successful for the previous stage. For example, when the negatives are tall, red boxes (Stage 1), the learner could have any of a large set of possible rules that would lead to accurate classification of the positives. With the introduction of the short, wide, red bottles (Stage 2), the learner’s current rule either leads to reinforcement or it does not. If it does not, the learner’s rule is different by one feature from the minimum rule now required. For instance, if the learner’s rule refers to bottle shapes, the learner receives information that this rule does not work because some of the bottle shapes (short, red, wide bottles) are shown to be negatives. A comparison of the positives and these negatives discloses that positives are taller and narrower. Therefore, a minimum rule that would permit correct classification of all examples could be either that tall bottle-shape or narrow bottle-shape examples are positive. By adopting either of these minimum rules, the learner’s knowledge would be shaped minimally by one feature. Population Changes. The context-shaping program manipulates the population of negatives and positives, changing both the types and numbers. For instance, there were four positives for Station 1. The purpose of this change was to increase the probability of the learner performing at a relatively high rate of success. If there is only one member in the set that has the features, the learner is not required to attend to the features as much as when four are in the set. All four are the same in various ways. Comparisons for common features of the positive examples are therefore promoted simply by changing the composition of the population.
ILLUSTRATION OF COMMUNICATION TECHNIQUES
307
Alternative Approaches. There are alternative approaches that require fewer steps and also that eliminate much of the context shaping. For example, we first introduce four positives for Station 1 and four negatives—two tall, narrow, blue bottles, two short, narrow, red bottles.
Each negative differs from the positives by one feature. Mastery of this set, therefore, ensures mastery for all members of the terminal population except the tall, wide, red bottle. (This mastery may occur through extrapolation, but it is not guaranteed.) All the other negatives would be correctly identified as negatives. The problem with this approach is that it may not initially establish a high enough percentage of correct responses. If the learner’s performance does not drop below about 60%, however, the approach would be preferable to the more elaborate one we have described simply because it will achieve the terminal performance in fewer trials. Procedures for Teaching Related Discriminations The programs described earlier address a single discrimination. If the goal is to teach all five of the discriminations from chapter 10 in a relatively efficient way, however, we would not necessarily teach the discrimination for a particular station to its terminal stage before introducing other discriminations. If we completed an elaborate program for Location 1 before introducing the next discrimination, we would greatly stipulate that examples are classified on the basis of the sum of their features and that an individual is assigned to one and only one location. The learner would tend to resist the idea that assignment to a station could be based on a single feature or pair of features. The learner would also tend to resist the idea that the same individuals that are positives for Location 1 could also be positives for other locations. The Sequence of Discriminations. The solution for avoiding stipulation is to contradict the possible unintended interpretations as quickly as possible. This would require the following: (a) a faster introduction of the positives for Station 1, (b) a different arrangement of positives and negatives, and (c) a sequence for the introduction of stations that contradicts the stipulation that a positive for one location cannot occur in other stations.
308
12.
EXPERIMENTAL DESIGNS
A poor sequence would introduce the tall, red, narrow bottle for Station 1 and then follow with the tall, blue, narrow bottle for Station 2. This sequence reinforces through extrapolation the interpretation that the remaining stations are based on four-feature classifications and specific objects are assigned to only one location. A good sequence would be one that showed the sameness features of the locations. The only things that are the same about the locations is that each is activated when illuminated and each issues reinforcement based on a classification rule. The classification rule refers to some combination of features. The logical principle for showing sameness is to juxtapose examples that differ greatly but that lead to the same effect. In this case, the station that differs most from Station 1 in its classification rule is Station 5. Location 1 calls for individuals that are tall, red, narrow bottles; Location 5 calls for a single-feature classification—red. So by juxtaposing Stations 1 and 5, we would rule out the interpretation that classification is based on a fourfeature combination. This juxtaposition would also show that the same individuals that are classified in Station 1 are classified for a different reason in Station 5. In short, the sequence that starts with Station 1 and then introduces Station 5 contradicts both possible unwanted interpretations. Station 1 Introduction. Because we are concerned with introducing a second station as quickly as possible, we do not attempt to shape the knowledge of all four features of the Station 1 examples before introducing the next station. We do not start without negatives, but with negatives that are somewhat demanding yet readily distinguished from the positives—for instance, four positives for Station 1 and three negative bottles, each of which differ from the four positives by two features (tall, blue, wide bottles; short, blue, narrow bottles; short, red, wide bottles). Note that none of these negatives is box shaped. To make positive responses more probable, we could introduce a prompt in the form of a model. We place two more positive examples in Location 1 to prompt the features of the examples that go in this location. With the introduction of this prompt, the learner does not have to remember the features of the positives, merely be able to select the other members that have the same set of features. Because there are four examples in the set, the learner should be able to receive positive reinforcement at a high rate. After the learner reaches mastery within this context, we change the composition of the set so that it consists of four positives and four tall, blue, narrow bottles. These negatives require the learner to discriminate the examples on the basis of a single feature—color. The purpose of these negatives is to demonstrate that positives and negatives may differ only in color.
ILLUSTRATION OF COMMUNICATION TECHNIQUES
309
This is the discrimination the learner will apply for Station 5, where items are grouped solely on the basis of redness. After the learner has successfully performed several times within this context, we remove the models from Station 1 and add three negatives to the set that differ from the positives by two features. Now the learner must remember those details that are called for by Station 1. Station 5 Introduction. As soon as the learner achieves mastery, we introduce the next station—Station 5, red things. Establishing the Station 5 discrimination would follow the same steps as those for Station 1. We would deactivate all the stations except 5, introduce a good set of positives and negatives, initially present models, and then remove the models. The models would be three red things that differ greatly from each other—tall, red, narrow bottle; short, red, wide bottle; and tall, red, narrow box. Together these examples imply what is the same about the examples that go in this station. This design does not correct for one possible stipulation—the interpretation that the models show the specific features of the various objects to be classified. It would be perfectly reasonable for a learner to conclude that if the models for Station 1 showed all the features of the various positives, then the models for Station 5 would also show the various individuals that are positive. Instead of identifying a single feature that is common to the examples, the learner would recognize the tall, red, narrow bottle as an individual and conclude that the other individuals are to be memorized and placed in Station 5. The models show that the tall, red, narrow box is a positive, but the models do not show that the tall, red, wide box is also positive. If the learner assumes that the models show the individuals that go in a station, the learner would not tend to place the tall, red, wide boxes in Station 5. We could contradict this possible stipulation in two ways. One would be to create a set that has a larger number of positives that are not modeled. The other way would be to change the models from trial to trial. For instance, we could introduce a population that had one member of each type modeled, but also two members of each type not modeled. The set would also have at least one blue counterpart for each type of positive. After the learner places all the positives in the station, we could replace the three original models with two different ones—a short, red, narrow box and a tall, red, wide bottle. An alternative approach would be to counteract the stipulation simply by presenting two models that differed from any of the positives in the set. For instance, the model could be a tall, red, wide bottle and a short, red, narrow box. The positives in the population outside the station would not include these types. The learner would not be able to match individuals and would have to formulate a rule about possible shared features.
310
12.
EXPERIMENTAL DESIGNS
As soon as the learner performs at mastery, the models are removed. After the learner demonstrates mastery on Station 5, Station 1 is reactivated, and the learner is taught the discrimination that when the light for a station is illuminated, the station is active. The population for this exercise consists of 12 individuals, 2 of which are positives for Station 1 and 7 of which are positives for Station 5. All red objects for Station 5 differ in at least two features from the positives for Station 1. Some of the blue negatives differ from the positives in only one feature—color. With this population, the learner receives practice in grouping some of the same individuals in Stations 1 and 5. The size of the population might be changed from trial to trial to ensure that the learner is discouraged from memorizing the features of all individuals and encouraged to recognize the basis for their classification in Station 5. Station 3 Introduction. The third station introduced would be Station 3 (short, blue boxes, both narrow and wide). The other stations would be deactivated. (At other times, Stations 5 and 1 would be activated so that the learner received continuous review with these examples.) The discrimination for Station 3 would be modeled by two examples—a short, blue, wide box and a short, blue, narrow box. Initially, the population of examples outside the station would consist of blue boxes only—four short, blue boxes and four tall, blue boxes. Following mastery, the models would be removed; the population would be increased to a 16-member set that included four short, blue boxes and two tall, blue boxes, as well as other examples that differed by single features from those for Station 3. As soon as the learner performed on this setup, an 18-member population would be installed (the 16 original members and 2 additional tall, narrow, red bottles for Station 1), the three stations would be activated and illuminated in a random order, and the practice format would continue until the learner achieved a high rate of correct responses for all three stations. Following the review, the learner would have functional knowledge that a station is based on four features, one feature, or three features. Station 2 Introduction. The next station introduces the discrimination of tall, blue, narrow bottles. This discrimination would be easy for several reasons. First, the learner has already discriminated these objects from minimally different objects in Station 1. Second, the station is relatively easy to discriminate because Station 2 is next to Station 1. Station 1 functions as an anchor that is familiar to the learner. Something of a pattern emerges—the classifications that differ from those of Station 1 are closest to Station 1. The Station 5 classification differs most from that of Station 1, and Station 5 is farthest from Station 1.
ILLUSTRATION OF COMMUNICATION TECHNIQUES
311
Station 4 Introduction. Station 4 would be the last station introduced and possibly is the most difficult to master within the sequence used here. The positives for Station 4 are narrow bottles. The problem is that all the members from Stations 1 and 2 go in Station 4, as well as some members from Station 5. Perhaps the best way to introduce Station 4 would be to deactivate the other stations and present a set of positives consisting of four small, narrow bottles (red and blue) and negatives consisting of four small, wide bottles and two tall, wide bottles. Two positives—a red and a blue short, narrow bottle—would be the model in Station 4.
The illustration does not show the physical arrangement of the population outside the station, merely the composition. The learner could learn a possible misrule from this introduction, which is that only the short, narrow bottles go in Station 4. However, it should not be a misrule that is difficult to contradict because the rule for the tall, narrow bottles is that any bottles that would be classified in Station 1 or Station 2 go in Station 4. Once the learner is firm on the discrimination of the narrow bottles within the original context, the model is dropped, but the population remains the same. Next, the set is changed so there are eight examples (two tall, narrow bottles; two tall, wide bottles; two short, narrow bottles; two short, wide bottles). The models shown would be one tall, narrow, red bottle and one short, narrow, blue bottle.
This set would give the learner practice in placing the member for Station 2 (tall, narrow, blue bottle) into a new station along with the other narrow members. The learner would certainly recognize the model prompt
312
12.
EXPERIMENTAL DESIGNS
provided by the tall, red bottle. The learner knows that this member goes in two other stations. The sequence addresses the most difficult part, making it clear why it also goes in Station 4. By delaying its introduction into the set, the sequence forces the learner to attend to narrowness of the other positive examples. We might also introduce a reversed set of models—a tall, blue bottle and a short, red bottle. Now the learner must discriminate the bottles that go in Station 4 from their wide counterparts.
Following the learner’s unprompted performance, we introduce an intermediate integration step in which we activate only some of the stations. For instance, we activate Stations 3 (short, blue boxes), 4 (narrow bottles), and 5 (red individuals). The set of examples would consist of the original 16 members, so there would be two positives for Station 3 (short, blue boxes), four positives for Station 4 (two short and two tall, narrow bottles), and eight positives for Station 5 (red objects). No models would be provided. This intermediate integration would be designed to give the learner practice in placing the positives for Stations 1 and 2 appropriately in other stations. Terminal Task. After the learner performs reliably on Stations 3, 4, and 5, the terminal task would be introduced. It would consist of the original 16 members with all five stations activated. The stations would be illuminated in random order, and a correct response would be that of placing all the appropriate members in that station. A partial correct response would be placing only some, not all, of the members that belong in the station. An incorrect response would be placing any negatives in the station or placing nothing in the station. We could introduce several types of tests that demonstrate the extent of learning. One test could introduce a larger or smaller number of objects for a station from the number encountered during training. For instance, we could alter the population so there were two examples for Station 1, three for Station 2, and six for Station 3. The prediction would be that such changes in the population would not affect the learner’s performance because the learner has practiced with various population configurations.
ILLUSTRATION OF COMMUNICATION TECHNIQUES
313
We could also introduce members that would be the same as current counterparts in the set with respect to all but one feature. For example, we could introduce a jar shape, yellow or purple members, or a tall, mediumwidth bottle. Alternative Designs. The reason we do not know (but seriously doubt) that the final sequence we described would be the best possible is that there are many different ways to address the various questions. There are many potential starting points and certainly more than one possible order of progression from each possible starting point. For instance, the first step could address the rules or discriminations for the different stations. For this starting point, we could activate three stations—1, 3, and 5. Each station would have one model. The set would consist of six members, one pair that is identical to each of the models. We would prompt the learner to respond to illuminated stations by placing the reinforcer in that station. The learner is reinforced if one of the correct objects is placed in the station. This stage of the design stipulates a significant misrule, which is that the game is to match what goes in the station. To contradict that stipulation, the next task would involve only Station 4 or 5. Alternatively, we could start with three modeled examples for Station 5 and configure the example set so that these three types were represented less than the other red members. The learner would have to identify the sameness in feature of those red examples not modeled. Next, we would introduce Station 4 (narrow bottles) to demonstrate that height and width are variables. The last stations would be presented in order—3, 2, 1. For each, there would be a model and a training set followed by integration of the station with more than one of the stations already taught. Predictions. The only form of prediction based on analytical versus experimental evidence is that designs that control particular variables in an obviously more articulate way produce better experimental results than designs that do not include such control. The important point about the experimental sequences described in this section is not that they are the best, but that they would outperform sequences that do not address the various issues of sequencing, stipulation, and practice. To determine which sequence performs better, they would be compared empirically according to the following criteria: 1. Is mastery achieved at each step? 2. Are any learners operating from content-map rules that are based on possible unintended interpretations, but that are consistent with the details of the presentation?
314
12.
EXPERIMENTAL DESIGNS
3. Is the pattern of mastery characterized by a tendency for at least some of the later discriminations to require fewer trials than earlier discriminations? 4. Does the learner achieve a high percentage of correct responses when there are no prompts and more than one station is discriminated? In addition to these empirical measures are the following analytical criteria: 1. Is each discrimination needed taught as a single discrimination, rather than being combined with other discriminations? 2. Are the criteria for percent of correct responses to be achieved at each stage analytically sound? 3. Does the design call for a cumulative integration of each newly taught discrimination with all earlier taught discriminations? 4. Does the design arrive at the terminal five-station task in a way that identifies and addresses each possible stipulation? 5. Does the design provide for adequate and timely reinforcement? If two analytically compared programs differ greatly on the criteria, the prediction is that the more carefully designed program will yield the better results. If there is no clear basis for determining which of the programs is better on one or more of the criteria, no prediction is possible, and an experiment would have to be conducted to answer the question. This is particularly true if issues that interact are addressed by greatly different approaches. Performance of learners who go through the more carefully designed sequence will differ from those going through the less carefully designed sequence in six significant ways: 1. A higher percentage of the population will reach mastery. 2. The overall rate of learning for each quartile of matched populations will be higher. 3. The ratio of percent mastery to instructional time will be greater. 4. Following the same total number of practice hours after achieving mastery, the retention of discriminations will be better (requiring fewer practice trials with the final population to again achieve mastery). 5. The smallest difference in the comparison groups occurs in the fourth quartile of the population (based on a norm that is independent of the training) and the largest difference in the first quartile. 6. The standard deviation will be smaller.
SUMMARY
315
Need for Multiple-Subject Designs. It is possible to learn a great deal from single-subject designs. They take the myth out of averages because they frequently show that either the learning occurs or it does not. There tends not to be sort of learners unless they have learned a guessing strategy or another type of spurious rule. The single-subject design may be used to validate or study the communication variables. However, a fair number of subjects would be required to show the relationship between what is analytically identifiable in the design and the experimental evidence. The reason is simply that when the communication presents more than one possible interpretation, a given learner will learn only one of those that are possible. To determine any trend that relates the possible interpretations to behaviors that validate the learning of various interpretations, quite a few subjects would be required, particularly if some of the interpretations are relatively obscure and would not be expected to be represented as frequently as others. A prediction, however, is that with a sufficient number of subjects, all unintended interpretations that are analytically identified will be confirmed by behavioral evidence. Further, if practice is adequate, no other interpretations will tend to occur.
SUMMARY In the broadest sense, behavior is a function of the discriminative stimulus and reinforcing contingencies. The learner cannot learn without information about these variables, which means that inducing specific learning requires communication that conveys information about the specific discriminative stimulus and reinforcing contingencies that govern the content and strategies to be learned. The details of the communication are the variables that influence learning. The manner in which the details are presented logically affect both what the learner learns and the relative amount of exposure needed to induce the learning. The illustrations of strategies for teaching the five discriminations that deal with the same population of objects are based on the assumption that learning is a function of the communication. The goal is to achieve a clear communication at each stage. The overall strategy is to shape the context in which the examples occur, progressing from relatively easier contexts to the terminal context. Throughout the process, all positive and negative examples are classified according to a criterion that does not change. The population is systematically changed to accommodate specific discriminations. The illustrations showed some effects created by changes in the number of positives for a given station, changes in the number and type of negatives, and changes in the sequencing for introducing component discriminations. If there is only one example of a set of features, the
316
12.
EXPERIMENTAL DESIGNS
teaching would tend to take longer than it would with more than one example. The multiple examples allow for massed practice. The learner works on juxtaposed examples of the discrimination without manipulating the same object. The difficulty of the discrimination changes as the difference between the positive examples and the least different negative examples change. By controlling negatives, it is possible to create something of a continuum that starts with a relatively easy context and progresses to the terminal context. With no negatives, performance on the positives of a four-feature discrimination is easiest, but it also generates the largest number of content maps that are consistent with the presentation. With negatives that require classification on the basis of four features and that differ from positives in only one feature, the discrimination is relatively the most difficult, but it provides unambiguous information about the essential features of the positives. Another consideration is the total number of negatives. If we have a population of 25 individuals, the task of locating two positive examples becomes more difficult simply because more systematic scanning techniques are required. For training, not all types of negatives are needed, simply those that differ from the positives in specified ways. If we include several different types of negatives that differ from the positives in two features, we do not need any negatives that differ by three or four features from the positives. (These would be automatically categorized as negatives by the learner through extrapolation of the difference between positives and known negatives.) Changes in the sequence of the discriminations presented can significantly alter the clarity of the communication. A controlled communication would teach one thing at a time and integrate it with whatever has been previously taught. In addition, the sequence would move the learner to the minimum-difference discriminations as quickly as possible. The sequence of what is taught first and next is shaped by considerations of possible stipulation, extrapolation based on features of earlier examples, and possible misrules created by the order of the discriminations taught. To induce a number of related discriminations clearly and relatively quickly, there are many possible starting points and orders of progression. The only way to determine which is superior is through empirical data, not through analysis. The ultimate goal of designing experiments that are analytically flawless is to rule out communication as a cause of not learning and thereby distill information about the learner’s capabilities. With the variables controlled, the relationship of communication clarity and the learner’s performance becomes more obvious. In contrast, the literature is littered with experiments that do not address the communication details and therefore draw faulty conclusions about
SUMMARY
317
both the learner’s capacity and what is involved in learning. For example, the learner is taught a discrimination through a set of examples and is then tested on examples not presented during training. If the learner tends to generalize, the result is interpreted as a tendency of learning without considering the role of content and the interpretations of content implied by the details of the experimental intervention. To create learning experiments is to design teaching. The performance of the learner is greatly influenced by the communication details of that teaching. Therefore, the relationship of learning to the presentation that induces the learning will be revealed only if the communication variables are acknowledged as variables and are controlled to create sequences that admit to only one possible content map (or that intentionally induce more than one content map to a group of learners).
Chapter
13
Volition and Thought
The inferred functions involved in learning and performance lead to the conclusion that organisms that produce nonreflexive or voluntary responses are conscious. The reason is simply that it is impossible to perform these behaviors without some form of volition or conscious directive to produce the response. The degree of consciousness may be operationally defined by the various roles that logically must be performed by the agent and by the information that the agent would need to perform them. As indicated in chapter 4, the agent of even the nonlearning system must be the locus of planning and directing responses. Because the responses are based on current sensory data and will result in specific transformations of the current setting, the agent must be simultaneously aware of the following: 1. The enhanced sensations associated with specific stimulus conditions; 2. The manner in which the setting is to be changed (the outcome to be achieved); 3. The details of the current setting that are relevant to the responses to be produced (the current position, the obstacles, etc.); and 4. The responses available to change the setting and achieve the desired outcome. From this information, the agent formulates a plan. The plan indicates the responses and what their effect should be. Without this plan, there could be no feedback because the system would have no knowledge of what response was to be produced or what effect was to be achieved. 318
INTRODUCTION
319
The agent must also have the capacity to direct the plan—to present the plan to the motor-response system and direct the system to do it. As the directive is executed, adjustments are made on the basis of discrepancies between projected and current sensory data. The adjustments take the same form as the initial plan. The changes in the response are planned on the basis of current sensory data, the projections are revised, and the agent directs the modified plan. Much of what the agent attends to is in the form of continuous variation. The responses are characterized by continuous variation, as are the transformations of the setting that occur while the response is produced. As the setting changes, the reinforcing or punishing sensations that the agent receives change. Therefore, the reinforcing or punishing sensations are viewed by the system as a function of the directives the agent issues. Outcomes that are more negative signal directives that fail; outcomes that are more positive signal directives that are successful and should be continued. The conclusion that the agent is conscious is inescapable. Unless the part of the system that directs responses (the agent) has all the information needed to produce and adjust the responses, the response-direction process will fail or be preempted by lack of coordination. Producing organized, goal-oriented responses without a plan is impossible. Planning a response without sensory information is impossible. Executing a plan without a directive is impossible. Creating a directive or plan without reference to responses and how to access and activate these responses is impossible. Therefore, in the presence of sensitized stimuli, the most primitive agent is conscious of all the variables that cannot be directly managed by the infrasystem, but that are required for producing nonreflexive or voluntary responses. The agent may have no later memory of the situation, but the extent to which the agent is conscious is defined by the sensory inputs, plans, and directives that are needed for the agent to perform the task. By definition, the agent is conscious of incoming sensory receptions, enhanced sensations that motivate the process of planning, process of directing, and changes that occur in the setting. Because the process of learning does not deal with certainty, but is based on logic built around a strategy of guess-and-check, the agent has a greater number of functions. To learn, the agent must be aware and attend to various features of the setting because the agent has no preknowledge of which will be relevant to what is being learned. The greater the capacity of the system to learn, the greater the scope of what the agent must attend to. Furthermore, unless the agent attends to features of the setting, it is less likely that the infrasystem will secure the information it needs about these features. At this more sophisticated end of the continuum, the agent has two different thought modes—one is a reflexive product of the infrasystem and the enhanced sensations it creates. The other is not keyed to enhanced
320
13.
VOLITION AND THOUGHT
stimuli, but is a product of the agent’s directives. In the same way the agent directs motor responses, the agent directs thought.
PLANNING AS THINKING Planning is a specific type of thinking. When the agent considers information and formulates a plan to respond in the current setting, the agent takes covert steps to arrive at a decision. These covert steps are thinking. It is behavior, but it is unobserved. Because planning is necessary for any nonreflexive responses, it is the simplest possible form of thinking. Even if the format and content of thought are largely determined by the infrasystem and the content map that is reflexively presented to the learner in the presence of specific sensory conditions, the agent must produce some form of covert, information-referenced behavior before a specific response is directed in the current setting. For the organism that has a logically minimal agent (based on the fact that limited behaviors are observed), the nature of the plan that the agent establishes may require the agent only to fill in some blanks that address the variables of the response. The agent’s decisions may be limited to which direction the organism is to move and some minimal choices in the details of the movement. Agents that produce more complicated behavioral interactions have more discretion about behaviors and strategies, and therefore the role of plan-based thinking becomes less formatted and more complicated. Conflicting Content Maps A more involved type of planning, and therefore of thinking, is required for the agent to make decisions based on two or more active content maps. The traditional approach–avoidance gradient is an example (Sidman, 1953, 1955; Sidman et al., 1957). A duck is on a lake. There is irresistible food near a person who is standing on the shore. We have behavioral evidence that the duck would approach the food if no person were around. We also have evidence that the duck would avoid the person if no food were near the person. Therefore, we know that the duck is being influenced by both the person and the food. The duck wants to approach the food, but avoid the human. The conventional explanation involves some form of progressive gradient. Accordingly, when the duck is far out in the lake, the approach gradient is very strong and the avoidance gradient is weak. As the duck approaches the shore, the approach gradient weakens and the avoidance gradient becomes stronger. When the duck is quite close to the shore, the
PLANNING AS THINKING
321
avoidance gradient becomes stronger than the approach gradient, and the duck moves farther from shore. The postulated approach gradient is a reasonable explanation in that it expresses the notion that more than one content map is influencing the duck’s behavior, with the maps driving the duck in opposite directions. The question that is important for the analysis of thinking is whether the behavior is explained in simple mechanical terms (distance from person, movement) or whether it results from the learner being simultaneously aware of both content maps—one associated with approaching the food, the other with retreating, and each generating urgings based on data. The behavioral evidence leaves little doubt that the agent is aware of both maps. Analytically, the conclusion derives from the following considerations: 1. The agent discriminates the food from Distance X regardless of whether a person is present. 2. The agent discriminates the person from Distance X regardless of whether the food is present. 3. A content map is reflexively presented to the agent as each stimulus is discriminated. 4. Therefore, as the duck approaches the shore, the agent is simultaneously under the influence of two content maps. The influence of the conflicting content maps is most evident when the duck is as close to the shore as it ventures. Its movements are hesitant and repetitive. It may retreat a little and then return to the boundary of its approach. This behavior at the boundary line or limit is particularly revealing about the nature of the thinking because the behavior cannot be produced unless the agent rejects the urgings of one content map and decides to follow the urgings of the other. The conflicting-content-map explanation deviates from the traditional explanation with respect to the notion of a gradient. The avoidance behavior is based on a content rule that provides for something of a halo of danger around the potentially aversive stimulus (the person). The avoidance rule for the agent is functionally something like, “Don’t get too close.” The duck may feel perfectly secure so long as it is clearly outside the halo of danger, whether it is 300 or 30 yards from the shore. The content maps are not in conflict because the duck is not too close and the avoidance map has not been activated. At possibly 20 yards, there is conflict because the duck has reached the boundary of the danger halo. The proximity of the duck to the boundary of the halo of danger manifests itself in generically different behaviors at different distances. When the duck is 60 yards from the shore, the duck’s reaction to the person’s sudden movements (arm waving, shouting, and movement toward the water)
322
13.
VOLITION AND THOUGHT
results in a calm swimming away from the shore. So 60 yards is not an absolute distance for retreating, but one that is relative to what the person does. When the duck is 20 yards from the shore, the same behavior results in urgent retreat behavior—rejecting the approach and possibly flying to a quite-distant part of the lake. At 20 yards, even a slight, nonthreatening movement of the person results in a response that grossly overcompensates for the apparently small change in stimulus conditions. The relevance to thinking is that the presence of the conflicting content maps means that the agent is aware of a larger set of multiple features—not simply multiple features that are relevant to the approach, but also features that are relative to the avoidance. When the agent is at the edge of a danger halo, it must actually disregard one map to produce responses. The conflicting enhanced sensations and urging are still present; however, the agent must make the decision to disregard one set of urgings and responses, formulate a plan, and reject it without actually executing it. The key difference between this situation and uncomplicated planning in response to urgings from the infrasystem is that each conflicting urging requires the agent to think about doing something. Because the agent cannot possibly perform a response that satisfies the requirements of both content maps, the thinking is different in two ways: (a) The agent thinks about some things without actually producing responses that respond to them, and (b) the agent must perform a higher order type of thinking that makes a choice about which content map to follow. Decision Making for Extensive Pursuits This higher order decision making required to resolve conflicting urgings also occurs when the learner is under the influence of a long-range content map. Long-range content maps are persistent, but necessarily weak so that urgings presented by content maps that have a more immediate goal may be satisfied during the period that the long-range map persists. For instance, a bird is to migrate from Canada to Central America. The bird does not fly nonstop, but rests and eats along the way. Functionally, therefore, at times the content maps for resting and eating are relatively stronger than those for migrating. However, the map for migrating persists so that the journey continues until the bird reaches the goal destination. At some points during a cycle of resting and flying, the maps for resting and flying are of near equal strength. This situation requires decisions based on more than one map and therefore decisions about which of two possible plans to follow. In effect, the agent would have to make a decision based on information about both conditions. In function, the process would be something like this verbal description of what the agent observes and feels. “I could land over there. Looks safe. But there is still a lot of day-
PLANNING AS THINKING
323
light. I need to keep flying. But there may not be good resting places ahead—just open sea. Swoop down and take a closer look at this place.” If the process were functionally mechanical, based on some form of gradient, the bird would not be able to take into account the variables that it obviously addresses. We see only behavioral manifestations of the resulting plan, the bird circling twice, once close to the surface, then resuming its course. Yet this behavior clearly implies a decision involving conflicting behaviors. A wide range of organisms produce behavioral evidence to suggest that they have some form of decision making that is influenced by the presence of more than one content map. When a spider comes to the boundary of a new surface and tests the surface with its legs, it is being influenced by more than one content map. One map says, “Don’t venture on a different and unknown surface”; the other map says, “Keep following the route that had been planned.” The agent must make a decision about which map to follow. As an analytical rule, if it is possible for two content maps to be present at the same time and one or both are variable in strength, there will be times when they are of nearly equal strength. At that time, a different order of thinking is involved (deciding which map to pursue), and this thinking must be an agent function.
Voluntary and Reflexive Thought In the context of conflicting content maps, the thoughts are presented reflexively to the agent. The content maps are accompanied by secondary sensations that compel the agent’s attention to specific details of the setting and issue urgings to comply with a broad response strategy. Although the process presents this broad class of thoughts reflexively, the decision making involves volitional thoughts. The decision to act requires the functional equivalent of this argument: (a) I am aware of urgings to pursue X and Y, (b) I will do Y, and (c) therefore I will make a plan to do Y in this setting. In effect the agent plans to plan. Step C indicates that the agent directs itself to think about one thing (e.g., planning an approach strategy) rather than another (e.g., retreating).
Thoughts That Lead to No Response If the system is capable of learning, a type of intermediate thinking is implied. This intermediate type of thinking lies between strictly volitional thought and thought presented reflexively to the agent. This type of thinking is based on the following circumstances:
324
13.
VOLITION AND THOUGHT
1. The system identifies false features when learning (features present during specific experiences, but that proved to be unreliable predictors). 2. Each false feature is fortified with enhanced sensation and triggers a content map that has been subsequently rejected by the system. 3. The content map is reflexively presented to the agent in the presence of the false feature. This type of false feature presenting a conflict becomes more probable for learning that is associated with strong primary reinforcers. Although the system has rejected the feature as a predictor, the infrasystem persists in presenting the content map in the presence of the feature. This presentation is a necessary function of the way the system is designed. Because the system has rejected the content map, the agent disregards the urging, but necessarily thinks about it. The thought is reflexive. The counterthought by the agent (to reject it) is voluntary. If the primary reinforcer is relatively weak, the thoughts and urgings are relatively weak. With stronger primary reinforcers, the thoughts and urgings are more prominent. The false predictor has its roots in the functional requirements of the learning process. Learning involves uncertainty. During learning, various features of a primary reinforcer are enhanced with secondary sensations so the learner will attend to them on future occurrences. Only some of them prove to be relevant. The others are disclosed to be false predictors or false discriminative stimuli. As learning occurs, the various features presented to the agent are either strengthened or weakened. At some point, the agent recognizes that, although particular features are present and accompanied by an urging for some form of action, they are not to be responded to. For instance, the system has concluded that red is not a primary candidate for being one of the essential features, but it was one of the original possibilities and is still accompanied by some urgings, although weakened. When the learner encounters an example that is red, the agent is aware of the urging to respond to red, but disregards this feature and bases the plan on one of the currently strong features.
REFLEXIVE THOUGHT The learner knows various individuals as the sum of their features. If the learner encounters a situation in which some of the features that characterize a particular individual are present, but the features do not belong to the individual, the learner thinks about the individual that has the features. The stronger the reinforcing properties of the individual and the greater
REFLEXIVE THOUGHT
325
number of common features, the greater the tendency to think of that individual in the presence of features possessed by that individual, and the greater the network of thought. An example would be recognizing that a stranger looks a lot like someone you know. You see the stranger and clearly recognize that it is not the person you know and that actions associated with the person you know would not be appropriate. At the same time, you think about the person you know. The presence of the features has created an involuntary thought. In summary, there are two primary situations in which the system’s design can create thoughts that do not result in plans: 1. During learning, when various features are weakly enhanced and individuals that possess those features create involuntary thoughts about the individual or event that possesses the features; and 2. When a cluster of features possessed by one known individual or event are presented in a context of another individual and cause involuntary thoughts about the known individual or event that has these features. For both situations, the strength of the thought depends on the reinforcing properties of the individual or event identified. The more reinforcing the individual (either positive or negative), the stronger the positive or negative sensations that accompany the thought. If the reinforcing valance is weak, the thinking is not highly charged and therefore tends not to persist or command attention. If the reinforcing valance is strong, the thinking is highly charged and commands protracted attention. False Features Caused by Traumatic Events False features are most obvious following traumatic encounters. For instance, the learner is an animal that goes to the vet for the first time. It is already in pain because of its injured paw. The experience at the vet is very aversive—strange people inflicting pain on the injured paw and an array of odors, some unique, some foul, some belonging to enemies, and others emanating from animals that smell of fear. Because the primary reinforcer is highly aversive, the learner’s system represents and enhances everything unique about the event. The system treats all features as if they are predictors of the aversion—from getting in the car with an injured paw to those strange sounds that occurred at the vet. Because the system assumes that all predict the aversive setting and experience, all are to be avoided. Not everything that the system identified as a predictor actually predicts. The system quickly rules out some of them. The formula is that they occur and are not followed by the aversive event. The system has no basis, how-
326
13.
VOLITION AND THOUGHT
ever, for discrediting the features that occur at a low frequency and had not been experienced by the animal before the trip to the vet. Because the situation is aversive, all features unique to it are strongly enhanced with negative sensation, which signals to the learner that each is to be avoided. Let’s say that one of the strange smells came from an ultraviolet light. Later the animal’s owner brings home an ultraviolet lamp to treat a skin condition. Now the entire house has that feared odor, so the animal is faced with a dilemma. It knows that the house is a reinforcing place—home with family, food, fun. At the same time, the smell evokes a memory of a bad place along with the strong urgings to avoid the feature. The animal does not act on the urgings, but the system has required the agent to think about the smell, bad place, and possibility of avoiding the bad place. These are involuntary thoughts. This format of thinking is different from one in which the learner thinks about planning one response rather than another. It is thinking about a sensitized stimulus and possibly remembering an event without seriously planning to produce a strategy that is consistent with any urgings.1 False Features and Relearning Strong false features occur when extensive relearning must occur in connection with a known reinforcing individual or event. The extent of the false features depends on the magnitude of the relearning required. If something happens that affects all of the features of a reinforcing individual or event, the false-feature effect is maximized. The system knows the individual or event both as an individual and the sum of its features. So if the status of the individual changes drastically, all of the features of the individual are implicated. Therefore, all become potential false features that promote involuntary, reflexive thoughts and urgings. If the reinforcing or punishing properties of the individual are strong, the scope of false features is extensive, and the sensations attached to these features are intense. Human experience provides dramatic examples of false features and relearning. An example is grief over the loss of a loved one. If a loved one dies, all unique features of the loved one—the experiences with the person, those details that are unique to the person, the events associated with the 1The learner may initially produce behaviors that are influenced by the smell, perhaps walking more hesitantly or approaching the source to identify it. However, none of these is the behavior that is urged. The urge calls for an uncompromised escape. Later, however, the smell would tend not to influence behavior in the home, but would still be recognized as a feature of the initial event.
1
REFLEXIVE THOUGHT
327
death—are enhanced with strong negative sensations that cannot predict the person or lead to plans to interact with the person, but that result in involuntary thoughts about that person. Because all the features of the person have become enhanced with strong sensation, the learner is sensitized to an incredibly large number of features. Under normal conditions, the system would have to encounter a fairly extensive cluster of features to produce involuntary thoughts about the loved one. With the enhanced sensations, however, virtually each individual feature of the person becomes the basis for a strong false predictor of the person. Because the learner has had thousands of experiences involving the loved one, it is virtually impossible for the learner to do or see anything without encountering a feature that was possessed by the loved one. If the loved one liked to build model airplanes, any reference to a model airplane, an airplane, or even things that fly would tend to evoke memories of the loved one. The specific phrases the loved one used frequently, the job the person had, and virtually all possible details of the person become false predictors. Something as innocent or apparently as far removed as a tablespoon could result in the memory of the person holding a spoon. The presentation of each feature leads to the thought of the person, the strong urging to interact with the person, and a mocking reminder that such interactions are not possible. Extinction of False Features These false features signal that new learning is required. That learning, of course, is the realization by the infrasystem that the person is dead. The process is slow because so many of the known features of the person are not consistent with a dead person. In a sense, the false-feature associations are illogical. They are a product of the system’s classification system. The agent knows the person is dead, but the system has classified the individual as the sum of a great number of features. It has also classified the person according to each of the features possessed by the person. In a sense, the announcement that the person is dead affects the whole, but not the parts, because each part is shared with others. The loved one was not the only one who was afraid of spiders or who built model airplanes. Because the loved one may be retrieved by starting with any of these features, the whole is recognized as being dead, but the features must, in effect, be extinguished individually. For the infrasystem to realize that the person is dead, it has to reclassify all of the features. Each must run a course of being shown not to be a predictor of the person. Like other relearnings, a large number of trials is required to weaken the features.
328
13.
VOLITION AND THOUGHT
The illogical aspect is the relationship between part and whole; however, for learning, this relationship is necessary. During learning, the part predicts the whole. (The features of the individual predict the individual.) Each feature of the individual is therefore a predictor. When the individual is enhanced with secondary sensation, each feature is enhanced. So if there are 1,000 clusters of features and the individual is classified by each of these features, each route from feature to individual would have to be extinguished before the presentation of features would not lead to thoughts of the individual. Once the involuntary thought about the person occurs, it is not limited to the part or feature that caused reference to the whole. Any other part of the whole is now implicated. The associations take this form: part 1 ® whole A ® parts 2 through N. Part 1 of whole A is presented. It evokes thought or recognition of the whole, A. In evoking A, it evokes various parts of A (2 through N). The feature has predicted the whole, so now the thoughts may migrate to other features of the whole—experiences related to the observed feature as well as other experiences. For instance, the presentation of a spoon evoked thoughts of the loved one. These thoughts were not in the abstract, but initially in relationship to the spoon (how the loved one proved that it really held more than a true tablespoon). This thought is connected to others related to the whole—for instance, examples of how argumentative she could sometimes be, and back to the phone call in which the learner’s brother announced, “Mom passed away last night.” In any case, the thoughts are stimulus driven—reflexive products of a system that learns through experiences with individuals and that governs attention to features by affixing positive and negative secondary sensations to all features of the individual as well as individuals. Although the learner may be trying hard not to think about the person, the thoughts are imposed by contact with any enduring feature possessed by the loved one or that predicts the loved one. The same sort of irradiation of effect would occur if the new positive or negative valance of the individual were not as strong. It would simply be less extensive. For instance, if the loved person had taken a nasty fall and suffered some bruises, references to accidents, hospitals, safety, dangerous events, situations in which people fall, and similar related topics would reflexively present memories of the person as well as related negative feelings. So would discussions of or references to other family members and possibly to some features not closely related to the topic, such as observing somebody walking down the street. However, the scope of the features involved is not as great because not as much relearning is involved. The learner does not have to learn that all features of the loved one have been rescinded, merely some of those that relate to her health.
DIRECTED THOUGHT
329
DIRECTED THOUGHT Directed or voluntary thought is not the product of stimuli in the current situation that have been reflexively enhanced with sensations. Voluntary thought is generated by the agent. Like other processes directed by the agent, voluntary thought requires a plan, directive, feedback, and some form of criterion of performance. The thinking may take the form of a plan formulated and directed by the agent, or the thinking may be independent of a plan to do action—simply an attempt to figure something out. As the grief examples indicate, the human is far from being free from thoughts that are reflexively caused by specific stimulus conditions. In addition to the reflexive thoughts, however, the human makes plans not governed by sensitized stimuli, but motivated by internal conditions. Agent-Imposed Goals and Criteria The most basic form of directed thought is an agent-imposed goal or content map. The fact that the agent devises the map does not imply that this action is unmotivated or based on caprice. It is influenced by various sensations and facts about the current setting. These details, however, do not cause the thinking. Rather, they have been selected by the agent as indicators of what needs to be planned. There are two main types of content maps that are not strict products of the details of the current setting: (a) the content map that calls for an action at a later time, and (b) the content map that is clearly independent of features of the current setting. Content Maps for Actions at a Later Time. When the learner plans an action that is not to take place now, a new kind of content map is implied. Like all other content maps, this one must meet the basic content map requirements: 1. It must involve specific content. 2. It must imply a specific class of actions or application of a specific response strategy. 3. It must have a criterion that is sufficiently precise for the learner to construct a plan that is consistent with the map and allow for feedback on the action that issues from the plan. 4. It must be connected to some form of discriminative stimulus that causes the presentation of the content map to the learner and urges planning and actions. A content map for doing something at a later time is a map that is functionally equivalent to the idea, “Later, I will search the field for food.” The
330
13.
VOLITION AND THOUGHT
map specifies the content (searching for food), the general response strategy that is involved (food gathering or hunting strategies) the criterion for the pursuit or the plan (obtaining food at a particular place), and the discriminative stimulus (a later time). The discriminative stimulus is different from those of plans for immediate execution. The discriminative stimulus does not reside in any feature that actually predicts the behavior, but in something the agent decides will predict the behavior. The map may be designed to be keyed to any specific features that will occur at that time. For instance, “When I’m hungrier . . .”, “After I rest . . .”, or “When it is cooler. . . .” What is presented at a later time is the content map “Search for food.” This process is greatly different from one in which something in the immediate present serves as a predictor for a response that is to be produced now. When the designated criterion for determining a later time occurs, the map to act and the accompanying urgings are presented to the agent. For the system to construct this type of content map, the agent must be able to exercise some control over the infrasystem. The agent constructs the content map and also directs the infrasystem to fortify the map with secondary sensations. It may be argued that the learner may use hunger as a criterion for determining when to forage. However, hunger is not an all-or-none feature, but rather gradations along a continuum. Even if the learner did specify a later time as so much hunger on the scale, the hunger is being treated as a fact— not just any hunger, but a specific degree of hunger. Therefore, when the degree of hunger occurs, the agent receives some form of urging to act. Another important consideration is that the agent is not assured of satisfying hunger at the later time. The learner must search for food, which may be an arduous and unsuccessful undertaking. The agent is not committing to eat something, but to the act of trying to find something to eat. The relationship of urging and the discriminative stimulus is more obvious if the activity is not associated with a continuum of variation. If the learner resolves, “Later, I will cross the river,” the requirements are the same as they are for “Later, I will search for food.” Presumably, crossing the river at the specified time is arbitrary in some ways. For whatever reasons the agent has for crossing the river, the act is planned for “later.” After time has passed, the agent receives an urging to execute the self-imposed content map. If the urgings do not occur, the agent has absolutely no basis for action. The reason is that the content alone has no power to influence the agent. Without the assist from the infrasystem, the agent will be aware of the map, but will have no basis to respond. A specific plan to cross the river would be no more motivating, attractive, or attention compelling than doing anything else.
DIRECTED THOUGHT
331
The infrasystem, not the agent, controls the secondary sensations. Therefore, there must be communication that influences the infrasystem to treat the agent-generated content map the same way it responds to content maps based on predictors of primary reinforcers. That communication must originate from the agent. The system complies and enhances the discriminative stimulus with secondary sensations. Later the map is then presented to the agent with urgings to respond. Routines as Self-Imposed Content Maps. Although the resolution to do something at a later time is not evident from overt behavior, self-imposed content maps may be inferred from behavior patterns not associated with any immediate discriminative stimulus. For instance, let’s say that there is food at three different locations all the same distance from where an animal dwells. The animal develops a routine for visiting the stations. At a particular time it goes to Station 2 every day. At another time, it visits Station 3. It rarely visits Station 1. This routine could not be the product of a content map reflexively imposed by the presence of either internal conditions that relate to hunger or to external stimuli. The food is available at any time, so the food cannot elicit the content map. Some state of hunger may account for eating behavior, but not without a content map that indicates what the organism would do when some combination of hunger and other events have occurred. The routine implies that the agent had to create a content-map rule that applies not merely to the current setting, but to all days. At a certain time, the agent receives the urging not simply to eat, but to recognize the hunger and related events as the discriminative stimuli for executing the routine. The arbitrary nature of the routine is evinced if different learners develop different routines for the same three stations. Each learner has reasons for the routine that it follows. Obviously, however, none is based on the same set of reasons. Therefore, each must have been the product of self-imposed criteria. The relationship of the urging to the remote-time content map is demonstrated in human experience, particularly with respect to nonreinforcing activities. For the human as well as other animals, the urge to comply with the map has to be stronger than the punishing consequences of the activity to be performed. (If the urge were not stronger, the map would not be followed.) For example, a person resolves to exercise every day before breakfast. When the time “before breakfast” arrives, the system provides the agent with urgings to follow the map. The agent may find reasons not to follow the urgings—lack of sleep, bad back, too early in the morning (maybe later). The person will comply with the map only if the urge to perform is stronger than the urge not to.
332
13.
VOLITION AND THOUGHT
The urging imposed on the agent has been identified variously as the super ego, will, or conscience. Regardless of its name, it has dual functions that are necessary for self-imposed performance: (a) it specifies information about what the learner is to do, and (b) it conveys negative reinforcement that persists in urging behaviors to comply with the established resolution. Agent Influence on Infrasystem Installing the content map implies specific influences of the agent on the infrasystem. Both the content and secondary sensations associated with the discriminative stimulus must be installed. The natural processes that would provide both the content of the map and enhanced sensations do not apply because following the map is physically more punishing than not following it. If the map is installed on the basis that it predicts an immediate positive outcome, the map is easily contradicted by the events that occur when the routine is followed. The only possible source for changing the valance of the activity from negative to more positive is the agent. Therefore, an agent function is implied. The fact that the content map may be accessed at times other than those in which the discriminative stimulus is present implies that the content map must be a resource of the agent in the same way that knowledge of the response repertoire is a resource of the agent. Awareness of the map, therefore, occurs in two possible ways: (a) it may be accessed voluntarily by the agent, or (b) it may be presented reflexively by the infrasystem in the presence of the discriminative stimulus. Figure 13.1 shows the performance system that includes agent-created content maps. These content maps are installed as part of the agent’s resources, along with other learned content retrievable on a volitional basis. The agent has access to the maps in the same way it has access to responses. Components of each content map are also installed for two other infrasystem functions. The discriminative stimulus for the map is installed in the modifier. It is shown in parentheses because it is not currently activated. However, it is potentially available as a criterion for screening incoming sensory receptions. The presence of the discriminative stimulus will activate the enhanced sensations that accompany the content map. The map is also installed in the modifier. Like the discriminative stimulus, it is in parentheses because it is not active. With these components installed, the infrasystem is now able to screen for the presence of the discriminative stimulus and present the appropriate content map and enhanced sensations. The screening criterion for the discriminative stimulus activates both the secondary sensations and the map. These are presented to the agent in the same way they are for simpler types of learning.
333
DIRECTED THOUGHT
FIG. 13.1.
Agent-generated content maps.
Agent Resources. For the simpler learning system (one that does not construct content maps except in the presence of events associated with a primary reinforcer), volitional access of the content map is not necessarily an agent function. The only type of thinking that is necessary occurs in the presence of stimulus events associated with the pursuits of primary reinforcers. For this model, the agent would have knowledge of the content map only when the discriminative stimulus is present. The requirements for any sort of planning not necessarily performed in the presence of the discriminative stimulus are more elaborate. The thinking involved in both constructing and retrieving the map must be agent functions. The infrasystem has a role, but the agent has some form of volitional control over accessing the map as well as the features of things relevant to the map. Access to Features. If the learner had a number of such maps and could access them at different times, the learner would necessarily have the ability to think about relevant features that occur in all of the various maps, including features shared by various maps. Because a broad range of possible features may be involved in the formulation of this library of maps, the learner would have to be able to retrieve information about features. For in-
334
13.
VOLITION AND THOUGHT
stance, in planning a route, the learner would have to indicate at least some of the markers or criteria that define the route to be followed. If this plan were formulated at another time and not under the influence of a natural content map, the agent would need access to the features that distinguish this route. Because the agent is incapable of dealing with features in the abstract and is able to represent features only as they relate to individuals, the agent would have to be designed so it was able to construct or imagine at least some events and their relevant features. For instance, unless the agent represents what it will do initially and where it will go first, a planned route is not possible. In the same way, if the representation does not include some sort of goal or criterion, the planned route is not possible.2 Dual Knowledge Classification System. The ultimate implication is that the agent has its own classification system for various purposes, and the infrasystem has a much larger classification system that incorporates everything the agent is able to access, but that also includes additional content. The disparity between infrasystem classification and that of the agent results from the difference in needs. The agent needs to call on images or representations of things it will encounter, not of abstract processes or the logic of feature analysis. In contrast, the infrasystem must be designed so that it is able to perform all those operations necessary to control the presentation of enhanced sensations and content maps. These operations imply an intricate analysis of features and changes in features. Agent-Directed Thoughts The format required for the agent to order a topic or thought is logically the same as that used to plan and direct a response. Just as the agent is able to create a directive to scratch its nose, it is able to issue directives that involve other operations within its repertoire. Both the directive to produce an overt behavior and the directive for thinking about something require a plan that is in the form of, “I will do X.” The difference is the action. If the plan involves thinking about a temporally remote X that is not prompted by current sensory conditions, the agent accesses its knowledge of both the events and behaviors it may produce. For example, it directs the classification system to produce the images or information about the place 2 The agent does not have to represent the entire route or all details of the route because some of details could be prompted by other details along the route. A familiar experience for humans is not being able to recall how to get to a place, but knowing how to start and where to end up. The person recognizes details along the way that prompt what to do next. “Oh, I remember. We turn right after that big tree.”
2
DIRECTED THOUGHT
335
it visits. If the agent resolves to visit Station 2 first, it not only directs thoughts about the place, but formulates the complete content map about what it plans to do. This process involves the same operations the agent uses when it plans specific motor responses. The agent’s plan (and directive to respond) functionally has enough specificity for the infrasystem to create an example consistent with the information provided by the plan and directive. So it is with thoughts. If the agent directs a thought about Station 2, the directive must be specific enough for the infrasystem to present the learner with some information about the specific features of that site—information specific enough to prevent the site from being confused with others in the system’s classification system. Another parallel between thinking about remote applications and thoughts influenced by a discriminative stimulus in the current setting is that the agent is able to direct the information to be more specific. If the learner is walking and the agent receives information that the type of walking initially specified is not working in the current setting, the agent may respecify the details of the walking so that it complies more with what the agent had intended. The same format would perforce apply to the directives involving thoughts. If the agent is concerned with a particular set of features of Station 2, it could direct itself not only to think about Station 2, but to think about the targeted features of Station 2. Note that these features are strictly within the context of Station 2 and are not presented in the abstract. In effect, the agent attempts to recreate an image of the place and superimpose on that image some behavior that will occur. Although the features that the agent creates in thought are of a concrete singular event, the agent may direct the infrasystem to present images of more specific features of the event. Interaction of Agent and Infrasystem When the agent directs the infrasystem to produce a particular thought, the infrasystem is able to present the thought in varying degrees of specificity; however, the system has the event classified as an event that contains only enough detail to prevent it from being confused with other classified events. This serves as a default setting for producing the thought. If the agent directs the thought of Station 2, the system would tend to produce a representation with only enough detail to prevent it from being confused with the other stations. This is a default setting—the broadest and simplest representation the infrasystem has. If the agent demands more detail, the infrasystem has to search for the specific features that the agent requires. This operation may be achieved through different search formats. However, the only presentation that
336
13.
VOLITION AND THOUGHT
makes sense to the agent is one of host plus feature. So the system must search the feature for the host or the host for the feature. Unless it finds the intersect of host plus feature, the search fails. The amount of detail that the system is able to produce is a function of how much detail the agent (and consequently the infrasystem) has attended to. The more elaborate the record of the event or object, the more the infrasystem is able to respond to requests for more detail. In the end, however, the interaction between agent and infrasystem is the same as it is with other resources. The agent may direct the system to push a 400-pound rock uphill. The infrasystem tries, but does not succeed. In the same way, the agent may direct the system to provide information about a specific feature of Station 2, but the system is not able to comply— either because it does not have the information or because it is not able to locate the combination of host and feature. For instance, the agent directs thoughts about more details of Station 2. The infrasystem may be able to accommodate the directive. For instance, the agent wants to be in the sun early in the morning, so the agent is trying to determine whether there are sunny areas at Station 2 early in the day. The directive to the infrasystem is equivalent to, “Identify early-morning sun feature of Station 2.” The system may be able to retrieve a memory of the learner being in Station 2 early in the day, standing not far from the food in a sunny area. However, the infrasystem may not be able to satisfy the learner’s directives to produce a thought about the sun features of Station 2. Search Strategy Formats As noted, the infrasystem must have more than one search format because the agent may ask about a particular event (e.g., what happened in Station 2 yesterday) or feature (sunny areas at Station 2 early in the morning). For a sophisticated system, there would be three basic search formats—search for host, search for feature, and search for host and feature. Host Search. The product of a host search is an individual, event, or group of individuals or events. The criterion for identifying hosts is that they have a particular feature. Examples of host searches would be, “Who do I know who has a lot of money?”, “What is the largest bank in this neighborhood?”, or “Where did I see that painting before?” The result of the search is a host that has a particular feature. For the painting, the system identifies the event that had many features, one of which was that the painting was present. A host search involving the three food stations would be something like, “Which station has the most sunlight early in the morning?” The product of
DIRECTED THOUGHT
337
the search would be a specific place that has many features, one of which is that it has the most sun in the morning. Feature Search. The product of the feature search is a feature. The criterion identifying a feature is that it belongs to a particular host. “How much money did I spend yesterday?” is an example of a feature search. The host is the event yesterday. The feature is the amount of money. Other examples would be, “What kind of dog did she have?”, “What color was their living room?”, or “Which tooth did Jennie lose first?” For the food station example, a feature search would be something like, “What is sunlight like at Station 2?” The system would provide information about the sunlight feature. Host and Feature Search. A more elaborate version of the feature search would have a criterion that specifies both the host and feature. The search is for a feature of the feature or a related feature. A search involving the food station example would be something like, “What is the sunlight like at Station 2 in the morning?” The host (Station 2) is specified. The feature of Station 2 to be identified is not simply sunlight at Station 2 or morning at Station 2, but the feature of Station 2 that is at the intersect of sunlight and morning. Search Strategy Limits The extent to which a dog or goat uses these strategies is not given by the analysis. For remote planning to be possible, the agents would have to represent both the features that serve as variables for the plan and the various hosts in which these features reside. Because there is more than one way to approach the intersect of host and variable, there is more than one possible strategy. The agent has access to the infrasystem’s classification system only to the extent that the infrasystem is able to present the outcome of the search to the agent as a specific concrete example of something that has been experienced or a transformation of events that have been experienced. Unfulfilled Searches. The system may not be organized in a way that permits immediate access to a specific combination of feature and host. However, by using more than one strategy (host search, then feature search), the agent may be able to access experiences relevant to the search. The failure of a search to identify the specified host or feature leads to a revised search plan in the same way that failure of a plan to approach X leads to replanning. For example, the content map that generates the plans is, “Find the answer about sunlight in the morning at Station 2.” The unanswered question is enhanced with secondary sensations so the agent is
338
13.
VOLITION AND THOUGHT
nagged to perform various searches that may reveal the relationship. Any features of the current setting that are related to the unfulfilled search will reflexively activate the content map. For instance, the awareness of sunlight or the urges to eat could prompt the unfulfilled search. In the end, the system may not be able to fulfill the criteria for the search. The learner will have to obtain firsthand information. Problem Solving, Trial and Error, and Insight The classification paradigm needed by the infrasystem suggests what occurs when the learner exhibits insight or solves a difficult problem through covert steps. The problem facing the system is different from the search for information already specified by the agent because the solution involves a response strategy based on specific features of the problem. The agent must be the main source of determining the solution. Like other examples of learning, the learner does not have preknowledge of the solution. The learner must plan and test the plan. Those plans that work may appear to be insightful, but in a basic way they cannot be any more insightful than those attempts based on trial and error. All are trials; some lead to errors and some don’t. Unfulfilled Expectations. For the learner to exhibit insight, the organism must be designed so that unfulfilled expectations are enhanced with sensation. The agent is designed to achieve consistency and predictability. Those experiences in which expectations are not met are represented and enhanced with strong secondary sensation. They drive the learner to make sense out of something that is not predictable. The problem must be important enough for the agent to spend the time and perform the searches necessary to plan various strategies. In other words, the sensations associated with unfulfilled expectations must be strong and extensive enough to produce something like the grief syndrome—all features of experiences that are associated with unfulfilled expectations are enhanced. Although the grief syndrome is the result of the sensitization of false features, the sensitization of features for problem solving is functional because a solution may occur only if the agent attends to the relevant variable. Given that the learner does not know which variables are relevant and exactly how they are relevant, the system needs to sensitize a wide range of features in the problem-solving setting and related features in the agent’s classification system. The unsolved problem is retained by the system and affixed to a content map that urges the agent to “Solve it.” Any example of unfulfilled expectations is a problem, and the search for a resolution is an act of problem solving.
DIRECTED THOUGHT
339
The general format for insight is that the system must identify a sameness and extrapolate from features of known relationships to the features of the current problem. The solution through extrapolation requires the system to proceed from known facts and relationships to the possibility that they apply to the current setting. The search that the agent imposes on the infrasystem is to consider either hosts or features of hosts in the current setting and identify the various facts (including response strategies) that may be relevant to the current problem. For example, food is in a cage. The learner is able to reach between the bars, but the food is positioned so that it is beyond the learner’s reach. In the room are several objects—chairs, a table, a bowl, and two long poles. Because the food is a strong primary reinforcer, the learner first tries to reach it through different direct-approach strategies. After many trials, the agent recognizes that direct approaches are not meeting the expectation of obtaining food. The solution to the problem is to use a pole. There are two possible extrapolation routes for arriving at the solution. One is through extrapolation based on known uses of poles; the other is to extrapolate from experiences in which the organism was not able to reach something. The first approach involves a feature search. The learner considers the poles (hosts) and identifies whether one of the response strategies for the pole (features) could imply a reach-extension solution. The second approach involves a host search. The learner starts with the feature of things that are out of the learner’s reach and identifies the various hosts—events in which the learner solved the problem. Feature Search. Both the feature and host searches are logically capable of arriving at the same solution given the classification of events as the sum of their features and by each feature. The routes are different, however. For the feature search, the agent examines the hosts in the current setting. The search identifies possible features that may be relevant to solving the current problem. For instance, “What features does a chair have?” or “What can you do with a chair?” The learner may test out the various features by moving the chair or tipping it over. The examination of the poles may involve swinging the pole, tossing, and hitting the bars with it. The system may reveal the solution to the problem through extrapolation. The learner hit the bar while it was at a distance from the bar. The host that permits hitting may be able to serve as the host for touching or manipulating something from a distance. Host Search. For this search, the agent starts with the feature of the problem, “Reach is not long enough.” So the agent may search the classification system for examples of using something to extend the learner’s
340
13.
VOLITION AND THOUGHT
reach and whether any are relevant to the current problem. The infrasystem may identify relevant hosts or events, such as the time the learner used a stick to poke at a snake that was beyond the learner’s reach. Something in the current setting that would permit the learner to touch the food might permit the learner to move the food. The learner searches for a host that shares features with the stick. The pole is identified as having the same features as a stick. Therefore, through extrapolation on the basis of sameness of features, the system goes from the stick to the pole and from poking to possibly moving. The pole is tested. The host search may give the impression of being more insightful than the feature search. The learner looks around the room, picks up the pole, and starts poking at the food. In contrast, the feature search may appear to involve more trial and error. The learner seems to be aimlessly rummaging through things in the room. There would certainly be individual differences in how learners approach the problem because there are various search strategies that could lead to the same solution. The more the steps are covert—thinking—the more insightful the solution seems. However, the learner rummaging through things may be thinking as much as the one producing no overt behaviors. The learners are simply approaching the problem in different ways. In any case, there are many possible search strategies. A given learner may perform more than one or even a combination of searches. In the middle of a feature search, the learner may identify a host. Using more than one search strategy logically increases the probability of arriving at a solution, even if the solution occurs by accident. The accident is probably only an accident with respect to a detail, not a blind, accidental behavior. If the learner tosses the pole at the food and then reaches in to retrieve the pole and accidentally comes in contact with the food, the discovery of the solution was an accident, but it occurred because the learner was trying to do something to move the food. This requires thinking, logic, and classification of events and features of events. SUMMARY Thought is a necessary function for planning. There are different levels of planning and therefore different levels of thought. The simplest forms are plans governed by a content map. The thought required when the content map is given is guided by both the sensitized discriminative stimuli and urgings to respond. In the simplest form, the presence of thought is reflexive. The presence of the discriminative stimulus compels attention and thought. The content of the thought is partly determined by the sensitized features and content map, but the particular thoughts that lead to the plan are based on details of the present setting.
SUMMARY
341
The functions associated with thought become more complicated when the learner is under the influence of more than one content map. If the maps are nearly equal in strength, the agent is forced to formulate a plan about a plan. This takes the form of a decision to follow the urgings to execute a plan for B rather than a plan for A. Once the decision has been made, the agent formulates and executes the plan for the selected content map. The presence of competing maps provides possibly the simplest example of the learner thinking about an action (the plan for A) without actually attempting to follow the plan. Thought without action occurs during learning. The learner thinks about one of the features that have been sensitized, but subsequently weakened or discredited. For these operations to be possible, the system must be designed to classify individuals both as the sum of features and classify each feature (with the individual listed under each feature). If the individual or event involves strong positive or negative stimuli, the system sensitizes all unique features of the individual. Some of these are false or irrelevant features based on an illogical conclusion. For some situations, they are easily discredited. For others, they persist as false features. For instance, the system concludes in effect, “If the individual is to be avoided, any feature of the individual is to be avoided.” The conclusion is safe because the agent that follows it will definitely avoid contact with the individual. However, the conclusion is not productive because it will also lead to avoiding other individuals that have one or more of the false features. Once the learner learns that features are false, the agent is still presented with the urgings to do something the learner knows is unnecessary. These false features often result in the agent thinking about the feature and being aware of the urging to avoid the feature without producing the behavior that is urged. Extensive false-feature sensitization occurs when individuals that are primary reinforcers must be reclassified by the system. Because the learner knows an individual both as an individual and the sum of many features, the system sensitizes all features of the individual following the death of the individual, for instance. The system now reflexively causes a wide range of thought not associated with action. Encounters with virtually any feature or combination that is shared with the individual reflexively causes thoughts of the individual. Each must be extinguished individually. False features are a necessary price that the system must pay if it is to perform the other productive applications that involve treating individuals as the sum of and as examples of features. All these examples address reflexive thoughts largely governed by the infrasystem. The agent does not direct the thought. The thought occurs to the agent in the presence of particular stimulus conditions. The agent is
342
13.
VOLITION AND THOUGHT
compelled to respond to the urgings the system presents. For sophisticated learners, another type of thinking is available—directed thought. For this type, the agent directs the infrasystem to produce thoughts in the same way that it directs the infrasystem to produce motor responses. For volitional or directed thought to be possible, the agent must have a repertoire of content. Basically, the agent must be designed to share the part of the infrasystem’s classification system that represents individuals and events. The efficient infrasystem provides the most general representation of the subject or host that is directed by the agent. If the agent directs the thought of individual X, the thought is presented as some features unique to the person. It does not present the sum of the features that the system has on record, merely those that are sufficient for distinguishing the individual from others. This format of presenting the most general representation first is parallel to the formats that characterize other agent directives. The system provides the agent with default settings for motor responses. The learner does not have to direct every detail of walking, but is provided with a default content map that requires only a minimum specification from the agent. The attention system classifies the most global units that characterize candidates for attention. When a horn of 110 decibels and a pitch of 260 cps sounds, the system first responds to it as something in the class of loud, unanticipated noises. The other, more specific features are added to identify it as a horn, possibly as a truck horn. Much more detail is available, but it is not relevant to the needs of the current setting. For all these applications, the system has a default setting that functions as something like a wide-angle view. The same wide-angle strategy is used for representations of individuals. For example, the wide-angle view presents the quiddity of Dorothy. A more detailed articulation is needed to comply with the search, “What’s that funny thing she does with her napkin?” Planning of remote temporal events requires access to the classification system shared by agent and infrasystem. There are three different types of mental pursuits that are possible for planning future events and problem solving—host search, feature search, and host and feature search. The solution to a problem is often revealed more readily by the host-first pursuit in some settings, the feature-first pursuit in others, and some form of combination in others. The efficient system employs various searches by enhancing any expectations that are not being met and enhancing any hosts (events) that have the feature of unfulfilled expectations. When the infrasystem classifies the hosts as the sum of their features, the agent has access to the host or features of the host through various searches. The solution to a unique problem that relates to prior knowledge is possible because features and hosts of past experiences are available for the current setting. These are extended through extrapolation of features—identifying what is the same about this situation and others the
SUMMARY
343
learner has encountered. If the samenesses are identified broadly, the system has the opportunity to formulate connections between the features of hosts in the current setting and hosts and features that have been used to solve problems of the same general class. The result is what may be classified as insight rather than trial and error. This distinction is artificial because both solutions require trial and error. The only difference is the relative lack of overt behavior. Regardless of the extent of the overt behavior, however, the operation is supported with a great deal of covert behavior— thinking and planning.
Part
HUMAN LEARNING AND INSTRUCTION
IV
Chapter
14
Human Learning
The human system is capable of learning an incredible range of patterns and content. The human system has been shaped to accommodate the corresponding elaborated functions of the agent. However, there is nothing new in the format of what is learned. The human system does not do anything that is unique to humans. It simply has the capacity to do them better and extend them to areas beyond those within the purview of other species. The basic processes are the same because the learner encounters nothing but individual things and events. Any learning achieved by the system involves sameness of features and only sameness of features. Because the learner does not have preknowledge of what it will learn, the system must be designed so that it receives the raw material necessary to identify samenesses, and so it has the processes needed to extract relevant content information from the receptions. The system is able to learn more because it represents and retains more. The raw material is recorded as individuals or events and as the sum of their features. The individuals may be regrouped according to common features. In the population of 16 presented in chapter 10, one classification was based on color only. The things that were grouped were individuals. Any individual that is in this red group has features in addition to redness, and therefore could be grouped on the basis of any of these or any combination of these features. Logically, the same individual would appear in many different groups, each defined by the presence of specific features. 347
348
14.
HUMAN LEARNING
This chapter presents examples that illustrate both the classification and learning phenomena as they relate to a series of events. This chapter also revisits the issue of learning to learn, which is illustrated in chapter 11. Chapter 15 extends the same phenomena to language usage and learning. The underlying theme of all these applications is that the learning and performance phenomena rely on the system’s dual classification of individuals or events as individuals and also as each feature of the individual.
RECEPTIVE AND EXPRESSIVE FUNCTIONS We recently did a casual experiment. We observed a learner on an early morning walk (6:30 a.m.). He did not know that we were observing him. Immediately after he returned, we gave him the general direction, “Tell us about all the things you saw on your walk.” Remarkably, the learner described the route, the rate of walking, nearly all of the five people he passed, where he encountered each, what they said when they passed, the illumination of some buildings and streets, how his body felt (sore right Achilles tendon), and many other details. After the learner described the details of the walk, we asked a series of additional questions about the various categories of things the learner reported. All of these were open-ended questions that did not name anything the learner had not already named. For example, “You mentioned the illumination near the software company. Can you tell us more about the illumination?” “You mentioned people that you passed. Can you tell us more about the people you passed?” Finally, we asked more direct questions about details related to what the learner reported, such as questions about what the people he passed were wearing. Within each round of questions, the learner provided more detail. However, the questioning failed to provide a perfectly accurate account of everything he encountered. For example, the learner failed to identify one of the people he passed—a uniformed female security guard. After presenting the questions, we went along with the learner as he retraced the route and made additional observations, most of which were accurate. He was still unable to answer some specific questions about specific details, saying something like, “ I don’t know” or “I didn’t pay any attention to that.” The learner’s performance has been reenacted many times in courtrooms, where the learner must reconstruct concrete events. The learner is not always accurate, but is able to remember a host of details and is often able to remember more detail when returning the scene of the event.
RECEPTIVE AND EXPRESSIVE FUNCTIONS
349
Reflexive Thought The learner’s performance provides evidence that the agent and infrasystem share classification of individual events, but that if the learner is provided only a general directive, such as “Tell about all the things you saw on your walk,” the agent may not gain access to the entire infrasystem record. For instance, during the reenactment, the learner remembered observing the uniformed security guard. He was about a block from the place where he passed her. He said, “Oh yeah, I remember. I passed her right up ahead, by those condominiums. I noticed that she disappeared sometimes. She was wearing a dark blue uniform and long pants.” (Was she wearing a hat?) “I’m not sure—maybe.” (Do you remember her hair color?) “Yeah, I think she had dark hair. I don’t think she was wearing a hat.” (She was, but everything else you said is correct.) If information about that guard were not stored in the infrasystem, there would be no basis for the agent to recover it under any circumstances. The fact that the recollection is more likely under some conditions implies that there are properties of these situations that increase the likelihood of the agent having access to the infrasystem. Also important is that the recollection of the guard was imposed on the learner. “Oh yeah, I remember.” The agent did not direct the thought; the thought came to the agent. Four specific implications outlined in earlier chapters are suggested by this informal experiment: 1. The learner possesses two memory records, one accessible by the agent and one solely for the infrasystem. 2. The records are not coterminous; rather, the infrasystem’s memory includes everything available to the agent, but also includes additional information that is either never available to the agent or tends to be available under some circumstances. 3. The agent’s classification system contains only references to concrete specific things and events—to features, but not in the abstract, only as parts of concrete events. 4. The fact that some thoughts are presented reflexively and not voluntarily implies that the infrasystem has some mechanical or hardwired basis for presenting the thought to the agent in specific contexts. The only possible basis for the probability of the recollection occurring in the context of the walk, rather than the context of reconstructing the details of the walk, is the number of sameness features. The reenactment presented a larger number of features shared by the original events. So the
350
14.
HUMAN LEARNING
infrasystem used these samenesses in features as predictors of some of the other features observed. This is what the system must do if it is to learn. For the system to be efficient at learning, it must record prior events, and it must present them to the agent when correlated features are in the sensory present. The logic of the system is based on the assumption that something predicted the security guard. If the place is a predictor, the presence of the place triggers the infrasystem to present the trial content map to the agent. During the learner’s reconstruction of the place, fewer details were present to prompt the others. Complex Patterns. If there is any pattern associated with the security guard, the data that support the pattern must be available on the various appearances and nonappearances of the security guard. Given that the system does not know which pattern is the right one, the system that learns any of the possible patterns must record an enormous amount of information. We prompted the infrasystem to produce this memory under artificial circumstances—by asking the learner to list the features of the walk. However, let’s say that there was no interview and the learner did not retrace his route and answer questions. Also let’s say that the security guard followed the same route at the same time only on Thursdays. This is one of hundreds of possible patterns. Would the typical human learner be able to learn this pattern? Absolutely. Furthermore, the agent would not necessarily recall the security guard on the days that intervene between the first Thursday and the next Thursday. On the following Thursday, however, the learner would encounter the guard again. Only if there was a record of having encountered the guard earlier could the record be presented to the learner on this occasion. Then the infrasystem presents the agent with awareness of the features that are the same as those in a previous walk—the same guard, walking in the same area, the same direction, and exhibiting the same behavior (appearing from a distance and then periodically disappearing). At this point, the agent is alerted to the sameness in features by a reflexive presentation of the memory of the earlier event. It is experienced as insight. “Oh yeah, I saw her before.”
REINFORCERS AND PREDICTORS This process has been described in all chapters detailing basic learning. The system makes a record of the security guard and the features that accompany the encounter with the guard and enhances these features so that when they recur, the memory is presented to the agent. Because the record
REINFORCERS AND PREDICTORS
351
is enhanced with secondary sensations, the agent is compelled to attend to it and other details of the current setting. This scenario assumes that there is some form of reinforcer. In this case, the security guard is the reinforcer. (We identify the reasons later in this chapter.) Because the learning will focus on the reinforcer, the second encounter with the security guard would not be a detached awareness, but would be accompanied by the emotional charge that characterizes all enhanced stimuli. The agent does not experience something that would be expressed as, “Oh the distant perception of the figure by the condominium sparked the memory of the security guard.” Rather, the information about the reinforcer is presented in the spirit of something worthy of attention. “I’ll bet that’s the same security guard I saw before.” The infrasystem discloses only its conclusions to the agent, not its processes. The presentation is in the form of representations of concrete events, not simply a display of features void of their historical context. In this case, the concrete event was the record of the earlier encounter.
Peripheral Reinforcers Learning a pattern for encountering the security guard is different from the learning described in earlier chapters. The major difference is not necessarily in the magnitude of the record acquired, but rather with the reinforcer. Obviously, there was no primary reinforcer to prompt the learning of the guard’s schedule. Possibly no secondary reinforcer or tertiary reinforcer is identifiable. Certainly, it is possible to create some sort of diagram that shows the generalization from early experiences to later ones and from more proximal encounters with primary reinforcers to those that are more remote. However, this diffusion from learned reinforcers is probably not a sufficient basis for the organism to generate such an extensive network of functional reinforcers. If we consider that the learner would be capable of learning any of the thousands of possible patterns that might occur on his walks, we would be hard pressed to account for why the learning was worthwhile to the learner. Many patterns would seem to be poorly grounded in reinforcers. The performance system is designed to attend to unanticipated occurrences and changes. Further, the system is designed so that attending to such detail and learning patterns is reinforcing to the agent. These aspects of learning are discussed in chapter 15. Regardless of the process, however, the security guard functions as a peripheral reinforcer. These are reinforcers learned peripherally in connection with other pursuits. They are not a central concern to the learner. The learner’s objective was not to learn about a security guard’s schedule, but to go on a walk. The learner had no goal of
352
14.
HUMAN LEARNING
interacting with the people encountered on the walk other than exchanging a greeting. For humans, peripheral reinforcers function as primary or secondary reinforcers. They are recorded, sensitized, predicted, and represented in content maps according to the same set of rules that govern any reinforcer. Like any other reinforcer, they compel attention and imply planning or thought. Priorities and Individual Differences. This description of human learning does not imply that all things are equal to the system. The system is still ruled by primary reinforcers and will still devote more attention to features that predict primary reinforcers than to those that predict peripheral reinforcers. The priorities for learning are determined by the strength of the secondary sensation. The sensations associated with primary reinforcers guarantee that all learners will learn basic relationships and basic predictions. The priorities also suggest that the behavior planned by the agent is greatly influenced by the strength of primary reinforcers. The primary reinforcers do not determine what will be learned, but circumscribe the scope and, therefore, the relative amount to be learned. If the learner is concerned with the immediate details of survival and has little time not dominated by sensations of hunger and cold, and by activities that require the learner to fully attend to details of the job, the probability of the learner learning about peripheral reinforcers is reduced. The learner learns much about the ambient details and features that predict success or failure of the current pursuit. Yet details independent of this pursuit have been preempted by the strength of the sensations related to pursuit of the primary reinforcers. If particular features of rocks predict that they will make noise if stepped on, these features will be learned because they are relevant to pursuits. The fossil impressions in these rocks are not relevant to, or reinforced by, the task and therefore probably will not be attended to or learned. The less the learner is dominated by immediate demands, the greater the potential for peripheral learning to occur. The morning walk requires the agent to be alert to current sensory input. However, because most of the input does not require serious changes in plans or possible unanticipated interactions, it does not place requirements on the system that preempt attention to details not directly related to the current activity. In other words, the walking is governed by what amounts to a default content map, which permits the learner to attend to peripheral, unanticipated features. Predictors of Interactions. The learner attends more to people than to other details of the setting primarily because the image of a person walking toward the learner predicts (a) an interaction—a “Good morning” or
FEATURE-HOST SEARCHES
353
“Hello,” and (b) a degree of uncertainty. The anticipated interaction requires a plan. This plan is different from the plan of simply walking. It involves determining what will be said and anticipating a possible response. So the record for each person encountered on a walk is necessarily more elaborate than those for the bus that goes by or the sidewalk in front of the condominiums (which the learner could not describe initially, but recalled on the “validation walk”). Memory Record. Because the record of the security guard has more features than that of the bus or sidewalk, there are logically more potential links between later events and the memory of the security guard. The reason is that any feature of the guard has more potential to activate memory of the guard (see false features discussion in chap. 13). If the memory of the guard has more features than those details that do not require uncertainty or planned interaction, there is a greater potential for the system to encounter one or more features. The presence of any of these has the potential to trigger the memory of the guard. In contrast, although the sidewalk in front of the condominiums has an unusual pattern of alternating grooved areas (one about 1 foot in length followed by one about 3 feet in length), the learner did not recall this pattern even on the validation walk. After it was pointed out to him, he reported that he had noticed it.
FEATURE-HOST SEARCHES For the learner to learn that Thursdays predict an encounter with the security guard, the system must ultimately have the guard classified as a feature of Thursday under a heading equivalent to “Things that happen on Thursday.” The system must also have Thursday identified as a feature of the guard. Only if Thursday is recorded as what functionally amounts to a feature of the guard could the system learn the correlation between Thursday and the appearance of the guard. If today is Thursday, that feature and only that feature predicts seeing the guard today. Stated differently, the intersect of Thursday and guard is not possible unless Thursday is a feature of the guard and the guard is a feature of Thursday. This arrangement makes it possible for the system to know that Thursday predicts the guard and the guard predicts Thursday. For Thursday, which is a single feature not physically or immediately related to the guard, to predict the guard, the system must have identified Thursday as a common feature of encounters with the guard. Many features of the first encounter with the guard are recorded in the infrasystem. Thursday may not be one of them. When the guard appears again, she is enhanced and presented to the agent as a reflexive thought or epiphany. “Oh yeah, I
354
14.
HUMAN LEARNING
saw her before, right in the same place.” This reflexive thought is enhanced with sensation so the agent will attend to the various details of the current setting. This is important for possible learning because enduring and predictive features in the current setting are the basis for possible patterns. On the second encounter, the learner may or may not make the connection that the first encounter was on the previous Thursday. However, because the guard received enhancement on the first encounter, the record for the second encounter is more elaborate. The learner may wonder whether the first encounter occurred on Thursday. It is possible that the feature of Thursday will not be recorded on this second encounter, but the probability is greatly increased because of the possibility that there is a pattern for predicting the encounters with the security guard. The system presents the agent with the enhanced record of the guard containing various details of the first presentation that could serve as predictors. The agent thinks about these details. In addition, the agent may direct the repertoire to produce additional details of the record and make conscious comparisons of the two encounters. Although this focus on the guard results in a relatively extensive record, it does not require more than casual attention by the agent. This attention during and following the interaction is enough for the infrasystem to make various trial content maps and imprint the record of features with secondary sensations so that it will be preserved in memory. Following the second encounter, the infrasystem has three records—one for the first encounter, one for the second, and one based on the features that are the same about both encounters. A combination of the shared features record and the features of the second encounter that may be shared with the first encounter (but that have not been confirmed) are the basis for various trial content maps. At this point, the system assumes that any features that occurred in the second encounter and may have occurred in the first are potential predictors. In this category is the feature of Thursday. It may have been in the first record; it is more likely to be in the second. If it is, it is recorded as a possible predictor. The agent is aware of at least some of these possible connections. “I wonder if I’ll see her again next Thursday.” On the third encounter, the learner may or may not recall that it is Thursday when he starts the walk. If not, this feature will be reflexively presented to the agent during the walk, possibly not until the image of the guard appears or possibly a block or more before the guard is visible. “Wait a minute. Wasn’t it on Thursday that I last saw her? Wasn’t it a Thursday the first time I saw her?” If the learner encounters the guard again on the following Thursday, the infrasystem would reconfirm the pattern and the learner would experience another epiphany. “I knew it. I’ll bet she takes this route at this time every Thursday.” The learner may plan to ask the guard if his observation is cor-
FEATURE-HOST SEARCHES
355
rect or may decide to say, “Good morning,” while tempted to say, “We have to stop meeting this way.” The process of developing the content map is the same as that specified in chapter 6 for basic antecedent learning and shown in Fig. 14.1. Before the learner’s third encounter with the security guard, the system has a trial content map based on the feature of Thursday. Calculation 2 is relatively extensive because the system’s expectations and projections are extensive. In addition to projections about the route to be followed, the learner has expectations about features of the various parts of the route. With regard to people that will be encountered, the learner anticipates encountering no more than five or six because it is early morning. Most of them are anticipated to be near the parking garage for the software company. The learner anticipates greeting everyone he passes. He expects most of them to respond, with the highest percentage of nonresponders on their way from the garage to the software company. He expects not to see anyone he has seen before. The trial content maps about the various expectations are not of the permanent variety, which means they may be easily modified by additional information. On the second encounter with the security guard, the sensory reception of the guard at a distance triggers the comparison of the projected outcome with the realized outcome (Calculation 2). The projected outcome of the learner not encountering someone encountered previously is contradicted by the sensory reception of the guard. Therefore, a modified trial content map is formulated based on the contradiction (Calculation 3). Just as in other learning, there are many possible patterns for encountering the guard—seeing her only twice, seeing her on a predictable schedule, seeing her again but in no predictable pattern. The result of Calculation 3 is the revised map with revised expectations for patterns that may be confirmed or disconfirmed. Templates Many of the features the system classifies are organized around templates or stereotypes. These templates are a kind of default representation that permits the system to classify clusters of features that tend to occur together, without specifying all of them. For instance, the system has various templates for any familiar class of host—houses, cups, women—that permit quick classification and facilitate the task of identifying features that distinguish the individual from others in the class. For instance, a template for classifying females may be based on the size and shape range of most. Those details that are unusual are listed as unique attributes of the individual. Those details that fall within limits of the template are not recorded. The result is something like, “Woman, tall, dark hair, dainty walk.”
356 FIG. 14.1.
Development of content maps.
FEATURE-HOST SEARCHES
357
This is a receptive variation of the strategy the system uses to specify features of a response. For a response, the agent specifies a relatively broad class, such as “Walk normal,” and then adjusts only those details of the response that are responsible for performance that is not as planned. To make any necessary adjustments, the system then responds to discrepancies between projected and realized outcomes. In the receptive variation, the system templates a relatively large cluster of features that tend to occur together (in this case, the category woman). The system then highlights discrepancies between the template for woman and the current individual encountered. If the woman was unusually tall, walked with a limp, or had a prominent nose, the system would record these features as deviations from the template. By default, the record of this individual implies that the features not noted fall within the receptive template limits. The template cannot accommodate features that are variable, such as eye color. However, the system would have these features grouped by a smaller template. For instance, dark hair and brown eyes go together. Dark hair and blue eyes may be noted as a discrepancy. The security guard was possibly recorded as a woman wearing a uniform. She was noted to be fairly young, attractive, with dark hair. By default, she was average in height, build, gait, posture, movement, and many other traits. Predictive Features One of the more contraintuitive aspects of the relationships among Thursday, the pattern, and the security guard is that Thursday is the only feature of a day that predicts the security guard. So in this sense, the security guard is a feature of Thursday. If we look at the guard, however, we note that one of her features is that she appears only on Thursday. Viewed this way, Thursday is a feature of the guard. If we consider the classification of the individuals in chapter 10, we see the same relationship. We can create various scenarios for relating features and hosts. For instance, if Station 5 is a host, the eight individuals that go in the host are features of the host. Individual 1 is a feature of the host because Individual 1 is red. If we look at the features of an individual, such as Individual 1, we note that it has various features, one of which is redness, another one of which is that it goes in Location 5. We can reveal host and feature relationships of the guard and Thursday by directing the learner to list all the features the learner knows about Thursday. The learner’s list of features would certainly include the security guard, but the list would include many other facts the learner knows about Thursday: how it is related to the calendar and various other calendar facts, how Thursday is spelled, various historical events that occurred on Thurs-
358
14.
HUMAN LEARNING
day, the capitalization feature, how it is pronounced, how it is expressed in different languages, conventions such as time zones that affect Thursdays, events in the learner’s life that occurred on Thursday, all the things in the learner’s life that routinely occur on Thursday (including those things that may occur on other days as well), routine events that are unique to Thursdays, Thanksgiving, songs (or poems, stories, and legends) that refer to Thursday, and so forth. The learner would be able to defend any item by explaining its relationship to Thursday. If we questioned the reference of Thursday to years, the learner might respond, “Well, there’s an average of 52 Thursdays in a year.” If we next asked the learner to list all the things he knows about the security guard, Thursday would be in the list. The list would include place, time, details of the uniform, typical greetings, walk, build, and prominent features. For this listing, the host is the guard and Thursday is a feature. Agent-Directed Searches If the agent tries to determine whether the first three encounters occurred on Thursday, the agent must direct each search that occurs. The process is roughly the same as that performed by the infrasystem. The primary differences are: 1. The agent must direct each step, whereas the operations of the infrasystem would, under specific conditions, perform the same searches automatically. 2. The feature is not actually abstracted from the two occurrences, but is rather a fact that describes the two individual situations. “I saw her last Thursday, and I see her again on this Thursday.” 3. The projection refers to the specific event that is anticipated, not to an abstract rule. “I bet I’ll see her next Thursday” or “I bet I’ll see her on every Thursday.” The system allows the agent to search events for features. To search for Thursday, the agent searches the recollection of each event in which the security guard appeared and determines whether it had the feature of Thursday. If the agent-directed search leads to a positive classification, the findings tend not to occur as a revelation out of the blue, but simply as the product of the search. The route to the conclusion may involve using indirect information. For instance, “Yes, I’m sure that it was on Thursday because it was the first day I wore those new shoes and I remember my Achilles was acting up a little bit and I was worried about those shoes. I’m sure that’s the same day I saw her the first time—Thursday.”
FEATURE-HOST SEARCHES
359
When the agent voluntarily searches the record, the steps are quite different from those that lead to the revelation that, “Wow, there she is. And it’s Thursday.” The agent-directed search does not occur in the context of highlighted information from the infrasystem. The agent must create the directives for the search and basically access the records available to the agent. The agent identifies the criterion for the search, and the system responds by producing at least part of the record. Strength of Directives. If the agent does not make the commitment to exert a lot of effort in directing the body to move a rock, the body may not produce enough force to succeed. In the same way, the search for information that is temporally remote requires more effort than the search for recent events. Even if the information about the first encounter is currently on record, it is not readily accessible to the agent. Therefore, the agent must persist in exercising great concentration or focus on that single question, and present many trials of this search before the infrasystem reveals the information. The operations parallel what the agent would have to do to free a rock wedged between two other rocks. The agent would have to push, pull, turn, try to lever, and persist until the rock was free. The pushing and pulling in the case of recovering details of the first encounter are represented by different tactics, such as trying to construct a connection between events known to have occurred on Thursday. For something as peripheral as the security guard’s schedule, it is highly unlikely that the learner would devote much time to the search. A more casual search may leave the agent with a sense or hunch that it did occur on Thursday. However, any hunch may be inaccurate.
Search Configurations The search is limited by the agent’s ability to specify the objective of the search. The directive to search specifies a target or criterion. Therefore, the nature of the search is limited by the agent’s knowledge. If the agent understands an intricate tapestry of related discriminations, the agent may direct the search for a variety of hosts and features. If the agent understands averages or what usually occurs, it may direct the infrasystem to impose this operation on its content. For instance, “How many people do I usually encounter on a walk?” It is not possible for the agent to direct a search for something that is not in the agent’s repertoire because the agent must plan the search directive in the same way that it plans the directive to walk. If something like what usually occurs is not in the agent’s repertoire, it cannot be incorporated in a plan.
360
14.
HUMAN LEARNING
This is not to say that if the agent lacks knowledge the infrasystem lacks knowledge. For instance, a naive learner may not have the notion of “what usually happens” in its repertoire. This limitation does not mean that the infrasystem lacks understanding of the criterion. It actually uses this criterion to classify various events. However, as we illustrate later in this chapter, the fact that the infrasystem performs and therefore has a functional understanding of a particular operation does not imply that the agent shares this knowledge. Agent Participation in Abstract Conclusions An important difference between a strict infrasystem search and an agentdirected search is the basis for a conclusion. When the learner identifies something as yellow, it is the result of a purely infrasystem process beyond the domain of the agent. The learner cannot provide any valid reasons for knowing that it is yellow except, “I just know it.” The learner may rationalize and talk about possible ways of identifying yellow by wavelength, consensus, or other data. The learner may present some scenarios about learning histories that associate the color with its name, but these discussions would disclose nothing about how the learner knows that the object is yellow. The question we asked did not require a historical response, references to populations of people, or discussion of the abstractions used in physics. It asked simply, “How do you know?” The awareness is completely intuitive, which means that the process of gathering evidence and drawing a conclusion about the classification of basic features is completely an infrasystem function. Obversely, if the agent is able to specify facts relevant to the conclusion, the agent has access to the infrasystem’s classification of the event and may be able to use the same evidence that the infrasystem used to formulate a prediction, such as the one about Thursday. The process is greatly different from that of recognizing irreducible features. Let’s say different colored signs were in front of the condominiums on different days. If the sign on Thursday were always yellow, the learner would be able to learn a relationship between yellow and the day. The agent would have the ability to discuss relevant details about how he knows that Thursdays predict the yellow sign. However, he still would be quite unable to describe how he knows that the sign is yellow. These considerations describe the scope of the agent’s possible participation in drawing conclusions about features. Any search criterion that the agent directs is limited to relationships, not irreducible features. For instance, the agent can search for yellow things, but not for yellow. Yellow things is a relationship that may be expressed most simply as, “This thing is yellow, and this thing is yellow.” Given the agent’s restriction to relationships, any learning that is based on known relationships may be explained
FEATURE-HOST SEARCHES
361
by the learner, but the learner does not necessarily know how to express every relationship that the infrasystem processes. Stated differently, (a) the agent is able to direct searches for any of the features the agent names when explaining the basis for a conclusion because these features will always be referenced to specific events, and (b) the agent is able to specify individual events that support the conclusion the agent draws. The agent shares some of the records with the infrasystem. Some of the shared records may be organized to provide reasons for drawing a conclusion. The formula does not address those things that the agent knows and is not able to communicate, only how the shared record implies a basis for the agent to reason. Let’s say that after the fourth encounter with the security guard, we direct the learner to explain, “What makes you think you’ll ever see that guard again?” (This request is parallel to the question, “What makes you think this is yellow?”) The learner responds by presenting a logical presentation of facts about the Thursday feature that led to the conclusion. “Come on, I’ve already seen her four times. I’m not sure about the first time, but I know that all of other times have been on Thursday. And that may be true of the first time, too, but I wasn’t keeping track of the day back then. So every Thursday, there she is. And I would bet you that I’ll see her again next Thursday—or at least if I see her again, it will be on Thursday.” Clearly the evidence he presents is relevant and leads to the practical conclusion the learner articulates (unlike any rationalization that the agent provides about how he knows that something is yellow). The conclusion corresponds to the one drawn by the infrasystem. It expresses the rule that the system has formulated: “Every Thursday, there she is.” It also reveals the extent to which the agent is able to direct searches. Everything the agent names describes a possible feature search. For instance, he referred to the first time. The reference implies the agent would be able to identify some features of the first time (where he went, possibly what he wore, etc.). Finally, the conclusion the agent draws has been enhanced with secondary sensation, which means that the infrasystem has drawn the same conclusion. Both agent and infrasystem know that this outcome is not certain, but both know there is a substantial logical basis for assuming that the prediction will be fulfilled. Infrasystem Prompts As a general rule, only the outcomes or conclusions drawn by the infrasystem are available to the agent. These come in the form of insight. The agent’s ability to construct the argument about why the conclusion is reasonable is the product of learning. However, that learning is greatly facilitated by the infrasystem conclusions because the infrasystem has already highlighted or enhanced the features relevant to the verbal argument. The
362
14.
HUMAN LEARNING
agent does not learn to attend to highlighted features, only how to describe them through symbols that others understand. The common features of the encounters that occurred on the second, third, and fourth presentation were highlighted, so the agent attended to them. Likewise, the final conclusion, “On Thursdays, she’ll be there,” had been made available to the agent. When the learner argued his case, he simply put together the details that the infrasystem had highlighted as being relevant to the conclusion and stated the conclusion that the infrasystem had presented to the agent as a content map. The learner had to learn how to do this, but the verbal explanation was greatly prompted by the record that the infrasystem had created. The difference between this scenario and the processes by which the agent formulates content maps for action (chap. 13) is that the infrasystem supports the argument by presenting the agent with information in the form of conclusions, highlighted stimuli, and urgings. The agent may operate on these inputs in various ways. The agent may ignore them, use them as the basis for a plan, use them as the basis for establishing a routine (which requires a content map), or use them as a basis for explaining something. The explaining function is what the scenario addresses. The information that the infrasystem used to enhance features and present urges may sometimes be used as a basis for constructing a compelling argument. Abstract Applications The discovery process based on concrete events results not only in the infrasystem’s conclusions, but with the infrasystem sanctioning the conclusions with urgings and secondary sensations. It is something the agent intuitively knows. One of the problems with formal instruction is that the learner is expected to operate on rules not developed in this manner, but that are simply presented as givens. This presentation is the reverse of what the infrasystem is designed to do. Instead of going from examples to the rule, we are proceeding from the rule to the application of the rule. This is the direction that the system goes only after it has formulated a content map. If the rule is that the guard will appear every Thursday and today is Thursday, we can apply the rule to this specific setting and conclude that the guard will appear and we can use this information as the basis for whatever action the agent decides is appropriate. However, if we start with a rule that is not enhanced, the evidence that would support the rule is not highlighted by the infrasystem; therefore, the naive agent would have no intuitive basis for recognizing evidence that would lead to the conclusion. The learner would not be able to respond to the highlighted events that lead to the generalization expressed by the rule because there would be no highlighted events. Instead the learner would have to operate only from facts. For the naive learner (who has never learned the generalized skill of work-
TEACHING CLASSIFICATION SKILLS TO NAIVE SUBJECTS
363
ing from rules to application), the task would be an example of highly unfamiliar learning. This fact is easily demonstrated by the performance of at-risk third graders. We have observed hundreds of them wrestle with problems like this one: “Here’s a rule. All people who work at the gym are in good shape. Say the rule.” (Response.) “Listen: Jill works at the gym. So what else do you know about Jill?” (Typical response.) “Jill works at the gym.” For many children, the presentation of subsequent examples does not result in any insights, as it would if the learner were able to tap the infrasystem’s knowledge of logical processes when they are not prompted by highlighted features. Note that this response tends to occur with students who know the component concepts. They know what a gym is, what good shape is, and what the word else means. They are able to respond to tasks such as, “You know that this basket is yellow. What else do you know about it?” (Typical responses would include, “It’s got a ribbon on it,” “It’s got a handle,” “It’s got eggs in it.”) Even after several examples in which the students have been taught to apply a rule to various positive and negative examples, they may struggle with the next example: “Listen. All the baskets have a ribbon on them. Jill has a basket. What else do you know about that basket?” (Typical response.) “It’s Jill’s basket.” This performance leads to the conclusion that they completely lack the information they would need to perform the same operation their infrasystems perform hundreds of times each day. Teaching the learner to mastery on the first examples involves repeating them on four or more occasions. If the infrasystem performs this operation, but the agent requires many trials to learn it, the agent does not share the infrasystem’s knowledge of the operation.1
TEACHING CLASSIFICATION SKILLS TO NAIVE SUBJECTS Another operation that the most basic infrasystems perform is classification. The difficulty of naive subjects in learning a variation of formal classifi1The
fact that naive children have trouble with these abstractions does not necessarily lead to the conclusion that the learners should therefore be taught from the concrete to the abstract (from the examples to the rule). It simply implies something they must learn. The way to teach them most effectively is to give them practice with a sufficient number of examples and reinforce their good performance. They will learn in the same manner they learn all other skills. The point the present discussion makes is simply that, although the infrasystem is well versed in the logic of applying rules, the agent does not share the infrasystem’s knowledge and must learn the logical relationship of rule to instances of rule.
1
364
14.
HUMAN LEARNING
cation implies that the agent does not share the infrasystem’s knowledge of classification. This learning, like that of formal deductions, is not assisted or prompted by enhanced discriminative stimuli. The behavior of at-risk kindergarten children in an informal study conducted by Engelmann and his associates revealed classification to be a highly unfamiliar operation. The children learned four classes to mastery. The classes were taught in different orders to different small groups (three or four children each). A different order of introduction was presented to each group: Group Group Group Group
1: 2: 3: 4:
vehicles, food, clothing, insects insects, clothing, food, vehicles food, insects, vehicles, clothing clothing, vehicles, insects, food
The sequence of activities for all the classes was identical. The presentation for each part was scripted, and the teacher followed the script closely when presenting. The same presenter taught all groups. The teaching for each class first introduced a rule for the class: If If If If
it’s it’s it’s it’s
something that takes you places, it’s a vehicle. something people eat, it’s food. something you wear, it’s clothing. an animal with six legs, it’s an insect.
Instruction did not start with the rule, but with the class name for the various members and the names of four members of the class. For instance, for the class name vehicles, they were taught car, boat, train, plane. To introduce the class name, the teacher pointed to a picture of each object and said, “This is a vehicle.” Then the teacher would point to each object and ask, “Is this a vehicle?” Often children would say things like, “That’s a boat.” The teacher would agree, and say, “But it’s also a vehicle. Is it a vehicle?” Once the children were at mastery at that step, the teacher would introduce the “kinds of vehicle.” The teacher would touch the object and ask, “Is this a vehicle?” Next, the teacher would introduce the kind of vehicle. “My turn. This vehicle is a car. What kind of vehicle is this?” The test for this step was whether the children responded correctly to each question pair: “Is this a vehicle?” and “What kind of vehicle?” Another discrimination addressed the negative examples of the class. The teacher would present pictures of vehicles and not vehicles. The
TEACHING CLASSIFICATION SKILLS TO NAIVE SUBJECTS
365
teacher would point to each picture and ask, “Is this a vehicle?” If the answer was yes, the teacher would ask, “What kind of vehicle?” Children were next taught the rule and provided practice in applying the rule. Work on this step started while the teacher was still working on Step 1. “Listen. If it takes you places it’s a vehicle. What is it if it takes you places?” The children practiced saying the rule. When the children were able to say it consistently (without corrections), the teacher would show how the rule applies to members and nonmembers of the class. “If it takes you places, what is it?” (A vehicle.) “A boat takes you places, so what else do you know about a boat?” (It’s a vehicle.) Work would continue on Step 2 until the children performed reliably on Steps 1 and 2. Note that this task is a deduction task, which implies that learning to perform the deduction format is logically simpler than learning to classify a set of members according to an expressed or verbalized criterion. In fact, the children made the same mistakes indicated for the deduction task. In response to the question “A boat takes you places, so what else do you know about a boat?”, responses would be, “It takes you places” or “It’s a boat.” In the last step, the children played variations of a game that required them to demonstrate knowledge of the relationship between the features shared by all members of a class, the features of specific members, and the features of the nonmembers. The first game format involved members and nonmembers. “Listen. I’m thinking of a table. Am I thinking of a vehicle?” (No.) “Listen: I’m thinking of a train. Am I thinking of a vehicle?” (Yes.) “How do you know that a train is a vehicle?” (A train takes you places.) The second game addressed the relationship of the higher order class and its members. There were three types of this game. For all three types, the teacher would present many examples of each type in an unpredictable order. The first was the most difficult for the children to learn. For the first type, the teacher would secretly place a picture in a bag. The teacher would present the bag and then ask about a series of possibilities, such as, “I have a picture of something in this bag. Listen: There’s a vehicle in the bag. Is there a boat in the bag?” (Maybe.) “Listen. There’s a vehicle in the bag. Is there a bottle in the bag?” (No.) After the children gave behavioral evidence that it was possible for the object to be a boat but not possible for it to be bottle, the teacher would then reveal what was in the bag. This procedure would be repeated for other examples. The next variation of the game presented three discriminations. “Listen: I’m thinking of a vehicle. Am I thinking of a boat?” (Maybe.) “Listen: I’m thinking of a vehicle. Am I thinking of a bottle?” (No.) “Listen: I’m thinking of a car. Am I thinking of a vehicle?” (Yes.) For other variations that involved the bag, the teacher would present questions based on the rule. “Listen: the thing that is in the bag takes you
366
14.
HUMAN LEARNING
TABLE 14.1 Responses to Mastery on Games for Four Classes Classes Responses to mastery
1st Class
2nd Class
3rd Class
4th Class
168
176
119
73
places. Is there a vehicle in the bag?” (Yes.) “Is there a car in the bag?” (Maybe.) “Is there a hat in the bag?” (No.) “Listen: There’s a car in the bag. Is it a vehicle?” (Yes.) Following the learning of Formats 2 and 3 would be a daily practice of a review set. The teacher would randomly present items involving whatever children had learned. “Listen. Is a coat a vehicle?” (No.) “How do you know?” (It doesn’t take you places.) “Listen. Is a train an insect?” (No.) “How do you know?” (It isn’t an animal with six legs.) “Listen. Is a coat clothing?” (Yes.) “How do you know?” (It’s something you wear.) “I’m thinking of a vehicle. Am I thinking of a truck?” (Maybe.) “Am I thinking of a shoe?” (No.) The number of trials children required to achieve mastery on the games (and not on the introductory activities) provides what is analytically an indicator that the children learned the operation and the logic that was shared by all four classes. Table 14.1 shows the trials required for mastery for the four categories. Independent of order, the fourth class required less than half the number of trials required by either of the first two classes. Therefore, the children learned the generalized operations and conventions of classifying verbal items within the context presented. The children could not have achieved this gain unless they learned (a) something that was the same about all of the classes, and (b) specific features of the examples that predicted sameness in other features.2 The reason the number for the second class is higher than the number for the first may be a function of the children not being able to identify exactly what was the same about the first class and the second class. They tended to confuse classes. Some children initially behaved as if the game were purely verbal—reciting answers rather than figuring them out. (This tendency reveals a correctable flaw in the sequence we presented.) 2 There
is no assumption that this experiment shows either that the items and tasks were as efficient as they might have been or that the numbers indicate the degree of acceleration that would be possible for other formats that introduce classification skills. The data only suggest that the various games required knowledge of the relationship between the criterion for the members and that all the members of a class share features not possessed by items that are not members of the class. The trend clearly shows that the savings was a function of the order.
2
TEACHING CLASSIFICATION SKILLS TO NAIVE SUBJECTS
367
The experiment required two separate types of learning—learning that involves content clearly independent of the infrasystem’s functions (e.g., the class name, vehicle) and learning processes that parallel what the infrasystem does. If the learners underwent something of an overall learning improvement strategy, they would show improvement on both content and operation. This trend was not observed. Learning the rule and names of the vehicles, for instance, showed some improvement as a function of order in the sequence; however, the improvement was small. In contrast, there was great improvement in learning of the relationship between member and nonmember and the basis for higher order classification. If the agent does not have preknowledge of the relationship between higher order classes and members, it would take exposure to more than one class for the agent to discover what is the same from one class to the next. The data suggest that it took some learners two and others three classes to learn what was the same about them. The learning of the fourth class was therefore faster than the learning of the first two classes simply because it required less learning. The learner was able to learn the unique features of the fourth class and then apply knowledge of the features shared by all the earlier classes. The details that were unique to each class (the names, the rule) had to be learned for each class. The relationship of higher order classes and members (which was the subject of the games) had to be learned only once and then applied to subsequent classes. Learning to Learn The examples of highly unfamiliar discriminations imply an operational definition of what Harlow (1952) referred to as learning to learn. Learning to learn is a strict function of the amount of learning that logically must occur. For the learner to achieve complete mastery of the first class introduced, the learner must learn two types of information—information that is unique to the first class and the information or processes shared by all classes. So much unique and so much shared knowledge must be learned for the first class. As the learner encounters other classes, the processes are the same, so they do not have to be relearned, simply adapted, or applied. Note that the system would need information about two classes to determine that there is a sameness in processes (that the games have the same format and deal with the same relationships). The system learns that learning a new rule and new class name are reliable predictors of a game that involves the new class. So once the learner learns this relationship, the amount that the learner must learn for a new class is reduced. Furthermore, the system is able to anticipate the game. The system knows in effect the kind of items the teacher will present. The items will involve the class name and the rule they are learning. After the
368
14.
HUMAN LEARNING
learners had learned two or three classes, they had learned about the processes (including order of events) that are the same for learning subsequent classes. The result was that the learners had to learn far less to master the games for the fourth item than they did for the first. The savings in time or trials to learn items provides a functional definition of less familiar and more familiar learning. Less familiar learning requires the learning of a larger number of processes and strategies to achieve mastery. The naive learner may require 100 trials to master a list of geological terms. The same learner may later require only 30 trials to master a list that is roughly equivalent in difficulty. The later effort provides prima facie evidence that it requires less learning. The only possible reason is that the learner has learned processes that are the same for learning such lists. The processes, once learned, apply to all future lists that share essential features with those that have been learned. The corollary is that if a learning task involves a process that applies to many other tasks, the learner who learns the first tasks more quickly may not necessarily be the smarter learner, just the one who has more relevant knowledge and is required to learn less. Let’s say we observed two groups of children playing the classification games with the class of insects. Let’s also say that we were not told that one of the groups was learning insects as the second class and the other was learning it as the fourth class. We would probably conclude that one group was not as smart as the other. If we later saw the slow group working on its fourth class (vehicles), we might draw another faulty conclusion, which is, “The class of vehicles is a lot easier than the class of insects.” To explain the difference in performance, we would need more information about changes in performance that occurred as a function of example sets. An implication of learning-to-learn phenomena is that if children learn how to handle a wide variety of processes that they apply to a wide variety of content, they will appear to be very smart and will be able to learn new information quite quickly. The reason is that the children have a relatively high degree of familiarity with the processes that are shared by different sets of examples. When the learners learn examples in any of these sets, they do not have to learn as much as the less-practiced, and therefore naive, learner must learn. Infrasystem Processes and Performance If any preknowledge of the processes involved in learning classification relationships were available to the agent, the children would have displayed completely different behavior. On the learning of the first class, they would have been able to play the game without additional instruction because the
TEACHING CLASSIFICATION SKILLS TO NAIVE SUBJECTS
369
agent would receive guidance from the infrasystem’s input in how to do it. If the system had some form of partial hardwiring or imprinting that required some learning of the first class, the behavior on the second class would have been quite different than it was for the first. If the system has no hardwired provisions, what is learned becomes a strict function of the information the agent receives, which means that no generalization or extrapolation occurs until the system has sufficient information to identify what was the same from one instance to the next. Given that the learning trend may be explained entirely as a function of the examples the children had received, the conclusion is that the agent has no access to the classification system of the infrasystem or the logic of the infrasystem except as it is made evident to the agent through insights and enhanced examples that are consistent with the conclusion of the insight. Agent Versus Infrasystem Processes The difference between receptive and expressive performance has long been recognized in the literature. Furthermore, we intuitively recognize that receptive tasks are easier than expressive ones. We know that the multiple-choice format that gives the answer (as one of the choices) is easier than the format that requires us to create the answer. Although this phenomenon is consistent with our experience, it does not really explain why it is easier to recognize the answer than create it. The reason has to do with the manner in which the infrasystem is configured. Its primitive functions center on responding to features in the sensory present. If certain highlighted features are present, the system produces the content map for processing those features or predicting what may occur in the setting. In other words, the basic design of the infrasystem is receptive. The design of the agent is expressive, creating plans, directing responses, and generally performing those operations that are beyond the scope of the infrasystem. The multiple-choice item presents one of the answers in the sensory present. The task involves comparing it with the criterion the item presents for evaluating items. This is what the infrasystem does. The correct answer is recognized and presented to the agent as correct. In contrast, the expressive task requires the agent to perform a search based on a particular feature. The agent must not simply recognize the item, but locate it from references to related features. If the item is simply, “Who was king of England in 1600?”, the task is more difficult than the same question with choices because the choices simplify the task to a comparison of something in the sensory present with the record.
370
14.
HUMAN LEARNING
SUMMARY The demands of human learning require specific emphases not implied by simpler systems. One emphasis is the extensive records of individuals, groups, and their features. For the human, the learning of any pattern is reinforcing. The features that characterize a pattern may be quite remote, abstract, and complicated. One major requirement of the system implied by such pattern learning is the memory and classification potential needed to record, compare, and identify patterns. The classification system must be designed so that it reveals relationships. A relationship is characterized by the intersect of a host with a feature. The relationship may be identified by searching various hosts for a feature or searching a host or event for its various features. Comparing events for sameness in feature reveals patterns. The search strategies employed are determined by the task that faces the system. If the task is to find out something that occurred on Thursday, Thursday is the host and what occurred are the features of the host. If the task is to find out when a particular event occurred, the event is the host; the various days examined are the features of the host. This system requires records that are incredibly elaborate because of the large number of events recorded and the large amount of detail recorded for each event. The events include not only those that involve primary or secondary reinforcers, but those that involve peripheral reinforcers. The human system extends the scope of possible reinforcers. The process of identifying patterns becomes a reinforcer for the system. The agent’s repertoire and access to the items classified are expanded. For the agent, all classifications are presented as specific events or facts (not yellow, but yellow things). The agent’s access to the classification system is not unique to humans, but is necessary for any organism that is able to think about things not prompted by the features of the current setting. The uniqueness of the human agent’s access to the infrasystem has more to do with the number of ways the items are classified and therefore the extent of the agent’s repertoire. The agent directs the search of the classification record by issuing the same type of directives the agent uses to direct motor responses. The infrasystem is able to dig deeper into its records if the items of a particular class have a strong reinforcing consequence, which is what happens if it is important for the learner to remember something remote. In these situations, the record may be tied to a single, possibly trivial feature. “Recount everything that happened 12 hours before the robbery.” With enhancement, the events connected to peripheral reinforcers are now connected to a strong reinforcer—the robbery. All the features associated with the robbery become enhanced. As enhanced stimuli, they tend to
SUMMARY
371
stand out more in the record, which means that the system may be able to locate items in response to the agent’s directives to search. Also there is an increased tendency that features in the current setting will result in the system’s reflexively presenting the thought to the agent. Looking out the window at a cleaning truck, for instance, may result in the realization, “Oh, I remember now, she called the cleaners because of some mix-up over the billing on one of her dresses.” Although the agent may direct searches of the infrasystem for the features of an event or events grouped according to a particular feature, the agent’s invasions are limited only to sensory and thought records, not to the logical operations of the system. If these operations were available to the agent, the learner would not have to learn them, simply direct them. The system makes it possible for the agent to benefit from the infrasystem’s processes by enhancing the conclusions that the infrasystem derives. The agent knows, for example, not only that something will happen, but also the various events or facts that support the prediction that it will happen. These are highlighted by the system. Therefore, they prompt the learner with what amounts to insight or knowledge. This prompting, however, does not suggest that the agent shares the infrasystem’s knowledge of either deductions or classificatory relationships. If the agent has knowledge of these, it is strictly a product of learning that is not assisted by the infrasystem anymore than the learning of any other arbitrarily designated content is. The learner is able to learn the operations used by the infrasystem, but the agent does not learn these operations without instruction. The evidence for this conclusion is that naive subjects do not exhibit behavior that suggests a generalized or unlearned understanding of classification or logical operations. The evidence comes from the amount and type of input and practice needed to teach naive subjects applications that require these operations. If the learning were prompted by the infrasystem, the operations would be intuitive. If the operations are not intuitive, but require the kind of learning that characterize any arbitrary content, the infrasystem is involved only after the fact. What the learner learns both about the content presented and the logical operations may be accounted for completely by the input provided through the instruction. If the operations are the same from one application to the next, the learning of the operations leads to a great reduction in the number of trials required for successfully applying them to new content. This is what occurs when subjects learn how to learn. They simply learn what is common to a set of applications so they do not have to relearn it with each change in content. It is now part of their repertoire—an independent feature that may be summoned by their agent on command.
Chapter
15
Language
The preceding chapters have used examples that involve language. Language acquisition is a remarkable feat. For the analysis of inferred functions, the primary function of language is to present and standardize at least a segment of the learner’s content maps so that the learner is able to induce maps in others and create maps consistent with the map the speaker intended. Like other content maps, those that are conveyed by language specify relationships—what features to attend to and what to do. Unlike other content maps, some content maps induced through language communication do not imply action beyond creating the picture or representation of content that is conveyed by the communication. In other words, they are directives for thought, not overt action. The major issues of language performance are (a) how the structure of language permits both the speaker and learner to share content maps, and (b) how the learner acquires knowledge of both the structure and content of the language. These topics overlap because the articulation of the necessary features of language implies what the learner is required to learn.
LANGUAGE STRUCTURE According to Chomsky (1980), language follows transformation rules that are too complicated to be accounted for by learning. Therefore, Chomsky concluded that humans are able to learn language because they have inherited knowledge of a deep structural understanding of language—a knowl372
LANGUAGE STRUCTURE
373
edge of a universal grammar that describes all languages. Chomsky derived the universal grammar from the fact that, although different languages have different words, structures, and conventions, all may be reduced to the same type of kernel sentences that may be expanded in lawful ways to create a grammar or set of rules about word order. The present analysis holds that both the structure and content of language are governed by meaning (qualitatively unique features), and that all the meaning is necessarily learned, not inherited. Furthermore, all of the structural phenomena that Chomsky cited (such as the fact that the learner creates unique utterances that follow grammatical rules) are completely explicable in terms of learning and completely inexplicable without reference to meaning or content. Unless the analysis of language starts with meaning or specific content that refers to features, it is not possible to derive a socalled universal grammar without encountering crippling contradictions. Grammatical analyses do not refer (except obliquely) to meaning. Meanings Meaning is not semantics or a mechanical equivalence by which one word is substituted for another word or set of words. Meaning has to do with the set of features shared by all instances of a particular utterance or class of utterances—both the features shared by the lexical items and the features of patterns that combine various lexical items. The analysis of shared features leads to the conclusion that there are two large classes or types of language meaning: (a) referent meaning conveyed by the words and utterances, and (b) usage or syntactical meaning of the utterance. Referent meaning is the extent to which an utterance describes the features of an event. The more features that the description shares with the observed event, the more meaning is specified by the utterance. Usage meaning refers to the rules that the speaker follows in creating descriptions and the rules that the listener follows to decode verbal messages. This type of meaning is syntactical or grammatical meaning. If the learner uses a word appropriately as a noun, it is because the learner understands that the word has a set of features common to nouns, but not common to other words. These determine how the word is used as a component of a description. Language as a Source of Content Maps Language functions as a transmitter of content maps. As such, its basic purpose is to permit the sender to create a map that describes an individual event, pattern, or entity. The agent is involved in planning and executing the
374
15.
LANGUAGE
resulting utterance. The agent is not involved in directing some of the grammatical phenomena that govern the utterance. These are the product of infrasystem processes. They are presented to the agent as conclusions. When the agent encounters an unknown language element such as, “Would you reiterate what happened,” the learner knows that the unknown word is a verb that serves the same usage role as tell, state, say, condone, cause, or disagree with. The agent does not necessarily know the basis for this knowledge. Rather, it is presented as insight—a product of the infrasystem’s processing and conclusions. The language creates some content maps that result in overt behavior. All, however, serve as directives for thoughts. Shared Meanings. Language conveys content maps by describing features in sufficient detail to permit the agent to assemble the features into the representation of an individual or specific group. The features are conveyed by the words and phrases—the units of meaning that the sender selects and arranges. Some of the features are names of classes, individuals, and events. Some are labels for actions and attributes of things. Still others are combining forms that describe how the parts are to be arranged. All have meaning. The test is substitution. If the replacement of the element results in a change in the meaning, the difference describes the meaning of the element that had been replaced. She has a small dog conveys a completely different type of joining than She is a small dog. That difference describes (within this context) the meaning of is and has. Public Meaning. The units to be used in both the sending and receiving of the content maps are public, which means that both sender and receiver have a repertoire of the various features and whatever conventions are necessary to arrange the units in a manner understood by both sender and receiver. So the word, in its most rudimentary form, is an arbitrary signal used to predict a feature of events. Big predicts a feature shared by all members of a class. Whatever population is involved, the word big permits the sorting of members into those that have this feature and those that do not. The feature relationship goes from concrete instance to word as well as from word to concrete instance. The presence of the population grouped according to bigness has the feature of bigness and predicts words that describe this feature—large, big, bigger, greater in size, and so forth. There is individual variation in the instances of the feature, and there is variation in the words that refer to the feature. However, all of the words that refer to the feature have a unique meaning, and all of the instances of the feature have a unique quality. What language does is simply provide a marker for that feature. The marker, like the feature, must be unique, which means that the same marker could not be used to indicate both big and little. The general rule
LANGUAGE STRUCTURE
375
for markers is that the language system tends to provide markers for all of the features that the learner frequently observes. So if red and tall do not always coexist, the language would have to provide a differentiated marker for tall and another for red. With separate words, the communication potential increases because the language more closely parallels the features the learner encounters. Verbal Directives as Content Maps. In its most direct applications, the language simply provides the receiving agent with the same kind of information a content map would provide. For instance, the communication directs the learner to “Put that hat on the table.” The form of this directive is exactly like that of a content map. It tells the agent the operation and outcome. It does not specify all the details, only those that would apply to all physical settings in which the directive, “Put that hat on the table,” would apply. To comply, the agent has to create a specific instance of the general directive, creating a plan about how to pick up the hat and exactly where the hat will be on the table. The plan is more specific than the directive the learner received, and the plan is necessarily created by the learner before the action commences. Verbal Descriptions. A description is just like a directive except that the plan the agent formulates does not involve response participation of the learner beyond representing a possible concrete instance consistent with the description. “So he walks up to the table, takes off his hat and puts it right in the middle of the table, right next to the wedding cake.” Like the simpler directive, this one could apply to millions of different concrete settings. All, however, would have the full set of features the description provides. All would simply have detail added to make it concrete and singular in the same way a plan of action would be. An extension that presents both a description and call for action is a question or directive for the receiver to produce some form of verbal description or confirmation. “Tell me why you did that with the hat. Did you do that on purpose?” The content map created through language communication is always more general than the plan or representation the receiver creates. To respond to a directive, the agent fits the specifications of the directives into the details of the current setting. For a description, the agent creates an example that has the features specified by the description; however, the agent adds enough detail to make a singular example. An underpinning of the agent’s ability to decode utterances and transform them into representations, or create utterances that express specific relationships, is a knowledge of grammar. The agent has both a receptive and an expressive understanding of grammar. If the learner wanted to pro-
376
15.
LANGUAGE
vide a verbal direction for putting the hat on the table, the system would follow syntactical rules. The agent would not create the utterance “Table, hat, put on the” because it does not fit the forms or templates that the infrasystem has for creating the desired description of the event. The agent knows these forms intuitively. Responses to Language Representations. Another parallel between content map and language communication is that just as the content map is able to compel attention and reflexively prompt awareness of features, the language extension creates the same effect. The words I’m sorry but your father passed away this morning at 11:05 carry all the impact of firsthand experience. The system reflexively responds in the same way it would to an observation that confirmed the death. The agent is aware of the content as well as other features of the communication—the choice of the words passed away rather than died, the sympathetic tone, and the inflections. Transforming by Combining Features. With the addition of words, a description provides more detail and therefore increases the features shared by the referent and utterance. If only the word person is presented to the learner, the representation may wander from one representation of a person to another. A 17-year-old person directs the system to create or select an individual from a smaller class. A 17-year-old person who is a girl creates an even smaller class, but the description is laborious because all girls are persons. So if the description refers to the smaller class, there is no need to refer to the broader class. A 17-year-old girl provides all the information of a 17-year-old person who is a girl and does so with fewer words. Just as the infrasystem identifies the smallest class of individuals that share a particular feature, the language communication is recognized as being more efficient if it names the smallest class of individuals that share a particular feature. A 17-year-old who is pregnant and unmarried identifies a relatively small class. The information difference between this description and that of a person could be demonstrated by giving 100 artists directions to draw a person and 100 artists directions to draw a 17-year-old girl who is pregnant and unmarried. We would probably be able to sort the pictures at close to 100% accuracy without knowing which artist was given which set of instructions. The feature restrictions placed on the second set of instructions serve as a content map for a picture so severely restricted that it is unlikely an artist given only the directions to draw a person would select one that had features of being female, young, pregnant, and possibly troubled. Yet it would be possible for an artist given the directive to represent a person to draw such a girl. She is a person. If we consider the most efficient way to convey the same information provided by the elaborate sentence, we could say something like, “She is 17, un-
SYNTACTICAL MEANING
377
married, and pregnant.” If the person is she, the person is female, so we do not need reference to girl. Host and Residential Features. For language to build representations of individuals from a description of various features, there must be discriminable language markers for the various residential features and the host or recipient of the features. The simplest form of combination is host plus residential features (attribute of the host)—that hat. The features referred to by the words tall, old, and enormous do not specify a host, which means that the words do not describe anything the agent would ever encounter. The agent encounters events. If the description is to communicate, it must therefore indicate a host: Redwoods that are tall, old, and enormous. This classification pattern for things described by language is the same as that for other content. The structure of the language pattern provides directives to the infrasystem for making more specific classes. The kings of Spain directs the system to make a subclass of kings. The Spain of kings directs the system to make a subclass of Spain (e.g., compared with the Spain of social democracy). The main implication that derives is that the categorization of residential features and host is as flexible for language as it is for the infrasystem. Earlier discussions have referred to the fact that the infrasystem identifies features of features, in which case the feature that supports the others becomes a host. In other discussions, we referred to feature searches. For these, the items identified are hosts that have particular features. The language system permits the construction of host–feature combinations and of specific relationships that would be commonly observed.
SYNTACTICAL MEANING The categorization of individuals by the sum of their features is the basis for syntactical classifications and the elaborate transformations that the user of the language formulates routinely. Broader classes and narrower classes are a function of the number of features used to identify membership in the class. If a large number of features is used as the criterion for identifying members, an individual or small group of individuals is identified. If fewer features are used to sort examples, a larger population of positive examples results. Individual 1 in the population of 16 has four features. If it is grouped by all four features, there is only one member in the group. If Individual 1 is grouped according to a single feature, like red, the population is larger. This relationship applies to syntactical features. The word car has referent and usage features. Car refers to individuals that share a large number
378
15.
LANGUAGE
of features. Car also has usage features. It fits the pattern, The ____ was red. The word car is not the only item that would fit this form, however. The words flower, house, dress, shoe, and wall would fit. This sameness of pattern is a feature shared by all the examples. Sorting according to the usage feature is like sorting members of the population of 16 according to redness. All the instances that fit the pattern (make sense or are consistent with a representation that the infrasystem is able to construct) have a common usage feature. Just as the learner is able to identify features of a particular morning walk or some specific features that are common to all Thursday morning walks, the system is able to identify features common to all morning walks. The learner could identify the place of origin, the articles of clothing that are common across weather conditions, the rate, his responses to people that he meets, the time requirement, and his postwalk behavior. This is parallel to usage or grammar rules. The learner understands what is the same about all instances of the form “The ____ is red.” The learner knows that substitution instances “The always is red,” and “The of is red” do not work because these words are not usually used as hosts. The model of generalization described in chapter 8 applies not only to the more narrowly framed features that are learned, but also to the broader ones implied by the usage of language. The learner would be able to do the same classification operations in response to words that the learner performs with events and component hosts because words are hosts of features. For instance, if told to identify words that rhyme with nation, the knowledgeable learner would be able to list a fairly large number of words. The behavior would be a lot like those of the learner in identifying common features of the walks. The information would be released in clusters by the infrasystem. “I can’t think of any more . . . Oh, here’s a whole bunch of them: regeneration, regurgitation, reaffirmation, reapplication recapitulation, reincarnation, reciprocation. . . .” The same formula applies to knowledge. If the learner had absolutely no knowledge of the labels used to describe grammatical categories, but was a fluent user of the language, the learner would be able to create sentences that follow a particular pattern. Let’s say we present the learner with three sentences: She is eating breakfast. My dog is sitting over there. That car is speeding. We direct the learner to look over these three sentences, say them out loud, identify what is the same about all three of them, and write 100 more that follow the same pattern as these. Some would create all the sentences so they start with “She,” “My dog,” or “That car.” Others would retain all
SYNTACTICAL MEANING
379
the words of the original sentences and add words, “My dog is sitting over there on the bench,” “My little dog is sitting over there,” or “My mother’s dog is sitting over there.” Still others would generate examples of the pattern X doing Y. Still others would create a group that shares only the verb is, which would include sentences with ing verbs and sentences without: “That car is big.” The demonstration would not seem particularly astonishing. However, it clearly documents that the learner has the ability to create instances for grammatical categories. The only way the learner could create a list of appropriate instances is to identify what is the same about the three examples and use this sameness of feature to generate other examples. The point is that the way in which things are classified in the infrasystem is no different for grammatical features than it is for any other set of features. In all cases, individuals are classified as the sum of their features and may be grouped on the basis of any feature or combination of features. All the features used to sort examples on the basis of whether they rhyme, the pattern of juxtaposed words, or any other grammatical or usage phenomena imply simply that the hosts (words or utterances) have multiple features, and therefore generate multiple classifications. The learner’s behavior provides indisputable evidence of knowledge based on sameness of pattern. The performance clearly shows that the infrasystem is able to identify the abstract patterns and create instances that follow these patterns. Furthermore, the agent is able to access the infrasystem on the level of generating concrete examples.
Clear Communications Grammar is an ancillary part of language. It comes about to satisfy the needs of users to communicate clearly. The requirements for clear communication derive primarily from features of the events to be described through language. If there is only one cat in the setting, the word cat provides an adequate feature description. If the same cat is in a context of seven other cats, the features of the targeted cat must be articulated by adding only a single feature, “the white cat,” or multiple features, “the white cat with the pug nose, the crossed-eyes and the very short whiskers.” Syntactical considerations as much as the lexical meanings contribute to the clear communication. If the grammar were so poorly formulated that there was ambiguity in describing an individual event or object, the pressure to communicate clearly would tend to change the grammar so the flaw was eliminated. For example, let’s say that the grammar described any under–over relationship of a ball and box with the following two interchangeable sentences: “Under ball box” and “Under box ball.” Both would indi-
380
15.
LANGUAGE
cate that one of the objects is under the other without indicating the relative position of either object. The ambiguity could be corrected by introducing a word order convention. This convention would simply add features to the description and thereby clarify which object was under. The clarification in meaning would be a strict function of the rules of word order. “Ball under box” would convey only one representation—that of the ball under the box. Relationships This clarification is achieved not by adjusting the meaning of any word, but by creating a word-order meaning convention. A particular order of words requires a particular arrangement of the features of the event. The meaning that is created by word order is strictly a product of arbitrary convention, just as the stipulation that the utterance ball is to designate a particular range of hosts or objects. The language does not have to use a wordorder convention, simply a convention that permits a representation of what is described. The language could retain the original ambiguous wordorder options (“under ball box” and “under box ball”) and signal the one that is under through pitch marker, a loudness marker, a stress marker, or any other arbitrary feature that uniquely marks one of the entities named in the sentence. The convention could call for the marked name to be the one that is under or the one that is not under. The options are negotiable so long as the communication satisfies the function of marking one of the objects and assigning a specific relative-position value based on that mark. The language could also introduce a lexical solution that created clarity. The words ball and box could still occur in either order. The one that is under, however, could be marked with an additional sound: “Ama-ball under box” and “Box under ama-ball” would have the same meaning. In a sense, however, this is not a word rule, but a grammar rule, in the same way that we use word-part markers to indicate plurals. The affix ama is a single marker that affects more than one possible word, ama-ball as well as ama-box. Therefore, it has a rule function—a grammar function. Giant Words. The need for various pattern conventions becomes apparent if we consider a simple (and strictly illustrative) language. This language does not have word orders because it uses only single words to express all meanings—sit, go, look, stump, man, food, and so forth. These are never combined. The language also has some relationships that are expressed with words. These are giant words because they express in one word what our languages would express in a sentence. Let’s say that one giant word was relb. The meaning of relb is that one ball is currently under one box.
381
SYNTACTICAL MEANING
Principle of Feature Correspondence. The rule for feature correspondence of the language is fairly simple: If the language is to convey information about discriminations, it needs a set of words or signals as large as the set of discriminations confronting the learner. For example, to convey information about the population of 16, there would have to be some form of language markers that referred to red, tall, narrow, and bottle shaped for the language to describe the features of the objects that belong in Station 1. In the same way, if the variables involved are ball, box, and under, in all their common manifestations, the language would have to have conventions for each of the feature variables. The simplest way to identify the needs of this language is to create different concrete displays involving balls and boxes in different configurations and in different temporal contexts. If one display is different from another, the language needs some sort of convention to mark the difference. We could permit the system to have a generalized plural convention that creates only two groups—one object and more than one. We could also permit markers for broad-time categories for indicating that the ball was (but not is) under one box. If we indicate “ball was under box,” we indicate a specific subclass of ball under box. In the same way, “balls under box” specifies a different subclass. Note that the feature signaled by was is not a feature of the ball, but a feature of the presentation of “ball under box.” This is a relationship. So the word was specifies features of some examples that have the features of “ball under box.” The ones that are positive examples of the relationship happened earlier and are currently not occurring. If the language that consisted only of single words processed only arrangements involving ball and box that varied in singular and plural, present and past, and relative position, the language would need at least 16 different words: 1. 2. 3. 4. 5. 6. 7. 8.
One ball is under one box. One ball is under more than one box. More than one ball is under one box. More than one ball is under more than one box. One ball was under one box. One ball was under more than one box. More than one ball was under one box. More than one ball was under more than one box.
9. 10. 11. 12. 13. 14. 15. 16.
One box is under one ball. One box is under more than one ball. More than one box is under one ball. More than one box is under more than one ball. One box was under one ball. One box was under more than one ball. More than one box was under one ball. More than one box was under more than one ball.
The original giant word relb refers only to Map 1 in the list. If the system persisted with the giant-word format, it would have to construct 15 more words to accommodate the 16 meaning variations. These giant words would
382
15.
LANGUAGE
be either structurally related or structurally unrelated. If they were structurally unrelated, there would be no systematic correspondence of sameness in word to sameness in feature of referent. The word for “one box is under one ball” might be clunder; the word for “one box is under more than one ball” might be ost. Related words would have variations in form that parallel variations in features of the referent. There would be great pressure for the users of the language to create such parallels because that is what the infrasystem does—identifies and classifies according to sameness of feature. As noted, this creation of parallels would not necessarily be an agent plan, but an automatic infrasystem function. However, if the language has markers that parallel features, the system would have less to learn or organize. It would learn that a particular marker refers to a particular feature. The marker is an enduring feature of the word that predicts an enduring feature of the event. The pressure to create such parallels would occur because the giantword language already has precedents for creating language markers that signal specific features. If the language has words for sit and chair, for instance, particular sets of observable features have unique markers. Whenever a set of features recurs, the natural tendency of the learner would be to refer to the feature with the same marker. Ball and box have features in common with those objects that have assigned words in the language. All are objects. A simple extension of the language precedents established for other object names would provide them with a stable marker that predicts the object features. In the same way, the infrasystems would exert pressure for speakers to use a variation of the same conventions already established for the feature of under. If all examples of “_____ under _______” have the same observed feature (an under relationship), some unique marker could stand for this relationship. In a sense, the language that has the word sit would already have established a perfectly parallel abstraction. If different individuals have the feature of sit, the feature is abstracted across places, time, persons, numbers, and objects. The same type of abstraction process would be applied to under. The relationship is independent of time, relative position of ball and box, number, and objects. A completely systematic set of words that followed the precedents already in the language would create words that parallel features. Words that conveyed only minimum differences in meaning, such as those for “One ball is under one box” and “One ball was under one box,” would differ in only one feature—for instance relb and relp. In the same way, words that were greatly different in form would convey greatly different meanings. The words for “One ball is under one box” and “More than one box was un-
SYNTACTICAL MEANING
383
der more than one ball” might be represented as relb and laerap. Three sounds of the original word (rel ) are in the reversed order in laerap, with a added to show plurals for box (la) and ball (ra). The sound p denotes the past tense just as it does in the word relp. Table 15.1 presents a variation of the table presented for the population of 16. The four features are pattern, tense, number over, and number under. A fifth feature is the word for the individual event. The major point this table makes is that, if it is possible to construct the same interactions used to show multiple features of 16 objects, the same learning implications that hold for objects would hold for the grammatical features or the relationships between referent and utterance. The learner potentially could learn to sort the examples on the basis of a single feature, two features, three features, or four features. Just as it is possible to sort the objects according to redness, it is possible to sort them according to present (vs. past). Just as eight objects would be in this set, eight instances of words would be in this set. Just as the sum of the features describes each of the objects, the sum of the features describes each word and each word meaning in Table 15.1. The word is designed so that it has a part and stable convention for each of the four features. Expanding the Language Let’s say that ball and box were not the only objects in the language that participated in over–under relationships. For instance, a chair is observed under the box. By extrapolation from the known set of words, some form of neologism would be implied—naming the chair first, using the rest of the word that indicates “under the box.” If the name for chair is san, the new word would be sanelb. This is relb with ball replaced by the chair. Likewise, if there is a convention for marking plurals of boxes and balls in the population of 16, other instances that share the plural feature would tend to be classified as sharing this feature. Therefore, more than one chair would tend to become sana and more than one chair under one box would be sanaelb. In the same way, present and past tense of the chair under a ball could be distinguished using the same marker that occurs in the ball and box set. For example, relb is present, but relp is past. Therefore, if sanelb is present, sanelp is past. We now have something of a full-fledged grammar with names for some things, nonword markers (affixes) for others, and sentence-order rules. It has been shaped solely on the basis of attempting to make details of the communications parallel to the content processed by the infrasystem and agent.
384
1. 2. 3. 4. 5. 6. 7. 8.
A C E H relab
2
A C F G raelb
3
A C F H raelab
4 A D E G relp
5 A D E H relap
6 A D F G raelp
7 A D F H raelap
8 B C E G lerb
9 B C E H lerab
10 B C F G laerb
11 B C F H laerab
12 B D E G lerp
13
B D E H lerap
14
B D F G laerp
15
One ball is under one box. One ball is under more than one box. More than one ball is under one box. More than one ball is under more than one box. One ball was under one box. One ball was under more than one box. More than one ball was under one box. More than one ball was under more than one box.
9. 10. 11. 12. 13. 14. 15. 16.
One box is under one ball. One box is under more than one ball. More than one box is under one ball. More than one box is under more than one ball. One box was under one ball. One box was under more than one ball. More than one box was under one ball. More than one box was under more than one ball.
box(es) ball(s) , B= , C = present, D = past, E = singular bottom, F = plural bottom, G = singular top, H = plural top. ball(s) box(es)
A C E G relb
KEY: A =
Pattern Tense # under # over Word
1
TABLE 15.1 Population of Four-Feature Individuals
B D F H laerap
16
SYNTACTICAL MEANING
385
Individuals Described by Feature or Name A sentence composed of words describes something one set of features at a time. A giant word may be designed so that it conveys all the information in a sentence. As the previous discussion indicated, the problem with this scheme is that if the words do not have parts that systematically vary as a function of shared features, an enormous number of words would be needed to describe even the most elementary events. Because describing an individual event one feature at a time requires fewer words, the use of giant words would be reasonable only for relationships that occurred at a high rate (with little variation) or that had a clear referent. Even for these applications, however, the giant word may actually be a word that is capable of functioning as an ordinary word in the language. For instance, the word look accompanied by pointing serves the function of a variety of possible sentences. It may stand for, “Look at that plane near the horizon” or “Look at that large plume of smoke over there.” The need for independent words that have the potential to describe a relationship one feature at a time becomes apparent if we introduce the population of 16 individuals described in chapter 10 to the simplified language. If we required the learner to identify the objects by feature using words for the various features, a vocabulary of eight words would be needed to describe any member of the set (e.g., “Tall, red, narrow-bottle shape”). However, if we required the learner to learn names for each of the objects, we would require learning 16 names. Furthermore, these names would have no generative value beyond the set of 16. The words refer to features that are shared by thousands of presentations beyond the population of 16. Conventions for Feature Markers The conventions for grouping features in a sentence derive from the fact that the communication is ambiguous if the features of the host are not clearly marked. “Chevy was my red uncle’s 1968 last stolen year pickup” does not make a lot of sense because the features of the host (Chevy pickup) are not marked as belonging to the host. Therefore, a basic rule is that if more than one word is needed to describe a single host, the words are ordered in a particular way. Different groupings result in different meanings: “My uncle’s red 1968 Chevy pickup was stolen last year.” “My uncle’s stolen 1968 Chevy pickup was red last year.” “Last year” belongs to the act of stealing in the first sentence, but to when the truck was red in the second sentence. The Logic of Feature Combinations Figure 15.1 shows a content diagram for the features of a large white house with a green roof. The host of the various residential features is the house.
386
15.
FIG. 15.1.
LANGUAGE
House features.
It is represented as the outer circle. It includes all the relationships shown by the overlapping circles. The three residential features are large, white, and a green roof. The intersecting segments show that the three residential features may be combined in various ways. They are not temporally ordered and therefore have no presumed order in the sentence that describes them. Note that roof by itself is not a variable because a house without a roof would not really be a house. (To say “a house that had a roof and that had a green roof” would be redundant because the reference house incorporates the roof feature.) If the house were unusual, such as one with no windows or one built of cans, the outer circle would be smaller. Instead of being house, the outer circle would be “house with no windows,” or there would be two intersecting circles labeled house and no windows. The rest of the intersecting circles would be inside the outer intersecting circles. In the same way, a house with no roof would be represented with a smaller outer class or intersecting circles. The basic assumption of any communication that expresses the relationship shown in Fig. 15.1 is that it must meet the requirements of naming all features—the host and three residential features. For each feature, there must be a verbalization that signals the feature. The utterance must be designed so that the relationship of host to each residential feature is clear. The residential features must be expressed as being coordinate, and all must be features of the same host. Once the house is described by features, it may be described in an abbreviated manner as the house or it (so long as the referent is clear).
387
SYNTACTICAL MEANING
Options. Grammatical conventions ensure that the greenness is clearly affixed to the roof, but the largeness and the whiteness are not features of the roof. (The roof is neither white nor green and white.) Additionally, there are order options for referring to the host and all residential features as part of a single sentence or as a sentence. A few examples follow. Sentence-part options A large, white house with a green roof A large, white, green-roofed house A green-roofed house that is white and large A large house that is white with a green roof A house that is large and white and that has a green roof A white house that is large and green-roofed Sentence options The house is large and white with a green roof. It is a large house that is white with a green roof. The large white house has a green roof. The white house with the green roof is large. Combining Related Hosts. The house and its specified features may be combined with another host that is related to the house. Let’s say three sisters—Emma, Ginger, and Alexis—lived in the house. Their relationship is shown in Fig. 15.2. Note that the sisters are residential features of the house. Among the many expressive options that describe this pod are the following:
FIG. 15.2.
Three sisters.
388 Three sisters, Emma, Ginger, and Alexis Three sisters, Ginger, Emma, and Alexis
15.
LANGUAGE
The sisters were named Alexis, Emma, and Ginger. Emma, Alexis, and Ginger were sisters.
If the objective of the communication is to join the house and sisters pods, hundreds of expressive options are available. Note that joining these relationships does not have to occur in a single sentence. The only requirements are that (a) the relationship between the features of the house and the girls who live in the house is made clear, and (b) the descriptions for each of the two pods is complete and clear. The following are six of the many expressive options for describing the relationship of the hosts. Three sisters—Ginger, Emma, and Alexis—live in a large, white house that has a green roof. Three sisters live in a large, white house that has a green roof. The sisters are named Ginger, Alexis, and Emma. Alexis, Emma, and Ginger are sisters who live in a large, white, greenroofed house. Alexis, Emma, and Ginger are sisters. Their house is large and white, with a green roof. The house is large and white and its roof is green. It is home for three sisters, Emma, Ginger, and Alexis. For each host, a new pod is required. (The sisters form a single pod because they do not interact independently of each other.) If the floor is the host of some residential features, such as having loose boards and places that squeak loudly, the pod has the host and the pair of residential features. The joining requirement is that the floor is designated as being a feature of the house and that the loose-board and squeaking features are linked to the floor, not to the house or the girls.
PLANS AND CONTENT MAPS The agent that creates any description requires knowledge of the relationship to be expressed (knowledge of the features and the relationships). A plan is formulated for producing behavior that communicates the relationship. The plan that is successful honors restrictions for describing the features. As with all instances of plans based on content, the content is broader than the plan, so there are options for the response strategy that is planned.
PLANS AND CONTENT MAPS
389
For the receiver, the utterance is a content map for creating a specific representation. The system assembles the features specified by the relationship and particularizes them. Just as the sender is able to create a potentially large number of behavioral strategies based on the features of the content, the receiver is able to create a great number of possible representations from the content received. The grammar provides the system for building gestalts one feature at a time. Abstract Patterns for Grouping Language Features An utterance such as “He is raking the leaves” has features that permit it to be classified in progressively abstract forms, each based on features of sentences that are in the learner’s repertoire. The following are some of the forms: He is raking the leaves. ___ is raking the leaves. ___ is ____ing the leaves. ___ is raking the _____. ____ is _____ing the _____. ____ is _____ing _______. ____ is __________. A large number of such series could be created. For instance, one would be the same except for the last pattern, which would be: ____ ____ _____ing ________. The variations would use different auxiliary forms (is/are, was/ were, etc.). Each entry conveys information. When there are no blanks in the sentence, the sentence provides sufficient information for the receiver to formulate a representation. The range of possible variation for a given entry reveals the range of possible content not conveyed by the abstract pattern. We are able to demonstrate the meanings conveyed by the pattern by introducing nonsense words for the various blanks. The nonsense words have an unknown referent. However, their position in the sentence and related words permit the language user to answer some questions based on the sentence. The range of information provided by any entry is operationally defined by the range of questions that may be answered by referring to the entry. The procedure involves comparing the set of questions that is logically answerable by referring to the sentence that has no unknown words with the sentence that has unknown words. The difference discloses the infor-
390
15.
LANGUAGE
mation provided by the sentence and the information function of the unknown words. 1. We present the learner with the sentence, “He is raking the leaves.” We then ask a series of questions based on the sentence. What is he doing? (Raking the leaves.) Who is raking the leaves? (He is.) What is he raking? (The leaves.)
2. We present a parallel item that contains nonsense words. We tell the learner, “Here’s a sentence with some funny words. He is snarping the flot.” What is he doing? (Snarping the flot.) Who is snarping the flot? (He is.) What is he snarping? (The flot.)
The average 5-year-old is able to perform this kind of task, which means that the average 5-year-old has language utterances classified according to the various information features conveyed by word patterns.1 This is not mere grammatical information. The simplest demonstration is to ask yes–no questions, such as “Is he snarping the flot?” The answer is “yes.” If we start with the sentence, “He is not snarping the flot” and then ask the same question, “Is he snarping the flot?”, the answer is “no.” Neither yes nor no were in the original statement. The only possible basis for the learner answering the questions is the knowledge that, for all instances of the pattern “_______ is ___ing the _____,” the answer is “yes” because the sentence describes something observable. For all instances of “ _______ is not _____ing the _______,” the answer is “no” because the act is not observable. If we asked questions that could not be answered without knowledge of the words snarping and flot, the learner would not be able to answer them. For example, the learner could not answer “When did you last snarp?” or “How many flots have you snarped?”
Meaning Versus Grammar The agent is quite aware of not knowing what snarping and flot refer to. The agent is equally aware that the sentence serves as a model for answering specific questions—those that are answerable and those that are not. If we present the statement “He is snarping the flot” and then ask, “Did you ever do what he is doing?”, the learner would possibly give some sort of troubled answer that expresses the idea “I don’t know.” The assumption of the system is that if all the words in the sentence were known, the sentence would describe something observable in either reality or fantasy. 1This test, the Basic Language Concepts Test (Engelmann, Ross, & Bingham, 1982), pre1 sents items that involve nonsense words. The test has been normed on at-risk and no-risk populations. The average 5-year-old at-risk student performs reliably on the nonsense-word tasks.
LANGUAGE ACQUISITION
391
In the final analysis, grammar is simply a system for directing the representation of events one feature at a time. Operationally, the details of the grammar correspond to observable features. A change in the detail of a sentence results in a change in the detail of the display generated by the sentence. The categories conveyed through the grammar are simply features of examples—shared samenesses of groups of individuals. The traditional analysis of grammar actually begs the question by referring to categories such as noun or verb. The groupings of words by parts of speech are based strictly on shared features of the referents of words. Names of things are nouns. The only possible way to determine whether a word in an unfamiliar language is a noun is to determine whether the referent is in fact a thing. If so it will automatically be in the class of things, which makes it a noun and affords it particular positions in sentences and particular modifiers (the, those, a, an, etc.). Coincidentally, it is possible to determine the part of speech of a word if the rules of the grammar are understood, even if the meaning of the word is not known. Snarping is a verb. Flot is a noun. These conclusions are based on extrapolation from known instances. Words of known meaning that follow is and end in ing are verbs. The word snarping is a word that follows is and ends in ing. Therefore, (a) snarping is a verb, and (b) snarping has a meaning, although its details have not been disclosed.
LANGUAGE ACQUISITION A great distortion in what occurs during language acquisition occurs if the descriptions refer exclusively to language production. This emphasis obscures the receptive knowledge of the learner. The learner cannot generate utterances in the language without having some notion of the meaning as well as the pattern. Receptive Functions The learner broadly learns that receptions have meaning and meanings change as the phonological details of the receptions change. The learner specifically learns that the features of the utterances that are the same across a range of individual examples signal a stable feature. For example, “Hand me the ball,” “Hand me the cup,” and “Hand me the soap” all have a verbal part that is the same and a corresponding feature of the examples that is the same. Therefore, the part of the sentences that is the same predicts the features of the example that are the same. Once the learner has learned the relationship, the learner is able to generate variations of similar series, “Me
392
15.
LANGUAGE
ball,” “Me cup.” The learner must know how the setting will change as a function of each utterance. The utterance has the power of making changes in the same way that physical manipulation does. “Me, cup. Me cup” results in somebody handing the child the cup: “Here you are, honey.” Other series that build receptive knowledge present the same word in a series of utterances. “Here’s your cup. Oh, don’t drop your cup. Here, let me hold the cup for you. You want to hold the cup yourself? Okay, but hold that cup tight . . . Good drinking from the cup.” Although much of this language is babbling to the infant, the verbal element, cup, occurs at a high rate and provides the learner with the opportunity of hearing it in a variety of verbal contexts. It is being established as an independent unit. Ultimately, the child learns that all instances that name cup refer to a particular set of host features. All the instances of hand me refer to a particular set of action features. The features of the cup predict the word, and the word predicts the features. Language Instruction Some details of language learning are not readily observed in the hustle of childrearing, but work with young, language-delayed children shows both the details of the information children need to learn the various language conventions and the amount of practice required. Often these children do not understand the words yes and no or even that the purpose of a verbal communication is to provide information, rather than just a recitation of sounds. Often they do not know the meaning of simple directives such as “touch” or “stand up.” Language-delayed children show through their behavior that learning is highly unfamiliar to them. We once worked with a group of 30 four- and five-year-olds, 28 of whom had recently arrived from Portugal and did not understand any English. Two were disadvantaged children who were English speaking, but who came from an environment that did not provide much interaction involving language. Within 3 weeks of instruction in language and language concepts, the two disadvantaged children were the lowest performers in the group. The children from Portugal already knew many of the concepts and language conventions incorporated in English. They also knew what was expected when an adult taught something—what kind of attention was required, what type of information was to be remembered, and how the information would probably be used later. All they needed to learn was the labels for the various features and the syntactical rules for expressing relationships. The disadvantaged children had to learn not only what many of the words mean, but also how to relate them to specific features of the setting. For in-
LANGUAGE ACQUISITION
393
stance, they knew no prepositions. If told to put the block on the table, under the table, or over the table, the block would end up in the same place—on the table. Although on the first day of preschool they knew more about English than the children from Portugal did, they knew far less about language. The performance of older disadvantaged children on reciting patterned digits also implied that they were relatively unfamiliar with learning verbal patterns. When presented with random digits, 6-year-olds could recite five or six of them. When presented with patterned digits, such as 7,7,3,3,8,8, they could not do as many. For them, the pattern served as interference. In contrast, the children from Portugal showed no advantage in recalling random digits, but were able to repeat 10 or more patterned digits such as 6,6,6,8,8,8,5,5,5,1,1,1.2 Language and Logic Learning properties and patterns of language require hundreds of examples. The reason is that the learner must be exposed to a wide range of examples to receive adequate information about the features shared by all examples of a pattern, such as “______ is not _______.” The sentence “Homer is not hungry” provides a lot of information about an individual dog, his behavior, and how he does not feel. The sentence “This car is not a piece of junk” does not refer to feelings or behavior. It negates an assertion that this car is a piece of junk and refers to fact, not something observed as a physical feature of something. Yet certain aspects of the logic are the same for both items (given that they are true). Only if the learner learns this logic is the learner able to create the various expressive variations that might be observed: “Mom, Homer is not staying in the back yard” and the parallel forms, “Mom, Homer won’t stay here,” “I did call him, and he didn’t come,” “Why doesn’t David go after him?” “That isn’t my job,” “Well, I ain’t going to do it.” The basic form is learned from a sufficient number of examples to show the sameness in pattern feature and a sameness in event feature. Just as extrapolation is possible on the basis of sameness of feature of an object, it is possible on the basis of sameness of feature of a pattern. This extrapolation, however, requires an understanding of the parallel between verbal pattern and event. The underpinning of language is knowledge. If the learner knows a fact is true, there are many expressive options. The knowledge must precede the verbal expression. With the knowledge in place, the learner is able to describe multiple features of the setting one feature at a time. “No, Mom, 2 The task of repeating patterned digits is presented by the Basic Language Concepts Test (Engelmann, Ross, & Bingham, 1982).
2
394
15.
LANGUAGE
Homer is not in the yard. He crawled under the fence, and he’s in Mrs. Jenkins’ garden. She’s chasing him with a broom.” In summary, the basic logical form that the system uses is extrapolation based on sameness of features of the referent and corresponding features of the expression. If a referent has Feature F and it is described by Expression E, it follows that F is a predictor of E (and E is a predictor of F). Any referent that has Feature F, therefore, is assumed by the system to have Feature E. Expressive Language The literature richly documents developmental trends of expressive language under relatively poorly controlled conditions. Indeed, there are norms for the number of words and sentence structures that learners of various ages produce. The norms, however, are assumed to be natural phenomena, not the product of specific input. When the child says “Milk all gone” and “Dirt all gone” and then generalizes to “Mommy all gone” as she leaves the room, the child is not simply engaging in a linguistic game. The child is using words to refer to a feature that is the same about all the settings. The “open-pivot” generalization is not merely evidence of the child’s ability to speak, but evidence of the child’s ability to perform a series of sophisticated steps based on knowledge: 1. Identify the host (mommy). 2. Identify the feature of the host that occurs in the setting (all gone). 3. Search for the appropriate language pattern or template for this relationship (_______ all gone). 4. Plan the response with the appropriate words in the appropriate sequence. 5. Direct the response: “Mommy all gone.” Steps 1 and 2 refer to the features of the setting that must be observed and identified before any representation of these features occurs. Steps 3 and 4 deal with the features of the expression that are correlated with features of the setting—the word that refers to the host and the words that tell what change occurred. The plan also includes the loudness and other features of the response. Step 5 directs the response. Elimination of any step would make it impossible for the response to occur. The infrasystem is highly involved in the formulation of the utterance. The feature of the setting that triggers the infrasystem functions is the presence of another person—a listener. (Verbal performance in the presence
LANGUAGE ACQUISITION
395
of the listener has a history of leading to social reinforcement.) The agent is presented with the insight (from the infrasystem) both that mommy has left the scene and the content map for expressing this relationship (_____ all gone). The agent plans the response, but with prompts from the infrasystem. The learner has learned to look at the listener because the listener is the source of reinforcement. The learner talks with excitement because the relevant features of the setting are enhanced with secondary sensation. For the learner, this observation is very important.
Meaning and Convention The learner extrapolates on the basis of sameness of feature. The generalizations based on the examples the learner has encountered do not always comply with the meaning or usage conventions of the language. The child’s generalization that mommy is all gone is clearly based on shared features of the referents the child has encountered. When the cup no longer contains any milk, the milk is all gone. When (at last) there is no longer any applesauce in the bowl, the applesauce is all gone. When the mommy is no longer in the room, the mommy is all gone. The language convention tends to restrict the generalization of all gone to things that are consumed or exhausted. With further experiences, the learner will learn the more restricted generalization through stipulation of usage (the term only used to describe a subclass of possible referents) and through corrections based on the child’s false positives. (Yes, Mommy is no longer here.) In learning the various conventions, the learner creates a variety of mistakes based on an irrelevant feature or a feature that is too broad. An example of misgeneralization could occur with relative words, such as your and my. The child may encounter two words together, such as your shoe, and generalize that the name of any shoe is your shoe. Encountering reference to my shoe, that shoe, or daddy’s shoe presents a minimum-difference negative. “No, honey, that’s not your shoe. That’s my shoe.” This explanation may be confusing to the learner, but the shoes do not look the same, so the learner’s system identifies some of the physical features that are unique to the shoe labeled my shoe and uses this as the basis for a discrimination. Later, the child will have to abandon this interpretation for one that recognizes the relative nature of the words my and your. This occurs through the learning of a lot of things that are labeled my—my bear, my shoes, my cup, my blanket. These things are always mine. Nobody else can call them my bear. (That would upset me greatly because it is my bear.) The common feature of the my things, therefore, are revealed through contact with a set of examples sufficient to shape the discrimination. False features are often associated with such words because of common features. If I like my bear, my shoe, and my
396
15.
LANGUAGE
blanket, I like my name. Nobody can take my bear, my shoe, or my blanket. Therefore, nobody can take my name. In the same way the child may make generalization mistakes with my and your, the infant may point to the wrong person (the mailman) when saying, “See Daddy.” The learner is generalizing on specific features of Daddy. He is a large person with a deep voice; therefore, he is called “Daddy.” The convention for Daddy is like that for my shoe. Not every shoe is called “my shoe.” Not every man is called “Daddy.” Primary Reinforcers and Words Before the days of child car seats, a driver had his 3-year-old son on the front seat next to him. As the car approached an intersection, another car cut it off. The father slammed on the brakes, extended his right arm to prevent the child from hitting the dashboard, and issued a loud, angry epitaph to the driver of the other car: “Son of a bitch.” About 3 months later, the father and son were on their way across town. As the car entered the freeway, the father realized that the car next to it was not yielding the lane. He extended his arm, and braked, without saying anything. The child pointed to the other car, frowned, and said in a loud, angry voice, “Cinnamon bitch.” This is a classic example of how all the features of settings that present strong primary reinforcers become enhanced. The child learned a content map from a single exposure. The learning was largely influenced by strong primary reinforcers. In the original encounter, the child was frightened by the sudden noise, jostling, and behavior of the father. Something predicted this outcome. The father’s anger clearly reinforced that the other vehicle was responsible for this outcome and that what the vehicle had done was bad. So all of the details are enhanced—the car, the braking, the sequence of events, and the verbal response of the father. Later, in the presence of some of the details—the driving context, the braking, the arm extending, the urgency, the other car—the behavioral strategy is presented reflexively to the agent, including the need for a loud, angry voice and the words to be issued—“Cinnamon bitch.” What the child learned is not simply the words, but their usage, including role behaviors that accompany the usage. The child learned to behave as any person would behave in this setting. (Daddy is reasonable. He behaved that way. Therefore, anybody would behave that way.) Psycholinguistic Trace Features of Words The classification system of the infrasystem has every entity classified according to the sum of its features. Some features derive from the sound of the word, some from the usage, and some from the referents. The previous
LANGUAGE ACQUISITION
397
example illustrates that the verbal expression carries with it the secondary sensation that predicts the event. The expression has feeling in the same way that the experience from which the expression derived had feeling. We may hate the name Vincent if we had punishing experiences with someone named Vincent. If we did a search for names that we do not like, Vincent would be on the list. It would have the feature of being negative and predicting punishment. This is a trace feature of the name—one that will influence some of the learner’s behaviors, but not one that is not shared by everybody. The learner who hates the name Vincent may be fully aware that there may be thousands of nice individuals who possess this name. The learner would not purposely interact in a discriminatory way simply because the person is named Vincent. In fact, if the learner likes this person, the learner may not have negative associations with this person’s name because it is not classified as simply Vincent, but primarily as Vincent Johnson. So if the learner thought of people who are positive, the name Vincent Johnson would (paradoxically) be on the list. Demonstrations of the extent to which the system is able to cross-classify on the basis of trace features were prompted by Osgood’s notion of the semantic differential (Osgood, 1952; Osgood & Zella, 1954; Osgood, Suci, & Tannenbaum, 1957). The idea is that if two things are responded to in the same way, they are equal. The studies that grew out of this principle did not investigate central or public meanings, like understanding of red and notred objects. Rather, they would determine the extent to which words had trace features that have no feature-based relationship to redness. They were experiential relationships. The respondent would rate the degree to which a word like anger is red. The results of these studies suggest that the word man is rated high in aggression, the word swear high in masculinity, the word circle high in femininity, and so forth. Of course, all of this has nothing to do with the primary meaning of the word, the secondary features of the word (the subclasses of meanings), or the tertiary meanings (grammatical and usage). Rather the connections involve what must be quaternary features at best—undisclosed relationships between the respondent and some features of the word—the sound, usage, referents of the word, word users, verbal context, or contexts in which the respondent experienced the word. The learner is able to forge some relationship between a color and aggressiveness, friendliness, masculinity, or interest. This performance confirms that the infrasystem has everything classified by the sum of its features. Only if the word man is classified by basically the sum of the experiences the system retrieves from a search of man could the system create a link between square and man or aggression and man or red and man.
398
15.
LANGUAGE
The results of investigations of semantic differentials proved to be quite sterile. A task that would more clearly articulate the conceptual basis for the apparent semantic equivalence (which is actually more of an experiential relationship) would present the relationship and ask how that relationship were possible. For instance, “Person A thinks of the color white when thinking of vacations. Why would a person have such an association?” Possible explanations might be winter vacation, snow-capped mountains, the name White River, and so forth. In any case, there is a basis; it is not semantic, but a peripheral relationship to words; it shows the extent to which the system is able to create relationships. Influences of Language on Perception Various experiments have shown how the verbal classification of ambiguous stimuli relates to how the features are represented by the system. For instance, subjects are presented with a simple drawing that has the general shape that could suggest eyeglasses or a dumbbell (Carmichael, Hogan, & Walter, 1932). The prompt that is provided to some subjects refers to glasses. The prompt for the other subjects refers to a dumbbell. When subjects later draw the stimulus, they tend to distort what was presented. One group’s drawings look more like eyeglasses, and the other’s looks more like dumbbells. What happens here is extrapolation based on the features of a model. If the verbal description eyeglasses is accepted, a set of features is accepted as a gestalt—a whole that provides for a specific arrangement of details. When reproducing the stimulus, the learner refers to the gestalt for eyeglasses and uses it as the basis for creating the detail of the drawing. This type of suggestion is actually a form of propaganda. To believe the prompt is to distort the features of the referent. More elaborate versions of this type of propaganda involve features never directly observed, simply described. The result is that the person who accepts these verbal prompts believes, for example, that the woman with the feature of blond hair is dumb, that the male with black skin is unreliable, and that the person with the surname Goldberg is rich and sinister. Like the child who encountered two “Cinnamon Bitches,” the learner responds to the reactions of the people who provided these stereotypes. If your friend tells you about blondes and shows disdain or anger, these responses are part of the verbal package that describes the individuals. One of the features of blondes becomes the disdain, just as one of the features of Cinnamon Bitch is anger. Like all the other relationships learned through the presentation of primary reinforcers, the agent may not be able to access the reasons for learning what has been learned. The learner does not know that she is generalizing on the basis of correlated features of both the setting and the words.
SUMMARY
399
She only knows that, at a later time, some words fit a particular setting. The infrasystem has presented the conclusion to the learner in the form of a content map. It is not the product of agent direction, although it may involve agent participation. For the child who learned “Cinnamon Bitch,” the infrasystem presented the scene to the learner many times, and the agent rehearsed and reviewed the details of the event with sufficient frequency and intensity to learn the proper use of the expression daddy used.
SUMMARY Some have suggested that the grammar of languages is too complicated for the learner to learn and is therefore inherited. However, the learning of language is no different from that of other relationships and predictors, except that it is more extensive and involved. Ultimately, however, the most primitive language understanding is based on the notion that the word predicts the feature and the feature predicts the word. The learner must learn an extensive array of shared features—the grammatical patterns of language as well as the various lexical, inflectional, tonal components, and usage contexts. All convey meaning. The name Alexis is shorthand for a litany of features. A word like dog encompasses a great range of features—newborn pups, young puppies, adult dogs, old dogs, friendly ones, mean ones, big ones, small ones, and various breeds. The word dog stands for the sum of these subtypes. Therefore, the word by itself is limited as a communication vehicle. In some settings, such as someone pointing to a paw print and saying “Dog,” the word is edifying. Yet to describe a specific event that is not present, additional specification is necessary. The reason is simply that dog describes only a single feature of the event. More features are necessary if the receiver of the message is to represent the event. The language provides rules for using words about features to describe individual events more precisely. “Watch how high this little dog will jump to catch a Frisbee” or “My dog has fleas.” The language calls the listener’s attention to specific features of a particular dog and the relationship of those features to something else (the jumping for the Frisbee or the fleas). Learning language is basically natural for the learner because language is designed to do what any content map in the learner’s repertoire does— provide a general description of what to do. The description refers to content. The description is made more specific by the agent. The features of the content describe those features that are not negotiable. The features that are not specified may be specified by the agent. Just as a content map that does not involve language applies to a range of possible settings, a content map described by the words That’s a big dog applies to a wide range of settings.
400
15.
LANGUAGE
The process of grouping events or individuals by feature is basically the same for language as it is for nonlanguage events, except that language creates parallels between features of things and words. Part of the classification of language items involves grammatical patterns because these are needed to convey or decode specific meanings. The patterns are abstract forms or templates that guide the speaker’s transformation of referent features into words and the listener’s transformation of words into representations that depict the features. For the speaker, the specific utterance is the product of a plan. For the receiver, the utterance is a content map for creating a representation. The utterance “Get a Phillips screwdriver and bring it to me, quickly” is perfectly parallel to any content map. It provides information about the goal and criteria to be compared with performance. Because the map does not provide all the features of the response the agent will direct, the agent must construct a plan consistent with the reference to quickly. The learner must identify a specific route to specific places, apply a criterion to find an object that has a specific set of features (a Phillips screwdriver), and plan a return route. The description implies many features of the response, which means that the person attempting to follow the directives may fail by not responding to any single feature or combination of features expressed by the directive. The person may not attend to quickly, Phillips, screw driver, or bring it to me. A simple extension of the directives for performing behaviors is creating the imagined event. The simplest extension would be, “He told his assistant to get a Phillips screwdriver and bring it to him quickly.” The receiver of this description does not perform the behavior, but formulates a representation of the behavior called for that has the same essential details as those of the directive. The only difference between verbal directives that call for action and verbal descriptions is the nature of the plan. For the action directive, the learner must create a specific behavioral strategy that is consistent with the provisions of the directive and then execute it. For the description, the learner simply creates the plan as if it is to be carried out. No execution of the plan is required. The learner learns the language system in the same way the learner learns about other sets of related discriminations—identifying samenesses of features. If the learner learns that the phrase a dog predicts a particular cluster of features, a cup predicts another cluster, and 10 other different words predict 10 different feature clusters, the learner’s system classifies each of these phrases according to the sum of its features. The common features across any of these phrases describe a possible basis for generalizing. All the expressions share a common usage feature and all share a sameness
SUMMARY
401
in the features of the referents (all are things or hosts that are observed). The system creates the generalization that any instance of the form “a _________” predicts a thing or host. Conversely, any observed thing or host predicts a name, which is expressed as “a _______.” This trial content map is applied and later modified as the learner learns about the features of the names that predict “an ____” rather than “a ___.” The process is complicated, but completely explicable in terms of the information the learner receives and the basic operations of the learning system. Because the language has the property of depicting events and groups, it is capable of conveying the secondary sensations predicted by what it depicts. If the event would reflexively cause strong secondary sensations, a description of the event would tend to reflexively cause the same secondary sensations. A graphic description of something the learner would find highly negative, such as taking a nasty fall, would result in a negative secondary sensation. Likewise, a description of an encounter that the learner would find sexually exciting would result in the learner experiencing sexual excitement. Words or expressions presented in a context of strong secondary sensations take on the secondary sensation. If the speaker who models the word shows great anger, the learner concludes that one of the features of the referent is that it provokes anger. The learner responds accordingly.
Chapter
16
Human Cognitive Development
The preceding chapters have suggested some of the modifications required by human learning. Because the human is designed to learn just about anything, its system must account for basic learning as well as the more sophisticated extensions. The system, therefore, must be designed so that its hardwired underpinning allows for great possible diversity of content. This endeavor is particularly difficult for the human system because the system is provided with no hardwired knowledge. The system must be able to start with basically no knowledge, learn the basic antecedent relationships, and then learn the various extensions that are unique to human learning.
UNIQUENESS OF HUMANS There are probably no unique functions of the human learning-performance system except the extent to which the agent functions are generalized. Virtually all organisms that learn can learn at least some patterns associated with primary reinforcers. If the human encounters virtually any kind of data that suggest a pattern, the human is capable of learning the pattern. Animals like dogs and horses are able to plan things and therefore think about things not keyed to details of the present setting. Humans do this routinely and with respect to any pattern, including those that are far removed from any obvious primary reinforcer. Animals communicate content through calls, movements, postures, and so on. A particular posture clearly shows the observer that an attack is immi402
DEVELOPMENTAL ORIENTATIONS
403
nent. A particular call means it is safe for fellow birds to land here. Also mammals are able to learn specific signs that direct behavior. The movement of the master’s arm indicates the direction the spaniel is to go. The inflection of the word sit combined with some broad phonological details of the command direct the dog to sit. The discriminably different verbal command “Hoooold it” tells the animal to stay in its current position. The extent to which humans are able to analyze features of receptions are greatly expanded. The extended analysis is not a function of superior hearing or eyesight, but simply improved capacity to identify qualitatively unique features. For the dog, sit may be replaced with words that are minimally different (sat, fat, fit, snit, slit, slat, or slot) without affecting the outcome, so long as the intonation and loudness match those that accompany sit. A human who knows no more about language than the dog does would respond the same way. Any features that are unique to all positive examples serve as a basis for identifying all positive and negative examples. If the tone, preceding events, and length of the utterance are unique to the positive examples (“Goldie, slit . . . slit.”), those are the only features the learner needs to learn to respond successfully to all positive examples and no negatives. The difference between a human and a dog is that the human is able to abstract the phonological features of the word and learn a meaning or set of meanings for each unique phonological arrangement. The learner is able to learn how to recognize the word sit when it is said slowly (“sssiiit”) or when it is segmented (“sss----iii---t”), when it is said in a high register or low one, when it is said with rising inflection or falling inflection, when it is loud or whispered, when it is in a unique tonal melody, when it is temporally preceded by other meaning elements, when it is appended with affixes, and when it has a sound that is shared by another word. When we say “Fritz sits,” it usually comes out “fritsits,” with no pause between the words. In contrast, there is a pause between the t and s in both words. So the pauses do not provide information about the end of the words. Yet the word sits is recognized as one unit, and the word Fritz is recognized as the other.
DEVELOPMENTAL ORIENTATIONS Because human growth has so many facets, different views of development are possible. The capacity of the child changes with age. A child who is 12 knows a lot more than she did when she was 9, but her IQ is probably in the same range. These are population predictions—averages. Jean Piaget is to development as Chomsky is to language. They are parallel in important ways. Both recognize some form of logical operations that
404
16.
HUMAN COGNITIVE DEVELOPMENT
the learner performs. For Chomsky, they are the sometimes complex rules of grammar; for Piaget, they are logical patterns of reasoning that are assumed to occur if the child is to learn broad, operational skills. Both Chomsky and Piaget introduce mystical components that account for processes and operations that are learned. Chomsky’s component is the inherited knowledge of the underlying structure of language. Piaget’s component is the suggested contribution of development to content that is learned. Piaget proposed norms based on how children perform on tasks requiring different patterns of reasoning. From children’s performance, Piaget inferred processes, operations, and interactions of learner and setting (Inhelder & Piaget, 1958). Piaget noted the transformations and logical operations necessary for the child to perform as observed. His theory acknowledged that the learner has to create abstractions to perform on certain types of tasks. In brief, he proposed some of the same relations that we address through the process of inferred functions. Piaget, however, postulated that learning was subsumed by development, which broadly means that it is possible to organize tasks of varying degrees of sophistication that children probably would not learn out of order. Their learning would parallel their physical development, with some tasks performed only by older children. For example, the 5-year-old would not understand that the floating–sinking behavior of objects is predicted by the density of an object compared with the density of the medium in which it is placed. The probability is much greater that a 12-year-old would be able to perform on this task and perform on all the tasks that characterize the performance of 5-year-olds and 9-year-olds. In addition to the logical processes that characterize the learning, Piaget’s scheme introduces an amorphous something that is labeled development and that yields an order of difficulty of the tasks. However, the tasks have an order of difficulty without reference to development. The order of difficulty needs reference only to the component skills required to solve the tasks. If Task 1 requires knowledge components A–N, the task assumes knowledge of A–D. If another task, Task 2, requires only A–D, some learners would be able to perform correctly on Task 2, but would fail Task 1, which requires components A–N. The only learners who would correctly perform Task 1 would need knowledge of A–N. Therefore, learners would be screened on the basis of total knowledge. The task may imply a necessary order. If the components E–N cannot be learned without knowledge of A–D, a progression of learning is implied. However, it is not based on development, but on the probability of specific learning occurring at a particular rate and in a particular order. The present analysis, although not taking issue with any of the actual data that Piaget
DEVELOPMENTAL ORIENTATIONS
405
and his colleagues have assembled, considers the problems of cognitive development simply problems of learning. A functional way to state the problem is that we could not communicate with the naive learner about density until the learner has a certain set of skills. Once these skills are induced, some form of articulate communication is possible. We must be able to refer to the size or volume of the objects and to their weight. We could not assume that the learner had knowledge of the fixed-unit reasoning that permits intuitive conclusions about density. Therefore, the child could not learn the concept because it would be logically impossible for us to show the shared features of the things that sink compared with the shared features of those that don’t. We could not use language, and we could not use hands-on demonstrations. Piaget presented a more circuitous relationship involving knowledge, development, and learning. As Furth (1969) summarized, If learning is defined as new knowledge derived from experience with particular events, physical abstraction illustrates learning in its dependence on general knowledge realized by means of formal abstraction. According to this definition, the development of general intelligence is not a process of learning, but of equilibrating through formal abstraction. (p. 249)
In other words, learning has to do with the process of learning something new, which assumes that some forms of formal abstraction are in place, but these forms are not seen as the product of learning (although they are influenced by learning). Piaget’s basic assumption is that these abstract operations are somehow content free, not that they are simply features of the examples that behave a particular way. The notion of the equilibrating through formal abstraction sums up the problem. This process apparently involves both content and some form of logical operations. If both are teachable, the skill is teachable, which means that it is not a function of development, but of learning. The abstract operation springs from the classification of the examples on the basis of sameness of features. The floating–sinking behaviors of Instances A–Z are different: A–L float, M–Z sink. Those that float have the feature of being less dense than the medium. Those that sink have the feature of being denser than the medium. The size, shape, and other features of the object are (largely) irrelevant. The type of medium is (largely) irrelevant. Logically, the operations cannot be independent of the features of the examples processed by the operation because the generalization of the operation (what is common across the positive examples) is solely a product of information about the features of the examples. Piaget identified different developmental stages, primarily the sensorimotor stage, the concrete-operational stage, and the formal-operational
406
16.
HUMAN COGNITIVE DEVELOPMENT
stage. Each stage is characterized by what Piaget considered to be unique logical operations. In fact, all these logical operations must be learned because the infrasystem is restricted to operating reflexively. For the infrasystem to produce the same operation on various occasions, the same class of stimulus conditions must be present. Classification by Multiple Features In contrast to Piaget’s orientation to development and learning, the current analysis rejects the possibility of amorphous developmental variables affecting learning. What the learner learns is accounted for by the learner’s experience. This is not to say that learners do not vary in their individual capacities and tendencies. Rather, it is to say that there is an experiential base for the cognitive content learned and processed. The only preset operations the infrasystem is logically capable of performing are the classification of examples on the basis of sameness of feature and the steps required to determine the identification of those classes that predict. The classification operation involves the component operations of comparing and identifying which feature or set of features are the same. Everything presented to the infrasystem—internally generated sensation, representations that are the agent’s repertoire, input from sensors, and combinations—is treated the same way. Inputs are compared by feature and classified by feature. If something coming into the infrasystem has the common features A–N, it is classified by those features. It is also classified on the basis of its other features. The rock is not simply something that sinks. It is an individual. It is a rock. It is hard. It has a particular size, shape, and texture, and it is involved in a particular event. Classification occurs for each of these features, and therefore any combination of them. The only format of classification by sameness of feature that leads to learning is one that records a relationship of more than one set of features. In its simplest form, this format involves three separate classes, each with unique features. The relationship of the three terms is the basis for classifying all the instances according to common features. Table 16.1 shows the classification that involves a three-term relationship. Set 1 is based on grouping according to the single feature A; Set 2, according to the single feature B. Relationship R is the shared feature of all instances that join Sets 1 and 2. For instance, the shared feature unique to Set 1 may be the behavior of directing the hand to the eye. All instances of Set 2 are a painful eye that corresponds to eye contact by the hand. The relationship is that Set 1 predicts Set 2. By itself, this relationship would mean little because it would not imply behavior. When it is combined with the related negative set (Table 16.2), however, behavior is implied.
407
DEVELOPMENTAL ORIENTATIONS
TABLE 16.1 Three-Term Classification Set 1 has Feature A
Set 2 is R related to 1
Set 2 has Feature B
TABLE 16.2 Three-Term Classification of Negatives Set 1¢ has Feature not-A
Set 2¢ is R related to 1¢
Set 2¢ has Feature not-B
TABLE 16.3 Three-Term Combined Classification Set 1 has Feature A ® B
Set 1¢ is P related to Set 1
Set 1¢ has Feature not-A ® not-B
In other words, the act of not bringing the hand to the eye is related to the absence of pain. By combining both of these sets (Table 16.3), the agent is able to compare the behaviors. The choice is to move the hand to the eye and experience pain or not move the hand to the eye and not experience pain. The agent tends to choose the latter. For this reorganization, Set 1 has to do with the instances of A leading to B. Set 1¢ has the feature of not-A leading to not-B. The sets are P related, which means in this case that Set 1¢ has the feature of being the preferred relationship, and Set 1 has the feature of being not preferred. The basis for this preference is hardwired. The preference is learned. Human learning is predictably slow at the onset because the learning is highly unfamiliar, and there is no base for extensive comparisons. The first thing the system must establish is the most rudimentary classifications. Something feels good. The infant’s infrasystem must first establish that feeling good is not something unique to the current setting, but is a variation of the same feeling that occurs on another occasion. Once this relationship is established (that the feeling is the same in some way), possible predictive features may be identified. As the knowledge of multiple features emerges, the first relationships are fashioned. Sucking feels good. The act is Set 1; the good feeling is Set 2. The relationship is that the feeling is a function of the sucking. Given that the learning involves multiple features, the potential for learning increases geometrically as multiple features are identified by the system. Learning Abstractions The assumption of all operations is that they are limited to specific features—even very abstract relationships, such as XY × XY = 1. X and Y refer to a specific feature of all instances of X and Y. Both refer to a number. Number
408
16.
HUMAN COGNITIVE DEVELOPMENT
is a feature of examples. If we do not limit the set of substitution instances to the number feature (and to a specific number that is substituted for X and Y on any particular occasion), sensible conclusions do not follow. For instance, if we substituted clouds for X and hunger for Y, the equation not only would not equal 1; it would not mean anything. The operation of dividing clouds by hunger is nonsense unless we abstract the number feature from any specific set of clouds and any specific occasions of hunger experiences. If things and events did not have number properties, however, the abstraction would not be possible. Compensatory Reasoning. Piaget’s conception of conservation of substance illustrates the problems of trying to treat operational understanding as something that is content free. When a 6-year-old child is presented with a clay ball and then sees it deformed into a sausage shape, the child understands that the amount of substance is the same as it was before the ball was deformed. According to Piaget, the child applies compensatory reasoning, which holds that, although the object increased in one dimension (length), it decreased in the compensating dimension (width). Piaget assumed that the same operation applies to tasks involving water that is transferred from one glass to another of a larger diameter. The child understands that the amount of water is the same. Although the level is lower when it is transferred, it is compensated for by increased width. So the abstract operation has to do with independent dimensions that covary in a way that does not change the total amount of substance. This scheme has many problems. The first is that the compensation is not perfect. Technically, if the child referred to the surface area of the water in the narrow glass versus the wide glass, there would appear to be more water in the narrow glass. The same is true of the clay ball. The sphere has the smallest surface area. Therefore, the other appears bigger. An empirical problem is that children do not learn to conserve all at once. There may be months separating their ability to conserve with the balls and to conserve with the water transfer. A more central problem is that the understanding of the transformations assumes a knowledge of the features of the substance. The substance is not compressible or expandable. Furthermore, the number of units does not change during the transformation. Unless these features are known, the children who conserve could not possibly perform the operation implied by compensation. Furthermore, if the children know that the units are fixed, they do not need to perform the compensatory operation, but simply remember, “If nothing is added or subtracted, the amount is the same, no matter how it looks.” If Piaget’s interpretation of development is correct, teaching the compensatory reasoning would be futile. In fact, Inhelder and Piaget (1958)
DEVELOPMENTAL ORIENTATIONS
409
performed technically naive teaching experiments that showed learning did not occur. They concluded that the learning could not occur because the abstract operational foundation that occurs through development was not in place. Teaching Conservation of Substance. One of Engelmann’s (1965) earlier experiments challenged this conclusion. The goal of the experiment was not simply to show that conservation of substance could be taught, but to demonstrate that a knowledge of conservation could be induced in a relatively short period of time using practices that violated every requirement Piaget suggested was necessary. The goal of the training was to teach the children compensatory reasoning as an operation that applied to a broad range of examples. The training, however, was designed so that it (a) would not consume a great deal of time, (b) would not present examples of actual transformations, and (c) would not allow for children to manipulate objects. All three conditions were considered by Piaget to be essential for the emergence of conservation of substance. The assumption of the design was that if children could learn through a presentation that provided content information without developmental accoutrements, it would validate that the learning is strictly a function of information about content and operations involving content. The first step was to assess the children’s understanding of conservation, particularly the application of compensatory reasoning. Engelmann developed a 10-item instrument, the Conservation Inventory, which evaluated children’s understanding of the compensation argument as it applied to water transfer as well as other two- and three-dimensional applications. In addition to presenting actual water-transfer examples, the instrument presented children with a model that would permit them to create examples. Figure 16.1 shows the equipment. The handles were connected to strips that could be moved up and down in the two cut-out areas. The cutout areas were identified by the tester as juice glasses. The strip was juice. The tester demonstrated with both glasses that the strip could be moved up to make the glass full or down to make it empty. After the child practiced making the glasses full and empty, the tester presented four items related to water transfer. Three involved the model. 1. The tester filled the left glass to the half-full mark and emptied the right glass by moving the strip all the way down. Then the tester identified the left glass. “This is your glass. See how much juice is in your glass. If you poured all that juice into my glass over here, would I have just as much juice as you have?” 2. Next the child was directed to “Show me how my glass would look if you poured all the juice into my glass.” The child adjusted the right strip to the desired level.
410
16.
FIG. 16.1. ment.
HUMAN COGNITIVE DEVELOPMENT
Assessment device used in Engelmann’s conservation experi-
3. The tester then presented another example with different wording. The left strip was moved to another level and the tester said, “Show me how my glass would look if it had just as much juice as your glass has.” 4. Two identical glasses were presented to the child, each filled to the same level. The water from one of the glasses was transferred to a wide glass. The child was asked if the wide glass had as much water as the narrow glass. The child was then asked how much water the empty narrow glass would have if the contents of the wide glass were returned to it. The results show that the learning of conservation is not all or none in nature, but is the sum of component skills. Only 3 of the 30 children identified as nonconservers (based on performance on the Conservation Inventory and another instrument that tested knowledge of transformations) missed all the items. Two thirds of the nonconservers got two or more items correct. One third got Item 1 right, but missed Item 2. They also passed Item 3. In other words, they knew that the amount was the same, but they could not show what the same amount would look like on the model. Fifteen of the children who were identified as nonconservers received training. The training was designed so that it met these criteria: 1. It required a short period of time—four sessions of 10 to 15 minutes each, presented in one school week. The total amount of training time was 54 minutes. 2. No examples of actual liquids or liquid transfer were presented. 3. No transformations of liquids were presented. (The model used for testing was not presented during the training.)
DEVELOPMENTAL ORIENTATIONS
411
4. No three-dimensional objects of any type were presented. 5. The children were not permitted to manipulate any concrete objects, including the two-dimensional illustrations the teacher presented. The only time the children saw an actual transformation of any kind during the training was in the first session, when the teacher demonstrated that a chalk-laden chalkboard eraser could make horizontal and vertical impressions on the board that looked different but were the same size. As noted, this training was not designed to induce conservation of liquid amount, which had been demonstrated by various experiments to be teachable. Rather, the training was designed to teach fixed-unit reasoning by presenting static examples of solids. The strategy was simply to teach rules that functioned as content maps. The rules were framed broadly and applied verbally. The assumption of the training was that if the rules were verbally related to specific predicted outcomes, the children would be able to perform on tasks that involved water transfer and any other concrete application of compensatory reasoning. Part of the instruction was designed to permit clear communication with the learners. This part worked on the verbal conventions involving more, less, and same. The instruction showed how more, less, and same could refer to different dimensions—size, color, number, and so forth. The reason for including these exercises was that when the liquid is transferred to the wide glass, nonconserving children always say that the wide glass has less. If they actually failed to apply the compensation argument (as Piaget indicated), about half of them would fix on the increased width and say that the wide glass has more. The children’s problem, therefore, is not that they fail to apply compensation reasoning, but that they do not know exactly what is less. Certainly the level is lower or less. So the verbal convention part of the training showed that dimensions of objects could be more, less, or the same. They were shown diagrams of pockets—large and small—with the same number of pennies in them. They practiced identifying whether the pockets were the same size and whether the pockets had the same number of pennies. They were shown tall rectangles and short rectangles with the same sized colored area at the bottom. They identified whether the rectangles were the same size and whether the colored area was the same size. They verbally sorted circles that were large or small, black and white according to two criteria. “I want to find the circle that is the same color and same size as this one. Is this circle the same color as that one? Is this circle the same size as that one? Is this circle the same color and same size as that one? (Why not?)” The children were taught that solids are composed of fixed units and shown how to apply the fixed-unit reasoning to water. This property of water is contraintuitive. Water coming out of a sprinkler seems to expand and
412
16.
HUMAN COGNITIVE DEVELOPMENT
take up lots of space. Similarly, water seems to be perfectly flexible because it assumes the shape of whatever vessel it is in. Therefore, some nonconservers may fail to apply compensatory reasoning not because they cannot do it, but because they do not know they should apply it to water. The instructional strategy used in the experiment called for the teacher to refer to transformations although none occurred. For instance, the teacher presented drawings of various rectangles, some wider than tall, some taller than wide. The teacher presented a rule about the compensating dimensions. “When the box is tipped over, it is longer this way, and shorter this way.” The children then identified which of the tipped-over rectangles were the same size as the targeted rectangle. Another series of tasks stressed the fixed-unit aspect of a component. Figure 16.2 shows the outcome of one of the tasks. None of the rectangles was initially colored. The teacher presented this problem: “Let’s say that I have enough paint to paint all of this smaller box. I would have enough paint to paint eight squares. How many squares could I paint? Remember that. I’m going to start at the bottom of the other rectangle and paint the same amount. Count the squares and tell me when I run out of paint.” The teacher touched the squares as the children counted them. The teacher then colored the squares. Note that the teacher did not simply present one task of each type, but many of them, one after another, so that the children would see what is the same about them. For the task format, the teacher would draw a pile of six blocks, remind children that the rule will tell what it will look like when it is presented horizontally (longer this way and shorter this way), and then tell the children to indicate what the blocks would look like if they were in a row rather than a column. No assumption was made that the training was elegant. If the instruction were not severely limited by the restrictions placed on it, the teacher could have shown how solids and continuous substances are the same by putting dowel rods in a pencil sharpener. The wider the dowel rod that is ground up, the bigger the pile that results. The results of the posttest confirm that the instruction induced a generalization that led to correct responses on the Conservation Inventory. Four
FIG. 16.2.
Fixed-unit aspects of forms.
DEVELOPMENTAL ORIENTATIONS
413
fifths of the children performed correctly on all problems involving the model. Two thirds passed all water-transfer items. None of the comparison children got either all the model problems correct or all the water-transfer items correct. The fact that the performance changes were induced through 54 minutes of instruction means simply that if the instruction is configured to present relevant features of the examples in a manner consistent with the generalization to be induced, the developmental transformation from preoperational performance to concrete-operational performance may be induced in 54 minutes, not in 4 to 6 months. The results also confirm that the operations the children learn are far from being content-free. Rather, they are based on the common features of the training examples. If the rule (it gets shorter this way and longer this way) is a common feature of the various examples of rectangles, it follows that any rectangular shape change would be governed by the same conservation rule. For the children to perform as they did, they would therefore have had to classify the glass and model as having the same features as the training examples. The sameness in the feature would prompt the system to apply the analysis presented during the training. (If they are rectangles, they follow the conservation rules for rectangles.) Relative Direction. The child who understands relative direction would be able to express the idea that X is south of Y, but north of Z. Piaget suggested that this ability does not emerge until children are around 9 years old. However, Engelmann taught relative direction to children much younger (4- and 5-year-olds). As with compensatory reasoning, the problem was not one of development, but of creating communications that teach the conventions of relative direction. The total teaching time required to induce relative direction was less than 1 hour distributed over seven sessions. First, the children were taught the conventions for the directions on a map (north to top, south on bottom, etc.). Next, they were shown how to identify the direction something moved. “Which side of the map is it going to? That’s the direction it’s moving.” Children learned the convention that if the targeted place is west of where someone is, the person would have to go west to get there. The children were then introduced to the notion of relative direction. They were taught the rule that if the teacher pointed to something on the map (a tree for instance) and asked, “Is this tree north, east, south, or west?”, they would answer, “It all depends on where you start.” The teacher would then point to various possible starting points, such as the barn, house, hill, and so forth, and ask, “What direction is the tree if I start at the house? So what do you know about the tree?” (The tree is ____ of the house.) After the children responded to many examples of this type (with different maps for each problem set), they were taught to make multiple direc-
414
16.
HUMAN COGNITIVE DEVELOPMENT
tional statements about things on the map. The teacher would say, for instance, “Tell me about the house and the gate.” (The house is east of the gate.) “Tell me about the house and rock.” (The house is north of the rock.) “Tell me about the house and the cow.” (The house is south of the cow.) For the final task, the teacher would present the map and say, for instance, “Tell me about the cow and the other things.” The children would refer to all the other objects on the map. “The cow is east of the car. The cow is north of the horse.” At this point, the children provided behavioral evidence that they understood relative direction. Conservation Reasoning. In a fairly elaborate experiment, Engelmann (1971) showed the extent to which at-risk 5-year-olds could learn the various conservations that Piaget had mapped for different developmental stages. These included the conservation of substance, conservation of speed, conservation of weight, conservation of volume, and, finally, the understanding of specific gravity. Piaget assumed that specific gravity problems required formal operations. These are defined as operations that require the learner to formulate propositions about propositions. However, according to the current analysis, the problems were simply based on specific features of examples. As with the conservation-of-substance experiment, the instruction Engelmann provided to induce knowledge of these operations was greatly constrained so that the instruction violated everything Piaget asserted was necessary for these operations to emerge. No three-dimensional objects were presented, no transformations were shown, no manipulation of objects occurred, and no extensive amounts of time were involved. The outcome measures were the traditional Piagetian tests for the various conservations. To demonstrate the extent to which the children had learned generalized operations, the test of specific gravity was expanded to include a container of mercury and two steel balls, one large and one small. These balls would be tested with water as the medium and then with mercury as the medium. The results show the types of developmental anomalies that could be created through instruction. One child passed the tests of volume, weight, and specific gravity (including the generalization to the steel balls in mercury), but failed the test of conservation of substance (which theoretically involves only concrete-operational reasoning). The child’s results occurred because he was absent on the two sessions that provided instruction relevant to substance. Perhaps the most curious anomaly was that none of the children passed the test of conservation of speed. (Conservation of speed is assumed to occur before the understanding of some of the other items in the developmental sequence.) The reason all children failed this test was
DEVELOPMENTAL ORIENTATIONS
415
that it was impossible to demonstrate the features of speed without showing things in motion. Such demonstrations would violate the ground rules of the instruction. The children’s performance on specific gravity topped the list of developmental anomalies. The test presented various floating–sinking examples and asked children why they thought something sank or floated. Nearly all the children passed the test. The instruction was based on the idea that if children apply a verbal rule about floating and sinking to the test items, they will pass the items. The trick was to design the instruction so children would generalize the rule from the training examples to the test examples. The instruction had to be designed to induce extrapolation of the verbal examples that could be presented during training. They were taught the rule, “If something sinks, it is heavier than a piece of medium the same size. If it floats it’s lighter than a piece of medium the same size.” The training examples were presented verbally and addressed floating and sinking in air, water, milk, gasoline, and melted rock. The children would apply the rule to each example: “Listen: A ball sinks in air. What do you know about the ball?” (It’s heavier than a piece of air the same size.) “Listen: The ball floats in water. What do you know about the ball?” (It’s lighter than a piece of water the same size.) Other problems gave information about the relative density of the medium. “Listen: A piece of gasoline weighs less than a lemon that is the same size. Is the lemon heavier or lighter than the gasoline? So what will the lemon do when you put in the gasoline? How do you know it will sink?” The children also practiced identifying the medium and the object. They worked problems like a cloud is lighter than the air around it. “What’s the cloud going to do? How do you know?” They also worked problems that involved decomposing homogeneous substances. “If two things are made of the same material and the bigger one sinks in a medium, what do you know about the other one?” The tester was a Piagetian researcher. While she was presenting the traditional Piagetian test of specific gravity to one of the children, the child changed her responses. The tester had asked the girl what she thought the candle would do in the water. (Sink.) She told the child that she was going to cut the candle into a large piece and a small piece. She asked, “What do you think the small piece will do?” (Sink.) “What do you think the large piece will do?” (Sink.) As the tester started to cut the candle, the child said, “It will float.” The tester stopped and said, “Now you’re saying it will float? What will the large piece do?” (Float.) “What will the small piece do?” (Float.) “First you said it would sink. Now you say it will float. What made you change your mind?” The child explained that as the tester was cutting the candle, a small piece
416
16.
HUMAN COGNITIVE DEVELOPMENT
flew off and landed in the water. The child said, “See? It’s floating. If part of it is floating, the whole thing will float.” The reactions of both the children and adults to the testing were revealing. One child was asked why one of the objects sank in water. The child started to give one of the classic Piagetian animistic explanations: “Well, there are things that pull it down and. . . .” He stopped and said, “No, it’s heavier than a piece of water the same size.” For the test involving mercury, the children first held the two steel balls. They were asked what the smaller one would do when placed in the water. Most said it would sink. After seeing the small ball sink, all said that the larger ball would sink. When asked what the small ball would do in mercury, all said it would sink. When they saw it floating like a cork, their eyes became very large. When asked what the large ball would do, all said it would float. The graduate student who was outside the testing booth keeping data on responses later reported, “Oh, when Byron said the big ball would float, I was so sure that he was wrong. That small ball was heavy, but that big one was so heavy I was sure it would sink.” Her response shows that even adults of above average IQ do not necessarily have any generalized operational knowledge of specific gravity, merely knowledge that is stipulated to the specific features of objects and media familiar to them. The graduate student had a stipulated understanding of sinking and floating as it applied to water, but probably did not apply to air and liquids of different densities. For instance, some adults believe that cream is heavier than milk, although they know that it floats in milk. Many adults have trouble conceptualizing air masses floating or sinking in other air masses as a function of density. A dramatic example of the extent to which knowledge of operations is greatly influenced by the specific applications that people have encountered occurred at a conference on Piaget and Measurement. Engelmann delivered the paper on the formal-operation study and then pointed out that there were probably many members of the audience (composed largely of Piagetian researchers) who did not have a completely generalized notion of conservation. Engelmann presented the following problem to illustrate the point. You start out with a container that is 100% whiskey and another, exactly the same size, that is 100% water. You remove exactly 1 tablespoon of water and place it in the container of whiskey. You then take exactly 1 tablespoon of the whiskey–water mixture and return it to the container of pure water. Has the percentage of one of the containers changed more than the other? Both started at 100%. Does one of them now deviate more than the other from 100% or have both been transformed by the same percentage? Over half of the participants incorrectly indicated that the deviations were not the same. Clearly, they did not have an intuitive understanding of
DEVELOPMENTAL ORIENTATIONS
417
the information required by the problem. If they did, the answer would be as obvious as a question about a teeter-totter. If one end up goes up 6 inches, does the other end go down 6 inches, more than 6 inches, or less than 6 inches? The problem involves nothing more than an understanding of fixed units. Think of each glass as containing 100 balls—one set white, the other red. If a spoon contains 10 balls, we move 10 red balls to the white-ball container. Then we return 10 balls. So long as we return 10 balls, it does not matter how many are red or how many are white. Both containers will deviate from the original percentages by exactly the same amount. If nine of the balls we return are white, one container will have 99 whites and one red; the other will have 99 reds and one white. If by chance we returned five whites, one container will have 95 reds; the other 95 whites. Why does this problem seem complicated when it is basically as simple as the teeter-totter problem? We tend not to encounter problems of this type. Therefore, it lacks the sanction of being intuitively obvious, unlike the problem types that are familiar. We are like the youngsters in the experiment who have the verbal rules and agent repertoire needed to solve the problem, but who do not have endorsements from the infrasystem. The reasoning behind the rules the children learned did not derive from common features of observed examples. The rules were presented by the teacher. The infrasystem was not involved in the formation of the rule, so the infrasystem was not able to draw conclusions on the basis of sameness of feature and present conclusions to the agent. The agent had to figure it out. When we approach the water–whiskey problem, we must remind ourselves of the context—that there are fixed units of water and fixed units of whiskey. If we dealt with applications of double-transfer problems frequently, the infrasystem would basically sanction the rule and make the conclusions seem obvious to us. When the infrasystem accepts the prediction, the problem is no longer a series of agent-directed steps that lead to the answer. It is the answer, and the answer is obvious. When the first transfer occurs, the glass of whiskey is one spoonful greater than the other. When one spoonful is returned to the water glass, the amounts are the same, and, therefore, the deviations are the same.1 1At
least one participant of the conference argued, and possibly correctly, that if these steps are involved, the outcome could not be predicted without formulating propositions about propositions; therefore, the problem is actually one of formal operations, not concrete operations. The problem is that if we apply this criterion to concrete operations as they were taught in the experiment, they too would require propositions about propositions. The rule is, “if it becomes longer this way and shorter this way, it is the same amount.” The application then becomes, “it became longer this way and shorter this way. Therefore, it is the same amount.” Yet if the net amount transferred is considered, the results are obvious. If two red are transferred and two white are returned, obviously the change of both is the same.
1
418
16.
HUMAN COGNITIVE DEVELOPMENT
Critics of Engelmann’s experiment made much of the fact that the children’s responses were stereotypic and not like those of children who are truly at the stage of formal operations. This criticism misses the point. The idea was not to convert them into older children and induce all the verbal accoutrements of older children, but simply to accelerate the rate at which they learned something. Either it is possible to induce an understanding of specific gravity to younger children or it isn’t. If they respond correctly to the tasks presented and if the training did not present items that appeared on the test or even facsimiles of these items, their performance is not a function of development, but of what they have learned. The teaching implanted a form of the same content map that would be possessed, in some fashion, by anybody who possessed knowledge of specific gravity. Because the rule is a verbal representation, it may be expressed many different ways—for example, “If the density of A is greater than B, A will sink relative to B” or “The substance with the greater weight-to-volume ratio will sink relative to the other.” These and others could serve as a model for the content map that the system derives because all refer to the same variables and predict the same outcomes across a range of objects and media. Verbal Rules. The verbal rule is simply a short cut that obviates the need for what Piaget and others have suggested is necessary—trial-by-trial learning, manipulation of concrete objects, and slow sequence of successive approximations of the ultimate relationship. Learning from examples is necessary if we have no means of communicating with the learner. For humans, however, it is possible to establish communication that provides a model of a content map. The map will lack the intuitive features of one that is the product of learning from concrete examples; however, if it is applied to concrete examples, it will obtain the infrasystem’s sanction of predicting. It becomes “automatic.” Alternative Operations. If conclusions of a particular type are possible only if a particular logical operation is applied, the learner who arrives at the conclusion employs some form of the necessary logic. If the conclusion may be arrived at through more than one operation, the learner who arrives at the conclusion performs one of the possible operations. Certainly, the child may apply compensation reasoning to solve problems of conservation of substance. Ironically, however, Piaget indicates that a prerequisite is the notion of atomism (fixed units). A contradiction results because, if the child had the notion of atomism, the child would not need the compensation reasoning (for the traditional tests) because the child could conclude that the amount is the same if nothing is added or subtracted. Furthermore, if an amount is added and the substance undergoes further transfor-
FUNCTIONAL PROPERTIES OF HUMANS
419
mation, the amount is the original value plus the amount added. If this reasoning is possible and permits the child to pass the traditional Piagetian tests of conservation of substance, some children will learn this reasoning. Concrete to Abstract. The main difference between Piaget’s orientation and that of the current analysis is that Piaget assumes development proceeds from the concrete to the abstract. The current analysis holds that learning must be in the form of a rule that applies to a set of examples that are the same in some ways. The rule involves a relationship. The content of the rule is a product of the features of the examples. What Piaget considers to be concrete is highly abstract, based on abstracted features and often symbols that serve as features for predicting other features shared by a set of examples. The notion of not is not concrete. It resides in classification operations. Like red, it is represented in propositions or descriptions. The notion of not is an abstract relationship that applies to any combination of features, including virtually any relationship.
FUNCTIONAL PROPERTIES OF HUMANS The human system is designed so that the agent participates greatly in the knowledge the infrasystem possesses. (Even the reflexive functions, like breathing and blinking, are shared by the agent and infrasystem.) The agent is able to create content maps, empower these maps with secondary sensation, consult a repertoire that shares access to great segments of the infrasystem’s records, and direct thoughts involving any features in the agent’s repertoire. In addition, the system is able to learn various language functions that permit the agent to both encode thoughts to provide others with specific content maps as well as decode symbols to create representations consistent with the specifications of the communication. The only scheme that would permit learning of this magnitude and scope would have to be capable of learning patterns of patterns. For example, a tune is a pattern. The harmony for the tune is a related pattern. The test of whether the learner has learned the pattern of simple melodies would be to present a simple melody never heard by the learner. The melody would stop before the last phrase. The person who has learned the pattern of patterns is able to sing a range of possible endings that are consistent with the melody and reject those that are not consistent with the melody. The same sort of pattern-of-pattern awareness is evident with sentence completions. We present sentences of the form, “He wanted to go on the trip to Hawaii, but . . . ,” “She decided to exercise so that she . . . ,” or “At the mall, they went shopping for shoes, and. . . .” Responses of an experienced
420
16.
HUMAN COGNITIVE DEVELOPMENT
user of the language are both logically consistent and syntactically or stylistically acceptable. Just as the set of melodic phrases that are consistent with the new melody have particular tone-pattern and timing features, the set of verbal phrases consistent with each sentence has particular content and form features. Pattern Identification The infrasystem is designed so that the learning is strongly biased to identify patterns. This bias is achieved by making the identification or execution of a pattern reinforcing to the agent. Earlier discussions pointed out that if the agent attends to changes in the surroundings, the agent probably observes the predictors of primary reinforcers. The reflexive response to things that change unpredictably is perhaps the single most important hardwired provision of organisms that learn. It is particularly important to humans for three reasons: 1. The human, more than other organisms, lacks hardwired scaffolds and default content maps that prompt specific behavioral strategies; therefore, the human needs a generic hardwired systemic strategy that applies to any sensory input—auditory, visual, tactual, proprioceptive, olfactory-gustatory, and any combination of these. That strategy is to attend to any changes. 2. The bulk of human learning involves patterns and transformations of patterns, so the system must be designed to observe the bases for patterns, which requires attention to sensory features over time. 3. If the agent is to attend to changes in sensory input, the system must be designed to provide reinforcement to the agent for attending. The most efficient systemic solution is to key internal reinforcement to the observation of unanticipated change and record events that occur before unanticipated changes. The net result of this attention provision is that the system reinforces the agent for identifying patterns. Lack of Hardwired Scaffolds. The infant has basic reflexes, such as the sucking reflex, that guarantee that the infant will nurse and a grasping reflex that simplifies learning of some motor responses. However, there is no evidence of the infant possessing any specific content knowledge. Observations of the infant disclose that it is unable to track objects visually, grasp objects, move body parts in planned ways, or even hold its head in an upright position. It does not know how to reach for an object without sometimes hitting itself in the face or sticking its finger in its eye.
FUNCTIONAL PROPERTIES OF HUMANS
421
Furthermore, the infant gives no indication of possessing preestablished knowledge about anything. It is not imprinted to its mother and possesses no program for speeding the learning of basic motor operations or the content of language. Consider that it takes the infant possibly 13 months to learn to stand, which is a fraction of what a mountain goat learns in less than 1 hour. Clearly, motor-response learning is highly unfamiliar learning for the human, with the first steps coming slowly and these serving as scaffolds for increments in skill. The trends in cognitive development parallel those of motor development. The child often possesses knowledge of only a handful of words by 18 months; however, the child may have been provided with hundreds of exposures to other words that have not been learned. If the child had heard particular words or phrases 10 times a day since birth, the child would have been exposed to 4,900 examples of these words or phrases. Yet they have not been learned. The human ultimately has a special capacity for learning language, but the early steps have the features of any highly unfamiliar learning—slow progress and performance that is completely accounted for by the range and frequency of examples presented to the learner. Attention. The problem facing the human system is how to ensure the learning of the various skills with no specific program or content in place. The solution involves two related modifications: 1. The human system initially overenhances the positive and negative secondary sensations of primary reinforcers. 2. The scope of the primary reinforcers is broadened so that the discovery or execution of any pattern is reinforcing. As a guarantee that the infant will learn basic relationships, states associated with primary reinforcers are not neutral or even moderate to the infant. Rather, they compel the infant’s attention to the exclusion of other details. When the baby is hungry, it is not a case of slight discomfort. It is a colossal pain. When the infant is engaged in trying to grasp a blue whale on the mobile, the infant is not easily distracted. The infant’s response to change is also important because it predicts what the learner will attend to and what it will learn. Sudden or unpredicted change is not neutral to the infant. It is either highly reinforcing or greatly punishing. A door slamming may result in crying. Holding the baby and then suddenly lowering it (without dropping it) creates a great startle response (negative). If somebody stares at the infant and then rolls her eyes or blinks rapidly, the change is not neutral. It produces laughter and sustained attention to the face of the blinker. Such exposures result in content being learned. The face was reinforcing to the child. The face was re-
422
16.
HUMAN COGNITIVE DEVELOPMENT
corded. The face has the potential of becoming a secondary or generalized reinforcer. This learning is enabled by a general default content map that makes a broad range of change reinforcing or punishing—thereby commanding the learner’s attention. Peripheral Reinforcers. When a pattern is discovered, the infrasystem presents the agent with the knowledge and reinforcement. The knowledge is the features of the pattern. The reinforcement is the product of producing or observing the pattern. The infant bangs his cup on the tray again and again with great gusto and apparent satisfaction. That the infant repeats this behavior is evidence that it is reinforcing. Likewise, the infant repeats utterances “namoo, namoo, namoo, namoo.” The toddler shows its awareness of vigorous musical patterns by clapping and dancing. Neither the dog nor the cat respond to music in this manner. They certainly may be taught to respond differentially to a pattern, but the pattern is not initially reinforcing. For the human, however, the pattern needs no additional reinforcement. Its discovery, recognition, and execution are reinforcing. Predictions The pattern performance of humans is a facet of the system’s focus on predictions. The human, like other organisms, has an abiding orientation to predict. The human system extends its investigations from strict temporalorder or enduring-feature relationship to any pattern. The reason is basically that patterns are predictable. The system is designed to discover predictable relationships. Therefore, the system learns patterns. For all animals, the inability to predict is relatively more punishing than it is to predict. That is operationally why they learn things. The inability to predict tends to be more disturbing to the system than being able to predict even painful outcomes. Animals that receive intermittent shocks on an unpredictable schedule will develop experimental neurosis, which is not a function of the number of shocks, but of the pattern—the unpredictability of when the next shock will occur (Foa, Steketee, & Rothbaum, 1989; Mineka & Kihlstrom, 1978). Even the self-abusive autistic child, who slaps herself in the face hundreds of times an hour, is operating from predictability. The evidence is that she does not respond the same way to her own slaps as those that are unanticipated. If she is suddenly slapped by somebody else, no harder than she slaps herself, her response indicates that a nonplanned slap is far more punishing than those that are planned. Her immediate response is a startled expression, followed by a cessation of her slapping behavior, then a slow resumption of her routine (Gannon, 1982).
FUNCTIONAL PROPERTIES OF HUMANS
423
Humor. Perhaps the most extreme example of patterns enhanced with secondary sensation is the human response to humor. The specific kind of humor that is responded to is learned and greatly influenced by mores; however, the basis for humor is not learned. The healthy human will respond to some events or verbal presentations with visible signs of delight. Many types of humor result from the convergence of inconsistent patterns—one that is predicted and the other that occurs. The learner has information about Features A–C. These strongly predict D, which is what the learner anticipates. However, A–C may also lead to X, which the learner discovers suddenly. In the slapstick variety of humor, somebody who is making a pompous exit slips on a banana peel and falls in an ungraceful manner. People laugh. They recognize the profile the victim wanted to present—that of a suave, superior individual. In contrast, they recognize the second pattern—that the person was so concerned with his demeanor that he did not watch where he was walking. He therefore revealed himself to be a bozo, a clumsy bumpkin, a putz. Ho, ho. In the semislapstick variation, the man in a department store sees a woman from behind, assumes that she is his wife, and pats her butt, only to discover that she is a perfect stranger. The features initially revealed to the man led to the conclusion that she will have a set of other features—those of the wife. When the true pattern is revealed, it becomes apparent that the set of features that predicted the wife also predicted this perfect stranger, who understandably is not in a very understanding mood. Although, at the time, the experience may not be even remotely humorous to the man or his wife, it may later take on humorous properties largely because it has the elements of a good tale. “You won’t believe what happened at the mall last month. Andy and I were. . . .” Many verbal jokes follow the same format. The information that sets up the punch line clearly implies one set of expectations. The punch line, however, shows that all the information provided earlier in the joke sets up a completely unanticipated outcome that is consistent with the information already provided. For example: Third graders were telling about their fathers’ occupations. Johnny says, “My dad is an expert carpenter. He can build anything, especially cabinets. Last summer he won an award for this great cabinet that he built.” Next the teacher calls on Jeanie: “My dad also got an award. It was for bravery. He’s a fireman, and he saved a little boy and his dog from a burning building. He’s a great dad who plays ball with us and we have a lot of fun whenever he’s home.” Next the teacher says, “Phillip is next. Tell us, Philip, what does your dad do?” “My dad is dead. He died last year.”
424
16.
HUMAN COGNITIVE DEVELOPMENT
“Oh, I’m terribly sorry. I didn’t know. That’s awful. But Phillip, maybe you could tell the class what your father did before he died.” “Yeah. He turned blue and shit all over the floor.” Clearly, the outcome is perfectly consistent with the information presented. The teacher asked a question and Phillip answered it. However, the pattern had already suggested that, even if something funny occurs, the joke will have to do with an occupation or award. Student 1 tells about an occupation and award. Student 2 tells about an occupation and award. We are caught off guard when Student 3 responds with a nonoccupational answer. So the formula for most forms of humor is that the listener anticipates one pattern but discovers another that is unanticipated, or the listener is presented with different patterns of expectations that two people have (some form of misunderstanding). The listener finds this interplay of inconsistent patterns highly reinforcing, resulting in a smile or laughter. When the infrasystem recognizes the second pattern or result of the second pattern, it enhances it with strong secondary sensation. For the agent, the punch line of a joke does not simply reveal a second pattern. The punch line reveals a second incongruous image presented to the agent as something that is reinforcing. Note, however, that the joke about the student’s father would not be funny to someone who has recently experienced loss of a loved one or acquaintance. The negative feelings associated with the category death in their infrasystem are so strong that the second pattern is recognized, but is enhanced with negative, not positive, secondary sensation. The joke is “too close to home.” To these agents, the joke would be described as a sick joke. A case may be made, however, that the discovery of all dual patterns is reinforcing—even those that involve punishing consequences. We are shocked to discover that our favorite character on a soap opera, the shy librarian, is a serial killer, but if the dual pattern has been reasonably set up, the insight is reinforcing. Even discovering that one’s accountant has been guilty of fraud is a shock, but also a relief because now a series of things that had occurred make sense. Parents of missing children who may have been killed are often relieved to have the murder confirmed. The pattern is confirmed. Participation of the Agent The human agent participates more extensively in the formulation of content maps than agents in any other species. The human agent creates peripheral reinforcers and extensively influences the infrasystem to enhance specific content with secondary sensation.
FUNCTIONAL PROPERTIES OF HUMANS
425
Thinking. A necessary requirement that may not be immediately obvious is that the agent is constantly thinking, which means that the agent’s attention is always focused on something. This feature is not limited to humans, but is shared by any organism capable of extensive learning. The thinking may take the form of attending to stimulus features and changes that are in the sensory present, planning a response strategy that will be executed in the current setting, or (for the more advanced systems) thinking about unique features of a setting that is not present. The assertion that the organism is constantly thinking is not as radical as it may initially seem to be. When we acknowledge that the system has sensors that provide information to an infrasystem and that this input is necessary if any response is to be planned before it is to be produced, we tacitly assume that thinking is occurring. Messages from these sensors are being continually presented to the agent. Some are enhanced with secondary sensations so they compel the attention of the agent. Stated differently, if the organism is capable of voluntary responses to produce changes in the setting, the agent is engaged in some form of thinking. The only conditions under which thinking does not occur would be those in which the agent is not conscious. Even sleep, however, does not guarantee an absence of thinking. The agent may still be active, and the infrasystem may still be presenting the agent with sensory data, although it has no external sensory basis. This is basically what occurs in a dream. Central and Peripheral Attention. Because thinking implies content, a further implication of the basic learning system is that the organism is always attending to some central content. This is an axiomatic requirement of thinking. If the agent is to plan, the agent must attend to some features of the current setting and ignore others. The attention process is an analogue to the human’s visual ability to focus on relatively specific details of the current visual presentation and observe others only peripherally. Role of the Agent. When the agent who has a painful finger thinks about the vacation, different patterns may occur. The agent may achieve sustained attention to details of the vacation or, more likely, the thoughts of the vacation would be interrupted by thoughts of the finger. The battle is between central and peripheral attention. The infrasystem is attempting to command central attention, but it does not succeed and is relegated to peripheral-attention status. The agent provides a stronger directive to think about the vacation. (The fact that the agent’s central attention is on thoughts about the vacation is prima facie evidence that the agent’s directive is relatively stronger than the reflexively issued feelings and corresponding demands of the primary reinforcer.)
426
16.
HUMAN COGNITIVE DEVELOPMENT
Agent-Created Secondary Sensations. One of the most important properties of thoughts is that the secondary sensation of an event or feature will accompany a representation of the event or feature. When we think about something erotic, it is not merely cognition that is presented to the agent. It is cognition plus some degree of positive, enhanced sensation. When we think of something frightening—such as discovering a poisonous spider crawling over us as we lie in bed or standing on the edge of a towering cliff when the rock suddenly gives way—the secondary sensations that would accompany the actual event accompany the thought. This fact has enormous implications for the human system. Given that the agent is able to direct voluntary thought of virtually any content, and given that some content is reinforcing, the agent is able to exercise control over the secondary sensations that the infrasystem issues. To influence the system to issue positive secondary sensations, the agent thinks of content that creates a positive secondary sensation.2 This phenomenon may be used to create false features and influence the infrasystem to respond in unnatural ways. The false feature is actually a false reinforcer. For any false-reinforcer strategy to work, the agent must respond to specific events or thoughts of events in the same way the agent would respond to a plan that had the desired sensations. The learner may respond to a neutral event as if it has a feature that makes it exciting, important, distasteful, or frightening. Youngsters scare themselves by thinking of common events like opening the door in a dimly lit room as if they are potentially life threatening. The investor convinces himself that he should follow the advice of his broker and invest in something he finds risky. False-feature strategies are particularly important for situations in which the agent must do things that are opposed by the secondary sensations of the infrasystem. For instance, a self-imposed regimen of exercise designed to cure a painful back injury will be executed only if the performance of the exercises is more reinforcing or less punishing than nonperformance. The sanction for doing the exercises will not come from the infrasystem because there will be no positive reinforcement. The exercises will hurt, and the pain may persist after they are finished. Functionally, if the negative urgings from the infrasystem are stronger than the urgings from the agent, the exercises will not be performed. The sanction for doing the exercises must be accompanied by strong secondary sensations in the present setting. The only source of these is the agent. The process involves two steps: (a) specifying the content, and (b) 2This fact has been extensively documented through a variety of measures, including liedetection mechanisms. The learner experiences changes in breathing, heart rate, galvanicskin reactions, and other responses that are fairly reliable concomitants of the actual experience being thought about. They imply that the thought carries with it the emotional charge of the experience.
2
FUNCTIONAL PROPERTIES OF HUMANS
427
influencing the infrasystem to treat this content as if it were more reinforcing or less punishing than it actually is (based on the information the infrasystem receives). The content map for scheduling the exercise is necessary, but it must be fortified with secondary sensation. This fortification is recognized as will, which is functionally the agent’s ability to influence the infrasystem. The agent has three primary options for creating these contrary secondary sensations that overcome the messages from the infrasystem: 1. The agent may think about the exercise as if it is a positive reinforcer rather than a punisher. 2. The agent may think about the exercise as something that is negative but that predicts future positive reinforcement. 3. The agent may think about not doing the exercises as a very strong punisher. If the agent directs responses that are the same as those that predict an actual reinforcer, the infrasystem will respond in the same way it would respond to something that predicts an actual reinforcing consequence. The infrasystem will record what happened and empower the content map for exercising with secondary sensations that are strong enough to sanction the performance of the exercise. The result is that when the time of day for exercising occurs, the infrasystem will tend not only to remind the agent of the content, but present the sense of urgency needed for the agent to face the aversion that follows. Convincing the infrasystem that the exercises are positive logically requires some form of dialogue (or, more accurately, a monologue) between agent and infrasystem. The agent, in effect, must tell itself why exercising is more important than not exercising. It must respond with emotion—determination, good thoughts, or whatever works. The agent may use various strategies to convince the infrasystem that the exercises are positive. For example: (a) “Only the strong are able to do them. They’re a challenge. And you are good at meeting challenges”; (b) “The fact that they hurt means that you’re doing the right thing. The more they hurt, the more they tell you that you’re doing the right thing”; or (c) “You can get high on that pain. Start slow, and once the endorphins kick in, you’ll like it.” To convince the infrasystem that the exercises predict positive outcomes, the agent could use the following ploys: (a) “No pain, no gain. Stick with it and you’ll see how much better you feel doing other things. You’ll see how much easier the exercises are in a week”; (b) “Dr. Jerrit says that this is the only way to fix the problem. He’s one of the premier sport medicine doctors, so he knows. I won’t get the benefits unless I do this program the right
428
16.
HUMAN COGNITIVE DEVELOPMENT
way—all out”; or (c) “Think of how amazed everybody’s going to be when you walk into the room without hunching or limping—particularly Heidi and her negative bullshit: ‘Oh don’t do that kind of exercise. You should rest.’ I’ll show you who should rest.” Some examples of convincing the infrasystem that not following the content map is more punishing than following it may also take many forms: (a) “If you want to show that you’re tough, you’ll do it. If you don’t do it, you’re showing yourself to be a wimp, a blowhard, a talker not a doer”; (b) “You know you’ll hate yourself a lot if you don’t do it. So do it”; or (c) “You are basically lazy and you know it. So you’re just going to take hold of yourself and do what you have to do.” Obviously, the learner may use a combination of techniques to influence the infrasystem. For the techniques to be successful, the agent would have to interpret sensory data in a way that relates to whatever false-reinforcer strategy the agent uses. Ultimately, the infrasystem will not respond to the exercises as if they are indicators of positive outcomes unless the agent responds in a consistent manner. If the back hurts, it may be interpreted as a good sign. If the exercises on one day are not as aversive as they have been, that is another good sign. In summary, the human agent has the capacity to influence the infrasystem to do things that are not supported by any content map or supported by any apparent reinforcing consequences. The agent must supply the reinforcing consequences. This effort requires agent-directed emotional responses to what occurs. From the standpoint of development, this aspect of the human potential is learned. The fact that it is learned raises the question of the extent to which the agent uses language when communicating with the infrasystem. A strong case may be made that the agent does use language and basically talks to itself or the functional equivalent. The language code is in the repertoire. The various lexical items in the repertoire are charged with secondary sensation. Therefore, they present the most efficient way to express relationships or content that the agent wishes to convey. Simultaneously, they convey the secondary sensation that the agent experiences during the communication. If the agent always experiences a positive sensation that is associated with the true resolve of an item, the infrasystem will tend to produce the secondary sensation that accompanies instances of true resolve when it is time to exercise or when the agent thinks about exercise. This format would apply to all humans. If the learner did not have a public language of any sort, the scope of the agent’s influence over the infrasystem would be limited in terms of content. However, the human agent would certainly have some system of representing various features and patterns, and the agent most certainly would be able to communicate with the infrasystem. The agent of a deaf-blind person might not be able to
FUNCTIONAL PROPERTIES OF HUMANS
429
argue about the importance that Dr. Jerrit places on doing something if this argument could not be expressed by the agent, but the agent would be able to represent what it wants to do and could formulate content maps and directives fortified with secondary sensation. In any case, the young human agent must learn to negotiate with the infrasystem because some form is needed for any self-control and longrange goals that are opposed to the hardwired responses of the infrasystem. The manipulation of the infrasystem would have to occur if the child were to engage in any activities that involve deferred gratification not clearly associated with immediate primary reinforcers.3 If there are bruises and tears associated with learning to play football, the youngster’s agent must convince the infrasystem that these negatives are more than offset by the positives. The events that follow, however, may dissuade the agent from assuming that the goals it anticipated are realistic, in which case the learner would tend to avoid, not relish, football. Role Learning Roles are discussed more thoroughly in chapter 17. However, role learning is a central component of human cognitive development. The infrasystem is designed to promote role performance, role adoption, and role rejection. Children imitate, but imitation is not to be confused with role performance. A child imitates, mocks, or repeats what the model presents. If the child transports the content into situations that have features observed in the settings in which the model performed, the child is not imitating, but is engaged in executing a role. The example of the child who scolded the other car and labeled it a “Cinnamon bitch” illustrates the difference between imitation and role learning. The child did not imitate because there was nothing to imitate. Rather, the child acted in the same way that the father acted and did so in response to particular features of the current setting shared by the original setting (car cutting in front, sudden braking, fear). The acting involved behaving the way the child assumed that the father would act in this situation—angry demeanor, tone, and verbal content. In fundamental ways, the child was playing father. Another example of the difference between role performance and imitation is the parallel between learning to write what Ernest Hemingway wrote 3 The
extent to which organisms other than humans have this ability is difficult to determine. However, one example suggests this ability in pigs. One of the authors had a recalcitrant pig who continually escaped its electrified fence enclosure. The pig would back away from the fence, begin squealing, and then proceed to run through the electric fence. The squealing behavior was not present at other times when the pig ran in its enclosure.
3
430
16.
HUMAN COGNITIVE DEVELOPMENT
and learning to write like Ernest Hemingway wrote. With knowledge of how Hemingway thinks, the learner’s goal is not to reorganize phrases that Hemingway wrote, but in effect to achieve the goal of being Hemingway— thinking the way he does, sharing his priorities so that it is possible to extrapolate the essence of Hemingway to a new setting and express things the way he would. During basic role learning, the agent may have no knowledge of copying or adopting some behavioral components. Rather, what is presented through the model seems the natural way to do it. If the model speaks with particular emphasis or inflectional patterns (using a high voice for questions of a particular type), this pattern seems to be the right way to speak. In the broadest context of positive models, the learner is oriented toward the future. The learner wants “to be like Mike” or other models. Peer models are important, but not all peers are models. In fact, when the learner interacts with those who know less than the learner does, the learner is the model. When the learner interacts with those who exhibit skills that the learner does not have, the learner is in the mode of learning from a model. Because the learner engages in many skill areas, the learner has many models and serves as models in various contexts. The basic logic of much role learning involves false features. The system classifies the model as positive. Therefore, all features of the model are enhanced with positive secondary sensation. If the model is on a particular team, that team is positive. If the model has a unique style of walking, that style is made attractive to the agent. Advertising schemes are based on the logic of using positive models to create false features. If you want to model yourself after Person X, you want to do the things that X does and like the things that X likes. The commercial shows that X loves Booboo Lipid chips. Learners who try to model themselves after X may be influenced to buy these chips.
SUMMARY The human learning-performance system has hardwired provisions that increase the probability that appropriate or adaptive behavior will result. The system is designed so that there are no default content maps that present content, merely broad-based responses to patterns and unpredicted changes in the sensory present. The system is designed so that patterns are enhanced with secondary sensations. The pleasure–pain dimension of reinforcers is also exaggerated to promote attention to details of the surroundings. When something moves unexpectedly, it is not merely interesting to the infant. The system responds to it in a way that makes it frightening or entertaining.
SUMMARY
431
Through this provision, the learner is provided with a peripheral reinforcement system that basically ensures that the child will be curious and will learn a full range of relationships. This feature is shared with other mammals, but what is learned by them is simply more proximal than the range of what the human learns. Much cognitive matter that the learner learns occurs through language. Language is designed so that it interfaces closely with the learner’s infrasystem and agent repertoire. The system classifies by features; the language has symbols and conventions for referring to features. The features that the learner learns have to do with hosts and their various residential features, many of which are relationships. The language has provisions for expressing all these features. Different schemes of development have suggested that learning is subsumed by development. In a sense, this assumption is true because what the learner is able to be taught is logically limited by what the learner already knows. In another sense, the position is self-contradictory. It assumes that the learner learns logical operations involving specific content in a manner that is different from the way in which the learner learns to identify dogs. Certainly the content is different, but the functions are the same. All learning is based on shared features. Some experiments have specifically addressed the necessity of assumed developmental influence on learning (manipulating concrete objects, spending a great deal of time, attaining a certain age range, etc.). These experiments have demonstrated that sophisticated cognitive operations may be induced through interventions that violate all the provisions that the developmental position assumes to be necessary. The implication of these experiments is that, just as the learner is not provided with preknowledge of grammar, the content learned about logical operations is strictly a function of information the system receives. Learners much younger than those who learn through uncontrolled developmental exposures may be taught complex logical operations through instruction that provides them with a model of a content map they are to learn. If this model is designed appropriately, it specifies the features and relationships that anyone who learns the operation would have to possess. The training provides practice with verbal examples that present symbolic facsimiles of the relevant features of the concrete examples. If the communication and application of the verbal content map is effective, the learner will later identify the relevant features and relationships of the concrete examples indicated by the content map. The content map will not have the sanctions of secondary sensation that maps created inductively (from specific events) have. Yet if the learner applies the verbally induced map to concrete settings and the map proves to be predictive, the infrasystem will enhance it and treat it in the same way it would treat maps that derived from concrete examples.
432
16.
HUMAN COGNITIVE DEVELOPMENT
Because the human organism has the potential to learn about any pattern that exists, the learner learns how to create content maps that are supported by false reinforcers, which influence the infrasystem. The communication from agent to infrasystem is most efficiently conducted through what amounts to language. The reason is that the scope of content communicated is potentially greater with language. The agent creates the content and responds to it as it would respond to something that actually has particular reinforcing properties. The infrasystem is deceived into assuming that the fortified content predicts reinforcement. With the system’s hardwired bias toward pattern identification, the learner is predisposed to learn roles based on models that interact with the learner. The learner is future oriented. The learner wants to become more like people who the learner recognizes as superior, admirable, or capable. Role learning creates false features. If the model is valued because she is nurturing, the infrasystem sensitizes all features of the model so that a wide range of the model’s behavior tends to become the standard for the learner. Perhaps the most prominent feature of the human infrasystem that sets it apart from those of other animals is the extent to which the agent is able to participate in actually modifying what the infrasystem does in the presence of specific stimuli. The stimuli may be physically painful, but the agent may modify them to have reinforcing properties. This agent power does not suggest that the infrasystem changes its basic functions. Rather, the agent creates both some form of sensory input and accompanying secondary sensations that the infrasystem records and further enhances with secondary sensation. Stated simply, if the agent takes the steps necessary to think positively about something, the system will support the agent with enhanced positive sensations that accompany thoughts or plans that refer to this target.
Chapter
17
The Logic of Instruction
The goal of formal instruction is to change behavior in specific ways. The behavior is the indicator of what the learner has learned. Just as we are able to infer the content map of hardwired organisms and of those who have learned outside an instructional setting, we are able to identify what learners have learned from their instruction. If the instruction is effective, the learner’s performance is consistent with the content map the learner would need to perform on the full range of examples beyond those presented during training. If the instruction is ineffective, the learner’s performance implies a content map other than the one the learner would have if the content were mastered. The enterprise of instructing the learner may therefore be conceived of as a process of implanting a specific content map or cluster of related maps. It follows that the efficiency of the instruction would weigh heavily in determining its desirability. The question of efficacy is an empirical one. If one instructional sequence is able to induce the desired content map in 20 hours of instruction and another requires 80 hours to achieve the same performance, the more efficient one is desirable—given that it does not generate any adverse side effects. If the learners who go through the faster program tend to be nervous, for example, the program has an undesirable side effect. Effectiveness is also measured by the range of learners taught within a particular time span. If the program teaches only two thirds of the students who exhibit the skills necessary to enter the program, the program is less ef433
434
17.
THE LOGIC OF INSTRUCTION
fective than one that teaches seven eighths of the qualified students in the same period of time. Although the ultimate determination of effectiveness is empirical, the program’s potential for being effective is largely analytical. The fundamental assumption of instruction is that the learner will learn a content map that is consistent with the information presented during instruction. If the instructional communications are consistent with the intended content, but also with unintended maps, the learner may learn an unintended interpretation. For instance, if we present a red bottle that is held in the hand and tell the naive learner “This is gluck,” the communication generates more than one possible meaning. Gluck could mean something held in the hand, a bottle held in the hand, a bottle, a red thing, and so forth. If we present identical bottles except for one feature (such as red) and identify only the red bottle as gluck, the only difference is logically the only basis for one being identified as gluck. Therefore, the communication is consistent with a single interpretation with respect to the feature of redness. Gluck means red, at least in some range of applications. To show the range of hosts that accommodate gluck, the program next identifies a range of red objects that differ in many features as “gluck”—red can, red dress, red house, red bird, red triangle. By presenting these objects, we imply through interpolation and extrapolation that gluck refers to any host. This conclusion is logical. If we arranged these items on a continuum of variation, the red triangle might be on one end and the red house or red bird on the other. Any steps that would be involved in going from one end of the continuum to the other describe an interpolated example. About the only shared feature these positive examples have is redness. So redness would be the basis for any interpolated value being positive. The range of variation implies that things outside the range, such as a red sunset, are also gluck. This labored demonstration would generate one and only one content map, but it would not be greatly efficient for any but the very naive learner (such as the person who is able to see things for the first time). The demonstration could be abbreviated to achieve efficiency, but it would still have to address the three main questions: 1. What is the feature, set of features, or relationship unique to all positive examples? 2. What is the range in variation of positive values? 3. What is the range of hosts that accommodate the positive examples? The communication techniques available to forge this communication primarily show differences and samenesses. In the strictest instructional
LANGUAGE
435
terms, the difference between positives and negatives is demonstrated by presenting juxtaposed examples that are minimally different and treating them differently (labeling one positive and the minimally different one negative). The sameness is demonstrated by juxtaposing positive examples that are greatly different and treating them in the same way. The sameness principle applies to both the positive variation within a given host (2 above) and positive variation across hosts (3 above).
LANGUAGE To induce a content map that is consistent with a single interpretation, the presentation must prepare the learner for dealing with a range of concrete or specific examples. This requirement does not mean that the examples must be presented physically. If the learner understands all the components of the language that would be required to describe the positive feature, the range of positive variation, and the range of hosts, not only is language possible, it is preferable because it is more efficient. Unless we use some sort of symbol system, it is improbable that we could teach the learner about neutral hosts that are not something else. For instance, we present a ball that is on the floor and ask, “Is this ball on the table?” Without symbols of some sort, how do we call the learner’s attention to the negative feature (on the table) rather than the positive (on the floor)? Therefore, the concrete presentation sans words is ambiguous because, to create an example of something that is not on the table, we create an example of something that is in some other spatial relationship—on the floor, in the oven, over the table. If language is used to construct content maps, the learning process may be greatly accelerated because the presentation names the relevant hosts and describes the relevant features. This aspect reduces noise and presents a more direct connection between the presentation and representation the learner is to formulate. The language describes the representation the learner is to formulate. It does not describe irrelevant features of hosts. Teaching the Necessary Language Components It is possible to construct language-based content maps that orient the learner to the relevant features of the presentation if the learner understands the language components of our presentation. If the learner does not understand one or more of the units in the verbal presentation, the presentation is logically compromised. For instance, we present the rule,
436
17.
THE LOGIC OF INSTRUCTION
“the medium and the object must be ala tuscun. The medium can’t be gupified and the object can’t be gupified.” If we presented examples of this rule, the learner might learn the meaning of ala tuscun and what it is to be gupified. However, the presentation is logically not efficient because the purpose of applying the rule to examples is to show that the verbal representations predict actual outcomes. If the learner is incapable of accommodating the verbal representations, the learner cannot clearly represent the content. The only efficient process would be to identify the verbal components used in the rule and teach them in a straightforward way before introducing the rule. The efficiency of this direct-instruction approach is realized in several ways. The teaching of the components that the learner does not know is logically simpler outside the context of the rule. The reason is that if we are presenting examples of the rule and trying to teach the meaning of ala tuscun at the same time, there is much more interference between the presentation of examples than there would be if we just presented examples of ala tuscun. Here is how the combined teaching of the rule and the additional teaching of the word meanings might appear to the naive learner. The teacher presents a can filled with tuna fish, an empty can of the same size, and a fish tank of water. “This (points in the fish tank) is the medium, and these cans are the objects. Are the empty can and the other can ala tuscun? Yes, they are. Are the empty can and the fish tank ala tuscun? No, you can see that one is gupified. Which one is gupified, Tina?” “Well, look at them and you can see that the tank is really gupified. See how gupified it is?” “Look at the two cans. Is the empty can gupified? No, it isn’t. Is the other can gupified? Of course not. So the two cans show how we could get a medium and an object that are ala tuscun. Jimmy, what could we do? Anybody else? Doesn’t any one know?” “All we have to do is fill the empty can with water. Watch. . . . Now we have a medium and an object that are ala tuscun. Is one of them gupified? No, Alex, one of them is not gupified. Open your eyes and pay attention.” You may have figured out what the terms mean. If not, reread the explanation and substitute “the same size” for ala tuscun and “bigger” for gupified. The teacher assumed that if something were really bigger than something else, all one needs to do is look at them to determine the difference. That is true, but if the learner does not know what the word means, the learner may not attend to the feature that the teacher is trying to name. One is bigger than the other, one is filled with water, one is a rectangular prism, one is transparent, and so forth. Any of these feature differences could be candidates for the learner’s attention.
TEACHING RULES
437
Basic Features. Any component that refers to basic features must be taught by presenting actual examples. A basic feature is something that cannot be taught using words alone because the student does not understand words for the feature. If we were to teach bigger, we would show examples that are bigger than others or examples that get bigger than they were. If the learner does not know what we mean by greater or larger, we cannot use any of these to teach bigger. For instance, we cannot say that something is bigger if it is larger. To the learner that would be analogous to, “If something is gupified, it is crupnurd.” We must present actual examples and label the feature. To make the teaching efficient, we follow three basic principles: (a) Teach one thing at a time, (b) arrange the setup so that examples differ in only one feature, and (c) arrange the presentation details so that examples may be presented quickly. If a sequence involves more than one feature-name pairing, the presentation requires the learner to shift attention from one feature set to the next. This practice is not efficient when the learner is attempting to learn which features are correlated with which words. If the positive and negative examples differ in only one feature, the learner receives precise evidence of the only possible feature that makes the example positive or negative. If examples are presented quickly, the relationship may be taught in less time, thereby making the presentation relatively more efficient. Also the learner will be in a better position to remember previous examples. If the learner responds to a positive example and a pause of 1 minute follows before the next example is presented, the learner may not remember either the response or features of the example. If only 3 sec elapse between examples, the learner is far more likely to attend to the relationship. TEACHING RULES Once all the verbal components of the rule have been taught, the rule is taught. The learning of the rule involves three main steps: (a) Saying the rule, (b) applying the rule to verbal examples, and (c) applying the rule to concrete examples. This order is radically different from that of traditional instruction. Traditional instruction would follow the developmental approach, which starts with the learner experimenting with concrete examples. Next, through verbal interchanges, the learner would formulate something of a content map that applied at least to the examples presented. Possibly much later, the learner would be exposed to the rule that describes the variables that control the full range of applications. The rule-to-application sequence is far more articulate because the full range of examples to which the rule applies may be stated clearly and transmitted unambiguously to the learner without introducing time-consuming
438
17.
THE LOGIC OF INSTRUCTION
and possibly ambiguous experiences that require the learner to revise inadequate content maps. For example, if the relationship of floating and sinking is taught through a rule, all media and objects are implied. If the learner understands the required vocabulary, a form of the full range of examples may be transmitted to the learner. We could talk about the floating and sinking of blobs of something in lava, or of blobs of lava floating or sinking in another medium of different density. We are able to portray possibly 50 examples in less time that it would require to present one actual experiment involving a fish tank and pair of cans. We are able to use language to portray examples that the child may never see, but is certainly able to represent. Merely because we do not necessarily use hands-on demonstrations as the basis for instruction does not imply that the children never see concrete applications of the rule. In the experiments described in chapter 16, Engelmann did not present such demonstrations only to make the point that they were not necessary. However, they are certainly desirable and ensure that the children are referring to general words like medium to expand the scope of their practice. The concrete examples, however, are best presented after the children have learned the content map. The reason is that if the children are required to use the same rule to predict both verbal and actual examples, the children see that the same rule applies to all the examples. They also see that they are able to predict outcomes by referring to the rule, and that the outcomes that occur with concrete examples are predicted by the verbal examples. This means that the content map receives the endorsement of the infrasystem. The rule predicts. If the instruction has been effective, demonstrations of concrete examples are reinforcers because the children will certainly be able to answer the teacher’s questions, predict the outcome, and explain why the outcome occurred. A sample demonstration might be presented as follows. Okay, we’re going to see some actual floating and sinking. Tell me, what’s the medium that we’re going to use? Yes, water. Good. We have steel balls, cans, rocks and a couple of mystery objects (which are made of Styrofoam). Pass these items around and get an idea of whether you think they’ll float or sink. If something sinks in the medium, what do you know about the thing that sinks? If two things are made of the same material and one of them floats, what will the other one do? Here we go. First the little steel ball. Watch. What did it do? So which is more dense, the ball or the medium? What will the other ball do? Watch. Were you right? Good job.
To convince the learners’ infrasystem that the rule they have learned applies to all objects and media, we follow the sameness principle: juxtapose examples that are greatly different and treat them in the same way.
TEACHING RULES
439
In accordance with this principle, the teacher next presents an inflated balloon. “Here’s an object in a medium. What’s the object? What’s the medium? Yes, air. Watch what happens. Did the balloon float or sink? So tell me about the density of the object and the medium.” (The balloon is more dense than air.) Next, the teacher presents an identical balloon. “This balloon does not have the same material in it that the other balloon had. It’s filled with a gas called helium. Watch. What did the balloon do?” (Float.) “Yes, it floated up to the ceiling. So tell me about the density of the helium balloon and the medium.” (The helium balloon is less dense than air.) If the teacher reviews or applies the rule from time to time, the content maps that the teacher induced should function as well as those achieved through procedures that require great amounts of time and that tend not to work well with lower performers. Form Versus Function Traditional approaches do not perform logical analyses to determine efficient ways to induce and apply the content to be learned. Rather, they are based on the misunderstanding that learning proceeds from concrete to abstract, from proximal to remote. This approach places heavy emphasis on play, experimentation, and manipulation (implied by Piaget’s explanation of cognitive development). Children are seen as somehow internalizing their actions into representations. Certainly, this progression would occur in a natural setting because it is the only progression that could occur. The purpose of instruction is not to re-create the form that occurs when there is no instruction or minimal instruction. It is to re-create the functions—the desired learning outcomes— with the most efficient forms available. This pursuit requires viewing the problems of instruction logically, not phenomenologically. Application of a concrete-to-abstract maxim results in poorly designed programs. For instance, social studies in the elementary school is often presented as a progression from the immediate to the remote. First the children learn about the familiar—facts and relationships that occur in the neighborhood. Later, they learn facts and relationships that occur in larger geographic areas. Finally, they may learn some general rules or principles that explain some of the common features shared by phenomena in all geographic areas. The approach induces serious misunderstanding because it stipulates and assumes that the children will extrapolate later. This tends to be unlikely. Let’s say that children derive some relationships about how things are done in the neighborhood—rules or laws, services, systems of transportation, and so forth. Obviously, there is nothing wrong with learning this in-
440
17.
THE LOGIC OF INSTRUCTION
formation. If the children are expected to formulate general principles that apply to any community in the world or facilitate such learning, however, the generalization is unlikely because any rule the child formulates would be limited to what the child knows. The problem is parallel to a presentation that presents three green examples and then expects the learner to identify the range of positive variation (see chap. 10). If the child does not know about other systems and remote places, the chances are extremely small that the learner would be able to identify what is the same about their community and others. Relearning would therefore be required at each step. A more efficient approach would be one that induced a content map that addressed the full range of variation. The map would then be applied not only to the proximal setting, but to remote settings as well. This progression would follow the same format used for teaching specific gravity. The only difference is that there is more than one rule for the social studies foundation. The umbrella rule is that all people have needs. The next rule might be all people need shelter. We would make sure that the children knew what all people and shelter mean before presenting the rule or as part of the introduction. Then we would make sure that the children could recite the rule and apply it to a range of verbal examples. Do all people need the same kind of shelter? (No.) How do we know if something is shelter? Yes, it protects from the weather and maybe from dangerous animals. If some people don’t have any wood to make the kind of houses we live in, what could they use to make a shelter? Do you think somebody could use ice or snow to make a shelter? Do you think somebody could use grass to make a shelter?
After the children have considered the problem in the broad sense, actual examples would be presented. “Look at this picture. What is this shelter made of? Yes, blocks of hard snow. Look at this picture. I’ll bet you don’t even see the shelter in this picture. What are they? Yes, those caves are the shelter. Pretty smart.” Following this work, the children could identify sameness of shelter features. Children could also tell why some shelters are as they are. “Look at this picture carefully. Why do you think the people didn’t use trees to build their shelter? Yes, there are no trees around.” The same format would be applied to the other needs of people—the need for food, water, and clothing; the need for laws; the need to move things from one place to another; and possibly other needs. If our presentation is well designed, the learner will understand that the rules apply to the full range of examples. The strategy this approach em-
TEACHING RULES
441
ploys is to refer to the features of specific examples with language that applies to all examples. If the same wording applies to the range of Examples A–N, and the infrasystem observes that it predicts for Examples A–D, it will have evidence that the rule is a predictor and generalize to E–N. The rule will then be responded to by the system in the same way it would respond to a rule that is generated inductively from actual contact with a few examples.
Designing the Rule As noted, one of the problems that results from proceeding from the concrete to the abstract is that of stipulation. The learner assumes that a fact or feature applies only to a limited range of examples when it actually applies to a wide range—for example, facts like “Cows give milk.” Young children often learn this rule. Later, when they are in the fourth grade, they may learn about the properties of mammals. These properties describe the full range of mammals. One rule is that, following birth, the mother produces milk for the infant. If we present this rule to the students and then ask them to apply it to concrete examples, we see that the cow is treated as a special entity. “Is it true that a mother dog does not give milk until she has puppies?” (Yes.) “Is it true that a mother bear does not give milk until she gives birth to a cub?” (Yes.) “Is it true that a cow does not give milk until she gives birth to a calf?” (No.) “Why not?” (Cows give milk.) Not all students would respond in this way because nothing actually prevented them from applying the rule about milk to all mammals, including cows. However, a fair percentage of the students would stipulate and formulate their general misconception that cows are unique. The same problem that occurs when children learn a rule or stipulation that applies to only a part of the range also occurs if they are permitted to learn a rule on their own. There is nothing wrong with the exercise or the idea of giving children practice in making up rules. However, there is a great deal wrong if we expect them to learn productive rules about topics they do not understand in detail. If they do not understand the full range of variation or features shared by all the examples, they have no logical basis for determining whether the rule they formulate makes sense. It may apply to the current example and to a limited range of examples that share many features with the current example, but it probably will not apply to remote examples. We recently worked with two groups of students—one a group of low socioeconomic status (SES) sixth graders who had been taught through an analytically careful sequence of math skills. The other group consisted of seventh graders of much higher SES status who had been in a constructivist math program since second grade. These children made up their al-
442
17.
THE LOGIC OF INSTRUCTION
gorithms. They would tend to work on different problem types with every assignment. The seventh graders were not only far behind the sixth graders in performance, but they had also discovered general features of math that actually prevented them from learning. After more than 2 months of careful instruction that started close to their skill level, they were still greatly dependent on the teacher. Even after they were shown how to work problems of a given type and practiced the steps with various examples, they would tend not to be able to apply the procedure the next day largely because they did not understand the basic premise of math—that the operations apply to a large range of examples. For them, every new day marked the occasion for making up something new. A typical exchange between teacher and student revealed the degree of the student’s misconceptions: The student raised his hand. “What is it, Lon?” “I don’t know how to do these.” “Those are the same type of problems we worked yesterday. Don’t you recognize them?” “Yeah, but I don’t know how to work them.” “Well, what did we do yesterday?” “Do you mean you want me to work them the same way we did yesterday?” “Yes, Lon.” The unfortunate learner had such a jaded introduction to math that he failed to get the fundamental notion that problems of the same type are worked with the same operational steps. He was not the only one in the group that had misconceptions of this order. By the end of the school year, perhaps two thirds of the students had caught on to the real game of math, but certainly not all of them. In the meantime, the low SES group continued to learn new skills at a high rate. For them, math made sense. Stipulated Rules Traditional instruction is replete with examples of instruction that will inevitably teach stipulated misinterpretations. Traditional teaching of fractions is replete with examples of serious stipulation and engineered confusion. The first mistake the introduction may make is to introduce the nomenclature and operations at the same time, implying that the words numerator and denominator describe some special feature of fractions. Often the basic
TEACHING RULES
443
rule about fractions refers to equal parts. This stipulation applies only to geometrical representations ( 45 of the pie). It certainly does not apply to 45 of the children on the bus, 45 of the busses, or 45 of the days the busses ran. Just as children who learn a stipulated rule about cows are unable to generalize, the children who learn a stipulated rule about fractions and equal parts must later relearn it if they are to understand ratios. An even greater stipulation results from the sequence of examples presented. More often than not, children are introduced to fractions through three examples that suggest a narrow range of variation: 12 , 13 , 14 . These fractions are studied extensively. The children come away with some version of three stipulated misrules: 1. A fraction refers to parts of a single whole and is always less than one. 2. There is only one part of the whole that is involved in the calculation. 3. All problems are solved by counting the total number of parts in a whole. Stipulation 1 implies that serious relearning is required for the child to accommodate fractions that are more than one whole. This stipulation is perfectly unnecessary. If the introduction is properly designed, it will induce a generalization that fractions refer to any number of wholes and any number of used parts. Merely because improper fractions were historically subsequent to proper fractions does not mean that the child’s introduction should recapitulate history. An efficient system would induce this knowledge as efficiently as possible through the fewest possible content maps. Stipulations 2 and 3 are created because the numerator 1 is shown through examples to be a constant, not a variable. One part is always shaded. So in determining whether a problem tells about a half, a third, or a fourth, the learner must attend to the only variable—the number of parts in the circle or the bottom number of the fraction. To find the example that shows 14 , the child simply counts the number of parts and finds the circle that has four parts. To create an example of 14 , the child makes a circle with four parts and then follows the convention that has been stipulated for all fractions by coloring in one of the parts. As with other stipulated misrules, these are a function of the example set selected. Teaching Fraction Rules There is certainly more than one way to design an introduction to fractions that avoids these stipulations, teaches the children what is the same about all fractions, and provides the kind of example selection that would induce extrapolation to all other examples. The issues are (a) the specific steps the children are initially taught for analyzing any fraction, and (b) the set of ex-
444
17.
THE LOGIC OF INSTRUCTION
amples that will imply generalization to virtually any two-symbol fraction that could be written. A related, but important consideration is how this fraction analysis is used as a foundation for teaching related skills. We have developed a variety of routines and sequences, some that would promote broader generalizations than others. The ones that induce the greatest generalizations are not necessarily the ones that appear in commercial programs that we have written (Engelmann & Carnine, 1969; Engelmann, Carnine, & Steely, 1985; Engelmann, Kelly, & Carnine, 1994; Engelmann & Steely, 1978). The reason is that the teachers in the early grades are often overwhelmed with the scope of examples presented and tend to feel that the program is too radical for them. The following is an outline of a program we have used successfully with preschoolers. The teacher teaches the children this rule: “The bottom number tells how many parts are in each group; the top number tells how many parts you use [or color].” When the teacher directs the children to apply the rule, she makes sure that the children understand that the bottom number does not tell how many groups to make. The teacher writes the fraction 24 on the board, asks children to identify the top number and the bottom number, and then says, “My turn: Does the bottom number tell how many groups to make? No. Does it tell how many parts I show in each group? Yes.” Then the teacher draws three circles. “Here are some groups. What do you know about each group?” (It has four parts.) “I’ll make the parts for each group.” The teacher then has the children count the parts in each group. To show “how many you use,” the teacher shades in parts. Routines for Samenesses and Differences. The examples are sequenced according to the principle for showing sameness. If juxtaposed examples differ greatly and are treated in the same way, the presentation induces both interpolation and extrapolation on the basis of common features. For instance, during a 10-minute segment that teaches fractions on a particular day, the children direct the teacher to construct a picture for the following fractions: 38 , 32 , 74 , 25 , 132 . The initial routine is the same for all examples. The teacher presents exactly the same steps, in the same order, and with the same wording format. The only parts of what she says that vary from one example to the next are the variables. The presentation, therefore, provides the learners with a map of what is the same (what the teacher does and says in connection with all examples) and what is different (the unique arrangement of numbers). Here is the presentation for 74 . “Read the fraction.” (Seven over four.) “What’s the top number?” (Seven.) “What’s the bottom number?” (Four.) “Which number tells how many parts in each group?” (Four.) “How many parts are in each group?” (Four.) “Does the bottom number tell how many
TEACHING RULES
445
groups to make?” (No.) “So I’ll just make five groups.” (Teacher draws five circles and divides each into four parts.) “What does the top number of the fraction tell us?” (Use seven parts.) “I’ll shade in the parts. Count and tell me when to stop.” (Students say “stop” after seven parts are shaded.) “There’s a picture of the fraction. What fraction is that?” (Seven over four.) For writing fractions from displays, the wording for all examples is the same. (The teacher makes a picture of 94 .) You’re going to tell me how to write the fraction for this picture. This time, the number I’ll write first is the number of parts that (we use/are in each group). Raise your hand when you know that number. How many parts (do we use/are in each group)? Is that the top number or the bottom number of the fraction? (The teacher writes the bar and the number.) Raise your hand when you know the other number of the fraction. What’s the (top/bottom) number? (The teacher writes the number.) What fraction does the picture show? (Nine over four.)
With a sufficient number of examples, the children learn the generalization that the basic analysis applies to any fraction. Following the basic analysis, the instruction focuses on different subtypes of fractions that are particularly important: (a) fractions that are more than one, (b) fractions that equal one, (c) fractions that have a denominator of one, and (d) fractions that have zero. To teach about fractions that are more than one, the teacher would write various examples on the board, some more than one and some less than one. The following routine would be repeated with the various examples: You’re going to figure out which of these fractions are more than one whole group. But we’re not going to make the groups. You’ll do it in your head. Remember, the bottom number tells how many are in each group. So if the top number is more than the bottom number, you have more parts than you can put in one group. The first fraction is five over three. How many parts are in each group? (Three.) Do you use more than three parts? (Yes.) So is this fraction more than one group? (Yes.)
To teach about fractions that equal one, the teacher would write a few examples of fractions that equal one on the board. The purpose of this analysis is to show what is the same about all of them. For each, the teacher would point out that, “This fraction is one whole group with no leftover parts. How many parts are in each group? How many parts do we use? So tell me about the fraction. Yes, it’s one group.” Following this exposure, the teacher would write different fractions on the board, such as
446
17.
THE LOGIC OF INSTRUCTION
3 5 1 2 9 7 2 8 5 3 6 5 7 2 The teacher would then direct the analysis of each fraction and ask whether the fraction is more than one group, less than one group, or one group. The teacher would also present construction tasks. For instance, the teacher writes [] 4
=1
“I’m going to make a fraction that equals one whole group. What does the number in the fraction tell me? Yes, there are four parts in each group. If there are four parts in each group, how many parts would I have to use to have one whole group? Yes, four over four equals one whole group.” The children would shade in the parts for different fractions to validate that the analysis of the fraction predicts the fraction. To show that fractions that have a bottom number of one equal a whole number, the teacher presents five circles with no divisions and then colors in three circles. “This is a strange fraction. Listen. There is one part in each group. You know that there’s one part in each group because I colored in three parts and how many groups did I color in?” (Three.) “So each group has one part. Remember, if there is one part in each group, every time you use a part, you use a whole group.” By applying the analysis the children have learned, the teacher shows that it is not possible to have zero as the bottom number. The teacher writes 4 . 0 Here is a fraction that is really stupid. How many parts are in each group? Yes, zero. How can I make a group with zero parts? Watch. (The teacher draws a circle.) Look at that. When I made a circle, I made one part. So, I can’t make a picture of this fraction. It’s impossible. Remember, don’t get fooled by stupid fractions. If the fraction tells you to make zero parts in each groups, it’s telling you to do something that’s impossible.
To show zero as the top number, the teacher simply makes the picture for the fraction. For instance, the teacher writes 04 . I’m going to make a picture of this fraction. How many parts are in each group? (The teacher makes four parts in each circle.) There’s a picture of four parts in each group. What does the top number of the fraction tell me to do? Yes, use zero parts. Here I go. I’m done. Did I color any parts? (No.) That’s how you use zero parts. Pretty funny, huh? Remember, you can make a
TEACHING RULES
447
picture of a fraction that has zero as the top number if you just show the parts in each group.
Automaticity. The prior examples specify the introductory formats or routines for introducing the different content maps and their related features. To induce this knowledge in children, the structure that the teacher provides must be systematically eliminated so that the children perform independently with no process directions from the teacher. The structured routine is something like a highly detailed content map. It precisely guides the learner in what to do. The structure must be systematically removed so the child generates the steps for identifying the problems and connecting them to the appropriate content maps. Therefore, the last part of the sequence of instructional steps is to shape the context so the learner performs without prompts from the teacher. The teacher would say something like, “Make pictures for all of these fractions,” “Circle all the fractions that are more than one whole,” or “Write the fraction for each picture.” The most efficient process of eliminating the teacher-provided structure from the routine is largely an empirical issue. The children’s rate-andaccuracy performance suggests the extent and rate at which teacher direction may be removed. The goal is not simply to induce the various maps, however, but to provide sufficient practice for children to achieve a high level of automaticity, which means that the children perform the analysis quickly and correctly. As part of the process, the item sets that the children work would become less predictable. Instead of occurring in blocks of a single type, the juxtaposed problems would be of different types that require different analyses. For instance, the teacher directs the children to write the fraction for the picture she draws on the board, make the picture for a fraction, look at different fractions on the board, and write those that equal one. Extensions. Automaticity guarantees that the children have a foundation of understanding that supports extension of the basic content they have learned. One such extension involves analyzing fractions that equal whole numbers. This instruction would occur after the children are able to identify whether fractions that are more than one have leftover parts (a circle with only some of its parts shaded). (Teacher writes 155 = on the board.) This fraction equals a whole number. We can figure out the whole number by thinking. What’s the bottom number of the fraction? (Five.) So every time you count five parts, how many groups do you count? (One.) Yes, one group. So we count by five until we end up with fifteen and see how many times we count. Get ready. (Five, ten, fifteen.) How many times did we count? (Three.) So how many whole groups is fifteen over five? (Three.)
448
17.
THE LOGIC OF INSTRUCTION
A similar analysis would hold for problems like
[] 4
=3
This problem says that the fraction equals three whole groups. But we have to figure out the top number. This calls for some big thinking. But we can do it. How many parts are in each group? (Four.) So every time we count four parts, how many groups do we count? (One.) So we just count by four, three times. Let’s do it: four, eight, twelve. So if the fraction equals three whole groups, what does the top number have to be? (Twelve.)
With little additional teaching, children can handle fractions like A +3 A A stands for some number. It’s the same number in both the top and the bottom. It could be eight; it could be two; it could be twenty. We don’t know what number A stands for, but we can figure out whether this fraction equals more than one or less than one. How many parts are in each group? (A.) So every time we count A parts, we count one group. Look at the top number. We use A parts and three more. So does the fraction equal more than one? (Yes.)
The extensions that are possible indicate the soundness of the initial analysis. If extensions, details, subtypes, or nomenclature are introduced without requiring reteaching what has been taught, the content map is well designed. The extent to which reteaching requires serious modifications of what had been taught is the extent to which the initial teaching is wanting. The amount of teaching that is required to achieve each extension suggests the implied scope and possible stipulations created by the preceding instruction. If the extensions are achieved with little instruction (although they may be conceptually hard in other contexts), the previous instruction was designed and presented well. Conventions. In the previous examples, initial teaching did not refer to numerator or denominator. The reason is simply that these are nuances that may be introduced later. They have precisely no influence on the details of the operations. They are simply conventionally dictated words for referring to the top and bottom numbers. Although the traditional approach to instruction would certainly make a concerted effort to teach these early, the terms are not necessary to communicate the content map. Also they are a potential source of confusion if taught early. Using the terms confounds relatively simple demonstrations with words the children do not know. “Remember, we start with the cladophore. If the rosterbag is larger than the cladophore, the fraction is more than one.”
TEACHING RULES
449
These terms are easily introduced later in a way that shows they are simply words for what the learner already knows. “Listen: the top number and the bottom of a fraction have special names. We’re going to use these new names from now on. The top number is called the numerator. What’s the top number?” (Numerator.) “And the bottom number is called the denominator. What’s the bottom number?” (Denominator.) With some repetition and firming on these words, the teaching is completed. Nothing is affected except the changes in language that the teacher will use for future applications. Because the children already have knowledge of the operations, the teacher actually reinforces the meaning of the words each time she uses them in a context familiar to the children. “How many parts are in each circle? So what number do I write for the denominator?” The decision on what goes into the initial content maps is determined by the nature of the content, not the verbiage traditionally used to describe it. If references to top number and bottom number are adequate and already in the learner’s repertoire, they are highly preferable to ensure that the demonstrations teach one thing at a time. Reading Instruction The same basic analysis that applies to topics in math applies to reading. The difference is simply that, for reading, the content is not as clean as it is with math.1 The literature on beginning reading is littered with bad ideas and unenlightened techniques. The major problem stems from the stigma attached to a sensible analysis of reading. Those who advocate a code-cracking approach (phonics based) are classified as conservatives who are trying to set the clock back many years, whereas those who approach unstructured or poorly structured approaches (whole language) are seen as forward thinkers—liberals who recognize the complexity of the child and the need for child-centered approaches that allow the child to make decisions. Analysis of the content to be learned reveals both that a phonics approach is the only one that makes sense and that the instructional-design problem is challenging, particularly if the program is to teach very naive children. The experienced reader reads words one letter at a time, although the learner’s reading rate may exceed 300 words per minute. The evidence is that readers are able to identify spelling mistakes in words that are read at 1
1Math has some irregularities in names for the counting numbers (particularly 11–19) and
in the symmetry of written statements, which may result in a different meaning when read from right to left or left to right. However, the irregularities in math are minor compared with those of reading.
450
17.
THE LOGIC OF INSTRUCTION
this rate. Therefore, the readers operate from a content map that follows the rule: Each word that is spoken has one or more unique spellings. The fact that the learner understands words in context is important to reading instruction, but has little to do with the requirements for the initial instruction of reading. If the child knows language, the teaching of the relationship between reading and language is so trivial that nothing needs to be done except going about the business of teaching the content. Any well-designed instruction will show that the words children read are spoken words. Furthermore, the only words they read are those they understand and use as spoken words. Children will initially read aloud so the written word predicts a spoken word. The quiddity of beginning reading instruction is not to teach language, but to teach the skills needed to decode the written words. Teaching the children a single content map or strategy for all words is impossible because there is no universal regularity across words (except that they are composed of letters). Effective initial instruction would induce a content map for words that are regular (perfect correspondence between the sound of each component letter and the sound it makes in the word). The program would then introduce extensions for various words that are irregular (less than perfect correspondence between the sound of each letter and the sound it makes in the word). Ideally, the initial set of words to be taught would meet the following criteria: 1. It would be a relatively large set. 2. It would accommodate a good sample of basic words. 3. It would be designed so that later additions would create minimum relearning. 4. It would permit some form of the same analysis for all words. These criteria are not completely attainable for reading instruction because the material is not consistent. Let’s say all the words in the initial set that have the letter a make the sound heard in cat and fan. The learner will be able to process all the words in the set, but we will have created a stipulation that the only sound for a is the short-a sound. The learner will have to relearn some content regardless of the set we start with. Reading Mastery Beginning Reading Program. In one of the beginning reading programs that Engelmann developed (Engelmann et al., 1995), the problem of irregularities was partially solved by introducing unique symbols for long-vowel sounds such as a and o and joined letters for combinations such as , , , and ).
TEACHING RULES
451
With these conventions, the range of regular words is greatly increased. Words like me and so are now regular. The children simply say the sound for each letter and say the sounds faster to identify the word. The program had an additional convention that made it possible to spell words correctly without confusing the children. Tiny letters appeared in some words. The rule about them was that they did not make any sound. Now words like paint, boat, ate, and many others were regular. A basic problem of the learner’s content map still existed, however. If the instruction first taught the full range of examples in this set and then attempted to transition to traditional orthography, a high percentage of lower performing children would have serious problems because the instruction would have stipulated that all words are regular, which means that each symbol had one and only one sound.2 The solution to the problem is to introduce some irregular words early on in the sequence. These would not be treated as traditional sight words because the notion of sight words is untenable. The learner must attend to the features of the word that make it unique—the letters and their arrangement. So the learner would have to engage in some form of spelling to decode all words. Reading Mastery did not introduce spelling by letter names because it did not introduce letter names at all, only the sounds that letters made. Initial instruction that first teaches letter names requires substantial additional learning because saying the letter names faster does not yield the word. For instance, saying “em AAA Tee” fast yields something like emmatee, not mat. A transformation step is required. The child would have to learn both the letter name and sound for the letter. However, if the learner does not have to learn the letter names, only the sounds, the number of steps required to introduce a basic reading vocabulary is greatly reduced. Children learn the sound each letter makes. They then apply a single process to decoding all words. They say the sounds for the words, starting with the left letter and proceeding in order. If they say the sounds fast, they say the word at a normal speaking rate. With the early introduction of some words that are irregular (was, said, to, of, and a few others), it is possible for the children not only to decode the words within a large set of words that are regular, but also to write any word and spell it phonetically. A first grader from Texas wrote a story that showed pretty accurately how to speak with the accent for that area. He wrote about what he would do if he had a monka. The boy would . 2Several programs developed in the 1960s used a version of the international phonetic alphabet to spell words the way they sounded. Children readily learned to read words presented in this orthography, but a fairly large percentage of them had serious problems achieving the transition to regular orthography and conventional spelling.
2
452
17.
THE LOGIC OF INSTRUCTION
With the orthography in place, the initial content map would be for the child to follow the directives “Sound it out” and then “Say it fast.” Some children could not perform the transformation of saying the sounds for the individual letters and then saying it fast. Typically if they sounded out a word like sat, they would say the sounds; but when told to say it fast, they would say at, not sat. The children who exhibited this problem needed a content map for transforming a sound sequence to a word. The simplest form involves oral words, not written words. Instead of requiring children to identify the individual sounds, children would repeat a sound sequence and then say it fast. “Listen: mmmeee. Say it with me. Mmmeee. All by yourself. Get ready.” (Mmmeee.) “Say it fast.” (Me.) Note that the children did not pause between the sounds. They did not say the sound for m, then pause before saying the sound for e. The sounding out resulted in the children saying the word slowly and then saying it fast. Reading words was delayed until the children were proficient at saying it fast. When they were introduced to reading words, they would follow the same steps except that they would refer to the letters for the sounds they produced. They would say the sound for each letter, without pausing, and then say it fast. Because reading words was now a simple transformation of the verbal presentation, the teacher was in a position to correct mistakes. If a child sounded out the word sssaaat and said it fast as at, the teacher could correct by presenting the verbal version of “say it fast,” then repeat with the written word. For example, as soon as the child made the mistake, the teacher would say, “Listen: sssaaaat. Say it with me. Sssaaat. All by yourself.” (Sssaaat.) “Say it fast.” (Sat.) “Now touch the sounds and do it there. Get ready.” (The child says sssaaat and says it fast, sat.) The correction procedure shows that sounding out and saying it fast are the same for the verbal task and for the printed-word task. Various details of the program addressed other preskills and sequencing issues. Reading was not delayed until the children learned the sounds for all letters. Rather, following the introduction of seven letters, reading commenced. After children had developed fluency in reading stories (composed entirely of words they had been taught), the program addressed various details of transition from the prompted orthography to traditional orthography. The transition did not occur all at once, but in several stages. The transition could be achieved faster for some children, but the transition is safe. By the end of the second level of the program, children are reading sophisticated stories. Children have learned more than 1,500 words, which is many times the number traditional basals teach. This program has been involved in various controlled comparisons. It has consistently outperformed the comparison programs. In sites that im-
TEACHING RULES
453
plement this program according to the developer’s guidelines, all children with an IQ of 75 or above read by the end of the kindergarten year (Apfell, Kelleher, Lilly, & Richardson, 1980; Bracey, Maggs, & Morath, 1975; Polloway, Epstein, Polloway, Patton, & Ball, 1986). If properly implemented, it produces virtually 0% dyslexic children. Alternative Approaches. The specific strategies and prompts introduced in the program described before are not universal for effective reading programs. Engelmann and his associates designed another beginning-reading program that uses completely different prompts, a different sequence, and different formats (Engelmann, Engelmann, & Davis, 1998). This program is like Reading Mastery in that it teaches one thing at a time and accounts for all the skills that the learner is to master. However, this program introduces long-vowel sounds first, assumes that children know letter names, and uses prompts that do not dictate the pronunciation of a letter, but that alert the child to the fact that the child must apply a rule that has been taught. Both programs are effective because both induce useful content maps and avoid teaching sequences that may create serious stipulation or require extensive additional teaching. This strategy holds for the teaching of any academic skill. It also applies to programs that reteach students. For instance, Engelmann and associates (1978) designed a corrective reading series for older students (Grades 4 and above) who have not caught on to the game of decoding. Instead of having a set of related content maps with an understanding that certain parts of words (the consonants) are usually regular, these students have a potpourri of strategies that they apply to different meaning or syntactical contexts. Most interesting, all the abortive strategies they apply are ones that teachers have taught them. For example, these students try to guess on the basis of text predictability. This strategy is taught by many teachers. “Well, what could that word be?” These students also guess on the basis of the beginning letter of the word and the general shape of the word. These are strategies that have been explicitly taught. Finally, these students are often confused about the extent to which they need to understand the word before they can actually identify the word. This confusion was most succinctly expressed by a student in one of our first field-try-out groups working on a prepublication version of the program. A student made a word-identification mistake while reading a story. The teacher stopped him and told him, “That’s not right. Sound it out.” The student looked at the teacher and said, “Tell me the word, and I’ll sound it out.” The remark seemed to be a wisecrack. However, the student precisely revealed his confusion. If we consider the way he was taught reading, we see the basis for his remark. From the first exposures to reading, the teacher always discussed the material before any reading occurred. The
454
17.
THE LOGIC OF INSTRUCTION
words, ideas, and events that the teacher discussed were in the story. If the teacher talked about illustrated characters, the names were the names that appeared in the story. The actions the teacher named were the actions the characters performed in the story. Furthermore, the pictures always gave strong clues about what was happening. If Janet, Billy, and their dog, Fluffy, were shown speeding downhill in a red wagon, that is what the story described. In other words, the pattern had always been to first understand or receive comprehension information, then decode. Obviously, the plan will not work because understanding a particular setting does not predict words to be read. In the same way, looking at the first letter of the word and guessing, or guessing on the basis of predictable sentence patterns, will not work. All are misrules. There is nothing in this instruction that prevents the learner from learning the true character of words. Yet if a child believes what the teacher has told him, he will think of reading as a complicated business, which somehow involves looking at the words and reciting something that should be understood—or, as the student put it, “Tell me the word and I’ll sound it out.” According to the student’s formula, before you can sound out or decode the word, you have to understand what the word means. The successful corrective reading program reteaches the learner. Part of reteaching involves contradicting the various specious content maps that the student has and replacing them with the rule that a particular spelling is the only thing that predicts a particular pronunciation. (Exceptions involving context are introduced much later.) Because of the learner’s misconceptions, an effective program might have the following features: 1. The stories would have no pictures. 2. The program would show the sameness features of words in lists and words in stories. 3. The stories would have no highly predictable syntactical patterns. 4. The program would build skills progressively and cumulatively. 5. The program would have provisions for students to map their progress. If the stories have no pictures, the student receives a clear demonstration that reading does not depend on pictures. It is also important to show the student that the words read in lists are the same words that occur in the story, and that they are read the same way. (Often students will be almost perfectly reliable at reading words such as a and the in a list, but when the words appear in stories, the students tend to guess.) In the corrective reading program Engelmann designed, one of the vehicles for achieving this goal was a story character named Chee. Chee was a
TEACHING RULES
455
dog who said things that made no sense when she became nervous. The words that she said were the words that many corrective readers confuse. For instance, Chee would say things like, “Of go, my what where to me the were you.” The students initially had great difficulty reading Chee’s utterances, although they could read the words with accuracy when they were presented in lists. An objective of any reteaching program is to create provisions that permit the learner to make mistakes at a high rate. If the learner has a strategy about guessing words such as is, a, the, what, of, or, to, or that, the program constructed so that these words appear frequently will result in mistakes at a high rate when the student uses the strategy he has previously learned, but at a much lower rate if he uses the strategy he uses for decoding words in lists. The program must be designed so the student is carefully taught alternative strategies for reading the various words. The material the learner learns is presented cumulatively so that the learner must continue to apply what had been learned. The words that students have learned serve as the basis for the program to shape the context. The new skills are always integrated into a context of what had been taught. This convention restricts the vocabulary of the early stories. The preteaching of words in list form occurs as the first activity of each lesson. After a particular word has appeared in a list on two or more consecutive lessons, the word appears in stories. Students orally read the story twice. Decoding errors are immediately corrected. The students use charts to map their progress. Typically, students in the early lessons make more errors on their second reading than on their first. This trend suggests how unfamiliar the learning is for them. They are not practiced in using information from the teacher to correct errors. After possibly five lessons, the trend reverses, with fewer errors occurring on the second reading. At this time, the students are able to use the information the teacher provides when correcting. The teacher reviews the charts regularly and points out areas of progress. Often students who start out making six or more errors per 100 words will later read text that is more sophisticated with no more than one or two mistakes per 100 on the first reading. The program for the corrective reader is not the same as that for the initial reader because the students have greatly different skills and needs. Beginning readers do not have misunderstandings, so it is possible to teach them to identify words reliably after less than six trials. Furthermore, beginning readers rarely confuse words like of, for, and from. In contrast, the corrective reader may require more than 600 trials of reading habitually confused words in the context of stories before reaching an accuracy level of 98% (Engelmann et al., 1978).
456
17.
THE LOGIC OF INSTRUCTION
Mastery The content details of the program presented to the learner are important; however, unless other variables are controlled, the content is not effectively communicated to the learners. All the programs referred to earlier are small-step programs, which means that, on a given lesson, about 90% of the material has been previously introduced. Virtually everything that was introduced for the first time on the previous lesson appears in the current lesson. Generally, nothing is assumed to be taught until it is presented on at least three occasions. Because the program develops skills in small steps, the benefits of the program will be transmitted to the learner only if the learner is at mastery. If the learner is not at mastery, the program will require large, not small, steps. For instance, if the students are to begin Lesson 17 and they were not at mastery on Lesson 16, they have to learn more than 10% new material to achieve mastery on 17. An irony is that the students who are not at mastery on the preceding lesson are the lower performing, not the higher performing, students. Consequently, for these students to achieve mastery on the new lesson, they must learn more than the higher performers are required to learn during the same amount of time. Historically, however, lower performers have learned less than higher performers during any given period of time. (That is why they are behind.) So by not teaching to mastery, the teacher all but ensures that the students who would benefit most from the careful program design—the lower performers—are preempted from receiving the potential benefits of the program. The importance of mastery was illustrated three times when we worked with full-school implementations in Utah. On these occasions, we split large groups of failing fifth graders and placed them at different points of the instructional sequence. In all cases, the teachers felt that the failed students should be placed in the fourth level of the math program. The teachers based their assumption on the abiding pressure within schools to “get the students as close to grade level as possible.” According to the ideal program placement rules, we believed that the students should be placed at the beginning of the third level. Our position was that the students would be closer to the ideal percentages of correct responses if we placed them at the beginning of the third level. The groups were split, with half of the students starting at the beginning of the fourth level and half starting at the beginning of the third level. All groups were to be taught to mastery, which meant accelerating their rate through the lessons if their performance on first-time material greatly exceeded 80% correct and repeating lessons if their first-time performance dropped below 60%.
TEACHING RULES
457
In all three cases, the profile of progress was the same for the split groups. By the middle of the following year, the students who had been placed in the third level actually passed up the students who had started in the fourth level. The reason was that the later material was not hard for the students who had been placed appropriately. Students placed in the fourth level performed somewhat acceptably for the first third of the school year. However, as the material became more complicated, the rate of corrections the teachers had to provide escalated to the point that it was obvious the students lacked the skills needed to continue. Teacher Training An issue related to mastery is teacher training. Teachers often do not know how to teach to mastery and often even have prejudices that prevent them from learning the skills. The prejudices come from their expectations that lower performers are somehow naturally slow or are not really expected to learn the material. The result is that the teachers create a self-fulfilling prophecy. They are sure that these learners are slow, and they teach in a way that all but ensures that the students will not achieve mastery. The outcome validates the assumption that they do not learn like others. Teachers need to know not only how to review material that is difficult for students, but also how to key the placement and rate of moving through the program to student performance. The following are general criteria that ensure proper placement and optimum progress of students who are not practiced at learning academic material: (a) On material introduced for the first time on the lesson, students are 70% correct; (b) on material not introduced for the first time, students are at least 90% correct; (c) at the end of the lesson, the students are close to 100% correct on everything; and (d) the rate of errors is low enough that it is possible to get through an entire lesson in the time allotted for the period. Teachers need practice in applying these criteria, identifying students who are misplaced in the program sequence, and providing remedies (placing them where they are about 90% correct on the content). The most important message that the training conveys is that this formula applies to anything teachers teach. If they are presenting a teacher-designed unit on Norway, for instance, the students probably will not achieve mastery unless they begin at no less than 70% correct. If they are not at mastery, their performance indicates where the teacher needs to provide more practice and where the program may be distributed over time (instead of trying to teach the whole unit during one period, the teacher presents it over five periods–devoting possibly 10 minutes of each period to the topic).
458
17.
THE LOGIC OF INSTRUCTION
Independent Practice The traditional prejudices to instruction assume that children learn a great deal from working on their own. For instance, the beginning reader is assumed to benefit greatly from independent practice. For beginning material, this is often logically impossible, and the impossibility is supported by empirical data (Carnine, 1977; Carnine, Carnine, & Gersten, 1984). The impossibility stems from the fact that the rules that govern reading are not rules of the physical world. They have no built-in feedback value to the learner. If the learner makes a mistake and identifies the word of as for, the learner will receive no feedback or corrections on these errors. The situation is quite different if the learner is attempting to do something like learning to stand up or improve at shooting baskets. These activities are governed by the physical world. So long as the learner knows the goal of the activity, the physical environment will provide clear feedback that is correlated with the preceding response. If the learner leans too far to the left when trying to stand, the physical environment will be emphatic about demonstrating that this is a negative example—a strategy that must be modified. Likewise, if the learner shoots to the left of the basket, the physical environment provides at least indirect feedback. In contrast, when the learner reads the word what as that, there is no basis for the learner to receive feedback of any kind. The learner does not see that the outcome is wrong and does not receive any active response from the physical environment. The word does not turn blue, physically punish the learner for making a mistake, or result in any differential consequence. Rather, the physical environment is unresponsive. This fact shows the extent to which the material being taught is artificial and therefore must be supported by artificial bases for feedback. It may be argued that the learner has information from the context, and that if the sentence does not make sense the learner is alerted that something is wrong. That is true for the practiced reader who has a precise idea of sentence structures. In fact, beginning readers do not have this precise idea. To them the sentence seems to make some sense. For instance, the child reads, “He did not know that he would find.” The wrong version would not be as odd as it would be to the practiced reader. The reader knows that the person did not know something. For beginning skills, practice should be overt. The rationale is that we do not want the learner to mislearn the material. The earliest material is supposed to provide the learner with a foundation for what is to come. If the foundation is flawed, the learner will have to relearn it. It is far more efficient to induce each skill from the onset than to have to reteach skills and strategies. Once the learner has a solid foundation in reading aloud and silently reading material that we know the student is able to read accurately,
TEACHING RULES
459
the child would be encouraged to read extensively. The independent practice would now be quite safe. A solid foundation is not built with practice formats alone, but with all the program details that serve as variables that influence learning—schedule, sequence, manner in which it is presented, and possibly above all, mastery of material that promotes the appropriate generalizations. Reinforcement The human organism is designed to respond to peripheral reinforcers and patterns. If the learners went through a successful hands-on program that allowed extensive work with concrete examples, the learning would be stronger, more meaningful, and more firmly confirmed by the infrasystem than material presented through a teacher, text recitation, or an educational film. As noted earlier, there are two problems in progressing from specific concrete examples to generalizations. First, it would take the learner years to learn what could be learned in hours if we go to the heart of what the learner is to learn and establish content maps in the most efficient way available. Second, it is highly probable that the learner is not going to discover some relationships, such as the properties of magic squares or a sound algorithm for division that also works for fractions. In fact, the learner may never learn that division and fractions express the same threecomponent relationship. The great value of the hands-on sequence is that it provides for great verification from the infrasystem. The content we expect the learner to learn is void of any sort of primary reinforcing properties. Fortifying it with positive reinforcement is particularly important for the beginning learner. The objectives of the reinforcement are (in addition to making instruction a more pleasant experience) to show the learner that the material is important and to demonstrate that the learner is capable. If we use efficient instructional sequences, we must add reinforcement to make the instruction significant to the learner. Note that the program of reinforcement must be designed so the infrasystem accepts it. The infrasystem is necessarily data based. If we tell the learner he is smart and is doing a great job, but at the same time we provide evidence that he is not able to complete the work that the children sitting on either side of him are completing, the verbal behavior of the teacher will soon be disclosed to be just that—some form of noncontingent reinforcement (which proves to be nonreinforcement because it is not predicted by a special set of behaviors). The effective program establishes positive models, ensures that the child does not learn from negative models, and provides both the encouragement and feedback necessary to make the point to each child, “This is your role, and you are getting good at it.” The idea is not to dispense M&Ms, but
460
17.
THE LOGIC OF INSTRUCTION
to provide each child’s system with information that permits both agent and infrasystem to create the following argument: Anyone who performs this role is smart and important. You are performing this role. Therefore, you are smart and important. There is no problem in telling the learner that the task is hard and requires work. There is a problem if the hard work is presented as coercion or unintentional punishment. The ill-advised teacher who presents her fourth graders with math problems for homework that they cannot solve is not helping her students, but systematically misteaching and punishing them. Their infrasystems are receiving repeated evidence that math problems have the principal feature of predicting failure. Functionally, inducing rules and establishing meaningful reinforcement requires the following: 1. Grouping the students homogeneously for each subject based on their performance. 2. Setting clear performance expectations that are identified as being important, but that the children will be able to meet or exceed. 3. Providing a schedule that is adequate for teaching the material thoroughly. 4. Providing feedback both on a moment-to-moment level and with respect to long-range performance goals. 5. Using the children’s performance as evidence to show children they are smart. If the children are grouped homogeneously, not only is the instruction more closely affined to the performance of the children, but also each child is able to excel, and children do not encounter models that reveal that individuals may not be comparatively smart. Certainly, the children in the lowest group know there are other children who are able to do things that they cannot do, just as these children know there are others who are taller than them and faster at running. That does not mean they are not tall enough or fast enough. In the heterogeneous situation, they receive constant information that they are inferior. In the homogeneous group, they have reason to believe the teacher when she says, “You guys are really smart, you know that?” They have evidence of performing something the teacher indicated was pretty difficult and evidence of achieving this performance often. The effective teacher of young children responds to the instruction as if it were important, talks about what the children will be able to do by the end of the year as a spectacular achievement, and reminds them of where they started, the progress they have made, and what it means. With a combination of the evidence about what the children have learned and the
TEACHING RULES
461
teacher’s response to their achievements as something praiseworthy, the teaching sets the stage for children to conclude that they are good learners. One of the simple devices we have used to provide a record of student progress is a thermometer chart. The chart is based on perfect papers. Each time a group of 12 students gets 12 perfect papers, the mercury on the thermometer moves up one degree. The teacher sets goals for the number of perfect papers that are expected by Halloween, Christmas, Valentine’s Day, and the end of school year. The teacher makes it clear that, “I don’t know if you’ll be able to reach these goals because the only people who can do it work very hard and are very smart. But I’ll bet that if we work very hard, we’ll come close to reaching all the goals.” In fact, the teacher knows that the children will probably exceed the goals. At least every few days, she refers to the progress on the chart with continued amazement: “You already have more perfect papers than I thought you’d have by Halloween. What’s going on here? Is this group extra smart or what?” Oh yes, and very proud of it. The references to the thermometer chart do not consume more than possibly 1 minute a day, yet the results in effort and performance are often spectacular. Roles The goal is to engage each child in the role of being a good, successful student who understands what the teacher teaches and is able to apply it well. The good student recognizes that the learning is sometimes hard work, but knows that it is worth the effort. The roles are established by creating secondary reinforcers. If each lesson that the learner engages in predicts reinforcement and success, the material being learned becomes important and reinforcing. The skills become prized possessions of the learner. The self-image of the learner who believes she is capable of learning well is important because it influences both what the learner will do with her free time and how she will approach future instruction. The learner with the strong self-image will respond to failure differently than learners who have received evidence that they are not effective learners. The competent learner will interpret mistakes as mere setbacks in the process. “If I keep working, I’ll get it.” The prophecy is fulfilled if the learner persists. In the end, she will be able to say, “See? I knew I could do it.” The same mistake for the learner with low expectations is interpreted as evidence of failure. “I knew I couldn’t do this stuff.” The prophecy is selffulfilling because the learner quits. The choice is either to control the reinforcing variables so that all children receive information about their abilities and successes, or leave it to chance, in which case some children will like math a lot. However, most will
462
17.
THE LOGIC OF INSTRUCTION
get a sickening negative sensation when any type of math is mentioned. The efforts to enlist high school females to become scientists will be wishful thinking unless sensible reinforcement and teaching practices are instituted at a much earlier grade level. As for the morality of lying to children about their abilities, it is not a lie. A child with a poor memory who requires lots of practice to learn even basic material is either low in capacity or unpracticed in the type of enterprise presented, and therefore must engage in relatively unfamiliar learning. Can the teacher, psychologist, or anybody else determine before the fact, which it is? No. Therefore, the safest assumption is that the learner has the capacity to learn and that the learner should receive the kind of practice and treatment that is well designed to lead to improvement. The improvement may be spectacular, great, or slight. In any case, the child will have learned important skills and will understand that her work is important. Functional View of Instruction The goal of the school is not to reproduce the family or form of learning that might occur in other settings. The teacher is not a mom interacting with a single child. The classroom is not a social group. The school program that serves at-risk populations must be scrupulously efficient. It must focus on function, not form. It must consider how the minutes of the school day may be configured so they transmit a relatively great amount of information and skill reinforcement to the students. If low-performing students are to catch up to their more advantaged age mates, they must learn at a relatively faster rate than the average student. They are thousands of trials or exposures behind in many relevant skills, particularly vocabulary (Hart & Risley, 1995). To provide the practice required to accelerate the acquisition of skills that the children need and to make them automatic, the school program must be far more efficient than the program that processes the average student. This goal is achieved only by evaluating instruction for teachers and students according to how efficient it is analytically and how effective it has been demonstrated to be empirically. SUMMARY We have glossed over many of the issues associated with instruction and have primarily considered formal instruction, not motor-response instruction nor instruction in areas like learning spatial relationships, driving, or job skills. Formal instruction is the category of learning that is farthest removed from basic learning. The natural contingencies that provide the reinforcers, present predictors, and prompt the learner to learn patterns are absent from formal instruction. The content is dense of necessity. The
SUMMARY
463
learner learns, in possibly 80 hours, what it took scholars of antiquity hundreds of years to learn about geometry. The beginning levels of instruction are the most important because they establish the framework for what follows. If the students who come into the fourth grade have learned all the skills they are supposed to have learned in the earlier grades, instruction is smooth, fun, engaging to the students, and rewarding for everybody. The teacher is reinforced because she has a great class. For the students, the material is interesting, the assignments are manageable, and the discussions that the children have about various topics are very engaging. There are some arguments, but the arguments are not punishing. They often lead to another assignment that gives everybody a better understanding of the issues. For a teacher who is concerned about teaching and empowering her students with skills, this is heaven—it is easy. If the same teacher is greeted with a fourth-grade class of rowdy students who follow the lead of models who are open about their disdain for the teacher, the material she is trying to present, and the school, the teacher is in hell. Certainly some of these children are potentially good students, but even the good ones are behind, and the lower students perform at the level of guess and guess again. To give these children even a chance of succeeding, the teacher would have to start the instruction on the level of their performance, which would be difficult because these fourth graders vary greatly in performance. The teacher would then have to extinguish the negative models, either by removing them or enlisting them to champion the cause of being good students. The teacher would have to engage in substantial reteaching and try to reinforce the students for working hard and learning. It can be done. The behavior can be changed and changed quickly. New roles can be induced. Yet it is many times harder than it would have been if the children had been properly instructed from the first day of kindergarten. Even if the teacher does a stunning job with these fourth graders, by the end of the year, the students will have gained only about a year’s worth of skills. Most of them will still be more than a year below grade level. For at-risk students, all the variables that affect student performance must be controlled if we expect them to catch up to their middle-class age mates. The centerpiece of the instructional component is the instructional sequence. If the instructional sequence introduces content maps that later have to be retaught for some students, the sequence fails because it does not permit the learner to process the full range of examples the map is supposed to accommodate. If the sequence is designed so that different teachers present the material in greatly different ways, there will be great variation in what the children learn in different classrooms and coordination from grade to grade will be impossible. Some groups will learn far less than other groups of equal ability.
464
17.
THE LOGIC OF INSTRUCTION
The effective approach is based on standardization of operations. This does not mean that teachers are marching in line, but that all have goals that must be met for the various groups of students. This chapter illustrated the universal features of an effective instructional sequence. As early as possible, it introduces content maps that will process the basics. The fraction analysis illustrated how this is possible with specific content that is logical and consistent. The reading operation illustrated the process with a skill that is more haphazard and therefore presents more subtypes of material that must be addressed. The fraction analysis was clean, which means that basic generalizations about features of all fractions was induced by a single map about the relationship of the top number and the bottom number. Inducing the map not only reduces the amount of training required, but also eliminates the various misrules that are generated by abortive content maps that derive from the presentation of a very narrow range of examples (e.g., 12 , 13 , and 14 ). These examples share features not shared by all fractions. Therefore, working on them exclusively induces serious misrules. The top number of a fraction is not always one, not all fractions are less than one, and not all fractions refer to a part of a whole—some refer to whole numbers. Instruction for beginning reading is not clean because the population of words does not present one sound for each symbol. Rather, it presents words that are not systematically related to each other with respect to the letter–sound relationship. There are trends. Consonants like f and l tend to have the same sound value in various words. Consonants like t and c are generally reliable, but may be combined with h to make unique sounds. The vowels have an incredible range of sound–symbol variation. Consider words that have the letter o: moss, most, mother, moon, took, worse, wow, form, cloud. The most efficient generalizable strategy for inducing beginning reading skills starts with a group of words that have the same letter–sound relationship. The children engage in a behavior that is functionally equivalent to spelling the words and then identifying them. To avoid serious stipulation, exceptions are introduced early in the sequence. These are shown to be words that have the same letters, but that do not make the predicted pronunciation and therefore must be learned as individuals (or as small families that share a common pronunciation bias). The content is sloppy, so the various content maps must be qualified by the instruction in a way that does not imply that sounding out will serve to identify all words. The illustrations dealt only with some facets of the initial instruction. The program would also have to expand skills in a relatively systematic way so that the children learn how to read silently, how to read accurately at a faster rate, and how to read for purpose—searching for specific informa-
SUMMARY
465
tion and reading for entertainment. Although the children should not initially read anything that they do not understand, an important purpose of reading is to learn. So at some point the well-designed program would carefully introduce the skills needed to allow the printed page to serve as a teacher. Because formal instruction is largely artificial and dense, the total school program must be designed to ensure that the child will assume the role of a good student and will respond to what is being learned and to school as something important. Achieving this goal requires establishing models, expectations, and work rules. Above all, it must be governed by the careful application of reinforcement so that by the time children reach the third or fourth grade teaching is rewarding for the children and the teacher. If children are taught in various subjects to mastery, they will learn a generalized set of strategies for learning any new material the teacher presents in a systematic way. Even children who entered school with inadequate preparation will continue to be competent learners.
Chapter
18
Issues
One of the difficulties in writing this book has been to articulate the basic functions of the system without addressing what others have said about issues, including those that potentially have considerable impact on the significance and orientation of the analysis. This chapter addresses some of these, but not to the degree the issues may deserve. There are many relationships between phenomena that we have discussed and explanations of perception, memory, neurology, brain and learning theories, models of cognition, and various classification schemes. On the topic of attention alone, there are many experimental outcomes that have led to different models. Sheer book length prevents serious excursions into the various issues that possibly deserve discussion. This chapter attempts to redress only some of the more central issues related to the analysis.
OVERVIEW One of the more egregious omissions in the analysis is the reference to the literature on learning. We asserted in several places that the analysis is consistent with the literature, but have not extensively documented that fact. Some of the more incontrovertible axioms of learning derive from learning rote lists and maze learning. The performance observed is consistent with the rule that the items that have a greater number of unique features are more discriminable and are therefore learned more 466
OVERVIEW
467
easily than the others. For learning a list of random consonant letters, such as JBVXDSMTNKR, the trend would be to learn the first letter and the last one with the fewest errors and the rest in roughly a U-shaped curve, with the sequence in the middle requiring the most trials (Murdock, 1968). If we consider simply unique sequence features of the letters, we see that the first letter is the only one that is temporally preceded by a nonletter. The last item is the only one temporally followed by a nonletter. No other letters in the sequence have these unique sequence markers. Therefore, the first and last are the most discriminable and easiest to learn. In contrast, the middle letters are preceded and followed by letters. They cannot be fixed into the sequence until other letters are fixed. The same kind of analysis discloses why errors in mazes are eliminated in reverse order from the food (Hull, 1934). The choice of direction that occurred immediately before the presentation of the primary reinforcer is the only one that immediately predicts the primary reinforcer. It is therefore analytically the easiest to identify. Errors that are temporally remote are in the context of choices that occurred before and after them. Therefore, they are less discriminable from each other. In the same way, other experimental outcomes are explicable in terms of the content and communication. The reason is that the organism is logical and data based. Certainly salience of stimulus, memory, and decay of representations play a role in performance. However, the primary player is logic. Inferred Entities The analysis of functions infers entities that have particular functional roles—the infrasystem and agent. Although such inference may be eschewed by behavioral psychology, there is a philosophically sound basis for it and extensive precedence in other fields, especially in the sciences. Molecules were inferred before they were ever observed. In his paper Constructions and Inferred Entities, Lewis White Beck (1953) illustrated the necessity for inferred entities in the sciences, using as an example the development of kinetic theory. A basic assumption of kinetic theory was the existence of entities that are inferred and have presumed properties. These were molecules. Beck developed the guidelines for the design of legitimately inferred entities. They have a data-summarizing function. They serve as a locus for known facts and relationships. These are necessarily generated from the data, but inferred entities must not simply regurgitate known data. They must have a data-generating as well as a data-summarizing function. Also they must be designed so their features are necessary, which means that the applications they predict could not be generated by entities with properties other than those attributed to inferred entities.
468
18.
ISSUES
The following is Beck’s summary of the data-generating requirement: If the data are predicted, we still have not affirmed the existence of the inferred entity unless we know that no other inferred entity could serve as the basis for the observations obtained. . . . The inference to the existence of the [inferred entity] is justified . . . only if [the inferred entity] does more than fulfill the operational definition or constructs which provide for its original definition. . . . Its predictiveness beyond the implications of its definition is absolutely basic to the assertion of the existence of any inferred entity. (p. 378)
An example of an inferred entity not capable of generating data beyond its definition was the ether of space. Elaborate models that involved the ether of space persisted after Einstein established the theory of relativity. These models used the definition of the ether of space in referring to the same phenomena that Einstein had identified. The problem with the formulations was that the ether of space did not function as a variable in these explanations. For example, to show the relationship of mass to speed as the object approached the speed of light, the argument would effectively be reduced to this: Increase of speed in the ether of space leads to increase of mass in the ether of space. The references to the ether of space are inert and add nothing to the essential details of the relationship. By removing the reference to the ether of space, we are left with the true variables: Increase of speed leads to increase of mass. The argument that there could be an ether of space that might work in this manner is rejected because there is no database to assume that it does work that way. The inferred entities that we define as the infrasystem and agent were formulated in strict accordance with the basic requirements set forth by Beck: 1. These inferred entities are described as functions referenced to the data on performance. 2. No other alternative hypotheses account for the data. 3. The inferred entities are capable of generating data beyond that which led to the identification of their properties. Requirements 1 and 2 have been met in a categorical sense, but not in a final sense. In other words, the theory is recognized as something of a first cut of a more refined product. The original identification of molecules did not postulate the detailed properties of molecules as we understand them today. The original orientation described the essential features that served
OVERVIEW
469
as foundations for inferences (such as Graham’s law of diffusion) that could not be generated without molecules having specific properties and behaviors. (Graham’s law does not refer to mass, but to an implication of mass—the density of the gas.) To the best of our knowledge, our formulation of entities is consistent with all firmly established phenomenological relationships presented in the literature on learning, perception, memory, cognition, and instruction. The assertion that no alternative hypotheses account for the data does not imply that all the details of the infrasystem and agent are correct as we have described them. The fundamental entities and their roles or functions, however, must exist according to the logical parameters identified. Stated differently, if somebody could provide a plausible explanation of how nonreflexive behavior that is a function of consequences could occur without an agent that planned and directed responses, the agent could be eliminated from the scheme that we have inferred. By the same token, if some detail of our logic is wanting, some details of the scheme will not predict. Biochemistry and Neurology Requirement 3 is absolutely essential, as Beck put it. The entity must generate new data. One area that is potentially impacted by the inferred function analysis is biochemistry. The field should be able to use information about the logically necessary functions of both the infrasystem and agent to discover the specific neural machinery that different organisms use to achieve the functions. As it is now, some inferences about the loci of cortical activity and the extent are inferred from measures such as MRI records. These suggest functional areas. However, inferences about function are largely determined through ablation or disablement of a particular neural nexus and then observation of the effect. If the logically necessary functions of the system were known, however, the biochemist would have a better understanding of the functions being manipulated. For instance, the memory record that the analysis articulates calls for extensive cross-classification of millions of data points organized as features and events. This fact may help clarify some of the mysteries of the neural networks and the locus and neurochemical basis of memory phenomena. Clearly, not all organisms use the same neurological machinery to perform functions. All who produce operant responses, however, require a specific interface between agent and infrasystem that permits the infrasystem to influence the agent through reflexive processes, and for the agent to communicate with the infrasystem through the formulation of plans and projections that serve as standards for the infrasystem’s reflexive additions.
470
18.
ISSUES
Identifying the neurological basis for the agent and infrasystem functions should be more tightly focused if investigators know what functions must exist. Unity of Operant Systems The analysis of the task discloses the requirements of any system that performs the task. If the task is the same, essential features of the system are the same. The historical orientation to learning and performance tend to trivialize the sophistication of lower organisms because of apparent differences (e.g., between humans and hamsters). The samenesses in cognitive functioning, however, greatly outweigh the differences. Humans process more content, a broader range of content, and extensive behavioral applications (including language). The difference in content is significant. However, the process is basically no more different than the basic processes of the human eye compared with that of the hamster. Perhaps the most contraintuitive aspect of how we are the same as other organisms that learn and perform is that all of us need a system that performs intricate logical operations. The requirements are disclosed by analyzing the learning and performance tasks. The historical basis for judging sophistication of function has been the brain. If humans have such a great brain–body mass ratio and they are so smart, the brain–body mass ratio must predict that the porpoise is somehow almost our equal, the dog is something of pinhead, and the bee is without hope. A large brain is clearly not a requirement for sophisticated learning. Bees, who have brains of around 100,000 cells (compared with billions of the human), have learned to distinguish symmetrical displays from nonsymmetrical ones. Therefore, a tiny brain tends not to predict the sophistication of what the organism is capable of learning; rather it predicts the scope of what it is capable of learning. If the learner must learn extensive relationships and discriminations that cannot be supported by content maps, a larger brain is required to process the possibilities and record and cross-classify the data. If the learning is limited to something that may be supported by a highly specific default content map, a more modest brain structure appears to be adequate. In other words, the porpoise needs a large brain because it learns systems of relationships that present an incredible number of possibilities and require enormous records. We do not happen to know the details of what it must learn, but we know that it must involve storage, manipulation, shaping, and rejection of an enormous number of possibilities. The dog, in contrast, may be more limited to learning roles within the context of the pack and hunting. The dog, however, may have as much capacity as the porpoise to learn the types of things we would teach.
OVERVIEW
471
Consciousness Certainly the most sensitive issue in inferring cognitive functions is consciousness. The idea that a cockroach is conscious rather than being an organism impelled by undefined instinct is highly contraintuitive. According to the current analysis, degree of consciousness is a function of task requirements. The simplest task implies a central locus—the agent—that performs five basic functions: 1. receives the sensory information from internal sensors and from distance sensors, 2. receives systemic secondary sensations that motivate, 3. receives some form of content information that implies how the current setting is to change, 4. has access to a repertoire of responses that may be deployed for the current pursuit, and 5. has the capability to plan and direct voluntary responses. Any system that meets these criteria has a form of consciousness. The system is modulated according to the type of sensory data it receives. Its models of space and features are limited accordingly. The minimum level of this consciousness does not provide for conscious memory or projections beyond the current undertaking. Therefore, this is what might be considered a stimulus-driven consciousness. During the present moment, the organism feels pleasure, pain, urgency, or compelling interest in certain features of the surrounding. The feelings change as the features of the surrounding and internal features change. The scope of consciousness is expanded if the organism learns. The agent must have a capacity to remember things not related to the current pursuit, although the thoughts may be presented in a purely reflexive way. The more flexible the system is, the greater its capacity to attend to features related to pursuits that are currently dormant. As Jerison (1973) pointed out, the sense of reality is a function of the nervous system. Unlike the snake, we are not able to map details of space on the basis of thermal input. Unlike the dog, we do not create olfactory profiles of every setting. So our realities differ both with respect to specific features and the extent to which we remember and manipulate them. However, all agents have knowledge categories and are aware of secondary sensations. These are as real as sensory data received by the eyes. All agents know about specific features and changes in features in the surroundings. All know the responses they are able to perform and their effect on the setting. All know how to plan strategies to avoid or escape from specific condi-
472
18.
ISSUES
tions that induce negative sensations and how to seek or approach specific conditions that lead to positive sensations. In other words, the content that we experience is different, but knowledge and control over the current pursuit, and the passion or interest that we experience, are shared with all other systems that have genetically solved the problem of creating voluntary, purposeful responses. There is little doubt that human consciousness has far more extensive voluntary control over the classification system shared with the infrasystem. So the human system is able to create features of specific situations that do not exist and abstractions based on a potentially large repertoire of patterns and relationships. The human agent does its job well, just as the crab’s agent does its job well.
SYSTEM DESIGNS The necessary functions of the system imply why some design features of the human brain may be as they are. For instance, the brain undergoes growth spurts at different stages of development. These spurts may be designed to permit the human to accommodate the extensive unfamiliar learning that must occur if the infant is to learn the foundation skills needed for later extensions. If something is learned with great difficulty and many errors, it is never learned as well as it would be with fewer errors. Achieving either automaticity or total intuitive understanding is not possible because the initial learning required for language and understanding of basic physical relationships results in extensive mislearning before the system has constructed a working model that accounts for phenomena the learner has experienced. The learning curve for highly unfamiliar learning has a high-error phase followed by a low-error phase. The brain growth spurts may be coordinated with the onset of the lowerror rate for basic learning. The brain growth spurt would be a clever way for the system to transfer what had been learned onto what amounts to virgin tissue—a tabula rasa that does not have all of the prior misconceptions and errors generated during the initial learning. The earlier record still exists, but functions that are more automatic may be keyed to the clean copy of recently revised classification categories. This reproduction would be intuitive and would require less agent direction. In the same way, serious brain damage may not result in great disability if the damage occurs before another brain spurt is scheduled. Assume that the current brain has sufficient space (afferent neurons and targets) to relearn basic mental functions; however, this relearning will require an enormous amount of practice (particularly during the early stages). If a growth spurt occurs after the functions have been relearned, the products of re-
SYSTEM DESIGNS
473
learning are transferred to the clean copy, and the agent again has automatic control of relearned operations. This scenario may be the major reason that younger subjects with the same type of trauma are far more likely to achieve a much higher level of relearning (sometimes to the point that it would be difficult for someone to detect any disability) than older subjects (Finger & Stein, 1982). Learning as Content Without intricate biological processes, neural pathways, and spinal reflexes, behavior could not occur regardless of whether the behavior relates to learning. The analysis of inferred functions, however, rejects the idea that the foundation of learning lies in these physiological processes. Learning involves content even if the content is a set of rules for behaving. So unless learning is expressed in terms of content, it is not explained. The spinal reflex has nothing to do with content except in a binary way. Therefore, the conditioned reflex is largely irrelevant to the learning of operant responses (responses influenced by primary reinforcers). What is relevant about the conditioned reflex is that specific content is involved. Conditioned reflexes simply document that the induction of content has influence on unconditioned responses as well as operant responses. This fact does not imply that conditioned reflexes represent learning that is more basic than operant learning or that operant learning somehow sprang from conditioned learning. The most basic type of learning would have to be operant because it is the only way to solve the difficult task of interacting with an environment that involves continuous variation of features and is uncertain in many ways. There are conditioned components in probably everything that is learned. In fact, if automaticity of function is achieved, there logically have to be conditioned components. However, the essence of learning is not to be found in these components, but in how the system receives and processes information about specific content. If the conditioned reflex actually represented basic learning of some sort, it should be possible to show learning of a conditioned reflex without the learning of operant-response strategies. The theory of inferred functions predicts that it is impossible for the system to learn only one response. Rather, multiple discriminations are learned. These serve as the basis not for a single response, but for various operant-response strategies. Reports of conditioned learning typically create a distortion because they do not make reference to what is learned in addition to the intended response. Sometimes, however, the experimental design reveals the unexplained behavior. For instance, in several experiments that condition the flexion reflex of a dog’s right front paw, the dogs proved to be smarter than
474
18.
ISSUES
the experimenters. After conditioning occurred and the dog reliably lifted its paw following a buzzer (to avoid an electric shock to the paw), the investigators restrained the paw so that it could not be moved. In response to the buzzer, the dog lifted its left leg (a response that had not been conditioned). When both front legs were restrained, the dog tried to position itself in the harness so that its rear paws would be off the floor (Bekhterev, 1932). For the theory of inferred functions, this is an issue of generalization. The presentation of the initial conditioning was consistent with the rule, “Lift your right paw when the buzzer sounds.” It was also consistent with the rule, “If you can’t lift your right paw when the buzzer sounds, lift something.” The feature of lifting was generalized. Clearly, this learning is a function of the training the dog received and is perfectly consistent with the information the learner received. The learner was never punished for lifting the left paw or the hind paws. Therefore, the dog’s content map was consistent with the information the training provided. Because most reports of conditioning do not specify the various operant responses that were induced by the training, the reports are accurate only to the extent that they describe one outcome of the training. The reports are misleading, however, because of the details they do not include. An analogy would be something like a newspaper account that said, “Fireman Bob Sweenie went up the stairs of the building at 6699 Oak Street last night at 11:35.” The reporter may be able to substantiate that the report of the time, place, and performance are absolutely accurate. The report, however, would be a distortion if the following facts were also true: The building was ablaze and a baby and her grandmother were trapped in an upstairs room; the captain ordered his men not to go into the building because of the danger of collapse, but Sweenie went up the stairs and saved both grandmother and baby. If we examine any concrete instance of inducing a conditioned reflex, we would be able to predict various operant behaviors that are induced even if they are not tested by the experiment or reported. We referred to an example earlier. We condition an eye-blink response in a dog. After N trials, the dog is thoroughly conditioned. The presentation of the buzzer is followed by the conditioned response (the eye blink). There is no doubt that learning has occurred, just as there is no doubt that fireman Sweenie went up the stairs. If we listed some of the other responses induced by the experiment in the same spirit that we referred to the conditioned eye blink, we would have to include the “leave the room” response, the “run down the hall” response, the “hunker down and whine” response, and, of course, the ubiquitous “head turn” response. It would be relatively easy to demonstrate any of these as a function of the training. If we take off the dog’s leash in the experimental room and
SYSTEM DESIGNS
475
say, “Okay,” the dog would quickly exhibit the “leave the room” response. The dog had never been trained to perform this response to the conditioned stimulus “Okay” (in fact, the dog may not even wait for “Okay”). In the same way, it would be possible to open the outside door and observe the “run from the building” response. The “head turn” response is demonstrated by removing the snout strap (but retaining the harness). The dog turns its head to the side in response to the beep. Amazingly, the timing of the head turn is the same as that for the eye blink. The physiological probability of this sort of “radiation of effect” occurring is zero. What the dog learned was that various features of the setting predict a specific aversive stimulus. The dog had a clear understanding of what was going to occur, where it was going to occur, what body parts were involved, and what strategies were available for reducing the aversiveness of the shock. With the snout immobile, the only strategy is to blink. If the head is mobile, a more effective strategy is available. If all physical constraints are removed and the leader says, “Okay,” even better options are available—get the hell out of here. Unless the dog represented the particular setting, the predictor (buzzer), the timing of the predictor (one-quarter second), and what was predicted (air blast to right eye), the conditioning would be logically impossible. In fact, conditioning has never been established from simultaneous presentation of the conditioned and unconditioned stimuli or from reverse-order conditioning. If the buzzer and air blast are simultaneous, no conditioning occurs because the buzzer does not predict anything; however, it would still be possible to demonstrate the learning of all the other behaviors because the dog has represented the content of what occurs in this room. Artificial Intelligence The details of the infrasystem and agent provide the field of artificial intelligence with a meta-blueprint for constructing intelligent machines that fulfill the internal functions of the two-system model. This meta-blueprint does not articulate the nuts-and-bolts construction details, but it describes design features and scope of relationships, operations, and content that must be in place for particular intelligent phenomena to occur. For example, the design of the infrasystem must have the classification and comparison functions. The test of any specific blueprint based on the required functions is whether it meets the information-processing and transformation requirements specified by the meta-blueprint. There are many models of intelligence and many intelligent functions; however, none of the extant ones has apparently addressed the most fundamental learning of relationships—exposure to positive and negative exam-
476
18.
ISSUES
ples and the logical processes that permit identification of what predicts the positive outcome. The smart machines are indeed smart, but the basic components of learning are not involved. The goals of performance and relationships that are to be processed have already been identified. What the machine does is manipulate the variables to achieve a preestablished goal state. In other words, it creates plans for the present setting. It may retain a record of this plan and apply it to the next occasion. However, the system does not go from raw sensory data, use information about whether each example is positive or negative to identify features of the example, and create (rather than simply follow) testable rules about the features that predict the positive or negative status of the example. For a machine to learn basic content, it must be designed so that the system draws inferences from the fact that an example is classified as a positive. The system would next infer that the sum of the features identified for positives cannot identify a negative example. Next, the system would infer which specific features are unique to this positive example and which are irrelevant to its positive status. If the system were classifying an example as a host with residential features, it would describe the host in the most specific terms that accommodate the variables. In other words, if there were three variables (A, B, C) and three inclusive hosts (1, 2, 3, with 1 being the smallest), the system would identify 1 as the host if it accommodated the variables. The system would reject 2 and 3 on the basis that they are not necessary. Figure 18.1 shows the three inclusive classes. The system would identify house as the appropriate host. To describe the house, starting with structure is rejected because it creates two unnecessary classes—structure and building. It says in effect, “The structure that is a building that is a house has the features of being large and white and greenroofed.” The classification that starts with building has only one inert class: “The building that is a house has the features of. . . .” The agent of an artificial intelligence engine would be a set of rules for using information to produce responses in the current setting. Depending on the degree to which the construction is to emulate the system of mammals or humans, the set of rules would be supported by some hardwired templates or default content maps keyed to specific stimulus conditions. The simplest rules would be programmed into its sense receptors. If a modality encountered something that was too extreme (too bright, too hot, too loud), the system would urge the agent to produce a minimum escape behavior. More complex operations would be needed for the agent to plan and direct responses based not on single sensors, but on the sum of the sensory input. The system would need a working model of the diagrams presented in chapters 4 and 6. The agent functions would be to plan and direct the response. The infrasystem would use the plan as a criterion for creating a pro-
477
SYSTEM DESIGNS
FIG. 18.1.
Three features within three hosts.
jection of the ideal outcome in the current setting. The assumption is that the current data will be transformed into the data projected for the outcome. The need for adjustments would be conveyed to the agent, and the agent would revise the plan that leads to a new projection. The infrasystem would have to be designed so that it (a) preserved multiple classifications of the event (by feature), and (b) classified the event as a one-time event that had the sum of the various features that are classified individually. At the heart of the system would be operations for representing and identifying features. These are best conceived of as transformations. If two examples are positive, whatever is involved in transforming one of the examples into the other is irrelevant to its classification as a positive. If the transformation from one shade of green to another creates two positives, the difference between the two marks the range of permissible variation. In contrast, the transformation from a positive example to a negative implies the essential features that are unique to positives. For instance, if the positive is green and the negative is yellow, the transformation from positive to negative would identify yellow as a feature beyond the range of acceptable positive variation. The system that learns basic relationships would need some form of these transformation capabilities. The operation would be relatively easier (certainly not easy) to apply to a single sensor. For accommodating multiple sensors, the logic of the infrasystem would have to permit the identification of relationships across modalities. Changes in sound would be correlated with changes in visual properties. Both would be presented to the agent on the same spatial map. All relative relationships could be identified or abstracted by another transformation. If Events A and B are viewed at different distances and the
478
18.
ISSUES
learner is testing a host for largeness, the system could convert both events to a standard distance, a predetermined reference point, and compare them for any visual relative features. For identifying sounds, the system could keep track of the background noise, adjust the background noise to a standard, and determine the relevant features of the targeted sound. The feature analysis of enduring and transitory features would be necessary to identify the host and what it is doing on a particular occasion. Again the analysis implies that the system has transformation capacity. The current sensory input would be compared with representations of individuals within the repertoire. Basically, the entity would be transformed into individuals in the repertoire. The identification of a match is the individual not identified as a negative that requires the least transformation. By applying some variation of this formulation, the system could learn about all transformations that a particular individual undergoes as well as learn the generalizable rules of space transformations that apply to all physical things. If the system discovers that the closest match to the observed host is not a positive example, the system would identify the various unique features that are not accounted for by standard transformations (distance, illumination, etc.) and record the new entity as the sum of its features including the unique ones (or simply as the difference in transformation between the positive). Further observations will determine which of the difference features are the reliable predictors for the new individual. This description of possible processes does not begin to describe the complexity of the machine that would be needed to perform the functions of mammalian learning. It does not even cover all the basic functions that would have to be in place to identify enduring or transitory hosts or searches that involve features of features. The system would need to create rules for various entities, rules for every class of individuals, and ultimately rules about rules. For virtually all facets of the design, however, transformations are implied. Transformations describe the range of possible variation for the positive examples, the difference between positives and negatives, and the features that are added or subtracted to distinguish one from another. Transformations are needed for the simplest projections, for the simplest plan, and for the simplest directive to respond. Because all transformations assume specific features, the products of transformations must be expressed as specific features of individuals or groups. If the development of artificial beings implied by this analysis is to progress from simplest to more complex, it could start with a single function like correlating responses to data. The response could be to a hardwired map to move to intensify X, where X is a feature like olfaction that requires a relative analysis that functions as a temporal analysis for the agent. If fea-
SYSTEM DESIGNS
479
tures are added to the basic core, the artificial being could accommodate functions required for more complicated pursuits. Current attempts to produce humanlike machines tend not to be proceeding in the directions and requirements implied by the analysis of inferred functions. Brooks, Brezeal, Marjanovic, Scassellati, and Williamson, (1999) assembled ideas from various fields and tried to coalesce these into a scheme that would somehow result in a machine that would learn. Machines that have been developed even have expressions, the assumption being that expressions will provide information to the other being involved in the interchange. For these efforts to approximate any of the intelligent functions of animals, however, the core of intelligent behavior must be addressed starting with basic performance and basic learning. Abstractions and Features Throughout the chapters, we have referred to every qualitative aspect of a host or feature as a feature. This treatment seems to imply a pandemic notion of what a feature is. (Is everything a feature?) The answer is that all qualities and relationships that have been identified would necessarily serve as features of some events. The system must possess the potential for such extensive identification of features because if all the relationships of the host H with other hosts and features are not identified as a feature of H, it would not be possible for the system to identify the relationship by searching for H. Consider what would occur if the system were not designed so that all individuals were classified as the sum of their features (including relationships). Let’s say that any host represented by the system would have only a set of physical features, not relationships. The agent would search its memory for house H and the system would respond by presenting the physical features of the house. The agent would now be able to identify the house in some detail—the patch on the west side of the roof, the windows, and all other enduring features of the house. The problem is that all enduring features are not necessarily permanent. The patch was not always on the roof. An important question is that if the feature were present, but no longer is, should it be retained by the system? If the system recorded only enduring features, the agent would not know that the house was remodeled and painted a different color. Unless both the transitory and permanent features are recorded as features of the house, searching for the old version would not result in any information about the new. Searching the new version would not reveal information about the older version, and searching for a particular feature no longer present in the house would not be possible and would represent a contra-
480
18.
ISSUES
diction in terms. The system would not know which host has this feature because the host at the earlier time is not connected with the current host. The agent would not be able to relate the house to the people who lived in the house, the one-time events that occurred in the house, or details of the house that are contingent on some conditions—the leaky window on the second floor, the tendency of the pipes to freeze when the temperature drops below zero. If all the relationships that involved the house were not recorded as features of the house, the search for the house would not yield information involving the neighboring houses or the physical relationship of the house to Allen’s drug store. Associations as Feature Relationships. The host must be classified both as the sum of all of its features and an instance of each of its features—including the one-time events that occurred in or in relation to the house. Therefore, the relationship of the house and freezing pipes could be accessed by the system not only through a search for the house, but also through a search for “freezing pipes,” “things that happen when the temperature drops below zero,” or “house problems.” If the system were not designed in this way, there could be nothing like what we call associations. A simple association is a relationship between two things. There is a basis in sameness of feature for any association. Farm and cow are associated because they tend to occur in the same place. Bridge and skyscraper are associated because they are instances of the same class (manmade structures). A wide range of variation in form is possible, but for any association a relationship requires feature or host H, a qualitative relationship, and feature or host F. Unless the system is capable of conducting at least three types of searches—for H, for the relationship, and for the related host or feature— the system could not formulate associations. Relationships are the grist of the agent’s repertoire. This fact is demonstrated in conversations. Each sentence expresses a new relationship, some of which relate a number of features to a single host. “Tina, do you remember the time in March that you were having a sleepover with four of your friends, and the temperature got down to something like seven below zero?” Yes, she does, and she remembers other times the pipes froze, how Dad had rules that had to be followed when there was an impending temperature drop below zero, and a network of associated details. “Remember? Every sink had to be running overnight, and Dad got so mad when Mom would get up in the middle of the night and turn off the water in their bathroom because it made that big gurgling sound.” Unless these memories are features of the house, no such “Do you remember” trails could be mapped by the system. The search for the freezing pipes would reveal information about freezing pipes, but it would be a list-
SYSTEM DESIGNS
481
ing, and about the only implications that would exist for conversations about freezing pipes would be for people to compare lists of the events that are classified in their infrasystem under freezing pipes. Abstract Thought. We have stated on various occasions that the agent is incapable of abstract thought or thought that does not involve some concrete singular focus. When you think of a blue bridge, an individual bridge is presented immediately to you as an insight—an image that may be transformed by the agent to become a railroad bridge, a covered bridge, and so forth. However, no matter how hard you force yourself to think of all bridges, the infrasystem does not present an insight. Although the infrasystem obviously has a classification of all bridges, it would be completely meaningless to the agent because, in interactions with the real world, the agent will never encounter all bridges or even all blue bridges. The agent encounters individuals and only individuals. If the system simply presented the agent with an abstraction of all bridges, what could the agent do with it? It may be suggested that the agent engages in abstractions when it deals with relationships that have no counterpart in the physical world. For instance, “good is to bad as hot is to. . . .” The answer is not a singular event or host. We do not picture a particular thing changing from good to bad and then from hot to cold. We do, however, recognize the task as something that is concrete and singular. It is a concrete example of a pattern, which is abstract and recorded as an abstract form by the infrasystem: A is to B as A¢ is to B¢. The pattern is intricate because A and A¢ must be from the same basic category and undergo the same transformation to create B and B¢. However, this pattern is applied to a concrete, singular packet of information. (Good is converted to its antipode as hot is converted to its antipode.) The task generates a limited set of correct answers. Furthermore, if we verbally characterize its form as A is to B as A¢ is to B¢, we are not actually describing the infrasystem’s classification but approximating it with a specific template for creating concrete singular instances consistent with the pattern. It may be argued that the content is more abstract than that of bridges and blue, and in one sense it is. Words about things are one step removed from the things. In a broader sense, however, the only difference is the content conveyed to the agent. We do not present actual bridges to the learner. We present words. The bridge and its color are fashioned from these words. For both the analogy and the blue bridge, the specific content conveyed by the words serves as a criterion for the search that the agent performs, and the product is a concrete singular application that is consistent with this criterion. For the bridge example, the agent makes the bridge blue, but does not convert the bridge into a fish market. For the pattern “Good is to bad as hot is to . . . ,” the learner does not say “Tuesday.”
482
18.
ISSUES
The rule of concrete specificity would seem to be contradicted by someone achieving a mental state in which thought was on a different level than it is during ordinary consciousness. Although there is no doubt that such states are possible, the fact that the person who achieves the state is able to report on the state indicates that the host is the event of achieving the altered state. If the person reports achieving the altered state, the person was able to discriminate between that state and the normal state. It had features not shared by the normal state and may be described by other features unique to this particular experience. Therefore, it was not abstract, but simply an event with specific features. These features refer to the sensations the agent received. The experience is no more abstract than one described as, “Wow, do I ever feel dizzy.” Math is abstract in the same way that word games or grammar are. Like the others, the result of a search is always a specific example that meets a set of criteria. The content of math is unique and sometimes involves rules about rules. The number of layers is interesting from a construction viewpoint. From the standpoint of how abstract the task is to the learner, it is basically no more abstract than the blue bridge task. The only difference is the pattern and set of features that are relevant to creating a concrete singular application. At an elementary level, we can require the learner to discover a pattern such as 5, 3, 6, 4, 7 or even use facts to complete analogies such as, “The power of one is to straight lines as the power of two is to. . . .” The mathematician who is trying to formulate a derivation is performing an operation parallel to creating a line of a poem. The content that the mathematician refers to is not directly observed by the senses; similarly, the meaning of the word blue is not observed directly by the senses. Blue is directly observed; the meaning conveyed by the word is not. In the same way, the resulting derivation is directly observed by the senses; the meaning is not. Without an understanding of rules for interpreting the derivation, it would appear to be gibberish. The abstractness of a number is based on a single, observable feature of objects or events. If we observe an object in Location A and later in Location B, we conclude that it had moved, but the property of movement is based on a single feature of the object—its current location. In the same way, a group of three events or objects is judged to be different from a group of two or four only with respect to a single feature of the group. That feature is number. This number feature is fairly easy to teach because it is analytically no different from other features that may be identified in the current setting. The language and grammar of math are indeed elaborate, but assertions or derivations have logical checks to determine their validity. Just as 3 cannot
SYSTEM DESIGNS
483
equal “not 3,” an elaborate derivation cannot contradict established facts and relationships. The infrasystem performs abstract operations, but the features of the infrasystem are explained in nonabstract terms. The explanations describe something not directly observed. In the same way, if inferences are based on accepted rules of math, further implications are possible, resulting in discussion of features that may be far removed from direct experience. Ultimately, the simplest test of whether it is possible for the agent to abstract any feature is to try to think of the feature so it is independent of all others. Think of number without thinking of some things, range, or context that have more than the feature of number. It cannot be done because the system presents hosts with features to the agent. Symbolic Logic One reader of this work observed that it would be possible to frame the various arguments more succinctly through symbolic logic. Actually, we began with such a plan. We abandoned it, however, because it tended to displace the focus from the content to the logical operations imposed on the content. Using symbolic logic presents two primary problems: 1. The basic logic of the infrasystem is not fully consonant with some conventions of symbolic logic. 2. Symbolic logic is not designed to express the content modes that the infrasystem employs. Problem 1 is most easily illustrated with the task of telling someone, “Give the man in the green hat $5.” The statement does not have to be phrased, “If and only if the man is wearing a green hat, give him $5.” Yet that is exactly what the statement means to the infrasystem. Symbolic logic would judge an outcome to be acceptable if the messenger gave everybody in the building $5, so long as the man with the green hat received $5.1 The infrasystem’s if-and-only-if logic is restricted to specific content. In responding to the direction “Think of a bridge,” the system complies by thinking of a bridge and other context details—rocks, water, and a particular perspective (the bridge imagined from a particular vantage point). The 1If the first part of an if–then rule is false and the last part is true, symbolic logic holds that the whole rule is true. For the rule, “If there is a man wearing a green hat, give him $5,” the first part is “false” if we give somebody else $5. The application is still considered to be “true” or valid.
1
484
18.
ISSUES
directive the system issued would be seem to be something like “If it’s a bridge, think of it,” not “If and only if it is a bridge, think about it.” If we say “Make the bridge blue,” however, the response is not to make other things in the scene blue, but to follow the directive conservatively. Only the bridge is transformed. “If and only if it is a bridge, make it blue.” The reason for this apparent inconsistency is that the bridge is a host and blue is a single feature that resides in hosts. Hosts cannot be imagined without some form of context (perspective). Therefore, the system finds no contradiction in adding detail that would be observed in connection with the host. In contrast, blue is a single feature of hosts. If the host is specified, the feature is limited to the host. To extend blue to other hosts would contradict the intent of the instructions. For both “Think of a bridge” and “Make it blue,” the system is thinking as conservatively as possible. The difference in the pattern is a function of the inherent difference between host and feature. This is not to suggest that it is impossible to specify conventions for expressing these relationships in symbolic logic; in fact, that is what the designer of a machine that performed as mammals perform would have to do. For our purposes, however, everyday language expresses the conventions of the system in a way that we understand intuitively. If we told somebody to give $5 to the man in the green hat and our messenger gave $5 to other hosts in the area, we would not conclude that the messenger was perfectly logical. We would judge that he had a serious deficiency in understanding. Problem 2 is more serious in some ways. Symbolic logic is not readily adapted to the different logical modes that the system employs when it performs a simple act of converting the content of a content map into a directive to respond in the current setting. The system employs at least three different logical modes. These are necessary because the system performs operations that are not logically connected. They are systemically connected, which means simply that the infrasystem is built in a way that connects them. Following symbolic logic rigorously would require additional conventions for each of the logical modes (which is achievable, but relatively clumsy). The three modes the system employs are best expressed as analogues to sentences that have the verb ought, the verb will, and the verb do. Each mode accommodates some form of the verb is. Ought. When the agent is urged to do something by a content map (such as intensify X), the closest translation into words would be a secondperson statement with the verb ought. “You ought to intensify X.” The ought is not a normative ought, but a logical one. The agent is not compelled to follow the urging. However, the urging expresses a standard that is preferred.
SYSTEM DESIGNS
485
The derivation of an ought conclusion is possible only if the system provides an ought premise. The agent adds facts, which are best represented by sentences that have the verb is, not ought. The conclusion has the verb ought. From the agent’s standpoint, the process is expressed as follows: “I ought to move to intensify X. X is over there. Therefore, I ought to move over there.” Note that the conclusion cannot be something like, “Therefore, I will go over there.” This is intuitively what the agent understands. From the standpoint of logic, however, the conclusion is expressed with the same verb as the premise. Will. This mode follows the same format as ought, but it involves the verb will. This is the language of projections. Projections are promises about what will occur. The “syllogism” for will involves a will premise, a fact (the verb is), and a will conclusion. “Outcome Y will occur if I do X. Doing X requires specific behaviors 1–4. Therefore, I will do behaviors 1–4.” Again the actual doing does not follow automatically. The conclusion is a plan, not an act or directive to act. It is merely a promise. Do. In the same way, the do mode involves a do premise, a statement of fact, and a do conclusion. The do mode would be expressed as something like this: “Do what you plan. You plan behaviors 1–4. Therefore, do behaviors 1–4.” A logical problem that applies to the ought, will, and do modes is that they involve mixed verbs and therefore are not strictly true or false. To be true or false, all parts of the argument would have to be expressed in the same verb—a form of is and, in some cases, was. The verbs will, ought, and do certainly follow conventions that are parallels of arguments expressed with the verb is, but for the system, conclusions are not judged on the basis of truth, but rather on the basis of rules the system follows to transform informa-
486
18.
ISSUES
tion about what is currently present into projected changes and finally into action. The truth of the conclusion “I will do behaviors 1–4” does not reside in the conclusion, but would have to be determined by applying external standards. After the learner produces behavior, it is possible to judge whether the promise was true or false, but the judgment derives from physical evidence, not the conclusion of the deduction. The modes of ought and do are even farther from being true or false than is the mode of will. The conclusion of ought cannot be judged unless some standard of ought is adopted. The conclusion is not true, merely consistent with the premise or inconsistent with it. The conclusion of do is related to facts only in the sense of whether the learner is designed to do what the directive says to do. Unless we had some independent measure of what the learner was supposed to do in connection with a particular do directive, we could not judge whether the learner had fulfilled its obligation. Yet the obligation is not true or false. It is either consistent or inconsistent with the rules. Another point is that none of these modes derives from facts of any sort. The modes are strictly impositions of the system. They are related to facts and to each other, not by logic, but by design stipulation. No amount of factual information could lead to a conclusion of ought unless the facts were related to an ought rule or premise.2 Performance Versus Learning The analysis of inferred functions starts with performance rather than learning because all learning occurs within the context of performance. The learner may be walking, resting, or crouching when the occasion for learning occurs. It does not matter whether the behaviors the organism is currently producing are learned or hardwired—they are a necessary context for learning. The role of performance is illustrated by the directive “Move to intensify X.” A certain degree of performance is required for the learner to apply the rule to a given setting regardless of whether the rule is learned. The details of the setting influence the specific behaviors that the learner will produce. 2 The
logical modes that the infrasystem uses are not coextensive with the grammatical modes (modal auxiliary verbs, which precede the main verb and qualify it). However, ought and will are grammatical modal verbs. Furthermore, each of the grammatical modal auxiliaries follow the same logical format as ought and will. One premise and the conclusion have the modal auxiliary verb or the functional equivalent (can, could, must, need, etc.). The other premise has the verb be (is). No other arrangement of verbs will yield the conclusion. The verb do (stated or implied in an imperative) is not classified as an auxiliary modal, but logically it has the same properties as the other modals. 2
SYSTEM DESIGNS
487
Because these details are variables, they influence the hardwired system in the same way they would influence a system that had learned the rule. The operations involved in planning the response, projecting changes in the setting, and directing and adjusting responses on the basis of a criterion are not variables for learning. These are performance variables. Whether the learner is testing a possible rule or applying one that had been learned, the performance context is a given. Analysis of this operational context permits a more precise identification of what the learner logically must do to learn the rule. On any given learning trial, the learner will plan something, project something, and determine whether the projected outcome has been achieved. Then and only then does the system have information needed for learning. In a sense, learning is a trivial extension of planning and projecting. Only the size of the unit that is planned changes. Instead of a plan for Situation A and a different plan for Situation B, the system identifies what is the same about A and B and attempts to make a permanent plan that applies to A, B, and any other instances that have the same features common to A and B. Unless the requirements of performance are separated from those of learning, it is not possible to rigorously identify the minimum necessary functions of the system that address learning. The functions unique to the learning would be amalgamated with the performance variables in a way that would sometimes make the separation of functions impossible. Probability Any theory of learning is influenced by probability. Although we have stated outcomes in a categorical sense, not as probabilities, practical considerations dictate that a margin of error is needed because (a) some aspect of the teaching may not have resulted in complete mastery, (b) there may have been subjects whose past learning negatively influenced their ability to learn the new material as presented, (c) there may have been children who were absent for some of the training, (d) some of the children may not have had the prerequisite skills needed to support mastery, or (e) the program used may still require some work before it is capable of being completely successful with the full range of children. Because of these potential demurs, we have to allow for some margin of error. However, the investigator’s use of probability should not be confused with the operations the learning system performs. Some explanations of learning assume that the system performs probability analyses—such as random switches in attention from one dimension to another (Estes, 1982). There is absolutely no way that an organism would be designed to learn by attending randomly to different aspects of the setting. Given that even the most pristine setting has at least hundreds of physical features, do we
488
18.
ISSUES
expect the learner to sample all of them or just some? If it samples only some, what is the basis—salience, proximity, duration? Even if the basis were limited to two dimensions, what kind of organism would be designed so that it randomly stopped attending to something and attended to something else? The organism that is moving to intensify X needs comparative information about X. If it randomly attends to Y, the attention would be perfectly nonfunctional; it would make the calculations for X more difficult or even irrelevant. In any case, probability models may predict overall tendencies that involve a set of individuals or occasions and may isolate relevant variables. Probability models, however, explain nothing about the design of the system, merely the observed effect. Learning and Teaching All teaching applications involve content. Therefore, a theory that does not address content cannot address the central variables in learning. A common result is that a theory identifies examples that are consistent with the theory, although they are not rigorously generated from the theory. Therefore, if the practice is successful, it does not really confirm the theory. It simply does not disconfirm it. Crude correlations do not confirm the theory; they simply do not disconfirm it. The fact is that, in most cases, the correlations could be generated by hundreds of different theories. If the theory is not detailed enough to imply low-probability outcomes, the theory cannot take credit for generating its successful applications. Let’s say a theory specifies that if the correct response is produced in the presence of the discriminative stimulus, the associations will lead to learning. For this assertion to be valid, every example within the class of associations would have the same effect—that of leading to learning. If the theory does not restrict the response any further than being produced in the presence of the discriminative stimulus, any randomly selected or contrived examples that did not result in the predicted outcome (learning the appropriate discrimination) would disconfirm the proposition. This proposition about learning is actually based on a crude correlation. Learning does not take place unless the learner is shown how to respond to the examples. The proposition is preposterous (although such interpretations are found in the literature; Anderson & Faust, 1973). Train 25 preschoolers who have no knowledge of reading or letter identification to read the words on 15 picture cards. In all cases, the picture will show its referent—dog, cat, goat, eggs, tree, and so forth. Each time the picture is presented, the children are to touch under the word and then say the word. They will quickly “learn” to identify all the words correctly. Clearly, the children produce the correct responses in the presence of the discriminative stimulus.
IMPLICATIONS FOR TEACHING
489
After the children respond correctly to all 15 pictures in any order, we switch the words and pictures so the picture of the goat is on the card with the word dog and so forth. We test the children individually: “Now don’t get fooled. Touch under the word and look at it carefully. Then tell me what that word says.” We will discover that not all the children will read the words correctly. In fact, for some groups, we may find that a high percentage of the children, perhaps even all, read all the words incorrectly. We would discover the same outcome if we tested the children by presenting only the words without the pictures. The responses may be different, with a lot of children saying, “I don’t know” or saying nothing. This prediction is not based on knowledge of the individual children, but knowledge of what has to happen if the children are to learn. The mere presence of the discriminative stimulus means little unless the presentation is consistent with only one interpretation—that the words are discriminated by letter-arrangement features and only by these features. Because the burden of the proof is on the theory to account for all the examples in an implied set, it is a lot easier to disconfirm a theory than it is to confirm it. A small percentage of unpredicted outcomes disconfirm a theory, but all examples are needed to confirm it. Confirming the theory requires ruling out competing models that could account for the outcomes. This is a less than rigorous process because the explanations that are ruled out may not represent any particular threat to the theory.
IMPLICATIONS OF INFERRED FUNCTIONS FOR TEACHING If a theory explains the variables involved in learning, an implication is that the control or maximization of particular variables would result in accelerated learning. A symptom of nearly all theories of learning, however, is that they have not resulted in significant contributions to effective teaching. The behavioral orientation has generated more effective extensions than other approaches. Controlling reinforcing variables clearly influences behavior in predictable ways. The changes are sometimes extensive and dramatic. One of the training exercises we have used with trainers and teachers is to change school behavior of out-of-control elementary schools quickly and without extensive interactions with the students. The plan is simple. We station people at different places in the school, such as at entrances, in the hall, on the playground. Each person has tickets—little pieces of yellow paper, for instance. When they observe a student walking properly, being cooperative, or behaving properly in other ways, the monitor gives the child a ticket and indicates only why the ticket is issued: “Here’s a good work ticket
490
18.
ISSUES
for letting that other person in front of you,” “Here’s a good work ticket for walking quietly in the hall,” or “Here’s a good work ticket for sharing your ball with the other girls.” On the first day of the intervention, the monitor also tells the student, “Give that ticket to your teacher, and she’ll have something for you.” The monitor says no more. If other children object and try to explain why they should receive a ticket, the monitor ignores them, walks to another place, and waits at least 1 minute before identifying another student who is doing something appropriate. Later the teacher praises the children who received tickets and gives them a little treat. She tells the other children that she will redeem good work tickets for everybody else who exhibits good behavior. The effects of this procedure are as close to magic as anything ever observed in behavior change. Within 2 or 3 days, playgrounds and schools that had been totally rowdy will have students playing without arguing, holding the door open for others, walking in the halls, conversing without yelling, and behaving like a completely different population of children. The teachers’ behavior is also changed. They are not nagging children, shouting at them, or calling them names. All the details of the plan derive from the principles of reinforcement. The schedule of reinforcement is random. The reinforcement is positive. The student receives clear information about why the reinforcement was provided. One of our coworkers, Wesley Becker, pioneered demonstrations of how much and how quickly behavior would change if teachers were trained to “catch kids in the act of being good” (Becker, 1971, 1986). Although the plan seems ridiculously simple, everybody involved must be trained to behave in particular ways, and they must not deviate from the rules. If they do, the implementation will fail. Following the rules is difficult for many teachers in bad schools because historically they have used only names, threats, and ridicule to control the students. Although this type of behavioral change derives in a fairly straightforward manner from the behavior orientation, inducing new content does not derive as neatly from the principles of reinforcement. Aspects of inducing new content are often consistent with behavioral principles, but behavioral principles are not specific enough to address the variables of content learning. This fact should not be too surprising because the programming of content involves an analysis of the content. The analysis of behavior has no provisions for analyzing content—only behavior. The traditional S-R and S-S theories of Hull, Guthrie, Spence, and others are far less generative than approaches based on Skinner’s work because these theories do not adequately explain any significant behavioral phenomena. As Skinner (1965) pointed out, these theories are guilty of reductionism. These theories create inferred entities in the form of undisclosed stimuli and unobserved “fractional anticipatory responses” that do
IMPLICATIONS FOR TEACHING
491
not meet the requirements of legitimately inferred entities because they are logically incapable of generating significant extensions. They do not address content. Furthermore, they are simply post-hoc inventions created only to explain a particular phenomenon that would remain unexplained without the invention. Like the history of conditioned reflex, each new experimental result that was unpredicted resulted in additions to the theoretical frameworks until they collapsed. These theories have not contributed to the understanding of teaching. The reason is simply that they are not designed to imply how content knowledge is induced in a rat or human. Analysis of Content The theory of inferred functions directly or indirectly implies a large percentage of the details of effective instructional programs. All primary or manipulable variables that influence learning are considered by the theory. Any communication designed in accordance with the analysis would be designed to achieve the following: 1. to be consistent with a single content map; 2. to provide precise information about the features that govern the discrimination and behavioral strategies required; 3. to provide differential reinforcement for correct performance; and 4. to control variables associated with the learner adopting appropriate roles, rehearsing the information, and interpolating and extrapolating to various applications. Single Interpretation. As we indicated repeatedly, the quiddity of effective instruction is that it does not tend to generate more than one interpretation. The interpretations that the material generates are a function of the set of examples presented to the learner, the sequence of examples, and the number of examples of various types. There are many ways in which the instruction may fail. The examples (and possible related rules) may not show the intended range of generalization. Stipulation may occur if the early examples show a restricted range or if the early examples suggest a range that is more extensive than the desired range. The examples may be well designed, but the task may provide the learner with a spurious prompt. (The teacher moves her mouth in a way that prompts the first word of the answer.) To identify possible unintended content, the designer must conduct both an analytical test (correcting obvious alternative interpretations generated by rough-draft material) and an empirical evaluation that shows how children actually respond to the instruction. Note that this is not so much
492
18.
ISSUES
an analysis of the score, but of the specific errors the children make. The specific errors often reveal unintended interpretations that are consistent with the material presented or the manner in which it was presented. Effective Communication. The material may be consistent with a single interpretation, but may not be communicated clearly to the learner. The examples may be well conceived, but the teacher may present them so slowly that the child does not receive enough information to see how they are the same and how they are different. The tasks may be well designed, but the teacher may not present them with proper pacing or emphasis so that children fail to appreciate the importance of what the teacher is saying. The material may be potentially interesting to the children, but the teacher may not model behavior that suggests it is interesting. The examples may be well designed, but the sequence may fail because it presents too much at one time. The material may be well sequenced, but the children may not be placed appropriately within the sequence, which results in a higher than desirable percentage of incorrect responses. The design features that make discriminations more obvious involve orchestration of various details, some of which may not be identified at first. For example, we are trying to teach the naive learner the difference between longer and not longer. If the teacher displays three horizontal lines side by side, the presentation indeed shows one that is longer. ——————— —————————— ——————— However, the format is poor because the details that are compared are horizontally separated. If the same examples are presented vertically, the part of each line that is the same is now visually obvious. ——————— —————————— ——————— When the teacher identifies the middle line as longer, the word can refer to only three interpretations—it is longer, it is in the middle of two lines that are the same length, or it extends to the right of the other lines. The second interpretation would be ruled out if the next example showed two lines, the top one of which is longer. Dynamic Presentations. One way to make the samenesses and differences of examples more obvious is to use a dynamic rather than a static presentation of examples. The three examples of longer lines are static. A dynamic variation could start with two lines that are the same length. The
IMPLICATIONS FOR TEACHING
493
teacher would point to the top one and say, “My turn: Is this line longer? No.” (She would make the line a little longer.) “Is it longer now? Yes.” (She would erase the right end, making the line a little shorter than the other line.) “Is this line longer now? No.” (She would then make the line its original length, a little shorter, a little longer, a lot longer, and then the same length as the other line.) After the variable has been demonstrated, the teacher would present a pair of other examples—snakes—to show that longer does not apply only to lines. The communication of dynamic changes is designed in a way that is consistent with the child’s infrasystem. The infrasystem attends to changes. The presentation shows changes. Some changes predict a change in what the teacher says. Some changes don’t. The learner is provided with clear information about the relationship between the examples and the labels. Because a dynamic presentation is consistent with the most fundamental operations of the infrasystem, the presentation makes it relatively easier for the learner to attend to the relevant variables and learn the correlation between the labels the teacher presents and the features of the examples. Note that the presentation has systematically ruled out possible misinterpretations. Could the learner assume that longer involved a change of a particular magnitude? No, because smaller and larger changes led to the line changing from a negative to a positive and of a positive remaining a positive. Could the learner assume that line position, absolute length, side that is longer, or other irrelevant details could be necessary? No, because the same line became both a positive and a negative. Could the learner assume that longer referred only to lines? No, because once the basis for longer had been established, a nonline example was introduced. Could the learner assume that longer referred to possible vertical examples as well as horizontal examples? Possibly, but this feature of longer will be established through stipulation. All examples of longer will differ in horizontal length. Objects like cars are longer based on their horizontal length. When they are oriented vertically, they are still referred to as longer (not taller). An observation about the analysis of content and design of communication is that, although the end product may seem simple (teaching longer), it is the product of serious design effort. Reinforcing Variables. There are two primary issues of content reinforcement that are being taught. One is whether there are payoffs that are extrinsic to the material. Ideally, the answer is yes. The other facet is whether the material is unnecessarily punishing. Ideally, it is not. The properties of effective reinforcement are implied by the requirements of learning. If the learner is reinforced for identifying patterns, at least some intrinsic motivation may be mobilized through the instruction.
494
18.
ISSUES
However, if learning the pattern is connected with some primary reinforcement, its importance to the learner is both more profound and focused. Praise presented contingently functions as effective reinforcement for most healthy children. It should be temporally associated with the child’s performance so that the child has information about the unit of work that led to the reinforcement. The reinforcement tends to make that unit of work reinforcing. Issuing reinforcement may be relatively difficult if other details of the communication are not in place. If the learner tends not to produce correct responses, the teacher has fewer opportunities to reinforce the learner for good performance. Therefore, all the details that have to do with the examples, sequencing, communication, pacing, and placement of the child within the sequence must be controlled for a presentation to have a great potential for being reinforcing. Instructional Variables. The extent to which the instructional variables need to be controlled depends on what percentage of the population is to be taught to mastery. With traditional education, only the upper half of the population was expected to learn (Greer, 1972). A fairly cavalier presentation could achieve this goal. If we are interested in ensuring that at-risk and hardto-teach children learn the skills, controlling the variables becomes important. For instance, with a careful program, virtually all at-risk children with an IQ of 75 or more can be taught to read by the end of the kindergarten year (Berkeley, 2002; Gersten, Becker, Heiry, & White, 1984). This outcome places them ahead of their middle-class peers. It is possible to maintain some advantage and ensure that they do not fall seriously behind. They are challenged when the material they read introduces words and syntax that is foreign to them, which means that their advantage in skills will tend to lessen around Grade 3. Note, however, that if these children did not fall behind the more advantaged children, this accomplishment would represent enormous acceleration of the population’s historical skill development and performance. The acceleration may be maintained if the children remain in a highly focused, no-frills, academic program (Berkeley, 2002). For the acceleration of lower performing children to occur, the instruction for the early grades must be particularly impeccable. If the instruction in these grades is effective, teaching in the middle grades becomes much easier, and the teachers do not have to be trained in as many techniques as teachers in the early grades. All the variables that affect performance must be tightly controlled because time must be used so efficiently that, during every clock hour, the children learn more than demography would predict. If this formula is rigorously followed, at-risk children will learn more than predicted each
IMPLICATIONS FOR TEACHING
495
school year. The result is changed children who have knowledge and skill at learning new material. Control of Instructional Variables. With the current state-of-the-art instruction in the early grades, the teacher is the primary conduit for the information the instructional sequence provides and the primary role model. Either by design or not, the teacher provides information for not only how the children are expected to behave, but also how important it is and how well the children are doing. If the teacher responds to the children’s efforts as if they are exciting, the children will respond to them in the same way. Within this framework, the printed instructional material is able to exercise control over (a) the presentation of the examples, (b) the sequence of events within the lesson, and (c) the wording that the teacher is directed to present for the various examples. The teacher is responsible for (a) the actual communication or translation of the instructional material and directions into behavior (the pacing of the presentation, the fidelity of what the teacher says), (b) the reinforcement, (c) the level of mastery that is required before moving from one task to the next (corrections, reviews), and (d) the role formation. The problem with this division of responsibility is that, unless the teacher is highly trained, many of the benefits of the instructional program are lost. Furthermore, the training required to make the teacher highly efficient is quite extensive. The teacher must be trained in the technical details of effective presentations—delivering wording, correcting responses, and responding to good performance. The teacher must learn how to use data to group the children for instruction and accelerate their performance. Teachers coming out of most colleges of education know none of these skills, which means that ongoing and effective teacher training must be provided if the intervention is to be effective. With CD technology (videodiscs, CD-ROMs, DVDs), it is possible to exert much greater control over the instructional variables. Control may be exercised over all the program-design variables (the examples, the sequence, wording for the various examples), all communication variables (the delivery of the presentation, the provisions for corrections, the pacing of the presentation), and some of the reinforcing variables. CD technology is able to provide dynamic changes far superior to anything the teacher could present. For example, to show the feature of longer, the program could present a single line that changes length. The narrator labels the changes: “Watch the line, I’ll tell you if it gets longer. (Change.) Did it get longer? Yes. Watch. (Change.) Did it get longer? Yes. (Change.) Did it get longer? No. Your turn. Tell me if it gets longer. Watch. (Change.) Did it get longer?”
496
18.
ISSUES
Now the presentation more effectively exploits the way in which the infrasystem is designed. A change occurs, and the change either predicts to a positive outcome (longer) or a negative outcome (not longer).3 Also the sequence could be designed to convert the lines into other objects. Once the learner responds appropriately to the first set of examples, the line could transform into a rope, snake, garden hose, or stick. The same rules that describe a longer line describe these objects. Therefore, the scope of the generalization is conveyed.4 Presentations on subsequent lessons would amplify the comparative notion of longer. Children would be taught the meaning of “longer than ___” and would apply it to various examples. Still later, they would learn that there are two types of negative examples for longer—those that are the same length and those that are shorter. The dynamic changes possible with CD technology may be used to teach things like positive and negative inflection points, acceleration versus moving at a constant speed, energy of activation for a chemical reaction, and hundreds of other relationships that are sometimes not clearly conveyed to students. For instance, traditional math sequences often treat fractions and division as completely independent operations, rather than two representations of the same operation. Many students do not have an intuitive understanding (or possibly any understanding) that the only difference between a fraction and a division problem is the way in which it is written. After students learn to perform basic fraction operations and learn the procedures for working simple division problems, the two operations could be related through a dynamic transformation. The instruction opens with a screen that shows the division problem: 12 ÷ 3. The narrator says, “Say the division problem and the answer.” (Twelve divided by three equals four.) “Here’s a rule. All fraction problems are division problems that are written up and down, rather than on their side. Watch.” (12 ÷ 3 rotates clockwise as the division sign changes to show 123 ). “I’ll say the division problem and the answer. Twelve divided by three equals four. Your turn. Say the division problem and the answer.” Students are now producing the same response to the fraction that they produced to the division problem. 3
3 The reason the presentation does not introduce longer and shorter to naive children dur-
ing the same session is that they tend to get reversals and get into error patterns that are difficult to correct. If they learn longer now and shorter later, no contradictions occur. For objects that are not the same length, if it is not longer, it is shorter. 4Like the teacher-directed presentation, the problem of clarifying 4 horizontal-orientation feature still exists, and the solution would be the same as it would be for a teacher-locus program. The future examples of longer would be horizontally oriented to stipulate that this is the context in which longer occurs. Once this stipulation is learned, the child will have a sufficiently clear content map to use the concept.
IMPLICATIONS FOR TEACHING
497
Another division problem appears—16 ÷ 8—and the same questions and transformation are used. Then the process is repeated with several other examples, including 6 ÷ 6. When the students say the fraction form of the division problem and the answer (6 ÷ 6 = 1), the narrator would point out, “You already know that, don’t you? If each group has six parts and you use six parts, you use one whole group. That’s the division problem: six divided by six equals 1.” For the next series of examples, a fraction would appear. The narrator would say, “Say the division problem and the answer.” (24 ÷ 8 = 3.) “So how many whole groups does the fraction 24 eighths equal?” The demonstration would be repeated on several lessons, with the students having a more active role in stating the relationships. The instruction is quite economical. In 2 minutes, the program presents possibly 12 problems. In the same time, a teacher-directed presentation of examples on the board might present three. The instruction requires students to use the same wording for juxtaposed examples that differ in apparent ways. Therefore (according to the sameness principle), the examples are operationally the same. The conclusion the learner draws is the one that the instruction intends: Fractions are division problems and division problems are fractions. The CD technology should be particularly important for the teaching of math to at-risk populations. A high percentage of elementary-school teachers are not literate about math. If math is accelerated in the elementary grades, children in Grade 5 would be performing sophisticated math operations. CD technology could be used to control both the presentation details and the corrections. The program would be useful not only to teach the students, but to teach the teachers to identify the kinds of mistakes that students are likely to make and learn how to correct them. The authors have been engaged in the development and expansion of a CD beginning-reading program for nonreaders. Funnix (Engelmann, Engelmann, & Seitz Davis, 2001) requires a teacher (parent or older sibling) to monitor the responses of the learner and direct the program to repeat tasks not responded to correctly. The parent tells the answer and presses REPEAT so that the task that the child missed is repeated. Steely is working on a speech-recognition version of the program that will present corrections automatically if the child makes mistakes. For both versions, the lessons are designed so that the last activity in each lesson is highly reinforcing. The children first work on letter sounds and words presented in lists. Next, they read a story. There are no pictures for this reading. After completing the story, the children read a different version of the story (on screen)—one that is something like a comic strip, with balloons showing what characters say. The second reading sometimes (unpredict-
498
18.
ISSUES
ably) presents animations that the children find reinforcing. It serves as a semicontingent reinforcer, which is attainted by performing on the earlier tasks in the lesson. The result is that the other parts of the lesson predict the reinforcer and therefore tend to become more reinforcing than they would be in a sequence that did not have a carrot. One of the implications of instruction that incorporates CD technology is that the teacher requires far less training. Because less training is required, it is more probable that parents or older siblings could be effective presenters of the CD lessons. This feature is particularly important where schools are failing, which is in at-risk communities. Theory of Instruction. The summary of the instructional variables described is greatly amplified in the book, Theory of Instruction (Engelmann & Carnine, 1982). The theory is consistent with the current analysis as it is applied to teaching. It classifies the types of discriminations from the perspective of instruction, with formatted procedures or templates for designing instruction for particular classes of discriminations. The basic discriminations are categorized as noncomparatives, comparatives, and nouns. The nouns are what we have referred to as hosts. The comparatives and noncomparatives are the single-dimension residential features. The templates provide a model of how the example set may be configured to communicate one and only one interpretation. The Theory of Instruction also shows how basic discriminations are joined. The joining formats involve single transformations, double transformations, and correlated-feature relationships. Finally, the text presents rules for designing simple programs, rules for scheduling and sequencing example sets, and directions about how to organize various tracks so that each lesson consists of exercises from 4 to 10 different tracks that are continuously developed and combined. The Theory of Instruction is different from this current text in that it considers only those variables under the designer’s control. Inferred Functions of Performance and Learning draws inferences about the system that creates the learning. Comparison of Discovery Math and Structured Math. Current assumptions about instructional practices are frequently at odds with how learners learn. One myth is that if the learner struggles, the learner will somehow learn or benefit more. In fact, this is virtually never the case. For one popular mathinstruction format, the teacher presents students with a homework problem each day. Although the students do not have the component skills necessary to understand the mathematical solution to the problem, they are expected to solve the problem using guess and check.
499
IMPLICATIONS FOR TEACHING
Even if the students sometimes succeed in finding the answer through a guess-and-check process, these explorations suggest only that math provides a low-level set of tools, and from there it is up to the practitioner to guess and check. Here is an example presented to an average fourth-grade class. Sally has 18 coins. The coins are nickels, dimes, and quarters. She has three times as many dimes as quarters. She has four more nickels than dimes. Her nickels are worth 50¢. How much are her coins worth all together? Six students in the class discovered the answer. Three of them could not explain how they arrived at the answer. Twenty-six students failed. Although there is nothing wrong with the learner experimenting and working a problem of this sort once in a while, it does not serve as a basis for teaching math or reinforcing what had been taught. If three quarters fail the exercises consistently, it is difficult to imagine what sort of cognitive benefits would result from the instruction. Although these exercises fail from an instructional standpoint, they present students with good evidence about why math is important—so that the person does not have to go through low-level manipulations. Students typically have difficulties with coin problems because they confuse the number of coins with the value of coins. One way to obviate this problem while presenting an algorithm that is perfectly parallel to a simultaneous equation solution is to use a table that has a row for numbers of coins and a row for dollar amounts. This is something like a default content map that would permit students to solve hundreds of problems. Nickels
Dimes
Quarters
Total
#
#
#
#
¢
¢
¢
$
The learner puts in the numbers the problem gives—18 total coins and 50¢ in nickels. Nickels
Dimes
Quarters
Total
#
#
#
# 18 $
50¢
¢
¢
If there is only one unknown in a row or column (other than the total column), the missing value is identified by applying a mathematical operation. Applying the rule to the first column yields the number of nickels. If there are 50¢ in nickels, there are 10 nickels.
500
18.
ISSUES
Nickels
Dimes
Quarters
Total
# 10 50¢
#
#
# 18 $
¢
¢
Now the student uses the information given in the problem to figure out the number of dimes and quarters. “Sally has four more nickels than dimes.” She has 10 nickels; therefore, she must have six dimes. When this number is entered, there is only one unknown in the top row—the number of quarters. So that unknown can be identified through subtraction. She has 18 coins and 16 of them are accounted for. Therefore, she has two quarters. Now it possible to complete the table and identify the total amount she has. Nickels
Dimes
Quarters
Total
# 10 50¢
#6 60¢
#2 50¢
# 18 $1.60
The information is organized so that it is possible to verify that all the component values are correct. Variations of this problem would provide the students with relevant practice in using what they know about math. If sufficient practice is provided with various coins and bills and with other relationships (number of containers of different sizes, number of people in different locations, and so forth), students will learn basic strategies for organizing information so that inferences may be drawn. Equally important, all the students who are properly placed in the fourth grade would be able to succeed. The number of students in the class who solve these problems through guess and check is often lower than their performance suggests because somebody at home often helps those who succeed. It would have taken no more than an hour, distributed over possibly three consecutive lessons, to teach the solution strategy and to give students practice in successfully applying the strategy to a range of problems. This is far less time than the average student spent on the problem before giving up. Personality The analysis of inferred functions has implications for therapy in the area of personality disorders. The therapy is strictly behavioral. It derives, however, from an analysis of inferred functions that are parallel in some ways with processes described by Freud, Jung, and others. Freud (see, e.g., 1901/ 1960, 1932–1936/1964) viewed personality as having three parts that interacted—superego, ego, and id.
IMPLICATIONS FOR TEACHING
501
According to the analysis of inferred functions, there are only two divisions. The agent is basically the ego. It plans and performs the voluntary behaviors, and it creates voluntary thoughts. The infrasystem performs both the superego and id functions. For learned-role behavior, the infrasystem performs superego functions. For behavior grounded in primary reinforcers, the infrasystem functions as the id. The abiding sameness of both the superego and id functions is that both are reflexive. Neither the reproof that the infrasystem presents to the agent when it makes decisions that are contrary to its committed role nor the nasty thought that occurs when a sexually attractive person is present are voluntary. They are impositions of a reflex-driven system. In contrast, the agent is the nexus of voluntary control of responses and some thought. Freud’s theory recognized conscious and unconscious operations. Freud identified paradoxical thoughts and associations that influenced behavior, but that were not revealed to the person’s consciousness. Freud identified sexual connections between behavior like slips of the tongue and underlying meanings, which are revealed through analysis of what appeared to be a casual event. For Freud, the unconscious was extensive. As the analysis of inferred functions implies, the unconscious (or set of functions not accessible to the agent) is far more extensive than Freud envisioned. Its primary design function is not to conceal, repress, or somehow disassociate emotion and content. These are simply components of an operating scheme designed to direct learning and performance. All features of an event are recorded by the infrasystem and presented reflexively to the agent under specific conditions. The agent punishes some thoughts that are in conflict with roles that the agent espouses. For instance, the system classifies the agent’s mother as an attractive woman. Attractive women share the characteristic of being approached sexually. Therefore, the system presents awareness of this thought to the agent. The agent punishes itself for hosting the thought. It is perfectly opposed to the agent’s role. The recurrence of the thought, therefore, is consistently followed by strong counterfeelings that punish it. The infrasystem, however, has the sexual feature on record, which means that it will be presented to the agent under some setting conditions. The dilemma is solved by a compromise. The infrasystem presents the thought, but not in a way that will lead to the negative sensation or upset the agent. The content of the thought may be suppressed by the infrasystem, but the negative sensations associated with it are presented to the agent so that the affect is attached to something else in the setting. This displacement permits the agent to become upset, but not to realize the cause of the upset. The thought may also be presented in a more general form, such as love rather than sexual approach. Now the sensation is recognized as positive,
502
18.
ISSUES
but the potential nastiness has been edited out by the infrasystem. Other scenarios are possible. However, all are based on the fundamental operating format of the infrasystem. In the presence of certain features, thoughts are reflexively presented to the agent. This is not negotiable. If they contradict an agent’s role, they are presented as features of the relationship without presenting the relationship in a way that will lead to punishment by the agent. To Freud’s credit, he recognized that the workings of the unconscious must be guided by evidence and logic, not by intuition of the agent. Freud used facts of behavior to identify the content of connected events. This strategy is basically the same one that led to the theory of inferred functions. Because the theory of inferred functions does not concur either with the scope of the unconscious or the significance of the covert relationships that may be identified, the approach categorically rejects Freud’s notion of what constitutes reasonable therapy. Psychoanalysis and related counseling therapies are based on the assumption that the unconscious and conscious should be aligned, and that the alignment is achieved through a verbal interaction of some sort in which the learner talks about the past, feelings, and the possible significance of specific events. The assumption is that this talk will lead to catharsis, which will make the agent and infrasystem more compatible. Although it is possible for the therapeutic experience of abreaction to permit the patient to be more productive and at peace, the instruction that the therapy provides violates the central requirement of effective instruction. It is not consistent with a single content map. It reinforces verbal behavior, not actual behavior. The therapeutic interactions, therefore, may reinforce the patient for simply talking more about “I, my past, and my feelings.” The learner must learn a new role, understand the importance of that role, and practice that role until it becomes sufficiently automatic (which means that it has been accepted by the infrasystem). These requirements dictate the details of the therapy and practice. Ultimately, the infrasystem and agent will be realigned in a functional way, not a verbal way. The disparity between agent and infrasystem will still exist because the basic design of the system requires the agent to make decisions that may be opposed to either past learning, past roles, or involuntary thoughts. To the extent that the agent learns new roles is the extent to which the infrasystem’s content cannot be aligned with that of the agent. Growing up in Samoa, as interpreted by Margaret Mead (1964), may lead to relatively less tension between infrasystem and agent because there are relatively fewer role demands. However, the task of growing up in Chicago results in significant new role learning.
SUMMARY
503
SUMMARY The theory of inferred functions has implications for the design of machines that learn in the way that organisms learn and for the design of instruction that teaches effectively. The theory challenges some strongly rooted beliefs about learning and performance. One is that inferences about the “homunculus inside” or internal functions are not scientific. Important scientific contributions have been based on inferred entities. These inferred entities not only summarize data, but imply data that will be obtained if the inferred entity exists as postulated. A series of other beliefs is based on the assumption that learning has to do with process, not with the properties of what is to be learned and the possible ways in which these properties are revealed to the learner. These explanations imply that learning is something of an automatic process in the presence of particular stimuli. Learning, however, is not a simple extension of the spinal reflex. In fact, the essence of voluntary responses is that they have little to do with reflexive behavior. Perhaps the most serious historical belief the theory challenges is that consciousness is somehow a human (or at least near-human) property. For the theory, the degree of consciousness is described by the nature of the task. This criterion classifies all organisms that perform operant behavior as being conscious. The agent that produces the response must have simultaneous information delivered by internal and distance sensors. It must have knowledge of multiple features of the current setting, knowledge of how the setting is to change, and knowledge of the responses that may achieve the change. Also the agent must be motivated to perform behaviors that are capable of achieving the desired change. These requirements functionally describe the minimum consciousness of the agent. The productiveness of the theory of inferred functions may be tested by applying it to the design of intelligent machines—those that learn according to the meta-blueprints presented by the analysis. The theory maps directions different from those currently pursued in the fields of robotics and artificial intelligence. The analysis of inferred functions would require the machine to identify features, classify them, employ transformations, determine the scope of generalization, and identify features that convert positive examples into negative ones. The machine would apply the informationseeking procedures required to attend, record events that could be possible predictors, and create trial content maps. The system would have to possess the logic required to process both positive and negative examples and identify single features and combinations of features that predict the reinforcer. In addition to these basic provisions, the system may be empowered to learn through information provided by secondary and peripheral reinforcers.
504
18.
ISSUES
The theory of inferred functions carries implications for the design of effective instructional practices. The goal is to install content maps. For the maps to be effective, they must predict across the range of examples to which they apply. The fundamental requirement, therefore, is that the instruction must work across the full range of examples, and the presentation must be consistent with a single interpretation. These requirements are met through instruction designed to control the range of training examples presented and eliminate possible interpretations other than the one the learner is to learn. The resulting instruction is greatly different from traditional instruction. Instruction that is effective with the younger, naive child must control all the variables that would affect the design and fidelity of the communications intended to induce the learning. With print formats, the instructional material is able to exercise some control over the presentation of the examples, the sequence of events within the lesson, and the wording that the teacher is directed to use. The teacher is responsible for creating the actual communication, implementing the reinforcement, obtaining the level of mastery, and conveying information about the learner’s role and how successful the learner is. CD technology permits the instructional program to more tightly control the presentation variables, provide dynamic changes that show which variables are relevant to a discrimination, and increase the potential for presentations that are reinforcing. The wording that the narrator presents is standardized and closer to ideal in terms of timing, inflection, and other details that influence clarity. The pacing is carefully controlled. Corrections may be standardized and automated. The theory of inferred functions has potential implications for therapy. The therapy is the same as that suggested by a behavioral model. The learner’s behavior must change. The learner must relearn and practice until the new learning becomes relatively automatic. The content of what is to be learned is determined by the goals and expectations of the learner. The therapy most consistent with the human learning and performance system is designed to accommodate the present and both predict and anticipate the future. In summary, Carl Sagan (1997) was renowned for his reference to “billions and billions of stars.” The human brain is necessarily characterized by billions and billions of connections of feature to feature, host to host, host to feature, and feature to host. The network that connects the entries is fashioned of logical functions. The foundation logic of this cognitive processing system is shared by all organisms that produce organized, voluntary responses. To perform is to possess a substantial logical underpinning.
References
Anderson, R. C., & Faust, G. W. (1973). Educational psychology: The science of instruction and learning. New York: Dodd, Mead. Apfell, J., Kelleher, J., Lilly, M., & Richardson, R. (1980). Developmental reading for moderately retarded children. Education and Training of the Mentally Retarded, 10, 229–235. Beck, L. W. (1953). Constructions and inferred entities. In H. Feigle & M. Broadbeck (Eds.), Readings in the philosophy of science. New York: Appleton-Century-Crofts. Becker, W. C. (Ed.). (1971). An empirical basis for change in education: Selections on behavioral psychology for teachers. Chicago: Science Research Associates. Becker, W. C. (Ed.). (1986). Applied psychology for teachers: A behavioral cognitive approach. Chicago: Science Research Associates. Bekhterev, B. M. (1932). General principles of human reflexology. New York: International. Berkeley, M. (2002). The importance and difficulty of disciplined adherence to the educational reform model. Journal of Education for Students Placed at Risk, 7(2), 221–239. Boren, J. J., & Sidman, M. (1953). Maintenance of avoidance behavior with intermittent shocks. Canadian Journal of Psychology, 11, 185–192. Bracey, S., Maggs, A., & Morath, P. (1975). Effects of a direct phonic approach in teaching reading with six moderately retarded children: Acquisition and mastery learning stages. Slow Learning Child, 22, 83–90. Brooks, R., Breazeal, C., Marjanovic, M., Scassellati, B., & Williamson, M. (1999). The cog project: Building a humanoid robot. Cambridge, MA: MIT Artificial Intelligence Lab. Carmichael, L., Hogan, H. P., & Walter, A. A. (1932). An experimental study of the effect of language on the reproduction of visually perceived form. Journal of Experimental Psychology, 15, 73–86. Carnine, D. W. (1977). Phonics versus look-say: Transfer to new words. Reading Teacher, 30(6), 636–640. Carnine, L. M., Carnine, D. W., & Gersten, R. M. (1984). Analysis of oral-reading errors made by economically disadvantaged students taught with a synthetic-phonics approach. Reading Research Quarterly, 19(3), 343–356. Chomsky, N. (1980). Rules and representations. New York: Columbia University Press.
505
506
REFERENCES
Engelmann, S. (1965). Cognitive structures related to the principle of conversation. In D. Brison & E. Sullivan (Eds.), Recent research on the acquisition of conservation of substance. Toronto: The Ontario Institute for Studies in Education. Engelmann, S. (1971). Does the Piagetian approach imply instruction? In D. R. Green, M. P. Ford, & G. B. Flamer (Eds.), Measurement and Piaget (pp. 118–126). Carmel, CA: California Test Bureau. Engelmann, S., & Carnine, D. (1969). DISTAR arithmetic. Chicago: SRA. Engelmann, S., & Carnine, D. (1982). Theory of instruction: Principles and applications. New York: Irvington Publishers. Engelmann, S., Carnine, L., Johnson, G., Meyer, L., Becker, W., Eisele, J., Haddox, P., Hanner, S., & Osborn, S. (1978, 1989, 1999). Corrective reading series. Columbus, OH: SRA/McGrawHill. Engelmann, S., Carnine, D., & Steely, D. (1985). Allard, K. (Producer). Mastering fractions. Washington, DC: Systems Impact. Engelmann, S., Engelmann, O., & Seitz Davis, K. (1998). Horizons reading series. (Teacher’s Presentation Book, Student Material, Literature Guide and Teacher’s Guide). Columbus, OH: SRA/McGraw-Hill. Engelmann, S., Engelmann, O., & Seitz Davis, K. (2001). Funnix. Eugene, OR: Royal Partnership Limited. Engelmann, S., Kelly, B., & Carnine, D. (1994). Connecting math concepts. Columbus, OH: SRA. Engelmann, S., Osborn, J., Osborn, S., Zoref, L., Bruner, E., & Hanner, S. (1995). Reading mastery series (Teacher’s Presentation Book, Student Material, Literature Guide and Teacher’s Guide). Columbus, OH: SRA/McGraw-Hill. (Originally published as DISTAR Reading, 1969. Science Research Associates. Chicago.) Engelmann, S., & Rosov, R. J. (1975). Tactual hearing experiment with deaf and hearing subjects. Exceptional Children, 41(4), 243–253. Engelmann, S., Ross, D., & Bingham, V. (1982). Basic Language Concepts Test. Tigard, OR: CC Publications. Engelmann, S., & Steely, D. (1978). Fractions·decimals·percents. Chicago: SRA. Estes, W. K. (1982). Models of learning, memory, and choice: Selected papers. New York: Praeger. Finger, S., & Stein, D. G. (1982). Brain damage and recovery: Research and clinical perspectives. New York: Academic Press. Foa, E. B., Steketee, G., & Rothbaum, B. O. (1989). Behavioral/cognitive conceptualizations of post-traumatic stress disorder. Behavior Therapy, 20(2), 155–176. Freud, S. (1960). The psychopathology of everyday life. In J. Strachey et al. (Eds.), The standard edition of the complete psychological works of Sigmund Freud (Vol. 6). London: Hogarth Press. (Original work published 1901) Freud, S. (1964). New introductory lectures on psycho-analysis. In J. Strachey et al. (Eds.), The standard edition of the complete psychological works of Sigmund Freud (Vol. 22). London: Hogarth Press. (Original work published 1932–1936) Furth, H. G. (1969). Piaget and knowledge: Theoretical foundations. Englewood Cliffs, NJ: PrenticeHall. Gannon, P. (1982). Predicting generalization: Generalized compliance training with a severely handicapped adult. Unpublished doctoral dissertation, University of Oregon, Eugene. Gersten, R. M., Becker, W. C., Heiry, T. J., & White, W. A. (1984). Entry IQ and yearly academic growth of children in Direct Instruction programs: A longitudinal study of low SES children. Educational and Policy Analysis, 6, 109–121. Giurfa, M., Zhang, S., Jenett, A., Menzel, R., & Srinivasan, M. V. (2001). The concepts of “sameness” and “difference” in an insect. Nature, 410, 930–933. Glang, A. (1987, July). Use of a diagnostic-prescriptive direct instruction approach to remediate language and memory deficits in a brain-damaged subject. Paper presented at annual meeting of International Neuropsychology Society, Barcelona.
REFERENCES
507
Greer, C. (1972). The great school legend: A revisionist interpretation of American public education. New York: Basic Books. Harlow, H. F. (1952). Learning. Annual Review of Psychology, 3, 29–54. Hart, B., & Risley, T. (1995). Meaningful differences in the everyday experience of young American children. Baltimore, MD: P. H. Brookes. Hull, C. L. (1934). The concept of the habit-family hierarchy and maze learning. Psychology Review, 41, 33–54, 134–152. Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence. New York: Basic Books. Jerison, H. (1973). Evolution of the brain and intelligence. New York: Academic Press. Mead, M. (1964). Coming of age in Samoa: A psychological study of primitive youth for western civilization. Gloucester, MA: P. Smith. Mineka, S., & Kihlstrom, J. F. (1978). Unpredictable and uncontrollable events: A new perspective on experimental neurosis. Journal of Abnormal Psychology, 87(2), 256–271. Modaresi, H. (1990). The avoidance barpress problem: Effects of enhanced reinforcement and an SSDR-congruent lever. Learning and Motivation, 21(2), 199–220. Murdock, B. B., Jr. (1968). Serial order effects in short-term memory. Journal of Experimental Psychology, 76(4), 1–15. Osgood, C. E. (1952). The nature and measurement of meaning. Psychological Bulletin, 49, 197–237. Osgood, C. E., & Luria, Z. (1954). A blind analysis of a case of multiple personality using the semantic differential. Journal of Abnormal & Social Psychology, 49, 579–591. Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. Urbana: University of Illinois Press. Polloway, E., Epstein, M., Polloway, C., Patton, J., & Ball, D. (1986). Corrective reading program: An analysis of effectiveness with learning disabled and mentally retarded students. Remedial and Special Education, 7, 41–47. Premack, D. (1965). Reinforcement theory. In D. Levine (Ed.), Nebraska symposium on motivation (Vol. 13). Lincoln: University of Nebraska Press. Sagan, C. (1997). Billions and billions: Thoughts on life at the brink of the millenium. New York: Ballantine. Senden, M. V. (1932). Space and form apprehension of persons born blind before and after operation. Leipzig: Barth. Sidman, M. (1953). Two temporal parameters of the maintenance of avoidance behavior by the white rat. Journal of Comparative & Physiological Psychology, 46, 253–261. Sidman, M. (1955). On the persistence of avoidance behavior. Journal of Abnormal & Social Psychology, 50, 217–220. Sidman, M., Herrnstein, R. J., & Conrad, D. G. (1957). Maintenance of avoidance behavior by unavoidable shocks. Journal of Comparative & Physiological Psychology, 50, 553–557. Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: AppletonCentury. Skinner, B. F. (1965). Science and human behavior. New York: Free Press. Skinner, B. F. (1969). Contingencies of reinforcement: A theoretical analysis. New York: AppletonCentury-Crofts.
Author Index Note: The letter n following a page number denotes a footnote.
A Anderson, R. C., 488 Apfell, J., 453
B Ball, D., 453 Beck, L. W., 467 Becker, W. C., 453, 455, 490, 494 Bekhterev, B. M., 474 Berkeley, M., 494 Bingham, V., 390n Boren, J. J., 156 Bracey, S., 453 Breazeal, C., 479 Brooks, R., 479 Bruner, E., 450
C Carmichael, L., 398 Carnine, D. W., 272, 285, 444, 458, 498 Carnine, L. M., 453, 455, 458 Chomsky, N., 372 Conrad, D. G., 156, 320
E Eisele, J., 453, 455 Engelmann, O., 453, 497 Engelmann, S., 272, 283, 285, 390n, 409, 414, 444, 450, 453, 455, 497, 498 Epstein, M., 453 Estes, W. K., 487
F Faust, G. W., 488 Finger, S., 473 Foa, E. B., 422 Freud, S., 500 Furth, H. G., 405
G Gannon, P., 422 Gersten, R. M., 458, 494 Giurfa, M., 107 Glang, A., 281 Greer, C., 494
509
510
AUTHOR INDEX
H Haddox, P., 453, 455 Hanner, S., 450, 453, 455 Harlow, H. F., 367 Hart, B., 462 Heiry, T. J., 494 Herrnstein, R. J., 156, 320 Hogan, H. P., 398 Hull, C. L., 467
I Inhelder, B., 404
J Jenett, A., 107 Jerison, H., 471 Johnson, G., 453, 455
K Kelleher, J., 453 Kelly, B., 444 Kihlstrom, J. F., 422
L Lilly, M., 453 Luria, Z., 397
M Maggs, A., 453 Marjanovic, M., 497 Mead, M., 502 Menzel, R., 107 Meyer, L., 453, 455 Mineka, S., 422 Modaresi, H., 156 Morath, P., 453 Murdock, B. B., Jr., 467
Osborn, S., 450, 453, 455 Osgood, C. E., 397
P Patton, J., 453 Piaget, J., 404 Polloway, C., 453 Polloway, E., 453 Premack, D., 219
R Richardson, R., 453 Risley, T., 462 Rosov, R. J., 283 Ross, D., 390n Rothbaum, B. O., 422
S Sagan, C., 504 Scassellati, B., 479 Seitz Davis, K., 453, 497 Senden, M. V., 292 Sidman, M., 156, 320 Skinner, B. F., 153, 154, 175, 177, 490 Srinivasan, M. V., 107 Steely, D. G., 444 Stein, D. G., 473 Steketee, G., 422 Suci, G. J., 397
T Tannenbaum, P. H., 397
W Walter, A. A., 398 White, W. A., 494 Williamson, M., 479
Z O Osborn, J., 450
Zhang, S., 107 Zoref, L., 450
Subject Index Note: The letter n following a page number denotes a footnote.
A Abstract thought, 481–483 Agent classifications generated by agent, 333–334 communication with repertoire, 278–283 consciousness, 318–319 content map as agent information, 35–37 directed thought, 318–320 as ego, 500–502 as expressive functions, 348–350, 369 goals and criteria imposed by agent, 329–332 resources, 80 roles in hardwired system, 20–21 searches directed by agent, 358–361 secondary sensations as reinforcers, 30–35 sensitivity to consequences, 21 Agent functions, 44–70 (chap. 3) inferring agent functions, 46 overview, 44–45 planning functions, 80–82 planning responses, 49–57 Agent and infrasystem interactions, 71–91 (chap. 4), 335–340 agent influence on infrasystem, 332–334 agent vs. infrasystem processes, 368–369
functional interactions, 78–84 hardwired interactions, 47–48 human agent, 424–429 interaction model, 79 (Fig. 4.4), 82 (Fig. 4.5) roles in basic learning, 99 Agent-generated content maps as agent resource, 332–334 (333 Fig. 13.1) Analysis of content, 491–500 effective communication, 492–493 four criteria, 491 instructional variables, 494–498 single interpretation, 491–492 reinforcing variables, 493–494 Antecedent learning, 107, 118–147 (chap. 6) correlation of features, 124 default design for, 123–124 extension of hardwired performance, 130–135 feature abstraction in, 125–126 requirements for feature identification, 136–140 three stages of, 130–136 Approach response antecedent learning for, 123–130 examples, 18, 25–28, 103–104, 164–165 and negative reinforcers, 33–35
511
512
SUBJECT INDEX
Approach–avoidance gradient example, 320–322 Artificial intelligence example of feature–host relationships, 476–478 learning feature–host relationships, 475–479 meta-blueprint for two-system model, 475–479 transformation functions, 477–479 Attention human attention, 421–422, 425 requirement for feedback, 77–78 to unanticipated changes, 142 Automaticity, 447 Avoidance behavior, 101, 140–141, 168–169
B Basic learning, 94–117 (chap. 5), see also Learning antecedent, see Antecedent learning components of responses, 109–110 definition, 95 discrimination-only learning, 107 discriminations and responses, 111–112 gestalts, 225–226 imprinting, 107 logic, 115–116 motor-response learning, 108 reinforcement in response learning, 111 response-only learning, 108–111 response-related learning, 110–111 response-strategy, see Response-strategy learning types of, 107–112, 119 (Table 6.1) Biochemistry and inferred functions, 469–470 Brain damage and relearning, 278–283, 472–473 unfamiliar learning, 282 (Fig. 11.1) Brain growth spurts, 472–473
C Calculations, necessary for basic learning, 133–135 CD technology, 495–498
Chaining backward chaining, 267–268 forward or backward, 267–269 training vs. natural order, 267–269 Classical conditioning and operant responses, 473–475 Classification agent-generated classifications, 333–334 component operations, 406–407 by features, 233–242 features as hosts, 353–361 groups generated by different criteria, 236–237 human learning classification templates, 355–357 individual as sum of features, 234 by individuals, 355–357 by individuals and features, 238–239, 353–362 multiple classification criteria, 10 by multiple features, 237–239, 242–245, 406–407 sameness analysis applied, 279–280 search for feature, 337, 353–363 search for host, 336–337 search for host and feature, 337 by single features, 240–242 Classification system (knowledge), 325–340 access to features, 333–334 Cognition, see Human cognitive development Cognitive development assumptions discredited, 410–411, 414–418 developmental norms, 403–406 developmental performance as tasks, 404–405 Piagetian developmental stages, 404–405 Communication, see also Language context (setting) shaping, 153–156, 172, 262–263, 303–307 controlling communication variables, 297–301 difficulty as function of example set, 304–307 discriminations taught separately, 298–299 examples of communication techniques, 301–315 variables that influence clarity, 303–304 Compensation reasoning, 408–409, 410–412, 418–419
513
SUBJECT INDEX Concrete operations, 408–413 as abstract operations, 419 teaching conservation of substance, 409–413 Conscience, 331–332 Consciousness, 471–472 Consequences, see Reinforcement Conservation inventory, 409–410, 412–413 Content analysis, 491–498 content inferences, 37–39 presented to agent reflexively, 35–37 as sensation, 35–37 Content map modification, 177–186 process, 184–186 related to schedules of reinforcement, 177–181 trial content map, 134–135 Content maps, 96, 98–99 as agent information, 35–37 and agent interaction, 46–49 agent-imposed examples, 331–332 as annotated blueprint, 38–39 conflicting approach–avoidance example, 320–322 conveyed by language, 373–376, 435–437 default, see Default content map default functions, 137–140 for delayed actions, 329–331 design example, 38–40 event-based process, 160–167 general and specific representations, 193–195 as gestalts, 224–225 hardwired content map, 35–40, 96–99 implied by sameness of performance, 38–39 learned content map, 98–99, 102–104 long-range pursuit example, 322–323 and plan, 51–54, 56–57, 72, 73, 74, 160–164 plan- and event-based combinations, 167–168 requirements for feature identification, 136–140 for response strategies, 149–151, 172 response-planning templates, 50–57 and sensory data, 72, 73 spurious, 97 termination criteria, 40, 75–76 trial content map, 102–103, 116–117, 134–135
and unique sensory features, 36 Context (setting) shaping, 153–156, 172, 262–263, 303–307 Continuous variation logical requirements, 10–14
D Decisions for extensive pursuits, 322–323 formats for nonlearning machines, 9–10, 13–14 in hardwired systems, 50–56 Default content map in antecedent-learning, 123–124, 139–140 for basic learning, 104–107, 115 Default values, 57, 61–64 Descriptions ambiguities, 379–389 example of descriptive options, 376–377 individuals by feature or name, 385 Development, see Cognitive development Difference principle, 211–213, 444–447 Directed thought, 318–320, 323, 329–340 Directive function described, 6–7 model for, 61–68, 71–78 Directive to respond, 57–58 Discriminations adjusting difficulty of, 304–307 minimum-difference negatives, 298–299 multiple discriminations, 298–299 multiple-feature discrimination, 308–309, 310–315 single-feature discrimination, 125–129, 309–310 specificity, 178–179 strategies, 251–257 Discriminative stimulus (SD), 87–88, 99–103, see also Examples Dynamic presentations, 247–249, 492–493 examples, 495–497
E Enduring features and infrasystem memory, 142 and nonenduring features, 239–240 as predictors, 87–88, 140
514
SUBJECT INDEX
Enhanced secondary sensations, 31–35 Equilibration, 405 Escape response escape-avoidance, 101 examples, 25–28, 33–35 Event-based process, 160–167 Examples information provided by positives and negatives, 245–247 juxtaposition of, 308–315 negative examples, 162–163, 172 sequencing examples, 307–315 simultaneous examples, 247–250 temporally ordered, 247–250 Expectations of interactions, 352–353 Experimental designs, 293–317 (chap. 12) for inducing learning, 293–294 involving minimum scaffolding, 302 and logic of learning, 295–297 multiple-subject designs, 315 unintended interpretations, 296–297 Experimental neurosis, 422 Expressive vs. receptive functions, 348–350, 369 Expressive language, 394–395 Extinction behavior, 138 of false features, 327–328 Extrapolation added and subtracted features, 218 beyond known examples, 202, 209–221 conservative and extensive extrapolation, 214 (Fig. 9.1) of the difference, 211–213 effects of reinforcing consequences, 214–215 as function of negative examples, 215–218 in gestalts, 225–226 magnitude of difference principle, 212–213 modifying the range of, 217–220 patterns for ordered examples, 219–220 positive and negative examples, 209–211, 213–215
F False features agent-created, 426–429
extinction of, 327–328 as function of uncertainty, 324 relearning example, 326–328 traumatic events example, 325–326 False positive examples, 253 Feature abstraction abstract operations and searches, 219, 479–483 process in basic antecedent learning, 125–129 Feature classification, see also Classification single-feature pursuits, 129–135 Feature identification from examples, 245–247 requirements for, 136–140 screening criteria, 85–87 strategy, 126–129 Feature sameness, across examples, 162 Feature-combination logic, 385–388 Feature-elimination logic, 126–129 Features, 218, 360–361 analysis of, 186–195, 206–207 associations as feature relationships, 360–361, 480–481 classification by, see Classification definition of, 187, 192–193 enduring, 87–88, 140, 142, 239–240 false, 324, 325–328, 426–429 as functions of populations, 240–241 generalization of features, 193, 195–199 (196, Fig. 8.2) host, see Host as hosts and predictors, 353–363 individual as sum of features, 161–162, 234 minimally different examples, 212–213 multiple-feature learning, 237–239 nonenduring features, 239–240 relative, 137 representation of, 29 residential, see Resident(ial) features search for, 337, 339 universal, 87–89 word-feature correspondence, 381–384 Feedback, 12, 253 attention requirement, 77–78 direct and indirect, 274–275 implications for adjusting responses, 58–61 based on projected sensory data, 74–75 rate, 75
515
SUBJECT INDEX Fixed-response requirements, 8–10 Fixed-unit reasoning, 408–413, 416–418 Formal instruction, 433–465 (chap. 17), see also Instruction; Teaching goals of 433–435 Formal operations, 414–418 Functions calculation, 82–83 definition, 3 directive function, 6–7 five functions of performance machines, 6–7, 14–15 implied by continuous variation, 10–14 inferring, 4–6 modification, 79–80, 83–84 planning function, 6–7, 8, 80–82 receptive function, 6, 78–79 reflexive and consequent functions, 16–18 (18 Table 1.1) response function, defined, 6–7 roles in hardwired systems, 20–21 screening function, 6–7
G Generalization completely generalized features, 193 conservatively generalized features, 193 general and specific features, 193–195 generalizing language patterns, 389–396 learning trends in stroke victims, 280–283 logic of generalized features, 220–221 patterns, 181–183 process, 151–152, 171 sensory core and generalized features, 196 (Fig. 8.2) Gestalt, 223–227 content map or response plan, 224–225 as function of interpolation and extrapolation, 202, 223–224 pattern-recognition examples, 226–227 projections of enduring forms, 224–225 temporal patterns as, 227 Grammar for clarity, 379–385 feature-marker conventions, 385–388 giant words, 380–384 meaning vs. grammar, 390–391 syntactical meaning, 377–391
structure of language, 372–377 Grief examples, 326–328
H Hardwired systems, 24–43 (chap. 2) agent, basic roles, 20–21 agent decision, 54–57 content map specificity, 96–97 decisions, necessity for, 13–14 feedback requirements, 12–13 infrasystem, basic roles, 20–21 multisensory map, 12 operant responses, 25–28 processes and learning, 95–97 reception as sensation, 28–29 reflexive and nonreflexive units, 20–21 Host, 188–189 abstract features as, 353–363 combining related hosts, 387–388 searches for hosts, 336–337, 339–340 Host and residential features conveyed by language, 377 example, 188–189 search for, 337 Human agent agent-created secondary sensations, 426–429 agent’s central and peripheral attention, 425 communication with infrasystem, 428–429 constant thinking, 425 false reinforcers, 426–429 influence on infrasystem, 424–429 volitional thought, 323, 329–340, 425 Human cognitive development, 402–432 (chap. 16), see also Cognitive development Human learning, 347–371 (chap. 14) agent vs. infrasystem knowledge, 362–363 agent vs. infrasystem processes, 369 expressive language, 394–395 intuitive basis for learned rules, 362–363 lack of hardwired scaffolds, 420–422 learning to learn, 363–369 natural setting example, 348–363 overview, 347–348 peripheral learning, 352–353 recollection of details, 348–353
516
SUBJECT INDEX
Human learning (cont.) related patterns, 419–424 savings in trials, 366n, 368 teaching classification skills, 363–369 Human learning and instruction, 345–504 (Part IV) Human uniqueness, 402–403 extent of pattern learning, 420–424 humor, 423–424 Humor as pattern convergence, 423–424
I Imprinting, 107 motor-response, 120, 122–123 as quasilearning, 119–120 sensory, 120–122 Individuals dual classification, 238–239 as multiple-feature discrimination, 308–309, 310–313 as sum of features, 234, 295 Individuals and features, 233–260 (chap. 10), see also Features Inferences about basic functions, 1–15 models of space, 11 multiple features of content, 16 performance systems, 14–22 processes implied by behavior, 168–171 reflexive and consequent functions, 16–18 universal processes, 15 Inferred entities, criteria for, 467–469, 490–491 Inferred functions emphasis on content and logic, 4–6 extension of behavioral analysis, 5–6, 490–491 implications for teaching, 489–500 Information, see Content; Content map Infrasystem and abstract representations, 333–334, 483 calculation function, 82–83 classification process, 253–257 vs. Freudian theory, 500–502 functions not available to agent, 256–257 infrasystem logic versus symbolic logic, 483–484
modification function, 79–80, 83–84 prompted conclusions, 361–362 as receptive functions, 78–79, 348–350, 369 representations of features, 241–242 roles in hardwired system, 20–21 Insight parallel conclusions by agent and infrasystem, 361–362 as reflexive thought, 353–354 vs. trial and error, 338–340 Instruction, see also Formal instruction; Teaching; Training communication of more than one interpretation, 294 control of variables, 494–498 criteria for predicting performance, 313–314 designing rules, 441–442 discovery vs. structured math, 498–500 efficiency principles, 437 efficiency of program, 299–300 form vs. function, 462 independent practice, 458–459 involving minimum scaffolding, 302 language instruction example, 392–393 misinterpretation caused by stipulated rules, 442–443 self-fulfilling prophecies, 461 setting events, 459–461 Theory of Instruction, 498 Interpolation between examples, 202–208 in gestalts, 225–226 logic for positive and negative values, 207–208 three designs for, 203–206 Involuntary thought, see Reflexive thought Issues, 466–504 (chap. 18)
L Language, 372–401 (chap. 15), see also Grammar abstract patterns for generating examples, 389–390 acquisition, 391–399 ambiguity of meaning, 379–380 contrived-language example, 380–385
517
SUBJECT INDEX conveyance for host and residential features, 377 example of descriptive options, 376–377 expressive functions, 394–395 generalization of patterns, 378–379 influences of language on perception, 398–399 instruction example, 392–393 meanings, 373–391 and primary reinforcers, example, 396 receptive functions, 391–392 shared and public meanings, 374–375 as source of content maps, 373–376, 435–437 structure of, 372–377 syntactical meaning, 377–391 verbal directives and descriptions, 375–377 Learned content maps, 98–99 Learned vs. unlearned performance, 24–28 Learning, see also Antecedent learning; Basic learning; Human learning; Response-strategy learning; Secondary Learning; Unfamiliar learning as adaptation, 95 and brain growth spurts, 472–473 content, 473–475 experimental designs as learning, 293–294 and hardwired processes, common features, 95–97 language acquisition, 391–399 latent responses, 473–475 limits of learning potential, 114–115 motor learning 108–109, 271–278 multiple discriminations, 237–239, 301–315 peripheral learning, 352–353 probability of, 268 probability models, 487–488 rate of learning unfamiliar content, 280–287 relationships, 102–106 relearning lost functions, 278–283 requirements of, 97–99 role learning, 429–430 sameness of learned and unlearned content, 486–487 secondary learning, 264–269 specified pursuits, 129–135 spurious prompts, 488–489 temporal basis, 99–102
temporally prior events, 140–145 testing propositions of, 488–489 traditional learning experiments, 466–467 verbal rules, 418 Learning patterns and generalization, 174–201 (chap. 8) pattern learning, 174–186 (182 Fig. 8.1), 419–424 Logic of instruction, 433–465 (chap. 17), see also Instruction; Teaching Logic in performance/learning system combining related hosts, 387–388 extrapolation, see Extrapolation feature combinations, 385–388 generalizing features, 220–221 interpolation, see Interpolation for learning samenesses and differences, 115–116 minimum differences, 212–213 single-feature elimination strategy, 126–129 stipulation, see Stipulation Logical analysis descriptive options, 376–377 inferred from performance, 4–6 requirements of learning and predicting, 97–98 symbolic vs. infrasystem logic, 483–486 training, 295–297 logical modes: ought, will, do, 484–486
M Mastery, 456–459 criterion for, 307 independent practice, 458–459 and magnitude of program steps, 263 teacher training, 457 unintended interpretations, 301 Meanings, see also Language based on samenesses, 393–394 generated by sentences with nonsense words, 390–391 referent and usage meaning, 373–377 syntactical meaning, 377–391 Memory cumulative-trace, 162–163, 172 and decay, 142–143 record of possible predictors, 143–144
518
SUBJECT INDEX
Memory (cont.) record and recall potential, 348–350, 353–354, 369 as requirement of basic learning, 98 of temporally prior events, 141–142 of temporary prior events for basic learning, 98, 103 Model of space, necessity of, 11 Motivation, 20, see Reinforcement; Reinforcers importance of in hardwired system, 24 secondary sensations as motivators, 30, 33–35 Motor-response imprinting, 120, 122–123 Motor-response, learning, 108–109, 271–278 components that influence responses, 273–274 examples of task difficulty, 276–278 isolating response components, 275–276 and knowledge of objective, 272–273 natural reinforcers for feedback, 274–275 program variables, 271–276 Multiple feature, see also Features classification, 243–245 learning, 237–239
N Negative examples, see also Examples cumulative record of, 162–163, 172 Negative reinforcer, 26, 30, 31, 39, 46–47, 144–145, 181, see also Primary reinforcer; Reinforcement; Reinforcers; Secondary sensations for approach responses, 33–35 pain as, 30–31 and punishers, 181 Neurology and inferred functions, 469–470 Nonenduring features, 239–240, see also Features Nonlearning systems, see Hardwired systems
O Operant responses and classical conditioning, 473–475 requirements for hardwired systems, 25–28
sameness of learned and nonlearned responses, 26 universal requirements, 470 Overt responses, 458–459
P Pain inferred from behavior, 31 without physical locus, 31–32 as secondary sensation, 30–31 Patterns abstract, 350 learning of patterns, 174–186, 419–424 Perception, influences of language on, 398–399 Performance fixed-response performance, 8–10 learned variation, 300–301 learned vs. unlearned, 24–28, 486–487 response to continuous variation, 10–14 samenesses imply content map, 38–39 termination criteria for pursuits, 40 Performance framework, 3–23 (chap. 1) Performance systems basic functions, 6–7, 14–15 hardwired pursuit example, 18–20 inferences about, 14–22 Peripheral reinforcers, 350–353, 422 Personality, 500–502 Piagetian developmental stages, 404–405 Plan-based learning, 210 Planning agent-imposed goals, 329–332 without behavior, 323–326 and directive functions, 8 planning function, 6–7 process, 160–164 and projected sensory data, 72, 73–74 and response directive, 72, 73, 74 response planning, 49–55 and termination criteria, 76–77 as thinking, 320–324 Population of 16 individuals, 234–241 (235 Table 10.1), 243–255, 301–315 Population of 16 neologisms, 380–384 Positive examples, see Examples Positive reinforcer, see Primary reinforcer; Reinforcement; Reinforcers
SUBJECT INDEX Predicting, see also Projections adjustment predictions, 183–184 continuous analysis of predictors, 185–186 and humor, 423–424 outcomes, 97–99 predicted vs. realized reinforcers, 181–183 Predictors reinforcers, 350–353 temporally prior, 140 Primary reinforcer, 95, see also Negative reinforcer; Reinforcement; Reinforcers basis for, 20–21 frequency of, 144 intensity as learning influence, 164 as unlearned/hardwired component, 25 Probability in learning, 487–488 Problem-solving, 338–340 Programs, see Formal instruction; Instruction; Reading instruction; Teaching; Teaching rules; Training Projections, see also Predicting as gestalts, 224–225 projected outcome, 134 projected sensory data, 74–75 Prompts, 302–303, 447 Psycholinguistic trace features of words, 396–398 Punisher, 138, 181, 214–215, 327–328
Q Quasilearning, see Imprinting
R Rate of learning and peripheral reinforcers, 353–355 unfamiliar content, 280–287, 363–369 Reading instruction beginning reading, 449–453 corrective reading, 453–455 extinguishing inappropriate rules, 454–455 necessary prerequisite skills, 452 progress records, 455 Recall, see Memory Reception function, 6 receptive and expressive functions, 348–350, 369
519 of sensory input, 78–79 Receptive language, 391–392 Reductionism in traditional learning theories, 490–491 Reflexes as components of operant responses, 48–49 Reflexive processing in hardwired system, 16–18 Reflexive thought, 323, 324–328 inferences about process, 349–350 as insight, 349, 353–354 Reinforcement in basic learning, 111 in beginning instruction, 459–461 changing reinforcing conditions, 262–267 changing school behavior, 489–490 effects on extrapolation, 214–215 hardwired internal reinforcers, 20–21 from pattern discovery, 422, 423–424 peripheral reinforcers, 350–353, 422 responses to, 174–184 schedules, 154, 175–177, 179–180 setting conditions for, 459–461 unreinforced trials, 180–181 Reinforcers for approach responses, 34 false reinforcers created by agent, 426–429 natural reinforcers for feedback, 274–275 negative reinforcer, 26, 30–31, 39, 46–47, 144–145, 181 peripheral reinforcer, 350–353, 422 positive reinforcer, see Reinforcement predicted vs. realized reinforcers, 181–183 primary reinforcer, 20–21, 25, 95, 144, 164 reinforcers and predictors, 350–353 secondary reinforcer, 264–267 Relearning performance trends, 282 (Fig. 11.1) throw-away example sets, 281 trends, 280–283 Representations, see also Classification; Features core sensory representation, 196–199 general and specific, 193–195 process of representing sensory data, 195–199
520 Resident(ial) features, 188–189 abstraction of, 189–190 conveyed by language, 377 feature as host, 188–189 search for, 337 Response(s) adjusting features, 58–61, 64–67 avoidance, see Avoidance behavior based on content maps, 50–57 components in basic learning, 109–110 conditioned response, 101 directives, 57–58 escape response, 25–28, 33–35, 101 imprinting, 120, 122–123 model of response cycles, 72 (Fig. 4.1) models for planning, directing, and adjusting, 62–68, 71–78 operant response, see Operant responses reflexive components, 48–49 as sum of features, 150 Response cost, 214–215 Response function, 6–7 adjustments of default values, 61–64 and competing discriminative stimuli, 67–68 creation of gestalt, 224–225 default values in planning template, 57, 62–63 response planning, 49–55 Response repertoire agent resources, 80 details, 150–157 Response shaping, 153–156, 172, 262–263, 303 Response strategy, 108–112 models, 157–158 as trial content map, 102–103, 116–117 Response-related learning, 110–111 Response-strategy learning, 108–112, 148–173 (chap. 7) coding requirements, 160 combination model, 158, 167–168 creating content maps, 149–151, 172 default content maps, 152 events-based model, 157, 164–167 plan-based model, 157–158, 160–164 specificity of content map, 158 strategy variations, 148–153 Role learning, 429–430, 461–462
SUBJECT INDEX
S Sameness principle applied, 308, 444–445 teaching samenesses and differences, 444–447 Scaffolding, 302, 447 Schedules of reinforcement extinction, 176–177 knowledge of patterns, 179–180 learning requirements, 175–176 and response strategies, 154 Screening function, 6–7 Searches, 336–340 for abstract relationships, 360–361 agent-directed, 358–361 and concentration, 359 for intersect of concrete and abstract features, 353–363 Secondary and unfamiliar learning, 261–292 (chap. 11) Secondary learning overview, 261–269 token example, 268–269 Secondary reinforcers general description, 264–265 natural order of learning, 265–267 quasisecondary reinforcers, 265 Secondary sensations, 30–35 as agent motivators, 30 approach and escape examples, 33–35 connected to specific discriminations, 32–33 pain, 30–32 vs. primary sensations, 30–31 Self-control, 429 Sensation content maps as sensation, 35–37 distortions of, 29–35 pain, 31 reception as sensation, 28–29 secondary, see Secondary sensations Sensory data generalized, 198–199 record, 133 represented, 195–199 Sensory imprinting, 120–122 Shaping bird behavior, 153–155 context (setting) shaping, 153–156, 172, 262–263, 303–307
SUBJECT INDEX one discrimination at a time, 306 escape responses, 156 and mastery, 307 response shaping, 153–156, 172, 262–263, 303 Single-feature classification, 240–243, 251–257 Specificity and generalization in response strategy, 171 Stimulus, see Discriminative stimulus Stipulation, 202, 221–223 avoiding, 307 examples of, 309–310 examples in instruction, 222–223 logic of, 221 as negative generalization, 221 Superego as agent-imposed content map, 331–332 Superstitious behavior, 114, 167 Symbolic logic, 483–486 Syntactical meaning, 377–391
T Task difficulty, levels of, 276–278 Teacher training, 457 Teaching, see also Instruction; Training of classification skills, 363–369 of concrete operations, 408–413 of conservation reasoning, 414–418 of conservation of substance, 409–413 for effective algorithm, 499–500 of formal operations, 414–418 of language components, 435–437 of learner roles, 461–462 to mastery, 456–457 of math skills, 441–442 of reading, 449–455 of related discriminations, 307–315 of relative direction, 413–414 of specific gravity, 414–416 stroke victims, 280–283 unfamiliar discriminations, 283–287 Teaching presentations communication through examples, 434–435 communication of more than one interpretation, 294 dynamic and static presentations, 247–249
521 implications of inferred functions, 489–500 setting events, 459–461 temporally ordered presentations, 247–250 Teaching rules applications, 437–441 concrete and abstract examples, 437–441 conventional nomenclature, 448–449 designing rules, 441–442 for fractions, 443–449 misinterpretation caused by stipulated rules, 442–443 removing scaffolding, 447 Templates for default classifications, 355–357 Termination criteria for content maps, 75–76 for plans, 76–77 Theory of Instruction, 498 Therapy, implications of inferred functions, 500, 502 Thinking (thought) abstract thought, 481–483 directed or volitional thought, 318–320, 323, 329–340, 425 planning as thinking, 320–324 reflexive thought, 323–328, 349–350, 353–354 Training, see also Instruction; Teaching and basic learning, 112–115 desirable ratio of correct/incorrect responses, 299 four levels of task difficulty, 276–278, 284 for new motor responses, 271–278 reteaching stroke victims lost skills, 278–283 for teaching unfamiliar discriminations, 283–287 use of models, 308–312 Transformation of data, 202–229 (chap. 9) Transformations in artificial intelligence, 477–479 based on extrapolation, 202, 209–221, 225–226 based on interpolation, 202–208, 225–226 and gestalts, 202, 226–227 of sensory and response features, 84–89 and stipulation, 202, 221–223 universal features, 87–89 Trial content map, 102–103, 116–117 modification, 134–135 Trial and error vs. insight, 338–340
522
SUBJECT INDEX
U Uncertainty, 244 Unconscious operations, 501–502 Unfamiliar discriminations, 283–287, 416–417 learning trends, 284–287 (285 Fig. 11.2) Unfamiliar learning, 267–290, see also Secondary and unfamiliar learning, 261–292 (chap. 11) categories of, 269–270 initial vs. potential performance, 286 knowledge and, 270–271 new motor responses, 271–278 throw-away example sets, 281–282
training stroke victims, 278–283 Unfulfilled expectations, 338–339 as punishment, 422 Unintended interpretations technique for contradicting, 307–308 variation across learners, 300–301 Universal processes, 15, 16–18
V Vocoder for teaching deaf children, 283–287 Volition and thought, 318–343 (chap. 13), see also Thinking
About the Authors
Siegfried Engelmann is Professor of Education at the University of Oregon and Director of the National Institute for Direct Instruction. He has authored over 100 instructional programs, ranging from beginning reading to elementary chemistry and earth sciences. His principal efforts have focused on at-risk, deaf, Down’s Syndrome, and autistic children. He has authored 50 chapters, 95 articles, and 19 books on educational psychology and instruction, including Theory of Instruction (with Carnine). He has been involved in nine major research projects and has received the 2002 Award for Education Research from the Council of Scientific Society Presidents. His other awards include the Fred Keller Award from the American Psychological Association and an honorary doctorate from Western Michigan University. Donald Steely has spent over 30 years in education, during which he has been a classroom teacher, a teacher trainer, a bilingual program director, a consultant, a researcher, and an instructional designer/author. He has developed print and multimedia instructional programs for both hearing and deaf students in science, mathematics, reading, spelling, and history. Dr. Steely has degrees from the University of Illinois, University of Oregon, and Honolulu University and is currently a research scientist at the Oregon Center for Applied Science.
523