Preface
- Edward Gibbon
My motivation for writing this book grew out of a perceivedneed for an integratedexplanatoryt...
219 downloads
1560 Views
47MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Preface
- Edward Gibbon
My motivation for writing this book grew out of a perceivedneed for an integratedexplanatorytext on the subject. This perceptioncame from the difficulty I havehadin finding a goodbookfor a coursein AutonomousRobots I havebeenteachingfor the pastten yearsor so. Although an extensivebody of literaturehas beenproducedon behavior-basedrobots, the lack of sucha text madeit hardto introducestudentsto this field without throwing theminto fairly deeptechnicalliterature, generallymakingit accessibleonly to advanced graduatestudents. Thoughthere are severalgood books of collectedoriginal papers, they provedto be only partially adequateasan introduction. This book' s intended audienceincludes upper-level undergraduatesand graduatestudentsstudying artificial intellegence(AI ) and robotics, as well as thoseinterestedin learningmore aboutroboticsin general. It assumesthe ability to comprehenda coursein college-level artificial intelligence. The text could be usedto supporta coursein AI -basedor autonomousrobotics or to supplementa generalAI course. . Being a religiousman, I' d first , of course, area necessity Acknowledgments like to thank God andJesusChrist for enablingme to completethis long and arduoustask. 1\\10biblical passagesthat haveinspired me as a roboticist can be found in Matthew3:9 andLuke 19:37- 40. My family hasbeenmostgraciousin putting up with the hardshipsgenerated , proofreadthe entirework. Sheand by authorship.My wife, to my amazement my four childrenput up with th~ inevitableabsencesandstrainscausedby this time-sappingprocess.I love themdearlyfor their support.
Preface
Earlier tutorialson behavior-basedrobotics, presentedinitially by myself (at the 1991IEEE InternationalConferenceon Systems , Man, and Cybernetics ) and later with Rod Grupen(at the 1993lntemationalConferenceon Robotics and Automation, and the 1993 InternationalJoint Conferenceon Artificial Intelligence) helpedcoalescemanyof the ideascontainedherein. I thank Rod for working with me on these. A large number of studentsat Georgia Tech over many years have contributed in a wide rangeof invaluableways. I ' d expresslylike to thank Robin Murphy, Doug MacKenzie, Thcker Balch, Khaled Ali , Zhong Chen, Russ Clark, ElizabethNitz, David Hobbs, WarrenGardner, William Carter, Gary Boone, Michael Pearce,JuanCarlosSantamaria , David Cardoze, Bill Wester, Keith Ward, David Vaughn, Mark Pearson , and others who have madethis book possible. I ' d alsolike to thankBrandonRhodesfor pointing out the short story that leadsinto chapter10. Interactionswith manyof my professionalcolleaguesin residenceat Georgia Tech, aswell asvisitors, havealsohelpedto generateof the ideasfound in this text. AmongthemareProf. ChrisAtkeson, Dr. JonathanCameron,Dr. Tom Collins, Prof. Ashok Ooel, Prof. JessicaHodgins, Dr. Daryl Lawton, Dr. John Pani, Prof. T. M . Rao, andProf. Ashwin Ram. My thanksto y ' all. This book would never have been possiblewithout the researchfunding ' supportthat camefrom a variety of sources. I d like to acknowledgeeachof theseagenciesandthe cognizantfunding agent: the National ScienceFoundation (HowardMoraff), DARPA (Eric Mettala andPradeepKhosla), the Office of Naval Research( TeresaMcMullen), andthe WestinghouseSavannahRiver TechnologyCenter(Clyde Ward). The folks at MIT Presshave been great to work with from the book' s inception, in particularthe late Harry Stantonand Jerry Weinstein. I am also indebtedto Prof. Michael Arbib for so graciouslycontributingthe forewordto this book. Specialthanksgo to Jim Hendierfor piloting a draft versionof the book for a courseat the University of Maryland. Finally, it is impossibleto list all the people within the greaterresearch , support, and insight community who have contributedwith encouragement into this endeavor . To all of you I am deeplyindebtedand can only hopethat this text servesin someway ascompensationfor thoseefforts.
Foreword Michael Arbib
Had he turnedfrom politics to robotics, John FitzgeraldKennedymight well have said, "Ask not what robotics can do for you, ask what you can do for robotics." Ronald Arkin has a more balancedview, however, for his book makesus vividly awarethat the interplaybetweenroboticsanda host of other disciplinesproceedsrichly andfruitfully in both directions. Theemphasisis on the" brains" of robotsratherthanon their " bodies" - thus thetitle " Behavior-Basedrobots," which movesthe detailsof robot sensorsand actuatorsfirmly into the background,makingthem secondaryto the main aim of the book: to understandwhat behaviorswe shouldexpectrobotsto exhibit, and which computationalmechanismscan serveto achievethesebehaviors. However, I shouldnote two featuresof the book that contributegreatly to its liveliness: a fine selectionof epigramsanda superbgallery of photographsof robotsfrom all aroundthe world. Thus evenwhile the book focuseson robot brains, we havemanyopportunitiesto admirehumanwit androbot bodies. RonaldArkin hasalwaysbeenfascinatedby parallelsbetweenanimals(including humans) and robots. This is fully expressedherein his marshalingof datafrom biology and psychologyto showthat much of the behaviorof animals andthusof robotscanbe understoodin termsof patternsemergingfrom a basicsetof reactivemodules. However, Arkin also showsus that manybehaviors of the world, canbe achievedonly by detailedanalysisof representations ' . We are andof the animal s placewithin it , built up from long term experience thusofferedan insightful analysisof robot architecturesthat placesthemalong a continuumfrom reactiveto deliberative. Schematheory providesa powerful frameworkexploitedhereto createbehaviors that are distributed in their control structures, integrativeof action and perception, and open to learning. Theseschemascan in turn be implemented , neural using conventionalcomputerprograms, finite stateacceptors
Foreword networks, or genetic algorithms. In building our appreciationof this framework , Arkin usesroboticsto illuminate basic issuesin computerscienceand artificial intelligence, andto feednewinsightsbackinto our readingof biology andpsychology. The book' s chapteron social behaviorpresentswhat is itself a relatively recentchapterof robotics- sociorobotics . While in its infancy, we can seein the studiesof robot teams, inter-robot communication, and social lerning the beginningsnot only of a powerful new technology, but also of a new science of experimentalsociology. Finally, we are takento that meetingplacebetweensciencefiction, philosophy, and technologythat attractedmany of us to wonderaboutrobotsin the first place. The final chapter, " Fringe Robotics: BeyondBehavior" (a nod to the 1960s' British review " Beyondthe Fringe" ?), debatesthe issuesof robot ' , emotion, and imagination, returns to Arkin s longstanding thought, consciousness concernwith the possibleutility to robots of analogsof hormones andhomeostasis . , andcloseswith an all too brief glimpseof nanotechnology In this way we aregiven a tour that impresses with the depthof its analysis of the schemasunderlying robot behavior, while continually illustrating the deep reciprocity betweenrobotics and biology, psychology, sociology, and philosophy, and the important connectionsbetweenrobotics and many other areasof computerscience.This is a subjectwhosefascinationcanonly increase in the decadesaheadas many researchersbuild on the framework so ably presentedhere.
Chapter
1
Whence
Behavior
?
Chapte Object .robotic 1 To unders what robot are . intellig 2 review the recent that led to the of beh b a histor deve . system .To learn 3 and the wide of robo con me . apprec spec 1.1 TOWARD INTELLIGENTROBOTS Perhapsthe bestway to begin our study is with a question: If we could create intelligent robots, what shouldthey be like, and what shouldthey be able to do? Answeringthe first part of this question- " What shouldthey be like?" ' ) and requiresa descriptionof both the robot s physical structure(appearance its performance(behavior). However, the secondpart of the question- " What should they be able to do?" - frames the answerfor the first part. Robots that need to move objects must be able to grasp them; robots that have to traverserugged outdoor terrain need locomotion systemscapableof moving in adverseconditions; robots that must function at night need sensors capableof operatingunder thoseconditions. A guiding principle in robotic design, whetherstructuralor behavioral, involvesunderstandingthe environment within which the robot operatesand the task(s) it is required to undertake . This ecologicalapproach, in which the robot' s goals and surroundings heavily influenceits design, will be a recurring theme throughoutthis book. But what is a robot? Accordingto the RoboticsIndustryAssociation(RIA ), " a robot is a re, multifunctional , manipulatordesignedto move programmable material, parts, tools, or specializeddevicesthrough variable programmed motionsfor the performanceof a varietyof tasks" (JablonskiandPosey1985) .
Chapterl
(A) Figure1.1 robots. (A) WASUBaT, a keyboardmusiciancapableof reading Anthropomorphic music. (B) WlnrI , a robotthat walkedin excessof 65 kin duringoneexhibition . of Waseda .) ( photographs courtesyof AtsuoTakanishi University This definition is quite restrictive, excluding mobile robots, among other things. On the other exb' eme, anotherdefinition describesrobotics as the intelligent connectionof perceptionto action ( Brady 1985). This seemsoverly inclusive but doesacknowledgethe necessaryrelationshipbetweentheseessential ingredientsof robotic systems. In anycase, our working definition will be: An intelligentrobot is a machine able to extractinformation from its environmentand useknowledgeaboutits world to move safely in a meaningfuland purposivemanner. Hollywood has
WhenceBehavior?
(B) Figure 1.1 (continued)
often depictedrobots as anthropomorphiccreaturesfashionedin the image of man, having two legs, two arms, a torso, and a head. Indeedrobots have e structure: Figure 1.1 illustrates actually been createdthat have a hu~ two such robots. Robots have often been modeledafter animals other than humans, however. Insectlike robots are now commonplaceand are commercially available; otherslook more like horses, spiders, or octopi, as figure 1.2 shows. Robotsalso often look like vehiclescapableof operatingon the ground, in . The examplesshownin figure 1.3 representclassesof the air, or underseas to as unmannedvehicles: UAV (unmannedaerial referred robots generically vehicle), UGV (unmannedground vehicle), and UUV (unmannedundersea vehicle) .
5
Figure 1.2
WhenceBehavior?
(C) ) Figure1.2(continued Animal-like robots. (A) Ariel, a hexapodcrablikerobot. (Photograph courtesyof IS robotthattrots, pacesandbounds , built at Robotics , MA.) ( B) Quadruped , Somerville . . (Q1992MarcRaibert in 1984.(Photograph theCMUleglaboratory by JackBingham of California robot. (Photograph .) (C) HEGI060Hexapod All rightsreserved courtesy Co., Thjunga , CA.) Cybernetics Robotscanbe differentiatedin termsof their size, the materialsfrom which they aremade, the way they arejoined together, the actuatorsthey use(motors andtransmissions , their locomotion ), the typesof sensingsystemstheypossess is clearly structure . But a onboard and their physical computersystems system, an must have not enough. Robotsmustbe animate, so they underlyingcontrol systemto providethe ability to movein a coordinatedway. This book focuses on theperformanceandbehavioralaspectsof roboticsandthe designof control systemsthat allow themto performthe way we would like. The physicaldesign : Many goodsourcesalreadycoverthat material(e.g., of robotsis not addressed Craig 1989, McKerrow 1991).
Chapter1
(A)
(B) Figure1.3
WhenceRehavinr?
(C) Figure 1. 3 (continued) Unmannedvehicles. (A ) Unmannedunderseavehicle: the AdvancedUnmannedSearch System(AUSS) . ( B) Unmannedaerialvehicle: MultipurposeSurveillanceandSecurity Mission Platform (MSSMP) . (C) UnmannedGround Vehicle: Ground Surveillance Robot (GSR) . ( Photographs courtesyof U.S. Navy.) How do we realize the goal of intelligent robotic behavior? What basic science and technology is needed to achieve this goal ? This book attempts to answer these questions by studying the basis and organization of behavior and the related roles of knowledge and perception , learning and adaptation, and teamwork .
1.2 PRECURSORS Peoplethat are really weird can get into sensitivepositions and have a tremendous impacton history. - J. DanforthQuayle
ChapterI To inventyou needa goodimaginationanda pile of junk . - ThomasAlva Edison
The significanthistory associatedwith the origins of modembehavior-based roboticsis importantin understandingandappreciatingthe currentstateof the art. We now review important historical developmentsin threerelatedareas: , artificial intelligence, androbotics. cybernetics
1.2.1
Cybernetics
Norbert Wiener is generallycreditedwith leading, in the late 1940s, the development of cybernetics: a marriageof control theory, information science, and biology that seeksto explain the commonprinciplesof control and communication in both animalsand machines( Wiener1948). Ashby ( 1952) and Wiener furtheredthis view of an organismas a machineby using the mathematics developedfor feedbackcontrol systemsto expressnatural behavior. This affirmed the notion of situatedness , that is, a strong two-way coupling betweenan organismandits environment. In 1953, W. Grey Walter applied theseprinciples in the creationof a precursor robotic design termedMachina Speculatrix which was subsequently transformedinto hardwareform as Grey Walter's tortoise. Someof the principles that werecapturedin his designinclude: 1. Parsimony: Simple is better. Simple reflexes can serveas the basisfor behavior . "The variationsof behaviorpatternsexhibitedevenwith sucheconomy " of structurearecomplexandunpredictable ( Walter1953, p. 126). 2. Exploration or speculation: The systemnever remainsstill except when feeding(recharging). This constantmotion is adequateundernonnal circumstances to keepit from being trapped. " In its explorationof any ordinaryroom it inevitablyencountersmanyobstacles ; but apartfrom stairsandfur rugs, there arefew situationsfrom which it cannotextricateitself ' ( Walker1953, p. 126). 3. Attraction (positive tropism): The systemis motivatedto move towards someenvironmentalobject. In thecaseof thetortoise, this is a light of moderate intensity. 4. Aversion(negativetropism): The systemmovesawayfrom certainnegative stimuli, for example, avoidingheavyobstaclesandslopes. 5. Discernment: The systemhasthe ability to distinguishbetweenproductive andunproductivebehavior, adaptingitself to the situationat hand.
WhenceBehavior?
. .
1
. \
. 1
' :
.
.
I .
Figure 1.4 Circuit of MachinaSpeculatrix. ( FromTheliving Brain by W. Grey Walter. Copyright 1953@ 1963and renewed@ 1981, 1991by W. Grey Walter. Reprintedby permission of W. W. Norton andCompany,Inc.)
The tortoise itself, constructedas an analogdevice (figure 1.4), consisted " " of two sensors , two actuators, and two nervecells or vacuumtubes. A directional photocellfor detectinglight and a bumpcontactsensorprovidedthe requisiteenvironmentalfeedback. One motor steeredthe single front driving wheel. The photocell alwayspointed in the direction of this wheel and thus could scanthe environment. The driving motor poweredthe wheel and provided locomotion. The tortoiseexhibitedthe following behaviors: . Seeking light : The sensorrotated until a weak light sourcewas detected while thedrive motorcontinuouslymovedtherobot to explorethe environment at the sametime. . Head toward weak light : Once a weak light was detected, the tortoise movedin its direction. . Back away from bright light : An aversivebehaviorrepelledthe tortoise from bright light sources.This behaviorwasusedto particularadvantagewhen the tortoisewasrecharging. . Thm and push: Used to avoid obstacles , this behavioroverrodethe light . response . Rechargebattery : When the onboardbattery power was low, the tortoise perceiveda stronglight sourceas weak. Becausethe rechargingstationhad a
Chapter1
Figure 1. 5 ' Grey Walter s tortoise, recently restoredto working order by Owen Holland., ( photographcourtesyof OwenHolland, The University of the Westof England.)
strong light over it , the robot movedtoward it and docked. After recharging, the light sourcewas perceivedas strong, and the robot was repelledfrom the rechargingstation. The behaviorswere prioritized from lowest to highest order: seekinglight, moveto/from light, andavoid obstacle.The tortoisealwaysactedon the highest priority behaviorapplicable, for examplechoosingto avoid obstaclesover moving toward a light . Behavior-basedroboticsstill usesthis basicprinciple, referredto as an arbitration coordinationmechanism(section3.4.3), widely. Walter' s tortoiseexhibitedmoderatelycomplexbehavior: moving safelyabout a room and rechargingitself as needed(figure 1.5). One recent architecture ' (Agah andBekey 1997), describedin section9.8.3, employsWalter s ideason positiveand negativetropismsas a basisfor creatingadaptivebehavior-based robotic systems. Valentino Braitenberg revived this tradition three decadesafter Walter (Braitenberg1984). Taking the vantagepoint of a psychologist, he extended the principles of analogcircuit behaviorto a seriesof gedankenexperiments involving the designof a collection of vehicles. Thesesystemsusedinhibitory
WhenceBehavior?
and excitatoryinfluences, directly coupling the sensorsto the motors. As before , seeminglycomplexbehaviorresultedfrom relativelysimplesensorimotor . Braitenbergcreateda wide rangeof vehicles, including those transformations cowardice, aggression to exhibit , andevenlove (figure 1.6). As with imagined Walter' s tortoise, thesesystemsare inflexible, custommachinesand are not . Nonetheless , the variability of their overt behavioris comreprogrammable pelling. Eventually, scientistscreatedBraitenbergcreaturesthat were true robots, . In one sucheffort, scientistsat MIT ' s Media not merely thoughtexperiments Lab ( Hogg, Martin, and Resnick 1991) usedspeciallymodified LEGO bricks ' to build twelve autonomouscreature-vehiclesusing Braitenbergs principles, including a timid shadowseeker,an indecisiveshadow-edgefinder, a paranoid shadow-fearing robot, a doggedobstacleavoider, an insecurewall follower, and a driven light seeker.Even morecomplexcreatureshavebeenassembled ,
"
\II\,"-,'".,\ -.'0 I
(A )
(B)
Figure 1.6 BraitenbergVehicles (A ) Vehicle1 (Singlemotor/ singlesensor): Motion is alwaysforward in the directionof the sensorstalk, with the speedcontrolled by the sensor.Environmentalperturbations (slippage, rough terrain) producechangesin direction. /two motors) : The photophobeon the left is aversiveto (B) Vehicle 2 ( Two sensors " light (exhibiting fear" by fleeing) since the motor closestto the light sourcemoves fasterthan the one farther away. This resultsin a net motion awayfrom the light . The connectingsensorsandmotors photovoreon the left is attractedto light whenthe wires " " aremerelyreversedfrom the photophobe(exhibiting aggression by charginginto the attractor).
Chapter1
~
-'..0\ II-:"I I\
(C) Figure 1.6 (continued) (C) Vehicle3: Samewiring as for Vehicle2 but now with inhibitory connections . The vehiclesslow down in the presenceof a stronglight sourceandgo fast in the presence of weaklight . In both cases,the vehicleapproach es and stopsby ilie light source(with one facing the light and one with the light sourceto the rear) . The vehicle on the left is said to " love" the light sourcesinceit will stay thereindefinitely, while the vehicle on the right exploresthe world, liking to be nearits currentattractor, but alwayson the lookout for somethingelse. to Vehicle 3, where ( D) Vehicle 4: By adding variousnonlinearspeeddependencies the speedpeaks somewherebetweenthe maximum and minimum intensities, other interestingmotor behaviorscan be observed.This can result in oscillatory navigation betweentwo different light sources(top) or by circular or otherunusualpatternstraced arounda single source( bottom). ( Figuresfrom Braitenberg1984. Reprintedwith permission.)
Whence Behavior ?
I ,, , . . . (; Y "'\-,.~ " ' I / . . . 4 /' . . " " /' . . . . . . / . . . " , . . ( " , -''",II , 5 '" --- - - @ II,'~ ........
.......
""", -
-
~
/'
' ~
, , \
- "
\:\II"\\ '-.1 ~ \::/J 11
I I , , \ \
\ \ \ I
/
/'
\ \
/
\
/
/ . \
/
'" ( D) ) Figure1.6 (continued
/ / .
\
I /
'
\
I
.......
/ / /
ChapterI
attributedwith personalitytraits suchaspersistence , consistency , inhumanity, or a frantic or observantnature. Granted, it is quite a leapto attributethesetraits to robotsbuilt from suchextremelysimplecircuits and plastic toy blocks, but the merefact that an observercanperceivethesequalities, evenmildly, in such simplecreaturesis notable.
1.2.2 Artificial Intelligence The birth of artificial intelligence(AI ) as a distinct field is generallyassociated with the DartmouthSummerResearchConferenceheld in August 1955. 's This conference goalsinvolvedthe studyof a wide rangeof topics including use neural nets, complexity theory, self-improvement, abstractions , language , and creativity. In the original proposal( McCarthyet al. 1955), Marvin Min " sky indicatesthat an intelligent machine would tend to build up within itself an abstractmodel of the environmentin which it is placed. If it were given a problem it could first explore solutionswithin the internal abstractmodel of the environmentandthen attemptexternalexperiments ." This approachdominated roboticsresearchfor the nextthirty years, during which time AI research developeda strong dependenceupon the use of representationalknowledge and deliberativereasoningmethodsfor robotic planning. Hierarchicalorganization for planningwasalsomainstream : A plan is any hierarchicalprocessin the organismthat can control the order in which a sequenceof operationsis performed. ( Miller, Galanter, andPribram 1960, p.16). Someof the betterknown examplesof the AI planningtradition include: . STRIPS: This theorem-proving systemusedfirst-order logic to developa navigationalplan ( FikesandNilsson 1971). . ABSTRIPS: This refinementof the STRIPS systemused a hierarchy of abstractionspacesto improvethe efficiencyof a STRIPS-typeplanner, refining the detailsof a plan asthey becomeimportant(Sacerdoti1974). . HACKER: This systemsearchesthrougha library of proceduresto proposea plan, which it later debugs.The blocks-world <Jomain(toy blocksmovedabout by a simulatedoversimplifiedrobotic arm) servedasa primary demonstration venue(Sussman1975). . NOAH: This hierarchicalrobotic assemblyplannerusesproblemdecomposition and then criticizes the potentially interactingsubproblems , reordering their plannedexecutionasnecessary(Sacerdoti1975). The classical AI methodologyhas two important characteristics( Boden 1995): the ability to representhierarchical structureby abstractionand the
WhenceRehavinr~
use of " strong" knowledgethat employs explicit symbolic representational assertionsaboutthe world. AI ' s influenceon roboticsup to this point wasin the idea that knowledgeand knowledgerepresentationare centralto intelligence, and that robotics was no exception. Perhapsthis was a consequenceof AI ' s preoccupationwith humm-level intelligence. Consideringlower life forms seemeduninteresting. Behavior-basedrobotics systemsreactedagainstthesetraditions. Perhaps Brooks ( 1987a) said it best: " Planningis just a way of avoiding figuring out what to do next." Although initially resisted, as paradigmshifts often are, the ' notion of sensingand acting within the environmentstartedto take preeminence in AI -relatedrobotics researchover the previousfocus on knowledge andplanning. Enablingadvancesin robotic andsensorhardware representation ' ,had madeit feasibleto testthe behavior-basedroboticscommunitys hypotheses . The resultscapturedthe imaginationof AI researchers aroundthe world. The inception and growth of distributed artificial intelligence (OAI) paralleled thesedevelopments . Beginning as early as the Pandemoniumsystem (SelfridgeandNeisser1960), the notion beganto takeroot that multiple competing or cooperatingprocess es (referredto initially as demonsand later as are of agents) capable generatingcoherentbehavior. Early blackboard-based speechunderstandingsystemssuchasHearsayII ( Ermanet al. 1980) referred to theseasynchronous , independentagentsas knowledgesources, communicating with each other through a global data structurecalled a blackboard. ' Minsky s Society of Mind Theory ( Minsky 1986) forwardedmultiagentsystems as the basisfor all intelligence, claiming that althougheachagentis as simpleasit canbe, throughthe coordinatedandconcertedinteractionbetween thesesimple agents, highly complex intelligencecan emerge. Individual behaviors canoften be viewedasindependentagentsin behavior-basedrobotics, relating it closely to OAI. 1.2. 3 Robotics
Mainstreamroboticistshaveby necessitygenerallybeenmoreconcernedwith . perceptionand action than their classicalartificial intelligencecounterparts To conductrobotics research , robots are needed. Thosewho only work with simulationsoften ignore this seeminglyobvious point. Robots can be complex to build and difficult to maintain. To position currentresearchrelativeto them, it is worth briefly reviewing someof roboticists' earliestefforts, bearing in mind that technologyin the 1960sand 1970sseverelyconstrainedthese projectscomparedto the computationalluxuries afforded researcherstoday.
Chapter1
Figure1.7 . (Photograph Shakey courtesyof SRI International.) Many other robots will be discussedthroughoutthis book, but thesesystems arenotableaspioneersfor thosethat followed. . Shakey: Oneof the first mobilerobots, Shakey(figure 1.7), wasconstructed in the late 1960sat the StanfordResearchInstitute( Nilsson1969). It inhabited an artificial world, an office areawith objects specially colored and shaped to assistit in recognizingan obj.ectusing vision. It plannedan action suchas pushingthe recognizedobject from one place to another, and then executed the plan. The STRIPSplanning systemmentionedearlier was developedfor
Whence Behavior ?
usein this system. The robot itself wasconstructedof two independentlycontrolled steppermotors and had a vidicon televisioncameraand optical range finder mountedat the top. The camerahadmotor-controlled tilt , focus, andiris capabilities. Whiskerlike bump sensorswere mountedat the peripheryof the robot for protection. The: plannerusedinformation storedwithin a symbolic world model to determinewhat actionsto take to achievethe robot' s goal at a given time. In this system, perceptionprovidedthe information to maintain andmodify the world model' s representations . . Hlli A RE: This project beganaround 1977 at Laboratoired' Automatique et d' Analysedes Systemes ( LAAS) in Toulouse, France(Giralt, Chatila, and Vaisset1984). Therobot Hll. .ARE (figure 1.8) wasequippedwith threewheels: two drive andone caster. It wasratherheavy, weighingin at 400 kg. Its world
Figure1.8 Hn . .ARE. ( photograph courtesyof LAAS-CNRS,Toulouse,France.)
Chapter1
(A) Figure1.9 containedthe smoothflat floorsfoundin a typical office environment.For sensing , anda laserrangefinder. , fourteenultrasonicsensors , it useda videocamera space: Geometric Planningwasconductedwithin a multilevel representational of the worlds, anda the actualdistancesandmeasurements modelsrepresented relationalmodelexpressedthe connectivityof roomsandcorridors. Of special noteis Hll...AR E's longevity. Therobot wasstill beingusedfor experimentation well over a decadeafter its initial construction( NoreilsandChatila 1989) . . Stanford Cart/ CMU Rover: The StanfordCart (figure 1.9, A ) was aminimal robotic platform used by Moravec to test stereovision as a meansfor navigation( Moravec1977). It wasquite slow, lurching aheadaboutonemeter every ten to fifteen minutes, with a full run lasting about five hours. The vi -
Behavior~ Whence
(B) Figure 1.9 (continued) (A ) StanfordCart. (B) CMU Rover. (Photographscourtesyof HansMoravecand the RoboticsInstitute, Carnegie-Mellon University.)
sualprocessingwasthe mosttime-consumingaspect, but the cart success fully obstacles detected courses meter , avoiding visually twenty navigatedfairly complex as it went. Obstacleswere addedto its internal world map as detected . The cart useda graphsearchalgorithm andwererepresentedasenclosingspheres to find the shortestpaththroughthis abstractmodel. Around 1980, Moravecleft for Carnegie-Mellon University (CMU) where he led the effort in constructingthe CMU Rover (Moravec 1983), a smaller, cylindrical robot with three independentlypoweredand steeredwheel pairs capableof carrying a cameramountedon a pan/ tilt mechanismas well as infrared and ultrasonicsensors(figure 1.9, B). This robot was followed by a
Chapter1
Figure1.10 Robotcontrol systemspectrum. long succession of other CMU robots , several of which are described in other portions of this book . These and other robotic precursors set the stage for the advancesand controversies to come as behavior -based robotic systems appeared in the mid - 1980s.
1.3 TIlE SPECTRUM OFROBOTCONTROL esfor robotic control havebeendeveloped Many differenttechniquesandapproach . Figure 1.10 depictsa spectrumof currentrobot control strategies . The left side representsmethodsthat employ deliberativereasoningand the right representsreactivecontrol. A robot employingdeliberativereasoningrequires relatively completeknowledgeabout the world and usesthis knowledgeto predictthe outcomeof its actions, an ability that enablesit to optimizeits performance relativeto its modelof the world. Deliberatereasoningoftenrequires strongassumptionsaboutthis world model, primarily that the knowledgeupon which reasoningis basedis consistent,reliable, andcertain. If the information the reasonerusesis inaccurateor haschangedsinceobtained, the outcomeof reasoningmayerr seriously. In a dynamicworld, whereobjectsmaybe moving arbitrarily (e.g., in a battlefieldor a crowdedcorridor), it is potentiallydangerous to rely on pastinformation that may no longer be valid. Representational world modelsare thereforegenerallyconstructedfrom both prior knowledge aboutthe environmentandincomingsensordatain supportof deliberation.
WhenceBehavior?
Level 7 Levele Level 5 Level4 Level3 Level 2 Level 1
1.11 Figure '
Albus s hierarchicalintelligent control system. Deliberative reasoning systems often have several common characteristics: . They are hierarchical in structure with a clearly identifiable subdivision of functionality , similar to the organization of commercial businesses or military command. . Communication and control occurs in a predictable and predetermined manner , flowing up and down the hierarchy , with little if any lateral movement. . Higher levels in the hierarchy provide subgoals for lower subordinate levels. . Planning scope, both spatial and temporal , changes during descent in the hierarchy . Time requirements are shorter and spatial considerations are more local at the lower levels. . They rely heavily on symbolic representation world models. 1. 3.1
Deliberative / Hierarchicai Control
The intelligent control roboticscommunity, whoseroots precedethoseof reactive behavior-basedsystems , usesdeliberativereasoningmethodsasits principal Standardsand Technology, paradigm. Albus, at the National Institute of ' . His methodsattemptto integrate is one of this philosophy's leading proponents both natural and artificial reasoning(Albus 1991). Figure 1.11 depicts
Chapter1
a jukebox-like hierarchicalmodel, with eachlayer consistingof four components ' ; asoutlined in Albus s theoryof intelligence: sensoryprocessing , world , and valuejudgment. All layersarejoined by a modeling, task decomposition global memorythroughwhich representational knowledgeis shared. Perhaps the most telling assertionthat representsthe heavyrelianceon world models is reflectedin Albus' s views regardingthe role of perception: Perceptionis the establishmentand maintenanceof correspondence betweenthe internal world modelandthe externalreal world. Consequently , actionresultsfrom reasoning over the world model. Perceptionthus is not tied directly to action. In the mid- 1980s, this view so dominatedrobotics that the government developeda standardarchitecturethat reflectedthis model. Figure 1.12 shows the NASA/ NIST( NBS) standardreferencemodelfor TelerobotControl System Architecture or NASREM (Albus, McCain, and Lumia 1987). Despite the ' , NASREM hashad only limited acceptancebut is governments endorsement still beingusedfor taskssuchascreatinga flight teleroboticservicercapableof ' performingmaintenanceand simpleassemblytasksfor NA SA s spacestation Freedom(Lumia 1994). The six levelsembodiedon this systemeachcapture a specificfunctionality. Simply put, from the lowestlevel to the highest: 1. Servo: providesservocontrol (position, velocity, and force management ) for all the robot' s actuators. 2. Primitive: determinesmotion primitives to generatesmoothtrajectories. 3. Elementalmove: definesandplansfor therobot pathsfreeof collisionswith environmentalobstacles . 4. Task: convertsdesiredactionson a singleobjectin the world into sequences of elementalmovesthat canaccomplishthem. 5. Service bay: convertsactions on groups of objects into tasks to be performed on individual objects, schedulingtaskswithin the servicebay area. 6. Servicemission: decomposesthe overall high-level missionplan into service . bay commands Higher levelsin the hierarchycreatesubgoalsfor lower levels. Anotherarchitecturalembodimentof thesesameideas, RCS (the Real-time Control Systemreferencemodel architecture ), hasthe samebasiclayering as NASREM but morefaithfully embedsthe componentsoutlinedin Albus' s theory of intelligence. This approachwastestedin simulationfor an autonomous submarine(Huang 1996) but hasnot yet beenfieldedon the actualvehicle. In other work alongthesesamelines, researchers at Drexel University have focusedon the theory of intelligent hierarchicalcontrol and createda control modelpossessingthe following characteristics :
Whence Behavior ?
SENSORY PROCESSING
WORLD MODELING
SENSE
TASK DECOMPOSmON
ACT
figure 1.12 . NASREMarchitecture . . .
It correlates human teams and robotic control structures: ' A hierarchy of decision makers implements this idea. Autonomous control systems are organized as teams of decision makers. . It assumesthat the task is decomposable, that is , it can result in structured subtasks. . Hierarchies are generated by recursion using a generalized controller . . Preconditions are established at each level of recursion to ensure proper execution.
Chapter1
Figure 1.13 showsa mobile robot control systemconsistingof six levels. The setof nestedhierarchicalcontrollersconsistsof a high-levelplanner, navigator, pilot, path monitor, controller, andlow-level control system. In yet anotherrepresentative of the intelligent controlscommunity, research at the RensselaerPolYtechnicInstitute ( Lefebvreand Saridis 1992) restricted the hierarchyto three primary levels: organizationlevel (conductshigh-level planning and reasoning), coordinationlevel (providesintegrationacrossvarious hardwaresubsystems ), and executionlevel (supportsbasic control and hardware). This approachimplementsthe principle of increasingprecision with decreasingintelligenceasonedescendsthroughthehierarchy.Figure 1.14 depictsa logical model of this architecturalframework. Note the clear- and restrictive- flow of control and communicationbetweenlevels within the hierarchy. Hierarchicalcontrol is seeminglywell suitedfor structuredandhighly predictable environments(e.g., manufacturing ). Reactivesystems,however, were in to several of the developed response apparentdrawbacksassociatedwith the hierarchicaldesignparadigmincluding a perceivedlack of responsiveness in unstructuredand uncertainenvironmentsdue both to the requirementsof world modeling and the limited communicationpathways; and the difficulty in engineeringcompletesystemsas incrementalcompetencyproveddifficult to achieve, that is, virtually the entire systemneededto be built beforetesting wasfeasible.
1.3.2 ReactiveSystems The right side of the spectrumdepictedin figure 1.10 representsreactivesystems . Simplyput, reactivecontrol is a techniquefor tightly couplingperception and action, typically in the contextof motor behaviors, to producetimely robotic responsein dynamicand unstructuredworlds. We further definethe following: . An individual behavior: a stimulus/responsepair for a given environmental settingthat is modulatedby attentionanddetenninedby intentio,n. . Attention: prioritizes tasksandfocusessensoryresourcesand is detennined by the currentenvironmentalcontext. . Intention: determineswhich set of behaviorsshouldbe activebasedon the robotic agent's internal goalsandobjectives. . Overt or emergentbehavior: the global behaviorof the robot or organismas a consequence of the interactionof the activeindividual behaviors.
WhenceBehavior?
Figure 1.13 Nestedhierarchicalintelligent controller.
Chapter1
Figure 1.14 J modelof hierarchicalintelligent robot. Logica . Reflexive behavior (alternatively , purely reactive behavior ) : behavior that is generated by hardwired reactive behaviors with tight sensor-effector arcs, where sensory information is not persistent and no world models are used whatsoever. Several key aspects of this behavior-based methodology include (Brooks 1991b) : . Situatedness: The robot is an entity situated and surrounded by the real world . It does not operate upon abstract representations of reality , but rather reality itself . . Embodiment : A robot has a physical presence (a body ) . This spatial reality has consequences in its dynamic interactions with the world that cannot be simulated faithfully .
WhenceBehavior?
. Emergence : Intelligencearisesfrom theinteractionsof therobotic agentwith its environment. It is not a propertyof either the agentor the environmentin isolationbut is rathera result of the interplaybetweenthem. This book focusesprim~ ly on behavior-basedreactiverobotic systems,whose structureandorganizationchapters3 and4 describein moredetail. Hierarchical control, however, is also discussedfurther in the contextof hybrid robotic architecturespresentedin chapter6.
RELATED ISSUES A few importantissuescentralto understandingandappreciatingthe behaviorbasedparadigmwarrantadditionaldiscussionbeforeheadinginto the core of , this book. . Groundingin reality: A chroniccriticism of traditionalartificial intelligence researchis that it suffers from the symbol grounding problem, that is, the symbols with which the systemreasonsoften have no physical correlation with reality; they are not groundedby perceptualor motor acts. In a sense ungroundedsystemscanbe saidto be delusional: Their world is an artifactual hallucination. Robotic simulationsare often the most insidious examplesof this problem, with " robots" purporting to be sensingand acting but instead just creatingnew symbolsfrom old, noneof which truly correspondsto actual events. Embodiment, as statedearlier, forces a robot to function within its of environment: sensing, acting, and sufferingdirectly from the consequences . " Building robotsthat aresituatedin the its misperceptionsandmisconceptions world crystallizesthe hard issues" (Flynn and Brooks 1989). For that reason this book focusesprimarily on real robotic systemsimplementedin hardware asexemplarsfor robotic control. . Ecological dynamics: A physical agentdoesnot residein a vacuumbut is typically immersedin a highly dynamicenvironmentthat varies significantly in both spaceand time. Further, theseenvironmentaldynamics, except for highly structuredworkplaces, are very difficult if not impossibleto characterize . Nonetheless , if a situatedrobotic agentis to be designedproperly, it must within its designtheopportunitiesandperils thatthe environment acknowledge es This is much easiersaidthandone. In nature, evolutionaryprocess affordsit . shapeagentsto fit their ecologicalniche; thesetime scalesunfortunatelyare not availableto the practicingroboticist. Adaptation, howevercanbe crucially important; chapter8 exploresthis further.
Chapter1 . Scalability : Scalability of the behavior-based approach has been a major question from its inception . Although these methods are clearly well suited for low -level tasks requiring the competence of creatures such as insects, it has
beenunclearwhethertheywould scaleto conformto human-level intelligence. Tsotsos( 1995), for example, arguesthat " the strict behavioristpositionfor the modeling of intelligencedoesnot scaleto human-like problemsand performance ." Section7.1 considersthis point further. Many of the strict behaviorists persistin their view that the approachhasno limits ; notably, Brooks ( 1990b) statesthat " we believe that in principle we haveuncoveredthe fundamental foundationof intelligence." Othersadvocatea hybrid approachbetweensymbolic esare reasoningandbehavioralmethods,arguingthat thesetwo approach " fully compatible: The falsedichotomythatexistsbetweenhierarchicalcontrol and reactivesystemsshouldbe dropped" (Arkin 1989d). (Seealsochapter6.) Much currentresearchfocuseson testingthe limits of behavior-basedmethods, andthis themewill recur throughoutthis book.
' SAHEAD 1.5 WHAT This book consistsof the following chapters: 1. Introduction : highlightsthe core issuesof intelligent roboticsandreviews the history of cybernetics , artificial intelligence, androboticsthat led up to the of behavior basedrobotic systems. development 2. Animal behavior: studiesthe basis for intelligence, biological systems , , neuroscientists , and ethologistsand examines throughthe eyesof psychologists severalrepresentative robotic systemsinspiredby animalbehavior. 3. Robot behavior: describesthe basisfor behavior-basedrobotics, including the notation, expression , encoding, assembling , andcoordinationof behaviors. 4. Behavior based architectures: presentsa range of robotic architectures employingthe behavior-basedparadigm. 5. Representationalissuesfor behavioral systems: questionsand explores the role of representational knowledgewithin the contextof a behavior-based system. 6. Hybrid deUberativeireactive architectures: evaluatesrobotic architectures that couplemore traditional artificial intelligenceplanningsystemswith reactivecontrol systemsin an effort to extendfurther the utility of behaviorbasedcontrol. 7. Perceptual basis for behavior-based control: considersthe issuesconcerning the connectionof perceptionto action- sensortypes, perceptualmod-
WhenceBehavior ? utes, expectations , attention, and so on- and presentsperceptualdesignfor a rangeof robotic tasks, including descriptionsof specificapplications. eshow robotscancopewith a changingworld 8. Adaptive behavior: address , including reinforcement througha variety of learningandadaptationmechanisms net Works neural , , fuzzy logic evolutionarymethods,andothers. learning, 9. Social behavior: opensup behavior-basedroboticsto the considerationof how teamsand societiesof robots can function togethereffectively- raising new issuessuchas communication,interference, and multiagentcompetition, cooperation,andlearning- andpresentsa casestudyillustratingmanyof these concepts. 10. Open issues: explores some open questions and philosophical issues regarding intelligence within artificial systems in general and behavior based robots in particular .
Chapter 2 Animal Behavior
Animals, in their generation, are wiser than the sons of men; but their wisdom is . . confinedto a few particulars, and lies in a very narrowcompass - JosephAddison
Chapter Objectives 1.To an of the between animal understanding possible relationships develop and robot control . behavior inneuroscience ethol 2 .To areasonable ,psychology ,and background provide . for the roboticist ogy . motivated robotic 3.To examine awide of systems range biologically 2.1 WHAT DOES ANIMAL BEHAVIOR OFFER ROBOTICS? The possibility of intelligent behavioris indicatedby its manifestationin biological systems.It seemslogical thenthat a suitablestartingpoint for the study of behavior-basedrobotics should begin with an overview of biological behavior . First, animal behaviordefinesintelligence. Where intelligencebegins and endsis an open-endedquestion, but we will concedein this text that intelligence canresidein subhumananimals. Our working definition will be that intelligenceendowsa system(biologicalor otherwise) with the ability to improve its likelihood of survival within the real world and whereappropriateto fully with other agentsto do so. Second, animal competeor cooperatesuccess
Chapter2
behaviorprovidesan existenceproof that intelligenceis achievable . It is not a it is a concrete a understood , mysticalconcept reality, although poorly phenomena . Third, the study of animal behaviorcan provide modelsthat a roboticist canoperationalizewithin a robotic system. Thesemodelsmaybe implemented with high fidelity to their animalcounterpartsor may serveonly asan inspiration for the roboticsresearcher . Roboticistshavestruggledto providetheir machineswith animals' simplest capabilities: the ability to perceiveand act within the environmentin a meaningful and purposivemanner. Although a studyof existingbiological systems that alreadypossessthe ability to conductthesetaskssuccess fully seemsobviously a reasonablemethodto achievethat goal, the roboticscommunityhas . First, the underlyinghardware historically resistedit for two principal reasons is fundamentallydifferent. Biological systemsbring a large amountof evolutionary baggageunnecessaryto supportintelligent behaviorin their siliconbasedcounterparts . Second, our knowledgeof the functioning of biological hardwareis often inadequateto supportits migration from one systemto an other. For theseandother reasons , manyroboticistsignore biological realities andseekpurely engineeringsolutions. Behavior-basedroboticistsarguethat there is much that can be gainedfor roboticsthroughthe studyof neuroscience , psychology, andethology.
The behavior-basedroboticist needsto decidehow to use resultsfrom these otherdisciplines. Somescientistsattemptto implementtheseresultsasclosely as possibly, concerningthemselvesprimarily with testing the underlying hypotheses of the biological modelsin question. Otherschooseto abstractthe underlying details and use thesemodelsfor inspiration to createmore intelligent robots, unconcernedwith any impact within the disciplinesfrom which the original modelsarose. Wewill seeexamplesof both approach eswithin this book. To appreciatebehavior-basedrobotics, it is important to have somebackground in biological behavior, which this chapterattemptsto provide. We first
Animal Behavior
overviewthe importantconceptsof neuroscience , psychology,andethology, in that order. The chapterconcludeswith severalexemplarrobotic systemswhose ,a goalshavedrawn heavily on biological modelsfor robotic implementation themethat continuesto varying degreesthroughoutthe remainderof the book.
2.2 NEUROSCffi NTI FI CBASISFORBEHAVI0 R The centralnervoussystem(CNS) is a highly complexsubjectwhosediscussion warrantsat least a separatetextbook. This sectionattemptsonly a gross overview. First, it highlights the componenttechnologyof neural circuitry. Next, it introducesthe readerto the most basic aspectsof brain function and structureand the neurophysiologicalpathwaysthat translatestimulusinto re, that is, producebehavior. Last, it presentsabstractcomputationalmodels . sponse developedwithin braintheorythathaveservedasa basisfor behavior-based robotic systems.
2.2.1
Neural Circuitry Eachoutcry of the huntedhare A fibre from the brain doestear. - William Blake
The nervoussystem's elementalcellular componentis the neuron(figure 2.1). Thereis no single" canonical" neuron: Theycomein manydifferentshapesand sizes, but they do possessa commonstructure. Emanatingfrom the cell body at the axonhillock is the axon, which after a traversalof somelengthbranches off into a collection of synapticterminalsor bulbs. This branchingis referred . The axon is often sheathedin myelin, which to as axonal arborealization facilitatesthe transmissionof the neuralimpulsealongthe fiber. The boundary betweenneural interconnections , is wherechemical , referredto as a synapse diffuse acrossthe synapticcleft whenthe cell " fires." At the neurotransmitters neuron's receivingend, a collection of dendritesemanatesfrom the cell body, . continuingfrom the other sideof the synapse across the neuron transmission occurs by the conveyanceof an electrical Signal ' chargefrom the dendrites input surfacesthrough the cell body. If the total amountof electricity impinging upon the cell is below a certain threshold , the currentis passivelypropagatedthroughthe cell up the axon, becoming es. If , however, it exceedsthe thresholdat the axon weakeras it progress hillock, a spike is generatedand actively propagatedwithout significant loss
2 Chapter 8m ....
~ Dendri
-
~
Axon Axonhitock rborization I
r
.
Axon
Figure2.1 of a neuron . (FromTheMetaphorical Brainby M. Arbib. Copyright Stylizedrepresentation 1972by Wiley-Interscience . Reprintedby permission of JohnWileyandSons , Inc.) of current up the axon to the synaptic bulbs , causing the release of neurotransmitters across the synaptic cleft . The cell must then wait a finite amount of time (the refractory period) before it can generate another electrical spike. This spike is also referred to as the action potential . Basic neurotransmitters are of two principal types : excitatory , adding to the probability of the receiving cell ' s firing ; and inhibitory , decreasing the likelihood of the receiving cell ' s firing . Combinations of neurons give rise to ever- increasing complex neural circuitry . There are many examples of specialized small systems of neurons whose function neuroscientists have elucidated. These special purpose systems include (from Arbib 1995b) : . . . . . . . .
scratch reflex es in turtles bat sonar stomatogastric control in lobsters locomotion pattern generation in lampreys wiping reflex in frogs cockroach locomotion location of objects with electricity in electric fish visuomotor coordination in flies and frogs
Animal Behavior
Often roboticists can draw from these neural models to create similar fonD S of behavior in machines. In Section 2.5 , we will study a few examples of this
eachdedicatedto a spec "ific function. Thesephysically parallel columnsalso processinformation in parallel. This is of particular importancefor spacepreservingmapsgeneratedfrom sensoryinputs. For example, in touch, space' preserving maps from the skin s embeddedneural tactile sensorsproject ' to the brain s somatosensorycortex. This is also the case for visual input from the eyes' retinasthat ultimately projects onto the visual cortex. Parallel pathwaysare naturally presentfor the processingof spatially distributed information. One model related to the inherentparallelism in neural processingis referred ' to aslateral inhibition, whereinhibition of a neurons firing arisesfrom the activity of its neural neighbors. Lateral inhibition can yield asingledom inant pathway even when multiple concurrentactive pathwaysare present. This results from amplification of the variationsin activity betweendifferent neuronsor neuralpathways. Throughstronglateral inhibition, one choice from many can be selectedin a winner-take-all manner. This is of particular value in taskssuchas competitivelearningfor patternclassificationtasks, prey recognition, disambiguationof word sensein language, solving the correspondence problem in stereovision (finding matching featuresin two or moreseparateimages), or selectingonefrom amongmanypossiblebehavioral . responses
2.2.2 Brain StructureandFunction ' It is said that the Limbic systemof the brain controlsthe four F s: Feeding, Fighting, Fleeing, andReproduction. - Karl Pribram
Animal brains obviously comein a very wide rangeof sizes. Simple invertebrates havenervoussystemsconsistingof 103- 1Q4neurons, whereasthe brain of a smallvertebratesuchasa mousecontainsapproximately107neurons.The humanbrain hasbeenestimatedto contain 101- 1011individual neurons. Despite a largevariation in brain size, we can say severalthings generallyabout vertebratebrains. First, locality is a commonfeature. Brains are not a homogeneous massof neurons; rather, they are structurallyorganizedinto different each of which containsspecializedfunctionality. Next, animalbrains regions,
Chapter2
generallyhavethree major subdivisions(figure 2.2). For mammalianbrains, thesegenerallyconsistof ( 1) the forebrain, which comprisesthe . Neocortex: associatedwith higher level cognition. . Limbic system(bet; weenthe neocortexand cerebrum): providing basicbehavioral survivalresponses . . Thalamus: mediating incoming sensoryinformation and outgoing motor . responses . Hypothalamus : managinghomeostasis , that is, maintaininga safe internal state(temperature , hunger, respiration, and the like). (2) the brainstem, which comprisesthe . Midbrain: concernedwith the processingof incoming sensoryinformation (sight, sound, touch, and so forth) and control of primitive motor response systems. . Hindbrain, which consistsof the . Pons: projectingacrossthe brain carryinginformationto the cortex. . Cerebellum: maintainingthe tone of muscle groups necessaryfor coordinated motion. . Medulla oblongata: connectingthe brain andthe spinalcord. and (3) the spinal cord, containingreflexive pathwaysfor control of various motor systems.Finally, afferent inputs conveysignals(typically sensory) toward the brain, whereasefferentsignalsconveycommandsfrom the brain to the body. Invertebrateneuralstructureis highly variableandthusfewer generalizationscanbe made. Mammaliancortex hasregionsassociatedwith specificsensoryinputs and motor commandoutputs (figure 2.3). In humans, the visual cortex (sight) is toward the rear of the brain, the auditory cortex (sound) is to the side, and the somatosensory cortex (touch) is midbrain, adjacentto the motor cortex (locomotion). Space-preservingtopographicmappingsfrom the input sensory organsto the cortexarepresentwithin all theseregions. It is interestingto note that thesemappingsare plastic in the sensethat they can be reorganizedafter damage. This has been shownfor both the somatosensorysystem(Florence and Kaas 1995) andfor visual cortex(Kaaset al. 1990). Subspecializationoccurs within the brain as well. At this level neuroscientific modelshaveoften had an impact on behavior-basedrobot design. For example: . In section7.2.2 we encounter" what" and" where" corticalregionsassociated with visual processing .
Animal Behavior
BrainStem Primitive
::~. forebrain ; ~ ; ~~ml db raIn .
RatBrain ) (notto scale
~~~:~:::. Brain Chimpanzee
::~~Sml :2db :forebrain .raIn ~~Z Brain Reptilian
cerebra hemisp . ~ ~ ~ H J ~=ml iraIn )b hindbrain 't~forebra 'd
Human Brain left cerebral
Mammalian Brain
here
brain stem
cerebellum
Figure2.2 .) of RodGropen . (Figurecourtesy Animalbrainstructures . Section6.2 discusses distinctionsbetweendeliberative(willed) and automatic behavioralcontrol systemsfor managingmotor actionsbaseduponneurophysiologicalevidence. . Evidencefor parallelmechanismsassociatedwith both long- andshort-term memoryhasalsobeenuncovered(Miller andDesimone1994). For both cases, two distinctprocessingsystemsareseeminglypresent: onefor factsandevents, the otherfor learningmotor andperceptualskills. Section5.2 discussestherole of thesetwo different typesof memoryfor behavior-basedrobotic systems.
2 Chapter somatosensory cortex
motor cortex
auditory
cortex primary cortical secondary cortical association
Fiaure 2.3 Regions of sensory and motor process in the human cortex . The general flow of infor -
mationwithinthebrainisindicaterlby arrows. ( Figurecourtesyof Rod GropeD.)
Neurobiologyoften arguesfor the hypothesisof a vectorialbasisfor motor control, somethingthat can be readily ttanslatedinto robotic control systems (section3.3.2). Researchat MIT ( Bizzi, Mussa-Ivaldi, and Giszter 1991) has shown that a neural encodingof potential limb motion encompassingdirection , amplitude, and velocity existswithin the spinal cord of the deafferented frog. Microstimulation of different regionsof the spinal cord generatesspecific force fields directingthe forelimb to specificlocations. Theseconvergent force fields move the limb towardsan equilibrium point specifiedby the region stimulated(figure 2.4) . The limb itself canbe considereda set of tunable springsasit movestowardsits restposition(equilibrium point). Thusthe planning aspectsof the CNS ttanslateinto establishingthe equilibrium points that implicitly specify a desiredmotion. Of particular interest is the observation Figure 2.4 (A ) Forcefields generatedby microstimulationof lumbarregions(A -D) of frog spinal cord (shownat left). ( B) Superpositionof multiple stimuli. C denotessimple vector summationof independent fields A and B. D representsactualfield evokedby microstimulationof regionsA and B concUn'ently. (Reprintedwith permissionfrom Bizzi, Mussa-Ivaldi, and Giszter 1991. Copyright 1991by AmericanAssociationfor the Advancementof Science.)
A ""t"~ ""',"~ ,,.',-,'\'.It ,""'"".\'.,,-~ /'
-
Animal Behavior
D I .,,..~'\\~I~;J'
I,.,"Vif ~ ~ " ,~ -
............ , I .......... .......... - . ---. -- _ . . - . ~ . .. ~ . . .
(A)
"t"" " """" , ,,:,,', t:~ ,~ \~ \''~ ~~ ,
.
.
.
,
.
-
-
.
.
' '
.
C
.
.
.
"
.
.
.
"
"
"
"
"
\ \ , I , ~ ,. " ' \ \ I / ' ~ ", I " ' \ \ 1 " -" ,,, ' " \ . , , ,. " ' " \ . " " -'
.--_ . ~ ---.,'" -,~I.\,~ '~ .""" "~"'" -
-
(B)
"
,
I
'
,-" ,
" "
\ \ \ ' '
" " " " " " ... -- " - ~ .. , -
-
-
.
-
\ \ \ \ \ \ \ ~ , . .
"
I' ~
"
" 111 " 11 " 11 ' & 1 " , & II \ " I ' I ' . . , . .. .
; ' '
.
\ I I I ' ~ ' \ I I I ' ~ ~ \ \ III " \ III ~ " \ 11 / -' "
" "
.. ~ ~ ", " . ", --" --"" -,-,.,,',,',, " "
"
"
"
"
.1
Chapter2 Box2.1 MemoryTypes -
,
. Short
term
STM
: )
memory
process
acting
Short
tenn
is
a
over
(
memory
" time
scale
intennediate
an
,
performing
requiring
temporary
that
involves
tasks " to
actions "
and storage
manipulation
as
also
referred
to
STM
is
.
and
Bumod
1995
Guigon(
)
working
memory .
the
of
"
Information. STM
information
guide
appropriate to
minutes
STM
for
of
several
seconds a
distinction
its
for
infonnation ,
significant
in
stored
persists
periods
is
limited
in
quite
capacity
holding
from
tenn
.
long
memory -
to
as
what
we
refer in
hours
Long
memory measured
is
for
LTM
time
scale
The
.
term
:
tenn
is
usually
LTM
)
(
Long
memory
.
in
conversation
memory
everyday
, " with
retention
information
LTM
is
One
definition
for
or
.
days
years
working " .
tenn
is
viewed that
inferior
to
recall
is
its
but
its
,
the
in
brain
involved
of
than longer
in
as
almost
limitless
many The
of
the
region
appears .
24
capacity
accuracy transfer
hours
( McFarland
1981 )
Long
memory
by
LTM
to
STM
.
STM
of
hippocampal from
information
of process
that multiple stimulationsgive rise to new spatial equilibrium points generated -based by simple vector addition. This sameprinciple is usedin schema robotic control (section4.4). New experimentsin humans(intact fortunately), havebeenshownto be consistentwith this force-field model when appliedto reachingtasks(ShadmehrandMussa-Ivaldi 1994) . Othercompellingexamplesarguingthat computationsof motion within the brain should be consideredas vectorsinclude researchfrom the New York University Medical Center( pellioniszand ilinas 1980). The authorscontend that activity within the brain is vectorial. Intendedmotion is generatedas an activity vectorwithin a three-dimensionalspace. Brain function is considered " " geometricwherethe languageof the brain is vectorial. The authorsexplain, however, that simple reflexes are not adequateto explain the entire rangeof . complexityevidencedby the actionsthe brain generates Another exampleforwarding spatial vectorsas an underlying representational mediumfor neuralspecificationof motorbehaviorcomesfrom theJohns " " Hopkins Schoolof Medicine (Georgopoulos1986) . The vector hypothesis assertsthat the changesin the activity of specificpopulationsof neuronsgenerates a neural coding in the fonD of a spatial vector for primate reaching. Experimentalresultshave been shown to be consistentwith this underlying hypothesis. Finally, vector-basedtrajectorygenerationhasservedasan accountfor certain fonDS of animal navigation. Arbib and House's model ( 1987) explains
Animal Behavior
detourbehaviorin toads(i .e., their circumnavigationof obstacles ) by describing the animal' s path planningin tenD Sof the generationof divergencefields ' (directional vectors) basedon the animal s perceivedenvironment. Inparticular , repulsivefields surroundingobstacles , attractiveforces leading to food sources,anddirectionalvectorsbasedon the frog' s spatialorientationgenerate a computationalmodel of path planningin toadsconsistentwith observedexperimental -based data. This modelhasbeeninfluential in the designof schema robot controllers(section3.3.2).
2.2.3
Abstract Neurosci enti:fi(: Models
' Unfortunately, our knowledgeof the brain s function is still largely superficial (literally). Progressin neuroscienceis proceedingat a rapid paceas new tools . for understandingbrain function becomeavailable. Nonetheless , it has been said that even if we possesseda completeroad map of the brain' s neural structure (all of its neuronsand their interconnections ), our understanding would still be inadequate . Brain activity over the neural substrateis highly dynamic, and information regardingprocessingand control would still need to be elaborated . What then should a brain theorist do? The key for many scientistslies in their first formulating an abstractionof brain function and then looking for neuralconfirmation. This top-down approachcharacterizesmany researchers in neuroscience , and has potentially high payoff for roboticists, as abstract modelsof brain function hypothesizedby theseneuroscientists canpotentially leadto robotic control systemsuseful in their own right. Abstract computationalmodels used to expressbrain behavior have two es mainstreamforms: schematheoryandneuralnetworks. Thesetwo approach arefully compatible(figure2.5). Schematheoryis a higher-level abstractionby which behaviorcan be expressedmodularly. Neural networksprovide a basis for modelingat a finer granularity, whereparallelprocessingoccursat a lower level. Schematheory is currently more adept at expressingbrain function, whereasneural networkscan more closely reflect brain structure. Schemas , onceformulated, may be translatedinto neural network modelsif desiredor deemednecessary . In this book we studyboth methodsin the contextof what they offer behavior-basedrobotic control systems.
. 2.2.3.1 Schema
: Methods
The use of schemasas a philosophicalmodel for the explanationof behavior datesas far back as Immanuel Kant in the eighteenthcentury. Schemas
Chapter2
avior \ Schema \ " Models functional ( )
decomposition
\ Neural Networks +
structure ( function ) Behavioral Modeling
Figure 2. 5 Abstract behavioral models. Schemas or neural networks by themselves can be used to represent overt agent behavior , or schemascan be used as a higher - level abstraction that is in turn decomposed into a collection of neural networks .
weredefinedas a meansby which understandingis ableto categorizesensory . Neurophysioperceptionin the processof realizingknowledgeor experience logical schematheory emergedearly in the twentiethcentury. The first application was an effort to explainposturalcontrol mechanismsin humans( Head and Holmes 1911). Schematheory has influencedpsychologyas well, serving asa bridging abstractionbetweenbrain andmind. Work by Bartlett ( 1932) and Piaget( 1971) usedschematheory as a mechanismfor expressingmodels of memory and learning. Neisser( 1976) presenteda cognitive model of interactionbetweenmotor behaviorsin the form of schemasinterlocking with perceptionin the contextof the perceptualcycle. Normanand Shallice( 1986) usedschemasas a meansfor differentiatingbetweentwo classesof behavior, willed and automatic, and proposeda cognitive model that usescontentionschedulingmechanismsas a meansfor cooperationand competitionbetween behaviors. Sections6.2 and 7.2.3 discusstheselast two examplesfurther. Arbib ( 1981) wasthe first to considerthe applicationsof schematheoryto robotic
Animal Behavior
sion systems(RisemanandHanson1987). " " Many definitionsexist for the term schema, often strongly influencedby its applicationarea(e.g., computational,neuroscientific,psychological). Some representative examples: . a patternof action aswell asa pattern.for action ( Neisser1976) . an adaptivecontroller that uses an identification procedureto updateits representationof the objectbeingcontrolled (Arbib 1981) . a functional unit receiving specialinformation, anticipatinga possibleperceptual content, matchingitself to the perceivedinformation (Koy-Oberthur 1989) . a perceptualwhole correspondingto a mentalentity ( Piaget1971) Our working definition is as follows: A schemais the basic unit of behavior from which complex actions can be constructed ; it consistsof the of how to act or as well as the knowledge perceive computationalprocessby which it is enacted. Schematheory providesa methodfor encodingrobotic behaviorat a coarsergranularitythan neuralnetworkswhile retainingthe aspects -competitivecontrolcommonto neuroscientific of concurrentcooperative models. Variousneurocomputationalarchitectureshave beencreatedthat incorporate theseideas. For example, work at the University of Genova(Morasso and es the Sanguineti1994) hasled to the developmentof a model that encompass vector-basedmotion-planning strategiesdescribedearlier within the posterior parietalcortex. Here, multiple sensorimotormappingsareintegratedinto a unitary . body schema, necessaryfor the generationof goal-orientedmovements Vector-basedpotentialfields (section3.3.2) providethe currencyof task specification for this integrationof taskintentions. Later, section4.4 exploresother modelsand associatedmethodsfor operationalizingschematheory as a basis for behavior-basedcontrol.
2.2.3.2 Neural Networks Computationalmodelsfor neural networks, also referredto as connectionist systems,havea rich historyparallelingthe developmentof traditionalsymbolic AI . Someof the earliestwork in the areacanbe tracedto the McCulloch-Pitts model of neurons( 1943). McCulloch and Pitts used a simple linear threshold unit with synapticweightsassociatedto eachsynapticinput. If a threshold was exceeded , the neuronfired, carrying its output to the next neuron. This
Chapter2
Xl w 1 X 2 " " ' ~ ~ ~ : . _ --~ X " 3 ~ !3 -_ -: .. .Xn Input (bVector )2.6 inary Figure ' A ron . A vector of is each s assoc Xi percept binary inputs multiplied by compone . The result is then summed 1 : and then to a Wi synaptic weight ( ) together subjec ' 8 to determine the unit s . This is then ( ) thresholding operation binary output output sent on to the next cell in the network . synaptic weights
Output
simplemodel gaverise to networkscapableof learningsimple patternrecognition tasks. Rosenblatt( 1958) later introduceda formal neural model called a perceptron (figure 2.6) andthe associatedperceptron convergence proof that establishedprovablelearningpropertiesof thesenetworksystems.In the 1960s and 1970s, however, neural network researchwent into a decline for avariety of reasons , including the publication of the book Perceptrons( Minsky andPapert1969), which provedthe limitations of single-layer perceptron networks . In the 1980s, however, the field resurgedwith the advent of multilayer neural networks and the use of backpropagation( Rumelhart , Hinton, and Williams 1986) as a meansfor training these systems. Many other notable efforts within connectionismduring the last decade, far too numerousto review here, have yielded highly significantresults. It shouldbe remembered , however, that mostneuralnetworksareonly inspiredby actualbiological neurons and provide poor fidelity regardingbrain function. Nonetheless , these abstractcomputationalmodelshaverelevanceto the behavior-basedroboticist andhavebeenusedwidely in tasksrangingfrom visual road-following strategies (section7.6.1) to adaptationand learning in behavioralcontrol systems (section 8.4). Wasserman1989provides a more generaltreatmentof neural networks.
Animal Behavior
2.3 PSYCHOLOGICAL BASISFORBEHAVIOR Psychologyis brainless, neuroscienceis mind1e~~ - Les Karlovits
Psychologyhastraditionallyfocusedon the functioningof the mind, lesssothe brain. It is not our intent to revisit the classicalmonist/dualist debateof mind andbrain herebut ratherto look at what a psychologist's perspectivecanoffer robotics . Certainly psychology is preoccupied with behavior. Within that scope, we focus on perception and action , as these issues are of primary concern for the roboticist , and provide a brief history of the field , beginning with the twentieth century. Sensory psychophysics was the first to relate stimulationintensityto perception
. Weberand Fechnerdevelopedphysicallaws that describedthe relationships betweena stimulus's physicalintensity and its intensity asperceivedby an observer(Pani 1996). Behaviorismburst upon psychologyin the early 1910s. Behavioristsdiscarded all mentalisticconcepts: sensation , perception, image, desire, purpose, and emotion others , , ( Watson1925). Behaviorwasdefinedby thinking among observationonly; datawas obtainedfrom observingwhat an organismdid or said. Everythingwas castin termsof stimulusand response . This approach's main benefitwas making the field more scientificallyobjective, moving away from the use of introspectionas the primary basisfor the study of mind. Its mainclaim was" that thereis a responseto everyeffectivestimulusandthatresponse is immediate" ( Watson1925) . As behaviorismprogressed , psychology asa field becamemoreandmorescientificandlessphilosophical, sociological, andtheologicalby relying heavily uponempiricaldata(Hull 1943). B.F. Skinner ( 1974) eventuallybecamebehaviorism's bestknown proponent. Gestaltpsychology(Kohler 1947) brought physics into the fray, drawing from the tradition of sensorypsychophysicswhile broadeningbehaviorism's basis. This form of psychologyinvertedbehaviorismsomewhat , concerningitself and how behaviorarises with visual ) heavily sensoryinput (predominantly asa direct consequence of the structureof the physicalenvironmentinteracting with the agentitself. The term " gestalt" wasderivedfrom the Germanwhereit referredto form or shapeasanattribute. Certaingestaltsenabledcertainbehaviors baseduponthe physicsof retinal projectionandthe ability of the perceiver to organizethe incoming stimuli. Gestaltpsychologyfocusedon perception
Chapter2
whereasbehaviorismprincipally concerneditself with action ( Neumannand Prinz 1990) . Gestalters , however, felt that behaviorismwas limited, arguing that levelsof organizationexist abovethe sensationitself, which an organism could useto its advantage . , Ecological psychology as advocatedby J. J. Gibson ( 1979), demandeda deepunderstandingof the environmentin which the organismwas situated and how evolution affectedits development . The notion of affordances(discussed further in section7.2. 3) providesa meansfor explainingperception's roots in behavior. This psychologicaltheory saysthat things are perceivedin termsof the opportunitiesthey afford an agentto act. All actionsare a direct of sensorypickup. This resultsfrom the tuning by evolution of consequence an organismsituatedin the world to its availablestimuli. Significantassertions (Gibson 1979) include: . The environmentis what organismsperceive. The physical world differs from the environment,that is, it is morethan the world describedby physics. . The observerandthe environmentcomplementeachother. . Perceptionof surfacesis a powerful meansof understandingthe environment . . Information is inherentin the ambientlight and is picked up by the agent's optic array. Later, cognitive psychologyemerged, paralleling the adventof computer science, defining cognition as "the activity of knowing: the acquisition, organization " , and use of knowledge ( Neisser1976). Information processingand computationalmodelsof the mind beganto play an ever increasingrole. Behaviorism wasrelegatedto the role of explaininganimalbehaviorandbecame far less influential in studying humanintelligence. Unifying methodsof explaining the relationshipbetweenaction and perception(section7.2. 3) were developedunder the banner of cognitive psychology( Neisser1976). Mentalistic terms previously abandonedcould now be consideredusing computational process es or metaphors . Someof the underlying assumptionsof the information processingapproach( Eysenck1993) include . A seriesof subsystemsprocessesenvironmentalinformation (e.g., stimulus ~ attention~ perception~ thoughtprocess es ~ decision~ response ). . The individual subsystems transformthe datasystematically . . Informationprocessingin peoplestronglycorrelateswith that in computers. . Bottom-up processingis initiated by stimuli, top- down processingby intentions andexpectations(section7.5.4).
Animal Behavior
Connectionismand the associateddevelopmentof neuralnetwork technology (seeSection2.2.3) offer anotheralternativecomputationalmodel to explain mentalprocessing . , availableto be exploitedby psychologists has fluctuated on Although psychology, significantly depending the current school of thought, roboticistscan derive considerablebenefitfrom an understanding of thesedifferent perspectives . The roboticist' s goals are generally different: Machineintelligencedoesnot necessarilyrequirea satisfactoryexplanation of humanlevel intelligence. Indeed, evenpassepsychologicaltheories canbe of valueasinspirationin building behavior-basedautomatons .
2.4 Emo ~LOGICALBASISFORBEHAVIOR in a kindof old saga . Animalsarestylizedcharacters - stylizedbecause eventhemost acuteof themhavelittle leewayastheyplayouttheirparts. - EdwardHoagland Ed1ologyis die study of animal behaviorin its natural environment. To die ' strict ed1ologist , behavioralstudiesmust be undertakenin die wild ; animals responseshaveno meaningoutsided1eirnatural setting. The animal itself is only onecomponentof die overallsystem, which mustincludedie environment in which it resides. Konrad Lorenz and Niko Tinbergen are widely acknowledgedas die foundersof die field. Tinbergenconsidereded1ologicalstudiesto focuson four primary areasof behavior( McFarland1981): causation,survivalvalue, development , andevolution. Animal behavioritself canbe roughly categorizedinto threemajor classes(Beer, Chiel, andSterling 1990, McFarland1981) : . Reftexesare rapid, automaticinvoluntary responsestriggeredby a certain environmentalstimuli. The reflexiveresponsepersistsonly as long as die duration of die stimulus. Further, die responseintensitycorrelateswid1die stim' ulus s strength. Reflexes areusedfor locomotionandod1erhighly coordinated activities. Certainescapebehaviors, suchas d1osefound in snails and bristle worms, involvereflexiveactionthatresultsin rapidcontractionof specificmuscles relatedto the flight response . . Taxesare behavioralresponsesd1atorient die animal towardor away from a stimulus (attractiveor aversive). Taxesoccur in responseto visual, chemical , mechanical, and electromagneticphenomenain a wide rangeof animals. Chemotaxisis evidentin responseto chemicalstimuli asfound in die ttail following of ants. Klinotaxis occursin fly maggotsmoving towarda light source
Chapter2
by comparingthe intensityof the light from eachsideof their bodies, resulting in a wavy course. Tropotaxisexhibitedby wood lousesresultsin their heading directly towardsa light sourcethroughthe useof their compoundeyes. . Fixed-action patterns are time-extendedresponsepatternsbiggered by a stimulus but persistingfor longer than the stimulus itself. The intensity and durationof the responseare not governedby the strengthand durationof the stimulus, unlike a reflexivebehavior. Fixed-action patternsmay be motivated, unlike reflexes, andthey may resultfrom a muchbroaderrangeof stimuli than thosethat govern a simple reflex. Examplesinclude egg-rebieving behavior of the greyling goose, the songof crickets, locust flight patterns, and crayfish . escape Motivated behaviorsare governednot only by environmentalstimuli but also by the internal stateof the animal, being influencedby such things as appetite. Ethologistssuchas Lorenz adoptedthe notion of schemaas well. Schemas capturecomplicatedcombinationsof reflexes, taxes, and patternsreleasedin responseto a suitablecombinationof stimuli. A sign stimulusis the particular external stimulus that releasesthe stereotypicalresponse . Schemas , which werelater renamedinnate releasingmechanisms(I RMs) in an effort to clarify their meaning( Lorenz1981), havethe following traits (LorenzandLeyhausen 1973) : . An IRM is a simplified renderingof a combinationof stimuli eliciting a specific, perhapscomplex, responsein a particularbiological situation. . One IRM belongsto one reactionto a given situation, attunedto relatively few distinctivefeaturesof the environmentandoblivious to the rest. . Every action dependenton its own releasingschemamay be elicited completely independentlyof all otherreactionsintendedfor the sameobject. . The innate releasingmechanismprovidesthe overall meansfor a specific sign stimulusto releasea stereotypicalresponsewithin a given environmental context. For Lorenz and Tinbergen(Lorenz 1981), complex systemsof behavioral mechanismshad a hierarchicalcomponent, although Tmbergenconsidered this a weak commitmentuseful principally only for organizationalpurposes . Figure 2.7 shows an examplefor the display behavior sticklebackfish use in protecting territory. This notion of hierarchicalgrouping has parallels in serveas aggregatesof component schematheory as well, whereassemblages schemas(section3.4.4).
" "1., 0 ~ O
,
~ ' )
6
.
'
01 e ~ ' O11 . ~
'
1
'l~ iP -Iro "('Q .~ ,;C v q e6 q lJ Q crl:",'A .~ P rQiJ -6 :.~ ~ p .'~ O h -iIlC "'I:< ~ r-,oeilo
2 Chapter Maximum Selecting System Sensor
Sensor
Sensor
Sensor
-~[[~ ~
Figure 2.8 Model of maximum selectionsystemproviding lateral inhibition betweenbehaviors. Whenevera behavioris readyand its sensorystimuluspresent, it generatesa positive responsewhile inhibiting otherpotentiallyactivebehaviors. After Lorenz 1981.
Reciprocalinhibition of parallel behaviors, a form of lateral inhibition, is also available (figure 2.8). In one model, a maximumselecting systemenablesoneof manybehaviorsto dominatebasedon its readinessandincoming stimuli. This winner-take-all strategyis commonwith many of the arbitration methodsusedin behavior-basedroboticssystems(section3.4.3). All behaviors not active at a particular momentare inhibited centrally. Supportingexperimental evidenceexistsfor this locusof superior commandin animalsranging from invertebratesto primates( Lorenz1981). Hybrid behavior-basedrobotic architectures , discussedat lengthin chapter6, exploit the utility of this organizationalconcept. Ethological studiesin animal communicationmechanismsare highly relevant for multiagentrobotic systems(Arkin andHobbs1992). Displaybehavior, in particular, involves the signaling of information by changesin postureor
Animal Behavior
activity. Thesestereotypedand often highly unusualdisplaysare most often generatedby fixed-action patterns(Smith 1977) and may be visible, audible, tactile, chemical, or evenelectrical, as in the caseof the electric eel. The displays themselvesincludebirdsong, raisingof a dog' s hackles, courtingdisplays in ducks, color changesin fish, leg waving in spiders, etc. Suchdisplayshave evolutionarybenefitsfor suchactivitiesasindirectly invoking escapebehavior in thepresenceof predators,reducingthe likelihood of fighting, andfacilitating themselvesmay be behavioralselection mating, amongmanyothers. The messages enable the to that messages recipient respondappropriatelyfor a given " " . situation (i .e., what to do) such as flee, in the caseof an alarm message " " , suchas who messages They may alsobe so-callednonbehavioralmessages " for kin or sex recognition or " where messages providing location information still may ultimately affect from the sender.Thesenonbehavioralmessages ' . Ethologists but not so as behavioral the recipients behavior, directly messages havedevelopedmethodsfor analyzingandrepresentingcomplexandritualistic interactionssuchascourtshipand greeting. . One of the most important conceptsfor behavior-basedroboticistsdrawn from the field of ethology is the ecological niche. As definedby McFarland " ( 1981, p. 411), The statusof an animal in its community, in terms of its relationsto food and enemies, is generallycalled its niche." Animals survive in naturebecausethey have found a reasonably stableniche: a place where they can coexistwith their environment. Gibson( 1979) strongly assertedthis mutuality of animal and environmentas a tenet of his school of ecological psychology(seesection2.3). Evolution hasmoldedanimalsto fit their niche. Further, as the environmentis always to some degreein flux , a successful animal mustbe capableof adaptingto somedegreeto thesechangesor it will perish. Environmentalpressuresassertedby changesin habitat, climate, food ' sources,andthe like, canprofoundly influencethe species survivability. This conceptof niche is important to roboticistsbecauseof their goals. If theroboticistintendsto build a systemthat is autonomousandcansuccess fully competewith other environmentalinhabitants, that systemmust find a stable . This promulgatesthe view nicheor it (asan application) will be unsuccessful that robotic systemsmustfind their placewithin the world ascompetitorswith other ecological counterparts(e.g., people). For robots to be commonplace , or dominate to survive and / that allow them niches must find the ecological they their competitors, whetherthey be mechanicalor biological. Often economic pressuresare sufficientto preventthe fielding of a robotic system. If humans arewilling to performthe sametaskasa robot (e.g., vacuuming) at a lower cost and/ or with greaterreliability, the robot will be unableto displacethe human
Chapter2 worker from the niche he already occupies. Thus , for a roboticist to design effective real world systems, he must be able to characterize the environment effectively . The system must be targeted towards some niche. Often this implies a high degree of specialization . These same arguments are often used in economics and marketing and are generalizable to behavior-based robotics ( McFarland and Bosser 1993) . Section 4.5.7 presents one example of a nichebased robotic architecture.
2.5 REPRESENTATIVE EXAMPLF~~ OFBIO-ROBOTS Let us begin our discussion of biorobots by summarizing some important lessons animal behavior affords the roboticist : . Complex behaviors can be constructed from simpler ones (e. ., g through hierarchies or sequentially, as in fixed -action patterns) . . Perceptual strategies should be tuned to respond only to the specific environmental stimuli relevant for situation - specific responses. . Competing behaviors must be coordinated by selection, arbitration , or some other means. . Robotic behaviors should match their environment well , that is , fit aparticular ecological niche. We now turn to five representative examples of robotic systems heavily motivated by animal studies. The first two focus on perceptual aspects, that is , sensory devices mimicking chemotaxis in ants and the compound eye of the fly . A pair of examples then illustrates the problem of producing coordinated locomotion for a robotic cockroach and a primate swinging from trees. The last case concerns interagent communication for a robotic honeybee. These examples are but interesting pieces of the puzzle of building robots ; subsequent chapters explore complete behavior-based robot design. 2. 5.1
Ant Chemotaxis Go to the ant, thou sluggard, considerher ways, andbe wise. - Proverbs 6:6
Ant behavioris of keeninterestto roboticistsbecauseantsarerelativelysimple creaturescapableof complexactionsthroughtheir social behaviorand biologists havestudiedthemextensively.Excellentreferenceworks areavailableon antbehavior(e.g., Holldobler andWIlson 1990). Much animalresearchsignif-
Animal B J ehavior
icantly influencedmultiagentrobotic systems: We defer that discussionuntil ' chapter9. For the momentwe considerhow chemicalsensing,inspiredby ants behavior, canbe usedfor pathfollowing in robots. Ant communicationis predominantlychemical. Visited paths are marked . All antstraversinga useful path continually using a volatile b' ail pheromone add this odor to the trail, strengtheningand reinforcing it for future use. The -specificcollective variationsin foraging strategiesresult in a wide rangeof species patternsthat haveevolvedto fit the ecologicalneedsof the environment to which they are adapted.It could be useful for one interestedin developing robots capableof foraging over long distances , to considerthe models forwarded ant . by entomologists Simulationstudiesconductedat the University of Brusselshaveshownthe spontaneousdevelopmentof biologically plausible b' ails using mathematical behavior models. Intemest traffic for the Argentine ant has been simulated using pheromonemodels (Aron et al. 1990). Deneubourgand Goss -specific foraging patternsfor three different ( 1989) have reproducedspecies army ant species. Goss et al. ( 1990) have likewise emulatedcomputationally the rotation of foraging trails observedin the harvesterant. Thesesimulation studies, although encouraging , still require implementationon real robots to gain widespreadacceptanceas useful models for robot foraging behavior. Researchersin Australia havetaken a step forward towardsmore directly emulating ant behavior by creating robotic systemscapableof both laying down and detectingchemicaltrails (Russell, Thiel, and Mackay-SiI D 1994). Thesesystemsexhibit chemotaxis:detectingandorientingthemselvesalonga chemicaltrail. Camphor, a volatile chemicalusedin mothballs, servesas the chemicalscent. The applicationmethodis sb'aightforward: the robot dragsa felt-tipped pen containingcamphoracrossthe floor as it moves, depositinga trail.onecentimeterwide (figure2.9a). Sensingis morecomplex. The detection devicecontainstwo sensorheadsseparatedby 50 mm (figure 2.9b). An inlet drawsin air from immediatelybelow the sensoracrossa gravimetricdetector crystal. An air downflow surroundingthe inlet insuresthat the inlet air is arriving from directly below the sensor. The detectorcrystal is treatedwith a coating that absorbscamphor, and as massis added, the crystal' s resonant . When frequencychangesin proportionto the amountof camphorabsorbed this chemotacticsystemhasbeenattachedto a trackedmobile robot provided with an algorithm that strivesto keep the odor b' ail betweenthe two sensor inlets, the robot hasbeenable to follow the chemotacticb' ail success fully for up to one-half hour after the applicationof the camphorb' ail.
(A)
~~.
OdorTrail
.
~
Direction ofMotion
Ii8C ~1m S
! ~~~
2 Chapter
I t\ FloorSurface (8) Figure2.9 Chemotaxis hardware : (A) thecamphor device . ; ( 8) thedual-headsensing applicator
Animal Behavior
Figure2.10 Robotequippedwith compound a 360 -degree eyeconsisting: of 100facets providing viewof thehorizon. panoramic
2.5.2 Fly VISion Researchat France's CentreNational de la RechercheScientifique(C.N.R.S.) has consideredthe housefly's compoundeye a useful way in which a robot ' can view the world ( Franceshini , Pichon, and Bianes 1992). The houseflys visual navigationsystemconsistsof approximatelyone million neuronsthat constantlyadjustthe amplitude, frequency, and twist of the wings, which are controlled by seventeendifferent muscles. Visual motion is used for course control. The eye of the houseflyis composedof 3,000 pixels eachcontaining eight photoreceptorsandoperatingin parallel. Severalbehavior-specificvision systemshavebeenreported, including vision for sexualpursuit of matesand the detectionof polarizedlight for usein navigation(Mazokhin-Porshnyakov 1969). An in- depth study by the CNRS group has led to the developmentof a reactivemobile robot that usesan insect-like visual system(figure 2.10). This ' ' systems raison d etre is simpler than that of the fly : it merely is to move safelyaboutthe world, avoidingobstaclesby exploiting via vision the relative motionbetweenitself andtheenvironment.The biological principlesexploited include . The use of a compoundoptic design, generatinga panoramicview. The layout is nonunifonn, with the visual spacein the direction of motion more . denselysampledthan elsewhere . Visuomotor control is conductedusing optic flow inducedby the robot' s motion.
Chapter2 .
.
. .
.
'
;
:
;
'
"
"
"
' C
: : ;
;
.
' ,
;
;
;
;
, .
Figure2.11 Roboticcompound withpermission fromFranceshini , Pichon , and eye. (Reprinted Bianes .) 1992 . Locomotionconsistingof a successionof translationalmovementsfollowed ' by abruptrotations, typical of the fly s free-flight behavior. . Motion detection circuitry basedon electrophysiologicalanalysis of the , Riehle, andLe Nestour1989) usinganalogdesign. housefly(Franceshini . The use of space-preservingtopographic(retinotopic) mappingsonto the control system. . Modeling from an invertebrateperspective , usingan exoskeletonasopposed to a backbone. This visual systemwasrealizedin specializedhardware. (figure 2.11). This small compoundeye proved up to the task of supportinglimited real-time navigationin a randomobstaclefield. 2. 5. 3 Cockroach Locomotion Long after the bomb falls andyou andyour gooddeedsaregone, cockroacheswill still be here, prowling the streetslike armoredcars. - TamaJanowitz
Intrinsic Synaptic Currents c~- . -'/ JC T
Animal Behavior
Threshold Voltage
Firing Frequency
CellMembrane
Figure 2.12 Neuralmodelfrom the Artificial InsectProject. This modelusesthe intrinsic currentsto ' capturethe neurons dynamicaspectsby pennitting time andvoltagevariation. The cell membraneusesa Resistor-Capacitor(RC) circuit that cantemporallysumthe synaptic andintrinsic inputs. A linear thresholdfunction generatesthe firing frequency.
Interdisciplinaryresearchconductedat CaseWesternReserveUniversity has studiedthe mechanismsof locomotor behaviorin the American cockroach. In their Artificial Insect Project, Beer, Chiel, and Sterling ( Beer 1990; Beer, Chiel, and Sterling 1990) developeda neural model more faithful to biology than most usedin neural network research(figure 2.12) . The model includes cell membraneproperties, usessynaptic currents, and generatesoutputs in terms of the neuron's firing frequency. The individual leg neural control circuitry is composedof a small collection of theseneuronsbasedupon a biologically derived model for walking ( pearson1976). In simulation studies, Beer and his colleaguescreatedwithin the artificial insect the spontaneous generationof gaits (metachronalwaves, tripod gaits) observedin the natural insect. The simulation model was endowedwith higher-level behavioral controllersthat usedantennaeand mouth sensorsto extractinformation from the environment. The behaviorsincluded wandering, edgefollowing, appetitive orientationand attractionto food, and a fixed-action patternrepresenting food consumption(figure 2.13). Thesecontrollers were also modeledat the neurallevel. The overall insectwas capableof exhibiting motivatedbehavior (exhibitedthroughthe buildup of arousaland satiationshownin feeding) and a variety of statically stablegaits, all " strikingly reminiscent" of the natural animalcounterpart.
Chapter2 [~~~~~~~~]
I
BEHAVIORS
I
Mouth - -Tactile -----. : Mouth Chemical Antenna Chemical
Antenna Tactile
Figure2.13 circlesshowinhibition . Lineswith darkened behavior of cockroach Simplifiedschematic . between behaviors ( 1993) implementeda portionof the neural EventuallyQuinnandEspenschied simulationmodel on a hexapodrobot about centimetersin length with a massof approximatelyonekilogram (figure 2.14 A ). The aspectsof the biological ' model concerningleg control were alsoimplemented.The robot s performance confirmedthe locomotiongaitsobservedin the simulationstudies. Espenscheidet al. ( 1994) later createda secondbiologically inspiredhexapod robot with a massof approximately5 kilogramsandabout centimeters long (figure 2.14 B), capableof a continuousrange of insect-like gaits and of navigatingirregular terrain. Whereasthe earlier robot was capableonly of straight-line motion, this newerversionwasgeneralizedto handleboth lateral . androtationalmovements
Animal Behavior
Figure 2.14 Roboticcockroaches: (A ) showsthe first version, which waslater refinedinto the robot shownin ( B) . ( photographscourtesyof R. Quinn, R. Beer, H. Chiel, andR. Ritzmann.)
60
Chapter2
2.5.4 Primate Brachiation Actionis atbottoma swingingandflailingof thearmsto regalIJ one' s balanceandkeep afloat. - EricHoffer Another interestingaspectof animal behaviorthat influencedthe designof robotic systemsinvolves a mobile robot that travels by a rather unconventional means. Most mobile vehicleseither have legs, wheels, or tracks, but researchersat NagoyaUniversity in Japanhaveconstructeda mobile system that swings from limb to limb (brachiates ) in the style of a long-armedprimate such as a gibbon (figure 2.15 A ). The researchersdesigneda heuristic controller that enablesthe two-link brachiatingrobot (figure 2.15 B) to learn appropriatemotion sequences by trial-and-error methods.It can, afterlearning, success fully catcha targetbar from any initial stateand continuelocomotion alonga seriesof spacedbars. It canalsorecoverfrom a missedcatch, usingthe initial statestrategyto beginagain. The level of influenceof the underlying animal behavioralstudiesfor this researchis markedlydifferentthanfor the cockroachmodelwe havejust studLed. In the insectwork, an attemptwas madeto model closely the underlying neuralcontrol algorithmsresponsiblefor the animal' s locomotorbehavior. In the brachiationwork, we seethat no effort is madeto be faithful to the neurophysiologyof the primatethat hasmotivatedit. In~tead, only the mostoutward aspectsof locomotor behaviorare involved, with the ariimal studiesserving solely asinspirationfor the creationof this type of robot.
2.5.5 RoboticHoneybee That which is not goodfor the bee-hive cannotbe goodfor the bee. - MarcusAurelius Research involving communication via dance in the honeybee (Kirchner and Towne 1994) provides an interesting twist on the relationship between robotic and animal behavior. Honeybees have long been thought to convey infonna tion regarding the whereabouts of food source discoveries in their environs by a waggle dance in the hive. The question was whether bees used sound in addition to their dance to convey location infonnation . The debate was resolved after the consttuction of a robotic ~oneybee capable of both singing and danc-
Animal Behavior
(A)
Figure2.15 from withpermission robot. (Figures . ( B) Brachiating (A) Gibbonbrachiation reprinted Saito,Fukuda , andAraj 1994.e 1994ffiEE.)
Chapter ' ing in a mannersimilar to that of a live bee. The bees body was madefrom brassand coveredwith beeswax. The wings were constructedfrom piecesof razor bladescapableof vibrating via an electromagnet . A long rod attached the body to motorscapableof producingthe waggledanceautomaticallywhen connectedto computers. The resultsindicatedthat the mechanicaldancingbee was capableof recruiting beesto fly in the particulardirection that the robot' s danceindicated. This occurredonly when the robot' s wings were vibrating, indicating that soundplayeda role in the communication.Introducingvariationsin the robot' s danceand observingthe effect upon the foraging beesprovidedadditionalinformation aboutthe natureof the communicationcontainedin the danceitself. This researchexampleillustratesthat the relationshipbetweenrobotics and the study of animal behavioris mutually beneficialrather than one-sided, as roboticscancontributeto the studyof animalbehaviorin additionto benefiting from it.
SUMMARV 2.6 CHAFfER . . . . .
Animal behavior provides a definition for intelligence . an existence proof for the creation of intelligent mobile systems. models that roboticists can mimic or from which they can draw inspiration . Neuroscience provides a basis for understanding and modeling the underlying circuitry of biological behavior. . The roboticist can view neuroscience from many different levels: . at the cellular level of neurons . at the organizational level of brain structure . at the abstract level based on computational models (e.g ., schemasand neural networks ) derived from the above. . Psychological models focus on the concept of mind and behavior rather than the brain itself . . Various (often opposing) psychological schools of thought have inspired roboticists : . Behaviorism : using stimulus response mechanisms for the expression of behaviors an agent has with its envi ' . Cognitive psychology : using computational models to describe an agent s behavior within the world
AnimalBehavior . Ethology is concernedwith the behavior of animals within their natural world. . The definition of behavioral classes , including reflexes, taxes and fixedaction patterns, provides a useful language for operationalizing robotic behavior. . Innatereleasingmechanisms(referredto earlierasschemas ) providea means for coordinatingmultiple competingbehaviors, especiallywhen coupledwith lateralinhibition. . The conceptof an ecologicalniche enablesthe roboticist to considerhow a robot is positionedwithin its overall environmentandhow it canbe asuccessful competitorwithin the world. . Many robotic systemshave beenheavily influenced, at various levels, by biological studies. Examplesinclude ant chemotaxis, fly vision, cockroach locomotion, primatebrachiation, androbotic honeybees .
Chapter Robot
3 Behavior
The greatend of life is not knowledge, but action. .- ThomasHenry Huxley We really only know, whenwe don' t know; with knowledge, doubtincreases . - JohannWolfgangvon Goethe For in muchwisdomis muchgrief, andhethat increasethknowledge - Ecclesiastes1: 18
sorrow.
Chapter Objectives 1..To learn what robotic behaviors are . be 2 To understand the methods that can used to and encode these express behaviors . 3..To learn methods for and behaviors . composing coordinating multiple 4 To obtain a basic of the choices related to behavior understanding design based robotic . systems 3.1 WHATAREROBOTICBEHAVIORS ? After developingan understandingof behavior's biological basisin chapter2, we now studyhow to expressthe conceptsand formalismsof behavior-based robotic systems.It is importantto rememberthat biological studiesarenot necserveasinspirations essarilyviewedasconstrainingfor robots, but nonetheless for design.
Chapter3
Perhapsthe easiestway to view simplerobotic behaviorsis by adoptingthe conceptadvocatedby thebehavioristschoolof psychology.A behavior, simply put, is a reactionto a stimulus. This pragmaticview enablesus to expresshow a robot should interact with its environment. By so doing, we are confining ourselvesin this chapterto the studyof purely reactiverobotic systems. 3.1.1
Reactive Systems If we had more time for discussionwe would probablyhavemadea greatmany more mi ~take~
- Leon Trotsky
A
reactive
intervening
robotic abstract
system
tightly
representations
couples or
to
perception time
history
action
without
the
use
of
.
Reactive robotic systems have the following characteristics: . Behaviors serve as the basic building blocks for robotic actions. A behavior in these systems typically consists of a simple sensorimotor pair , with the sensory activity providing the necessaryinfonnation to satisfy the applicability of a particular low - level motor reflex response. . Use of explicit abstract representational knowledge is avoided in the generation of a response. Purely reactive systems react directly to the world as it is sensed, avoiding the need for intervening abstract representational knowledge . In essence, what you see is what you get. This is of particular value in highly dynamic and hazardous worlds , where unpredictability and potential hostility are inherent. Constructing abstract world models is a time consuming ' and error -prone process and thus reduces the potential correctness of a robot s action in all but the most predictable worlds . . Animal models of behavior often serve as a basis for these systems. We have seen in chapter 2 that biology has provided an existence proof that many of the tasks we would like our robots to undertake are indeed doable. Additionally , the biological sciences, such as neuroscience, ethology , and psychology , have elucidated various mechanisms and models that may be useful inoperational izing our robots. . These systems are inherently modular from a software design perspective. ' This enables a reactive robotic system designer to expand his robot s compe-
Robot Behavior
tencyby addingnewbehaviorswithout redesigningor discardingthe old. This accretionof capabilitiesover time and resultantreusability is very useful for constructingincreasinglymorecomplexrobotic systems. Purely reactivesystemsare at oneextremeof the robotic systemsspectrum (section 1.3). In subsequentchapters, we will see that it may be useful to add additionalcapabilitiesto reactivesystems,but for now we focus on these simplersystems.
3.1.2 A NavigationalExample Let us constructan examplewith which we canframe the discussionto come. Considera studentgoing from one classroomto another. A seeminglysimple ' . task, at leastfor a human. Let s examineit more closely and seethe kinds of thingsthat areactuallyinvolved. Theseinclude 1. gettingto your destinationfrom your currentlocation 2. not bumpinginto anythingalongthe way 3. skillfully negotiatingyour way around other studentswho may have the sameor different intentions 4. observingcultural idiosyncrasies(e.g., deferringto someoneof higher priority if in conflict with priority determinedby age or gender, in the United Statespassingon the right, etc.) 5. copingwith changeanddoing whateverelseis necessary So what soundssimple (getting from point A to point B) can actually be quite complex(figure 3.1), especiallyin a situationwherethe environmentis not controllableor well predicted. Behavior-basedroboticsgrewout of the recognitionthat planning, no matter how well intentioned, is often a wasteof time. ParaphrasingBurns: The best laid plans of mice and men oft go astray. Oft is the keyword here. Behaviorbasedrobotic systemsprovide a meansfor a robot to navigatein an uncertain and unpredictableworld without planning, by endowingthe robot with behaviors that deal with specificgoals independentlyand coordinatingthem in a purposefulway.
3.1.3 Basisfor RoboticBehavior Where do robotic behaviorscome from? This primary question leads to a seriesof subsidiaryquestionsthat must be answeredto provide a robot with behavioralcontrol:
3 Chapter YHI PAR SIDI
ByGARYLARSON
ftgure 3.1 Thingsmay be harderthanthey seem. ( TheFar Side<9 1993Farworks, Inc. Distributed by UniversalPressSyndicate. Reprintedwith permission. All rights reserved.)
RobotBehavior . . . .
What are the right behavioral building blocksfor robotic: systems ? What really is a primitive behavior? How are these behaviors effectively coordinated? How are these behaviors grounded to sensors and actuators?
Unfortunately there are currently no universally agreed-upon answers to these questions. A variety of approaches for behavioral choice and design have arisen. The ultimate judge is the appropriateness of the robotic response to a given task and environment. Some methods currently used for specifying and designing robotic behaviors are described below. 1. Ethologically guided/ constrained design. As previously mentioned , studies of animal behavior can provide powerful insights into the ways in which behaviors can be constructed. Roboticists can put models generated by biological scientists to good use. One such example comes from Arbib and House' s ( 1987) studies of the navigational behavior of the toad and its relationship to Arkin ' s ( 1989a) schema-based robotic navigational system (section 4.4 ) . In this instance, motion divergence fields are specified for a toad navigating amid a collection of poles toward a can of worms . This model provides an analogous means for representing robot behaviors using a modified potential (force ) field method (figure 3.2) . The key phrase for the design process here is " etho" logically guided : consulting the biological literature for classifications , decompositions , and specifications of behaviors that would be useful for robotic but not systems, necessarily being constrained by them. Other researchers, epitomized by Beer ( Beer, Chiel , and Sterling 1990) (discussed earlier in section 2.5.3), look toward high -fidelity models of the neurological substrateof an animal ( in Beer ' s case, the cockroach) in their attempt to emulate an appropriate behavioral responseby a robot . These scientists choose to deliberately constrain their behavioral models to match those of the animal under study. In many ways this overconstrains the problem of producing intelligent behavior in a robot , but as a side effect this research can potentially answer interesting questions regarding actual biological behavior, for example in terms of predictive modeling . The methodology for designing ethologically guided/ constrained behaviors is illustrated schematically in figure 3.3. A model is provided from a scientific study, preferably with an active biological researcher in tow. The animal model is then modified as necessary to realize it computationally , and is then ' grounded within the robot s sensorimotor capabilities . The results from the robotic experiments are then compared to the results from the original biological studies, and either the biological model or its robotic alter ego are
-
3 Chapter
"" - . -
~
./ Worms
/
4
-
- . -
.,..-
" t
~
, .
,
t .
t
.
..............-...
......
..
t t , t ....... ~ . . . . . . -. . - ............. ObstacleFence ...... . . + +
.......
~
.
~
,
~
t
t
"
,
t
f
.... ""
\
~
~
I
8 ------- 8 Toad (A) Figure3.2
)"
~
Robot Behavior
Figure 3.2 (continued) ' Toad/robot navigationalmodel (A ) Representsa model of a toad s attractionto a can of worms, avoidancefrom a pole fence, and an egocentricanimalpotentialfor motion. The vectorsrepresentthe most likely direction of motion for the animal at eachpoint in space. This model was shownto be consistentwith experimentallyobservedanimal a setof robotic behaviorsfor similar data(after Arbib andHouse1987) . (B) Represents : avoid-static-obstaclesand move-to- goal. An egocentricpotentialis not circumstances . neededin this similar, yet different, representation
~
OmS 000 ~
=
GO
Wc
. ta
cO
~
Chapter3
Figure3.3 for ethologically
Robot Behavior
. Assess AgentEnvironmen Damics Partition into Situations Create Situational Responses Import Behaviors to Robot Run Robotic Experimen Evaluate Results
Enhance , , Expand Correct Behavioral Responses
8 Figure3.4 Situated activitYdesignmethodology. Arbitrarily complex situations can be created and specified that may have no biological basis. Pengi (Agre and Chapman 1987) is a system that characterizes situations by their indexical -functional aspects. Indexical refers to ' what makes the circumstances unique , functional refers to a robotic agent s intended outcome or purpose in a given situation . This system uses lengthy phrasesto characterize particular situations that demand certain responses. For example ( paraphrasing the situations from the original Pengi somewhat), the block - I - need- to -kick -at- the- enemy-is -behind -me is a situation that requires the ' - - agent to backtrack to obtain the object in question. I ve run into the edge of the wall requires that the robot turn and move along the edge. These situations can be highly artificial and arbitrarily large in number. Coordination in Pengi
Chapter3 , is handled by an arbittation mechanism, where one of the candidate actions is chosen (there may be many applicable) and executed. Indeed, one candidate action may be in conflict with another. Hopefully the best action is chosen, but there is no guarantee, as no planning is conducted, nor does the mechanism project the consequencesof undertaking any action. Assuming that there is no limit to the number of situational conditions that can be enumerated leads us to a more expansive version of this theory of situated activity : universal plans. Universal plans, as developed by Schoppers 1987 ( ), require the robotic agent to have the ability of recognizing each unique situation for what it is and then selecting an appropriate action for each possible world state. These universal plans cover the entire domain of interaction , use sensing to conduct the classification , and presuppose no ordering on the situations or even the type of situations that might arise for that matter ( Schoppers 1989) . Sensing is conducted continuously , so situational assessment, and thus appropriate response, is continuously reevaluated. To deal with the issue of the sheer bulk regarding the vast number of possible situations , the idea of caching plans is forwarded . Despite this technique, universal plans have encountered significant criticism (e.g ., Ginsberg 1989) predominantly due to the immensity of the numbers of plans required and the potential irrelevancy of most. Even the harshest critics acknowledge that more limited versions of the situated activity paradigm have utility , even when it is designed to include not only routine situations but a wide range of contingent ones. The argument, however, that an enumeration of every possible situation (i .e., universal plans) is impractical at best and mathematically inttactable at worst is a valid one. Reactive action packages (RAPs ) (Firby 1989) constitute an unusual variant on situation -driven execution. As with the other methods, the current situation provides an index into a set of actions regarding how to act in that environment . RAPs , however, operate at a coarser granularity than the other situated-action approaches and provide multiple methods of acting within a given context. RAPs consist of a set of methods specific to a task-situation , and for each of those methods, a sequenceof steps to accomplish the task is provided (a kind of " sketchy" plan ) . RAPs differ from most reactive systems, however, in that they are not truly behavior based ( but rather task based) and in that the system relies heavily on a strong explicit internal world model. 3. Experimentally driven design. Experimentally driven behaviors are invariably created in a bottom - up manner. The basic operating premise is to endow a robot with a limited set of capabilities (or competences) , run experiments in the real world , see what works and what does not , debug imperfect behaviors, and
RobotBehavior
Build Minimal System -'f Exercise Robot -.t Evaluate Results
Add New Behavioral Competence
~
8 Figure 3.5 Experimentallydriven methodologyfor behavioraldesign.
thenaddnewbehaviorsiterativelyuntil the overall systemexhibitssatisfactory performance(figure 3.5). An excellentexampleof this designparadigmappearsin Brooks' ( 1989a) work on the designof a behavior-basedcontroller for a leggedwalking robot. Initially the robot (panel(A ) in figure 3.6) was provided with the ability to standup and conducta simple walk. This worked adequatelyfor smoothterrain but posedproblemsasthe robot attemptedto walk over irregularsurfaces. Basedon the requirementsof this extendedcapability, force balancingwas addedto modify the leg controllersand help the robot maintaina steadyposture . Whiskers(protrudingsensorsin the front of the robot) were then added to provide more warningto the control systemto deal with largerobjectsthat therobot neededto climb over. A final problemwasnotedinvolving balancein situationswherethe robot washeavily tilted fore or aft (pitching). To compensate , an inclinometercoupledwith new pitch stabilizationcode was addedto provide evenbetterperformanceas the robot maneuveredover highly irregular terrain. It wasthendecidedto allow the robot to track warm objectssuchas people, soinfraredsensorswereadded, coupledwith a newbehaviorto provide wasaddedincrementally,baseduponthe prowling. Eachof thesecompetencies resultsof previousexperimentsandthe goal of providing greaterutility for the robotic system. Section4.3 detailsthe developmentof this system. In Paytonet ale1992, fault toleranceis introducedinto reactiverobotic systems through the design of suitablebehaviorsthat can handleunanticipated
Chapter3
(A)
Figure3.6 . ( photograph (A) OriginalGenghis courtesyof RodneyBrooks.) (B) Genghis1I- a robotichexapod successor to theoriginalGenghis . (photograph , commercial courtesy . of IS Robotics , Somerville , MA.)
RobotBehavior
" , dubbed Do whatever contingenciesas they arise. This designmethodology " works, hasa goal of generatinga sufficiently generalsetof low-level behaviors that when activatedcan copewith eventsbeyondthe initial designer's vision . Redundancyis the ~ey feature, that is, allowingthingsto be accomplished in more than one way andthen designinga controllercapableof selectingthe mostsuccessfulbehaviorfor thecurrentsituation. In a sense,this featureis also embodiedin theRAPssystemdescribedpreviously, in which multiple methods areusedto accomplisha task. Ferrell ( 1994) developeda complex control systemfor anotherwalking hexapodpotentiallysuitableasa lunaror Marsroverusingthis bottom-up strategy . In this implementation essupportedlocomotion , 1,500 concurrentprocess over rough terrain and providedthe requisitesensingwith a significantlevel , of fault tolerance. The entire systemwas constructedwithout any relianceon simulationtechnology. Earlier, Connell ( 1989a) demonstratedthe efficacy of this experimentalmethodwith the designof a mobile manipulator(figure 3.7). In particular, the arm controller for this systemconsistedof fifteen independent behaviorscapableof finding a sodacan, then grabbing, transporting, and depositingit at anotherlocation. Whateverthe designbasis, a genericclassificationof robot behaviorscanbe usedto categorizethe differentwaysin which a robotic agentcaninteractwith its world: Exploration / directional heading based
behaviors (move in a general direction )
wandering Goal- oriented appetitive behaviors (move towards an attractor ) discrete object attractor area attractor Aversive / protective behaviors (prevent collisions ) avoid stationary objects elude moving objects ( dodge, escape) aggression Path following behaviors (move on a designated path ) road following hallway navigation stripe following Postural behaviors balance stability
Figure3.7
~~-
3 Chapter
Herbert- a mobile manipulator .
Social
/ cooperative
courtesyof JonConnell.)
behaviors
sharing foraging
~~king/ hATding
Teleautonomous
behaviors
influence behavioral
Perceptual saccades
modification
behaviors
visual
search
ocular
reflex
es
( coordinate
with human operator )
RobotBehavior Walking behaviors ( for legged robots ) gait control
behaviors ( for arm control ) Manipulator specific reaching Gripper / dextrous hand behaviors ( for object acquisition grasping enveloping
)
Perceptual support is required to implement any of these behaviors. Chapter 7 describes how perception can be tailored to behavioral need.
3.2 EXPRESSION OFBEHAVIORS Severalmethodsare availablefor expressingrobotic behavior. This book employs three: Stimulus-response(SR) diagrams, functional notation, and finite stateacceptor( FSA) diagrams.Thesemethodswill be usedthroughoutthe text in representingvariousbehavior-basedsystems.SR diagramswill be usedfor of specificbehavioralconfigurations , functional notation graphicrepresentations for clarity in designof the systems,andFSAswhenevertemporalsequencing of behaviorsis required.
3.2.1 Stimulus-Response Diagrams Stimulus-response(SR) diagramsare the most intuitive and the least formal methodof expression . Any behaviorcanberepresented asa generatedresponse to a givenstimuluscomputedby a specificbehavior. Figure3.8 showsa simple SR diagram. Figure 3.9 presentsan appropriateSR diagramfor our navigationalexample (section3.1.2). Here five different behaviorsare employedin the task of getting to the classroom.The outputsof eachbehavioris channeledinto a coordination mechanism(schematizedhere) that producesan appropriateoverall motor responsefor the robot at anypoint in time giventhe currentexistingenvironmentalstimuli. Section3.4 discussesfurther the problemandmethodsof coordinatingbehaviors.
Stimulus
Response
Chapter3
-
_ .
.
.
_
~
_ .
.
.
.
.
. .
.
.
.
.
_ .
.
.
l ~
J
~
~
_
_
~
L ~
-
I
~
~
.
~
~.
.
-
-
~
L
Action
~ 1
.
.
.
_
_.
~
-
_ .
~
~
~
L ~
j
.
-
-
-
~
"
-
,
~
L ~
detected elder
_
detected path
.
detected student
.
detected object
J .
class location
Fi2Oft3.9 SR diagramfor classroomnavigationrobot. 3.2.2
Functional Notation Mathematical methods can be used to describe the same relationships using a functional notation , b (s ) = r , meaning behavior b when given stimuluss yields response r . In a purely reactive system, time is not an argument of b , as the behavioral response is instantaneous and independent of the system' s time history . A functional expression of the behaviors necessaryto carry out our example navigational task of getting to the classroom would appear as: coordinate - behaviors [ move- to - classroom ( detect - classroom - location avoid - objects ( detect - objects ) , dodge students (detect - students ) , - to - stay right on path ( detect - path ) , defer to elders (detect - elders ) = motor response ]
),
The = motor - response is usually implicit and is generally not written when using this notation.
RobotBehavior Each of the five behaviors listed can produce an output depending on the current environmental stimuli . A coordination function determines what to do with those outputs (e.g ., selecting one of them or combining them in some a trivial problem to ensure that the outputs of each meaningful way ) . It is not ' behavioral function are in a form that can be coordinated , as section 3.4 will detail . Coordinated functions can also be the arguments for other coordinating functions . For example coordinate - behaviors [ coordinate - behaviors (behavioral - set - l ) coordinate - behaviors ( behavioral - set - 2 ) coordinate - behaviors ( behavioral - set - 3) ] where each of the behavioral setsis a set of primitive behaviors. Clearly this notation readily permits a recursive formulation of behavior, ultimately grounded in physical robotic hardware but able to move upward into arbitrary levels of abstraction. Functional notation has an interesting side effect in that it is fairly straightforward to convert this representation into a computer program . Often a functional programming language such as LISP is used, although the C language also enjoys widespread usage.
3.2.3 Finite StateAcceptorDiagrams Finite stateacceptors(Arbib, Kfoury, andMoll 1981) havevery usefulproperties when describingaggregationsand sequences of behaviors(section3.4.4). make the behaviors active at They explicit any given time and the transitions betweenthem. They are lessuseful for encodinga singlebehavior, which results in a trivial FSA (figure 3.10). In figure 3.10, the circle b denotesthe statewherebehaviorb is activeand is acceptingany stimulusinput. The symbola denotesall input in this case. A finite state acceptorM can be specifiedby a quadruple (Q, <5, qo, F ) with Q representingthe setof allowablebehavioralstates; <5being a transition function mappingthe input and the currentstateto another, or eventhe same, state; qOdenotingthe startingbehavioralconfiguration; and F representinga setof acceptingstates,a subsetof Q, indicatingcompletionof the sensorimotor task. <5 can be representedin a tabular form where the arcs representstate transitionsin the FSA andareinvokedby arriving stimuli.
Chapter3
a
b
.5 q input .5(q,input ) ba b
a transitionfrom one stateto another, and the resulting state. For this trivial FSA, all inputsresult in the samestate.
For the trivial exampleshownin figure 3.10: M = {{b}, ~, b, {b}} .
FSAsarebestusedto specifycomplexbehavioralcontrol systemswhereentire setsof primitive behaviorsare swappedin andout of executionduring the accomplishment of somehigh-level goal (Arkin andMacKenzie1994) . Gat and Dorais ( 1994) havealso expressedthe needfor sequencingbehaviors. FSAs providea readymechanismto expresstheserelationshipsbetweenvariousbehavioral sets and have been widely used within robotics to expresscontrol systems. In their MELDOG system, Tachi and Komoriya ( 1985) use an automaton mapto captureactionsthat shouldbeexecutedat variousplaceswithin theworld. Similar examplesexistof FSAusagefor guidingvision-basedrobots (Fok and Kabuka 1991; Tsai and Chen 1986). Brooks ( 1986) has also used a variation, augmentedfinite statemachines(AF SMs), to expressbehaviors within his subsumptionarchitecture(section4.3). In this text, however, we use thenotationdevelopedin Arkin andMacKenzie1994basedon the formalisms describedin Arbib, Kfoury, andMoll 1981. The classroomnavigationexamplecan also be expressedwith FSAs, although the result is of a decidedlydifferent character(figure 3.11) . This example has four different states: start, journey, lost, and at-class. The last two
RobotBehavior other
-at-class not
all
Start -class reached At CIass
all
Figure3.11 FSA representingclassroomnavigationexample.
areterminal states: lost is abnormal, at-classnormal. Journey,the main behavioral state, actuallyconsistsof an assemblage (a coordinatedcollection) of the five other low-level behaviorsmentionedearlier (move-to-classroom , avoidto-elders . students to on and defer , stay right ) Specifically path, objects, dodge , M = {{start, journey, lost, at-class} , 8, start, { lost, at-class} } . The FSA provides us with a higher level of abstractionby which we can expressthe relationshipsbetweensetsof behaviors.
3 Chapter To further illustrate this, let' s look at an evenmore complexexample. Figure 3.12 depictsan FSA constructedfor a robot usedin a competitionconducted by the AmericanAssociationfor Artificial Intelligence. Here, a collection of high-level behaviors,eachrepresented schematicallyasa state, encodes ' the robot s goal of moVingaboutan arenalooking for ten distinct poles, then . This robot hasthreemajor behavioral moving to eachof thosepolesin sequence states, wander, move-to- pole, and return-to-start. Move-to-pole consists of a subsetof actionsfor selectinga pole, orienting the robot so it points toward the pole, movingto the pole, andtrackingthe pole visually during motion until it is reached.In this case, M = {{start, lind -next-pole, move-to-pole, wander, -to-start, halt } , 8, start, { halt } }, Return where8 containsthe transitioninformation depictedin figure 3.12. The FSA showsthe sequencingbetweenbehaviorsas the robot carriesout its mission (figure 3.13). More details on this task and robot can be found in Arkin et al. 1993. Incidentally, the use of finite statedescriptionsin the University of SouthernCalifornia' s PhonyPonyprojecton quadrupedlocomotion(McGhee 1967) is probablythe first exampleof their applicationfor specifyinga robot control system.
3.2.4 FonnalMethods Fonnalmodelsfor behavior-basedroboticscanpotentiallyprovidea setof very usefulpropertiesto the robot programmer: . They canbe usedto verify designerintentions. . They canfacilitate the automaticgenerationof robotic control systems. . They provide a complete common languagefor the expressionof robot behavior. . They provide a frameworkfor conductingformal analysisof a specificpro' . , and/ or completeness gram s properties, adequacy . They provide supportfor high-level programminglanguagedesign. Severalformal methodshavebeendevelopedfor specifyingand designing behavior-basedrobotic systems.A brief reviewof two representative strategies is presentedbelow.
RobotBehavior other
Find nextpole
Return to-start
Halt
IS lS inpu , ( ) input q q find n ex start p ol star -return find n ext p ole -move s tart twande o -halt tpoole -up
wander
compete other all-poles-found no- pole-found pole-selected not-at-start at-start lost at-pole not-at-pole time-not-up timeout all
return-to- start wander move-to- pole return-to- start halt find-next-pole find-next-pole move-to- pole wander find-next-pole halt
Figure3.12 robotcompetition . FSArepresenting example
all
Chapter3
3.13 Fiaure Robotexecutingbehaviorsat competitionarena.
3.2.4.1 RS Lyons and Arbib ( 1989) developedthe RS (robot schema ) model as a method for expressingdistributedsensor-driven robot control programs. Aprocessal gebrais usedthat permits the compositionof a network of processes called schemas(behaviors ). Processcompositionoperatorshave been defined as a basisfor creatingthesenetworks, which include methodsfor conditional, sequential , parallel, and iterative structures. Preconditionsare establishedfor coordinationoperatorsto ensurea smoothflow of control during execution. In particular, a port automatamodel hasbeenadoptedas the underpinning for expressingtherelationshipsbetweenschemas . Portautomatacanbe viewed as an extensionof FSAs with supplementalformal methodsfor specifyingthe interconnectionsbetweenstates. Schemascommunicatewith each other via . predefinedinput-outputports using synchronousmessagepassingtechniques ' A schemas behavioraldescription, which encodesits responseto any input , fully determinesits action. Schemasare aggregatedvia a nesting messages mechanismtermed an assemblage . Each assemblagerecursively encodesa network of schemas or other . unique assemblages
Robot Behavior
A high-level algebraic RS encoding for the navigational example used throughoutthis chapterwould be: Class- going - robot = ( Start - up ; (done? , Journey) : At - classroom) Journey = (move- to - classroom , avoid - objects , dodge- students , stay - to - right - on- path , defer - to - elders ) Translating, the Class- going - robot consistsof a robot that, beginning from an initial start-up state, sequentiallytransitionsto the Journey state(the sequentialoperatoris denoted;) and which then remainsin the Journey state with a concurrentmonitor processcheckingfor arrival at the classroom(the concurrencyoperatoris denotedwith a ,). If the robot is at the classroomit transitionsto the At - classroom state(denotedby the conditionaloperator:). Journey consistsof the concurrentexecutionof the behaviorsspecifiedduring travel from onelocationto the next. This methodcombinesadvantages of both the functional andFSA methodsinto a singlesyntax. Thereis far moreto theRSmodelthanwhatis presentedhere. The interested readeris referredto Lyons and Arbib 1989, Lyons and Hendriks 1994, and Lyons 1992for moredetails.
3.2.4.2 SituatedAutomata The situatedautomatamodel, developedby KaelblingandRosenschein ( 1991), recognizesthe fundamentalrelationshipan agenthasasa participantwithin its environment.The model employslogical formalismsasunderpinningsfor the ' designof circuitry that correspondsto a robot s goalsand intentions. The use of logic enablesreasoningover the system, which can lead to the establishment of provableproperties(Rosenscheinand Kaelbling 1987), an important goal for the designerof any type of system. Rex (Kaelbling 1986), a LISPbasedsystem, was the first languageto embody the basic tools to generate specificationsfor synchronousdigital circuitry embodyinga reactivecontrol program. Gappsis a more recentlydevelopedlanguagethat enablesgoals to be specifiedmore directly and is inherently easierto use, akin to a higher generationlanguagein conventionalprogramming. Goalsare either achieved, executedor maintained: Achieved goals are thosethat should be eventually realized; executedgoals are thosethat should be done now; and maintained , as they have alreadybeenattained. goals are thosethat shouldbe preserved are defined in a LISP like format that correspondto thesethreegoal Operators states: ( ach goal ) for achieve, (do goal ) for execute, and (maint goal ) for
Chapter3
maintain. The logical booleanoperatorsand, or , not , if are usedto create higher-level goals. Circuits aregeneratedfrom thesehigh-level goal expressions . Standardlogical methodscanbe us.edto compilethe high-level circuitry into a collectionof digital logic gates. This type of formalismprovidesa very concretegrounding onto actualdigital hardwarefor creatingsituatedautomatarobots. A simplified Gappsspecificationfor our ongoingclassroomnavigationexample would be: (defgoalr (ach in - classroom) ( if (not start - up) (maint ( and (maint (maint (maint (maint (maint ) ) ) )
move- to - classroom) avoid - objects ) dodge- students ) stay - to - right - on- path ) defer - to - elders )
This encodingstatesthat the robot is to achievethe goal of being in the classroom. If the robot is not in start - up state, then it is to journey to the location by maintaining the concurrentgoals of moving to the classroom, , stayingto theright, anddeferringto elders. avoidingobjects, dodgingstudents Methodsalso exist to prioritize goals within the Gappslanguageshould the needarise. As in the casewith RS, there is far more to the Gappslanguage thancanbe discussedhere, sothe interestedreaderis referredto Kaelblingand Rosenschein1991for additionalinformation. Gappscircuitry wasusedfor an unmannedunderwatervehicle (UUV ), described in Bonasso1992. The basicgoalsestablishedfor this systemwere . (maint not - crashed) which establishedthe subgoalof ( ach avoid nearest obstacle ) alongwith otherbehaviorssuchasavoidingcollision with the oceanbottom. . ( ach wander) endowedthe robot with explorationcapabilities. . (ach joystick goal - point ) alloweduserdirectedinput to control the direction of the robot. The first two of thesebehaviorsprovided the UUV with the ability to navigate safely in a water tank. Additional behavioralgoals were describedus-
Robot Behavior
ing Gapps and tested in simulation , including ( maint best - heading ) and mission - specific tasks such as ( ach record - thermal - vent - event ) and (ach quiescence ) .
3.3 HERA VIORALENCODING To encode the behavioral response that the stimulus should evoke, we must create a functional mapping from the stimulus plane to the motor plane. We will not concern ourselves at this point as to whether a behavioral response is appropriate to a given stimulus , only how to encode it . An understanding of the dimensionality of a robotic motor response is necessary in order to map the stimulus onto it . It will serve us well to factor the robot ' s motor response into two orthogonal components: strength and orientation . . Strength denotes the magnitude of the response, which mayor may not be related to the strength of a given stimulus . For example, it may manifest itself in terms of speed or force . Indeed the strength may be entirely independent of the strength of the stimulus yet modulated by exogenous factors such as intention (what the robot ' s internal goals are) and habituation ( how often the stimulus has been previously presented) . We will later see that by controlling the strength of the response to a given stimulus , inroads are created for integrating goal -oriented planning into behavioral systems (chapter 6) as well as introducing the opportunity for adaptive learning methods (chapter 8) . . Orientation denotes the direction of action for the response, (e.g ., moving away from an aversive stimulus , moving towards an attractor) . The realization
In kinemat is the scienc of in motio of , , general objec inclu asp ' and acceler . This includ all of the rob s , , , position velocity parti and time b ased . exte kine to geometric physica Dyna prope include the of the forces that motio in . study produ obje
of this directionalcomponentof theresponserequiresknowledgeof the robot' s kinematics. It mayor may not be dependenton the stimulus's strength.
A behaviorcanbeexpressedasatripleS , R, .8) whereS denotesthedomain of all interpretablestimuli, R denotesthe rangeof possibleresponses , and .8 denotesthe mapping.8: S - + R.
Chapter3
R- Rangeof Responses r (whererE"R) of a , the instantaneous Refiningthis. further response -based behavior reactive canbeexpressed asa six-dimensional vector system of six vectors . Each of the vectors consisting subcomponent subcomponent encodes themagnitude of thetranslational andorientational foreach responses of thesixdegrees of freedom of motionof ageneral mobilerobot.
A
degree
of
( DOF )
freedom
variables
, with
respect
position
within
the
world
to
refers
a frame
to of
one
reference
of
the
set
, necessary
of
independent to
specify
position an
' object
s
.
An unconstrainedrigid objecthassix OOFs, r = [x , y , Z, 6, 4>, 1/1], wherethe first threecomponentsof r representthe threetranslationaldegrees of freedom (x , y , z in three-dimensionalcartesiancoordinates ), and the last threecomponentsencodethe threerotationaldegreesof freedom(6 for roll , 4> for pitch, 1/1for yaw). Often pitch is alternativelyreferredto astilt , and yaw as pan, asin a pan-tilt device. This is especiallytrue in the contextof controlling the pointing of sensorssuchascameras . For ground basedmobile robots, the dimensionalityis often considerably lessthansix OOFs. For example, a robot that moveson flat groundandcanrotate only about its central axis has only three degrees of freedom , r = [x , y , 9 ], representing translation in the cartesian ground plane [x , y ] and the one degree of rotation 8 (yaw, or alternatively pan) .
Another factor that can limit the realization of a generatedbehavioralresponse is the robot' s non-holonomicity.
-holonomic A non robot has restrictions in the it can move becau , way typically of kinematic or constraints on the robot such as limited abilitie , dynamic turning . or momentwn at velocities high A truly holonomicrobot can be treatedasa masslesspoint capableof moving in anydirectioninstantaneously . Obviouslythis is a very strongassumption anddoesnot hold for any real robot (althoughit is easyto makethis assump-
Robot Behavinr
tion in simulation, potentiallygeneratingmisleadingresultsregardinga control ' algorithm s utility on an actualrobot). Omnidirectionalrobotsmoving at slow translationalvelocities, however, (that is, robotsthat can essentiallyturn on a dime and headin any direction), can often pragmaticallybe consideredto be holonomic, but they are not in the strictestsense.Non-holonomicity is generally of greatsignificancewhen thereare steeringangleconstraints, suchas in a car attemptingto park parallel. H the wheelswere capableof turning perpendicular to the curb, parking would be much easier, but as they cannot, a sequenceof more complexmotionsis requiredto move the vehicle to its desired location. The constraintsimposedby non-holonomicsystemscanbe dealt with either , by including them within the function during the generationof the response , ,8, or after r hasbeencomputed, translatingthe desiredresponseto be within the limitations of the robot itself. S- The Stimulus DomainS consistsof the domain of all perceivablestimuli. Each individual stimulus or percepts (whereseS) is representedas a binary tuple (p , }..) having both a particulartypeor perceptualclassp anda propertyof strength}... The complete set of all p over the domainS definesall the perceptualentities a robot can distinguish, that is, thosethings it was designedto perceive. This conceptis looselyrelatedto affordancesasdiscussedin section2.3. The stimulusstrength }.. can be definedin a variety of ways: discrete(e.g., binary: absentor present; categorical: absent, weak, medium, strong) or real valuedandcontinuous. In other words, it is not requiredthat the merepresenceof a given stimulus be sufficientto producean actionby the robot.
motor
a
evoke
to
sufficient
not
is
but
stimulus
a
of
The
necessary
presence -
.
robot
based
behavior
a
in response
Wedefine't' asa thresholdvalue, for a givenperceptualclassp , abovewhich a responseis generated . Often the sttengthof the input stimulus(J..) will determinewhetheror not to , althoughother exogenousfactors respondand the magnitudeof the response can influencethis (e.g., habituation, inhibition, etc.), possiblyby altering the valueof 't'. In any case, if J.. is nonzero, the stimulusspecifiedby p is present to somedegree.
Chapter3
Certainstimuli may be importantto a behavior-basedsystemin waysother than provoking a motor response . In particular, they may have useful side effectson the robot, suchas inducing a changein a behavioralconfiguration evenif they do not necessarilyinduce motion. Stimuli with this property are referred to as perceptual triggers and are specifiedin the samemanneras previouslydescribed(p, .1..). Here, however, when p is sufficiently strong, the desiredbehavioralsideeffect is producedratherthanmotion. We returnto our discussionof stimuli in the contextof perceptionin chapter7. - The Behavioral Mapping 13 Finally, for eachindividual activebehaviorwe canfonnally establishthe mapping betweenthe stimulusdomainandresponserangethatdefinesa behavioral function 13where 13(s) ~ r . 13can be definedarbitrarily, but it must be definedover all relevantp in S. Wherea specificstimulusthreshold, 'r, mustbe exceededbeforea responseis producedfor aspecifics = (p, A) : \ * no response* \ 13: (p, A) ~ {for all A < 'r then r = [0, 0, 0, 0, 0, 0] elser = arbitraryfunction} \ * response* \ where [0, 0, 0, 0, 0, 0] indicatesthat no responseis requiredgiven the current stimulis . Examples Considerthe exampleof collision avoidancebehavior. If an obstaclestimulus is sufficiently far away (henceweak), no actual action may be taken despite its presence . Oncethe stimulusis sufficiently strong(in this casemeasuredby , proximity) evasiveaction will be taken. To illustrate intuitively, imagineyou arewalking on a long sidewalkandyou seesomeoneapproachingfar aheadof you. In general, you would not immediatelyalter your pathto avoida collision, but ratheryou would not reactto the stimulusuntil action is truly warranted. This makessense, as the situation may changesignificantly by the time the oncomingwalker reachesyou. She may havemovedout of the way or even turnedby herselfoff the path, requiring no actionwhatsoeveron your part. The functionalmappingbetweenthe strengthof stimulusandthe magnitude (strength) and direction of robotic motor responsedefinesthe design space for a particular robotic behavior. Figure 3.14 depicts two possiblestimulus strength-responsestrengthplots for the situationdescribedabove. Oneis a step
RobotBehavior
function in which, oncethe distancethresholdis exceeded , the action is taken at maximumstrength. The other fonnulation involvesincreasingthe response strengthlinearly over somerangeof stimulusstrength(measuredby distance ' s orientation here). Other functionsare of coursepossible. The response may also vary dependingon how the behaviorhas been constructed . For example , the motor responsemay be directly awayfrom the detectedobject (strict repulsion- move away), or alternativelytangentialto it (circumnavigation go left or right) (figure 3.15). Associatedwith a particular behavior, ,8, may be a scalar gain value g (strengthmultiplier) further modifying the magnitudeof the overallresponser for a givens: r ' = gr .
Thesegain valuesareusedto composemultiple behaviorsby specifyingtheir strengthsrelative to one another(section3.4). In the extremecase, g can be usedto turn off a behaviorby settingit to 0, thusreducingr' to O. The behavioralmappings,,8, of stimuli onto responses fall into threegeneral : categories
. Null: The stimulusproducesno motor response . . Discrete: The stimulusproducesa responsefrom an enumerableset of prescribed choices(all possibleresponsesconsistof a predefinedcardinal set of actionsthat the robot can enact, e.g., turn-right, go-straight, stop, travel-atspeed-5). R consistsof a boundedset of stereotypicalresponsesenumerated for the stimulusdomainS and specifiedby ,8. . Continuous: The stimulus domain producesa motor responsethat is continuous over R ' s range. (Specific stimulis are mappedinto an infinite set of responseencodingsby ,8.) Obviously it is easy to handle the null caseas discussedearlier: For all s, ,8: s ~ O. Although this is trivial , there are instances(perceptualtriggers) where this responseis wholly appropriateand useful, enablingus to define es independentof direct motor action. perceptualprocess The methodsfor encodingdiscreteand continuousresponsesare discussed in turn.
3.3.1 DiscreteEncoding
Chapter3 0 0 ~
~ ~ ' -"' 0 w , (/) c 0 Q. o (/) 10 , I}: \tOo .. .c .0) c ' 0 L. ~ .In
0 0
2 Distance
4 from
6 Stimulus
8
10 ( m )
(A) Figure3.14 pairs. Sensingprovides the index for finding the appropriatesituation. The responsesgeneratedfor a situationcan be very simple, suchas halt, or more complex, potentially generatinga sequenceof actionsakin to the fixed-action patternsdescribedin chapter2. Another strategyusing discreteencodingsinvolves the use of rule-based systems.Here.8is representedasa collectionof If -thenrules. Theserulestake the generalform: IF antecedentTHEN consequent wherethe antecedentconsistsof a list of preconditionsthat must be satisfied in order for the rule to be applicableand the consequentcontainsthe motor . The discretesetof possibleresponsescorrespondsto the setof rules response in the system. More than one rule may be applicablefor any given situation. The strategyusedto deal with conflict resolutiontypically selectsone of the potentially many rules to usebasedon someevaluationfunction. Many rule-
Robot Behavior 0 0 ~ """ ~ \.-" 0 ~ ~ U) c 0 a. o U) CO ~ ~ II00 ~ .c: +C) C ~O LoN +U)
0 0
2 Distance
4 from
6 Stimulus
8
10 ( m)
(B) ) Figure3.14(continued Stimulusdistance / responsestrengthplot: (A ) stepfunction, and (B) linear increase.
basedbehavioralsystemsencodetheir behaviorsusing fuzzy rules; we will studythesesystemsin moredetail in chapter8. Gapps(section3.2.4.2) usesgoal-reductionrules to encodethe actionsrequired to accomplisha task. Here the antecedentspecifiesa higher-level goal that if necessarywill require that certain subgoalsbe achieved. Eventually thesesubgoalstranslateinto specific motor commands(or action vectors, to usetheir tenninology). One examplefor an underwaterrobot ( Bonasso1992) usedthe following Gappsrule: (defgoalr ( ach wander) - at - wander- angle ( if (not (RPV ( ach turn to wander angle ) (ach wander set point ) Here the wanderbehaviorrequiresthat the robot turn to a new wanderangle andthenmoveto a newly establishedsetpoint.
Chapter3
.
. . . . . .
. .
.
. ' .
.
.
. "
, . .
.
-
. .
.
. .
. .
. . ' .
. .
. .
. .
. "
.
. .
_
.
.
.
. .
. .
.
. .
. , .
" , " .
. . -
~ ~
.
.
. ~ .
. .
. .
r rr
" "
. .
.
.
"
.
.
. .
.
.
.
. .
.
.
.
. ,
.
"
. '
.
. "
. .
. "
.
.
.
.
" "
" " "
..
..
. .
.
.
"
"
.
.
, . .
.
. .
\
. .
\
" \
tt
" " " "
t t
,
"
"
" t
" "
"
, '
.
.
fl
" ..
. .
. .
.
.
. .
.
,
"
"
tt
. .
.
"
" tt
,
,
. .
. .
. ,
"
tt ,
.
. .
.
tt
"
\
. .
,
"
..
.
. . .
" " "
" ,
, "
"
~
"
~ ~
"
"
"
.. "
"
..
"
..
"
"
'"
,
"
\
fl
,
tl
'
'
'
~
. .
.
.
' "
~ , " " ~ ~
. ~ " .
~
"
. .
-
"
~ ~ " "
'
.
"
. . .
.
-
.
_ _
"
" -
. . ' . -
. . _
_
_ "
"
.
. . .
. .
. . . . .
' . . . .
. .
.
.
. .
"
.
"
.
" "
" '
. .
. .
-
~
'
. -
.
.
~
~
" "
.
~
~ ..
. .
~ .
.
.
.
,
, " ..
.
.
~ " ~ ~ " ~ -* .. .. " "w -* " ~ . .. , ... ... "w ... ... .. ~ . .. . . . . ... ... . . ... ... . . .. ~~ ~ .. .. . . . . ... ~~~ ... ... . . .. .. ~ .. . . . . . . .. ... . . ~ .. p ~ ~ ~ ~ ~ ~ ~ ~ " pp , , -... ... ~ .. .. . " P *' - ... . . . . .. . . .. .. . . ... ~ ... . . .. . .. .. .. ... . . ~ .. .. " .. " J J ~~ \ . .. .. .. .. " .. .. Jl ~ ~ ~ .. " " " " ,, " " " J ~ ~ ~ ~ ~ " " " " .. .. .. ~ I -- .. .. " ~ ~ \ .. .. .. .. .. JII " " .. " \ ~ \ ~ ~ .. .. .. .. JI ~ .. " " \ ~ \ ~ .. .. .. .. JIJ ~ ~ ~ ~ ~ .. .. ~ ~ ~ .. ' ~ ~ ~ ~ ~ IIJ ~ ~ ~ ~ " ' I I ' & ~ ~ ~ ~ ~ ~ " " " ' II & & ~ ~ ~ ~ " " " ~ " & " " III . . " " " " , " . . " " " " " , " " . . . , , , , , , , , , I I . . . . . . . I I I , , , , , I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
"
,
"
"
. .
I
.
.
. .
. .
.
" ,
" "
. .
.
" "
"
..
. . .
"
"
..
. .
. ,
"
" .
. .
. .
"
.
"
.
. .
. "
.
'
.
.
.
(A )
Figure3.15 In Nilsson 1994, condition-action productionrules control the robot. The condition (antecedent ) is basedpartially on sensoryinformation, whereasthe encodes a consequent responsefor the robot to enact. The resultingactionsare themselvesdurative insteadof discrete, heremeaningthat the actionresulting from the invocation of the rule persistsindefinitely. (Contrast the durative " " responseof Move at a speedof five meters/second with the discreteresponse " Move fifteen meters." This teleo-reactivemethodis somewhatrelatedto the ) models circuitry epitomizedby Gapps, yet usesdistinguishingformalismsthat canpotentiallyfacilitate analysis. Anotherexampleof discreteencodinginvolvesrulesencodedfor usewithin the subsumptionarchitecture( Brooks1986). To facilitatethe useof this partic-
Robot Behavior
( B) Figure 3.15 (continued) lWo directionalencodingsfor collision avoidance : (A ) repulsive, and ( B) circumnavi gation, with approachfrom the bottom of the figure.
ular approachto reactivesystems,the BehaviorLanguagewascreated( Brooks 1990a). This languageembodiesa real-time, rule-basedapproachforspecifying and encodingbehavior. Using a LISP-like syntax, rules are specifiedwith the wheneverclause: (whenevercondition &restbody1orms) wherethe condition portion of the rule corresponds to the antecedent and the & rest to the consequent. Behaviors are consbucted by collecting a set of these real- time rules into a group typically using the defbebavior clause:
Chapter3
(detbehavior name&key inputsoutputsdeclarationsprocesses) wherethe list of rules appearsin theprocesses location. Similar ruleswereusedfor the control of a wheelchairin ConnellandViola 1990. A loosetranslationof a few of thesebehavioralrules include . Approach: IF an object is detectedbeyondspecifiedsonarrange, THEN go forward. . Retreat: IF an object is nearbyaccordingto sonar, THEN movebackward. . Stymie: IF all the front infrared sensorsindicate an object immediatelyin front, THEN turn left. The entirerobotic systemconsistedof fifteen suchbehaviorsencodinga mapping of direct sensingto motor activity. (Thesebehaviorsstill requirecoordination , which is discussedin section3.4) .
3.3.2 ContinuousFunctionalEncoding Continuousresponseallows a robot to havean infinite spaceof potentialreactions to its world. Insteadof having an enumeratedset of responsesthat discretizestheway in which the . robotcanmove(e.g., (forward, backward, left, slowdown . . . , etc. }), a mathematicalfunction transformsthe , , , right speedup reaction. Oneof the mostcommonmethodsfor into a behavioral sensoryinput implementingcontinuousresponseis basedon a techniquereferredto as the potential fields method. ( Wewill revisit why the word basedis italicized in the previoussentenceafter we understandthe potentialfields methodin more detail.) Khatib ( 1985) andKrogh ( 1984) developedthe potentialfields methodology as a basisfor generatingsmoothtrajectoriesfor both mobile and manipulator robotic systems. This method generatesa field representinga navigational spacebasedon an arbitrary potential function. The classic function used is that of Coulomb's electrostaticattraction, or analogously , the law of universal gravitation, wherethe potentialforce dropsoff with the squareof the distance betweenthe robot and objects within its environment. Goals are treatedas attractorsandobstaclesaretreatedasrepulsors.Separatefields areconstructed , baseduponpotentialfunctions, to representthe relationshipbetweenthe robot andeachof the objectswithin the robot' s sensoryrange. Thesefields are then combined, typically through superpositioning , to yield a single global field. For path planning, a smoothtrajectorycan then be computedbasedupon the gradient within the globally computedfield. A detailed presentationof the potentialfields methodappearsin Latombe1991.
RobotBehavior
-squarelawexpressing therelationship between forceand Usingtheinverse distance, 1 Force
<X Distance
2
we consttuct a field for a repulsiveobstacle, shownin (A ) in figure 3.16. A ballistic goal attraction field, where the magnitudeof attraction is constant throughoutspace, is depictedfor an attractorlocatedin the lower right of (B) in figure 3.16. In figure 3.17, (A ) showsthe linear superpositionof thesetwo fields, with (B) illustrating anexampletrajectoryfor a robot movingwithin this simpleworld. Becausepotentialfields encodea continuousnavigationalspacethroughthe , sensedworld (i .e., the force can be computedat any location), they provide an infinite set of possibilities for reaction. Potential fields are not without their problems, however, (Koren and Borenstein1991). In particular, they are vulnerableto local minima (locationswherethe robot may get stuck) or cyclicoscillatorybehavior. Numerousmethodshavebeendevelopedto addressthese problems, including the use of hannonic potential fields (Kim and Khosla 1992; Connolly and Gropen 1993), time-varying potential fields (Tianmiao andBo 1992), randomnoiseinjectedinto the field (Arkin 1987a) andadaptive methods(Clark, Arkin , andRam 1992), amongothers. We will examinemore closelythe methodsfor combiningpotentialfields in section3.4. Another seeminglysignificantproblem with the use of the potential fields methodis the amountof time requiredto computethe entire field. Reactive robotic systemseliminatethis problemby computingeachfield' s contribution at the instantaneous positionmerelywherethe robot is currentlylocated(Arkin 1989b). This is why we saidearlierthat thesetechniquesareonly basedon the potential fields method: no field is computedat all. No path planningis conducted at all: Rather, the robot' s reactionto its environmentis recomputedas fast assensoryprocessingwill permit. Oneof the major misconceptionsin understanding reactivemethodsbasedon potentialfields is a failure to recognize the fact that the only computationneededis that requiredto assessthe forces from the robot' s currentposition within the world. This methodis thus inherently very fast to computeas well as highly parallelizable. When the entire field is representedin a figure, it is only for the reader's edification. Reiterating , behavior-basedreactivesystemsusing potential fields do not generate ' plansbasedon the entire field but insteadreactonly to the robot s egocentric perceptionsof the world. This is of particular importancewhen the world is dynamic(i.e., thereare moving objects, thus invalidatingstaticplanningtech-
100
Chapter3
.
. .
.
.
.
. .
. "
.
.
. .
. .
. .
.
. . . . . -
. . . . . . .
.
. .
. . .
. . -
: . "
" " "
: .
.
. .
. . .
-
. . . .
: .
. . . .
. -
. .
: . "
: "
" " "
.
.
.
" " " .
"
" ,
.
.
"
. .
.
. : .
.
. .
.
-
. :
. .
.
"
.
. .
.
.
.
. .
.
. .
. .
. .
.
.
. .
,
. .
.
. .
. .
.
.
. .
. .
. ,
"
. .
.
. .
,
.
. .
. .
.
-.
.
,
, ,
,
tt
,
ttt
~
. .
.
.
.
. .
. .
. . .
.
. .
.
,
t
"
.
.
.
. .
. .
,
,
,
,
.
.
.
,
. .
.1
. .
.
.1
.
. .
. .
.
.
" , ,
" "
"
. . .
"
" "
ttt1
.
. .
"
" t
.
. .
.
"
,
" '
.
. .
.
" "
,
.
.
" "
"
.
.
.
"
" "
.
. .
"
"
.
" "
. .
. .
.
. .
. .
, , .1 .1 , " ~ . " " ~ " ' \ ' " ~ " " ~ ~ ' ~ . .. .. ... .. ~ .. ~ tfff .. .. ... ..., ,;r . . ~ ~ ~ ~ . .. ... ~ ... ' * .. ~ .. ' '* " ~ ~ ~ ~ t . .. ~ ~ . Ai A " \ I ~ ... - ~ ~ ... .. ~ ~ . .. . . + - ~ ~ ... ... . _ ~ ~ ~ ... ~ ~ -~ ~mm -~ - ~ ~ - ~- ~- ~ ~ ~ . . 4- + - + . . ... ~ ~ - ~ ~ ~ ~ ~- ~- ~ ~ ~ ~ ~ ~ ~ . ~ .. - ~ ... .. ~ ~ 7 ~ ~ ~ ~ : : : : 7 / : : : : ~ ~~ ~ : : : : ~ : ~ : , .. .. .m .. .. ~ . , m .. .. . . .. .. ~ , iif ~ ~ J ~ " " " , ~" .. ~ . ~ , \ ~ J JJi " ~ " " " ,, ' ' ~ " \ JJ ~ ~ \ " ~ " " ' " .. \ JJ ~ ~ J. ~ ~ " ' ' " ~ ' .. \ \ JJl ~ ~ ~ " " ' " ( \ ( 144 ~ ~ ~ \ " " " " ( ( I " ~ ~ ~ \ " " " \ " \ " I ' ~ ~ ~ \ . . . . . , . , \ , I I , , , , , . . . . . . . . , . , , . . . . , , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ,
,
"
-.
,
~
'
l'
,
. .
. ,
"
. .
" . " . . . . -
.
. .
.
. .
. . . -
. . . . -
-
. .
: . "
" .
.
.
.
: . .
. . . .
. .
.
:
"
.
-
' "
.
. .
.
:
.
.
.
.
. . .
. . .
-
.
.
. . .
. .
.
. .
.
. :
:
.
.
: .
: .
.
"
"
.
.
.
.
(A )
Figure 3.16
) or sensorreadingsare noisy or unreliable(as a plan generatedon bad Diques data is likely to be defectiveitself) . Frequentresamplingof the world helps . overcomemanyof thesedeficiences Slack( 1990) developedanothermethodfor describingcontinuouspathways throughnavigationalspace.Navigationaltemplates(NATs) aredefinedasarbitrary functionsto characterizenavigationalspacethat do not necessarilyhave any correlation with typical potential fields methods. TheseNAT primitives rather characterizespaceon a task-orientedapproachand are definedon an as-neededbasis. An exampleof this method involves the use of spin-based techniquesfor obstacletreatment, circumventingthe problem of local min-
ima found in traditional potentialfields methods. In figure 3.18, (A ) showsa spin-basedversionfor obstacleavoidancefor the situationdepictedin (B) in figure 3.16. The resultsof superpositioningthis new template(as opposedto field) with (A ) in Figure3.16areshownin ( B) in Figure3.18. By usingknowledge of the goal' s locationrelativeto the detectedobstacle, a spin direction is chosen, either clockwise(whenthe obstacleis to the right of the goal, viewed from above) or counterclockwise(when it is to the left). Unfortunately, the modularityof the behavioris somewhatcompromisedwith this technique, for Figure 3.16 (continued) Potentialfields for (A ) an obstacleand ( B) a goal locatedin the lower right sideof the figure. (B) "
\
"
"
t
t
,
"
t
,
~
.
~
~
~
~
~
~
~
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
~
-
-
-
-
-
-
-
~
-
-
-
-
-
-
-
~
-
-
-
-
-
-
~
-
-
-
-
-
-
-
~
~
-
-
-
-
-
-
~
-
~
-
-
-
-
~
~
~
-
-
-
-
-
~
~
~
-
-
-
-
-
~
~
~
~
-
-
-
-
~
~
~
~
-
-
-
-
~
~
~
~
-
-
-
-
~
~
~
~
-
~
-
-
~
~
~
~
~
~
-
-
~
~
~
~
~
-
-
-
~
~
~
~
~
-
~
-
~
~
~
~
~
~
~
-
~
~
~
~
~
~
-
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
,
~
~
?
~
~
"
r
t
,
,
"
'
"
t
"
~
t
t
t
t
,
,
~
"
t
,
\
,
~
,
,
\
"
,
\
,
,
~
t
\
~
,
~
~
-
-
-
-
-
-
-
t ~
~
-
.
.
I
'
-
.
J
.
.
-
~
.
.
"
-
.
.
-
-
-
.
.
-
'
"
"
~
\
. . . .
~
.
\
~
-
-
.
'
~
.
.
.
-
-
'
.
-
-
'
.
-
-
_
-
-
'
'
.
.
.
.
'
-
-
-
-
'
.
.
.
'
.
-
-
-
'
.
.
.
'
-
-
-
'
'
.
-
-
-
.
.
'
.
.
-
-
_
'
.
-
-
'
'
'
.
.
.
.
.
.
-
-
_
-
'
.
.
-
-
-
-
'
-
-
-
-
'
.
-
-
-
'
-
-
-
-
'
'
.
'
.
.
-
-
'
.
-
-
-
'
-
-
-
-
-
'
'
-
-
_
-
'
'
'
.
-
-
-
-
'
.
.
.
~
~
~
~
.
~
~
~
.
~
\
~
~
.
~
.
~
~
~
.
\
~
~
.
~
.
\
~
~
~
\
\
\
\
~
\
~
~
\
\
~
\
\
\
\
\
\
~
~
\
\
\
~
\
~
\
\
\
~
\
\
~
~
~
\
~
~
~
~
\
~
~
~
\
. . . .
~
~
\
-
-
.
-
-
.
-
-
.
.
-
-
.
.
.
.
.
.
.
.
.
.
.
.
-
-
-
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
'
'
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
~
~
~
~
"
.
.
.
"
.
.
"
"
.
.
"
. . . .
. . . .
.
"
. . . .
. . . .
. . . .
. . . .
. . . .
"
. . . .
. . . .
. . . .
. . . .
.
"
"
. . . .
. . . .
. . . .
. . . .
~
-
-
-
-
-
-
.
.
.
.
"
.
"
. . . .
. . . .
. . . .
~
~
"
"
"
. . . .
. . . .
. . . .
. . . .
~
~
.
.
.
.
.
.
.
.
~
~
"
"
.
"
.
"
"
"
. . . .
. . . .
. . . .
~
~
~
~
~
"
"
.
.
"
"
"
-
'
'
'
.
-
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
'
'
'
.
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
~
~
~
~
~
"
.
"
.
"
"
"
. . . .
. . . .
~
~
~
. . . .
. . . .
. . . .
"
~
.
~
~
~
~
.
"
.
"
"
"
"
. . . .
. . . .
\ ~
.
~
.
~
.
~
.
"
~
~
~
.
.
"
.
. . . .
"
"
"
. . . .
~
~
~
.
~
.
"
.
"
"
"
~
~
.
.
~
.
"
.
"
. . . .
"
"
"
. . . .
-
'
'
'
.
-
-
.
.
.
.
.
.
.
.
.
.
-
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
-
.
.
.
.
.
.
.
.
.
.
.
.
'
'
.
.
.
.
.
.
.
.
.
"
"
"
"
"
. . . .
~
~
.
.
.
.
"
"
. . . .
"
. . . .
"
"
"
. . . .
~
~
~
.
.
.
"
"
. . . .
"
"
~
~
.
.
-
-
-
.
.
.
.
.
.
.
.
.
.
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
~
.
.
.
"
. . . .
"
. . . .
"
"
.
~
~
.
.
.
.
"
"
"
~
~
~
~
.
.
.
.
"
"
. . . .
"
. . . .
"
'
'
. . . .
~
.
~
~
.
.
.
.
"
"
"
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
"
"
. . . .
"
. . . .
"
~
. . . .
~
.
~
~
.
~
.
.
.
"
"
"
-
.
.
.
.
.
.
~
'
.
-
-
.
.
.
.
.
.
.
.
-
-
.
.
.
.
.
.
.
.
-
'
'
.
-
-
.
.
.
.
.
.
"
"
. . . .
~
~
~
.
~
~
.
.
.
.
. . . .
. . . .
"
"
"
. . . .
~
"
~
~
~
~
.
~
~
.
. . . .
.
. . . .
"
"
. . . .
'
-
-
-
.
.
-
'
'
.
-
-
-
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . .
. . . .
"
"
~
. . . .
~
"
~
~
~
~
~
~
~
.
. . . .
.
. . . .
. . . .
"
"
~
~
~
.
.
. . . .
. . . .
. . . .
. . . .
. . . .
"
"
~
~
~
'
~
~
~
\
.
~
~
.
.
~
.
.
.
~
~
.
.
~
.
.
.
. . . .
. . . .
. . . .
. . . .
"
"
~
~
~
~
~
~
~
~
-
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
-
'
'
.
.
.
.
.
~
~
~
.
.
~
~
.
.
. . . .
.
.
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
~
~
~
~
~
~
~
~
\
\
-
-
-
-
.
.
.
.
.
.
~
.
.
. . . .
.
. . . .
. . . .
. . . .
~
~
~
.
.
.
.
.
.
.
'
'
.
.
.
.
.
.
.
.
~
~
~
.
.
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
~
~
. . . .
~
~
~
~
\
\
~
\
\
~
-
-
-
-
.
.
.
.
.
.
~
~
~
.
'
'
.
.
.
.
.
.
.
.
.
~
.
. . . .
. . . .
. . . .
. . . .
~
. . . .
~
~
~
~
~
~
~
\
\
\
\
\
\
~
-
-
-
.
.
.
.
~
~
.
'
'
.
.
-
-
-
.
.
.
.
.
.
.
.
.
.
.
. . . .
. . . .
. . . .
. . . .
~
~
~
\
~
~
~
~
\
\
\
\
\
\
~
~
-
-
-
-
.
.
.
.
.
.
~
.
.
.
.
. . . .
. . . .
. . . .
. . . .
. . . .
~
~
~
~
\
\
~
~
\
\
\
~
~
\
\
\
\
.
.
.
~
~
~
.
~
. . . .
. . . .
. . . .
~
.
.
.
.
.
.
.
.
-
'
'
.
.
.
.
.
.
.
.
.
~
.
~
. . . .
.
. . . .
. . . .
. . . .
~
~
~
~
~
\
\
\
\
~
~
\
\
~
\
\
\
~
\
\
\
-
-
.
.
.
.
~
.
.
. . . .
. . . .
. . . .
~
~
~
\
~
\
~
\
~
\
~
\
\
~
\
\
\
\
\
\
\
\
.
.
.
.
.
-
-
.
.
.
~
.
. . . .
.
. . . .
. . . .
. . . .
~
~
\
\
~
\
\
\
\
\
\
\
\
\
~
~
~
\
~
\
\
.
.
.
.
.
.
. . . .
. . . .
. . . .
~
\
"
.
.
.
.
.
.
~
.
"
.
.
.
.
.
. . . .
. . . .
~
~
\
\
\
\
~
\
\
~
\
\
\
\
\
\
\
\
\
~
\
~
~
~
\
-
"
~
~
.
.
. . . .
. . . .
~
~
~
~
\
~
\
\
\
~
\
\
\
\
~
\
\
\
\
~
~
\
\
\
~
~
-
-
"
.
~
. . . .
. . . .
.
. . . .
. . . .
. . . .
~
\
\
\
\
~
\
~
\
~
\
\
\
\
~
~
~
~
~
~
~
~
~
~
~
-
-
"
.
.
. . . .
~
. . . .
.
~
~
~
\
~
\
\
\
\
~
\
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
.
-
"
.
.
~
~
. . . .
~
\
\
~
\
\
~
~
~
~
\
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
. I
-
"
.
. . . .
\
\
~
\
\
\
\
\
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
. I ~
.
~
.
IIIII ~
.
-
-
.
. . . .
~
~
~
J
~
~
\
\
\
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
. I ~
.
1111
~
.
~
~
~
~
~
~
~
J
~
J
~
J
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
. I ~
.
" J
.
~
-
.
\
J
'
~
J
.
.
~
~
~
~
J
~
~
~
J
J
J
~
J
~
J
J
J
J
J
~
J
~
~
~
~
~
~
~
~
~
. ' ~
.
~
~
.
~
'
I
II
'
11
J
J
.
~
J
J
~
J
J
J
J
J
J
J
J
J
J
J
J
J
~
J
~
~
J
JJ
J
JJ
~
~
J
. " J
.
~
-
-
~
"
'
11
'
I
~
J
J
"
"
J
J
J
J
J
J
J
,
-
~
-
'
~
"
.
"
I
~
-
~
-
~
~
.
"
'
JiII
'
J
'
'
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
JJ
JJ
J
J
JJ
JJ
J
.
~
. ,
" ~
.
~
~
~
~
~
~
\
~
~
~
~
\
\
~
\
~
~
\
\
\
\
\
\
~
\
\
~
~
~
\
~
~
~
~
~
~
~
~
. . . .
~
. . . .
~
. . . .
~
. . . .
. . . .
. . . .
. . . .
. . . .
11111
J
~
J
~
J
~
~
.
~
~
.
~
~
.
~
~
~
~
.
~
.
~
~
~
~
.
I
\
~
~
~
~
~
\
~
~
\
~
~
\
\
~
~
\
\
~
\
\
\
~
\
\
\
\
~
\
~
\
\
\
\
\
~
\
\
~
\
~
\
\
\
\
\
~
\
~
~
\
~
~
~
~
\
\
~
~
~
\
\
~
~
~
~
\
\
~
~
~
\
\
~
~
~
~
\
"
~
~
~
\
"
"
~
~
~
~
"
"
~
~
~
~
"
~
~
~
~
~
~
~
~
~
~
~
~
~
~
\
~
~
.
~
.
\
~
.
\
~
\
~
\
~
\
\
\
\
\
\
~
\
~
\
\
\
\
~
\
~
\
\
~
\
~
\
~
\
~
~
~
~
~
~
~
~
~
~
~
~
\
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
\
~
~
~
~
\
~
~
~
~
\
\
~
~
\
\
~
~
\
\
\
~
~
\
\
\
~
\
\
\
~
~
\
\
\
\
~
\
\
\
\
\
\
\
\
\
\
~
\
\
\
\
\
\
\
\
\
\
\
\
\
~
\
\
\
~
~
\
\
~
~
~
\
\
~
~
.
\
\
~
~
. \
\
~
~
~
. I
~
~
~
~
. I
~
.
\
~
. I ~
~
~
~
. I
~
.
I
~
. I ~
~
~
~
. I
~
.
I
~
. I ~
~
~
~
. I
~
.
I
~
. I ~
~
~
~
. I
~
.
I
~
. I ~
~
~
~
. I
~
.
I
~
. I ~
~
~
~
. I
~
~
.
111
~
. I ~
. ' ~
~
~
~
. I
~
~
.
"
~
. ' ~
. ,
~
~
~
~
~
. "
J
. , J
~
J
~
. '
~
~
~
~
~
\
101
Robot Behavior
102
',"-".~ I1 l''~ -".-",1 '':.,\I& o 'J~ \-"V o ~ ~ ~ 0 0 0 .--l'.~ "\t\,IJ[~ 0 . \ " ' o ~ ~ I o ~ 1 , + 6 l t ra : r \ > " J ' ~ . , o ~ ~ . " O \ I r ' 6 6 . t .(A " ~ ' , 1 I .). '
Chapter3
Figure3.17
the correctchoicefor the obstacle's spin direction requiresknowledgeof the goal location, which normally is usedonly within the goal attractionbehavior. One other novel representationusesa techniquecalled deformationzones for collision avoidance(Zapataet al. 1991). This techniquedefinestwo manifolds : the information-boundary manifold, which representsthe maximum extentof sensing, andthe rangemanifold, constructedfrom the readingsof all activesensors , which canbe viewedasa deformedversionof the informationboundarymanifold becauseof the presenceof obstacles(figure 3.19). Perception producesdeformationof the information-boundarymanifold, andcontrol strategiesaregeneratedin responseto removethe deformation. In the example
103
Robot Behavior
8
( B)
Figure 3.17 (continued) (A ) Linear superpositionof fields in figure 3.16. (B) An exampletrajectorythroughthis navigationalspace. shown, the robot would steer to the left . In this approach, reactive behaviors are defined, including emergency stop, dynamic collision avoidance, displacement (orientation ), and target following . One interesting aspect of this work allows for the definition of a variable collision avoidance zone dependent on ' the robot s velocity (i .e., when the robot is moving faster it projects further into the information -boundary manifold , and when it moves more slowly the information -boundary manifold shrinks ) . This enables effective control of the ' robot s speed by recognizing that if the deformation zone cannot be restored by steering, it can be restored by slowing the vehicle down.
104
Chapter3
. . . . . .. . .. . .. . .. . .. . .. . .. . .. . . . . . . . . .. . .. . ". . . . ". . . . .". . . .". . . ". . . . ". . . . . . . . . . . . . . . . . . . . . . . . . -. . . . . . . . . . . . . . . .. . .. . .. . .. . _ . -. ~-~-~-~-~-. . . . ~. . . . . . . . . . . . . . . . . . . . . . . ~~~ ~ .. . . . . . . , , ~~. . . . . . . . . . . . . . . . . . ". , ~~, , .............. .........-..................~ ..........~. . . . . . . . . . . . . . . . , ", , ". , , ........~.. _ ....~~. . . . . . . . . . . ... ... .. ~~~-+~ .... ..., .........~ . . . . . . ". ", . ", ' ~~~~ " " . ~ .... ~\. . .. . .. . .. . .. . .. , = ~ ~ -. . . . . . . ", , ", 1" ,\ ~ ~ ~ ... .. \ . . . . . . . . . . . . "" "" , ,;~ ~~ '-+~~~~ ~\~~\~\ \~. .. . .. . .. . .. . .. . . ". . ". "" , t t t , ~ \ ~" " " . . . . , , t t ' t ~ ~~ ~~ ~ ~ ~ ~~'~ ~~~~ , ', .' '. .' .' ' t : ~ . : . : . : . : . : t : t ~t ~1~; ' : : : : ": : , t t ",tt 't ~t ~~~,/I~ i } ~~i JJl ". . ". . t , , t ' ~~ "" "" ' , . tttt ~ ~ J4 . ". . ". , "' , ' , ~' ~ ~ J ~ ~ ~ J4 """ "' , "' "" Ji ".. .. ".. ". "" """ , '\' ~ ~ ~ ,~ ~-:-'*"""-~~~~ --~"~~'../ II "". , ' ". ". ". . . . . . , , , , ~ ~ . . . . . . . . . . , , , , , ,~ -..-.., .. ...+-.+ ~ -+-~........~........--.., , , . , , , . . . . . . . .. . .. . .. . ". "" "", ", ".."-..--..---r~ ...~..~..--..""" "". .. . .. . . . . . . . . . . . ". ", ....... ......~..~....~....~.........~. ~ " , . . ~. . ~. . . . . .. . .. . .. . .. . .. . . . .. . .. . .. . .. . ". ". .. . .. . .. . ~ ~ .................~ .~ .~ . .. . .. . .. . .. . .. . .. . .. . . . . . . . . . . . . . . . . . - . -. . . . . . . . . . . . . . . . . . -. - -. -. . . . . . . . . . . ... . ... . (A) 3.18 Figure 3.4 ASSEMBLINGBEHAVIORS Having discussed methods to describe individual behaviors, we now study methods for constructing systems consisting of multiple behaviors. This study requires the introduction of notational formalisms that we will use throughout this book and an understanding of the different methods for combining and coordinating multiple behavioral activity streams. We first examine, however, the somewhat controversial notion of emergent behavior.
105
',-",.'~ ~ \~ \A ~ ~ ~ ,".'.~ l ~ \ i ,~ '_--"."/-'~ . ~ ~ l + + , , . ~ ~ \ : r J : ; I iAA ( \ l t J A , ~ ' l J ~ \ + , l p . ! " " 1 ~ I & \ _ ' . J . , + t " ~ 1 ' "AA '""f",A
RobotBehavior
(B)
) Figure3.18(continued of fields(A) in (B) in figure3.16. (B) Superposition (A) NATspinfieldfor theobstacle onthisfigureis thepathfor a robot in thisfigureandfigure3.16b.Alsosuperimposed fieldfor thesamestartandgoalasshownin (B) in figure3.17. usingthiscombined
3.4.1 EmergentBehavior Emergenceis often invokedin an almostmysticalsenseregardingthe capabilities of behavior-basedsystems.Emergentbehaviorimplies a holistic capability where the sum is considerably greaterthan its parts. It is true that what occurs ' in a behavior-basedsystemis often a surpriseto the systems designer, but doesthe surprisecome becauseof a shortcomingof the analysisof the constituentbehavioralbuilding blocks and their coordination, or becauseof somethingelse?
Chapter3
, \ \ \ \ \ \ \ \ \ Information
Robot
" Boundary I Manifold
I I I , Dire
I
Motion ~ Range Manifold . II Obstacle
106
Figure3.19 The useof manifoldsfor reactivecollision avoidance.
The notion of emergenceas a mystical phenomenonneedsto be dispelled, but the concept in a well-defined sensecan still be useful. Numerousresearchers havediscussedemergencein its variousaspects : " . Emergenceis "the appearanceof novel properties in whole systems (Moravec1988) . . " Global functionality emergesfrom the parallel interactionof local behaviors " Steels1990 . ( ) " . Intelligenceemergesfrom the interactionof the componentsof the system" ' (where the systems functionality, i.e., planning, perception, mobility, etc., resultsfrom the behavior-generatingcomponents ) (Brooks 1991a). . " Emergentfunctionality arisesby virtue of interactionbetweencomponents not themselvesdesignedwith theparticularfunction in mind." ( McFarlandand Bosser1993) . Thecommonthreadthroughall thesestatements is thatemergenceis a property of a collection of interactingcomponents in our casebehaviors. The question , ariseshowever: Individual behaviorsare well defined functionally, so why shoulda coordinatedcollectionof themproducenovelor unanticipatedresults?
107
Robot Behavior
Coordinationfunctionsas definedin this chapterare algorithmsand hence contain no surprisesand possessno magicalperspective . In somecases, they are straightforward, such as choosingthe highest ranked or most dominant behavior; in othersthey may be more complex, involving fusion of multiple active behaviors. Nonetheless , they are generallydeterministicfunctions and certainly computable.Why then doesthe ineffablequality of emergencearise whenthesebehavior-basedrobotic systemsarereleasedin the world? Why can we not predict their behaviorexactly? The answerto this questionlies not within the robot itself but ratherin the relationshipthe robot has with its environment. For most situationsin which the behavior-basedparadigmis applied, the world itself resistsanalyticalmodeling . Nondetenninismis rampant. Thereal world is filled with uncertaintyand dynamicproperties. Further, the perceptionprocessitself is also poorly characterized: Precisesensormodelsfor openworldsdo not exist. If a world model could be createdthat accuratelycapturedall of its propertiesthen emergence would not exist: Accuratepredictionscould be made. But it is the natureof the world to resistjust suchcharacterization , hencewe cannotpredict apri ori, with any degreeof confidence, in all but the simplestof worlds, how the world will presentitself. Probabilistic modelscan provide guidancebut not certainty. For example, simply considerall the thingsinvolvedin somethingassimple as your attendinga class. Lighting conditionsand weatherwill affect perceptual , the traffic on the roadandsidewalkscanbe characterizedonly processing weakly (i.e., you do not know aheadof time the locationandspeedsof all people and cars that you meet). The complexitiesof the world resist modeling, " leading to the aphorismespousedby Brooks ( 1989b): The world is its own best model." Sincethe world cannotbe faithfully modeledand is full of surprises itself, it is small wonderthat behavior-basedsystems,which are tightly groundedto the world, reflect thesesurprisesto their designersand observers by perfonningunexpectedor unanticipatedactions(or by definition actionsnot ' explicitly capturedin the designers intentions). Summarizing, emergentpropertiesare a commonphenomenainbehaviorbasedsystems, but there is nothing mystical about them. They are a consequence underlying the complexity of the world in which the robotic agent residesand the additional complexity of perceiving that world. Let us now moveon to methodsfor expressingthe coordinationof behaviorwithin these robotic systems.
108
Chapter3
3.4.2 Notation Wenow considerthecasewheremultiple behaviorsmaybe concurrentlyactive within a robotic system. Defining additionalnotation, let . S denotea vectorof all stimuli Si relevantfor eachbehaviorPi detectableat timet . . B denotea vectorof all activebehaviors.Biat a giventime t . . G denotea vector encodingthe relative strengthor gain gi of eachactive behaviorPi. . R denotea vectorof all responses ri generatedby the setof activebehaviors. A newbehavioralcoordinationfunction, C, is now definedsuchthat
p= C(G* B(8 , oralternatively p=C(G* R) where rl R= r2 . rn
1 .8n
81 gl 2 g2 B= s= . ,and gn G=
,
and where * denotes the special scaling operation for multiplication of each scalar component (gi ) by the corresponding magnitude of the component vector (ri ), resulting in a column vector ~ of the same dimension as ri . In other words , ~ represents a response reaction in the same direction as ri with its magnitude scaled by gi . (Alternatively , G can be represented as a diagonal matrix ). Restating, the coordination function C , operating over all active behaviors B , modulated by the relative strengths of each behavior specified by the gain vector G , for a given set of detected stimuli at time t , S, produces the overall robotic response p , where p is the vector encoding the global response that the robot will undertake, representedin the same form as r (e.g ., [ x , y , Z, (J, >, 1/1]) . C can be arbitrarily defined, but several strategies are commonly used to encode this function . They are split across two dimensions: competitive and cooperative. The simplest competitive method is pure arbitration , where only ' one behavior s output (ri ) is selected from R and assignedto p , that is , arbitrar ily choosing only one responsefrom the many available. Several methods have
109
.RobotRehavinr been used to implement this particular technique, including behavioral priori tization ( subsumption) or action selection. Cooperative methods, on the other hand, blend the outputs of multiple behaviors in some way consistent with the ' agent s overall goals. Th ~ most common method of this type is vector addition . Competitive and cooperative methods can be composed as well . Section 3.4.3 examines these methods in more detail . First , however, let us revisit the classroom example. Given the robot ' s current perceptions at time t :
S=
(class - location , 1.0) (detected - object , 0.2) (detected - student, 0.8) (detected - path , 1.0) (detected - elder , 0.0 )
,
for each Si = (p , A), where A is the stimulus p ' s percentage of maximum strength. The situation above indicates that the robot knows exactly where the classroom is , has detected an object a good distance away, sees a student approaching nearby, sees the sidewalk ( path) with certainty , and senses that there are no elders nearby. We can then represent the behavioral response as:
B (8 ) =
- to- class .8move (s 1) (S2) .8avoidobject - student (S3) . 8dodge - right(S4) . 8stay .8de / er- to- elder(SS)
.
R then is computed using each .8,
R=
.0 1 0 .08 1
with component vector magnitudes equal to (arbitrarily for this case)
Chapter3
where each ri encodesan [x , y , 6] for this particular robot expressingthe desireddirectional responsefor eachindependentbehavior. In the example, avoid-object anddefer-to-elder arebelow thresholdandgenerateno response , whereasmove-to-classandstay-right areat maximumstrength(definedas 1.0 here) . Rememberthat the responseis computedfrom ,8: S -.. R. Before the coordinationfunction C is applied, R is multiplied by the gain vector G. For the numbersused in the example for G, stay-right is least important (gstay- right = 0.4), dodge-student is deemedthe most important - student= 1.5), while move-to-classanddefer-to-elder are of behavior(gdodge - to- class= gde! er- to- elder= 0.8) and the equal intermediatepriority (gmove avoid-object behaviorrankssecondoverall (gavoid-object= 1.2). R' = G * R , where:
.0.4 0 8 1 2 5 .8 0 .0 8 1 .4 2 I
gmove- to- class gavoid- object
G=
gdodge- student gstay- right
-
gdef er - elder
yielding:
'--
110
g. * r . g2* r2 ' =C = R C ( ) p g3* r3 g4* r4 . gs* rs
vectormagnitudes with scaledcomponent
' '_...1- -R magm ,-
G is of value only when multiple behaviors are being coordinated , as it enables the robot to set priorities through establishing the relative importance of each of its constituent behaviors. If a simple winner -take-all coordination strategy is in place (action selection), a single component vector Girl ( based on some metric such as
Robot Behavior
toPandexecuted . Intheexample ) ischosen greatest magnitude byC, assigned above withavalue of 1.2, associated withthe , thisisthecomponent student behavior . Thus the undertaken this arbitration dodge response using simple function istheaction to an student . that . requireddodgeoncoming Note thisbehavior dominates thecurrent attimet onlygiven perceptual readings 'sperceptions andthatthebehavioral willchange astherobot ofthe response -based world alsochange . Forapriority arbitration the ranked , system highest behavior encoded inG(above threshold bechosen andexecuted component ) would oftheindividual . Inour , independent componentsrelativemagnitudes -student would alsobechosen thismethod , dodge , assuming example using -ranked thestimulus isabove - student threshold , since (I.5) isthehighest gdodge behavior . functions arerecursively defined tooperate notonly , coordination Finally onlowlevel behavioral butalso ontheoutput ofother coordination responses also arebehavioral (which ). operators responses PI P2 , P, = C' . Pn where p ' coordinatesthe output of lower-level coordinativefunctions. One benefitof this definition is the ability to sequencesetsof behavioralactivities for a robot temporally. 3.4. 3 Behavioral Coordination
We now turn to examinethe natureof C, the coordinationfunction forbehavlors. This function hastwo predominantclasses , competitiveandcooperative, each of which has severaldifferent strategiesfor realization. The different classesmay be composedtogether, althoughfrequentlyin the behavior-based architecturesdescribedin chapter4, a commitmentis madeto one type of coordination function specificto a particularapproach.
3.4.3.1 Competitive Methnd~ Conflict canresultwhentwo or morebehaviorsareactive, eachwith its own independent . Competitivemethodsprovidea meansof coordinatingbehavioral response responsefor conflict resolution. The coordinatorcan often be viewed
.
.
.
_
_
&
-
~
_
_ .
.
Response of highest active behavior
II
p e r c e p t i 0 n
Chapter3
I
112
Priority-based Coordination Figure 3.20 Arbitration via suppressionnetwork.
as a winner-take-all networkin which the singleresponsefor the winning behavior out-musclesall the othersandis directedto the robotfor execution.This type of competitivestrategycanbe enactedin a variety of ways. Arbitration requiresthat a coordinationfunction servingasan arbiter select . The arbitrationfunction can take the form of a a single behavioralresponse fixed prioritization network in which a strict behavioraldominancehierarchy exists, typically throughthe useof suppressionand inhibition in a descending -basedmethods(Brooks 1986), the manner.This is a hallmark of subsumption particularsof which aredescribedin chapter4. Figure 3.20 showsan example illustratedasa dominancehierarchy. Action-selectionmethods(Maes 1990) arbitrarily selectthe outputof a single behavior, but this is donein a less autocraticmanner. Here the behaviors actively competewith eachother through the use of activationlevels driven ' by both the agents goals (or intentionality) and incoming sensoryinformation escompete . No fixed hierarchyis established ; ratherthe behavioralprocess with eachother for control at a given time. Run time arbitrationoccursby the selectionof the most active behavior, but no predefinedhierarchyis present. Figure 3.21 capturesthis notionally. The conceptof lateral inhibition (section 2.2.1) caneasilybe implementedusing this methodwhereonebehavior's ' strongoutputnegativelyaffectsanothers output. Anotherevenmoredemocraticcompetitivemethodinvolvesbehaviors' generating votes for actions, with the action that receivesthe most votes being the single behaviorchosen( Rosenblattand Payton 1989). This technique
113 Q
RobotBehavior
= R behavior
with
R
of Response act ( B4
level
activation
act . >
B3
(
act '
)
( B2
MAX
act .
(
)
(
~
hlgh881 I
-
.
Action-Selection Coordination
Figure3.21 Arbitrationvia actinn - ~ 1ect1nn ,
is embodied in the Distributed Architecture for Mobile Navigation (DAMN ) ' (Rosenblatt 1995) . Here instead of each behavior s being encoded as a set of rule -based responses, each behavior casts a number of votes toward a predefined set of discrete motor responses. For navigation of an unmanned ground vehicle , the behavioral response set for steering consists of {hard - left , soft -left , straight -ahead , soft - right , hard -right } . Each active behavior (e.g ., goal -seeking, obstacle-avoidance, road -following , cross-country , teleoperation , map-based navigation ) has a certain number of votes (gi ) and a user-provided distribution for allocating those votes. Arbitration takes place through a winner -take-all strategy in which the single response with the most votes is enacted. In a sense, this allows a level of behavioral cooperation , but the method is still arbitrary in its choice of a single response. 3.22 Figure depicts this type of arbitration method.
3.4.3.2 CooperativeMethods Cooperative methods provide an alternative to competitive methods such as arbitration . Behavioral fusion provides the ability to use concurrently the output of more than one behavior at a time . The central issue in combining the outputs of behaviors is finding a representation amenable to fusion . The potential - fields method, as described in section 3.3.2, provides one useful formalism . As shown earlier , the most straightforward method is through vector addition or superpositioning . Each behav' ior s relative strength or gain , gi , is used as a multiplier of the vectors before addition . Figure 3.23 illustrates how this is accomplished.
.
-
L
114
Chapter3
p e r c e of Response
= ,
)
(
, votes
)
(
, votes
MAX
votes )
R
(
( ~
~
~
p
behavior
with
) votes
)
Rs
(
votes
, votes
( >
R4
t
most i I
.
.
d
.
-
~ ~
Voting-based Coordination
3.22 Figure Arbitration via voting.
p e r c e p t i 0 n
Fused behavioral response
Figure3.23
Behavioralfusion via vector summation.
Figure 3.24 showsa field with three active obstaclesand a goal locatedin the lower right comer. In (A ), the goal attractorbehaviorhas twice the obstacle avoidancebehavior's relativestrength. The paththe robot takesthrough this field cuts fairly closeto the obstaclesthemselves , resultingin a shorteralthough hastwice the strength avoidance obstacle In B the . moreperilouspath ( ), of the goal attraction. The path taken is considerably longer but also further . Vectoradditionprovidesa continuumfor comfrom the obstaclesthemselves through the gain vector G . Intelligence Center (Payton et al . 1992 ) , the issue disparate .behavioral outputs is handled by avoiding
biDing these fields controllable In work at Hughes Artificial of combining
multiple
115
RobotBehavior the sole use of single discrete response values. Instead, the responses of each behavior constrain the control variables as follows : . Zone: establishment of upper and lower bounds for a control variable . Spike : single value de ignation resulting in standard priority -based arbitration ~ . Clamp : establishment of upper or lower bound on a control variable In essence, the system provides constraints within which the control system must operate to satisfy its behavioral requirements. The algorithm for fusing the behaviors is as follows : ' Generate a profile by summing each active behavior s zones and spikes If maximum is a spike choose that command else (maximum is in zone ) minimal change from current choose commandrequiring control setting If command chosen beyond a clamped value choose most dominant clamp from all clamps If that clamp dominates profile move commandvalue to just within clamped region In this manner, commands can be described in the control variables for each actuator (e.g ., speed of a motor , angles of a control surface, etc) . Clamps limit the acceptable ranges for each actuator. This particular system has been applied to the control of an underwater robot using high - level behaviors such as stealth, safety, urgency, and efficiency , which are mapped onto low - level behaviors that ' control the vehicle s speed, the angle of the dive plane, proximity to the ocean bottom , and so forth , which in turn control the vehicle actuators, such as the ballast pumps and dive plane motor. Formal methods for expressing behavioral blending are discussed in Saffiotti , Konolige , and Ruspini 1995. In this approach, each behavior is assigned a desirability function that can be combined using specialized logical operators called t - norms, a form of multivalued logic . A single action is chosen from a set of actions created by blending the weighted primitive action behaviors. As this is essentially a variation of fuzzy control , where blending is the fuzzifi cation operation and selection is the defuzzification process, the method for generating these results is deferred until chapter 8.
116
,-",'._ ._ \:.~ \~ \,""o ~ ~ s S '-.\'t~ I-"+ , ,"-"lr"-+ .,a .,\.,~ . "".\.~ ''_ J ,'i"1 \ ~ ~ S ' I , . . J , I \ 1 / ' r ~ S , ' " . J I a " a I ~ _ . r \ . + t + + A ~ ~ ~ 1 ~ ~ A A " , A (F ).2ig 4 Chapter3
It is also possiblethat the deformationmanifolds approachdescribedfor obstacleavoidancein section 3.3.2 could be extendedto behavioralfusion as the approachpotentially can provide a commonrepresentationfor robotic action. .So far, however, it hasbeenlimited to arbittatioD. 3.4.4
Behavioral Assemblages
Behavioral assemblagesare the packages from which behavior-based robotic systems are consbucted. An assemblageis recursively defined as a coordinated collection of primitive behaviors or assemblages. Each individual assemblage consists of a coordination operator and any number of behavioral components
117
"r",'-~ ",."",.I'.'.e "\",0 " . " 1 , , ' I .,~ e ,'".,\~ . " 1 , , I ' " ~ I , ' 1 " \ , . " ' ~ . . . , . ~ ~ ~ , " t (B )
RobotBehavior
Figure 3.24 (continued) Behavioralfusion: (A ) goal attractiondominates; ( 8) obstacleavoidancedominates.
usearisesfrom the notion ). The powerof assemblage (primitivesor assemblages of abstraction,in which we cancreatehigher-level behaviorsfrom simpler onesandrefer to themhenceforwardindependentof their constituentelements. Abstractionhastraditionally beena powerful conceptin AI (Sacerdoti1974), in an and it is no less so here. Abstractionenablesus to reuseassemblages based the modular manner to construct behavior ; systemsprovides ability easy, to reasonoverthemfor usein hybrid architectures(chapter6); andprovides coarserlevelsof granularityfor adaptationandlearningmethodsto be applied (chapter8).
118
Chapter3
Wehavepreviouslyusedthe notationC (G * B (8)) = p, in section3.4.2. We denotean assemblage qi suchthat qi symbolizesC (G * B (8)), henceqi = p . For convenience , we will generallyomit the p in assemblage , leaving expression which onto a state within an FSA diagram. Although qi , mapsconveniently canbe expressedin anyof the notationalformatssection3.4.2 describes assemblages , wheneverthereis a temporalcomponent(i.e., the behavioralstructure of the systemchangesover time), FSA notationis usedmostoften. of assemblages in FSA form Figure 3.12 hasalreadyshownan assemblage for a competition robot. The constituentbehaviorsfor two of that robot' s statesare assemblage . move-to- pole . move-to-goal(detect-pole) avoid-static-obstacle(detect-obstacles ) . noise(generate -direction) low gain . wander . probe(detect-open-area) . avoid-static-obstacle(detect-obstacles ) . noise(generate -direction) high gain . avoid-past(detect-visited-areas) .
Each of theseFSA statescan be equivalentlydepicted as an SR diagram , an exampleof which appearsin figure 3.25. All of theseassemblages can, in turn, be bundledtogetherin a high-level SR if desired 3.26 (figure ). Using this alternaterepresentation , which has diagram merely a sequencerlabel for the coordinationfunction, the explicit temporal dependenciesbetweenstates(i .e., the state transition function q associated with the perceptualtriggers) arelost whencomparedto the earlierFSA version (figure 3.12) . Another form of assemblageconstruction, referredto rather as " hierarchi" cally mediatedbehaviors, appearsin Kaelbling 1986. A similar hierarchical abstractioncapability also appearsin teleo-reactivesystems( Nilsson 1994). In RS (Lyons and Arbib 1989), an assemblageis defined as a network of schemasthat can be viewed as a schemaitself. In all thesecases, behavioral ) can be viewedrecursively(or alternatively, hiaggregations(or assemblages . erarchically) asbehaviorsthemselves defined as hierarchical recursivebehavioralabstractions , , constitute Assemblages the primary building blocks of behavior-basedrobotic systemsand are ultimately groundedin primitive behaviorsattachedto sensorsandactuators.
119
RobotBehavior detect-pole
detect -obstacles
r ;
Move -to- pole Response
generate -direction
detect -open -area detect - obstacles
Wander Response
generate-diredion
(8) Figure3.2S
Q
SR diagramfor the move-to-pole andwanderassemblages.
Overall Competition Behavior
.1
..--.___.
Figure3 . 26 CompetitionSR diagram.
1/
120
Chapter3
3.5 CHAPTER SUMMARY . Robotic behaviors generate a motor response from a given perceptual stimulus . . Purely reactive systems avoid the use of explicit representational knowledge . . Behavior -based systems are inherently modular in and design provide the ability for software reuse. . Biological models often serve as the basis for the design of behavior -based rob Otic systems. . Three design paradigms for building behavior -based systems have been presented : ethologically guided/ constrained design, using biological models as the basis for behavioral selection, design and validation ; situated activity , which creates behaviors that fit specific situational contexts in which the robot will need to respond; and experimentally driven design, which uses a bottom -up design strategy based on the need for additional competency as the system is being built . . The expression of behaviors can be accomplished in several different ways : SR diagrams, which intuitively convey the flow of control within abehavior based system; functional notation , which is amenable to the generation of code for implementation ; and FSA diagrams, which are particularly well suited for ' representing behavioral assemblages time -varying composition . . Other, more formal methods have been developed for expressing behaviors, such as RS and situated automata as epitomized by the Gapps language. . Behaviors can be represented as triples ( S, R , .8) , with S being the stimulus domain , R the range of response, and .8 the behavioral mapping between them. . The presence of a stimulus is necessary but not sufficient to evoke a motor response in a behavior-based robot. Only when the stimulus exceeds some threshold value t' will it produce a response. . gi , a strength multiplier or gain value, can be used to turn off behaviors or increase the response' s relative strength. . Responses are encoded in two forms : discrete encoding , in which an enumerable set of responses exists; or continuous functional encoding , in which an infinite space of responsesis possible for a behavior. . Rule -based methods are often used for discrete encoding strategies. . Approach es based on the fields method are often used for the continuous potential functional encoding of robotic response. . There is nothing magical about emergent behavior ; it is a product of the complexity of the relationship between a robotic agent and the real world that resists analytical modeling .
121
RobotBehavior . Notationalmethodsfor describingassemblages and coordinationfunctions . usedthroughoutthe text havebeenpresented . The two primary mechanismsfor behavioralcoordinationarecompetitiveor , but they canbe combinedif desired. cooperative . Competitivemethodsresultin the selectionof the outputof a singlebehavior, typically eitherby arbitrationor action-selection. . Cooperativemethodsoftenusesuperpositioningof forcesor gradientsgenerated from field- basedmethods, including potentialfield-basedapproaches or navigationaltemplates. . Assemblagesarerecursivelydefinedaggregationsof behaviorsor other asse. mblages . . Assemblagesserve as important abstractions useful for constructing behavior-basedrobots.
Chapter
4
Behavior
- Based Architectures
One can expect die human race to continue attempting systemsjust within or just our reach; and softwaresystemsare perhapsdie most intricate and complex . beyond' of man s handiworks. The managementof this complexcraft will demandour bestuse of new languagesand systems,our bestadaptationof provenengineeringmanagement . medlods, liberal dosesof commonsense, and a God-given humility to recognizeour fallibility andlimitations. - FrederickP. Brooks, Jr. Therearetwo waysof constructinga softwaredesign. One way is to makeit so simple that thereareobviouslyno deficiencies.And the otherway is to makeit socomplicated that thereareno obviousdeficiencies. - C .AiR. Hoare
Objectives .
is
architecture
robot
a
what
Chapter characterize
To
.
1
robotic
based
behavior
a
of
the
for
the
design
understand
To
.
. subsumption :
architectures
robotic
2
requirements
reactive
different
two
architecture
in , .
understand
To
.
,
depth schema
motor
3
and
to
available
choices
architectural
based
behavior
other
the
of
review
To
.
4
many .
builder
robot
the
system robotic
based
behavior
a
of
construction
the
for
To principles
design
.
5
develop .
architecture
124
Chapter4
4.1 WHATIS A ROBOTICARCIDTECTURE ? In chapter3, we learnedabout robotic behaviors, including methodsfor expressing , encoding, andcoordinatingthem. To designandbuild behavior-based robotic systems,commitmentsneedto be madeto the actual, specificmethods to be usedduring this process.This needleadsus to the studyof robotic architectures : softwaresystemsand specificationsthat providelanguagesand tools for the constructionof behavior-basedsystems. All of the architecturesdescribedin this chapterare concernedwith behavioral control. As describedin chapter 1, severalnon- behavior-basedrobotic architecturesappearedbefore the advent of reactive control, for example, NASREM (section 1.3.1). Here, however, we focus on behavior-basedsystems . Though considerably varied, thesearchitecturessharemany common features: . emphasison the importanceof coupling sensingandactiontightly . avoidanceof representational symbolicknowledge . decompositioninto contextually meaningfulunits ( behaviorsor situationactionpairs) Although thesearchitecturessharea common philosophy on the surface, therearemanydeepdistinctionsbetweenthem, including . the granularityof behavioraldecomposition . the basisfor behaviorspecification(ethological, situatedactivity, or experimental ) . the responseencodingmethod(e.g., discreteor continuous) . the coordinationmethodsused(e.g., competitiveversuscooperative ) . the programmingmethods, languagesupport available, and the extent of softwarereusability. In this chapterwe study severalcommonrobotic architecturesusedto build behavior-basedsystems. Tablesappearthroughoutsummarizingthe characteristics for each of the behavior-basedarchitecturesdiscussed . Two of the architectureshave been singled out for closer scrutiny than the others: the subsumptionarchitectureusing rule-basedencodingsand priority-basedarbitration ; and motor schemasusing continuousencodingand cooperativecombination of vectors.
125
Behavior -Based Architectures
4.1.1 Definitions Perhapsa good place to begin searchingfor our definition of robotic architectures would be with ~ e definition of computerarchitectures . Stone( 1980, . 3 one of the best known architects uses the , p ), computer following definition " : Computer architectureis the discipline devoted to the design of highly specificandindividual computersfrom a collectionof commonbuilding blocks." Roboticarchitecturesareessentiallythe same. In our roboticcontrolcontext, however, architectureusually refersto a softwarearchitecture,ratherthan the hardwareside of the system. So if we modify Stone's definition accordingly, we get: , Roboticarchitecture is the
disciplinedevotedto the designof highly specificand individualrobotsfroma collectionof commonsoftwarebuildingblocks. How doesthis definition coincidewith otherworking definitionsby practicing robotic architects? Accordingto Hayes-Roth ( 1995, p. 330), anarchitecture refers to " . . . the abstractdesignof a class of agents: the set of structural componentsin which perception, reasoning, and action occur; the specific functionality and interfaceof eachcomponent, andthe interconnectiontopol." Although her discussionof agentarchitecturesis ogy betweencomponents targetedfor artificially intelligent systemsin general, it alsoholdsfor the subclass with which we are concerned , namely, behavior-basedrobotic systems. Indeed, a surveillancemobilerobot system(figure4.1) hasbeendevelopedthat -Roth et al. 1995) . Sheargues embodiesher architecturaldesignprinciples( Hayes that architecturesmustbe producedto fit specificoperatingenvironments , a conceptcloselyrelatedto our earlierdiscussionof ecologicalnichesin chapter 2 andrelatedto the claim in McFarlandandBosser1993that robotsshould be tailoredto fit particularniches. Mataric ( 1992a) providesanotherdefinition, stating, "An architectureprovides a principled way of organizinga control system. However, in addition to providing structure, it imposesconstraintson the way the control problem canbe solved." Onefinal straightforwarddefinition is from DeanandWellman " ( 1991, p. 462) : An architecturedescribesa set of architecturalcomponents andhow they interact."
126
Chapter4
Figure 4.1 Nomad200 robot. of the type used for surveil lance at Stanford. ( photographcourtesy of NomadicTechnologiesInc., MountainView, California.)
4.1.2 Computability Existing robotic architecturesare diverse, from the hierarchicalNASREM architectureto purely reactivesystemssuch as subsumption(section4.3) to hybrid architectures(chapter6). In what ways can instanceschosenfrom the diversity of architecturalsolutionsbe saidto differ from one another? In what wayscanthey be saidto be the same? The answerto thesequestionsis related to the distinction betweencomputability and organizingprinciples. Architecturesare constructedfrom components , with eachspecificarchitecturehavingits own peculiarsetof building blocks. The ways in which thesebuilding blocks can be connectedfacilitate certaintypesof robotic designin given circumstances . Organizingprinciples underlie a particular architecture's commitmentto its componentstructure, granularity, andconnectivity. From a computationalperspective , however, we may seethat the various architecturesare all equivalentin their computationalexpressiveness . Consider . Different , for instance, the differencesbetweenprogramminglanguages
127
Behavior - Based Architectures
choicesareavailableto the programmerrangingfrom machinelanguageto assembler to varioushigh-level languages(suchasFortran, Cobol, C, Pascal,and LISP) to very high- level languagessuchasthoseusedin visual programming. Is thereany fundamentalincompatibility in the ideathat one languagecan do
thatanothercannot ? something Considerthe resultsthat BohmandJacopini( 1966) derivedconcerning . Theyprovedthatif anylanguage languages computabilityin programming containsthe threebasicconstructsof sequencing , conditionalbranching, and iteration, it cancomputethe entireclassof computablefunctions(i.e., it is Thring equivalent). This essentiallystatesthat from a computationalperspective the commonprogramminglanguageshaveno differences. The logical extensionis that sinceall robotic architecturesprovide the capability to perform taskssequentially , allow conditionalbranching, and provide the ability for iterativeconstructs,thesearchitecturesarecomputationally equivalent. All behavior-basedrobotic architecturesare essentiallysoftware languagesor frameworksfor specifying and controlling robots. The level of abstractionthey offer may differ, but not the computability. This does not mean, of coursethat we will start writing AI programsin Cobol. It doesmeanthat eachcurrentprogramminglanguagehasin turn found a nichein which it serveswell and thus hassurvived(i.e., remainedin usage) becauseit is well suitedfor thatparticulartask. Somearguethat the sameholds for robotic architecturesas well: each servesa particular domain (or niche) and will be subjectedto the sameenvironmentalstresses for survival as are . computerarchitecturesor softwarelanguages Behavior-basedrobotic systemsservebest when the real world cannotbe accuratelycharacterizedor modeled. Wheneverengineeringcanremoveuncertainty from the environment,purely behavior-basedsystemsmay not necessar ily afford the best solutionfor the task involvedandhierarchicalarchitectures (chapter 1) may prove more suitable, as, for example, in factory floor operations where the environmentcan be altered to fit the robot' s needs. More often than not, however, much as we try, we cannotremoveuncertainty, unpredictability , and noisefrom the world. Behavior-basedrobotic architectures were developedin responseto this difficulty and chooseinsteadto deal with theseissuesfrom the onset, relying heavily on sensingwithout constructing potentiallyerroneousglobal world models. The morethe world changesduring execution, themoretheresultingvalueof anyplangenerateda priori decreases , andthe moreunstableany representational knowledgestoredaheadof time or . gatheredduring executionandrememberedbecomes
128
Chapter4
At a finer level, we will seethat behavior-basedarchitectures , becauseof their differentmeansof expressingbehaviorsandthe setsof coordinationfunctions ' they afford, provide significantdiversity to a robotic systems designer. Eachapproachhasits own strengthsandweaknesses in termsof what it is best at doing or whereit is mostappropriatelyapplied. The remainderof this chapter discusses a variety of behavior-basedrobot architecturalsolutions. Not all are expectedto withstandthe test of time, and many will likely suffer a fate similar to that of early programminglanguages(e.g., ALGOL , SNOBOL) and fadeoff into obscurity. Ecologicalpressurefrom sourcesrangingfrom easeof usefor the designerto generalizabilityto public opinion to exogeneous factors (political, economic, etc.) will ultimately serveas the fundamentalselection ' mechanism , not merely an academics perspectiveon their elegance , simplicity , or utility . As an aside, we notethe recentcontroversyPenrose( 1989, 1994) stirredup by claiming that no computerprogramcaneverexhibit intelligenceasaccording to Penrose , intelligencemustincorporatesolutionsto noncomputableproblems as well as thosethat are computable. Interestingly, he doesnot dismiss the attainmentof intelligenceas utterly impossiblein a deviceand presentsa novel, but rather speculative , approachbasedon quantummechanics(a microtubule architectureif you will ), rather than a computationalapproach, to achievethis goal. To say the least, his position hasbeenstrongly rebuttedby manywithin the AI communityandis often cursorily dismissedasrubbish. In the book' s final chapter, we will revisit this issueof what intelligencemeans within the contextof a robotic systemandwhat we canor shouldexpectfrom thesesystems. Suffice it to say, for now, that all the behavior-basedarchitectures consideredin this book arecomputational. 4.1. 3 Evaluation Criteria How can we measure an architecture ' s utility for a particular problem ? A list of desiderata for behavior -based architectures is compiled below. . Support for parallelism : Behavior -based systems are inherently parallel in nature. What kind of support does the architecture provide for this capability ? . Hardware targetability : Hardware targetability really refers to two different things . The first regards how well an architecture can be mapped onto real robotic systems, that is , physical sensors and actuators. The second is concerned with the computational processing. Chip -level hardware implementations are often preferred over software from a performance perspective. What
129
Behavior-BasedArchitectQres
type of supportis availableto realizethe architecturaldesignin silicon (e.g., compilersfor programmablelogic arrays[Brooks 1987b])? . Niche targetabilty : How well can the robot be tailored to fit its operating environment(Hayes-Roth 1995) ? How canthe relationshipsbetweenrobotand environmentbe expressedto ensuresuccessfulnicheoccupation? . Support for modularity : What methodsdoes an architectureprovide for ? Modularity can be found at a variety encapsulatingbehavioralabstractions of levels. By providing abstractionsfor use over a wide rangeof behavioral ' levels (primitives, assemblages , agents), an architecturemakesa developers task easierandfacilatessoftwarereuse(Mataric 1992a). . Robustness:A strengthof behavior-basedsystemsis their ability to perform in the face of failing components(e.g., sensors , actuators, etc.) ( paytonet , al. 1992; Horswilll993a ; Ferrell 1994). What typesof mechanismsdoesthe architectureprovidefor suchfault tolerance? . nmeliness in development: What typesof tools anddevelopmentenvironments are availableto work within the architecturalframework? Is the architecture more of a philosophicalapproach,or doesit provide specifictools and methodsfor generatingreal robotic systems? . Run time ftexibility : How can the control systembe adjustedor reconfigured during execution? How easilyis adaptationandlearningintroduced? . Performance effectiveness: How well doesthe constructedrobot perform its intendedtask(s)? This aspectalso encompass es the notion of timelinessof execution, or how well the systemcan meet establishedreal-time deadlines. In other instances , specificquantitativemetricscan be appliedfor evaluation purposeswithin a specific task context (Balch and Arkin 1994). Thesemay includesuchthingsastime to taskcompletion, energyconsumption,minimum travel, and so forth, or combinationsthereof. Thesewidely rangingcriteria canbe usedfor evaluatingthe relative merits of manyof the architecturesdescribedin the remainderof this chapter.
4.1.4 OrganizingPrinciples From the discussionin chapter3, severaldifferent dimensionsfordistinguishing robotic architecturesbecomeapparent,including . Different coordinationstrategies , of particularnote, competitive(e.g., arbitration , action selection, voting) versuscooperative(e.g., superpositioning ) . Granularity of behavior: microbehaviorssuch as those found in situated activity-basedsystems(e.g., Pengi) or more generalpurposetaskdescriptions (e.g., RAPs).
130
Chapter4 . Encoding of behavioral response: discrete, that is , a prespecified set of possible responses (e.g ., rule -based systems or DAMN ), or continuous (e.g., potential field - based methods) . The remainder of this chapter first discusses two architectures in some detail : the subsumption architecture and motor schema- based systems (the reactive component of the Autonomous Robot Architecture (AuRA ) . Next it reviews several other significant behavior-based architectures, although at a higher level. Finally it presents design principles for constructing a behavior -based robotic system, in an architecture- independent manner as much as possible.
4.2 A FORAGING EXAMPLE To ground the following architecturaldiscussions , let us consider a wellstudiedproblemin robotic navigation: foraging. This taskconsistsof a robot' s moving awayfrom a homebasearealooking for attractorobjects. Typical applications might includelooking for somethinglost or gatheringitemsof value. Upon detectingthe attractor, the robot movestoward it , picks it up and then returnsit to the homebase. It repeatsthis sequenceof actionsuntil it hasreturned all the attractorsin the environment.This test domainhasprovidedthe basisfor a wide rangeof resultson both real robots(Balchet al. 1995; Mataric 1993a) andin simulation. Foragingalsocorrelateswell with ethologicalstudies , especiallyin the caseof ants(e.g., Gosset al. 1990). Severalhigh-level behavioralrequirementsto accomplishthis task include: 1. Wander: movethroughthe world in searchof an attractor 2. Acquire: movetowardthe attractorwhendetected 3. Retrieve: returnthe attractorto the homebaseonceacquired . Eachassemblage shown Figure4.2 representsthesehigher-level assemblages is manifestedwith different primitive behaviorsand coordinatedin different waysaswe movefrom onearchitecturalexampleto the next.
4.3 SUBSUMPTION ARCIDTECTURE RodneyBrooks developedthe subsumptionarchitecturein the mid- 1980sat the MassachusettsInstitute of Technology. His approach, a purely reactive behavior-basedmethod, flew in the faceof traditional AI researchat the time. Brooks arguedthat the sense-plan-act paradigm used in some of the first autonomousrobots suchas Shakey(Nilsson 1984) was in fact detrimentalto
131
Behavior.Based Architectures
Figure4.2 FSAdiagramfor foraging . the constructionof real working robots. He further arguedthat building world modelsand reasoningusing explicit symbolic representationalknowledgeat best was an impedimentto timely robotic responseand at worst actually led roboticsresearchers in the wrong direction. In his seminalpaper, Brooks ( 1986) advocatedthe useof a layeredcontrol system,embodiedby thesubsumptionarchitecturebut layeredalonga different dimensionthan what traditional researchwas pursuing. Figure 4.3 showsthe distinction, with the conventionalsense-plan-act vertical model illustratedin (A ) andthe new horizontaldecompositionin (B) . (The orientationof the lines that separatethe componentsdeterminesvertical andhorizontal.) Much of thepresentationandstyleof the subsumptionapproachis dogmatic. Tenetsof this viewpoint include . Complexbehaviorneednot necessarilybe the productof a complexcontrol system. . Intelligenceis in the eyeof the observer( Brooks1991a). . The world is its own bestmodel(Brooks 1991a). . Simplicity is a virtue. . Robotsshouldbe cheap. . Robustness in the presenceof noisy or failing sensorsis a designgoal. . Planningis just a way of avoiding figuring out what to do next ( Brooks 1987a). . All onboardcomputationis important. . Systemsshouldbe built incrementally. . No representation . No calibration. No complex computers. No highbandwidthcommunication( Brooks1989b).
132
Chapter4
. 1-- -.----~-.--- ...--8-1--.J ~1...... ..----......-...._-- -- .1 -- --- I. 1 -, I , - --- -- ~I ~.._~~~-~~"C~.. I (A)
~ I
. .
. ~
(B)
Figure 4. 3 - plan-act model. ( B) Subsumption(reactive) model. (A ) Sense
This was hard to swallow for many in the AI community (and in many cases,still is). Brooks lobbiedlong andhardfor rethinkingthe way intelligent . robotsin particular, and intelligent systemsin general, shouldbe constructed This stancechangedthe direction of autonomousroboticsresearch.Although currently manyin the AI communitytake a more temperedposition regarding the role of deliberationand symbolicreasoning(chapter6), Brooks hasnot to datedisavowedin print any of theseprinciples( 1991a). Let us now moveto the specificsof the subsumptionarchitecture.Table4.1 is the first of manytablesthroughoutthis chapterthat providea snapshotview of the designcharacteristicsof a particulararchitecturein light of the material discussedin chapter3.. 4. 3.1
Behaviors in Subsumption
Task-achievingbehaviorsin the subsumptionarchitectureare representedas separatelayers. Individual layers work on individual goals concurrentlyand . At the lowest level, each behavior is representedusing an asynchronously finite statemachine(AFSM) model(figure4.4). The AFSM encapsulates augmented a particularbehavioraltransformationfunction.8; . Stimulusor response or inhibited by other activebehaviors. A resetinput signalscanbe suppressed is alsousedto returnthe behaviorto its startconditions. EachAFSM performs
133
Behavior - Based Architectures
Table4.1 Architecture Subsumption Name Background Precursors
Subsumptionarchitecture Well-known early reactivearchitecture Braitenberg1984; Walter 1953; Ashby 1952
Prindpal designmethod Developer
Experimental RodneyBrooks ( Mit )
Responseencoding Coordination method
Predolninantlydiscrete(rule based) Competitive(priority-basedarbitrationvia inhibition and suppression ) Old methodusesAFSMs; new methodusesBehavior Language Allen, Genghis(hexapod ) , Squirt (very small), Toto, Seymour, Polly (tour guide), severalothers Brooks 1986; Brooks 1990b; Horswill 1993a
Programming method Robots fielded References
Reset Suppressor
BEHAVIORAL
INPUT WIRES
MODULE
Inhibitor
Figure 4.4 Original AFSM asusedwithin the subsumptionarchitecture.
OUTPUT WIRES
Chapter4
134
....................... .......................................... ................ ........ Back - out - ot . Tight Situations Layer Lost
Reverse
Collide .. ................................... Clock Explore Layer
.......
...... ............ ..
Go Wander .. ............. ............ ..... ... .................................. ............ .. S E N S 0 R S
[
~
OTORS
ForwardS
'
S
Run Away : ~ J
~~ S t
~
=~
= = == =
~~ A~
-
. BRAKES
Figure 4.5 AFSMs for a simplethree-layeredrobot (Brooks 1987b). an action and is responsible for its own perception of the world . There is no global memory , bus, or clock . With this design, each behavioral layer can be mapped onto its own processor ( Brooks 1987b) . There are no central world models or global sensor representations. Required sensor inputs are channeled to the consuming behavior. Figure 4.5 shows a simple robot with three behavioral layers . The system was implemented on a radio -control led toy car (Brooks 1987b) . The lowest behavior layer , avoid -objects, either halts or turns away from an obstacle, ' depending upon the input from the robot s infrared proximity sensors. The explore layer permits the robot to move in the absenceof obstacles and cover - - large areas. The highest layer , back out of tight situations , enables the robot to reverse direction in particularly tight quarters where simpler avoidance and exploration behaviors fail to extricate the robot. As can be seen, the initial subsumption language, requiring the specification of low -level A F SMs , was unwieldy for those not thoroughly schooled in its usage. Recognizing this problem , Brooks ( 1990a) developed the Behavior Language, which provides a new abstraction independent of the A F SMs themselves using a single rule set to encode each behavior. This high level language is then compiled to the intermediate AFSM representation, which can then be further compiled to run on a range of target processors.
135
Behavior-BasedArchitectures
4.3.2 Coordinationin Subsumption The name subsumption arises from the coordination process used between the layered behaviors wi ~ the architecture. Complex actions subsume simpler behaviors. A priority hierarchy fixes the topology . The lower levels in the architecture have no awareness of higher levels. This provides the basis for incremental design. Higher -level competencies are added on top of an already working control system without any modification of those lower levels. The older version of subsumption specified the behavioral layers as collections of A F SMs , whereas the newer version uses behavioral abstractions (in the form of rules) to encapsulate a robot ' s response to incoming sensor data. These abstractions are then compiled into the older AFSM form , but this step is transparent to the developer. Coordination in subsumption has two primary mechanisms: . Inhibition : used to prevent a signal being transmitted along an AFSM wire from reaching the actuators. . Suppression: prevents the current signal from being transmitted and replaces that signal with the suppressing message. Through these mechanisms, priority -based arbitration is enforced. Subsumption permits communication between layers but restricts it heavily . The allowable mechanisms have the following characteristics: . . . . . .
low baud rate, no handshaking messagepassing via machine registers output of lower layer accessible for reading by higher level inhibition prevents transmission suppression replaces messagewith suppressing message reset signal restores behavior to original state
The world itself serves as the primary medium of communication . Actions taken by one behavior result in changes within the world and the robot ' s relationship to it . New perceptions of those changescommunicate those results to the other behaviors.
-BasedReactiveSystems 4.3.3 Designin Subsumption The key aspects for design of subsumption- style robots are situatedness and embodiment ( Brooks 1991b) . Situatednessrefers to the robot ' s ability to sense its current surroundings and avoid the use of abstract representations, and em-
136
Chapter4
bodimentinsiststhat the robotsbe physicalcreaturesand thus experiencethe world directly ratherthan throughsimulation. Mataric 1992apresentsheuristics for the designanddevelopmentof this type of robot for a specifictask. The basicprocedureoutlin.edis asfollows: I . Qualitativelyspecify the behaviorneededfor the task, that is, describethe overall way the robot respondsto the world. 2. Decomposeand specify the robot' s independentbehaviorsas a set of observable disjoint actionsby decomposingthe qualitativebehaviorspecifiedin step 1. 3. Determinethebehavioralgranularity(i .e., boundthedecompositionprocess) andgroundthe resultinglow-level behaviorsonto sensorsandactuators. An additionalguidelineregardingresponseencodingrecommendsthe useof small motionsratherthan largeballistic onesby resortingto frequentsensing. Finally, coordinationis imposedby initially establishingtentativepriorities for the behaviorsandthen modifying andverifying themexperimentally. -style Let us now review the exampleof experimentallydriven subsumption design( Brooks1989a) previouslymentionedin section3.1.3. The targetrobot is a six-legged walking machinenamedGenghis(figure 3.6). Its high-level behaviol;'alperformanceis to be capableof walking over rough terrain and to havethe ability to follow a human. This constitutesthe qualitativedescription of task-level performancementionedin step 1 above. The next step, involving behavioraldecomposition , mustnow be performed. Each of the following behaviorallayers was implemented, tested, and debuggedin turn: 1. Standup: Clearly, before the robot can walk, it needsto lift its body off of the ground. Further decompositionleadsto the developmentof two AFSMs, one to control the leg' s swing position and the other its lift . When all six legs operateunderthe standupbehavior, the robot assumesa stancefrom which it canbegin walking. 2. Simple walk: This requiresthat the leg be lifted off the groundand swung forward (advance ). A variety of sensordata is usedto coordinatethe motion ' betweenlegs, including encodersreturning the position of eachleg s joints. When appropriatelycoordinated, a simple walk over smoothterrain (tripod gait) is achieved. 3. Forcebalancing: Now the issuesconcerningrough terrain are confronted. Forcesensorsare addedto the legs, providing activecomplianceto changesin the ground's contour.
137
Behavior -Based Architectures
4. Leg lifting : This helpswith steppingover obstacles . Whenrequired, the leg canlift itself muchhigher thannonnal to stepover obstacles . 5. Whiskers: Thesesensorsare addedto anticipatethe presenceof an obstacle rather than waiting for a collision. This capability emergesas important throughexperimentswitlt the previousbehaviors. 6. Pitch stabilization: Furtherexperimentsshowthat the robot tendsto bump into the groundeitherfore or aft (pitching). An inclinometeris addedto measure the robot' s pitch anduseit to compensateandpreventbumping. Now the robot' s walking capabilitiesarecomplete. 7. Prowling: The walking robot is now concernedwith moving toward a detected human. The infrared sensorsare tied in. When no personis present, . As soonassomeonestepsin front of the robot, the suppression walking is suppressed and stops walking begins. 8. Steeredprowling: The final behavior allows the robot to turn toward the personin front of it and follow him. The differencein readingsbetweentwo IR sensorsis usedto providethe stimulus, andthe swingendpointsfor the legs on eachsideof the robot aredeterminedby the differencein strength. The completedrobot, satisfyingthe task criteria establishedfor it, consists of fifty -sevenAFSMs built in an incrementalmanner. Each layer has been testedexperimentallybefore moving onto the next, and the results of those testshaveestablishedthe needfor additional layers (e.g., whiskersand pitch stabilization).
4.3.4 ForagingExample -baseddesign The foragingexamplepresentedearlier alsoillusb' atessubsumption . In particular, the robots Mataric ( 1993a) constructedfor severaltasks including foraging provide the basisfor this discussion. The robots are programmed in the Behavior Language. The target hardwareis an IS Robotics system(figure 4.6). Each behaviorin the systemis encodedas a set of rules (standardfor the BehaviorLanguage). The overallsystemhasactuallybeendevelopedasamul tiagentrobotic system(chapter9), but for now we will restrict this discussion to a singlerobot foraging. The following behaviorsareinvolved: . . . .
Wandering: movein a randomdirectionfor sometime. Avoiding: turn to the right if the obstacleis on the left, then go. turn to the left if the obstacleis on the right, thengo.
,
Robotics
IS
courtesy
photograph ( .
of
RI
:
) .
, MA
Somerville
Subsumption
foraging based
robot
-
Figure 6
.
4
Chapter 138
4
139
Behavior-BasedArchitectures
Figure 4.7
SR diagram for subsumption -based foraging robot .
. after threeattempts, backup andturn. . if an obstacleis presenton both sides, randomlyturn andback up. . Pickup: Turn towardthe sensedattractorand go forward. If at the attractor, closegripper. . Homing: Turn towardthe homebaseand go forward, otherwiseif at home, stop. Figure4.7 illustratesthe SR diagramfor this setof behaviors.Priority-based arbitrationis the coordinationmechanism , andthe robot is executingonly one behavioralrule at any time. Note in particularthat when the robot sensesthe attractor, wanderingis suppressed and when the attractoris grabbed, homing then suppress es pickup (allowing the robot to ignore the potentialdistraction of otherattractorsit might encounteralongthe way). 4. 3.5
Evaluation
When the criteria presentedin section4.1.3, are appliedto evaluatethe subsumption architecture,they identify the following strengths: . Hardware retargetability: Subsumptioncan compile down directly onto -arraylogic circuitry (Brooks 1987b). programmable . Supportfor parallelism: Each behaviorallayer can run independentlyand . asynchronously . Niche targetability: Custom behaviorscan be createdfor specific taskenvironmentpairs.
Thefollowingcharacteristics asneitherstrengthnorweaknesses : emerge
140
Chapter4
. Robustness : This can be success fully engineeredinto thesesystemsbut is often hard-wired ( Ferrell 1994) andhencehardto implement. . Timelinessfor development : Somesupporttools exist for thesesystems,but curve is still associatedwith custombehavioraldesign. a significantlearning , can slow development Experimentaldesigri, involving trial-and-error development . Also, consistentwith Brooks' philosophy, simulatorsare not usedto pretestbehavioralefficiency. : Under the criteria, the following showup asweaknesses . Run time flexibility : The priority-basedcoordinationmechanism , the ad hoc ' wired flavor of behaviorgeneration, and the architectures hard aspectslimit the waysthe systemcanbe adaptedduring execution. . Supportfor modularity: Although behavioralreuseis possiblethroughthe BehaviorLanguage,it is not widely evidencedin constructedrobots. Subsumption hasalsobeencriticized on the basisthat sinceupperlayersinterferewith lower ones, they cannotbe designedcompletelyindependently( Hartleyand Pipitone 1991). Also behaviorscannotalwaysbe prioritized (nor shouldthey be), leadingto artificial arbitrationschemes(Hartley andPipitone1991). Commitment to subsumptionasthe solecoordinationmechanismis restrictive. 4. 3.6
Subsumption Robots
Many different robots (figure 4.8) havebeenconstructedusing the subsumption architecture.Brooks 1990breviewsmanyof them. They include -basedrobot, which used sonarfor navigation . Allen: the first subsumption basedon the ideasin Brooks 1986. . Tom andJerry: two small toy carsequippedwith infraredproximity sensors (Brooks 1990b). . Genghisand Attila : six-leggedhexapodscapableof autonomouswalking (Brooks 1989a). . Squirt: a two-ouncerobot that respondsto light (Flynn et al. 1989). -basedrobot andthe first to use . Toto: the first map-constructing,subsumption the BehaviorLanguage( Mataric1992b). . Seymour: a visual motion-trackingrobot ( BrooksandFlynn 1989) . . Tito: a robot with stereonavigationalcapabilities(Sarachik1989). . Polly: a robotic tour guidefor the MIT AI lab (HorswiIl1993b). . Cog: a robot modeledas a humanoidfrom the waist up, and used to test theoriesof robot-human interaction and computervision ( Brooksand Stein 1994).
141
Behavior - Based Architectures lA -robot
A- eye Herbert
I
Tj
ien
Clint
A - ear
Climbing Robot Photovore Robot Skiing Genghis motor
200 Squirt
Jim
and
Tom
Jerry Frame
Attila
( photographcourteS) of RodneyBrooks.)
4.4 MOTORSCHEMAS Another approach, more strongly motivatedby the biological sciences , appeared on the heels of the subsumptionarchitecture. This behavior-based methodusedschematheory, which we reviewedin chapter2. We recall from that review that schematheoryprovidesthe following capabilitiesforspecifying and designingbehavior-basedsystems(adaptedfrom Arbib 1992) : . Schematheoryexplainsmotor behaviorin termsof the concurrentcontrol of manydifferent activities. . A schemastoresboth how to reactandthe way that reactioncanbe realized. . Schematheoryis a distributedmodelof computation. . Schematheoryprovidesa languagefor connectingactionandperception. . Activation levelsare associatedwith schemasthat determinetheir readiness or applicability for acting.
142
Chapter4
. Schematheory providesa theory of learningthroughboth schemaacquisition and schematuning. . Schematheory is useful for explaining the brain' s functioning as well as disbibutedAI applications(suchasbehavior-basedrobotics). Schematheory is an attemptto accountfor the commonalitiesin both neuro. biological and artificial behavior, and Arkin choseit as a suitablevehicle to implementrobotic behavior. the implicationsof schematheoryfor Arkin (1989a , 1993) addressed , 1990a autonomousrobotics: 1. Schemasprovidelargegrain modularity, in contrastto neuralnetworkmodels , for expressingthe relationshipsbetweenmotor control andperception. 2. Schemasact concurrentlyas individual disbibutedagentsin a cooperative yet competingmannerandthusarereadily mappableonto disbibutedprocessing architectures . 3. Schemasprovide a set of behavioralprimitives by which more complex . behaviors(assemblages ) canbe constructed 4. Cognitive and neuroscientificsupportexists for the underpinningsof this , as additionalneuroscientific approach.Thesecan be modified, if appropriate or cognitivemodelsbecomeavailable. -basedroboticsis to providebehavioralprimitives The overallmethodof schema that can act in a disbibuted, parallel mannerto yield intelligent robotic actionin responseto environmentalstimuli. Lyonshasalsousedschematheory in his research(sections3.2.4.1 and 6.6.3), but herewe focus on the methodology Arkin adopted. es in several The motor schemamethoddiffers from otherbehavioralapproach : significantways . Behavioralresponsesareall representedin a singleuniform format: vectors generatedusing a potentialfields approach(a continuousresponseencoding). . Coordinationis achievedthroughcooperativemeansby vectoraddition. . No predefinedhierarchyexists for coordination; instead, the behaviorsare ' configuredat run-time basedon the robot s intentions, capabilities, and environmental constraints. Schemascan be instantiatedor deinstantiatedat any time basedon perceptualevents, hencethe structureis more of a dynamically changingnetworkthan a layeredarchitecture. . Purearbitrationis not used; instead, eachbehaviorcanconbibutein varying ' . The relativestrengthsof the behaviors degreesto the robot s overallresponse ' . (G) determinethe robot s overall response
143
-Based Behavior Architecture~ Table4.2 Motorschema .~ Name
Motor Schemas
Background Precursors
Reactivecomponentof AuRA Architecture Arbib 1981; Khatib 1985
Principal designmethod
Ethologicallyguided RonaldArkin (GeorgiaTech)
Developer Responseencoding Coordination method Programming method Robots fielded References
Continuoususingpotentialfield analog Cooperativevia vectorsummationand normalization Parameterized behaviorallibraries MARV, George, Ren and Stimpy, Buzz, blizzards, mobile manipulator, others Arkin 1987a; Arkin 1989b; Arkin 1992a
. Perceptualuncertaintycanbe reflectedin the behavior's responseby allowing it to serveasan input within the behavioralcomputation. Table4.2 summarizestheimportantaspectsof this architecture.The remainder of this sectionstudiesthe detailsof its implementation. 4.4.1
Schema -Based Behaviors
Motor schemabehaviorsarerelatively largegrain abstractionsreusableover a wide rangeof circumstances . Many of the behaviorshaveinternal parameters that provide additional flexibility in their deployment. The behaviorsgenerally are analogousto animal behaviors(section2.4), at leastthoseuseful for navigationaltasks. A perceptualschemais embeddedwithin eachmotor schema.Theseperceptual schemasprovidethe environmentalinformationspecificfor that particular behavior. Perceptionis conductedon a need-to-know basis: individual perceptual algorithmsprovide the information necessaryfor a particularbehaviorto react. Chapter7 details this sensingparadigm, referredto as action-oriented perception. Suffice it to say for now that attachedto eachmotor schemais a perceptualprocesscapableof providing suitablestimuli, if present, asrapidly aspossible. Perceptualschemasarerecursivelydefined, thatis, perceptualsubschemas can extractpiecesof informationthat are subsequentlyprocessedby anotherperceptualschemainto a morebehaviorallymeaningfulunit. An example might involve recognitionof a personwith more than one sensor: Infrared
144
Chapter4
~MAS MOTORSCH ~NVIRONM ~NTAL
SENSORS
ROBOT
E N V I R 0 N M E N T
MOTOR . . .
.. .
Key : Schema PS- Perceptual PSS- Perceptual SUbschema MS- MotorSchema ES- EnvIronmental Sensor
Figure4.9 -actionschema relationships. Perception sensorscan provide a heatsignature, whereascomputervision may provide a humanshape. The infonnation generatedfrom eachof theselower-level perceptual processeswould be mergedinto a higher-level interpretationbefore passingit on to the motor behavior. This enablesthe use of multiple sensors within the contextof a singlesensorimotorbehavior. Figure4.9 illustratesthis relationship. Each motor schemahas as output an action vector (consistingof both orientation and magnitudecomponents ) that definesthe way the robot should move in responseto the perceivedstimuli. This approachhas beenusedfor ground-basednavigationwhereeachvectoris two dimensional(Arkin 1989b), for generatingthree-dimensionalvectorsfor usein flying or underwaternavigation (Arkin 1992a), andfor usein mobile manipulatorswith manyadditional degreesof freedom(Cameronet ale1993).
145
-BasedArchitectures Behavior Many different motor schemashavebeendefined, including . Move-ahead: movein a particularcompassdirection. . Move-to-goal: move towardsa detectedgoal object. Two versionsexist of this schema:ballistic andcontrolled. . Avoid-static-obstacle: move away from passiveor nonthreateningnavigational barriers. . Dodge: sidestepan approachingballistic projectile. . Escape:moveawayfrom the projectedinterceptpoint betweenthe robot and an approachingpredator. . Stay-on-path: movetowardthe centerof a path, road, or hallway. For threedimensionalnavigation, this becomesthe stay-in-channelschema. . Noise: movein a randomdirectionfor a certainamountof time. " Follow-the-leader: move to a particular location displacedsomewhatfrom a possibly moving object. ( Therobot acts as if it is leashedinvisibly to the moving object.) . Probe: movetowardopenareas. . Dock: approachan object from a particulardirection. . Avoid-past: moveawayfrom areasrecentlyvisited. . Move-up, move-down, maintain-altitude: moveupwardor downwardor fol Iowan isocontourin rough terrain. . Teleautonomy : allows humanoperatorto provideinternalbiasto the control at the same level asanotherschema. system Theseare the basic building blocks for autonomousnavigationwithin this architecture. Figure 4.10 depicts severalof the schemas . Rememberthat although the entirefield is illustrated, only a singlevectorneedsto be computed at the robot' s currentlocation. This ensuresextremelyfast computation. The actual encodingsfor severalschemasappearbelow, where Vmagnitude denotesthe magnitudeof theresultantresponsevectorand Vdirectionrepresents its orientation: . move-to-goal (ballistic): = fixed gain value. Vmagnitude = towardsperceivedgoal. Vdirection . avoid-static-obstacle: 0 for d > S = Vmagnitude * G for R < d ~ S , ~ 00 ford ~ R {
146
Chapter4
. . . . . " .
. . '
. . .
.
. . .
. . .
. . .
' : "
.
.
. . ~
. .
. .
,
. .
.
. "
p .
.
.
. ,
. .
. "
.
.
. .
.
"
"
"
. ,
"
"
, "
\
,
. ,
'
, "
. t
tt
. . ,
" " "
tt
"
t t
,
"
"
t
" '
. .
.
. .
. . ,
.
"
"
tt ,
. .
,
t
t
\
.
.
tt
"
\
. .
.
ttt
" \
. .
.
,
\
"
. . .
,
\
"
. .
.
" ,
"
. .
"
"
"
.
"
. .
. "
"
"
.
.
"
"
.
"
"
" "
, ,
"
"
,
, .. ..
" ..
.
, , . .
,
.
. . .
,
,
..
..
, ,
. , . .
'
,
:
..
'
..
:
..
'
,
:
..
'
,
. .
~
'
~ ~ " ' ' " t ~~ : : : : : \ 1. . . . ~ .. ~ .. ... . . ... . . . . .. .. ~ " fi , .. . . . . ... ~ ~ ~ ... ... . . .. ~ , - .. . . - ~ ~ ... . . . . .. ...;,~ ~ - .. . . - ~ ~~~ ~ . . . . .. ~ ~ ~ ~ ~ .. . . . . . . . . . . . . .. . ~ .. . . ~ ~ . . . . .. ~ ~ ~ ~ . ... .. .. .. . . . . ~ ~ ~ ~ ~ . . . . . *' " k" " ii . . . ... . . .. ~ ~~ P . " " "" " ... ... ~ .. ~ ~ .. .. . . ... ... ... .. .. . " " ~ ~ ~ ~ ~ ~ ~ Prrr ~ ~ ~ . ~ .. ~ , . r ~ ~ ~ " ~ , ~ ~ ~ .. , J S ~~ ~ \ . . r ~ " " .. ~ ~ . ' JJ ~ ~ ~ \ " . . " " .. .. .. .. i " \ \ ~ S ~ \ " ' . " . " .. .. .. li \ \ ~ J ~ \ " . ' . ~ " .. .. li4 \ ~ ~ ~ \ " . . ~ ' \ . .. II ~ J ~ ~ \ " . " " " " ~ ~ ~ ~ \ IJ " " " I " ~ ~ \ ' " I " " ' II " " " " . . , . . . , , . . , , I . . . . . . , . . , . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . :
.
, : ~ ~ ~
~ ~ ~ ~
~ . . .
.
,
: .
.
. .
. . . .
. .
. . '
' .
. . .
. . .
. .
, . .
.
. ' . .
.
(A )
Figure 4.10 : (A ) avoid-static-obstacle, ( B) move-ahead(toward east) , (C) schemas Representative move-to- goal, guarded(compareto ballistic versionin figure 3.16 (B , ( D) noise, ( E) stay-on-path, and (F) dodge. where S = sphere of influence (radial extent of force from the center of the obstacle), R = radius of obstacle, G = gain , and d = distance of robot to center of obstacle. Vdirection= radially along a line from robot to center of obstacle, directed away from the obstacle.
147
-Based Architectures Behavior -
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
~ . -' - - -
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
~ -
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- . - - - - - - - - - -
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
_
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
.-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Figure 4.10 (continued) . stay-an-path
P = Vmagnitude { ~
ford > ( W/ 2) * G for d ::5( W/ 2) ,
where W = width of path, P = off-path gain, G = on-path gain, and d = distanceof robot to centerof path. = alonga line from robot to centerof path, headingtowardcenterline. Vdirection
148
Chapter4
(C) ) Figure4.10(continued . move- ahead = fixed gain value. Vmagnitude = specified compass direction . Vdirection . noise = fixed gain value. Vmagnitude
Vdirect ion= randomdirectionchangedeveryp time steps(p denotespersistence ). It canbe seenthat the actualresponsecomputationsarevery simpleandfast.
149
~ --,"J"lt~ -;J1 '-'I.\t.~ """J,'A "\1 -_ '",tt,Jt'1 '-.;,r~ .,r-t,1 ".~ \..";1 \~ Irt,~ ,~ 'A tJt.1 J .\.-;r\~ ,-_ .r;r~ .".;\r-A .''\JA ,t",t1 J ,,Jt_ ~ J ,"J;J1 ~ ~ ~ -A -.tA t-"".'\A 1 l \ ~ . 1 t J \ , . ; r 1 ~ J ;(D r")'\~ l.,'~ \~ \ -1 't.;r,-\_ "1 J ;,"r-.J"' '.,t.Jl~
Behavior - Based Architectures
Figure 4.10 (continued) 4.4.2
Schema -Based Coordination
The next issueis how coordinationis accomplishedwith motor schemas . The answeris straightforward: vector summation. All active behaviorscontribute to somedegreeto the robot' s global motion. G , the gain vector, determinesthe relativecompositionfor eachbehavior. The notion of schemagainsis loosely alignedwith the conceptof activationlevelsmentionedpreviously. In a system whereno learningor adaptationis permitted, the gain levels remainconstant throughoutthe run. We will seein chapter8 that this can be modified during executionto permit learning. Action-selectiontechniques , soonto be described in moredetail, alsoreflect the notion of activationlevels.
150
Chapter4
Figure 4.10 (continued)
Returningto coordination, eachschemaoutput vector is multiplied by its associatedgain value and addedto all other output vectors. The result is a single global vector. That vector must be normalizedin somemanner(often merelyby clipping the magnitude) to ensurethat it is executableon the robot. The resultingnormalizedglobal vectoris sentto the robot for execution. This -react processis repeatedas rapidly as possible. The schemascan operate sense , eachdeliveringits dataas quickly as it can. Perceptual asynchronously performancegenerallylimits overallprocessingspeedsincethe simpleanddistributed motor responsecomputationis extremelyrapid. Somecare must be to ensure that thereactionproducedis basedon informationstill relevant taken ' (i.e., the perceptualdatais not too old). Normally, the perceptualalgorithms
Behavior-BasedArchitec~
PROJ ~ COLL ,~ LOCA ~ ' " ~ t4t ~ : , : . " " " ~ . , ~. . , " ~ ., . '~ " " '. ." ,, ., , ,., . . ,. " ' "'~ . , , . . . . . " , , ~ . . . , , . . . , , ~ . " . , . , " , . . , " , . . , . . , " , . . " . . " . , . . . . . " ......" ". 'ROBOT 4 (F ) Figure 4.10 (continued) action -oriented design ensuresrelatively prompt processing independent of the data source. Figure 4.11 shows several different types of robot paths resulting from these methods. It is interesting to observe some of the biological parallels for these type of systems (figure 4.12) . Section 3.3.2 described certain problems endemic to the use of potential fields , in particular local minima and cyclic behavior. Schema-based systems are not immune to these problems , nor have they been ignored . One of the simplest methods to addressthese problems is through noise, injecting randomness into the behavioral system through the noise schema. Noise is a common technique used to deal with local minima in many gradient descent methods, for
152
4 Chapter
(A)
(B) Figure4.11
;
obstacles
.
through
object
target
moving
manipulator
) dimensional -
and
B
(
)
and
,
;
three
( C
mobile
to
obstacles
docking avoid
static
-
noise
schemas
and
obstacle
-
goal
, avoid
static -
3
, -
to
path
stay
)
on
move -
-
-
Representative
paths
( A
:
simulated
robot
-
)
( continued
Figure 11
.
4
)
( C
view
Front
roan
Top view
roan
~
153
Behavior -Based Architectures
Chapter4
(B) Figure4.12
155
Behavior-BasedArchitectures
example, simulatedannealing(Hinton and Sejnowski1986) and mutation in -basedcontrol, geneticalgorithms(Goldberg1989), to namea few. In schema randomnessworks in severalways: in somecasesit preventsthe entry into local " minima, acting as a sort of " reactivegrease (figure 4.13). In general, it is ' -basedsystemto alwaysusefulto inject a small amountof noiseinto a schema help ensureprogress. Balch and Arkin ( 1993) developedanotherschema, avoid-past, to ensure progressis madeevenif the robot tendsto stall. Avoid-pastusesa short-term representationas it retains a timewindow into the past indicating where the robot has been recently. It is still reactive, however, since no path planning for the robot is ever conducted, and the output of this behavioris a vector of the sameform as all the other behaviorsand is combined in the same way. Repulsiveforces are generatedfrom recently visited areasthat prevent the robot from stalling when not at its goal. This approachhas proven very effectivein evendegenerate cases(figure 4.14). 8 will discussseveraladaptiveandlearningtechniques Additionally chapter that havebeenappliedto improvenavigationalperformance .
4.4.3 Desian in Motor Sehema- Ba.~
Systems
The design process for building a schema-based robotic system is typically as
follows: 1. Characterize the problem domain in terms of the motor behaviors necessary
to accomplishthe task. 2. Decomposethe motor behaviorsto their mostprimitive level, usingbiological studieswheneverfeasiblefor guidelines. 3. Developformulasto expresstherobot' s reactionto perceivedenvironmental events. 4. Conductsimplesimulationstudiesassessing the desiredbehaviors' approximate perfonnancein the proposedenvironment. 5. Determinethe perceptualrequirementsneededto satisfythe inputsfor each motor schema. Figure 4.12 (continued) Biological parallels: (A ) A schoolof anchoviesbeing attackedby diving birds. ( photograph courtesyof Gary Bell) . ( B) A herd of sheepin flight from their handlers. ( photographcourtesyof TempleGrandin, ColoradoStateUniversity.)
156
.--_ ...--,_ "_ ',"~ ..'_ -~ \--,~ ~ ~ ,..-~ J \ . " , ' ~ J \ . ' + ' , ~ _ " t 1 ~ . f , "",'1t', ".~ "~ ".?
Chapter4
(A)
Figure 4.13 -basednavigation(A ) Exactcounterbalancing Effect of noiseon schema :of forcescauses . robot to stall; (B) With noiseadded, the problemis prevented
6. Design specific perceptualalgorithms that extract the required data for eachbehavior, utilizing action-orientedperception, expectations , and focusof-attentiontechniquesto ensurecomputationalefficiency(chapter7). 7. Integratethe resultingcontrol systemonto the targetrobot. . 8. Testandevaluatethe system's performance . 9. Iterateandexpandbehavioralrepertoireasnecessary
Behavioral software reusegreatly simplifies this process, becauseschemas developedusing biological guidelinesoften have extensiveutility in many different circumstances .
157
~ ~ ~ ~ \ \ ~ ~ "+ ,~ 1 I . . " , J , . . .~ , '--.+ ' '.~ \ , ,".'-";0 ._ "'+ \ " + ~ ~ \ ~ + _ _ . . . A ' , J " . + _ .(B , " ' I A \ _ A , ? " ~ t ~ . ~ ~ . , \ ' . _ , ; o 1 . ~ r ; 0 ? " " ' ' " I o I 0 \ r t I ? A , I.)I_l~ 'A ,~ o ; o -1 1 .,~ t,f1 .,~ ? Behavior .-BasedArchitectures
) Figure4.13(continued 4.4.4 ForagingExample
Motor schemashave been used in severalimplementationsof foraging systems . The first example, mirroring the FSA shownin figure 4.2, consistsof threeassemblages , eachconsistingof up to four behaviors(Arkin 1992b). The primitive behaviorsare . Avoid-static-obstacle:instantiateddifferently for eitherenvironmentalobstacles or otherrobots. . Move-to-goal: changesits attention from the attractor to the home base dependingupon the statein which the robot finds itself.
Chapter4 -
-
Distance
Sensible 3
-
Gain 00
4
-
Gain
Object .
00
.
Past
-
Gain
0
20
.
Noise
Persistence
0
Noise
2
09
.
0
Goal
To
67
.
0
Motion
20
.
2
Obstacles
66
.
33
Dist
0
Contacts
725 Steps
441
.
0
Direction
80
.
0 Malnitude
cGoal
158
(A ) Figure 4.14 Avoid-past schemausagefor overcominglocal minima. Repulsiveforcesaregenerated at recently visited areas, preventingthe robot from stalling: (A ) without and (B) with avoid-past behavior; (C) deepbox with avoid-past; (0 ) mazewith avoid-past.
803
.
)
( continued
14
.
Figure 4
)
B
(
Goal
053
.
0
Direction
80
.
Maenitude
0
-
0
Goal
To
80
.
0
3
Motion
00
Distance
Sensible
.
00
.
0
4
00
lect .
Gain
-
.
Ob
Obstacles
75
.
3
Dist
3
.
20
Gain
-
Past
0
Contacts
0
Steps
140
2
Gain
-
Noise
Persistence
-
Noise
-
159
Behavior-BasedArchitectQres
-~
Chapter4
'
II
0
C
.
. oIJ
. 004 9
II .
00
004
.
C
II U
C
0
C
004
. ~
1 oIJ
O
II
'
00
0
II .
0
C 0040
. ~
11 oIJ
.
. Q
'
G
~
C
. 0040
~
1
110
.
004
0 Z
II
~
0
C
II
oIJ
.
004N
.
~
II Q
1
II
. 004
0 Z
U
Figure 4.14 (continued)
Behavior -Based Architectu .reS
D )
(
Figure 4.14 (continued)
. Noise: Initially setat a high gainto ensurebroadexplorationof the area, then reducedgreatlyuponencounteringan attractor. Figure4.15 depictsthe behavioralconfigurationusing an SR diagram. , Balch et at. ( 1995) createdmore-complex foraging robots Subsequently (figure 4.16) using a similar methodologyfor use in a robot competition. Chapter9 discussesthis multirobot implementationin moredetail. 4.4. 5 Evaluation When evaluated using the criteria presented in section 4.1, motor schemabased robotic systems are found to have the following strengths:
.
.
.
.
~
"
. . "
'
"
"
"
"
"
"
"
"
' .
"
"
"
"
"
"
"
"
~
'
, ~
~
, . . . . . . . . . . . . . .
.
.
.
.
,
. '
"
.
.
.
" "
"
"
"
"
.
"
"
.
.
.
.
.
-
" . . . . . . . . . . . . . .
"
"
"
"
"
"
"
"
" "
"
"
"
"
. . . . . . . . . . . . . .
-
,
t
/
/
-
' .
- -
_ /
' .
. . . . . . . . . .
-
-
-
~
,
.
. . .
-
.
~
" '
. . . . . . -
e
~
. C
;
'
,
"
~ ~ . . .
, "
"
"
"
"
-
-
Jot
lIC
~
. . . . . . . . . . ~
,g/l~ -et-~-/~~ 1101l a'etece ' CIs ., a'e~ -"T -t'Obot =
'
:
"
"
" "
"
"
"
"
"
"
"
"
"
,
/:'O ClJ ~ter "" 4 GE -/ 6,2
163
Behavior-BasedArchitectures
Figure 4.16 -basedrobot. Callisto, a foraging schema
. Supportfor parallelism: Schematheory is a distributedtheory of computation es. Motor schemasare naturally paralinvolving multiple parallel process lelizable. . Run time flexibility : As schemasare software agents, instantiatedat run time as processes and are not hard-wired into the architecture, it is simple to reconfigurethe behavioralcontrol systemat any time. . Timelinessfor developmentand supportfor modularity: Schemasare essentially softwareobjectsand are by definition modular. They can be stored in behaviorallibraries andare easily reused( Mackenzie , Cameron, and Arkin 1995). The following is found to be neithera strengthnor a weakness : . Robustness : As with any reactive system, schemascan well cope with changein the environment. One deficiency lies in the use of potential field analogsfor behavioralcombination, which hasseveralwell-known problems. Specificmethods,however,suchastheintroductionof noiseandthe avoid-past behavior, havebeendevelopedto circumventthis difficulty. The following weaknesses areidentifiedunderthe evaluationcriteria:
164
Chapter4
Figure 4.17 -based robots. First row (left to right): 10, Callisto, and Several motor schema . . Secondrow: Shannonand Sally. Third row: George, Ren, andStimpy Ganymede Rear: GT Hummer. . Niche targetability : Although it is feasible to design niche robots , the generic modular nature of the primitive schemas somewhat discourages the design of very narrowly focused components. . Hardware retargetability : Schema-based systems are essentially software architectures mappable onto hardware multiprocessor systems. They do not provide the hardware compilers that either subsumption or Gapps does. Hardware mappings are feasible , however ( Collins , Arkin , and Henshaw 1993), just not as convenient as with some other systems.
-BasedRobots 4.4.6 Schema The incremental development of the library of motor schemas can be traced through a series of fielded mobile robots (figure 4.17) . Schemas from earlier robots were easily reused for newer machines as they became available. . HARV : An early Denning mobile robot , named after the cocktail Harvey Wallbanger, which reflected its early behavior. This early robot was capable of a wide range of complex behaviors including :
165
Behavior-BasedArchitectures
. Exploration: a combinationof avoid-static-obstacleandnoise. . Hall following : move-aheadin the direction of the hall coupledwith avoidstatic-obstacle; enabledsafenavigationdown a corridor. . Wall following or " drunkensailor" behavior: usefulforgoing throughdoorways . A move-aheadschemapointing at an angleinto the wall coupledwith avoid-static-obstacleproduceda behaviorwhere the robot followed the wall and then passedthroughthe first openingit found. This enabledthe robot to " completenonspecifictaskssuchas go down the hall and enterthe first door " on your right. . Impatientwaiting: occurredwhile the robot waitedfor a door to open. It consisted of a small amountof noiseanda move-to-goal behaviortargetedimmediately beyonda closeddoor, coupledwith the avoid-static-obstacleschema. The robot would oscillatein its local minima until the door opened. Whenthe obstaclestimuluswasno longerpresentbecauseof the new opening, the robot movedthrough the doorway. This behavioris potentially useful for entering elevators, amongother things. The behavioris also referredto as the fly -at-awindow behaviorsincea fly is attractedtowardsthe light yet repelledby the window, and the behaviorhasa noisy component(panic?) that is at the glass actuallyhelpful to the fly in finding openingsthat it may not initially sense. . Indoor and outdoor navigation: demons.trated in severalways, including variouscombinationsof the stay-on-path, avoid-static-obstacle, noise, move. ahead, andmove-to-goal schemas . George: This Denning DRV-3, namedafter a fictitious GeorgiaTech student , wasthe first robot to exhibit behavior-baseddocking (Arkin andMurphy 1990), teleautonomy(Arkin 1991), andavoid-pastbehaviors( BalchandArkin 1993). . Ren and Stimpy: A pair of DenningMRV-2 robotsusedfor dodge, escape , . research behavioral forage, andmultiagent . Buzz: Usedin the AAAI competitiondescribedin section3.4.4 (Arkin et al. 1993) . . 10, Callisto, Ganymede : Three student-constructedsmall mobile robotsfor multiagentresearchand winnersin a robot competition(Balch et al. 1995) . . Mobile manipulator: One of the MRV-2s fitted with a CRS+ robot arm (Cameronet al. 1993).
4.5 OTHERARCIDTECTURES We now surveya representative samplingof the wide rangeof otherbehavior ' basedarchitecturesthat exist, highlighting eachone s approachand contribution . All sharea philosophyof commitmentto sensingand action, elimination
166
Chapter4 Table 4. 3 Circuit architecture Name
Circuit Architecture
Background Precursors
Early reactivearchitec~ Brooks 1986; Nilsson 1984; Barbera et al. 1984; Johnson1983
Princi~ designmethod Developers Response encoding Coordinationmethod Programmingmethod Robotsfielded References
Situatedactivity L. Kaelbling and S. Rosenschein(SRI) Discrete(rule based) Hierarchicalmediation(arbitrationwith abstraction ) Rex andGapps Flakey Kaelbling 1986; Kaelbling andRosenschein1991
or reductionof symbolic representationalknowledge, and the use of behavioral units astheir primary building blocks. Eachone' s uniquenessarisesfrom its choiceof coordinationmechanisms , the responseencodingmethodsused, the behavioralgranularity, andthe designmethodologyemployed.
4.5.1 Circuit Architecture The circuit architecture is a hybridization of the principles of reactivity as typi fied by the subsumption architecture , the abstractions used in RCS ( Barbera et al. 1984) and Shakey ( Nilsson 1984), and the use of logical formalisms (Johnson 1983) . Table 4.3 summarizes this approach. We discussed aspects of this architecture in section 3.2.4.2, in particular the role of logical formalisms and situated automata. One strength this approach provides involves the use of abstraction through the bundling of reactive behaviors into assemblagesand by arbitration to occur within each level of abstraction, that is , what the allowing designers refer to as hierarchical mediation . Another advantage is the use of formal logic as a means for expressing the behaviors, permit ting compilation into hardware and assisting with the verification of the perfonnance of the resulting robotic system ( Rosenscheinand Kaelbling 1987) . The motivations for this architecture , according to its designers, are typical for behavior -based systems in general: modularity , permit ting incremental development; awareness, tightly coupling sensing to action ; and robustness, being able to perform despite unanticipated circumstances or sensor failure .
167 4.5.2
Behavior -Based Architectures
Action -Selection Action - selection is an architectural approach developed by Pattle Maes in the late 1980s. It uses a dynamic mechanism for behavior selection. Instead of employing a predefined priority -based strategy typified by the subsumption approach , individual behaviors (competence modules) have associated activation levels that ultimately provide the basis for run -time arbitration . A competence module resembles a traditional AI robotic operator with preconditions , add lists and delete lists. Additionally , an activation level is associated with the module that ultimately governs its applicability at any particular time by being above some threshold. The activation level for any particular module is affected by the current situation , higher level goals, spreading activation due to previous or . potentially succeeding events in time , or inhibition from conflicting modules. Activation levels also decay over time . The module with the highest activation level is chosen for execution from the set of all modules whose preconditions are satisfied. The selection process is repeated as rapidly as possible as the world ' s circumstances change about the agent. Because there is no predefined layering of behaviors as in subsumption, it is harder in action - selection to predict the agent' s global performance in a dynamic environment , and thus action - selection has a greater emergent quality . Several global parameters are used to tune the control system, all of which are related to the activation levels (e.g ., activation threshold , amount of activation energy injected) . An advantage of this strategy is flexibility and openness, as the system' s responses are not hard- wired . As an agent' s intentions can influence the activation parameters, higher level goals can also induce performance changes (see Norman - Shallice model from chapter 6 ) . The action - selection approach also shares much in philosophy with schema theory (Arbib 1992), especially regarding the use of activation levels for controlling behavioral performance . Its primary perceived limitation is the lack of real implementations on actual robots and thus no evidence exists of how easily the current competence module formats would perform in real world robotic tasks. Table 4.4 summarizes this architecture ' s characteristics.
4.5. 3 Colony Architecture
Thecolonyarchitecture(Conne111989b of the subsumption ) is a directdescendent architecturethat uses simpler coordination strategies(i .e., suppression only) and permits a more flexible specificationof behavioralrelations. The
168
Chapter4 Table 4.4 Action - selection architecture
Name
Action-selection
Background Precursors
Dynamiccompetitionsystem Minsky 1986; Hillis 1988
Principal designmethod
Experimental PattleMaes(Mit ) Discrete
Developer Responseencoding Coordination method
Arbitration via action-selection
Programming method Robots fielded
Competencemodules Simulationsonly
References
Maes 1990 ; Maes 1989
Table 4.5 Colony architecture Name Background Precursors
Colony architecture Descendentof subsumptionarchitecture
Principal designmethod
Minsky 1986; Brooks 1986 Ethologically guided/experimental
Developer
JohnConnell (IBM)
Responseencoding Coordination method
Discreterule based Priority-basedarbitrationwith suppressiononly
Programming method Robots fielded
Similar to subsumption Herbert(mobile manipulator), wheelchair
References
colony architecturepennits a treelike ordering for behavioralpriority as opposed . A closer to thetotal orderingof layeredbehaviorsfound in subsumption models derived from animals was also to , developed using ethology relationship suchas the coastalsnail to justify the useof pure suppressionnetworks. Thepinnacleof this architecturewasHerbert, a robot designedto wanderabout the corridorsof the MIT laboratoryandretrievesodacansfor recycling using . Table4.5 summarizesthe vision, ultrasound, and infrared proximity sensors ' s characteristics . architecture colony
169
Behavior - Based Architectures
Table 4.6 AnimateAgent Architech Ire Name Background Precurson Principal designmethod Developer Responseencoding Coordination method Programming method Robots fielded ,References
4.5.4
Animateagentarchitecture RAP-basedsituatedactivitysystem Miller, Galanter , andPribram1960;Georgeffet aI. 1987;Ullman1985 1986;AgreandChapman Situated activity of Chicago R. JamesFirby( University ) NATbased(continuous ) Sequencing RAPlanguage robot) Chip(trash-cleaning Firby1989,1995;FirbyandSlack1995
Animate Agent Architecture
The animateagentarchitectureaddstwo componentsto the RAPs discussed in section3.1.3: a skills systemand special-purposeproblemsolving modules , typically ( pirby 1995). The skills providecontinuousenvironmentalresponse RAPs whereas 3.3.2 section 1990 Slack NAT ), provide an )( encoding( usinga . In a situations in mechanismfor bundling skills useful particular assemblage sense, RAPs are relatedto FSA statesand can be usedto sequencethrough a collection of skills over time (Firby and Slack 1995). Situations, however, are usedto definethe states(or context) and provide the overall designbasis. ' Table4.6 summarizesthis architectures key points.
4.sis DAMN The Distributed Architecture for Mobile Navigation (DAMN ) boasts a rather provocative name. Developed by Rosenblatt ( 1995), initially at the Hughes AI Center and subsequently at Carnegie- Mellon University , this behavior-based system has a unique coordination mechanism. The behaviors in DAMN , which was initially touted as a fine - grained alternative to the subsumption architecture (Rosenblatt and Payton 1989), are themselves asynchronous processes each generating outputs as a collection of votes cast over a range of responses. The votes for each behavior can be cast in a variety of ways, including differing statistical distributions over a range of responses. The behavioral arbitration
170
Chapter4 Table 4.7 DAMN Architecture Name Background Precurson Principal design method Developer Responseencoding Coordination method
DAMN (Distributed Architecture for Mobile Navigation) -styIe architecture Fine-grainedsubsumption Brooks 1986; Zadeh1973 Experimental Julio Rosenblatt(CMU) Discretevote sets multiple winner-tate-all arbiters
Programming method Robots fielded
Custom
References
Rosenblattand Payton1989; Rosenblatt1995
DARPA ALV andUGV vehicles
method, discussedin section3.4.3, is a winner-tate-all strategyin which the . Table4.7 responsewith the largestnumberof votesis selectedfor enacbnent its . highlights characteristics Also unique to the DAMN architectureare the multiple parallel arbiters for both speedand turning control. Arbitration for each of these activities occurscompletely independently . Chapter9 revisits the DAMN architecture in examiningthe DefenseAdvancedResearchProject Agency' s ( DARPA' s) UnmannedGroundVehicleDemo II Program. 4.5.6
Skill Network Architecture
The skill networkarchitectureis a behavior-basedsystemdevelopedfor graphical animationrather than robotics. Indeed, the use of behavior-basedtechniques within animationis becomingwidespread . Pioneeringwork by Craig 1987 on his Bolds a ) Reynolds( systemprovided compelling set of visual behaviors for flocks of birds and schoolsof fish. Recentwork by Hodgins and Brogan( 1994) hasextendedthesetechniquesto modelnot only graphicalcreatures' responses to their fictional environmentsbut alsotheir dynamics. Zeltzer and Johnson( 1991), however, developedthe skill network architecturein a moregeneralway, providing for a variety of computingagents: sensingagents that provide information regardingthe environment , skill agentsthat encode behaviors, and goal agentsthat monitor whethercertainconditionshavebeen met. A modificationof Maes' action-selectionmechanismservesas the basis
Behavior-BasedArchitects Table 4.8 Skill networkarchitecture Name Background Precursors
Skill networkarchitecture Behavior-basedanimationarchitecture Maes 1989; Badier andWebber1991
Principal designmethod
Etho10gical/ Expenmen Tai
Developer
David Zeltzer (MIT )
Responseencoding Coordination method
Discrete Action-selection
Programming method Robots fielded
Agent libraries Graphicalanimationsonly
References
ZeltzerandJohnson1991
tions to minimize computation time , an essential aspect of computer-generated animation . The general characteristics of the skill network architecture appear in table 4.8.
4.5.7 Other Efforts Severalotherbehavioralapproach es warrantmentioning. . BART: The BehavioralArchitecturefor RobotTasks, wasan early approach (Kahn 1991) that definedtask behaviorsarbitrarily and provided supportfor military robotic missions. Of note wasits useof a focus-of-attentionmanager to provide situationalcontextfor the selectionof relevantbehaviors. A BART languagewasdevelopedfor specifyingbehavioraltasks. . Autochthonousbehaviors:Developedby GropenandHenderson( 1990), this behavioralapproachuseslogical impedancecontrollersas the basisfor specifying behavioralresponse . Of particular interestis its focus on graspingand ' manipulationtasksas opposedto robot navigation. Gropenand Hendersons methodalsoreliesfar moreheavily on representational knowledge(3D models sensor data than reactive ) generatedby typical systems. . Andersonand Donath: A behavioralapproachstrongly influencedby etho-pound robot (Andersonand , a 4OO logical studiesand fielded on Scarecrow -basedapproach,uses Donath1991). The system, similar in spirit to the schema
172
Chapter4
potential fields methodsfor the responseencodingand vector summationfor the coordinationmechanism . . SmartyCat: Definedin the SmartyCat Agent Language( Lim 1994), similar in flavor to the BehaviorLanguage(Brooks 1990a), this behaviorspecification approachhasbeentestedon a CybermotionK2A robot. . Dynamic Reaction: A behavior-basedsystemcapableof using goal-based constraintsin dynamicor rapidly changingworlds (Sanborn1988) . This system wastestedin simulationin trafficworld, a driving simulator. . ARC (Artificial Reflex Control) : In this model for robot control systems , actualbiological reflexes serveasa basisfor their designin robots. The model hasparticularrelevancefor rehabilitativerobotics, in which a prostheticdevice suchas an artificial handor limb must coordinatewith an activehuman. This control modelhasbeenappliedto hand, knee, and leg controllersfor potential usewithin human-assistivetechnology(BekeyandTomovic 1986, 1990). . Niche Robot Architecture: This architecture, developedby Miller ( 1995), draws on the notion of ecological niches as espousedby MacFarland(McFarlandand Bosser1993) and presentedin chapter2. It focusesmore on the philosophicalissuesof creatingrobots for specifictasksrather than on rigid . Several commitmentsto specificbehavioralencodingsor coordinationstrategies real world robotsfit into this paradigm, including Rocky ill , a prototype Mars microrover; Fuddbot, a simplistic vacuumcleaningrobot; and a robotic . wheelchairtaskedwith assistingthe handicapped
4.6 ARCmTECTURA LDESIGNISSUES We can extract a number of common threads from this diversity of architectural approaches as well as some themes driving the development of these systems. . Analysis versus synthesis. This methodological difference relates to the underlying assumptions regarding just what intelligence is. In some instances, intelligence is perceived as something that can be reduced to an atomic unit that when appropriately organized and replicated can yield high - level intelligent action. In other approaches, abstract pieces of intelligent systems, often extracted from or motivated by biological counterparts, can be used to construct the required robotic performance . . Top-down ( knowledge-driven ) versus bottom -up (data-driven ) design. This aspect relates more closely to experimentation and discovery as a design driver versus a formal analysis and characterization of the requisite knowledge a system needs to possess to manifest intelligent robotic performance . These
173
Behavior-BasedArchitectures
" differencesperhapsparallel to a degreethe " scruffy versusneat dichotomy in AI . . Domainrelevanceversusdomainindependence . To someextentthis characteristic is not a singleform of intelligence there either is or view that the captures " . Herethe AI parallelis " weakversusstrong methods. . Understandingintelligence versusintelligent machines. The fundamental ' differencehere lies in the designers goals. Biological constraintscan be applied in the developmentof a robotic architecturein an effort to understand the natureof animalintelligence. This approachmay compromisethe utility of the resultingmachineintelligence. Often robot architectswho follow this path havean underlyingassumptionthat intelligenceis fundamentallyindependent . This is merely a ~ orking of the underlyingsubstratein which it is embedded evidence to is no as there , supportit (nor to contradictit ). yet strong . hypothesis Otherarchitectsareconcernedwith the moredirect goal of building usefuland productivemachineswith sufficientintelligenceto function within the world in which they are situated. Whetherthesemachinesrelateto biological systems is not their concern. Thesecompetinggoalsand methodsresult in a wide rangeof architectural es, aswe haveseenthroughoutthis chapter. Many roboticistsfeel that approach in dynamic althoughbehavior-basedmethodsprovideexcellentresponsiveness use of the in their lost environments , much is representational eliminating knowledge. Theseresearchershave consideredhow representationalknowledge . In in variousforms can be integratedinto thesebehavioralarchitectures can introduce that various methods we will encounter 5 representational chapter , knowledgeinto reactiverobotic architectureswhile maintainingmost, if not all , of their desirableproperties. In chapter6 we will study hybrid architectures that attemptto supplementbehavior-basedarchitectureswith not only . representationalknowledgebut additional deliberativeplanning capabilities ' these into be introduced can and es how discuss 8 adaptation learning Chapter systems,anothervery importantresearcharea.
SUMMARY 4.7 CHAPTER . A wide range of architectural solutions exist under the behavior -based paradigm . . These architectures, in general, share an aversion to the use of representational knowledge , emphasis on a tight coupling between sensing and action , and decomposition into behavioral units.
174
Chapter4
. Thesearchitecturesdiffer in the , coordination granularityof behavioraldecomposition methodsused, responseencodingtechnique,basisfor development , andothe;r factors. . Our working definitionis that robotic architectureis the disciplinedevotedto thedesignof highly specificandindividual robotsfrom a collectionof common softwarebuilding blocks. . Robotic architecturesare similar in the sensethat they are all ' lUring computable but indeeddiffer significantlyin termsof their organizationalcomponents andstructure. . Behavior-basedarchitecturescan be evaluatedin termsof their supportfor parallelism, hardwareretargetability, ecologicalniche fitting, modularity support . , robustness , flexibility , easeof development , andperformance . The subsumptionarchitectureis a layeredarchitecturethat usesarbitration strategiesand augmentedfinite statemachinesas its basis. It hasbeenimplemented on many robotic systemsusing rule-basedencodingsand an experimental designmethodology. . Motor schemasare a software-orienteddynamic reactivearchitecturethat is non-layeredand cooperative(as opposedto competitive). Vectorsserveas the continuousresponseencodingmechanismwith summationas the fundamental coordinationstrategy.Severalrobotic systemshavebeenimplemented , and the architecturehas had significant influencefrom biological considerations . . Circuit architecturespredominantlyuse logical expressionsfor behavioral encoding, use abstractioncoupled with arbitration, and typically follow the situatedactivity designparadigm. . Action-selection architecturesare dynamic rather than fixed competition systems,andthey alsousearbitration. . The colony architecture is a simplified version of subsumption , more in its . straightforward implementation . The animateagent architectureusesreactiveaction packages(RAPs) and sequencingmethodsto unfold situationalresponsesover time. . The DAMN architectureprovides voting mechanismsfor behavioral response encodings, with a winner-take-all arbitration mechanismin the style ofsubsumption. . The skill network architectureis particularly well suitedfor graphicalanimation andusesaction-selectiontechniques . . Many other behavior-basedarchitecturesalso exist, varying at somelevel from the otherarchitecturalsystems.
175
Behavior - Based Architect Ures
. Design choices for robotic architects involve issues such as whether to use analysis or synthesis, take a top - down or bottom -up design stance, design for specific domains or be more general, and whether to consider the abstract role of intelligence in general or simply be concerned with building smarter machines.
ChapterS Representational
Issues for Behavioral
Systems
Knowledgeis to embarkon a journey which . . . will alwaysbe incomplete, cannotbe chartedon a map, will neverhalt, cannotbe described. - DouglasR. Hofstadter The only justification for our conceptsand systemsof conceptsis that they serveto representthe complexof our experiences ; beyondthis they havenot legitimacy. - Albert Einstein Thereis no knowledgethat is not power. - Ralph Waldo Emerson Is knowledge knowable ? If not, how do we know this ? - Woody Allen
Chapter Objectives 1. To developworking definitionsfor knowledgeand knowledgeuse. 2. To explorethe qualitiesof knowledgerepresentation . 3. To understandwhat typesof knowledgemay be representable for usewithin robotic systems. 4. To determinethe appropriaterole of world and self-knowledgewithin behavior-basedrobotic systems. 5. To studyseveralrepresentational strategiesdevelopedfor usewithin behaviorbasedsystems.
178
ChapterS
5.1 REPRESENTATIONAL KNOWLEDGE A significantcontroversyexists regardingthe appropriaterole of knowledge within robotic systems. Oversimplifying this conflict somewhat , we can say that behavior-basedroboticistsgenerallyview the use of symbolic representational knowledgeas an impedimentto efficient and effective robotic control , whereasothersarguethat strongforms of representational knowledgeare neededto havea robot performat anythingabovethe level of a lower life form. In this chapter, we attemptto defusethis argumentby first providing some definitions and characteristicsof knowledgeand knowledgerepresentations and then showing successfulexampleswhere knowledgerepresentationsof variousforms havebeenintroducedinto reactiverobotic systemsat the behaviorallevel. We emphasizethe appropriateuseof knowledge, not merely using ' knowledgefor knowledges sake. Later, in chapter6, we describehow it is possible to exploit multiple robotic control paradigmswithin a singlearchitecture, with differentcomponentsof the systememployingknowledgerepresentations in different ways.
5.1.1 What Is Knowledge ? . Knowledge , much like intelligence , is a word notoriously difficult to define. Information arises from data, and knowledge can be said to emanate from
information (figure 5.1). Tanimoto's ( 1990, p. 111) definition for knowledge seemsparticularly to the point: " information in context, organizedso that it canbe readily appliedto solving problems, perception, andlearning." Thrban's ( 1992, p. 792) definitionis alsouseful: " Understanding , , awareness or familiarity acquiredthrough educationor experience . The ability to use information." Knowledgeinvolvesusing information intelligently. To do this effectively, knowledgemust be efficiently organized, otherwise it becomes burdensome . Of course, if we intend to useknowledgeto guiderobotic behavior, it must somehowbe representedwithin the robotic system. Much of the debateregarding ' knowledges role within behavior-basedsystemscenterson how it is representedwithin the context of the control system. Steels( 1995) considers to involve " physicalstructures(for example knowledgerepresentations electrochemical states) which havecorrelationswith aspectsof the environment andthushavea predictivepowerfor the system." Although.this definition is broaderthan most, it doescapturetwo very importantcharacteristics :
Issuesf <;>r BehavioralSystems Representational
.
Volume ~
179
Organization Figure 5.1 Basisof knowledge
musthavesome 1. Environmental con- elation. To be useful, a representation that world. The nature of with the external relationshipwill serve relationship that we asa definingcharacteristicfor manyof the knowledgerepresentations consider.In particular, the temporaldurability or persistenceof the represented knowledge(e.g., short term, long term) and the nature of the correlational mappingitself (e.g., metric, relational) will serveasdefiningfactors. 2. Predictive power. This predictiveability is centralto the value of knowledge . H thereis no needto predict, we canrely entirely on what representation is sensed , resulting in a purely reactiveapproach. However, if there is useful information beyondthe robot' s sensingcapabilitiesthat is accurate, durable, andreliable, thenit canbe worthwhile in providing knowledgerepresentations that encodethis information to the robot. As we all know, however, the future is notoriouslydifficult to predict in the real world, evenin the bestof circumstances . Here lies the controversy: in the utility of knowledgeat a given time within a given environment , sinceit hasan ability not only to predict, but also to deceiveshouldthat informationbe inaccurateor untimely. Figure 5.2 capturessomeof the trade-offs regardingsensingversusrepresenting . Wheneverthe world changesrapidly, in various task environments
180
ChapterS
dynamic end uncerl8ln worlds
highly structuredworld.
FigureS .2 Trade-offs for knowledgeuse. An increaseis denotedby the arrow' s direction. stored knowledge becomes potentially obsolete quickly . On the other hand, continuous sensing is not free , and from a computational perspective it is better to conduct that process as few times as necessary. If the world is unlikely to change, it can be advantageousto retain previously sensedinformation instead of unnecessarily oversampling the environment. A related problem is maintaining an accurate correlation between the robot ' s position within the world and its representational point of view. This is not a trivial problem . When this involves the spatial location of a mobile robot it is referred to as localization . If maps of any form are maintained , the problem of resolving the robot ' s egocentric frame of reference and the representational frame of reference must be addressed. In other words , the question that must be answered is " Where am I ?" Purely reactive systems do not address this ' question at all , for they are concerned only with the robot s immediate sensations . There is no projection into the future , nor any reasoning over past experience. This chapter delves into various approaches for addressing these trade-offs , maximizing robotic utility in differing circumstances. We must keep in mind as we study these methods that they depend strongly on the environment in which the robotic agent resides and not merely on the architectural choice itself .
5.1.2 Characteristics of Knowledge Early psychologistssuchas Hull andSherrington, who precededbehaviorists, took the stancethat " knowledgeis embeddedin the structureof the reflex units that generateaction" (GallisteI1980, p. 336). Therewereneitherdiscrete esto completethe mappingfrom knowledgestructuresnor translationalprocess store to action. In knowledge subsequentyears, behaviorists(epitomizedby B.F. Skinner), eschewedmentalisticterms such as " knowledge" or " mental
181
Issuesfor Behavioral Systems Representational " , which have no place within the behavioristpoint of view representations and are viewed merely as artifacts derived from complex stimulus-response couplings. Cognitivepsycholog~sts (e.gNeisser 1976) haverecentlyforwardedcompelling evidenceandtheoriesof knowledgerepresentationthat havedisplaced much of the earlier behavioriststance. Knowledgestructuresareconceivedas manipulableunits of information involved in variousways with action generation . Cognitivemaps(Gallistel 1990) are often referredto as the meansfor both storageof previousexperienceand its translationinto action. Neurobi" " ological evidencenow exists for what and where centerswithin the brain , and Macko 1983), providing a compellingbasisfor (Mishkin, Ungergleieder . These theimportanceof localizationandobjectrecognitionandcategorization a that maintain vectors of action as collections be as spatial simple mapsmay correlationwith sensoryeventsarriving from the outsideworld andappropriate actionsby the agent( Bizzi, Mussa-Ivaldi, andGiszter 1991; GallisteI1990 ). Learningis often inextricablyboundup with the issuesof knowledgerepresentation : for example, how arethesecognitivemapscreatedand storedin the first place? Wedefermostof theselearningissuesuntil chapter8, but inevitably someaspectsseepinto our discussionhere. This perspectiveon psychologyclearly relatesto behavior-basedrobotic . systemsin that we would like our robotsto act intelligently or knowledgeably The debateover knowledge's role residesnot in whetherit is useful but rather in how it appearswithin a robotic system. Traditional AI is often distinguishedfrom behavior-basedsystemsalong the knowledgerepresentationfront. Let us considera taxonomyof knowledge ( Dennett1982; Malcolm and Smithers1990) : representations . Explicit : symbolic, discrete, manipulableknowledgerepresentations typical of traditionalAI . . Implicit : Knowledge that is non-explicit but reconstructableand can be madeexplicit throughproceduralusage. . Tacit: Knowledgeembeddedwithin the systemthat existingprocesses cannot reconstruct. Symbolic systemsuse explicit knowledgeas defined above, subsymbolic systemsinvolve either implicit or tacit knowledgeuse. It canbe saidthat all intelligent systemsmustuseknowledgeto accomplish their goals. To be truly intelligent, the knowledgeusagemust be efficient and effective. Thosewithin thebehavior-basedroboticscommunityresistthe useof explicit knowledge, aswe havenoted, becauseof anotherdifficulty : the"symbol
182
ChapterS groundingproblem. Succinctlystated, the symbolgroundingproblemrefersto the difficulty in connectingthe meaning(semantics ) of an arbitrary symbolto a real world entity or event. It is easyto createa symbol with the intention of representingsomethi~g. It is difficult , however, to attachthe full meaningand implicationsof that real world object or eventto the symbol. The degeneracy is often recursiveor circular. Othersymbolsareusedto definethe symbolthat one is trying to anchor. This only compoundsthe problem by leading to a proliferation of symbolsthat still haveno true meaning. We, as humans, are capableof groundingour symbols(or language) andextractingmeaningfrom thempartly becauseof our ability to perceiveandmanipulatethe environment. Meaningarisesfrom our interactionswith objectswithin the world and is not intrinsic to the objectsthemselves . Fortunatelyrobotics, unlike much of AI , us a means which our agentcaninteractwith the world, that is, the provides by robot' s sensorsandactuators. If we want our robotsto act knowledgeably , varioustypesof knowledgeare . Listed below are several necessary possibilities: . Spatialworld knowledge: anunderstandingof the navigablespaceandstructure surroundingthe robot. . Object knowledge: categoriesor instancesof particular types of things within the world. . Perceptualknowledge: informationregardinghow to sensethe environment undervariouscircumstances . . Behavioralknowledge: an understandingof how to reactin different situations . . Ego knowledge: limits on the abilities of the robot' s actionswithin the world (e.g., speed, fuel, etc.) and on what the robot itself can perceive(e.g., sensor models) . . Intentionalknowledge: informationregardingthe agent's goalsandintended - a plan of action. actionswithin the environment Spatial world knowledgecan take severalforms: quantitativeor metric, where someabsolutemeasureis usedto establishthe robot' s relationshipto the world; and qualitativeor relational, whereinformation aboutthe world is describedin relativeterms(e.g., the goal is just to the right of the seconddoor on the left). Another way in which knowledgecan be characterizedis in regardto its durability: how long it will be useful. lWobasicforms of knowledge, persistent and transitory, can be distinguished. Persistentknowledgeinvolves a priori information about the robot' s environmentthat can be consideredrelatively
183
Issuesfor Behavioral Representational Systems staticfor themission's or task' s duration. Thesedatatypically arisefrom object modelsof things the robot might expect to seewithin its world, models of the free spacewhere it moves, and an ego model of the robot itself. The knowledgebase within which this information residesis termed long-term ' . memory( LTM), indicativeof this datas persistence The robot acquirestransitory knowledgedynamically as it movesthrough theworld andstoresit in short-term memory(STM). World modelsconstructed from sensorydata typically fall into this category. Although STM is rarely useddirectly for reactivecontrol, it can be appliedwhen purely reactivetechniques encounterdifficulties. In behavior-basedrobotic systems , dynamically acquiredworld modelsshouldbe usedonly when the control regimefails to cope with difficult situations. Even then, STM is best usedto reconfigurethe control systemratherthanto supplantit. Transitoryknowledgeis typically forgotten (fades) asthe robot movesawayfrom the localewherethat information wasgathered. For both persistentandtransitoryknowledge, the choiceof representational structureandformat is lessimportantthanmerelythe availability of the knowledge itself. Persistentknowledgeallows for the use of preconceivedideasof the robot' s relationship to the world, enabling more efficient use of its resources than would be accomplishedotherwise. Either form of knowledge, if misused, could interfere with the simplicity and efficiency of reactivecontrol . Nonetheless , whendifficulties with a behavioralcontrol regimearise, it is useful to provide a bigger picture to help resolvethem. This can result in solutions to problemssuchasthe fty-at-the-window situationin reactivecontrol, whenan insectstrivesto go towardsunlightenteringfrom an outsidewindow, is rebuffedby the glassybarrier, expendsall of its energytrying to solve the problemwith its fixed set of behaviors, andultimately dies. If transitoryenvironmental modelsareconstructedundertheseconditions(STM), a robot could usethe informationto circumnavigatethe barrier. We previouslydiscussedSTM andLTM in box 2. STMperforms over an intennediatetime scale, in contrastto LTM , which is more durative, and reflexive . Strongneuroscientificevidence memory, which is quasi-instantaneous existsthat STM process es are distinct from LTM (Guigonand Burnod 1995). LTM is persistentand is generallyviewedas the basisfor learning. Figure 5.3 illustratestheserelationships. Associativememory is anotherfonn of knowledgerepresentation , often linked with neural network models (Anderson 1995) . Here input patterns, often only partially complete, evokeresponsesor memoriesencodedin a network model. Variousmappingsare found within the nervoussystem. Primary
184
ChapterS
Transitoxv :Knowled ~e Purely ReacI Jve I Instantaneous
Persistent Knowledi !:e
Sensor -acquired Maps
-
Short-term Memory
A Priori Maps
.
Long-term
Memo ~'
Time Horizon Figure 5.3 Time horizonfor knowledge.
examplesinclude retinotopic maps, space-preservingmappingsof the retina onto the visual cortex; and somatotopicmaps, topographicrepresentations of thebody surfaceprojectingontothe brain (FlorenceandKaas1995). Chapter8 revisitsneuralnetworkmodelsfor behavior-basedrobotic systems. The philosophiesregardingknowledgeuse itself form two subdivisionsof . As defined by Webster1984, epistemologyis "the study or a metaphysics theory of the natureand groundsof knowledgeespeciallywith referenceto its limits and validity." Ontology, on the otherhand, is definedas " a particular " theory aboutthe natureof being or the kinds of existents. , in our context, concernthe useandvalidity Epistemologicalconsiderations of knowledgewithin behavior-basedrobotic systems. Ontological considerations are more specific to the representationalchoicesand detenninewhat kinds of things exist (or are understandable ) within the framework of our robotic system. In the remainderof this chapter, we primarily addressthe ontological factorsin our examplesof knowledgeuse: what knowledgeis available choices(i.e., vocabularies to the systemandthe representational ) made. 5.2 REPRESENTATIONAL KNOWLEDGE FOR BEHAVIOR -BASED SYSTEMS A little knowledgethat actsis worth infinitely morethanmuchknowledgewhich is idle. - Khalil Gibran
Any numberof animalsusecognitivemaps. Foraginganimals, suchasinsects neartheir homesites andbirds, arebelievedto usepolar vectorrepresentations 1989). Thesevectorsareimposedover a polar grid coordinatesystem ( Waterman createdby a mosaicof recognizablelandmarks. Gallistel ( 1990) presents compellingevidencefor theuseof cognitivemapsin animalsystems,including the following :
185
Issuesfor BehavioralSystems Representational
. Evidencefor vector spacesin animalscapableof spatialencodingincludes tectal mapsfor angulardeviationin birds andauditorycortexmapsforencoding distanceandvelocity in the bat, amongothers. . Many animalsarecapableof distancetriangulationfrom known landmarks, including many insects~suchas locusts, wasps, and bees, and other animals, suchasthe gerbil. . Cognitivemapsin naturehavefound usesin foraging, homing, puddlejumping (gobies), resourcelocation (e.g., refinding calcium depotsfor the desert tortoise), avoidingnearlyundetectableobstaclesby rememberingtheir location (bats), orientingtowardshiddengoals(rats swimmingforplatforms), route selection basedon relative distance(chameleons ), rememberingplacespassed en routeto anotherlocation(mazefinding in rats), androute selectionbetween ). placesencounteredin the past(chimpanzees . Of particular interestis the localization of a geometricmodule within the hippocampusof the rat brain, believedcapableof encodingmetric relations and compasssenseand performing the necessarytransformationsto make the information useful to the rat. Below we discussan exampleof a robotic ' representationsysteminspiredby the rat s hippocampus(Mataric 1990). Of course, animals' useof cognitivemapsis not a sufficientlycompellingargument for including them within robotic systems.At the very least, however, it shouldcertainly give the reactiveroboticistpauseto considertheir potential useandimpactuponperformance . Indeed, the principle of biological economy that are not argues systems generally includedwithin biological systemsunless they provide someutility or advantageto the animal. Knowledgerepresentationscan be made availablein severalways to the control system for behavior-basedrobotic systems. Each compromisesthe purely reactivephilosophy(chapter3) to varying degreesbut providesbenefits difficult to realizewithin the strictestview of behavior-basedrobotics. These es correlateclosely to our notionsof transitory and representationalapproach persistentknowledgemanifestedin variousforms of STM andLTM. . Short-term behavioralmemory: Isolating representationalknowledgeto a specificbehav~or providesknowledgeon a need-to-know basis, in a manner similar to action-orientedperception(section4.4.1). Analogouslywe canrefer to this useasaction-orientedknowledgerepresentation . Only that information for the of a behavior is . This preserves necessary performance specific represented thebehavioralmodularityandopportunityfor incrementaldevelopment that is so valuablefrom the behavior-baseddesignperspective .
186
Chapter5
. Sensor-derived long-term cognitive maps: Information that is directly perceived from the environmentand gatheredonly during the lifetime of the robot' s experiencein a particular environmentis used to constructa standaloneworld model. This model is plastic in the sensethat as new sensordata arrivesthe modelis continuouslyupdatedandmodifiedattemptingto maintain a closecorrelationwith the actualworld. Theserepresentations areconstructed in a behavior-independentmannerandmay provide informationto the overall behavioralcontrol systemin a variety of ways. The greatestdifficulties lie in with the world, especiallywhen it maintaininga high fidelity correspondence is dynamicandunpredictable . . A priori map-derived representations : With thesemethods, information is introducedbeyondthe robot' s sensingcapabilities. It may be compiled from ftoorplansor externalmapsof the world, but it was gatheredindependently of the robotic agent. The strengthof theserepresentations lies in their ability to provide expectationsregardingthe environmentevenbefore the robot has enteredit. Significantproblemsarise from the fact that the initial knowledge sourcemay be inaccurate,untimely, or just plain wrong, and that the form of the initial representationmay not be easily or convenientlytranslatedinto a usefulrobotic format. We now review specific instancesof behavior-basedrobotic systemscapable of taking advantageof knowledgein thesevariousforms. 5.2.1
Short - Term Behavioral Memory Knowledgeis soonchanged,then lost in the mist, an echohalf-heard. - Gene Wolfe
Behavioralmemoryprovidescertainadvantages to a robot: It reducesthe need for frequentsensorsamplingin reasonably stableenvironments , andit provides recentinformationto guide the robot that is outsideof its sensoryrange. Representations that fit this particularcategoryhavethreegeneralcharacteristics . First, they are usedin supportof a singlebehaviorin a behavioralcontrol system . Second, the representationdirectly feeds , most often obstacleavoidance the behaviorratherthan directly tying it to a sensor. In essence , the memory servesasa buffer andtranslatorfor a limited numberof previoussensings(figure 5.4). Third, they are transitory: the representations are constructed , used while the robot is in the environment, and then discarded. They must be reconstructed if the robot reentersthe environment.Although initially this might
187
Issuesfor BehavioralSystems Representational
Figure5.4 Behavioral . memory , it is actually appearto penalizethe robot by making it somewhatabsentminded valuableas it eliminatesmuch of the difficulty associatedwith long-term . localization(i .e., havingthe robot maintainits bearingsrelativeto a map) and is well suitedfor somewhatdynamicenvironmentswherethe position of obstacles may changeover time. In general, purely reactivesystemsstill have an advantagein very dynamic worlds (e.g., navigatingalong a crowdedsidewalk ), but somebehavioralSTM techniquescandealquite well evenwith these situations. Both behavioralmemoryandcognitivemappingcommonlyusegrids to represent arearbitrarily the navigablespacearoundtherobot. Grid representations tessellatedregionssurroundingthe robot. Theycanvary in the following ways: . Resolution: the amountof areaeachgrid unit covers(e.g., an inch, a meter, or more). . Shape: mostfrequentlysquare,but alsoin otherforms suchasradial sectors (Malkin andAddanki 1990). . Uniformity : the grid cells may aUbe the samesize, or they may vary. The most commonvariable-sizedgrid methodologyinvolvesthe use of quadtrees (Andresenet al. 1985), which areformedthroughthe recursivedecomposition of free space. Figure 5.5 depictsa few variationsof grids. Implementationsof grid-based representationsmost often use preallocatedtwo-dimensionalarrays, but occasionally ) involve linked lists as the central data (especiallyfor quadtrees structure. We discussedin section4.4.2 one exampleof behavioralmemoryusedfor navigation: the avoid-pastbehavior. This behaviorpreventedstagnationduring navigation by adding repulsion from places the robot has recently visited. A regular grid stored sensoryinformation concerningwhere the robot had
188
Chapter5
(IIIB )-L~-L
Figure 5.5 Grid representations : (A ) regulargrid, ( B) sectorgrid, and(C) quadtree.
I ~ II III ' ' Past , , Positionallnfon M apper Grid B ased . . . (pschema erceptual ) Satial i Memo - ~ IIII II I-L~'1iiiiiii (shaft encoders)
Avoidance Response
Figure 5.6 Avoid-past.
recently been (figure 5.6). The original work ( Balch and Arkin 1993) used deadreckoninginformationbasedon shaftencoderreadingsfor incrementing counter valueswithin the grid. Other, more effective sensors , could also be readily used, however, such as global positioning systems(GPS), infrared bar code readers, or inertial navigationsystems(INS) with no impact on the behavioror the representationitself (assumingthe sensorcould provide the necessaryresolution). The avoid-pastbehaviorwasfully self-containedin that it wasconstructedwith the intent of usinga representationfrom the onset, and was completelymodularand integrablewith all of the other behaviorswithin the control system. Yamauchi( 1990) developedtwo differentforms of behavioralmemory. The first, referredto as wall memory, usesan array of elements, correspondingto the numberof ultrasonic sensors , to increaseconfidenceover time that the robot is near a wall. The memory readingsare then usedto supporta wallfollowing behavior(figure 5.7). Yamauchifurther extendsthe notion of behavioralmemoryby storingin an action memoryinformationnot only aboutthe world (e.g., walls) but alsoabout the robot' s most recent responses . This memory permits the robot to favor
189
Representational Issues for ~ ehaviora1 Systems
.
.
Detector
Response
-
Sonar
Wall
, Algorithm I i
iiiiiiii
Figure5.7 . Wallmemory
.
.
. .
.
Instantaneous
Visio
Stereo
. -
-
.
-
I
Response
Obstacle Algorithm
Map . .
.
. . . . .
.
Two
Video Cameras
Figure 5.8 vision . Real-time obstacleavoidance" ~ing stereo the direction in which it is currently moving by using a weighted average of past responses to bias the immediate reactive response. This tends to remove noise and reduce the possibility of premature reflexive action due to a single reading . Badal et ale ( 1994) developed an example of real-time obstacle avoidance using stereo vision with a form of behavioral memory . An instantaneous obstacle map stores detected obstacle points projected onto the ground plane. To produce the steering behavior for a high mobility multipurpose wheeled vehicle are mapped onto a polar occupancy grid ( sector (I - IMMWV ), these points based) using a coarse configuration space approach, which then generates a steering vector in the direction of the least hindrance (i .e., the direction with the farthest possible avoidance distance) . This behavior is depicted schematically in Figure 5.8. Borenstein and Koren ( 1989) patented a method for conducting obstacle avoidance in real time using sonar data and a grid - based representation. Their work centers on the use of a certainty grid , an outgrowth of earlier work by ' Moravec and Elfes ( 1985) . Moravec s work focused on constructing a world
L ~ I ~ . L Certainty --~ ~ ~ Grid ~ :1 D imen -L -L-1LOne Polar Histog
ChapterS
"
190
Sonar Readings
Obstacle Detection
D~~~
Figure 5.9 Vectorfield histogramusa2efor obstacleavoidance .
model using sonardata coupledwith probabilistic sensormodelsthat, when given multiple readingsof the world, could createa grid-basedmap. Traditional path planning was then conductedwithin this representationalframework to producea route for the robot, which would then move through the world in a nonreactivemanner. Matthies and Elfes ( 1988) later extendedthis work to provide sensorfusion of both sonarandvisual stereodata. BorensteinandKoren( 1989) modifiedthis methodsothatmanymoresensor readingscouldbetakenby greatlysimplifying the world mapupdatingprocess. Informationcouldbe addedcontinuouslyfrom the incomingdata. Further, they addeddecayprocess esto decrementcell valuesovertime, ensuringcurrencyof the data. Initially the grid took the traditional two-dimensionalsquareformat in their vector force field concept. A repulsivevector, generatedin a manner analogousto that of the potential fields method(section3.3.2), producedthe steeringdirection and velocity for the robot. In later work ( 1991), Borenstein andKorenalteredthe representational format to includea one-dimensionalpolar form, the vector field histogram, further decreasingthe processingtime by collapsingthe two-dimensionalgrid directly into a directional representation centeredon the robot' s currentlocation. Each sectorstoresthe polar obstacle density. The steeringcomputationbecomestrivial , simply selectingthe most suitabledirectionin the histogramconsistentwith the robot' s overallgoal. Figure 5.9 depictsthis approach. All of the examplesdiscussedthusfar havebeenconcernedwith short-term datathat is used, then discarded, and that is channeleddirectly to a particular behavior. Wenow look at representational methodsthat aremorepersistentand . potentially havebroaderusefor a variety of behavioralresponses
5.2.2 Long-TermMemoryMaps Under some circumstances, persistent infonnation regarding the environment may be useful for behavior -based robotic systems. In general, these long -term
191
Issuesfor Behavioral Representational Systems
arebestusedto advisea behavioralcontrol regimeratherthan representations dictateto it. The origin of the map data itself provides a useful way to classify these maps. Someare deriveddirectly from sensorsonboardthe robot, as the robot movesaboutthe world storinginformationin a particularrepresentational format for later use. Othersare constructedfrom information gatheredindependently of the robot but transformedinto someuseful format. Thesea priori maps are not as timely as sensor-derived maps but can be obtainedfrom a broaderrange of resources , even including remote sensingdevicessuch as satellites. The map representationalknowledgeitself is typically encodedin one of two forms: . Metric: in which absolutemeasurements and coordinatesystemsare used to representinformationregardingthe world. Latitudeandlongitudemeasurements aretypical of this format. . Qualitative: in which salientfeaturesandtheir relationshipswithin the world arerepresented . This may supportbehavioraldescriptionssuchas " turn left at the seconddoor on the right," or " continue moving until you seethe sign." Thereis little or no notion of any quantitativemeasurementwithin the world, only spatialor temporalrelationships. The useof anyform of mapknowledgeis dangerousprimarily in the fact that it may be untimely (i.e., the world haschangedsincethe mapwasconstructed ) and henceinaccurate. Additionally, localization, a nontrivial perceptualactivity ' providing environmentalcorrelation of the robot s position within the mappedworld, needsto be conducted. Map knowledgeis advantageous priin that can the horizon of immediate marily maps provide guidancebeyond sensing.Thesetrade-offs needto be weighedvery carefully whenconsidering . The behavior-basedroboticistwho mapusagefor particulartaskenvironments choosesto use mapsstrivesto permit immediatesensoryactivity to override any potentiallyerroneousmap information.
-DerivedCognitiveMaps 5.2.2.1 Sensor Sensor- derived maps provide information directly gleaned from the robot ' s experiences within the world . How long the information is retained will determine its timeliness . As the world is sampled from the robot ' s egocentric point of view , it is often advantageous to use qualitative representations instead of metric ones becauseof the inherent inaccuracies in robot motion and the sensor
192
ChapterS -
III
II
I
"
.
=
J
Figure5.10 - abruptdepth : (A) endof hall- three-waysymmetry Distinctiveregions ; (B) doorway visual constellations constriction C ; ( ) hallway depth minimumD ) discontinuity . of vertical lines in this case two feature , ) ( triplets patterns unique - -
. In behavior-basedsystems, sincethe information is intended readingsthemselves only to supplementthe reactivecontrol systemrather than to replace it , which is not the casewith traditional robotic motion planning ( Latombe 1991), inaccuraciesin the sensordataare toleratedin a much more forgiving manner. One of the hallmarksof qualitativenavigationaltechniquesis the notion of distinctive places, regions in the world that have characteristicsthat distinguish them from their surroundings(Kuipers and Byun 1988). A distinctive " placeis definedas the local maximumfound by a hill -climbing control strategy " given an appropriatedistinctivenessmeasure (Kuipers and Byun 1991) . are alwaysderivedin Higher-level topologicaland geometricrepresentations . In robotic this systemfrom thesesemanticallygroundedsensoryobservations from the sensor most often derived are these characteristics readings systems, . Localesthat exhibit symmetryin somemanner, abruptdiscontinuthemselves ities in sensorreadings, unusualconstellationsof sensorreadings, or a point where a maximum or minimum sensorreading occursare typical examples (figure 5.10). After the robot detenninestheseobservableand ultimately recognizable landmarks, they can later be usedfor lower-level control, such as moving to a particularpoint in this observationspace. As thesemap featuresare often directly tied to sensing, integrationinto a behavioris fairly easy.For example, the robotcanreadilybe instructedto move aheaduntil an abruptdepthdiscontinuityoccurson the right -andthen switch to a move-through-doorbehavior. The sensor-derivedqualitativemapservesas the basisfor behavioralconfigurationandaction. Of course, obviousproblems ariseshouldthe doorhappento be closed, but that is typical of relianceon map dataof any sort. The world is assumedto exist asit is modeled, which mayor may not be the case. .
[
l
E
Representational Issues for ,Behavioral
v ~tern ~
i ~
!
.
.
~ ~
J. .
~
~
_
c
~
-
~
'
-
~
_
.
_
~
-
~ . , . .
,
~
-
~
_
L +
-
-
S E N S 0 R S
i
.
193
Response
Figure5.11 -stylearchitecture of representation in subsumption . Integration s' primary advantageis their relativeimmunity to Qualitativerepresentation errorsin motion, especiallywhenthe robot reliesheavily upondeadreckoning as its basisfor localization. An examplesystemusedfor outdoor navigation usesviewframes , representationsconstructedfrom visual input that possess the qualitiesof distinctivenessdiscussedearlier (Levitt andLawton 1990). As such they constitutea visual memory of where the robot has beenand provide landmarksregardingits position within the world. Path transformations link aggregationsof viewframestogetherand describehow the robotic agent movedfrom oneplaceto the other. The resultingnetworkconstitutesasensorderivedmap useful in instructing the robot in how to navigate. This form of qualitativenavigationcanbe readily tied to behavior-basedsystems(Arkin and Lawton 1990; Lawton, Arkin , andCameron1990). In particular, reactivecontrol canprovidea safebasisfor exploringunknownareasduring the qualitative mappingprocess.Behaviorscanbe developedthat arecapableof attractingthe robotic agentto areasthatit hasnot yet surveyed;theycanprovidethe ability to track relativeto qualitativelandmarksandpermit recoveryfrom disorientation by allowing the robot to searchfor specificperceptualevents. Mataric ( 1992b) demonstratedthe integration of qualitative maps and -basedsystem. Figure 5.11 behavior-basedrobotic systemsin a subsumption depicts the subsumptionstyle controller. Landmarksare derived from sonar data, using featuresthat are stable and consistentover time. In particular, right walls, left walls, and corridors are the landmarkfeaturesderived from the sensors . Spatialrelationshipsare addedconnectingthe variouslandmarks throughconstructinga graph. Map navigationconsistsof following the boundaries of wall or corridor regions. When the robot, Toto, determinesthat it is in a particular region, it can move to different locales by traversingthe
194
Chapter5
graph representationthat connectsdifferent regions, effectively conducting of reactivenavigation~ not lost, , the advantages path planning. Nonetheless as the robot is not requiredto follow its path blindly throughthe world when other overriding sensordata indicatethat it is more importantto detour(e.g., for obstacleavoidanCe ). Of additional interest is the claim that this mode of navigationalbehavioris a possibleinterpretationof the mannerin which the rat' s brain (specifically the hippocampus) conductsnavigation ( Mataric 1990). Severalothervariationsof qualitativemapssuitablefor behavior-basednavigation havebeendeveloped , including . A behavior-basedcontroller capableof such actionsas hallway following andcomer turning is coupledwith a finite stateautomatarepresentationof the world (Basye 1992). The navigationaltask then becomesa depth-first search through the statetransition graph (Dean et al. 1995) . This systemhas been developedpredominantlyfor officelike environmentsandto datehashadonly limited experimentaltesting. . A modelderivedfrom panoramicvisual sensing(a 360-degreefield of view) basedon depth recoveryfrom the motion of .a robot (Ishiguro et al. 1994). Though not specificallyfocusedon reactiverobotic systems, it is potentially useful in that regard. A two-and-a-half- dimensionaloutline structureof the environment(contourdepthalong a line) is recoveredusing visual depthfrom motion techniquesand broken into consistentvisual features. The resulting panoramicrepresentationis convertedinto a qualitativeforln by segmenting the outline model into objects basedon abrupt depth discontinuities. This ' qualitative model can be used for visual event prediction along the robot s . intendednavigationalpath or for localizationpurposes Map knowledgeneednot be derivedsolelyfrom sensorydata. Let us now investigate someothermethodsof mapconstructionof potentialuseinbehaviorbasedrobotics. 5.2. 2.2 A Priori Map -Derived Representations A priori maps are constructedfrom data obtained independentlyfrom the robotic agentitself. The mostcompellingargumentsfor usingthis type of map knowledgearisefrom convenienceandgreaterscope: . It may be easierto compile thesedatadirectly without forcing the robot to travel throughthe entire world aheadof time.
195
Representational Issues for ,Behavioral Systems
. These data may be available from standard sources such as the Defense Mapping Agency or the U .S. Geographical Survey, among others. . Precompiled sources of information may be used, such as blueprints , floor 's and tha need to be encoded for the , robot use. , plans roadmaps ~ only Of course, the perils to the accuracy of this data are different since it comes from other sources: . Errors may be introduced in the process of encoding the new data. . The data may be relatively old compared to recent robotic sensor readings. . The frame of reference for the observations may be somewhat incompatible with the robot ' s point of view . These are just some of the trade-offs that must be considered when using a . priori map knowledge . Clearly , a behavior -based roboticist ' s goal would be to include this information in a manner that does not impede the robot ' s reactive ' performance , but that allows for guidance from knowledge outside the robot s direct experience. Payton ( 1991) provides one of the most compelling examples of use of apri ori map knowledge within the behavior-based robotics paradigm . In Payton' s research, fielded in D A RPA' s Autonomous Land Vehicle Program, a map of the environment containing known obstacles, terrain information , and a goal location is provided in a grid -based format derived from a digital terrain map. A cost is associated with each grid cell based on mission criteria ; the cost can take into account such factors as traversability , visibility to the enemy, ease of finding landmarks, and impact on fuel consumption , among others. A gradient field is computed over the entire map from start point to goal point with the minimum cost direction represented within each cell to get to the goal . Figure 5.12 depicts an example map. The gradient field represents what is referred to as an internalized plan , since it contains the preferred direction of motion to ' accomplish the mission s goals. The key to this approach' s success is its integration with the other behaviors within the overall control system. The internalized plan acts like another behavior sitting atop a subsumption- style architecture (figure 5.13) . The lower level behaviors guide the vehicle when the situation warrants, but if the vehicle is proceeding normally , the highest-level action that corresponds to the inter nalized plan representing the overall mission is enacted. This method of injecting map knowledge into reactive control is also readily applicable to other behavior -based architectures.
Chapter5
INTERNALIZED PLANS
.
.0c.er-C c1 cn ; Q >.Q ~cn I(.Q
Figure5.12 " fromPayton aninternalized Gradientfieldrepresenting , D., Internal plan. (Reprinted " for ActionResources izedPlans : A Representation , 1991 , pp. 94, with kind permis 24, 1055KV, Amsterdam , The Burgerhartstraat
Cl
196
RESPONSE Figure5.13 controlusinginternalized Behavioral plans.
197
Issuesfor BehavioralSystems Representational
G 0 A L S S E N S 0 R S
RESPONSE
Figure 5.14 Action-selectioncontrol architecture(SimmonsandKoenig 1995 ).
Simmonsand Koenig ( 1995) presenta different approachto encodingboth topologicaland metric information from mapsobtainedfrom ftoorplans. The baserepresentationis constructedfrom topological (connectivity) modelsof theenvironment,commonknowledgeaboutoffice structure(e.g., corridorsare straight), and approximatemeasurementof the environmentregardingwidth of passageways anddistancesbetweenturning (decision) points. The resulting graphrepresentsthe world. Markov models(specializedprobabilisticmodels) encodethe actionsthat a robot cantakeat different locationswithin the model. As the layout of the office chosenis simple, the allowableactionsare simply turning right or left 90 degreesor proceedingstraight aheadfor one meter. Obstacleavoidanceis handledusingthe sameapproachasin Arkin 1987a, and with plannedactionsbasedon wherethe robot needsto proceed supplemented to its mission. The planneritself usesan A * searchalgorithm to according specify the actions to be taken at each point within the topological model. The Markov model- basedplannerissuesdirectivesto the robot for turning left or right, moving forward or stopping. The navigationalarchitectureusesan action-selectionmechanism(figure 5.14). The specificarbitrationmechanism usedis a best-actionstrategy,basedon the highestprobability for eachpossible directive. The probability for a directivedependson the probabilitiesderived from sensingthatassess esthe locationof the robotwithin the map. This system hasbeenfieldedon a mobile robot namedXAVIER (built on a RWI B24 base) (figure 5.15) and is reportedto havesuccess fully completed88 percentof its missionsusingthis strategyin morethana kilometerof total distancetraveled.
198
ChapterS
Figure5.15 XAVIER. (photograph courtesyof Reid Simmons, The RoboticsInstitute, Carnegie.) MellonUniversity
199
RepresentationaJ Issuesfor.BehavioralSystems
Figure 5.16 NavlabII . (Photographcourtesyof The RoboticsInstitute, Carnegie-Mellon University.)
Another system capable of integrating a priori map knowledge into a behavior-basedcontrol system was fielded on the Navlab II robot testbed (Stentz and Hebert 1995) (figure 5.16). An eight-connectedcartesiangridbasedrepresentationcapableof storing complete, partially complete, or no knowledgeof the world is availableto a DynamicA * or D* planner(Stentz 1994). D* hasthe specialability to replan very efficiently shouldsensordata . This in essenceprovidesa variant on updatethe storedmap representation ' s work discussedearlier that can , , Payton provide rapid updatesto the map basedon incoming sensorydata. In the overall system, an obstacleavoidance behavior basedon range data provides local navigation abilities. The DAMN steeringarbiter (section4.5.5) choosesthe correct behavioralaction
200
Chapter5
Goal Seeking Cell
Grid
DAMN RESPONSE
~ Updates ARBITER I
ObstacleAvoidance ~ SENSORS
Figure5.17 controlsystem Navigational usingD* asabehavior. for the circumstances(figure 5.17). The systemwastestedat a slagheapnear fully in excessof 1. kilometersin a clutteredenvironment Pittsburgh, driving success on its way to the goal. The authorsclaim it to be the first systemthat exhibits on a real robot both efficient goal acquisitionand obstacleavoidance in an unstructuredoutdoorenvironment. Anothertype of map, thepurposivemap (Zelinsky et aI. 1995; Zelinsky and Kuniyoshi 1996), storesinformation regardingthe utility of behaviorsduring navigationwithin the world (as opposedto spatialinformation regardingthe ' world' s structures ). It is similar in spirit to Yamauchis action memory described ' earlier. In Zelinsky s research , the behavioralcontroller consistsof a such as wander, collide, walllike arbiter behaviors , managing subsumption and the like. The purposivemapmonitors follow, find an opening, move to goal, and coordinatesbehavioralstate. The map itself lists featuresand associated scalarquantitiesthat estimatethe spatialrelationshipsbetweenfeatures. . The Associatedwith eachfeatureis anactionto beusedwhenit is encountered in the . this enters the infonnation map mapmanually Actually, systemdesigner is moreof a plan thanan environmentalmodelsinceit ties task-specificactions with recognizableenvironmentalconditions- aninterestingtwist on the useof , wherethe knowledgestoredfor lateruseis dependentnot only representations on the stateof the world but the robot' s intentionsaswell.
REPRESENTATIONS 5.3 PERCEPTUAL The representations we have discussed thus far have been concerned with the where issues: where in the world the robot is located and where it is going . As chapter 2 discussed, there is strong neurophysiological evidence for
201
Issuesfor BehavioralSystems Representational dual cortical pathways: one concerned with spatial issues ( where), the other with recognition (what ) (Mishkin , Ungergleider , and Macko 1983) . We briefly discuss in this section some representational issues as they relate to perception for behavior -based robotic systems. This discussion is a preview of a larger discourse in chapter 7 , -but it is important at this point to understand certain aspects of representational use for object recognition . A very large body of work exists on model :'based computer vision , which typically is concerned with object identification based on geometric models. Although this strategy is appropriate for certain classesof problems , traditional geometric models have less utility in the context of behavioral systems. Recall from chapter 3 that perception in the context of behavior-based robotic systems is best conducted on a need-to- know basis. Because perception is strongly related to the actions that the robot needs to undertake, we now briefly review some representational strategies that take into account the robot ' s ability to move within the world and that can reflect its intentional position . These representations are consistent with our notion of action -oriented perception and offer in some sense to try to capture the concept of affordance introduced in chapter 7. The main class of representations we consider for sensory recognition are function based, addressing the problem of recognizing environmental objects that fit specific functions of value to the robotic agent at a particular time . These may include affordance-like functions such as sittable , throwable , provide support, provide - storage- space, or what -have-you . A simple example involves assuming that you have a book in your hand and you need to place it down somewhere safely in a room in which you have never been before. Once you walk in , you might see many candidate surfaces ( shelves, tables, beds, etc.) each of which to some degree will perform that function . How is the functional model represented for book stowability , and how can goodness of fit be measured with each candidate' s environmental features? How can a decision be made as to the best place to put down the book? Functional models are concerned with these kinds of issues. Stark and Bowyer ( 1991, 1994) developed one of the first examples of a functional representation. In this first attempt at a function -based model , no explicit geometry or structure is used. Instead a set of primitives (relative orientation , dimensions, stability , proximity , and clearance) defines functional properties . The flow of control for this system (figure 5.18) first takes as input a three-dimensional description of the object in terms of faces and vertices, then identifies potential functional elements ( such as surfaces), attaches functional
No Test Highes Rank Order Cond Functional Ranke 3D Object Possible Object Shap Element Features Remain Categori Identificatio Analy Catego 202
Chapter5
Figure5.18 . Controlflowfor functionalinterpretation
-and-testevaluation labelsto theindividual features,andfinally conductsa generate to determineif it indeedfits the functionalcriteria in question. The main ' exampleused for Stark and Bowyer s researchhas been identifying a chair " " as somethingthat both is sittable and " provides stable support." An armchair would add the functionsof " providesback support" and " providesarm " support. In this strategy,objectswith no expressgeometricmodelscanbe considered chairsif they meetthe constraintsthe functionalmodelsfor that object ' . impose Stark and Bowyer s work hasyet to be integratedinto a robotic system , but it appearsto hold promiseespeciallyfor behavior-basedsystemsthat . involve affordance-like perceptualstrategies More recently, Bogoni and Bajcsy ( 1994a) havestudiedfunctionality from . Functionality is relatedto observabilityand an active-perceptionperspective to investigateactively to determinethe for robotic sensors a basis provides ' actual object s nature. Another study (Budenskeand Gini 1994) provides a basisfor recognizingdoorwaysthrougha seriesof exploratorymotions. Sonar data, which can be applied for this purpose, are notoriously misleadingand error prone becauseof specularreflections, a wide samplingcone, and other ' artifactsof the sensingprocess( Everett1995). In Budenskeand Gini s work, initial sonar readingsindicate a functional passagewaythat is potentially a door, but this is confirmedby otherinvestigatorysensor-motor activitiesbefore the robot navigatesthrough it . Theseactivities include positioning the robot adjacentto the door opening, having the robot move back and forth in front of the openingto confirm its size and position, and centeringthe robot in the . This information is then tied to a behavior-based openingprior to passage /actuators( LSAs), andthe robot, using behaviorsreferredto aslogical sensors
Issuesfor BehavioralSystems Representational passagecompleted . The doorway per se is not directly represented; rather, the ' configuration of actions determines the doorway s function . In research at the University of California at Berkeley , Stark developed the theory of scanpaths, another active representational strategy potentially suitable for use in behavior -based robotic systems ( Noton and Stark 1971; Stark and Ellis 1981) . In this theory of object recognition , models are created by the paths taken by the eye as its foveal region moves about the entire object . The connectivity of the focusing points and the order in which they occur constitute the scanpath . This ordering , which can be modeled on a computer by creating a network of visual features, can provide guidance as to where to focus ' perceptual processing to distinguish one object from another. Stark s work has yet to be tied to robotic systems but provides a clear alternative strategy to geometric model- based recognition for use in behavioral control systems. There is much more to be said on the role of perceptual activity in the context of behavior-based robotics . We have touched on only some of the perceptual representation issues here. Chapter 7 investigates more thoroughly perceptual processing and control for this class of robots.
5.4 CHAPTER SUMMARY . The more predictable the world is , the more useful knowledge representations are. . Two important characteristics of knowledge include its predictive power and the need for the information stored to correlate with the environment in some meaningful way. . Knowledge can be characterized into three primary forms : explicit , implicit , and tacit . . Knowledge can be further characterized according to its temporal durability : . transitory knowledge , which is derived from sensory data and corresponds to cognitive short- term memory . . persistent cognitive maps, which may originate from either a priori knowledge or sensory data and corresponds to long -term memory . . Significant evidence exists from cognitive psychology that mental processing involves various forms of knowledge representation. . Using representational knowledge has several potential drawbacks within behavior -based systems: . The stored information may be inaccurate or untimely . . The robot must localize itself within the representational framework for the knowledge to be of value.
204
Chapter5 . Representational knowledge ' s primary advantage lies in its ability to inject information beyond the robot ' s immediate sensory range into the robotic control system. . Examples of explicit representational knowledge use in behavior- based robots include short- term behavioral memory , sensor-derived cognitive maps, and a priori map- derived representations. . Short-term behavioral memory extends behavioral control beyond the robot ' s immediate sensing range and reduces the demand for frequent sensory sampling . This form of representation is tied directly to an individual behavior within the control system. . Grid -basedrepresentations are often used for short-term behavioral memory . The grids are typically either sector-based or regular and have resolutions that ' depend on the robot s environment and source of the grid data. . Grid representations of STM have been used to remember the robot ' s past positions , to buffer sensor readings for wall recognition , and to store observations of obstacles. . LTM maps are either metric or qualitative . Metric maps use numeric values to store the positions of observed events; qualitative maps use relational values. . The notion of distinctive places is central to the use of sensor-derived cognitive maps. Locations are remembered that incoming sensor data determine to be unique in some way. . Qualitative maps support general navigational capabilities and can provide behavioral support for moving to a goal , avoiding obstacles, invoking behavioral transitions , localization , and other related activities . . A priori map- derived representations offer the robot information regarding places where it has never been before. The data for the representations may come from preexisting maps, blueprints , ftoorplans , and the like . . Internalized plans inject a priori grid -based map knowledge directly into a behavior-based control system. . The D * method improves on the gradient map strategy used in internalized plans by pennitting efficient sensor updates to the stored world knowledge . . STM and LTM cognitive maps address the " where" aspects of memory ; function -based perceptual representations addressthe " what " aspects. . Function -based perceptual representations are most closely related to the affordance- based approach reactive robotics commonly uses. . Function -based methods are not based on standard geometric methods and require analysis of incoming sensorreadings in terms of what the robotic agent needs to accomplish .
Chapter Hybrid
6 DeUberative
/ Reactive
A rehiteetures
In preparing for battle I have always found dlat plans are useless , but planning is . indispensable - Dwight Eisenhower It is a bad plan that admitsof no modification. - Publilius Syrus
' Everybodys got plans . . . until they get hit. - Mike Tyson Few people think more than two or three times a year ; I have made an international reputation for myself by thinking once or twice a week. - George Bernard Shaw
Objectives Chapter and delibe met limitations of reactive I2 .3.To understand the purely purely in . is considered isolation when each . of reactive / d eliber To models syste hybrid study biological cont and interfac betwe react the issues in establish recognize interf . several models for these and to deliberative express planners AuR architec 4.To several ,espec ,Atla -Rrepresenta .hybrid PRS eactor Planner ,and ,study
206
Chapter
6.1 WHYHYBRIDIZE ? We haveseenthat reactivebehavior-basedrobotic control can effectivelyproduce robust performancein complex and dynamic domains. In someways, however, the strongaSsumptions that purely reactivesystemsmakecan serve asa disadvantageat times. Theseassumptionsinclude 1. 2. 3. 4.
The environmentlackstemporalconsistencyand stability. The robot' s immediatesensingis adequatefor the task at hand. It is difficult to localizea robot relativeto a world model. world knowledgeis of little or no value. Symbolicrepresentational
In some environments , however, theseassumptionsmay not be completely valid. Purely reactiverobotic systemsare not appropriatefor all robotic applications . In situationswhere the world can be accuratelymodeled, uncertainty is restricted, and someguaranteeexists of virtually no changein the world during execution(such as an engineeredassemblywork cell), deliberative methodsare often preferred, since a completeplan can, most likely, be effectively carriedout to completion. In the real world in which biological agentsfunction, however, the conditionsfavoring purely deliberativeplanners generallydo not exist. If roboticistshope to havetheir machinesperforming in the sameenvironmentsthat we, ashumans,do, methodslike behavior-based reactivecontrol arenecessary . Many researchers feel, however, that hybrid systems of both deliberative capable incorporating reasoningandbehavior-based executionare neededto deliver the full potential of behavior-basedrobotic systems. We saw in chapter5 that introducing various forms of knowledgeinto a robotic architecturecan often make behavior-basednavigationmore flexible and general. Deliberative systemspermit representationalknowledgeto be usedfor planning purposesin advanceof execution. This potentially useful knowledgemay take severalforms: . Behavioral and perceptualstrategiescan be representedas modules and , addingversatility. configuredto matchvariousmissionsand environments . A priori world knowledge, whenavailableand stable, canbe usedto configure or reconfigurethesebehaviorsefficiently. . Dynamically acquiredworld modelscan be usedto preventcertainpitfalls to which non-representational methodsare subject. / reactive robotic architectureshave recently emerged Hybrid deliberative of combining aspects traditional AI symbolic methodsand their use of ab-
207
Hybrid Deliberative / ReactiveArchitectures
stract representationalknowledge, but maintainingthe goal of providing the , robustness , and flexibility of purely reactive systems. Hybrid responsiveness architecturespermit reconfigurationof reactivecontrol systemsbasedon availableworld knowledgethroughtheir ability to reasonover the underlying behavioralcomponents . Dynamic control systemreconfigurationbasedon deliberation (reasoningoverworld models) is an importantadditionto the overall competenceof generalpurposerobots. Building such a hybrid system, however, requirescompromisefrom both endsof the robotic systemsspectrum(section1.3). Furthermore,the natureof the boundarybetweendeliberationand reactiveexecutionis not well understood at this time, leadingto somewhatarbitrary architecturaldecisions. 6.2 BIOLOGICAL EVIDENCE IN SUPPORT OF HYBRID SYSTEMS Psychologicaland neuroscientificmodels of behavior provide an existence proof for the successof an integrativestrategyinvolving elementsof deliberative reasoningand behavior-basedcontrol. Flexibility in our use of the modelsscientistsin other fields havedevelopedis important, however, since we, as roboticists, are concernedprimarily with creatingfunctioning autonomous agentsthat may have somebehavioraloverlap with their biological , but not necessarilywith reproducingtheir control andexecution counterparts strategiesverbatim. Just as many psychologistsmovedfrom behaviorism( Watson1925; Skinner 1974) to cognitivepsychology( Neisser1976) asan acceptabledescription of humaninformation processing , researchin the use of hybrid systemshas expandedto include many conceptsforwardedby this schoolof thought. The experimentalevidenceis compelling. Shiffrin and Schneider( 1977) haveindicated the existenceof two distinct modesof behavior: willed andautomatic. NormanandShallice( 1986) havemodeledthe coexistenceof two distinct systems concernedwith controlling humanbehavior. One systemmodels" automatic " behaviorandis closelyalignedwith reactivesystems.This systemhandles automaticaction executionwithout awareness , startswithout attention, and consistsof multiple independentparallel activity threads(schemas ). The secondsystemcontrols " willed" behaviorand providesan interfacebetween deliberateconsciouscontrol and the automaticsystem. Figure 6.1 illustrates this model. This researchcharacterizedthe tasks requiring willed control in humans : involving deliberateattentionalresources
208
Chapter6
] (vertical control threads)
Figure 6.1 Model for integratedautomaticandwilled behavior(after Noonanand Shallice 1986) .
. planningor decision I making . troubleshootin2 . novel or poorly learned actions . dangerous or difficult actions . overcoming habit or temptation ' Other motor tasks are typically automatic and occur without the use of attention . Their modeling incorporates a contention scheduling mechanism for coordinating the multiple active motor schemas. Higher -level deliberative processes involving attention alter the threshold values for schemas ( behaviors), dynamically changing the interplay between them. Psychological support for schema use in this strictly horizontal manner is well established ( Schmidt 1975) . The Nonnan - Shallice model incorporates aspects of both vertical and horizontal control threads. The horizontal control threads are used in a similar manner in the subsumption architecture ( Brooks 1986) . Deliberative influence is introduced when multiple horizontal behaviors are mediated by vertical threads that interconnect the various behaviors and allow for their dynamic modulation as a result of attentional resources, such as planning ,
209
/ Reactive Architectures HybridDeliberative , etc. Perceptualeventstrigger the schemasthemselvesbut attentional troubleshooting es modulatethem. This provides a coherentpsychological process modelfor integratingmultiple concurrentbehaviorscontrolled by higher-level . processing The Norman-Shallicemodelpoints out severalconnectionsbetweendeliberate andautomaticcontrol: . Automatic schemasare modulatedby attentionarisingfrom deliberatecontrol . . Schemas(behavioraltasks) competewith eachother. . Schemaselectionis deliberatecontrol' s principal function. Vertical threads . providethe selectionmechanism . Neuropsychologicalexperimentsareconsistentwith this model. The evidencesupportingthe existenceof a distinct supervisoryattentionalsystem is considerable . The model lacks a mechanismby which the deliberative is conducted ; this is left for othersto elucidate. Evenif this theoryultimately process fails to explainthe basisfor humanpsychologicalmotor behaviorand planning, the model may nonethelessprove useful in its own right as a basis for integratingdeliberativeandbehavioralcontrol systemsin robots.
' 6.3 TRADITIONALDELIB ERAnVEPLANNERS Deliberativeplannersare often alignedwith the hierarchicalcontrol community within robotics. (Hierarchicalcontrol is alsoreferredto asintelligent control ; seesection 1.3.1.) Hierarchicalplanningsystemstypically sharea structured and clearly identifiablesubdivisionof functionality relegatedto distinct programmodulesthat communicatewith eachotherin a predictableandpredetermined manner. Numerousexamplesillustrate this deliberative/hierarchical planning strategy(e.g., Albus, McCain, and Lumia 1987; Saridisand Valvanis 1987; Meystel1986; Keirseyet ale1984; Ookaet ale1985). A generalized modelappearsin figure 6.2. A typical subdivisionof functionality dependson both the spatialplanning ' scopeand temporalconstraints. At a hierarchicalplanners highestlevel, the most global and least specificplan is formulated. The time requirementsfor producingthis plan are the least stringent. As one proceedsdown the planning hierarchy, the scopebecomesnarrower, focusing on smaller regionsof the world but requiring more rapid solutions. At the lowestlevels, rapid realtime responseis required, but the planneris concernedonly with its immediate
Chapter6
U~~~cw . ~
210
HIERARCHICAL PLANNER Strategic . Global Planning Tactical Intermediate Planning Short-Term Local Planning
Actuator Control
WORLD MODEL
TIME HORIZON Long-Term
Global Knowledge
Local World Model
Immediate Sensor Interpretations Real-Time
ACTIONS
SENSING
Figure 6.2 Dellberative/hierarch i cal planning.
" " surroundingsand has lost sight of the big picture. Meystel ( 1986) has developed a theory for hierarchicalplanningthat emphasizesthe significanceof and invokesthe conceptof nestedcontrollers. scope Hierarchicalplannersrely heavily on world models, can readily integrate world knowledge, and havea broaderperspectiveand scope. Behavior-based control systems, on the other hand, afford modular development , real-time robustperformancewithin a changingworld, and incrementalgrowth, and are tightly coupledwith arriving sensorydata. Hybrid robotic architectsbelieve es canpotentially that a union of the deliberativeandbehavior-basedapproach yield the best of both worlds. H donepoorly, however, it can yield the worst of both worlds. The central issue then becomeshow to developa unifying architecturalmethodologythat will ensurea systemcapableof robustrobotic plan executionyet take into accounta high-level understandingof the nature of the world anda modelof userintent.
211
Architectures Hybrid DeliberativeI Reactive
6.4 DELffi ERA
T I PLAN ON: ORNOTTOPLAN? TO
The integrationof knowledge-baseddeliberationandreactivecontrol requires the confrontationof manydifficult problems. Eachof thesemethodsaddress es differentsubsetsof thecomplexitiesinherentin intelligentrobotics. The hybrid ' systems architect contendsthat neither approachis entirely satisfactoryin isolation but that both must be taken into accountto producean intelligent, robust, and flexible system. The hierarchicalapproachis best suited for integrating world knowledge and user intent to arrive at a plan prior to its execution. Replanningwith this method, however, at levels where sensorydata is mergedinto world models is cumbersomeat best. Deliberativeplanning without considerationfor the ' difficult issuesof planexecutioncanleadto restrictedusagewithin very narrow problem domains(i .e., the ecologicalniche is extremelysmall and focused) andextremelybrittle robotic systems.A robot musthavethe ability to respond rapidly and effectively to dynamic and unmodeledchangesthat occur within its world. If a purely deliberativesystemattemptsto model and preplanfor all eventualities , it risks becomingso boggeddown that the planningprocess neverterminates(thequalificationproblem) (seebox 6.1). It is alsounsafefor a robot to makegrossassumptionsaboutthe world that do not reflectits dynamic nature.
Box 1 .The 6 " " is relate to a neve w if ? ha qualific proble e nd qu ' .enume stream to a s mak it mor and mo res an Qualifi plan utility " " less . There are too w i fs ha to be ab to genera just ) many ( p rec them all in advan for real orl dom . Thu sin we ca any 'sapplic a fail wh . ,itmay adequa qualify plan pos ap
The reactiveapproach, on the other hand, is well situatedto deal with the immediacyof sensorydatabut is lesseffectivein integratingworld knowledge. A clear-cut distinctioncanbe seenin the hierarchicalplanner's heavyreliance on world models(either a priori or dynamicallyacquired) as comparedto the avoidancein most reactivebehavior-basedsystemsof world representations entirely. When reactivebehavior-basedsystemsare consideredin isolation, robustnessis gained at the expenseof some very important characteristics : , flexibility and adaptability. The issuesof action andperceptionare addressed
212
Chapter6
but cognition is ignored, often limiting theserobots to mimicking low-level life forms. Hybrid systemresearchassumesthat representationalknowledge is necessaryto enhanceandextendthe behaviorsof thesemachinesinto more meaningfulproblemdomains. This includesthe incorporationof memoryand dynamic representationsof the environment. Dynamic replanning must be affectednot only in a reactivemannerbut alsoin the contextof a moreabstract ' plan, one representingthe robot s goals and intents at a variety of planning levels. The researchissuesfor thesedesignersdo not centeron reactiveversus preplanneddeliberativecontrol but rather on how to synthesizeeffectively a control regimethat incorporatesboth methodologies . The terms signifying each of the two major componentsof thesehybrid architecturesvaries widely. Lyons ( 1992) uses planner and reactor. Malcolm and Smithers ( 1990) prefer cognitive and subcognitivesystems, with the cognitive componentperforming high-level functions such as planning and the subcognitiveportion controlling the robot' s sensorsand actuators. In this book we generally use deliberativeand reactiveto distinguish the two systems. The central issue in differentiating the many approach es to hybrid architectures discussedin this chapter focuseson interface design: What is the appropriateboundaryfor the subdivisionof functionality? How is coordination effectively carried out? This is one of the most interestingand pressing researchareasin intelligent roboticstoday. Lyons ( 1992) describesthreedifferent waysin which planningandreaction canbe tied: . Hierarchicalintegrationof planningandreaction: Deliberativeplanningand reactiveexecutionareinvolvedwith differentactivities, time scales,andspatial scope. Hencea multilevel hierarchicalsystemcanbe structuredthat integrates both activities (panel(A ) in figure 6.3). Planningor reactingdependson the situationat hand. In manyways, this is closely alignedwith the traditionaldeliberative approachwith one fundamentaldistinction: the higher, deliberative level(s) are epistemologicallydistinct from the lower, reactiveone(s), that is, the natureandtype of knowledgeandreasoningis distinct. . Planningto guide reaction: Another alternativemodel involves permitting planningto configureand set parametersfor the reactivecontrol system. Execution occurssolely underthe reactivesystem's auspices , with planningoccurring both prior to and concurrentwith execution, in somecasesprojecting the outcomeof continuouslyformulatedplans and reconfiguringthe reactive systemasneeded(panel(B) in figure 6.3). . Coupledplanning-reacting: Planningand reactingare concurrentactivities, eachguiding the other (panel(C) in figure 6.3).
213
Hybrid Deliberative/ Reacti~e Architectures More Deliberative
~ I
II
Level2 Level1
Planner Deliberation ~ Projection Behavioral Advice Configurations Parameters [ ~~
Level0
] : ~ =~
J
MoreReactive (A)
(8)
(C)
Figure6.3 /hierarchical . Typicaldeliberative planningstrategies
6.5 LAYERING Oneoutcomeof a 1995workshopon robot architectures(Hexmooret aI. 1995) wasthe observationthat a multilayered hybrid architecturecomprisinga toplayer planning systemand a lower-level reactivesystemis emergingas the architecturaldesignof choice(HexmoorandKortenkarnp1995). It wasfurther observedthat the interfaceor middle layer betweenthe two componentsof suchan architectureis the key function, linking rapid reactionand long-range planning. Hybrid systemdesignerslook toward a synthetic, integrativeapproachthat appliesboth of theseparadigms(reactionand planning) to the issuesof robot control, using eachwheremost appropriate.After the decisionhasbeenmade that both deliberativeand reactivefunctionality are importantfor a particular application, the questionarisesas to how to effectively partition thesefunctions . In general, two layersareneededat a minimum: oneto representdeliberation andthe otherreactivity. Anothercommonapproachinvolvesintroducing an explicit third layer, concernedwith coordinatingthe two components . Section 6.6 looks at specificinstancesof thesehybridizedarchitecturesand gives examplesof both. Indeed, in somecases,furtherresolutionis added, producing evenmorelayers. The bottom line, however, is that deliberationandreactivity needto be coordinated,andthe architectdecideswhereandhow to implement this function.
214
Chapter6
HYBRID 6.6 REPRESENTATIVE ARCIDTECTURES Four principal interfacestrategiesarein evidencefor the varioushybrid architectural designs: . Selection: Planning is viewedas configuration. The planning component determinesthe behavioralcompositionandparametersusedduring execution. The planner may reconfigurethem as necessarybecauseof failures in the system. . Advising: Planningis viewedasadvicegiving. Theplannersuggestschanges that the reactivecontrol systemmayor may not use. This is consistentwith the " " plansasadvice view (Agre andChapman1990) in which plansoffer courses of actionsbut the reactiveagentdetermineswhethereachis advisable. . Adaptation: Planning is viewedas adaptation. The planner continuously altersthe ongoingreactivecomponentin light of changingconditionswithin the world andtaskrequirements . . Postponing:Planning is viewedas a leastcommitmentprocess. The planner defersmaking decisionson actionsuntil as l ~te as possible. This enablesrecent sensordata, by postponingreactiveactionsuntil absolutelynecessary , to a more effective course of action than would be if an initial provide developed . plan were generatedat the beginning. Plansareelaboratedonly asnecessary This sectionpresentsfour major hybrid architectures , eachof which typifies one of thesestrategies : AuRA for selection; Atlantis for advising; PlannerReactoras adaptation; and PRS as postponement . A surveyof severalother . Judgingfrom the high level of activity, hybrid architecturesis then presented the hybrid approachis currently a particularly importantresearchtopic. Note especiallythe distinctionsin how deliberationandreactivity areinterfaced, as thesearehallmark characteristicsfor eachapproach. 6.6.1
AuRA
Arkin ( 1986, 1987b) was amongthe first to advocatethe use of hybrid deliberative -based) control systemswithin (hierarchical) and reactive(schema the AutonomousRobot Architecture (AuRA). Incorporatinga conventional plannerthat could reasonover a flexible and modularbehavior-basedcontrol system, Arkin found that specificrobotic configurationscould be constructed that integratedbehavioral, perceptual, and a priori environmentalknowledge : ( 1990b). Hybridization in this systemarisesfrom two distinct components a deliberativehierarchicalplanner, basedon traditional AI techniques , and
215
Hybrid Deliberative / ReactiveArchitectures
a reactivecontroller, basedon schematheory (Arbib 1992). Arkin ' s was the first robot navigational system to be presentedin this integrative manner ( 1989d). . AuRA hastwo Figure 6.4 depictsthe componentsof AuRA schematically : a hierarchicalsystemconsisting major planning and exeCutioncomponents of a missionplanner, spatialreasoner , and plan sequencercoupledwith a reactive system, the schemacontroller. In the style of a traditional hierarchical ; Saridisand planningsystem(Albus, McCain, andLumia 1987; Meyste11986 Valvanis1987), the highestlevel of AuRA is a missionplannerconcernedwith establishinghigh-level goalsfor the robot and the constraintswithin which it must operate. In AuRA-basedsystemsconstructedto date, the missionplan. The spatial ner has actedprimarily as an interfaceto a humancommander Arkin referred to as the 1987b reasoner , originally ), usescartographic navigator ( knowledgestored in long term memory to constructa sequenceof navigationalpath legs that the robot must executeto completeits mission. In the first implementationof AuRA, this was an A * planneroperatingover a meadowmap (hybrid free space /vertex graph) representation(Arkin 1989c). The plan sequencer , referredto asthe pilot in earlierwork, translateseachpath the reasoner spatial generatesinto a set of motor behaviorsfor execution. leg In the original implementation, the plan sequencerwas a rudimentaryrulebasedsystem. More recentlyit hasbe.enimplementedasa finite statesequencer (Mackenzie, Cameron, and Arkin 1995). Finally, the collection of behaviors , is then sentto the (schemas ), specifiedandinstantiatedby the plan sequencer robot for execution. At this point, deliberationceases , and reactiveexecution begins. The schemamanageris responsiblefor controlling and monitoring the behavioral es at run time. Eachmotor behavior(or schema ) is associated process with a perceptualschemacapableof providing the stimulusrequiredfor that particularbehavior. This action-orientedperceptionis the basisfor this form of behavior-basednavigation(Arkin 1990a). As describedin section4.4, each behaviorgeneratesa responsevector in a manneranalogousto the potential fields method. The schemasoperateasynchronously , transmittingtheir results to a process(move-robot) that sumsandnormalizestheseinputsandtransmits themto the low-level control systemfor execution. Within AuRA, a homeostaticcontrol system(testedonly in simulation to date) is interwovenwith the motor and perceptualschemas(Arkin 1992c). Internal sensors , provide information , such as fuel level and temperaturetransducers over a broadcastnetwork monitoredby behaviorscontainingsuitable ' receptors. Theseinternal messageschangethe overall motor responses
216
Chapter6 Learning
User Input
,-- - - - - - - - - - , I- - - - - - - - - - - ,I I I : Plan Recognition : : User Intentions Profile : I :I User II: I I II I II III I II . . GoalsII ILearning II I: Spatial II Spatia II II :I II I I I II A .SmI Mission I II pportunl II :I Alterations II II II: II I I I :I On-line : : Teleautonomy :I II II I.- - Adaptation L - - - - - - - - - L______- - - - -I
Hierarchical Component
Reactive Component
Figure 6.4 . High-level AuRA schematic ' ' performance by altering the behaviors and internal parameters relative strengths in an effort to maintain balance and system equilibrium ( homeostasis) . Chapter 10 discusses homeostatic control further . Once reactive execution begins, the deliberative component is not reactivated unless a failure is detected in the reactive execution of the mission . A typical failure is denoted by lack of progress, evidenced either by no motion or a time -out. At this point the hierarchical planner is reinvoked one stage at a time , from the bottom up , until the problem is resolved. First , the plan sequencer attempts to reroute the robot based on information obtained during navigation and stored in STM . Original implementations used sonar maps produced by the Elfes -Moravec algorithm for spatial world modeling (Elfes 1986) . If for some reason this proves unsatisfactory (e.g ., the route is completely blocked within this local context ), the spatial reasoner is reinvoked and attempts to generate a new global route that bypasses the affected region entirely . H this still fails to be satisfactory, the mission planner is reinvoked , informing the operator of the difficulty and asking for reformulation or abandonment of the entire mission. Modularity , flexibility , and generalizability as a result of hybridization constitute Au RA' s principal strengths. The value of each aspect has been demonstrated both in simulation and on real robotic systems.
217
Hybrid Deliberative/ ReactiveArchitectures
AuRA is highly modularby design. Componentsof the architecturecan be replacedwith othersin a straightforwardmanner. This is particularlyuseful in research . Someexamplesinclude . A specializedmission . plannerdevelopedfor an assemblytask whereboxes are pushedtogetherinto a specifiedarrangement . This planner was ported to a Denning mobile robot that competedin the 1993American Association for Artificial Intelligence (AAAI ) mobile robot competition. Stroulia ( 1994) further extendedthe plannerto learn and reasonover more generalplanning tasks. . The original A * spatialreasonerhasbeenreplacedwith Router(Goel et at. 1994), a multistrategyplanner. Routermodelsnavigableroutesas topological links betweennodesinsteadof the metric meadowmap representationused previously. The systemwastestedon a Denningmobile robot that success fully navigatedfrom room to room and down corridorsin an office and laboratory building. . Perceptualschemashavebeenexpandedto incorporatespecializedactionorientedsensorfusion methods(Murphy and Arkin 1992) (section7.5.7). Because of the recognitionthat in manycasesmultiple sensorsourcesare better than individual ones, specializedstrategiesweredevelopedto fusedatawithin the contextof action-orientedperception. Dempster-Shaferstatisticalmethods providedthe basisfor evidentialreasoning(Murphy 1991). . Theoriginal rule-basedplan sequencer hasbeenreplacedwith a temporalsequencer (Arkin and MacKenzie1994) basedon FSAs (MacKenzie, Cameron, and Arkin 1995) . The FSA is an expressionof a plan, in which eachstaterepresents a specific combinationof behaviorsthat accomplishone step of the task. Transitionsare madefrom one stateto anotherwhen significantperceptual eventstrigger them. Anotherstrengthof AuRA is the flexibility it providesfor introducingadaptation and learning methods. Chapter8 will discusstheseand other methods further. In early implementationsof AuRA, learningaroseonly from STM of spatialinformationusedfor dynamicreplanning. Sincethen, a varietyof learning techniqueshavebeenintroduced, including . on-line adaptationof motor behaviors using a rule-based methodology (Clark, Arkin , andRam 1992) . case-basedreasoningmethodsto providediscontinuousswitchingof behaviors basedon the recognitionof new situations(Ramet at. 1997) . geneticalgorithmsthat configurethe initial control systemparametersefficiently (Ram et at. 1994) and allow a robot to evolve toward its ecological nichein a given taskenvironment
218
Chapter6
Au RA' s generalizabilityto a wide range of problemsis anotherstrength. Variousarchitecturalcomponentshavebeenappliedin a variety of domains, including . manufacturingen onments(Arkin et al. 1989; Arkin andMurphy 1990). ~ . three-dimensionalnavigationasfound in aerialor underseadomains(Arkin 1992a). . indoor and outdoornavigation(Arkin 1987b). . robot competitions(Arkin et al. 1993; Balch et al. 1995). . vacuuming( MacKenzieandBalch 1993). . military scenarios(MacKenzie, Cameron, andArkin 1995; Balch andArkin 1995). . mobile manipulation(Cameronet al. 1993). . multirobot teams(Arkin 1992b; Balch andArkin 1994). AuRA' s major strengthresultsfrom the power of weddingtwo distinct AI paradigms: deliberationand reactivity. AuRA provides a framework for the conductof a wide rangeof robotic researchincluding deliberativeplanning, reactivecontrol, homeostasis , action-orientedperception, and machinelearning . It hasbeenmotivatedbut not constrainedby biological studies, drawing insight whereveravailableasa guidelinefor systemdesign. Au RA' s strengthslie in its modularity, which pennits ready integrationof new approach es to variousarchitecturalcomponents ; flexibility , as evidenced the ease of introduction of various by learning methodologiesand novel behaviors ; generalizability, demonstratedby its applicability to a wide rangeof domains, including robot competitions, amongothers; and most importantly, use of hybridization to exploit the strengthsof both symbolic reasoningand reactivecontrol. 6.6. 2 Atlantis
At the Jet PropulsionLaboratory(JPL) Gat ( 1991a) developeda three-level hybrid system, Atlantis, that incorporatesa deliberatorthat handlesplanning andworld mbdeling, a sequencer thathandlesinitiation andterminationof lowlevel activitiesand address es reative-systemfailures to completethe task, and a reactivecontroller chargedwith managingcollectionsof primitive activities . None (figure 6.5). The architectureis both asynchronousand heterogeneous of the layers is in chargeof the others, and activity is spreadthroughoutthe architecture.
Hybrid Deliberative/ ReactiveArchitectures
SENSORS
ACTUATORS Status
A
Results
Invocation
TIVE
ERA
DELIS
219
Figure6.5 TheAtlantisarchitecture . Atlantis' s control layer is implementedin ALFA (Gat 1991b), a LISP-based programlanguageusedto programreactivemodulesconfiguredin networks connectedvia communicationchannels.ALFA is mostcloselyrelatedto Kael' bling s Rex ( 1987), a circuit-basedlanguage(section3.2.4.2). This systemwas initially testedon Tooth (figure 6.6, panel(A , a small precursorto the Mars microroversusedfor NA SA' s Pathfinderprogram(ShirleyandMatijevic 1995) (figure 6.6, panels(B) and(C , anda RealWorld Interface( RWI) basefor indoor . navigationalexperiments The sequencinglayer of Atlantis is modeled after Firby' s RAPs (section 3.1.3). Conditional sequencingoccurs upon the completion of various subtasksor thedetectionof failure. In particular, the notionof cognizantfailure is introduced(Gat andDorais 1994), referingto the robot' s ability to recognize on its own when it has not or cannotcompleteits task. Monitor routinesare addedto the architectureto determineif things are not going as they should and then interrupt the systemif cognizantfailure occurs. Often thesemonitor routinesare very task specific, such as checkingalignmentconditionswhen conductingwall following, but they can be more general, suchas a time-out for the overall completionof a task. Deliberationoccursat the sequencinglayer' s request(Gat 1992). The deliberator consistsof traditionalLISP-basedAI planningalgorithmsspecificto the task at hand. The planner's output is viewed only as adviceto the sequencer layer: it is not necessarilyfollowed or implementedverbatim.
220
Chapter6
(A)
(B) Figure6.6
221
Hybrid Deliberative/ ReactiveArchitectures
(C) Figure 6.6 (continued)
. , (B) Rocky4, and(C) Sojourner (A) Tooth
Design in Atlantis proceeds from the bottom up : low - level activities capable of being executed within the reactive-controller level are first constructed. Suitable sequencesof these primitive behaviors are then constructed for use within the sequencing level , followed by deliberative methods that assist in the decisions the sequencer makes. Experiments have been performed on a large outdoor JPL Mars rover testbed called Robby (figure 6.7 ) (Gat 1992), which successfully undertook various complex navigational tasks in rough outdoor terrain . The primitive activities used in the reactive controller were based on Slack ' s NATs ( section 3.3.2) and were guided by a strategic plan constructed by the deliberator. The sequencer was then able to abandon intermediate - level navigational goals if they became untenable as noted by advice from the deliberator.
' SummarizingAtlantis s importantfeatures: . Atlantis is a three-layeredarchitectureconsistingof controller, sequencer , anddeliberator. . Asynchronous , heterogeneous reactivity anddeliberationareused.
222
Chapter6
Figure6.7 . , a JPLMarsroverprototype Robby . The resultsof deliberationareviewedasadvice, not decree. . Classical AI is merged effectively with behavior-based reactive control methods. . Cognizantfailuresprovide an opportunityfor plan restructuring. . The systemhas been exercisedsuccess fully on both indoor and outdoor robotic systems. 6.6. 3 Planner -Reactor Architecture
Lyons and Hendriks ( 1992, 1995) forward the Planner-Reactorarchitecture as anothermeansfor integrating planning and reactivity. Their philosophy advocatesthe use of a planner as a mechanismto continuously modify an executingreactivecontrol system. Figure 6.8 depicts this approach. The planner is in essencean executionmonitor that adaptsthe underlying behavioral control systemin light of the changingenvironmentand the agent's underlyinggoals.
Architectures Hybrid DeliberativeI Reactive
1 I I I -- .I
- - - - - - - - - - - '
ACTION
I I I I
WORLD : I . I SENSINGI I- - - - - - _ _ _ _ _ _
: I I I I I 1
Figure 6.8 Planner-Reactorarchitecture.
The RS model, discussedin section3.2. 4.1, is usedboth to model and to implement the reactor component. It is assumedthat a suboptimalreactor ' may be presentat any time and that the planners goal is to improve the performanceof the reactor at all times. Loosely speaking, this is a form of anytimeplanning, wherea significantly suboptimalsolution may be initially chosenthenimprovedon during execution.
such
critical -
manner
time
a
in
answers approximate
provide
planners
Anytime :
that .
and
execution
for
available
a
is
,
At
plan
point
any .
Wellman
and
Dean
.
cf
time (
over
available
increases plan
the
of
The quality .
1991 )
223
Situationsprovidethe frameworkfor structuringsetsof reactions. They can be hierarchicallydefinedand often denotethe statethe robotic agentis currently in regardinga task. For the primary task studiedin LyonsandHendriks' , a situationalhierarchycan be structuredas depicted ( 1993) work, parts assembly in figure 6.9. Here, the situation where the robot needsto build kits consistsof variousconstituentsituations, eachof which may in turn consistof further situationalspecifications . This hierarchyis not unlike the tasklsubtask hierarchiesdevelopedby traditional AI planningsystemssuchas Noah (Sacerdoti 1975), but differs in that the situationsspecifybehavioralstructuresfor usein the reactorandnot specificrobotic commands .
224
Chapter6
/ Y FINDTRA
BUILDKITS """ :IeU FFERPARTS : Y: :HANDLETRA ------ ------", ~ r PLACEPARTS Y MOVETRA GEl TRAY
-----_J_--------~ -A,: PART ::HANDLE - : PARTB, HANDLE 1 1 C'' 1-HANDLE - - - - - - - ---PART - - - - ----
The situationsthemselvesare also encodedusing RS formalisms. Planning is viewed as a form of adaptation( Lyonsand Hendriks 1994) . A reactorexecutes under a set of operatingassumptions . If any assumptionsare violated regardingthe utility of a particularreactorconfiguration, the plannermodifies the reactor's control systemto removethe assumptionviolation. If the violation occursas a result of environmentalchanges , the strategyis referredto as relaxation . Planner directed relaxation of assumptionscan forced assumption alsooccurbecauseof a changein high level goals(e.g., from userinput). The assumptionsusedwithin the Planner-Reactorarchitecturearegenerallyhighly domainspecific(i.e., strongknowledge).
..Strong information to a dom involves Knowledge peculiar particul proble and has little or no . general utility can be used across domai and has Weak is information Knowledge many . that broad utility For partsassemblyby a robot in a manufacturingwork cell, someof these assumptionswithin the Planner-Reactorarchitectureinclude . Partquality: Eachpart meetsthe necessarycriteria or specificationsfor use . in the assembly . Non-substitutabilityof parts: Eachpart hasonly onetype. . No partsmotion: Partsdo not moveoncedeliveredinto the work space.
225
Hybrid Deliberative/ ReactiveArchitectures . No downstream disturbance: Subsequent manufacturing processes are always to receive the assembled . ready parts . Filled tray : All the parts are delivered to the work cell . . Tray disturbance: The tray is not moved after arrival . . Parts homogeneity : Parts arrival is evenly distributed . Clearly , in the real world , violations of these assumptions are not only possible but likely . Each assumption has a monitor associated with it during run time to ensure its validity . If , for whatever reason, an assumption violation is detected, the planner relaxes the assumption and adapts the control system to deal with the new situation . These violations often occur because of environmental factors ' beyond the robot s control . The planner can reinstate assumptions later, once the original situation has been restored, along with a reactor reconfiguration and reinstantiation of a suitable assumption monitor . Figure 6.10 depicts the flow of control in this architecture. This process is recursive, as an adapted reactor can be further adapted by the planner. A variation of the Planner- Reactor architecture has been developed for planning and controlling a multifingered robotic hand (Murphy , Lyons , and Hendriks 1993) . The deliberative planner is referred to as the grasp advisor and has an associated grasp reactor. Grasp selection is based (ideally ) on the task requirements, the feasibility of acquiring the part using the proposed grasp, and the stability afforded the part once grasped in that manner. The initial implementation , however, is concerned only with the stability criterion . Typical behavioral components for a reactive grasping system include find -objects, grasp-objects, and avoid - obstacles, which are all self- explanatory in function . The deliberative grasp advisor, using information obtained from environmental knowledge such as part information obtained through vision , communicates global constraints to the reactor, which then blases the actual grasp strategy used for initial contact with the part . This example task does not currently use assumptions within the grasp advisor in the same way that the kitting assembly system does but nonethelessexemplifies how deliberation and reactivity can be effectively integrated.
Summarizing, the key featuresof the Planner-Reactormethodologyare as follows: . Deliberationand reactivity are integratedthroughasynchronousinteraction of a planneranda concurrentreactivecontrol system. . Planningis viewedasa form of reactoradaptation. . Adaptationis an on-line processratherthan an off -line deliberation. . Planningis usedto removeerrorsin performancewhenthey occur.
226
Chapter6
REACTOR ADAPTED BY PLANNER AND ASSUMPTIONS RELAXED REACTOR PERFORMANCE WITH MONITORING RESTORE INITIAL HALT REACTOR Figure6.10 -basedplanning. flow of control for assumption FSArepresenting . The reactor undergoes situationally dependent on-line perfonnance improvement . . The basic techniques, tested in both assembly work cell tasks and grasp planning for a robotic hand, are believed applicable to a broad range of applications , including mobile robot navigation and emergency response planning . 6.6.4
The Procedural Reasoning System
The ProceduralReasoningSystem, (PRS) (Georgeffand Lansky 1987), provides an alternativestrategyfor looking at the integrationof reactivity and deliberation. Reactivityin this systemrefersto the postponementof the elaboration of plansuntil it is necessary , a type of least-commitmentstrategy.
227
Hybrid Deliberative/ ReactiveArchitectures
A least c ommitm defers a decis until it is ne strategy makin abs to do so . The informa to make a corre dec is ass neces to become late in die die nee forb ,dlus proces reduc . available
Figure 6.11 The Proceduralreasoningsystem( FRS).
In fRS , plansarethe primary modeof expressingaction, but theseplansare continuouslydeterminedin reactionto the current situation. Previouslyformulated plansundergoingexecutioncan be interruptedand abandonedat any time. Representations of the robot' s beliefs, desires, andintentionsareall used to formulatea plan. The plan, however, representsthe robot' s desiredbehaviors insteadof the traditionalAI planner's outputof goal statesto be achieved. Figure 6.11 depictsthe overall PRS architecture. The interpreterdrives system execution, carrying out whateverplan is currently deemedsuitable. As new beliefs, desires, or intentionsarise, the plan may change, with the interpreter handlingthe plan switching. A symbolicplan alwaysdrivesthe system, however, so it is not reactivein the nonnal senseof tight sensorimotorpair execution , but it is reactivein the sensethat perceivedchangingenvironmental conditionspermit the robotic agentto alter its planson the fly. The systemwastestedanddevelopedon SRI' s robot Flakeyfor usein office navigationtasks(figure 6.12). UM -PRS ( Lee et al. 1994) is a later variation
228
Chapter6
Figure6.12 .) . (Photograph courtesyof KurtKonoligeandSRIInternational Flakey
229
Hybrid Deliberative/ Rea ~tive Architecture ~
of this FRS systemthat has beenappliedto the DefenseAdvancedResearch Project Agency UnmannedGround Vehicle ( DARPAUGV) Demo n project for outdooroff -road military scoutingmissionsusing HMMWVs. We revisit this systemas a pieceof the DARPA UGV Demo n programin chapter9.
6.6.5 Other Hybrid Architectures We now survey other efforts in the development of hybrid deliberative/reactive architectures. The solutions being explored are diverse, especially in regard to where deliberation should end and reactivity begin and whether planning should be viewed as selection, advising , adaptation, postponement, or something else. . SSS (Connell 1992) : SSS, developed at the IBM T.J. Watson Research Center , is a hybrid architecture that descended directly from the subsumption architecture (section 4.3) . The letters in SSS stand for each of its three layers : servo, subsumption , and symbolic . The interface between the servo layer and the symbolic is not particularly new: together they provide behavioral modularity and flexibility to the underlying servomotor controllers by providing parameters and set points for the servo loops in the same manner as subsumption . SSS' s novelty lies in its use of world model representations, which are viewed as a convenience, but not a necessity, for certain tasks. The symbolic (deliberative ) layer provides the ability to selectively turn behaviors on or off as well as provide parameters for those that require them. Once the behaviors are configured , they continue to execute without any intervention of the symbolic level. Restating, the symbolic level predetermines the behavioral configuration used during execution. The system was tested on a small mobile robot , TJ (figure 6.13), capable of moving at an average speed of a little under three feet per second in an indoor office environment . The symbolic level handles where-to- go- next decisions (strategic) , whereas the subsumption level handles where-to- go- now choices (tactical ) . A coarse geometric map of the world is present at the strategic level , and route planning is conducted within this representation . Piecewise segmentation of the route in a manner similar to that of AuRA provides the behavioralconfigurationfor eachleg of theoveralljourney.
Also worth mentioningis an earlier system(Soldo 1990), considerably less , that advocatesthe useof behavioralexpertscoordinatedby an AI developed plannerusing a world map. This systemprovidesa frameworkfor integrating deliberativeplanning and reactivecontrol by allowing the plannerto choose appropriatebehaviorsalongthe samelines asboth SSSandAuRA.
-
Chapter6
"
230
Figure 6.13 TJ. (Photographcourtesyof Jon Connell.)
. Multi -Valued Logic (Saffiotti et al. 1995) : Researchersat SRI International , drawingheavily on manyearlier ideasand synthesizingthem formally, have developeda novel hybrid architecturethat uses a multivalued logic ) as the reactivecomponent (MVL ) representationfor behaviors(motor schemas ' coupled with gradient fields as goals in the mannerof Paytons work (section 5.2.2). Multivalued logic provides the ability to have a variable . In other words, planner-controller interfacethat is stronglycontextdependent the decisionwhen to plan and when to react reflectsthe natureof the environment . Further, behavioralplans are included that draw inspiration from PRS style deliberation. Theseprovide a form of preplannedbehaviorthat can
231
Hybrid Deliberative/ ReactiveArchitectures
be invoked and elaboratedas necessary . This systemhas been success fully testedon the robot Flakey(figure 6.12) in variousindoor office environments . . SOMASSHybrid AssemblySystem(Malcolm andSmithers1990) : The SOMASS systemis an assemblysystemconsistingof two parts: the cognitive . The cognitivepart (deliberative) and the subcognitive(reactive) components consistsof a symbolicplannerdesignedto be asignorantaspossible- a virtue . The intent is to avoidcloggingthe reasoner accordingto the systemdesigners with unnecessaryknowledge. The planner itself is hierarchicalin structure, concernedfirst with finding a suitableorderingof partsto producethe required , thensubsequentlywith determininggravitationalstability consistent assembly with the ordering, producingsuitablegraspsto acquirethe part, insuring that assemblytolerancesfor handlingerrors are met, and translatingthe plan into executablerobot code consistingof parameterizedbehavioralmodulessuitable for execution. The subcognitivecomponentis concernedwith the actual executionof the behaviorsafter the plan is downloadedto the robot. In this system, implementedon a working robotic arm, thereis a clear-cut division between deliberationandplanningbut a limited ability to exchangeinformation betweenthem. . Agent Architecture(Hayes-Roth et at. 1993) : Plansin this architectureare considereddescriptionsof intendedcoursesof behavior. 1\\10levelsare specified within the agentarchitecture:thephysicallevel, concernedwith perception and action within the environment, and the cognitive level, for higher-level reasoningneedssuchas problem-solving and planning. According to the designers , finer resolutioncould yield more than two levels, but currently this numberseemsadequate . A plan is communicatedfrom the cognitive level to the physicallevel, with feedbackfrom the executionof the plan returnedto the cognitive level. The designersclaim that reactiveand deliberative(planning) behaviorscan coexist within eachlevel, so the standardpartitioning of reactivity anddeliberationdoesnot pertain. As examples,situationassessment can occur within the cognitive level, and limited path planning can occur within the physicallevel. The differencebetweenlevelsis essentiallyepistemological andtemporal, basedon the following distinctions: . Whether symbolic reasoning(cognitive) versus metric (physical) is employed . . Time horizonsfor history and reactionare both significantly shorterfor the level to the one. physical compared cognitive . Greaterabstractionis presentat the cognitivelevel. The systemhasbeentestedon mobilerobotsfor both surveillanceanddelivery tasks.
232
Chapter6
. Theo-Agent (Mitchell 1990) : " Reactswhen it can, plans when it must" is the motto for Thea- Agent. This hybrid system, developedat Carnegie-Mellon University, focusespredominantlyon learning: in particular, learninghow. to becomemore reactive, more correctin its actions, and more perceptiveabout . Stimulus-response(behavioral the world' s featuresrelevantto its performance execution are the basis for reactiveaction, using an rule selection and ) arbitrationmechanismto choosethe most appropriaterule. If no rules apply, then andonly then is the plannerinvokedto determinea suitablecourseof action . As a resultof theplanningprocess,a new stimulus-responserule is added to the existingrule set. This newrule canbe usedagainlater shouldthe sameor similar situationarise, this time without the needfor planning. Theo-Agentwas testedon a Hero 2000 mobile robot given the task of locatinggarbagecans. A stimulus-responsereactiontook on the order of 10 milliseconds, whereasthe plannerrequired severalminutes. Hence if the robot begins with few or no from initially severalminutesto undera second rules, its reactiontime decreases (about two ordersof magnitude) as it experiencesmore and more of the world. . GenericRobot Architecture(Noreils and Chatila 1995) : This hybrid architecture , developedin France, consistsof threelevelsthat bridge the spectrum of planningto reactivity: . Planninglevel: Generatessequences of tasksto achievethe robot' s high-level goalsusinga STRIPS-like planningsystem(Nilsson 1980). . Control systemlevel: Translatesthe plan into a set of tasksand configures the functional level prior to execution. . Functionallevel: Correspondsto a setof functionalmodules(servoprocess es execution . behaviors concerned with reactive to behaviors ) Implemented analogous includeobstacleavoidance , wall following, and visual tracking. This systemis similar in structureto severalof the selectionstyle architectures already encounteredin this chapter. A significant contribution of this work lies in the developmentof a task descriptionlanguagethat providesa formal methodfor designingandinterfacingthesemodules. Specialattentionhasalso . The systemhas been beenpaid to diagnosticand error recoveryprocedures testedon the Hilare seriesof robots, using vision for trackingtasks. . DynamicalSystemsApproach(SchonerandDose 1992) : This approachhas been significantly influencedby biological systemsresearchas a basis for providing an integrativehybrid approachfor reactingand planning. Suitable vector fields, designedusing potentialfield methods(section3.3.2), serveas the basisfor planning and provide a clean bridge to behavior-basedreactive ' execution. The deliberativeplanner operateswithin the reactivecontroller s
233
Hybrid Deliberative/ ReactiveArchitectures
representationalspace, dealing with the underlying conttoller' s mathematics and dynamicsratherthan reasoningsymbolically. Planningconsistsof selecting and providing parametersfor eachof the associatedbehavioralfields and detenniningtheir relativesttengthfor summationpurposesin light of the task' s consttaints.This work hasbeendemonstratedin simulationonly to date, but it providesan interestingway of rethinking the representationalspacein which high-level deliberativeplanningcanoperate. . Supervenience Architecture(Spector1992): The supervenience architecture providesan environmentfor integratingreactionanddeliberationbasedon abstraction , in particularthe " distancefrom the world" (supervenience ) that the abstractedconceptrepresents . Althoughtheramificationsof this work areoften more concernedwith philosophythan robotics, a multilevel implementation -partitionedevaluator(APE) of the architecturereferredto as the abstraction hasbeenimplemented.It consistsof multiple levelsrangingfrom the perceptual /manual(lowestlevel), to spatial, temporal, causal, andfinally conventional (highestlevel), connectedin a strict hierarchy. To test theseideas, a simulated homebotcapableof actionssuchasgrab, move-object, move-right, rotate, and the like hasbeenused. The main premiseis that reactivity anddeliberationare differentiatedprimarily by their levelsof abstractionand how far they are removed from the real world. Supervenience providesan integratedformalism for describingthesemanylevelsof absttaction. As such, it somewhatblurs the distinctionsthat otherhybrid architecturesmakeandthusleanstowardsa more traditionalhierarchicaldesign(e.g., Albus 1991, MeysteI1986 ). . TeleoreactiveAgent Architecture(Bensonand Nilsson 1995) : This hybrid deliberative/reactivearchitectureis basedon the constructionof a plan in the form ofa setofteleoreactive(TR) operators(section3.3.1) which an arbittator then selectsfor reactiveexecution. The deliberativecomponentinvolveshierarchicalplanning, yielding a tree-like structurethat consistsof TR programs. The TR formalismprovidesthe unifying representationfor both reasoningand reacting. The systemhasbeentestedin a simulatedbotworld environment. . ReactiveDeliberation(Sahota1993) : The reactivedeliberationarchitecture consistsof two distinct layers: the deliberatorand the reactiveexecutor. The executorconsistsof actionschemasoperatingat a level similar to that of RAPs (section3.1.3). The deliberatorenablesa single action schemaat a time and . Deliberationin this architecturerefersnot to higher-level givesit parameters abstractreasoningbut rather to the selectionof one of the many potential behaviorscurrently appropriatefor executionin the given situationconsistent with the agent's goals. In manyrespectsthis is merelyan elaboratedversionof an action-selectionmechanism , but it providesus with anotherway to think
234
Chapter6
about the interface betweenplanning and reactivity. This systemhas been testedusing small robots for playing tabletop soccer. The behaviorsfor this domainconsistof activitiessuchas shoot, defend-fed-line, clear, go-to-homeline, and so on. . IntegratedPathPlani1ingandDynamic SteeringControl (Krogh andThorpe 1986): In early work demonstratedin simulation, a strategyfor using path planningmethodsusingrelaxationovera grid-basedworld modelwascoupled with a potential fields- basedsteeringcontroller. The path plannergenerated subgoalsreferredto ascritical pointsfor the steeringcontroller. Potentialfields methods(Krogh 1984), modifiedto providereal-time feedbacksimilar to those usedin AuRA, provided the local navigationalcapabilitiesfor achievingthe . Thoughnot really abehaviorseriesof subgoalsthe path plannerestablished a clear basedmodel, this early exampleprovides integration betweenpath planningandreactivecontrol. . UUV s: Hybrid architectureshave been applied to underseanavigational . The rational behavioral model ( Byrneset al. tasks by severalresearchers architecture a three 1996), consistingof execution, tactical, and strategic layer layers, reasonsover behaviorsat different levels. Although castin a more hierarchicalframework (cf. Saridis 1983), it usesprimitive behaviorsas the primary object for planning. Another system( Bonasso1991) usesGapps/ Rex (section3.2.4.2) asthe underlyingreactivecontrol methodologyandsubsumption ) as the primary operators. This is an example competences(behaviors more of hybridizing two different reactivestrategiesthan of deliberationand reactivity. The targetvehiclefor both architecturesis an undersearobot.
SUMMARY 6.7 CHAPTER . Both deliberativeplanningsystemsandpurely reactivecontrol systemshave limitations when eachis consideredin isolation. . Deliberative planning systemsprovide an entry point for the use of traditional AI methodsand symbolic representationalknowledgein a reactive robotic architecture. . The interfacebetweendeliberationand reactivity is poorly understoodand servesasthe focusof researchin this area. . Strongevidenceexiststhat hybrid deliberativeand behavior-basedsystems arefound in biology, implying that they arecompatible, symbiotic, andpotentially suitablefor usein robotic control. . Hybrid modelsinclude hierarchicalintegration, planningto guide reaction, andcoupledplanningandreacting.
235
Hybrid Deliberative / ReactiveArchitectures . Another important design issue concerns the number of layers present within the overall architecture , with two or three currently being the most common . . AuRA is an early hybrid deliberative / reactive system using motor schemas and a traditional
AI spatial planner . The planner system prior to execution and reconfigures it in . Atlantis is a three - layer hybrid architecture introduced the concept of cognizant failure in
configures the reactive control the event of task failure . based on RAPs and NATs . It which a robot becomes aware
of its inability to complete a task . Plans are viewed as advice rather than commands or instructions in this system . . The Planner - Reactor architecture consists of two major components . Planning is viewed as continuous adaptation of the reactive component . The RS ' model provides the underpinnings of all the architecture s components . Situations ,
provide the context for sets of reactive actions . . PRS uses a least - commitment strategy to delay the elaboration of plans for execution until necessary . Although not strictly behavior based , it does react to changes in the environment detected via sensing and develops plans consistent ' with the robot s current observations , beliefs , desires , and intentions . . Selection ( AuRA ), advising ( Atlantis ) , adaptation ( Planner - Reactor ), and postponing ( PRS ) are four major interface strategies frequently used in various hybrid architectures . . Many other hybrid architectures have also been developed along similar lines : SSS , MVL , agent architecture , Theo - Agent , Supervenience , and teleo reactive agent architecture among others .
Chapter Perceptual
7 Basis
for
Behavior
- Based
Control
' We don t see things as diey are, we see d1emas we are. ---.Anais Nin
Obje Chap an mo be bet intim the unde pe .2 1 To rela rob the info can how pr re de biol appr . of perc . mo of the To 3 .and str per utilit reco of att , foc , role se 4 , pe exp expl . b as beh with fusio sens o th cl fo see seve de 5 pe exa repr .syst robo
" " It would be as useless to perceive how things actually look as it would be to watch die random dots on untuned television screens. - Marvin Minsky
We have to remember that what we observe is not nature in itself but nature exposed to our med1od of questioning . - Werner Karl Heisenberg
238
7 Chapter
7.1 A BREAKFROMTRADITION Without a doubt, a robot' s ability to interpretinformation aboutits immediate surroundingsis crucial t~ the successfulachievementof its behavioralgoals. To react to externalevents, it is necessaryto perceivethem. The real world is often quite hostile to robotic systems. Things move and changewithout warning, at best only partial knowledgeof the world is available, and any a priori informationavailablemay be incorrect, inaccurate,or obsolete. Machineperceptionresearch , and in particularcomputervision, hasa long and rich tradition, with a great part of it dissociatedfrom the issuesof realtime control of a robotic system. This work hasfocusedon taking input sensor readingsandproducinga meaningfulandcoherentsymbolicand/ or geometric interpretationof the world. The top panel of figure 7.1 representsthis viewpoint . Much of this researchhas ignored the fact that perceptualneedsare ' predicatedupon the consumingagents motivationaland behavioralrequirements . canbe found for the lack of progressin producingrealCertainlyscapegoats time robotic perceptionunderthis paradigm: Computerarchitecturesweretoo primitive, or neuroscientistshavenot provided an adequateunderstandingof humanvision. Perhaps , however, the meanswere not at fault, but rather the desiredends. The traditional approachhassignificantproblems: . Perceptionconsideredin isolation: Is it wise to considerthe perceiveras a disembodiedprocess? This is perhapssimilar to studyinga living creatureby choppingit up andhandingout the piecesto different scientists. Perceptionis betterconsideredas a holistic, synergisticprocessdeeplyintertwinedwith the ' completeagents cognitiveandlocomotionsystems. . Perceptionas king: There hasbeensomeelitism regardingmuch of the research in perceptualprocessing , computervision in particular. Unquestionably, vision is a hardproblem. Nonethelessperceptualactivitiesneedto be viewedas only oneof the manyrequisiteneedsfor a functioningintelligent agent. Vision researchers can benefitgreatly by consideringtheseother systemcomponents as partners, as opposed to servants. . The universal reconstruction : Much perceptual research has focused on creating three-dimensional world models. These models are often built without ' regard for the robot s needs. A deeper question is whether these reconstructive models are really needed at all . Roboticists (Brooks 1991b) and psychologists (Neisser 1993) alike lament the pitfalls associated with the traditional approach to machine vision . Over-
J Basisfor Behavior-BasedControl Perceptua
PARADIGM
OLD
" / /
. -
,
.
PARADIGM
NEW
I det
239
Figure7.1 . (Figurecourtesyof BobBolles). esto perception Approach
240
Chapter? coming these difficulties requires a shift toward a new (or rather rediscovered) paradigm : viewing perception as a partner process with action. More accurately , a duality exists: The needs of motor control provide context for perceptual processing, whereas perceptual processing is simplified through the constraints of motor action. In either case, action and perception are inseparable. Recently, in developments paralleling the advent of behavior-based robotic systems, new approaches have emerged that take this interplay into account. These methods are guided by the adage: . Perceptionwithout the contextof actionis meaningless The reflections of the new perceptual paradigm include : . Action - oriented perception : An agent' s perceptual processing is tuned to meet its motor activities ' needs. . Expectation -based perception : Knowledge of the world can constrain the interpretation of what is present in the world . . Focus-oj -attention methods: Knowledge can constrain where things may appear within the world . . Active perception : The agent can use motor control to enhance perceptual processing by positioning sensors in more opportune vantage points . . Perceptual classes: These partition the world into various categories of potential interaction . The bottom panel of figure 7.1 captures some aspects of this new approach. Perception now produces motor control outputs, not representations. Multiple ' parallel processes that fit the robot s different behavioral needs are used. Highly specialized perceptual algorithms extract the necessaryinformation and no more: Perception is thus conducted on a need- to -know basis. To further advance this position , complexity analysis of the general task of visual search has provided illuminating results. Bottom - up visual search where matching is entirely data driven has been shown to be NP - complete and thus computationally intractable , whereas task- directed visual search has linear - time complexity ( Tsotsos 1989) . This tractability results from optimizing the available resources dedicated to perceptual processing (Tsotsos 1990) . Attentional mechanisms that result from exploitation of the knowledge of the specific task provide just such constraints. The significance of these results for behavior-based robotic systems cannot be underestimated: " Any behavioristapproachto vision or roboticsmust deal with the inherentcomputational complexityof the perceptionproblem: otherwisethe claim that thoseapproach es scaleup to human-like behavioris easily refuted." (Tsotsos1992, p. 140)
241
PerceptualBasisfor Behavipr-BasedControl The net outcome is that a primary purpose of perceptual algorithms is to support particular behavioral needs. In earlier chapters, we have seen that behaviors and their attendant perceptual processes can be executed in parallel . In reactive control , sensor information is not fused into a single global representation over which other planning processes then reason. This is in marked contrast to more traditional hierarchical views of robotic control which assume that perception ' s purpose is to construct a global world model (Barber a et ale 1984) . The inherent parallelism and more targeted processing of behavior based robotics permits much more efficient sensor processing. To emphasize further the importance of perception itself , we revisit the symbol grounding problem in AI (chapter 1) . Perception provides perhaps the only opportunity for us to provide physical grounding for the objects within ' ' .an agent s world . The agent s interaction with these objects completes the grounding process by providing meaning through its resulting actions.
7.2 WHATDOESBIOLOGYSAY? A wide rangeof disciplineswithin the biological scienceshaveaddressedthe issuesof perceptionasrelatedto behavior. For roboticists, significantinsights can be gleanedfrom thesestudies. This section provides an overview of a few importantresultsfrom researchin perception, neuroscience , psychology, and ethology of particular relevanceto our study of behavior-basedrobotic systems.
7.2.1 TheNatureof PerceptualStimuli To begin with, it is useful to distinguishbetweenthe different ways of categorizing perceptualstimuli. One such distinction can be basedupon the of the received stimuli. Proprioception refers to perception associated origin with stimuli arising from within the agent. This includes information such as tendon or muscle tension, from which limb position or the number of times a particular action has beenrepeated(such as a leg movement) might be computed. Exteroceptionrefers to perceptionassociatedwith external stimuli. Here the environmenttransmitsinformation to the agent via vision, audition, or some other sensormodality. The most common industrial robotic arms that computetheir end effector positions through inverse kinematicsrely on proprioceptiveinformation. If , however, a vision systemis coupledto the robot, exteroceptivedata can provide environmentalfeedback asto wherewithin the world the robot needsto move. Clearly, reactiverobots
242
Chapter7
tightly coupledto the environmentthroughsensorimotorbehaviorsrely heavily on exteroceptiveperception. Nonetheless , proprioceptivecontrol hasbeen ' widely observedby biologistsin animals generationof navigationaltrajectories . One of many suchexamplesoccursin insectssuchas millipedes(Burger and Mittelstaedt 1972) and in desertspiders( Mittelstaedt1985), wherehoming behavioris basedupon proprioceptivesensationsand is generatedfrom the " sum of the momentaryperipheralafferent inputs" ( Burgerand Mittel staedt1972). In other words, the distancestraveledby the insec~ and spiders are believedto be storedin somemanner, then used later by the organisms to return to their homesites . This process, referred to as path integration, relies entirely on proprioceptiveinputs. The sensorydata, generatedby an accumulationof the animal' s past movementsand usedto orient the animal within the world, is referredto as ideotheticinformation ( Mittelstaedt1983). In contrast, orientation information generatedby landmarks, sun position, or other externalcues is referredto as allothetic information. Allothetic information supportsclosed-loop control basedon continuousfeedbackfrom an external source, whereasideotheticinformation providesonly open-loop control and is thus subjectto greatererror due to the inevitablenoise during locomotion.
its
of
results
from
the
feedback
uses
do
to
commanded
it
was
what
. closed
control system
output and
between
A
loop deviation
to
the
actions
compute the
to
the
of
as
one
used
is
feedback
This
it
. accomplished
inputs
what
actually .
controller
between
difference
the
evaluate
to
means
no
has
.
available
is
feedback
no
is ,
that
,
result
. An
control system actual
the
loop and
action
open commanded
the
In animal navigation , it is believed that both allothetic and ideothetic information are in use and integrative mechanisms are provided to reconcile the inevitable differences between them smoothly . We will study the issues concerning the combination of multiple , potentially conflicting data sources in section 7.5.7. 7.2.2
Neuroscientific Evidence
243
PerceptualBasisfor Behavior-BasedControl
informationis processed . Unfortunately, we do not know asmuchaswe would like about the actual processingof perceptualinformation within the brain. Nonetheless , severalrelevantobservationsderivedfrom neurosciencemay be ' in our helpful understandingof perceptualprocessings role in behavior. Individual sensormodalities have spatially separatedregions within the brain. Sight, hearing, and touch all have distinct processingregions. Even within a specificsensortype, spatialsegregationis present: " Perhapsthe most striking finding is that thereis no single visual areain the brain. Different areas of the brain specializein different aspectsof vision suchasthe detectionof " pattern, color, movement, and intensity . . . (McFarland1981, p. 593). This observationholds not only for the humanbrain but that of lower animalsas well. For example, a distinct neuralregionexistsfor loomingdetectionin frogs -Perez1995). Neural structures associatedwith predatoravoidance(Cervantes associatedwith prey selectionfor theseanimalshavealsobeenobserved( Fite 1976). In the humanandprimatebrain, visual processingis channeledinto two distinct vision streams( Nelson1995) : the object vision stream, concernedwith recognitionof objectsand foreground-backgroundseparation , and the spatial vision stream, which providespositional information useful for locomotion. Theinitial evidencefor these" what" and" where" visual systemscamefrom lesion studiesconductedon primates( Mishkin, Ungergleider,and Macko 1983) indicatingthat the object streamis localizedto the temporalareaof the cortex, whereasthe parietalregionsareassociatedwith spatialvision. Furtherspecializationoccurswithin the cortexitself. Orientationsensitivity to a particularstimulusoccursthroughoutlayersof the visual cortex. A neuron at a particularlevel is sensitiveto a stimulusat a preferredorientation, as has beenobservedin catsand macaquemonkeys( Lund, Wu, andLevitt 1995) . Echolocation,analogousto sonarsensingin robots, alsohasspecializedneural regionsassociatedwith it. In particular, the auditorycortexof the mustached bat is dedicatedto this type of processingbut hasadditionalparceUationwithin itself (Suga and Kanwal 1995). The subdivisionsare associatedwith varying rangesto targets, eachof which likely hasa differing behavioralresponse associatedwith it. Further analysishas revealeddifferent specializedregions associatedwith targetsizeandvelocity. One final observationthat we mention is the space-preservingnature of the connectionsbetweenthe brain and the sensingsystemitself. Thesemappings areprevalentand areexemplifiedby the retinotopicmapsprojectingthe ' eyes output throughthe lateral geniculatenucleusonto the visual cortex; somatotopicmapsprojecting the peripheralinputs generatedby touch onto its
244
Chapter?
associatedcortical regions; and tonotopicmapsfound preservingspatialrelations producedby audition. Sensoryinformation impingesupon the brain in a mannersimilar to its externalsource.
7.2.3 Psychological Insights The observer,when he seemsto himself to be observinga stone, is really, if physicsis to be believed, observingthe effectsof the stoneupon himself. - BertrandRussell Finally , taking a psychological perspective, we can obtain additional insight . In particular we draw heavily upon the theories of J. J. Gibson and Ulric Neisser ' regarding perception s role in generating behavior.
7.2.3.1 A relevantand important conceptlies in the meaningof objects in relation to an organism's motor intents, a conceptGibson ( 1979) first introducedas . As definedby Gardner( 1985, p. 310), "Affordancesare the poaffordances tentialities for action inherentin an object or scene- the activities that can take place when an organismof a certain sort encountersan entity of acertain sort." The Gibsonianconceptof affordancesformulatesperceptualentities not as semanticabstractionsbut ratherby what opportunitiesthe environment affords. The relationshipbetweenan agent and its environmentafforded by a potential action is termed an affordance. All information neededfor the agent to act resideswithin the environment, and mental representationsare not usedto codify perception. A chair can be perceiveddifferently at different times, as somethinguseful to sit in, as somethingblocking the way, or as somethingto throw if attacked. The way the environmentis perceiveddepends on what we intendto do, not on somearbitrary semanticlabeling (e.g., chair). A chair need not be explicitly recognizedas a chair if it is serving , it needbe recognized only as a barrier to motion. Underthosecircumstances If tired it need be as an obstacle . , recognizedonly as a place to rest. only From a robot designer's perspectivethis translatesinto designingalgorithms that detectthings that impede motion, afford rest or protection or other capabilities , but not in to designingalgorithmsthat do semanticlabeling and . categorization The following stepscharacterizeaffordanceresearch(Adolph, Gibson, and Eppler 1990) :
245
Basisfor Behavior-BasedControl
I . Describethe fit betweenthe agentandits environment. 2. Detenninethe agent-environmentrelationshipsregardingboth the optimal performanceof the action andthe transitionsbetweenactions. 3. Analyze the correspondencebetween the actual and perceived agentenvironmentfit. 4. Detenninethe perceptualinformationrequiredto specifythe affordance. 5. Evaluatehow to maintainand adaptaction asnecessary . Theseguidelines, developedfor psychologists , can also be of benefit in the -basedrobotic systems. of to behavior design perceptualalgorithms support of the more radical that maintain Gibson's views have Many camp strictly lately fallen on hard times within the mainstreampsychologicalcommunity, but that in no way diminishes the potential value of his ecological stance . on agent-environmentinteractionsas a basis for robot perceptualalgorithm , Uhlin and Ekhlundh 1993; Blake 1993; Ballard and generation(Pahlavan Brown 1993; Arkin 1990a). We, as roboticists, will use the term affordance to denotea perceptualstrategyusedto interactwith the world, satisfyingthe needof somespecificmotor action.
7.2.3.2 A ModifiedAction-PerceptionCycle While I am not sure that accessto movement-producedinformation and affordances would be sufficient to produceperceptualawarenessin a machine, it is a necessary condition . . . (Neisser1993, p. 29)
Neisser( 1989) modifies the Gibsonianstancesomewhatto permit Gibson's ecologicalperspectiveto accountfor the spatialvision stream(for locomotion) discussedearlier in this section, while forwarding a cognitive explanationfor the object vision stream(for recognition). This approachacknowledgesrecent neurophysiologicalfindings and presentsa two-prongedexplanationof vision reasonably consistentwith the approach es usedin someof the hybrid robotic architectureswe encounteredin chapter 6. Other robotics researchershave recognizedtheseparallel pathwaysand used them to constructseparateyet coordinatedvision systemsfor determiningwhat an objectis apartfrom where it is located (Kelly and Levine 1995). If indeed there are multiple parallel perceptualsystemsin thebrain asthe evidenceindicates, it is certainlypossible that different methodsexist aswell of processingthat informationfor action. Neisser's perspectivearisesfrom the schoolof cognitivepsychology(chapter 2) and leadsus to the notion of action-orientedperception. This school
246
7 Chapter
....-- -----......", / / / ~ ACTION PLANS WORLD BEHAVIORS /, ~'"'"..reactive " " shunt , """"---------.direction
ODELS MEMOR
cognitiol1
7.2 Figure -
Modified action-perceptioncycle.
of thought acknowledgesthe fact that perceptionand action are intimately intertwined. Neisser( 1976) elaboratesthe action-perceptioncycle asthe basis by which humansinteractwith their environment(figure 7.2 presentsa modified version). In this cycle, perceptionsarisingfrom interactionwith the world ' modify the organisms internal expectationsand behaviors, which in turn result in new exploratoryactivities that result in new perceptions.Anticipatory schemasplaya crucial role in providing both the direction and contextfor interaction with the world. Neisser's initial versionof this cycle doesnot include the reactiveshunt, which, in my estimation, more directly ties perceptionand action togetherwhile still permitting the coexistenceof plansfor actions. The versionpresentedhereis believedto be moreconsistentwith his later publications ( Neisser1989) which haveevolvedsincethe earlierexpositionin Neisser 1976. In any case, this liberty was takento provide a betterreflectionon how this cycle canbe relatedto behavior-basedrobotic systems. 7.2.4
Perception as Communication - An Ethological Stance
Considerthat the world is trying to tell us something, if only we knew how - in which to listen. Sensingcan thus be viewed as a form of communication information flows from the environmentto the attendingagent. Obviously, if we don' t know what to attendto we will have a hard, if not an impossible, that the world is providing. The world is telling time discerningthe messages us somethingif only we would pay attention. Whereand how our attentional and perceptualresourcesare directeddependsstrongly on our motivation or intentionalstate.
247
PerceptualBasisfor Behavior-BasedControl
The ethological literature is replete with examplesof sensedinformation providing cuesfor evoking behavior(e.g., Smith 1977; Tinbergen1953). Indeed , evolutionhasprovidedbiological agentswith highly tunedapparataeto pick up efficiently the information necessaryto carry out useful actions. The looming andprey deteCtors( Ewert 1980) mentionedearlier for guiding visual responsein the frog are goodexamples. In recognitionbehavior, we find that someagentsare capableof discerning thingsotherssimply cannot(e.g., intraspecieskin recognitionamongbirds ' (Colgan 1983 . Perceptualcuesnecessaryfor an organisms survivalandroutine functioning are extractedcheaply and efficiently from the environment whereasirrelevantinformation is not processedat all (i.e., it is not evendiscarded : pick up neveroccurs). In otherwords, theseagentshaveevolvedmechanisms that enableefficient communicationwith the world' s salient features ' (salient, that is, in the contextof that agents needs). This implies that we need to haveour robotic agentsattendto what is necessaryin the context of their (not our) needs. Dependingon their internal conditions, motivationalstateor goals, and sensorylimitations, we can developalgorithmsthat provide useful andfocusedinformationfor theseactors.
7.3 A BRIEFSURVEY OFROBOTICSENSORS Sensortechnologyhasadvancedrapidly in the last decade, resultingin many low-costsensorsystemsthat canbe readilydeployedon behavior-basedrobots. Sensorscanbe categorized , in termsof their interactionwith the environment, as either passiveor active. Passivesensorsuse energy naturally presentin the environmentto obtain information. Computervision is perhapsthe most typical form of passivesensing. Passivityis particularly importantin military , applications, wheredetectionof the robot shouldbe avoided. Active sensors on theotherhand, involvethe emissionof energyby a sensorapparatusinto the environment , which is then reflectedback in somemannerto the robot. llitra sonicsensingandlaserrangefinding aretwo commonactivesensormodalities usedfor behavior-basedrobots. A very brief discussionof the operationof severalrepresentative sensorsystems follows, including the use of shaft encodersfor deadreckoning. Shaft encodersare not environmentalsensorsin the strictestsense, sincethey measure ' only the rotationsof the robot s motors(i .e., they provide proprioceptive information). Nonetheless , they are widely usedfor positionalestimationand warrantfurther discussion.The readerinterestedin moredetailedinfonnation on a wide rangeof sensorsusefulfor robotsis referredto Everett 1995.
248
7 Chapter
7.3.1 DeadReckoning Dead reckoning(derivedoriginally from deducedreckoning) providesinformation regardinghow far a vehicle is thought to have traveledbasedon the rotation of its motors, wheels, or tteads(odometry). It doesnot rely on environmental sensing. 1\\10 generalmethodsfor deadreckoningare available: shaftencodersandinertial navigationalsystems. Shaftencodersareby far the mostfrequentlyusedmethodof deadreckoning becauseof their low cost. Theseoperateby maintaininga countof the number of rotationsof the steeringanddrive motor shafts(or wheelaxles) andconverting thesedatainto the distancetraveledand the robot' s orientation. Although shaft encoderscan provide highly reliable positional information for robotic arms, which are fixed relative to the environment, through direct kinematics, in mobile systemsthey canbe exttemelymisleading.
Direct
kinematics
involves
the
solving
necessary
equations
to
detennine
where
' the of
robot the
s end robot
ann
effector
is , given
the
joint
coordinates
( angles
and
length
of
links
)
.
If you haveeverbeenstuck in ~ ud or snowin an automobile, you can recognize that the informationasto how manytimesthe drive wheelshaveturned does not necessarilycorrelatewell with the car' s actual positional changes . Becauseshaft encodersare proprioceptiveand only measurechangesin the robot' s internal state, they must be supplementedwith environmentalsensing to producereliable resultswhendeterminingthe robot' s actuallocationwithin the world. An inertial navigational system (INS) does not measurethe rotation of wheelsor shaftsbut rather tracks the accelerationsthe robot has undergone , this information into . This results in far converting positional displacements more accuratedead reckoning systems, but with one major penalty: higher cost. INS is also proneto internal drift problemsand must be periodically recalibrated to yield accurateinformation. The quality of the data makesit far moredesirablethanshaftencoders , but costandpowerrequirementsfrequently its . prevent deployment Although not basedon deadreckoning, global positioning systems(GPSs) can also provide geographicinformation as to the robot' s whereaboutswithin
249
PerceptualBasisfor Behavior-BasedControl
the world. A battery of twenty-four Departmentof Defenseearth-orbiting satellitesrelay positional data wherebythe robot can deduceits position relative to a .world coordinatesystem. Time of flight of the GPS signals and ' triangulationwith threetransmittingsatellitesenablesthe robot s altitude, longitude , and latitude to be computed. Global positioning systemsare rapidly ' decreasingin cost but cannot be used inside buildings where the satellites signalsare blocked. Differential GPS (DGPS) can provide higher positional resolutionthan the standardnonmilitary GPS service, which is limited to approximately lOO-meter accuracy(bestcase) (Everett 1995) becauseof an intentionally degradedpublic usagesignalpreventingunintendeduseby hostile military powers. DGPSrequiresa ground-basedtransmitterto supplementthe satellitesandcaneasily yield relativeaccuraciesin submeterranges.
7.3.2 mtrasound Sonar(ultrasonicsensing) is a form of activesensing. It operateson the same basic principle by which bats navigatethrough their environment. A highfrequencyclick of soundis emittedthatreflectsoff a nearbysurfaceandreturns later at a measurabletime. The delay time for receipt of the returning signal can be usedto computethe distanceto the surfacethat reflectedthe soundif the velocity of the soundwaveis known. A typical ultrasonicsensor(Polaroid) emitsa beamthatreceivesechoesfrom a regionapproximately30 degreeswide emanatingfrom its source. Thesesensorscan operaterapidly, returning ten or more depthdatapoints per second. Accuracyfor many working systemsis typically on the order of centimeters(0.1 foot) over a maximum rangeup tometers . A wide rangeof sensorsis commercially availablecovering a broadrangeof frequencies , eachwith variationsin beamwidth and distance. a Nomadic 7.3 shows Figure Technologiessonarring equippedwith sixteen sensors . : It is of low cost, providescoarse illtrasonic sensinghasdecidedadvantages three dimensionalenvironmentalinformation (distanceto an object), and returns a tractableamountof data for interpretation. Its disadvantages are substantial aswell: It hasmuchpoorerdiscriminatoryability thanvision, is significantly susceptibleto noiseanddistortiondueto environmentalconditions, frequently produceserroneousdatabecauseof reflectionsof the outgoingsound waves, and the sonarbeamis proneto spread. Sonarhasfound its bestusein obstacledetectionandavoidanceat shortrange. Its difficulty in discriminating different typesof objects, for example, betweenan obstacleand a goal, limits
250
Chapter7
Figure7.3 Ultrasonicsensors . ( photograph Inc., Mountain courtesyof NomadicTechnologies View, California .) its applicability . Sonar cannot be used in outer space, as it requires a medium for the transmission of the sound wave. Progress has been made in the use of phased sonar arrays to provide greater information regarding the environment , but these currently are not in widespread use in robotic systems. 7. 3. 3 Computer VISion
Video technologyhasbeenavailablefor at leasthalf a century. Only recently, however, has charge-coupled device (CCD) camera technology advanced rapidly in terms of miniaturizationand greatly lowered cost. Color imagery is now availableat very affordableprices. , are available, most robot vision systems Although somedigital cameras consistof one or two black-and-white or color analogoutput CCD cameras , oneor moredigitizers, andan imageprocessor . Becauserobotsneedreal-time interpretationof incoming video data, they often requirespecializedarchitectures found in image processorsfor seriousresearchand applications. Cer-
251
Perceptual Basis for Behavior -Based Control
tain techniquesprovided throughthe useof behavior-baseddesign, however, can provide very low-cost, completevision systemsfor robots ( Horswill and Yamamoto1994). Figure 7.4 showsa high-end vision systemusedfor realtime tracking of peoplein an autonomoushelicopter. It consistsof a low-light cameraor conventionalVideomountedon a pan-tilt systemwith specialized image stabilization and real-time motion detection hardware(Cardozeand Arkin 1995). A behavior-basedhelicopter, designedby Montgomery, Fagg, and Bekey ( 1995) at the University of SouthernCalifornia, won the 1994International Aerial RoboticsCompetition. This vehicle integratedthree sonar sensorsfor altitude measurements , a compassfor headingcontrol, three gyroscopes for controlling attitude, and a video camerafor recognizingtarget objects. The sheervolume of video information generatedcan be staggering.For a singlecamera, typical imageresolutionafter digitization is on the orderof 512 by 512 pixels (pictureelements), with eachpixel consistingof eight bits of information encoding256 intensitylevels. Multiply this valueby threefor color one images( image plane eachfor red, green, and blue) and then attemptto at frame rate (30 times per second). We now havea receivingbandwidth process of approximately24 megabytesper second! Specializedandoften costly image processinghardwarecan make this data flow tractable. We will see, however, that behavior-basedrobotic perceptionprovidestechniques , suchas the useof expectationsand focus-of-attentionmechanisms that constrain the , amountof raw data that must be analyzed, significantly reducingthe overall . Thesebehavior-basedperceptualalgorithmsaregenerally processingrequirements designedto exploit task and behavioralknowledgewhereverpossible. Adaptive techniquesthat track featuresover multiple frames are also commonly used. Full-scalesceneinterpretation, the hallmark goal of mainstream , is generallynot required. image understandingresearch 7. 3.4
Laser Scanners
Laser scannersare active sensors , emitting a low-poweredlaser beamthat is scannedover a surface. Throughtechniquessuchasphaseamplitudemodulation , the distanceto the individual points can be computedwith the net result an array of image points, eachof which has an associateddepth. In effect, a three-dimensionalimageis obtained. Reflectancedatais often also available, providing data regardingthe natureof the surfaceas well. The product is an extremelyrich three-dimensionaldatasource. Figure7.5 illustratesa representative commerciallyavailablesystem.
252
Chapter
(A) I I I -
-
Video
-
. ~ ~
~
~
~
~
[ I .
~
~
.
-
-
.
[
~ I I I
Teleos
SUN R
Indy host tlon
tracking
r1- - - - - - --I MPC 1 _. . . . . 1work : :..- - --station Ethernet - - --1
I I I -
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
~
-
-
-
-
~ I
(B) Figure7.4 . Thehelicopter Visionsystemfor anautonomous (A) hasa colorCCDvideo helicopter to on the nose. The videois transmitted cameramountedon a pan-tilt mechanism . (Photograph thegroundstationsystem(B), wherethereal-timetrackingis conducted .) courtesyof MarkGordon
253
:ior-BasedControl PerceptualBasisfor Behav
Figure 7.s LASARTMLaserscanningsystem. ( Photographcourtesyof Perceptron, Inc.)
Laser systemsare not without their drawbacks. First and foremostis their cost, manyordersof magnitudeoverultrasonicrangingdevices. Anotherproblem arisesfrom the mechanicalinstabilitiesassociatedwith manycurrentimplementations of thesedevices(e.g., problemswith nodding mirror designs usedfor mechanicallyscanningthe laserbeam). More subtledifficulties arise asa resultof the sparsedatasamplingfound at longerdistances(a consequence of the imaging geometry), which can causecertainobjectsto go undetected , and problemswith rangeperiodicitieswhen using phasemodulatedsystems that force interpretationambiguitieson the incoming dataregardingdistance to the surface. Lower-cost linear array laser scannersare available, but they , as the underlying sensor provide considerably less information. Nonetheless
254
Chapter? technology improves in the next several years, these devices are expected to become more and more useful and commonplace . Their nature and use of low powered lasers nonetheless poses some difficulties for various applications in terms of both safety and stealth.
7.4 MODULARPERCEPTION A child of five would understandd1is. Sendsomeoneto fetch a child of five. - Groucho Marx
One of the essentialcharacteristicsof behavior-basedroboticsis the designof multiple parallel motor behaviorsto provide overall control. This leavestwo choicesfor tying in perception: generalizedperceptionor modularperception. ' RevisingMarr s ( 1982) definition of generalvision, generalizedperception is a processthat creates, given a set of input sensing, a complete and accurate representationof the sceneand its properties. Thus stated, perception is conductedwithout regardfor the agent's intentionsor availablerepertoire ? The of behaviors. But is therereally a needfor sucha scenerepresentation behavior-basedroboticist, needsonly to identify the necessaryperceptualcues within the environmentrequiredto supportthe neededmotor actions. Much of the difficulty inherentin the generalperceptionproblemvanishes, sincethere is no needto perform the complexand arduoustask of full -fledgedscenereconstruction . As an alternative, modular perceptionadvocatesthe design of perceptual specialistsdedicatedto extract the relevant information for each active behavior . This reprisesthe themeheardin Minsky' s Societyof the Mind: " Each mental agentby itself can only do somesimple thing that needsno mind or - in certain very special thought at all. Yet we join theseagentsin societies " true 1986 this leads to , p. 17)."We havealready intelligence ( Minsky ways studiedthe behavioralagentscomprisingour robotic designs, we now focuson the individual moduleswhich, in toto, constituteperception. Oncecommittedto the paradigmof modularperception,we mustdetermine whatconstitutesa module. Variousdefinitionshavebeenforwarded, which we now review.
7.4.1 PerceptualSchemas Schematheory has a long and rich history, which we reviewedin chapter 2. We focus now on thoseaspectsof schematheory that apply to perceptionand . According to Arbib ( 1995a, p. 831), can be generalizedto robotic systems
255
PerceptualBasisfor Behavi~r-BasedControl
"A es for recognizinga given domain perceptualschemaembodiesprocess of interaction, with various parametersrepresentingpropertiessuch as size, location, and motion." Extracting severalof the major featuresof schema theoryrelevantto modularperception(Arbib 1992) : . Schemasare a set of multiple concurrentactive, not passive, process es focused on differing perceptualactivities. . Schemascontain both the control and knowledgerequired to completea perceptualtask. . Schemasform an active network of process es functioning in a distributed mannerspecificfor a particularsituationand a setof agentintentions. . Schematheoryprovidesthebasisfor languagesto defineaction-orientedperception (e.g., RS ( Lyonsand Arbib 1989) and the AbstractSchemaLanguage .(Weitzenfeld1993 . . The activationlevel associatedwith a schemacanbe relatedto the degreeof belief in a particularperceptualevent. ' , serving Perceptualschemasare somewhatrelatedto Gibsons affordances a similar purposewithin an organism(Arbib 1981, Arkin 1990a). For the roboticist, the difficulty lies in how to operationalizethis notion of affordance asa schema. Each individual perceptualschemais createdto produceonly the information necessaryfor the particular task at hand. Rememberthat perceptual schemasare embeddedwithin motor schemas , providing the information required for themto computetheir reactionto the world (section4.4). The question becomesone of saliency: How do we know what featuresof the environment arethe correctonesto supporta particularbehavior? Evenassumingthat thesecanbeclearly identified, we mustthenassessif it is feasibleto extractthis information in real time, as is necessaryfor robotic control, given the limitations of existing sensorandcomputationaltechnology. Gibsonianaffordances are preoccupiedwith the role of optic flow in navigationaltasks. Although someprogresshas been made in using optic flow fields for behavior-based control (Duchon, Warren, and Kaelbling 1995), often other, more computationally feasiblefeatureextractionalgorithmsare preferredfor schema-based perception. Let us examinesomepractical casesfrom our robot' s world. If an avoidstatic-obstacleschemais active, asis usuallythecasesotherobot will not crash into things, some meansfor perceiving obstaclesis necessary . An obstacle is defined as somethingthat provides a barrier to the robot' s motion, that is, it occupiesspacein the intendeddirection of motion. Objects within the environmentare not consideredobstaclesuntil they get in the way. Further,
256
Chapter they do not have to be semanticallylabeled as chairs, people, or whatever: they merely needto be recognizedas an impedimentto motion. The sensor or algorithmicsourceof the perceptionof the obstaclesis of no concernto the motor schema , only the informationregardingwherethe obstaclesarelocated and, if available, a measureof the certaintyof their perception. It is irrelevant to the motor schemaif the obstacleinformationarisesfrom vision, ultrasound, or someother form of sensing. From a designstandpoint, it is necessaryonly to chooseone or more of thesesensoralgorithms, embeddingit within the motor schemaitself. The obstacle-detection algorithm is unconcernedwith other perceptualprocessingfor other active motor schemasand thus can run asynchronouslywithin the context of the avoid-static-obstacleschema. No world model of the environmentis required, only reports as to where any obstaclesare currently located relative to the robot. As this information is ' egocentric(centeredon therobot s position), it eliminatesthe needfor absolute coordinateframesof reference. Anotherexampleinvolvesroadfollowing whenusinga stay-on-pathschema Arkin ( 1990a). The questionof what perceptualfeaturesmakea road or path recognizableto a robot hasno single answer. Roadsand pathsvary widely in . , andweatherandtime of day further alter their visual presentation appearance For certainconditions, a possiblesolutioninvolvestracking the paths' boundaries . One particular method usesa fast line-finding algorithm and is most successfulwhenappliedto well-definedroads, paths, or hallways(figure 7.6). An alternativeperceptualschema , for usein different situations, exploitsa fast that is robustwhen the path boundariesare ill region segmentationalgorithm defined(figure 7.7). In someinstancesit may be preferableto have both of theseperceptualprocess es active, arbitrarily choosingthe most believableor their in results some fusing meaningfulway (section7.5.7). Many other more exist to sophisticatedways perform a road-following behavior, someof which arediscussedin section7.6.1.
7.4.2 VISUalRoutines Visual routines, as developedby Ullman ( 1985), are anothermethodfor describing modularperception. In this approach, a collection of elementalperceptual operationscan be assembledinto a rich set of visual routines. These routinescanbe createdto servespecificperceptualgoals; they can sharecommon elementaloperationsand can be applied at different spatial locations within the image. Controlmethodsmustbe providedfor sequencingtheroutine operationscorrectly andapplyingthem at suitablelocationswithin the image.
PerceptualBasis for Beha~ior - Based Control Visual routines embody both sequential processing through the choice of an ordered set of elemental operations as well as spatial, temporal and functional (i .e., specialization ) parallelism . Base representations are first created to which the visual routines are then applied . This approach has a more bottom - up processing flavor than perceptual schemas since a continually available substrate of low -level perceptual ' processing is assumed to be available. This strategy is consistent with Marr s ( 1982) priffial and two -and- a-half- dimensional sketches. Box 7.1 describes these aspects of Marr ' s theory of vision .
Box7.1 ' of
:
levels
uses
three
vision
1982
of )
theory
representation
Marr
s (
. infonnation
in brightness (
Sketch
makes
.
and
textures
blobs ,
as ,
such
about
surfaces
information
dimensional -
sketch
makes
,
explicit
edges -
explicit
The
Primal
regarding
changes
intensity
-
)
-
a
. two
and
The
half .
orientation )
surface
.
.
,
the
e (
g
each
within
at
point
image
' an
s shape
object
model
makes
information
. three
dimensional
The
regarding .
volume )
. ,
. g
world
e (
the
within
explicit
257
In Ullman' s theory, a setof universalroutinesprovidesinitial analysisof any image. Theseuniversalroutinesbootstrapthe interpretationprocess,providing indices and thus guidancein the application of more specializedroutines. There is inherently less relianceon expectationsto provide effective choice in applying the set of perceptualmodules. The applicationof visual routines is not purely bottom up, however, becauseroutinesare assembledon an asneededbasisfrom a finite set of elementaloperations. Although a definitive set of elementaloperationshasnot beencreated, severalplausibleoperations include . shifting the focusto different locationsin the baserepresentation . boundedactivation, restrictingthe applicability over the spatialextentof the baserepresentations . boundarytracing . location marking " . indexing, basedupon a locations sufficient distinctivenessfrom its surroundings
258
Chapter?
(A ) Figure 7.6 Fast line finding. (A ) A Denning robot conductssidewalkfollowing using a fast line finder. The algorithmusesexpectationsto anticipatedie positionof die sidewalkboundaries ( bod1in tenDS of spatialand orientationconstraints). ( B) and (C) The resultsof die groupingprocessand die computedcenterline. Theseresultsare fed forward into die next incomingframe asexpectationsasdie robot proceedsdown the sidewalk.
This theory, intended primarily as an explanationof biological perception es with applicability for , has led to the developmentof severalapproach robotics. Chapman( 1990) developeda perceptualarchitecture,SIVS, inspired ' by mlman s visual routine theory. Similar methodsusing visual routineswere ' also deployedin Chapmanand Agre s earlier system, Pengi(Agre and Chapman 1987) . Figure7.8 depictsthe SIVS architecture.Controlinputsselectfrom amongthe primitive visual operatorsguiding the overallprocessingperformed on the substrates , referred to as early (retinotopic) maps in the figure. The ' routinesare selectedin a task-specific, top- down mannerbasedon the agents . The operatorsthathavebeendefinedin SIVS extendthose actionrequirements proposedby Ullman: . Visual attentionand search: whereto look within the scene . Tracking: following moving objectsthroughthe useof visual markers
259
Perceptual Basis for Behavior - Based Control
(C) Figure 7.6 (continued)
260
Chapter7
(A ) Figure 7.7 Fastregion Segmentation . This sequenceshowsthe resultsof a fast region segmenter whenusedfor pathfollowing. It is particularlyusefulwhenroadedgesareweakandthe fast line finder is not suitable. Only the centralportion of the path is extractedto obtain a more consistentsegmentation . (A ) showsthe robot locatedon a gravel path whose edgesare coveredby grass. (B) showsthe extractedregionand (C) the computedpath edges(least-squaresfit ) and the computedcenterline. . Spatial properties : the ability to compute distances, angles and directions to objects within the world , thus constructing concrete spatial relationships . Activation : determining whether a selected region is bounded through spreading activation
. Others: including varioushousekeeping andmarkermanipulationoperators. This systemwastestedonly in a video gameenvironmentandthusbypassed muchof the difficulty of dealingwith real perceptualalgorithms. Nonetheless the overallcontrol architectureis faithful to Ullman' s approachandprovidesa solid steptowardsoperationalizingit on a real robot. Reeceand Shafer( 1991) implementedvisual routinesfor potential use in robotic driving at Carnegie-Mellon University. Their system, called Ulysses, was inspiredby Agre and Chapman's work ( 1987) and usedfourteenroutines
261
PerceptualBasisfor Behavi9r-BasedControl
(B)
(C) ) Figure7.7 (continued
EARL MAP EAR MA EA MA
-C :~a 0>
262
VISUAL OPERATOR
.
CONm OL SYSTEM
Figure7.8 SIVSperceptual architecture . (table 7.1) to provide the necessarycontrol information for selectingsteering and speedmaneuversfor tactical driving. It was testedon a traffic simulator andshowedthat specialpurposeperceptualroutinesreducethe tacticaldriving task' s computationalcomplexityby severalordersof magnitude. 7.4. 3 Perceptual Classes
The notionof perceptualclassesis particularlyusefulfor describingperceptual requirementsin behavior-basedsystems,becauseit permitsdefiningperceptual tasksbasedon the agent's needs. The partitioning of perceptualeventswithin the world into equivalenceclassesis basedon the needsof a motor action:
263
Perceptual Basis for Behavior -Based Control
Table7.1 PerceptualRoutinesin illysses Find-current-lane Find-next-car-in-lane
Mark-adjacent-lane Find-next-lane-marking
Track-lane Profile-road
Find-next-sign Find-path-in-intersection
Find-back-facing-signs Find-next-car-in-intersection
-signs Find-overhead Find-signal
Find-intersection-roads
Find-crossing-cars
obstacleor nonobstacle ; road or nonroad; landmarkor anything else; moving object or stationary. The perceptualtask directs the appropriatesensory processingmechanismto the consumingmotor behavior. By channelingthese perceptualtasksdirectly, the ability to executethemin parallelon separateprocessorsis obtained, thusenhancingcomputationalperformance . Donald and Jennings( 1991a, 1991b) have contributedformalizationsfor this concept. They view the designof perceptionasthe constructionof recognizable sets: placesor things the robot is capableof perceiving. Task-directed , becausethe robot can be strategiesfor perceptionare a naturalconsequence providedwith expectationsregardingwhatit shouldencounter(i.e., a definition of the characteristicsof the perceptualclassbeing sought) (Donald, Jennings, and Brown 1992). Sensingis considereda mappingof the robot' s view of the world onto the setof possibleinterpretations , definedby the perceptualclasses themselves . Careful constructionof the perceptualclassescanmakethis mapping easierthan in an unconstrainedinterpretation. In defining the perceptual classes , what the robot is pernrittedto senseand understandis denoted. Correlating the resulting perceptualclassesto the needsof the behaviorspernrits the extractionof information to be limited to only that requiredfor a particular task. The notion of perceptualequivalence , wherebya large, disparateset of uncertainsensoryreadingsis reducedto membersof particular perceptual classes , renderscomputationaltractability to the otherwiseunduly complex interpretationproblem. The notion of information invariantsas the basisfor equivalenceshasextendedthis work evenfurther, pointing towardsthe eventual developmentof a calculuswherebyrobot sensorsystemscould be evaluated es analytically (Donald 1993) . This work involves the use of approach the reduction of one sensor into employing computationaltheory pernritting another. The perceptualequivalenceclassmethodpernritsthe minimizationor elimination of mapconstructionin a mannerstronglysupportiveof purely reactive
264
Chapter?
systems(Donald andJennings1991a). This researchhasfocusedprimarily on spatiallandmarkrecognition(wherethe robot is) ratherthan on which objects afford what actions. Nonetheless , the principles appearreadily extensibleto this broaderproblem.
7.4.4 LightweightVISion Horswill ( 1993a) has forwardeda considerably less theoretical, more pragmatic , but no lessimportantapproachto the designof perceptualalgorithms. The overall approach,dubbedlightweightvision, focuseson specializingindividual es tailored to the behavioraltasksat hand. Horswill perceptualprocess that this argues although specializationmight lend the appearanceof acollection of disjointed, ad hoc solutions, there are principled meansby which thesespecializedmodulescanbe analyzedto produceboth generalizationand potential reusability. This is accomplishedin part by making explicit the assumptions that underlie the applicationof a perceptualalgorithm for a given taskenvironment. Lightweight vision incorporatesboth task and environmentalconstraints. Theseexplicit constraintsprovide the basisfor designof the specializedperceptual module. The claim is madethat for mostreal world task-environment pairs, a potentially large numberof perceptualsolutionsexist. Each solution within that solutionspaceis referredto asa " lightweight" system. ' Lightweight vision is loosely related to Donald s methodsin perceptual equivalenceclasses(section 7.4.3). An equivalenceclass is concernedwith the theoreticalequivalenceof multiple perceptualsystemsin a task' s context. Horswill acknowledges that manysystemsprovidean equivalentsolutionfor a class as definedby a taskenvironment); it becomesthe engi( givenperceptual neer's goal, however, to definea low-cost, highly efficient perceptualsolution within that spaceof potentialsystems.Thus lightweight vision providesa design methodologyfor constructingspecializedperceptualalgorithmsfor usein behavior-basedrobotic systems(i.e., thosethat do not require reconstruction of the environmentin someabstractrepresentational form). is the declaration of constraintsimposed Design accomplishedthrough explicit on the perceptualtask. Habitat constraintsrefer to the set of environments within which the specializedsystemwill operate.Theseconstraintslead to the formulation of a computationalproblem that then lends itself to optimization . Polly (figure 7.9) is a robot whosepurposeis to roam through the corridorsof the MIT AI lab andprovidetoursfor visitors asneeded.A partial list of the perceptsrequiredfor this taskappearsin table 7.2.
265
-BasedControl Basisfor Behavior Perceptual Table7.2 Partiallist of Percepts usedwithinPolly. open-left? open-region? light-floor? -ahead ? person
open-right? blind? dark-floor? person-direction?
blocked? vanishing-point? farthest-direction? wall-ahead?
Table7.3 elaboratesthehabitatconstraintsthatthe environmentprovidesfor ) and vanishing computationof depthrecovery(neededfor obstacleavoidance needed for information for . These constraints point ( heading navigation) pose tractablecomputationalproblemsthat in turn lend themselvesto efficient, low. cost lightweight solutionsfor eachof the perceptualmodulesrequiredfor the laboratorytour. ' Polly s architectureconsistsof a coupled low- and high-level navigation system. The low-level architectureis composedof speedcontrol, corridor -style follower, wall follower, andballistic turn behaviorsusing a subsumption arbitrationmechanism . The high-level navigationsystemprovidesdirections as to whereto go next in the tour and conductsplacerecognitionto assistin verifying wherethe robot is, at variousplaceswithin the tour script. Polly is a very robust system, having given in excessof 50 tours in two different laboratories( MIT and Brown University). Perceptualspecialization (alsoknown aslightweight vision) hasenableda small, computationallyweak robot to conducta complexvisual navigationaltask by permitting the design of algorithms that fit a specific task-environmentpair. The vision modules . This systemhasclearly may alsobe reusedin similar tasksandenvironments demonstratedthe pragmatismof modular vision in behavior-basedrobotic design.
7.5 ACTIONAND PERCEPTION Perceptualmodulesand motor behaviorscan be bundledtogetherin different ways. To someextent, one may questionwhetherperceptionis driving action or vice versa. 1\\10different forms of behavioralperceptionhaveresulted: . action-orientedperception, in which behavioralneedsdeterminethe perceptual strategiesused, and . activeperception, in which perceptualrequirementsdictate the robot' sactions.
N:
Chapter?
Figure7.9 Polly, a robotic tour guide. (Photographcourtesyof RodneyBrooks.)
267
PerceptualBasisfor Behavior-BasedControl Table 7.3 1\ vo setsof exemplarhabitatconstraintsandtheir usein Polly' s task. Perceptual Need
Constraint
Computational Problem Optimization
Depthrecovery for navigation
GrouRdplane assumption
useheight in image (indoor level floor)
Backgroundtexture constant(carpet)
usetexturealgorithm
Corridor vanishing long corridor edges point for heading (office building)
line finding
Strongcorridor edges edgedetection known cameratilt clustering
usepixels as lines low-costdetector ID clustering
To someextent, this is a " chicken and egg" question, relating to the origin of perceptionin animals: Did perceptionresult from the requirementsof locomotion , or did the evolutionof perceptionenablelocomotorycapabilities? We leavethis questionfor othersto answerand look insteadat the impact of both of theseapproach eson the designof behavior-basedrobotic systems.
7..5.1 Action-OrientedPerception Action-oriented perceptionrequires that perceptionbe conductedin a topdown manneron an as-neededbasis, with perceptualcontrol and resources determinedby behavioralneeds. As statedearlier, this is in contrastto more traditionalcomputervision research , which to a largeextenttakesthe view that in or that its solepurposeis to constructa modelof is an end itself perception the world without any understandingof the needfor sucha model. Thesenonaction -orientedstrategiesburdena robotic systemwith unnecessary processing . requirementsthat canresult in sluggishperformance Action-orientedperceptionis not a new concept. As we have seen, it has roots in both cybernetics(Arbib 1981) and cognitive psychology (Neisser 1976). The underlyingprinciple is that perceptionis predicatedon the needs of action: Only the informationgermanefor a particulartaskneedbe extracted from the environment. The world is viewed in different ways basedupon the ' agents intentions. Action-orientedperceptionhasmanyaliases: selectiveperception(Simmons 1992), purposive vision (Aloimonos and Rosenfeld 1991), situated vision (Horswill and Brooks 1988), and task-orientedperception(Rimey 1992) are
268
Chapter7
a few. The underlying thesisfor this methodis that the natureof perception is highly dependenton the task that is being undertaken . The task determines the perceptualstrategyand processingrequired. This approach, developedin conjunctionwith reactiverobotic systems, tailors perceptionto fit the needs of individual motor behaviors. Insteadof trying to solvethe so-called general vision problemby attemptingto interpretalmosteverythingan imagecontains, an advantageis gainedby recognizingthat perceptualneedsdependon what an agent is required to do within the world. Avoiding the use of mediating in the pathwaybetweenperceptionand action removes global representations a time-constrainingbottleneckfrom robotic systems. In action-orientedperception,the motorbehaviorsprovidethe specifications for a perceptualprocess: what mustbe discernedfrom environmentalsensing and constraintsas to whereit may be located. Focus-of-attentionmechanisms playa role in directing the perceptualprocessas to where to look, and expectations of the event provide clues(e.g., models) as to what the appearance beingsoughtis. How to perceivethe desiredeventis capturedin the perceptual module's computationalprocess. Behavior-basedsystemscan organizeperceptualinformation in threegeneral ways: sensorfission ( perceptualchanneling), action-oriented sensorfusion , and perceptualsequencing(sensorfashion) (figure 7.10). Sensorfission is straightforward: a motor behaviorrequiresa specificstimulusto producea , so a dedicatedperceptualmoduleis createdthat channelsits output response to directly the behavior. A simplesensorimotorcircuit, numerousexamplesof which we havealreadyencounteredin chapter3, results. Action-orientedsensorfusion permits the constructionof transitory representations (percepts) local to individual behaviors.Restrictingthe final percept to a particularbehavior's requirementsandcontextretainsthe benefitsof reactive control while permitting more than one sensorto provide input, resulting in increasedrobustness . Section7.5.7 explorestheseissuesfurther. Sometimesfixed-action patternsrequire varying stimuli to support them over time and space. As a behavioralresponseunfolds, different sensorsor different viewsof the world maymodulateit. Perceptualsequencingallowsthe coordinationof multiple perceptualalgorithmsovertime in supportof a single behavioralactivity. Perceptualalgorithmsare phasedin and out basedon the ' agents needsandthe environmentalcontextin which it is situated. The phrase sensorfashion coarselycapturesthis notion of the significanceof differing perceptualmoduleschangingover time and space. Section7.5.6 studiesthis aspectof coordinatedperceptionin moredetail.
269
J Basisfor Behavior-BasedControl Perceptua Percept-1 Percept-2 Percept-3
~, I ~Response-1 .1 ~~.'~""~,~,~ .L~~ -t.....-~-~J ~Response-2 ~Response-3 , ,~ ~ (A)
Percept- 1 ~ Percept-2
=1~ ~ Percept-3 / ' Percept- 1 ~ Percept-2 . Percept-3
-
Percept
.
I.._~
_.._ I
~ Response - 1
(B) -of one
-1 ~Response (C)
Figure 7.10 Dimensionsof action-orientedperception: (A ) Sensorfission- multiple independent motor behaviors, each with its own perceptualmodule; ( B) Sensorfusion- multiple perceptualsubmodulessupportinga single perceptualmodule within the context of a single motor behavior; and (C) Sensorfashion- multiple perceptualmodulesare sequencedwithin the contextof a singlemotor behavior. The different dimensionscan be composedtogether.
es. In Crowley et ale( 1994) havetried to formalize reactivevisual process their approach, a setof virtual sensorscreatesmappingsonto an action space determinedby actuatorsand their associatedcontrollers. The behavioritself providesthe mappingfrom perceptualspaceonto action space. Supervisory control pennits the selectionand control of the sequencingof individual perceptual es in the style of hybrid architectures(chapter6). process 7.5.2
Active Perception Active perception focuses primarily on the needs of perception , rather than the needs of action. The question changes from the action -oriented perspective of " " How can perception provide information necessary for motor behavior? to
270
Chapter7 " How canmotor behaviors " supportperceptualactivity? Thesetwo viewpoints are not mutually exclusive; indeed, activeperceptionand action-orientedperception are intimately related. What the agentneedsto know to accomplish its tasksstill dictatesperceptualrequirements , but activeperceptionprovides the perceptualprocess es with the ability to control the motor systemto make its task easieras well. In contrast, Blake ( 1995) characterizesnonactiveperceptual strategiesthat rely upon a single vantagepoint as the equivalentof a " " seeingcouchpotato. ' , Uhlin , andEklundh s usefuldefinition ( 1993, p. 22), Accordingto Pahlavan "An activevisual systemis a systemwhich is ableto manipulateits visual parameters in a controlled mannerin orderto extractuseful dataaboutthe scene in time andspace." Bajcsy' s ( 19~8) seminalpaperon activeperceptioncharacterizesit as the applicationof intelligent control to perceptionusing feedback from both low-level sensoryprocess es and high-level complexfeatures. Useful a priori knowledgeof the world may be either numeric or symbolic and encodessensormodels and expectations . Active perceptionsharesthe view with action orientedperceptionthat sensorymodulesarea principal commodity , with perceptualgoalsembeddedin thesemodules. Active perceptionis thus definedas an intelligent dataacquisitionprocess, intelligent in its useof sensors guidedby feedbackanda priori knowledge. Work at the University of Toronto epitomizesthis approach. Wilkes and Tsotsos( 1994) definebehaviorsfor controlling a video camerato extractthe necessaryinformation for objectrecognition. Threespecificbehaviorsare defined for this purpose: image-line-centering, providing rotation to the camera to orient the objectvertically within the image(rotationalcontrol) ; image-linefollowing, providing translationsparallel to the image plane to track object features(tangentialcontrol); andcamera-distance-correction, providing translation , zooming (radial control). Because along the optical axis, in essence thesebehaviorspermit the camerato control its position in space, it can use active explorationstrategiesto confirm or annul hypothesesgeneratedfrom the candidateobject modelsavailableto the system. Figure 7.11 capturesthe systemarchitecture.The cameraactivelypursuesits goal, moving aboutat the endof a robotic arm to continuallyobtainmoreadvantageous viewpointsuntil an identificationis achieved. The systemwas testedsuccess fully in complex scenesthat sufferedfrom high levelsof occlusion. The ability to havemultiple views guidedby feedbackfrom the recognitionprocessreducesthe computational complexitysubstantiallyoverwhat may otherwisebe an impossibletask whendatais from only a singleor an arbitrarycollection of prespecifiedCaIn-
271
PerceptualBasisfor Behavior-BasedControl
Rotational Motion Low-Leve Corresponden 1" 8_8_ 1--- Image --..... Based ~~ ~ r Features Tracking
RadialMotion
Controller---... Action
Motion Tangential Indexof Object Candidates
NewView(if uncertain ) . ObjectIdentity If ( certain )
Figure7.11 , Exemplaractivevision architecture.
era viewpoints. Similar researchhasalso beenconductedat the University of Wisconsin(Kutulakos, Lumelsky, andDyer 1992). Ballard ( 1989) hassummarizedthe importantcharacteristicsof activevision which he refersto asanimatevision): ( . An activevision systemhascontrol over its cameras : it can search, movein, changefocus, andso forth. . Active vision canmovethe camerain preprogrammedstereotypicalways, if needed. . Image featurescan be isolated and extractedreliably without the use of models. . Alternate coordinatesystemsbecomeavailablefor exploration insteadof merely an egocentricframe of reference, as is often the case with purely reactive systems. Action and perceptioncan be conductedwith an objectcenteredframeof referenceor with respectto someindependentfeaturewithin the world. . Fixation points provide the ability to servowithin the visual frame of reference . . Object-centeredcoordinatesystemshave inherentadvantagesdue to their invariancewith respectto observermotion. Active vision has been treatedin severalother books (Aloimonos 1993; Christensen , Bowyer, and Bunke 1993) to which the interestedreaderis referred . Our discussionnow returnsto a more behavior-basedaction-oriented ' perspectivein which an agents intentions directly determineperceptualrequirements .
272
? Chapter
7.5.3 What About Models? A priori knowledgeof the world can playa significantrole during perceptual . Particularcasesfavor the appropriateuse of this form of knowledge processing . Theseusesincludeprovidingexpectationsfor perceptualprocess es: what to look for (section7.5.4); providing focus of attentionmechanisms : whereto look for it (section7.5.5); sequencingperceptualalgorithmscorrectly: when to look for it (section7.5.6); and initially configuring sensorfusion mechanisms : how to look for it (section7.5.7). Theseuses, areentirely properwithin the context of motor behavior. It is not so much in what form the representation appearsbut rather how it is usedthat distinguishes the action-oriented approachfrom other methods. Thereis alsothe issueof percepts,eventsperceivedwithin the environment, and how they should be encoded, if at all. The dictionary definesa percept -datum" as " an impressionof an object obtainedby use of the senses ; sense ( Webster1984). A questionarisesas to whetherit is at all necessaryto create an abstractrepresentationencodingsuchan eventor impression. The pure reactivistwould arguethat the signal on a sensorcircuit wire is the encoding of the sensoroutput and neednot be mademore persistentor abstractthan that. Certainly stimulus-responsepairscanbe characterizedin this manner, so ? A recent why add any additional representationfrom a perceptualperspective and advocates the useof to this 1990 ), response position (Agre Chapman ' which capturethe agents relationshipwith its transientdeictic representations environmentat any particular point. As an examplefrom their Pengi system (Agre and Chapman1987), a video gameenvironmentincluding variousanimal agentssuchas penguinsand bees, would include the-bee-I -am-chasing. The agent's intentionsand the situationin which the agentfinds itself define the particularbeein question. Pengiusesvisual routines(section7.4.2) to extract the relevantdatafor the situatedtask pair. Thesefleeting action-oriented areconstructedasnecessaryin supportof particularbehavioral representations needs. The routinesusedin this system, however, were not physically tied to an operatingsensorsystem. ' Hayhoe, Ballard, andPelz s ( 1994) work at Rochesterembodiedtheseideas in a real system. Camerafixations provided the contextualbindings for the . Perceptualinformation, insteadof being stored as a deictic representations history, is requiredimmediatelyprior to its use. Descriptionscreatedare relevant only to a particulartask, avoidingthe memory-consumingrepresentations neededshould traditional geometricmodels be stored. To someextent, the
273
-Based J Basis forBehavior Control Perceptua Memory -
Sensors
- - - - - --- 1
Perception
I
.
Action
Figure7.12 Control . system usingdeicticspatialmemory environmentitself servesas the representationalstorehouse , requiring only suitableperceptualalgorithmsto exttact the infonnation as needed.The system was testedin a simple blocks world environment, with simple perceptual routinesto GetColor andGetLocationto movethe blocksfrom onelocationto . another. Cognitivemodelsof humanperfonnancemotivatedthis study, which concludedthat the visual representationshumansuse are " extremelyscant" with " minimal information carried over betweenfixations." It can be argued that theseconclusionsserveassolid guidelinesin the useof visual representations for behavior-basedsystemsaswell. Brill ' s ( 1994) researchat the University of VIrginia arguesfor the extension of deictic representations to includelocal spatialmemory. The impetusis to provide information that goesbeyondthe limits of reflex action, providing to the control systemsomehistory of eventswithin the world that go beyond the sensorysystem's immediaterange. Deictic representations again serveas the basisfor the basicconstructs,but addedmemoryprovidesthe robot with a broaderperceptuallayout. The systemthus can both act from immediatesensory data, as in a typical reactivesystem, but can also reactto spatialmemory constructedin a task-orientedmanner(figure7.12), reminiscentof the structure seenin the modified action-perceptioncycle discussedearlier (section7.5.1). It alsoresemblesPayton's methodsfor internalizingplans(section5.2.2. 2) but is concernedwith sensor-deriveddeictic modelsasopposedto incorporatinga ' priori mapknowledge. Brill s conceptualframeworkhasbeenexploredonly in a preliminarymannerthusfar. Wenow moveon to investigatinghow informationaboutthe world, however it may be represented or derived, canhelp constrainthe inherentcomplexityof perceptualinterpretationas we examinethe intertwinedroles of expectations andfocus-of-attentionmechanisms .
274
Chapter
7.5.4 Expectations Expectationsenablelimited computationalresourcesto perfonn effectivelyin real time. Psychological(e.gNeisser 1976) andneuroscientific(e.g., Spinelli ' role in 1987) studieshaveconsistentlyreaffinnedexpectations biological systems es at leasttwo things: whereto . Expectationscan tell perceptualprocess look for a particularperceptualeventand how that particularperceptualevent will appear. Temporalcontinuity and consistencyassumptionscan enablea systemto restrict the possiblelocation of an object from one point in time to the next. Things generallydo not disappearin oneplaceandmaterializein another . This fact, coupledwith knowledgeof the system's egomotion, allows . significantconstraintsto be appliedto sensoryprocessing One way expectationscan be exploitedis by a recognitionthat much perceptual : recognitionandtracking. The activity involvestwo majorcomponents -driven approachbasedon the a model or is often recognition discoveryphase anticipationof what the object in questionmight look like. Its representation or characteristicscan be retrievedfrom memory or may be hard-wired into a specific object recognition behavior. In somecases, as in the caseof obstacles , a recollection of how specific obstaclesactually appearneednot be invokedbut rather a recall of the affordancesthat makean obstaclean obstacle . Oncerecognitionis achieved,this phaseis abandonedin favor of the newly arrivedperceptsforming the basisfor tracking the whereaboutsof the object in question. This tracking phaseis computationallyfar less demandingthan discovery,becausethe expectationsderivedfrom perceptionaremoreimmediate andhencemorereliableandaccuratethanthoselong-tenDrecall or a more abstractmodelprovides. ' Dlustratingthis processby example, supposethat an agents instruction is to turn right just past the first oak tree. If the agenthas previously traveled the route, somementalimagemay exist of what the particulartree in question looks like and whereit might be located. This recollectionmay be imperfect, havingbeenfonned underdifferentenvironmentalconditions(night insteadof day, fog versussunshine, etc.). Additionally, the tree may havegrown, been , or beenalteredin somemannersincelast observed.Thus we must damaged assumethat our modelcan provideat bestonly limited knowledgeof how the treeshouldappear. It is also entirely possiblethat the agentmay neverhavetraveledthe route before, and thus somegenericmodel for an oak tree will needto be invoked. This forcesgreatercomputationaldemandfor recognition, asfewerconstraints can be placed upon the interpretationof incoming sensordata. Nonetheless
275
J Basisfor Behavior-BasedControl Perceptua
somerestrictionscanbe placeduponthe sceneinterpretationregardingwhere oak treesare likely to occur in generaland their typical characteristics(e.g., height, color, breadth, leaf shape,etc.). Let us now assumeth.at the recognitionphasehassuccess fully discernedthe oak tree in question. The task remainingis simply to track the position of this tree until the next changein motor activity, which is the right turn, can occur. We no longer needto continually distinguish what type of tree it is, rather all that is neededis to track its position so that the turn can be madewhen on the opportunitypresentsitself. The perceptualprocessingnow concentrates the tree' s immediatelyextractedperceptualfeaturesand not on somevagueor . Color or shapemay be adequatefor trackingpurposes outdatedrepresentation in a settingwherethe tree is solitary. By exploiting the assumptionsthat the tree remainsanchoredin placeand its reflectanceis relatively constant, it can readily be trackedas the robot movesabout. Although lighting variationsand other environmentalchangescan and do occur, the expectationsof the tree' s trackedfeaturesare continuouslyupdatedlargely independentof any a priori model. No independentsensor-basedrepresentationof the tree is maintained or neededat the reactivelevel. This form of adaptivetrackingremainsin effect until eitherthe taskis completedor the recognitionoccursthat someperceptual criteria hasbeenviolated, forcing a recall of the recognitionphase. at the Universityof Maryland( Waxman Researchers , Le Moigne, andSrinivasan1985; Waxmanet al. 1987) havelookedat the recognitionandfollowing of roadwaysas part of the DARPA autonomousland vehicle project. Their : bootstrapprocessfor roadfollowing is brokendown into two distinct phases . The bootstrapsystem's purpose image processingand feed-forward processing ' is to find the road s locationwithout prior informationregardingthe vehi' cle s position. Oncethe roadhasbeensuccess fully identifiedandthe vehicleis in motion, a feed-forward strategyis employedthat relies heavily on the inertial guidancesystemfor deadreckoning. Wherethe road shouldappearwithin the image is predictedbasedon its last appearanceand the motions sensed by the INS. Thesepredictionsrestrict the possiblelocation of featuresso that processingcanbe limited to small subwindowscenteredon their expectedpositions , reducingthe computationdramatically. The actual constraintsbased on error studieshavebeensetat 0.25 metersand 1 degreeof orientation- fine for inertial guidancebut impracticalfor most lower-cost systems.To maintain thesetolerances , an expensivepan-tilt mechanismis requiredaswell. Line extraction , combining evidencefrom multiple image subwindowsto yield the , is the principal featureusedfor roadidentification long parallellines of the roadsides .
276
Chapter?
7.5.5
Focus of Attention Focus of attention , closely related to expectations, provides infonnation regarding where an age~t should expend its perceptual resources, guiding where to look within the image or determining where to point the sensors. This in essence is a search reduction strategy. The problem can be phrased as how to prune the search space within which the desired perceptual event appears. Several techniques use hardware strategies such as foveated cameras or focus control ; others exploit knowledge of the world to constrain where to look . We first examine the strong evidence supporting attentional mechanisms in human visual processing. The advantages from a computational perspective are presented , followed by examples of both hardware and software systems these issues have influenced.
7.5.5.1 TheRoleof Attention
in Human Visual Processing
Vision in naturalenvironmentsconfrontsthe observerwith a largenumberof potentialstimuli within the field of view. Biederman( 1990) claims that there are at least three reasonsattentionalmechanismsmust selectone entity for observationfrom the manypotentiallyavailable: . The eyeitself is foveated; the retinacontainsonly a smallregion (the fovea) capableof resolvingfine detail. This is manifestedin the spotlight metaphol; in which information is filtered so that attentionis paid only at the centerof attraction(Olshausenand Koch 1995). . Shifting attentionfrom one region to anotherdoesnot necessarilyrequire . eyemovementsandthuscanbe doneat very high speeds . Serial shifting of attentionprovidesthe ability to integrateand usedifferent features, suchas color and shape.This serializationhascomputationaladvantages as a consequence of the highly consttainedmannerin which a sceneis . explored Culhaneand Tsotsos( 1992) at the University of Torontodevelopeda strategy to reducethecomputationalcomplexityof visual searchin machinevision. An attentional" beam," a localizedregion within an image, is usedwithin a " multilevel abstractionhierarchythat selectivelyinhibits irrelevantimage regions (figure 7.13). The hierarchyrepresentsvariouslevelsof visual processing , with the attentionalbeamappliedto the most abstractlevel. This in turn controlslow-level processing , thusreducingdramaticallythe amountof image processingrequired.
PerceptualBasisfor Behavior-BasedControl Attentional Spotlight
Most Abstract Layer
Least Abstract Layer Pass
j t
Zone Effective I Actual Zones
Inhibition
277
Figure7.13 Attentional beam (afterCulhaneandTsotsos1992). CulhaneandTsotsoscharacterizetheir methodasa " continuousandreactive mechanism ." This systemfocusesprimarily on early vision and seemsmost appropriatefor taskssuchastargettracking. After the initial targetlocationis determined, attentionconstrainsprocessingat lower regionsin the hierarchy by inhibiting computationin regions outside of where the target would be expectedto appearwithin the nextincomingimage. This modelcanalsohandle object recognitionvia scanpath methods(Starkand Ellis 1981) (section5.3) . A prioritized searchfor different featuresat different spatiallocationswithin an imagecan be conductedeasily by serially pointing the attentionalbeamat different imageregions.
7.5.5.2 Hardware Methods for Focus of Attention Hardwaresolutionsfor providing attentionhavealso beenstudied. One strategy is to embedthe spotlightnotiondirectly into the hardware,usinga foveated camera , similar in principle to the eye, from the onset. Kuniyoshi et al. ( 1995) reviewsseveralsuchmethods, where . multiple imagesareacquiredusinga zoomlens, sequentiallyfocusingin on the objectbeingattendedto. . multiple camerasusediffering fields of view (e.g., wide-angleandtelephoto . lens) while coordinatingthe cameras
278
? Chapter . a CCD chip is designedasan artificial retina with a densefoveatedregion. . a speciallydesignedlens is usedwith a conventionalCCD arrayto produce a foveatedspace-varying image. Although progresshasbeenmadein developingspecializedhardwareconsistent with the eyeandpotentiallycapableof exploiting comparablefocus-ofattentionmechanisms , little work to datehasincorporatedthem into complete robotic systems. 7.5.5.3 Knowledge-BasedFocus-or-Attention Methods Knowledgeaboutthe world can also guide the applicationof attentionalprocessing . In researchconductedat the University of Chicago, Fu, Hammond, and Swain ( 1994) havedevelopedmethodsto link regularitiespresentin the everydayworld with perceptualactivity. In their GroceryWorld example, in which an agentis shoppingfor food at a supermarket , domain-specificknowledge is exploitedto guide perceptualprocessing . For example, foods are typically cereals , meats); within types, brands groupedtogetherby types(soups, tendto be colocated; big itemsare storedlow on the shelves,perishableitems in freezers, andso forth. This heuristicknowledgegeneratesexpectationsasto wherein a storeto find certainitems. Perceptualactivities are then structured in a mannerthat yields the desiredoutcome: finding the food item on the shelf. Visualroutinesin this systemincludea type-recognizerandan item-recognizer built from three simple and fast vision algorithms. Theseroutinesare highly specificto the task-environmentin which the shopperoperates(an imagedatabase of a grocerystore). By exploiting the regularitieswithin this domain, very simpleperceptualalgorithmscan accomplishwhat would otherwisebe a very difficult task. Anotherapproachis to introducedeep, causalknowledgeaboutthe world to focus attention. In a blocks world exampleusing the BUSTER systemdeveloped at NorthwesternUniversity, Birnbaum, Brand, andCooper( 1993) applied causalanalysisto explain why a complex configurationof blocks is capable of remainingstanding. Expectationsregardingthe causesof physicalsupport drive the investigation,allowing attentionto be appliedto regionsof the image most likely to provide evidencefor an explanation. Attention is first directed to lower partsof the image, wherephysicalsupportbegins. After establishing supportat theselower regions, imageprocessingproceedsupward, providing explanationsas to why or how a particularblocks world configurationcan indeed remainstanding. This yields a visual attentionaltracethroughthe image
279
Perceptual Basis for Behayior -Based Control
not unlike the scanpathsdiscussedin section5.3. Here, explanationdrivesvisual attention: The agent's goal is to determinewhy somethingis the way it is. Causalsemanticseventuallywill be embeddedin a completerobotic system, andthe resultingprogresswill certainlybe of interest. 7. 5.6
Perceptual Sequencing
It is entirely possible, and in many instanceshighly desirable, to havemore thanoneperceptualalgorithmassociatedwith a singlemotorbehaviorat different stagesduring its activation. Sensorfission is concernedwith multiple parallel independentconcurrentalgorithms, and sensorfusion is concernedwith combining multiple relatedconcurrentperceptualalgorithms. Sensorfashion (perceptualsequencing ), on the other hand, is concernedwith the sequencing of multiple relatedperceptualalgorithms. Sensorfashion, an aspectof action-orientedperception, recognizesthat the perceptualrequirementsfor evena single motor behavioroften changeover time and space. For example, different perceptionis required to recognize an object when it is far away than when it is close. Often, entirely different perceptualcuesmay be usedover the courseof a singlebehavior. Assume, for example, thata mobilerobot' s taskis to dockwith a workstation in a factory (Arkin andMacKenzie1994). One of the behaviorsnecessaryfor this task is Docking, which providesthe responseregardinghow to moverelative to theperceivedworkstationstimulus. Becausethis operationis carriedout over a wide rangeof distances(from asmuchas 100feet awayto immediately in front of the workstation), no singleperceptualalgorithmis adequatefor the task. Instead, four distinct perceptualalgorithmsarecoordinatedsequentially . When the robot is far awayfrom the workstation, the limitations of the video lens make it impossibleto discernthe dock' s structure. The first perceptual algorithmcueseither from lighting conditionsat the workstation or from motion , to generatea hypothesisabout the workstations location. The docking behaviorinfluencesthe robot andit startsmoving towardthe hypothesizedtarget . As the robot approach es the workstation, it must positively identify it. A more computationallyexpensivealgorithm, exploiting a spatially constrained versionof the Houghtransform(a model-basedobjectrecognitionalgorithm), is usedto makea positive identification, confirming that the perceptualevent in questionis truly the workstation. As soon as the workstation is positively identified, an adaptivetrackingmethodologyis used, basedon regionsegmentation . At the final stagesof docking, ultrasoundpositionsthe robot.
280
Chapter7
The primary issuesin perceptualsequencingare when to useeachof these . algorithmsandhow to determinethe besttime to switch perceptualstrategies In our example, while the robot is far from the dock, we usethe long-rangedetection algorithm, headingdirectly towardsthe dock; whenwithin an expected recognitionrange, the model-basedstrategyis usedwith the resultingcuetriggering the approachof the docking behavior. As the rangecloses, a transition from vision to ultrasoundoccursasthe camera's field of view becomesuseless at closerange. Relatedto this is the notion of a perceptualtriggel; a perceptualprocess that invokesa changein either the behavioralor perceptualstateof the robot. The behavior-basedrobot' s control systemis reconfiguredwhen a perceptual trigger fires, changing either the set of active behaviors, the set of active perceptualprocesses, or both. These triggers can be quite simple, such as proximity information or detectionof a color or motion (e.g., a cloak waved in front of a bull), or they can be more complex, like the dock recognition algorithm mentionedabove. We have seen, in section4.4.4, how perception canbe usedto trigger different behaviorsfor a foragingtask. An experimentalrun for our exampleshowsthe robot using four different perceptualschemasin the courseof traversinga distancefrom thirty feet to one-half foot. A finite stateacceptor(FSA) (figure 7.14) is usedto express the relationshipsamongthe individual perceptualstrategiesin the contextof the docking behavior. Allowing failure transitionsto be presentwithin the . Figures 7.15 and7.16illustratethe exampledocking FSA ensuresrobustness run. At the University of Pennsylvania , Kosecka, Bajcsy, andMintz ( 1993) have developeda similar approachbasedon discreteevent systems(DES). This methodusessupervisorycontrol to enableand disablevariousperceptualand . Bogoni and motor events: Abrupt sensoryeventstrigger different strategies ' s 1994b extensionof the DES to the ) investigationof functionality approach Bajcsy ( usesa supervisoryoverseerfor controlling, arbitrating, and fusing evidencefrom multiple sensoryvantagepoints. In their work, developedfor the manipulationdomain, an overseerusesFSAs to control multiple sensors ' (figure7.17). The overseers stateis reflectedto the two subordinatesensors(in the exampleshown, tactile and vision). Tactile sensingprovidesonly contact informationwhereasvision canbe informativefor all states. Figure7.18showsa considerably morecomplexmappingfor a piercingtask (detectingwhena screwdriverhaspierceda styrofoambox).
281
-n -,r---T -nn --led --,r---I',:,a 'I:''I"--Ballistic 'I:I----Contro t,
PerceptualBasisfor Beh~vior-BasedControl
q 0 : ballisticperception q 1 : exteroceptivecue q 2 : adaptivetracking
q 3 : finalpositioning q 4 : normaltermination
n : normal t : terminal r : recoverable error f : fatal error a : all
q 5 : abnormal termination Figure 7.14 FSA encodingtemporalsequencingusedfor dockingbehavior. Obstacle
Light t ~ ~ 6 ~ . 3 , . : l ..-. _
~ \ ~ .- ~ -_ "---._._ . Robot
.
Dock
Figure 7.15 Traceof dockingrun with an obstaclepresent.Active behaviorsincludedocking, noise, and avoid-static-obstacles .
'=-
~
-
N~
Chapter7
" " , . , , ; . .
~-
e
Figure7.16
283
Control PerceptualBasisfor Beha~ or--Based
(F) Figure 7.16 (continued) PerceptualSequencingfor Docking: (A ) Phototropiclong-rangedock detection; (B) Successfulmodel-baseddock recognition; (C) Adaptivetrackingusing regionsegmentation ; ( 0) Loss of region due to obscuration; (E) Rerecognitionusing model-based technique; and (F) Final adaptivetracking image followed by final positioning using ultrasound. 7.5.7
Sensor Fusion for Behavior - Based Systems ' A man with a watch knows what time it is ; a man with two watches isn t so sure. - Anonymous
' Sensor fusion s traditional role has been to take multiple sources of information , fuse them into a single global representation, and then reason over that representation for action , an approach at odds with the basic view of repre ' role in behavior -based robotics . This does not mean however that sentation s , , sensor fusion no longer has any place within behavior -based robots , for multiple sources of information can significantly enhance the way an agent acts within the world . Action - oriented sensor fusion advocates that sensor reports be fused only within the context of motor action and not into some abstract, all -purpose global representation. Fusion is based on behavioral need and is localized within the perceptual processes that support a particular behavior.
284
Chapter7
F
OVERSEER
VISION
F
FORCE
A: Approach N: NearContact F: FailedContact C: Contact Figure 7.17 In this simpleexampleof the DES approach,the overseer's statesare mappedinto two sensors , vision and force, eachcontributingwhen it can. The shadedareaindicatesno useful informationis availablefor that particularsetof states. Sensor fusion , in any case, is not as simple as it sounds; incoming evidence may be complementary (i .e., in support of other observations) or competitive (i .e., in contradiction ) . Further, the incoming evidence may be arriving at different times ( asynchronously), as some sensors take longer to process than others. Often there are qualitative distinctions in the nature of the information provided : Vision may yield color information regarding the presence of a soda can, a laser striper may yield shape data, and a tactile sensor the can' s surface texture. The information may also be coming from widely separated viewpoints . Deciding what to believe is a complex task, and the behaviorbased roboticist ' s goal is to provide a single coherent percept consistent with the incoming evidence. Pau ( 1991) considered the r
OV
PerceptualBasisfor Beha~ or-BasedControl
POSITION
OperatingStates: S: Start G: Goal(Success ) AApproachD : Depart C: Contact(generic) CI: Contact(Insert) CE: Contact(extract) E: Extract P: Piercing I: ToolInObject T: ForceSensed NF: No Force NC: NoContact FailureStates: Fc: contactfailure Fp: Piercingfailure Fg: Goalfailure Fe: Extractionfailure Fs: Systemfailure FORCE
(
>as 0z
285
Figure 7.18 : vision, The more complex piercing task requiresthe coordinationof three sensors onto the individual the overseer from shows the The and force. mappings figure position . When the overseerchangesstate, the sensorstateschangeaswell. sensors
286
Chapter?
' Kluge and Thorpe s ( 1989) work at Carnegie-Mellon University took a more practicalapproachto coordinatingmultiple sourcesof information. The FERMI system, usedfor roadfollowing, employsa collectionof roadtrackers, eachtracking different featuresaboutthe road, with all providing their information to the robot controller for road following. Five different road trackers havebeendeveloped:two edge-basedmethodsorientedto a particularfeature, anotherthat extractslinear featuressuchas road stripes, a boundarydetector usingcolor information, and a feature-matchingalgorithm. All of the trackers ' provide information regardingthe road s location. Trackersare fusedusing a Hough basedmethodin which eachtrackervotesfor a particularlocationof a road spine(centerline). This voting processdeterminesthe winning candidate roadposition. Arbib, lberall, and Lyons ( 1985) were amongthe first to introducethe notion of a task-dependentrepresentationin the contextof robotics. Their work advocatedthe coordinateduse of a set of perceptualschemasto provide the necessaryinformation for a motor task. Murphy ( 1992) hasfurtheredthis notion of coordinatedperceptualschemasasthe basisfor action-orientedsensor fusion. Her model draws heavily on psychologicaltheoriesof sensorfusion (Bower 1974; Lee 1978) and usesa state-basedmechanismto control important sensorinteractionssuch as cooperation, competition, recalibration, and . Contributingperceptualsubschemas dedicatedto individual sensors suppression and funneled into a controlling parent perceptualschemaprovide the meansfor expressingthis model. Murphy' s work coalescedin the development of the SFX architecture(Murphy andArkin 1992). SFX' s analogof the underlyingpsychologicalmodelhasthreestates: State 1. CompleteSensorFusion: All sensorscooperatewith each other in determininga valid percept. State 2. Fusion with the possibility of discordanceand resultant recalibration of dependentperceptualsources: Recalibrationof suspectsensorsoccurs ratherthanthe forced integrationof their potentiallyspuriousreadingsinto the derivedpercept. State 3. Fusionwith thepossibility of discordanceand a resultantsuppression of discordantperceptualsources: Spuriousreadingsare entirely ignored by suppressingthe output streamof the sensors) in question. The task-specificperceptualschemasusedfor fusion yield perceptsdirectly related to the needsof a motor behavior. Perceptualsubschemasfeed their . Ties to the actual parentschema,which in turn supporthigher-level schemas sensordataemanatingfrom eachsourceeventuallygroundthis recursivefor-
287
PerceptualBasisfor Behavior-BasedControl
REQUIREMENTS FROM MOTORBEHAVIOR
EXCEPTION HANDLING
SENSOR DATA
PERCEPTFOR BEHAVIOR
Figure7.19 TheSFXarchitecture . mutation. A parentperceptualschemacombinesthe incoming subschemainformation usingstatisticaluncertaintymanagement techniques(Murphy 1991) to producea perceptand a measureof its belief to be usedwithin the motor schemaitself. Figure 7.19 depicts the overall control flow within the SFX architecture. The perceptualschemaand subschemaarraysare configuredprior to execution ' during the fusion processs investigatoryphasebasedon the activemotor ' behaviors needs. This investigatoryphaseis analogousto the preexecution configurationof behaviorsfound in manyhybrid architectures(chapter6). The performatoryphaseof sensorfusion similarly matchesthe executionaspectsof reactivecontrol andproceedswithout hierarchicalsupervision.An observation directed acyclic graph (oDAG) incorporatesthe sensingactivity (algorithms and sensors ) to generatethe necessarybehavioralpercept(figure 7.20) . This control scheme( Murphy 1992) has been developedto provide error unique r" '-=c., ~ry capabilitiesin light of potentialsensorfailuresor uncertainreadings.
288
Chapter7
OBJECT PERCEPT
/ ULTRASONIC RANGEPROFILE ,
~
VISIBLELIGHT CCDCAMERA .
THERMAL IMAGE INFRARED CAMERA
/ \ rHeightPosition fX/ \ f -2 Feature -3 Region 1 Region-2 Feature -1 Feature Feature Constellation
~
Nearest Surface
r
Figure 7. 20 This oDAG representsthe generationof a percept associatedwith an object using : ultrasound,video, andthermalimagery. evidencegatheredfrom threedifferent sensors Featuresandtheir associatedbelief valuesarepropagatedupwardto generatethe belief in the overall percept.
7.6 REPRESENTATIVE EXAMPLES OF BEHAVIOR -BASED PERCEPTION Perceptualprocessingfor mobile systemsis notoriouslydifficult. Working in a partially known and uncertainenvironmentwith sensorsthat are in motion and subjectedto bouncing, the perceptualsystemmust provideinformationto a robot that is both useful and accurate. In this sectionwe investigatethree tasks: roadfollowing, visual tracking, androbotic headcontrol. representative
7.6.1 RoadFollowing fonD of amnesia.Everythingis to be discovered , everythingto Driving - is a spectacular be obliterated. - Jean Baudri11 ard
Many researchcentershave expendedand continue to expenda significant effort on providing perceptualsupportfor roadfollowing. We will surveyonly two of the most successfulefforts, one in Europeand the other in the United States. Dickmannsand Zapp ( 1985) havedevelopeda high-speedroad following systemat the Military Universityof Munich. This systemoperatesusinga windowing techniqueto enforcefocus-of-attentionmechanismsto meetreal-time
289
PerceptualBasisfor Behavior-BasedControl
processingconstraints.High-speedvehicledynamicshavebeenconsideredin their initial testbed, a paneltruck calledVaMoRS(panel(A ) of figure 7.21). Autonomousroad following has been achievedat speedsup to the vehicle' s limit ( 100km/ hour) . A dedicatedmicroprocessorassignedto eachfeature tracks it in its own window. Anticipatory control is utilized via a " preview" window, basedon the vehicle' s modeleddynamics. Figure 7.22 illustratesthe overall control architecture. Limited obstaclerecognitionhas also beenprovided (Graefe1990). An evenfastervehiclp" vaMP (panel( B) of figure 7.21), further pushesthe envelopeof high-speed, autonomousroad following. Additional researchhasextendedthis methodto autonomouslandingof an airplane (Dickmanns1992) . In the United States, Carnegie-Mellon University has a long track record in road-following autonomousrobots. Figure 7.23 showsone of their earliest systems, Terregator(TERREstrialnaviGATOR), that was used, amongother things, for following sidewalkson campus( Wallaceet al. 1985). Initially , the only environmentalsensorusedwasa singleblack-and-white videocamerabut it waslater upgradedto color ( Wallaceet al. 1986). Wallace's ( 1987) subsequentwork at Carnegie-Mellon on the Navlab (figure 7.24) demonstratedthe ability to follow roadsstreakedwith shadowsand poorly registeredin the color spectrum. A patternclassificationschemebased on pixel valueson a color surfacedistinguishes sunnyand shadedroad from nonroadregions. The imagepixels associatedwith the roadare groupedinto a singleregionthat is then usedto steerthe vehicle. This approachevolvedinto the SCARFsystemdescribedbelow. We have already discussedtwo other recent road-following methodsdeveloped at Carmegie-Mellon: Ulysses, in section7.4.2, and FERMI, in section 7.5.7. We will now review severalothers. . The SCARF systemwas initially developedat CMU by Crisman ( 1991). This road-following systemrelieson color imageryto detectroadswith which feature-basedalgorithmshad seriousdifficulties: for example, roadswith degraded , and thosewith significant shadowing. No threeedgesand surfaces . dimensionalroad model is constructed(consistentwith the behavior-based ) while SCARF robustly provides control feedbackregardingroad approach . position In its most recentimplementation(Zeng and Crisman 1995), color categoriesareprovidedwithin the RGB (red-green-blue) spaceof the imagery. Pixels are mappedonto thesecategoriesasreceived, reducingthe 24-bit RGB color imageto a 6-bit format. Statisticsarerespectivelygatheredon thosepixels that correspondto road and nonroadareas, and in somecasestheseare further partitionedinto shad~d and nonshadedareas. A Gaussianroad model,
290
Chapter7
(A)
(B)
7.21 Figure
291
PerceptualBasisfor Beha~ or-BasedControl
(C) Figure 7.21 (continued) High-speedroad-following autonomousrobots: (A ) VaMoRS, ( B) vaMP, and(C) vaMP vision system. (Photographscourtesyof ErnstDickmanns.) generated from the statistics of previous imagery , is then used to classify the 6bit format into road and nonroad regions, providing the necessary information on road location for steering purposes.
Also at Carnegie-Mellon, Pomerleau( 1993) developedanotherclassification system, basedon neuralnets, calledALVINN (AutonomousLand Vehicle in Neural Nets). An imagearray or retina of 30 x 32 pixels servedas the input layer. This was connectedto four hiddenunits that in turn projectedonto the output layer, quantizingthe steeringinto thirty discreteunits rangingfrom sharp-right to straight-aheadto sharp-left (figure 7.25). The systemwastested on the Navlab system(panel(A ) of figure 7.24) and later ported for use for road-following tasksfor the UGV Demo II project (chapter9). For training purposes , a humandriver first takesthe vehicle over the road. -propagationmethodtrains the network (section8.4). the back During driving, After training, the systemis capableof following the roadtypesandconditions it encountered . ALVINN hashadsuccessfulrunsup to 90 mileswithout human
292
Chapter?
SUPERVISORY
and Planning Situation Assessment
-forward MODE and Feed Monitoring RuleSelection Programs SELECTION ------- ~------------------ ------- ------------------ ---- --------REFLEXIVE State Feedback _ - f +,t BEHAVIOR Estimation '.,:..1 Control Laws
SENSORS
ACTUATORS
Figure7.22 Architecture usedin VaMoRSproject.Notethepartitioningalongthelinesof hybrid in chapter6. discussed systems
Figure7. 23 -Mp.llnl1I Univer Institute ofTheRobotics . (Photographs , Camegie courtesy Terregator sity.)
293
Perceptual Basis for Behavior -Based Control
interventionon highwaysand has also beensuccess fully usedon single- and multilanedirt andpavedroadsof varioustypes. The No HandsAcrossAmerica Navlab 5 USA tour is one of the most ambitious exhibitionsof a~tonomousdriving to date (panel (B) of figure 7.24). The vehicle, a modified 1990PontiacTransSport, drove almost 3,000 miles , from Pittsburgh, Pennsylvania , to SanDiego, California. Autonomous autonomously visual steeringwas usedfor 98.2 percentof the trip at speedsaveraging 57 miles per hour. The vision systemusedfor this project, developedat Carnegie-Mellon, and called RALPH (Pomerleau1995), hasa simplecontrol process: 1. An imageis acquired. 2. Irrelevant portions of the image are discarded(i .e., a focus-of-attention . trapezoidis usedto constrainthe processeddata). 3. The remaining image is subsampledto yield a 30 x 32 image array that includesimportantroadfeatures. -and-test strategy, 4. The road curvatureis then computedusing a generate ' low-resolution the rows a shifting imagery by predictedcurvatureuntil it " " becomes straightened . The curvaturehypothesiswith the straightestfeatures wins. 5. The vehicle' s lateraloffsetrelativeto the road' s centerlineis thencomputed usingtemplatematching(from a templatecreatedwhenthe vehicleis centered in the lane) on a one-dimensionalscanline acrossthe road. 6. A steeringcommandis then issued, and the processbegins again from step 1. Only the templatein Step 5 needsto be modified when RALPH encounters new road types. This templatecan be createdas neededduring driving without humanintervention, using look-aheadand rapid adaptation, and swapped in automaticallyas needed. The highest speedRALPH has achievedon the Navlab5 is 91 miles per hour on a testtrack.
7.6.2 VISual Tracking Evenif you' re on the right track, you' ll get run over if youjust sit there. - Will Rogers
. Instead Visual trackinghasbeena heavilyresearchedareafor severaldecades we three of of attemptinga comprehensive survey, presentjust examples recent work thathavebeenfieldedon actualrobotichardwareandthatexploit, at some . level, the notion of taskdirectedness
294
Chapter?
(A) Figure7. 24
295
PerceptualBasisfor Behavtor-BasedControl
( B) Figure 7.24 (continued) (A ) Navlab 1 and ( B) Navlab 5. (Photographscourtesy of The Robotics Institute, Carnegie-Mellon University.)
Woodfill and Zabih' s researchat Stanford University ( 1991) produceda motion-basedtrackingalgorithmfor keepinga movingpersonin view of a mobile . Correlationalmatching robot by panningthe robot' s cameraas necessary wasperformedon a pixel by pixel basisto computean initial disparitymapthat wassubsequentlysmoothedby comparingeachpixel' s valuesto its neighbors' . This algorithmran andassigningit the mostpopularonefor the neighborhood -nodeconnectionmachine(a very powerful computer) processing on a 16,OOO 15 pairs of imagesper second. Segmentationof the moving object from the . The tracking motion field wasthen performedusing histogrammingtechniques ' processalso projectedthe object s location to constrainthe interpretation of the motion field. This systemworked for both indoor and outdoor scenes sufficiently texturedto producea rich motion field. ' Prokopowicz, Swain, and Kahn s work at the University of Chicago( 1994) takesadvantageof the wide rangeof well- developedtrackingalgorithmsavailable , including a simplified version of the Woodfill and Zabih algorithm described above, a correlation-basedtracker, and a color histogramtrackerthat takesadvantageof the target's color properties. Additional trackersbasedon
Chapter? YEA
LA
OUTPUT UNITS
)
296
30 (
I
sharp-left
INPUT IMAGE (30x 32)
HIDDEN UNITS
VIDEO ahead straight IN
~
STEERING COMMAND
right
sharp
Figure7.25 ALVI N N'sneural network architecture . binocularvision arebeingdeveloped . This approachfocuseson understanding therelationshipbetweenthe algorithms, the target, andthe environmentwithin which the agentoperates . Threebehavioraltrackingtasksaredefined: . Watch: wherethe robot is stationaryandthe targetis moving. . Approach: wherethe robot is moving andthe targetis stationary. . Pursue: whereboth the robot andtargetaremoving. , etc.) By definingthetargetclassproperties(speed,locale, appearance changes for a wide rangeof objects, a setof environmentalbackgroundcharacteristics , and so forth) and the ( backgroundmotion, unevenlighting, visual busyness conditionsunderwhich the varioustracking algorithmssucceedor fail , selection of the algorithmsat run time basedon agentintention, targetproperties, and environmentalconditionsbecomesfeasible. This approachexplicitly acknowledges theseinterdependencies , and by doing so, significantly enhances the system's overallrobustness .
297
PerceptualBasisfor Beha~ior-BasedControl
Figure7.26 -jugglingrobot. ( Photograph .) of BrianYamauchi Rochester balloon courtesy Anotherinterestingtrackingsystemdevelopedat theUniversityof Rochester Yamauchi andNelson1991), wasusedto controla robotic manipulatorfor the ( task of juggling a balloon (figure 7.26). Threeconcurrentparallel behavioral agentswereimplemented: . Rotationaltracker: keepsthe robot facing in the balloon' s direction. . Extensionaltracker: keepsthe arm extendedso that it remainsbeneaththe balloon' s position. . Hitter: providesthe upwardforce to the balloon whenthe balloon is immediately abovethe paddleheld by the arm. The balloons were partially inflated with helium to slow them during fall. An averageof 20 experimentalruns yielded 7.0 bounceswith a 39.4-second duration with a maximum run of 20 bounceswith a 104-secondduration. Other work in juggling systemswith conventionalballs abounds(e.g., Aboaf, Drucker, and Atkeson 1989; Buhler, Koditschek , and Kindlmann 1989) but it often employsthe reasonably predictablephysicsof the situation. Balloonjuggling is inherentlymore complexand unpredictable , making it more suitable for behavior-basedsolutions.
298
Chapter7
7.6. 3 Robotic Heads All I askof my body is that it carry aroundmy head. - ThomasAlva Edison
To many, the pinnacleof visual sensingoccursin the designanddevelopment of a robotic head. A typical headconsistsof two video cameras , eachof which has severalcontrollable degreesof freedomD OFs). To coordinatesuch a complexsystem, severalbehavioralcontrol subsystemsmay be used( Brown 1991) : . Saccade : the open-loop rapid slewing of the camerain a given direction, often usedto repositiona camerawhena targetmovesout of the field of view. . Smoothpursuit: continuoustrackingof a moving target. . Vergence : measuringthe disparity betweenthe two camerasfocusedon a targetandthen moving oneof the camerasto reduceor eliminateit. . Vestibulo-ocularreflex: open-loop control usedto stabilizethe headcameras relative to body movement, eliminating apparentmotion due to translationor rotationof the robot base. . Platform compensation : used to prevent the camerapositioning systems from reachingtheir limits. When a limit is approachedfor any particularDOF, open-loop motionrepositionsthejoints of the roboticplatform without moving the camerasthemselves . Robotic headscan rangewidely in complexity and cost. At the low end of the price scale, we canfind a robotic headthat costslessthan $ 1000(Horswill andYamamoto1994). It includestwo low-resolutionCCD camerasanda fourDOF active head. It has beenusedfor experimentsin vergenceand saccadic motion. Ferrier and Clark ( 1993) at Harvardhavedevelopeda more complexhead , pursuit, and vergence(panel(A ) of figure 7.27). An even capableof saccades more complicatedsystem(panel( B) of figure 7.27) with thirteen DOF has been developedin Sweden(pahlavanand Eklundh 1993). Primary reflexes developedfor this KTH (Kungl TekniskaHogskolan) head include smooth . A large numberof robotic heads , and involuntarysaccades pursuit, vergence havebeendeveloped . The interestedreaderis referredto Christensen , Bowyer, andBunke 1993for more information. , the most ambitious robotic head (and torso as well) project to Perhaps date is Cog, a robot constructedby Brooks and Stein ( 1994) at the MIT AI Laboratory(figure 7.28). This robot is equippedwith auditoryaswell asvisual
299
PerceptualBasisfor Behavior-BasedControl
sensors(with pan-tilt , vergence , and saccadecapabilities) and hasin its initial two six DOF arms and a torso with a three-OOF hip and a configuration three-DOF neck. Conductiverubbersensorswill be usedlater to impart touch sensingto the robot. Cog is intendedultimately to encompassmost of the aspectsof behavior-basedvisual perceptionwe havediscussedthroughoutthis section: saccades , smooth pursuit, vergence , vestibulo-ocular reflex, visual routines, and head-body-eye coordination. This project confrontsheadon the -basedbehavioralmethods. issuesof scalabilityin subsumption
7.7 CHAPTER SUMMARY . Traditional perceptionhas beenconcernedwith constructingan intentionfree model of the world. Newer approach es, generatedfor behavior-based ' systems,takethe systems motor requirementsinto accountin their design. . These behavior-basedalgorithms are highly modular, providing targeted . capabilitiesfor specificmotor needsandenvironments . Biological studieshaveprovidedsignificantinsight into the designof these systems: . Affordancesprovide a new way of thinking abouthow to perceivebasedon what environmentalopportunitiesareaffordedthe robotic agent. . The dual systemsof what andwherecanbe associatedw th hybrid architectural ~ design. . Commonrobotic sensorsinclude shaftencodersfor deadreckoning, inertial , ultrasound, video, and laser navigationsystems, global positioning systems scanners . . Schematheory, visual routines, perceptualclasses , and lightweight vision methods for . modules providediffering describingperceptual . Action-orientedperceptionprovidesthreemeansby which perceptualmodules canbe coordinated: . Sensorfission: Individual modulesarededicatedto eachbehavior. . Action-orientedsensorfusion: Recursivelydefinedmodulescombinemultiple sourcesof informationinto a singlepercept. . Sensorfashion(perceptualsequencing ): Variousperceptualmodulesare activated andcoordinatedat differing points in time and spaceasneeded. . Active perceptionenablestheperceptualprocessto control supportingmotor behaviors. . Expectationsand focus of attentioncan be usedto constrainthe perceptual ' processs inherentcomplexity.
300
? Chapter
(A) Figure7.27 . Significantresultshavebeenachievedin the areasof high-speedroboticroad following, visual tracking, andthe designof complexrobotic heads. . The following list summarizesthe design principles for perceptualalgorithms in supportof reactiverobotic systems: . Don' t design one algorithm that does everything: rather, tailor perception . modularlyto meetmotor requirements . Closelycoupleperceptionandmotor control. . Action-orientedperceptionis centralto achievingrapid response . . Exploit expectationknowledgewhenavailable(from previousimages, object models, etc.) . . Use focus-of-attentionmechanismsto constrainsearch; use computational powerwhereit is mostlikely to yield results. . Organizeperceptualstrategiesusing sensorfission, fusion, or fashion as needed.
5
I
~0-
J Basisfor Behavior-BasedControl Perceptua
) - ~7.27(continued A ( ) Harvardhead. ( photographcourtesyof Nicola Ferrier.) (B) KTH head. ( Photograph courtesyof Jan-Olof Eklundh.)
i ~~
~s
7 Chapter
PerceptualBasisfor Behavior-BasedControl
(B) ) Figure7. 28(continued of RodneyBrooks.) Cog: (A) full view; (B) head.( photographs courtesy
Chapter 8 Adaptive Behavior
It is impossibleto beginto learnthat which one thinks one alreadyknows. . Epictetus A mind oncestretchedby a new ideaneverregainsits original dimension. - Oliver WendellHolmes The reasonablemanadaptshimself to the world; the unreasonable manpersistsin trying to adaptthe world to himself. Therefore, all progressdependson the unreasonable man. - George BernardShaw
Objectives .
have
to
need
understand
learning
capabilities
Chapter
robots
To
.
1
why
.
based
behavior
within
the
for
systems
learning
To
opportunities
.
2
recognize
based
behavior
for
of
used algorithms networks
,
algorithms
genetic
,
the
learning
types
understand
To
.
3
major
neural
reinforcement ,
learning .
including control
behavioral
,
systems
in fuzzy
and learning
LEARN? 8.1 WHYSHOULDROBOTS Learningis often viewed as an essentialpart of an intelligent system. Indeed some arguethat without this ability, there cannotbe intelligencepresentat all: " Learning is, $ r all , the quintessentialAI issue. . . . I will now give a definition of AI that most of our programswill fail. AI is the scienceof endowingprogramswith the ability to changethemselvesfor the better as a " Schank1987 . 63- 64 . result of their own experiences , pp ( )
306
Chapter8
But what do we meanby learningor adaptation? As with manyother terms we haveencountered , thereis no universaldefinition: " " . Modification of a behavioraltendencyby experience ( Webster1984). . "A learningmachine, broadlydefined, is anydevicewhoseactionsareinfluenced " by pastexperiences ( Nilsson1965). " . Any changein a systemthat allows it to performbetterthe secondtime on " repetitionof the sametaskor on anothertaskdrawnfrom the samepopulation (Simon 1983). . "An improvementin information processingability that resultsfrom informationprocessingactivity" ( Tanimoto1990). Our operationaldefinition will be: withinan agentthatovertimeenableit to performmore Learningproduces changes . effectivelywithinits environment Although this definition will not satisfy all, it providesus with a meansfor measuringlearning by defining performancemetrics againstwhich an agent canbe measuredbefore, during, andafter learninghasoccurred. How then can we relate learning and adaptation? Adaptationrefers to an ' agents learningby making adjustmentsin order to be more attunedto its environment . Phenotypicadaptationoccurswithin an individual agent, whereas genotypicis geneticallybasedand evolutionary. Adaptationcan also be differentiated on the basisof time scale: acclimatizationis a slow process, but homeostasisis a rapid, equilibrium-maintainingprocess.We differentiatefour typesof adaptation(adaptedfrom McParland1981) : . Behavioraladaptation:An agent's individual behaviorsare adjustedrelative to oneanother. . Evolutionaryadaptation:Descendents changeoverlong time scalesbasedon the successor failure of their ancestorsin the environment. . Sensoradaptation: An agent's perceptualsystembecomesmore attunedto its environment. . Learningas adaptation:Essentiallyanythingelsethat resultsin a moreecologically fit agent. Adaptationmay producehabituation, an eventualdecreasein or cessation of a behavioralresponsewhen a stimulus is presentednumeroustimes. This . Sensitizaresponses processis usefulfor eliminatingspuriousor unnecessary tion is the opposite, an increasein the probability of a behavioralresponse when a stimulus is repeatedfrequently. Habituation is generally associated
Adaptive Behavior conventional feedback
led
Control . Controller System -
'
Element Leaming
~ I )
adjustment
parametric (
307
Figure 8.1 Adaptivecontrol system.
, with
relatively insignificant stimuli suchas loud noise, whereassensitization occurswith moredire stimuli like electric shocks. From a controlsperspective , we canmoreeasilydifferentiateadaptationand ' learning. Adaptive control, an early 1950sexampleof a systems changing ' to better fit its environment , usesfeedbackto adjust the controller s internal parameters(Astrom 1995). Figure 8.1 illustratesan adaptivesystemthat uses feedbackin both the traditional senseand for internal modification of the controlleritself. Learning, on the other hand, can improveperformancein additionalways, by . ,introducingnew knowledge(facts, behaviors,rules) into the system. . generalizingconceptsfrom multiple examples. . specializingconceptsfor particularinstancesthat are in someway different from the mainstream . . reorganizingthe informationwithin the systemto be moreefficient. . creatingor discoveringnew concepts. . creatingexplanationsof how thingsfunction. . reusingpastexperiences . Artificial intelligenceresearchhasdevotedconsiderableeffort to determining the mechanismsby which a robotic systemcanlearnsomeof thesethings, leadingto a wide rangeof learningsystems,including
. Reinforcement learning : Rewardsand/or punishmentsare used to alter numericvaluesin a controller. . Neural networks: This form of reinforcementlearning uses specialized architecturesin which learning occursas the result of alterationsin synaptic weights.
308
Chapter8
. Evolutionary learning: Geneticoperatorssuchas crossoverand mutation, are used over populationsof controllers, leading to more efficient control . strategies . Learning from experience . . Memory -basedlearning: Myriad individual recordsof pastexperiencesare usedto derivefunction approximatorsfor control laws. . Case-based learning: Specific experiencesare organizedand stored as a case structure, then retrieved and adaptedas neededbasedon the current situationalcontext. . Inductive learning : Specific training examplesare used, each in turn, to generalizeand/ or specializeconceptsor controllers. . Explanation-basedlearning : Specificdomainknowledgeis usedto guide the learningprocess. . Multistrategy learning: Multiple learningmethodscompeteandcooperate with eachother, eachspecializingin what it doesbest. Many of the abovelearningmethodshavebeenexploredto varying degrees in behavior-basedrobotic systems. In this chapterwe look at how learning methodscan be effectively exploited in behavior-basedrobots and study a wide rangeof constructedsystemsthat provide theseagentswith the ability . to improvetheir performancewithin their environments 8. 2 OPPORTUNITIES FOR LEARNING IN BEHAVIOR -BASED ROBOTICS Learningis not compulsory. Neitheris survival. - WEdwards Deming Where can learning occur within a behavior -based robotic control system? To answer this effectively , we need to revisit some of the notation we developed in chapter 3. Recall that a functional mapping .8i defines an individual behavior that acts upon a given stimulus Si to produce a specific response ri . A gain ' value gi is used to modify the response s overall strength as a multiplicative constant. ri = gi * .8i(Si) Within the context of the individual behavior, what can be learned? I . What is a suitable stimulus for a particular response? That is , given a desired ri , what Si is appropriate ?
309
AdaptiveBehavior
2. What is a suitableresponsefor a given stimulus? That is, givenaparticularS ;, what is r ; ? This identifiesa point behavioralresponsefor a singlestimulus but doesnot specify.8; in its entirety. 3. What is a suitablebe~aviora1mappingbetweenan existingstimulusdomain andrangeof responses ? That is, what is the form of .8; ? 4. What is the magnitudeof the response ? That is, what is the valueofg ; ? 5. What constitutesa whole new behavior for the robot (i.e., new stimuli and/ or responses )? Recall further (section 3.4.2) that behaviorsare grouped into assemblages that specify the global responsep of the robot for a set of behaviorsB with associated gainsG anda givensetof stimuliS whensubjectedto a coordination function C: p = C (G * B (S . Cisgenerally either competitive(arbitration) or cooperative(fusion) or some combinationof the two. Within the contextof a behavioralassemblage , what canbe learned? 1. What setof behaviors.Biconstitutesthe behavioralcomponentof an assemblage B? 2. What are the relative strengthsof eachbehavior's responsewithin the assemblage ? That is, what is G? 3. What is a suitablecoordinationfunction C? Problemsoftenaccompanyopportunities,andlearningis not differentin this respect. . Credit assignmentproblem: How is credit or blameassignedto a particular pieceor piecesof knowledgein a largeknowledgebaseor to the components) of a complexsystemresponsiblefor either the successor failure of an attempt to accomplisha task? . Saliencyproblem: What featuresin the availableinput streamarerelevantto the learningtask? . New term problem: When doesa new representationalconsttuct(concept) needto be createdto capturesomeuseful featureeffectively? . Indexing problem: How can a memorybe efficiently organizedto provide effectiveand timely recall to supportlearningandimprovedperformance ? . Utility problem: How doesa learningsystemdeterminethat the information it containsis still relevantanduseful? Whenis it acceptableto forget things? Robots can potentially learn how to behaveby either modifying existing behaviors(adaptation) or by learningnew ones. This type of learningcan be
310
8 Chapter relatedto Piaget's theory of cognitive development(Piaget 1971), in which assimilationrefersto the modificationor reorganizationof the existing set of availablebehaviors(schemas ) andaccommodationis the processinvolvedwith the acquisitionof ne~ behaviors.Robotscan alsolearnhow to sensecorrectly by eitherlearningwhereto look or detenniningwhat to look for. A robot can also learn about the world' s spatial structure. We discussed this in chapter5 in the context of short-term behavioralmemory and longterm memorymaps. We do not discussspatialmemoryandlearningfurther in this chapterandfocusinsteadon learningandadaptationwithin the behavioral control system. As we haveseen, robotslearnby widely varied methodsthat can be classified alongseveraldimensions( Tan1991) : . Numeric or symbolic: Symbolic learning associatesrepresentationswith the numbersinherentlygeneratedwithin control systems.Thesecan result in , productionrules, and semantic symbolic structuressuchas logical assertions networks. Numericmethodsof learningmanipulatenumericquantities. Neural es. networksand statisticalmethodsareprime examplesof numericapproach . Inductive or deductive: Generalizingas a result of learningfrom examples or experimentsis typical of inductivelearning. Deductivelearningproducesa moreefficient conceptfrom an initial oneoriginally providedto the robot. . Continuousor batch: Learningcanoccureitherduringthe robot' s interaction with the world (continuousand on-line) or insteadthroughits acquisitionof a large body of experienceprior to making any changeswithin the behaviors (batch). Most of the methodsapplied to behavior-basedrobotic control systemsto date are both numeric and inductive. Someof thesemethodsare on-line and continuousand othersare more batchoriented. The remainderof this chapter revealsmoreaboutvarioustypesof learningparadigmssuccess fully appliedto different aspectsof behavioralcontrol systems.
LEARNING 8.3 REINFORCEMENT The man who setsout to carry - a cat by - its tail learnssomethingthat will alwaysbe usefuland which will nevergrow dim or doubtful. - Mark Twain
Reinforcementlearning is one of the most widely used methodsto adapt a robotic control system. It is numeric, inductive, andcontinuous.It is motivated " by an old psychologicalconcept, the Law of Effect, which states: Applying a
311
AdaptiveBehavior conventional feedback , ~
Controller
.
...,
Control led System
--
~ , Reinforcement --
Critic .
--
Figure 8.2 Reinforcementlearningsystem.
rewardimmediatelyafter the occurrenceof a responseincreasesits probability of reoccurring, while providingpunishmentafterthe responsewill decrease the probability" ( Thorndike1911). This notion of rewardandpunishmentreinforces the causativebehavior. Modem psychologylargely disputesthe Law of Effect asthe major basisfor animallearning, but the law still providesa useful modelfor robotic behavioralmodification. To sendthe necessaryreinforcementsignal to the control system, we need a componentcapableof evaluatingthe response , which we will refer to as the critic . The critic appliesreinforcementto the control systemin light of its evaluation(figure 8.2). This form of unsupervisedlearning includesno notion of a particulartarget goal statethat the systemis trying to achieve, in contrastto supervisedlearning , in which thereis an explicit notion of correctnessin termsof an optimal behaviorestablisheda priori. In reinforcementlearning, the feedbackto the control providesinformationregardingthe quality of the behavioralresponse . It maybe assimpleasa binary pass/fail or a morecomplexnumericevaluation. Thereis no specificationas to what the correctresponseis, only how well the particularresponseworked. Oneof the major problemsassociatedwith reinforcementlearningis credit . Suppose , for example, we havea collection of N active behaviors assignment 8 .8;( ;) generatinga global responsep . The critic evaluatesthe results from executingp and determinesthat a reward or punishmentneedsto be applied, perhapsby changingthe componentsof G, the relative strengthsof eachbehavioralcomponent.How can the critic either increaseor decreaseits strengthin accordancewith the Law of Effect? It is hardto determinedirectly which of the individual componentsis largely responsiblefor the successor failure of a response . We will seehow each of the actual learning systems
312
Chapter8
es this particular problem. Neural network architectures ,a presentedaddress specialcaseof reinforcementlearning, are discussedseparatelyin the following section. A commonmeansfor expressingreinforcementlearninginvolvesa decision policy. A robotic agentmay havemanypossibleactionsit cantakein response to a stimulus, andthepolicy determineswhich of the availableactionstherobot shouldundertake. Reinforcementis then applied basedon the resultsof that decision, and the policy is alteredin a mannerconsistentwith the outcome (reward or punishment). The ultimate goal is to learn an optimal policy that choosesthe bestactionfor everysetof possibleinputs. The issuesin the designof robotic reinforcementlearning systemscan be summarizedasfollows (Krose 1995) : . Which reinforcementlearningalgorithm shouldwe choose? 1Wotypespredominate : . AdaptiveHeuristic Critic (AHC) learning: The processof learningthe decision policy for action is separatefrom learning the utility function the critic uses for state evaluation(panel (A ) in figure 8.3). Section 8.4.2 discusses connectionistvariationsof this algorithm, typically used for behavior-based robotics. . Q-Iearning: A singleutility Q-functionis learnedto evaluateboth actionsand states(panel(B ) in figure 8.3) ( Watkinsand Dayan 1992). Lin ' s ( 1992) study comparingQ- and AHC-basedreinforcementlearningmethodsfor enhancing an agent's survivability in a simulateddynamicworld providesevidencethat Q-Iearning is the superiormethod for reactiverobotic applications. Indeed, Q-learningcurrentlydominatesbehavior-basedrobotic reinforcementlearning es. approach . How do we approximatethe control function most effectively? Shouldwe uselookup tablesor discreteor continuousapproximations , and what aspects ? of the control statesdo we needto represent . How fast do we needto learn? This is strongly dependenton the problem domain in which the robot is operating. Learning that is too slow may not be worth the extra computationaloverhead,particularly if the environmentin which the agentis operatingis also subjectto change. Although a large body of literature exists on robotic learning, much research to datehasbeencarriedout only in simulation. Becausethereis often a fully in simulationto constructing large leap from implementingstudiessuccess actual robotic hardware, we restrict our discussionprimarily to learning systemsthat haveactually beenfielded on robots. The remainderof this sec-
313
AdaptiveBehavior STATE INFORMATION
STATE Action INFORMATION
"' "
l Utility
? Action
(A)
Utility (B)
Figure 8.3 : (A ) AdaptiveHeuristicCritic ; ( B) Q-learning. Learningarchitectures
. tion offers a representativesampling of a range of tasks that benefit from reinforcementlearning strategies . Many additional simulation studiesof behave been conductedwithin the artificial life community haviorallearning systems . Becausethis book is concernedwith robotics, we focus on real world systems.
8.3.1 Learningto Walk We studied earlier (section 2.5.3) the problem of coordinatingmultiple leg controllersin a leggedrobotic system. Obtaining efficient gaits is a nontrivial problem. We have seenalso in section2.5.3 that a neural controller can generategaitsthat correspondto thoseof biological systemssuchasthe cockroach . Numericreinforcementlearningmethodscanalsobe appliedin learning leggedlocomotionin a behavior-basedcontrol system. Maes and Brooks ( 1990) studiedlearning using Genghis, a robot hexapod (figure3.6). Theyuseda rule-basedsubsumptionarchitecturefor the controller, which consistsof thirteen high-level behaviorsusing two sensormodalities for feedback: two touch sensorslocatedon the bottom of the robot (fore and aft) to determinewhen the body of the robot hits the floor and a trailing wheel to measureforward progress(figure 8.4). Genghis's task was to learn to move forward. Negativefeedbackresultswhen either of its touch sensors makescontactwith the ground and positive feedbackwhen its measurement wheel indicatesthat the robot is moving forward. Thesefeedbackresultsare binary (i.e., the sensorsare either on or oft). High-level behaviorsinclude six swing forward behaviors, six swing backwardbehaviors, and a horizontal balancebehaviorthat correctsall of the legs to producea stablehorizontal position.
Chapter8
\
314
TRAILING WHEEL
Figure 8.4 Sensorsusedin Genghisfor reinforcementlearning. Underbellytouchsensorsto detect ground contact provide negativefeedbackand the trailing wheel provides positive feedbackwhenthe robot movesforward.
8.3.1.1 TheLearningAlgorithm Reinforcementis usedto alter the preconditionlist of the subsumptionbehaviors on thecriteria of their being: relevant, that is, positivefeedbackis received morefrequentlywhenthe behavioris moreactivethanwhenit is not, andnegative feedbackis not receivedat all when the behavioris activeor is received lessfrequentlywhen the behavioris more activethan not; or reliable, that is, the feedbackresults(either positive or negative) are consistentwhen the behavior is active (i.e., the probability of the behavior's occurrenceapproach es either0 or 1). Each behavior is modified independentlybeginning with a minimally restrictive preconditionlist. Eachmaintainsits own performancerecord- a matrix consistingof the numberof times that feedback(bothpositive and negative ) is on or off and whetheror not the behavioris active or not during that time. The resultsare decayedover time so that only a recenthistory is maintained . The correlationfor positivefeedbackfor a behaviorB is computedby
(8.1) where
315
AdaptiveBehavior
j = numberof timesthe behavioris activewhenpositivefeedbackis present; k = number of times the behavior is not active when positive feedbackis present; number of times the behavior is active when no positive feedbackis present; and m = numberof times the behavioris not activewhen no positivefeedbackis present. 1 =
The sameequationis usedfor correlatingnegativefeedbacksimply by substituting negativefeedbackfor positivefeedbackin equation8.1. A valuenear - 1 for the correlationindicatesthat feedbackis not likely when the behavior is activeand + 1 indicatesthat it is quite likely. Thebehavior's relevanceis determinedby computingthedifferencebetween the correlationsbetweenpositiveandnegativefeedback: corr( P , B ) - corr(N , B ) .
(8.2)
If this value approach es + 2, the behavioris highly likely to be active and is consideredrelevant, that is, positivefeedbackis receivedwhenthe behavioris active. The behavior's reliability is then computedby . I DIn (max( . jp ' . Ip ), max( . jn ' . In } p + Ip } p + Ip } nn+ I }nn+ I
,
(8.3)
where j and I havethe meaningas beforeand the subscriptsp and n denote . If this value is close to 1, the positive and negativefeedbackrespectively behavioris consideredhighly reliable, but very unreliableif it is nearO. If the behaviorprovesrelevantbut not reliable, a different perceptualcondition is monitoredto seeif it is responsiblefor the lack of reliability. If it is, the behavioralrule' s preconditionlist will be altered. New performancestatistics aregatheredcorrelatingthe robot' s performancewhenthe new perceptual conditionis eitherpresentor absent. A correlationwith feedbackis then computed ' , asin equation8.1, by substitutingthe new perceptualcondition s being eitheron or off for the behavior's beingeitheractiveor inactive. If the resultant correlationvalue is near + 1, the preconditionlist of the behavioris modified to require the new perceptualcondition to be on; if the correlation value is near - 1, the requirementthat the new perceptualcondition be off is added. If no strong correlation either way is apparent , a new perceptualcondition is monitored. Theseconditionsare continually reevaluateduntil a sufficiently high reliability is achieved.
316
Chapter8 During execution , the behaviors are grouped based on common actuator control . A probabilistic selection is made based on their relevance, reliability , and newness regarding its appearance in each group . The behaviors are then activated and feedback obtained. These probabilistic aspectsinsure experimentation to avoid premature convergence to a suboptimal solution .
additional
which
at
the
training
to
refers
a
point .
system
in performance
in
learning additional
improvement
Convergence in
result
not
does
any
8.3.1.2 RoboticResults The following resultswereproducedusingthis learningalgorithm: . Using solely negativefeedbackwith the balanceand six swing forward behaviors, Genghislearnedto adopta stabletripod stance, keepingthreelegs on the groundat all times (the middle leg from one sideandthe front andrear from the other). . A secondexperiment,usingboth positiveand negativefeedback, resultedin ' Genghiss walking usingthe tripod gait, in which it alternatelyswingstwo sets of threelegsforward asthe robot moves.
8.3.2 Learningto Push At ffiM , MahadevanandConnell( 1991) usedQ-Iearningto teachabehaviorbasedrobot how to push a box. The robot, Obelix, was built on an RWI 12-inch-diarneterbase. Thereare eight ultrasonicsensors , four of which look forward, two each to the left and right. Sonar output is quantizedinto two ranges:NEAR (from 9- 18in~hes) andFAR ( 18- 30 inches). A forward-looking infrareddetectorhasa binary responseof four inchesusedto indicatewhenthe robot is in a BUMP state. The current to the drive motors is also monitored to determineif the robot has becomephysically STUCK (the input current exceedsa threshold). Only 18 bits of sensorinformation are available: 16 bits from the ultrasonic sensors( NEAR or FAR) and two for BUMP and STUCK. Motor control outputsare limited to five choices: moving forward, turning . left or right 22 degrees , or turning more sharply left or right at 45 degrees ' of the for The robot s learningprobleminvolvesdeciding, approximately any
Adaptive .Behavior Reward / Punishment
oowzoooa
317
RESPONSE Figure 8.5
Obelix ' s behavioral controller (colony architecture ) .
250,000 perceptualstates, which of the five possibleactionswill enableit to find andpushboxesarounda room efficiently without getting stuck. The behavioral controller, based on Connell' s colony architecture(section 4.5.3), consistsof threebehaviors(figure 8.5) : . Finder behavior, which is intendedto movethe robot towardpossibleboxes. This behavioris rewardedwheneverthe input vector containsNEAR bits. If the robot movesforward and the forward looking NEAR bits are turned on, a + 3 rewardis given. If any NEAR bits that were alreadyon are turned off , a - 1 punishmentis applied. Finder is active only when neither of the other behaviorsis active. . Pusherbehavioroccursafter BUMP resultsfrom a box find andcontinuesat leastuntil the box is wedgedagainstan immovableobject, suchasa wall. The robot' s rewardis + 1 if it continuesto move forward and remainsin BUMP, and its punishmentis - 3 if it stopsbeing in the BUMP state(losesthe box). The pusherbehaviorremainsactive for a short time after BUMP contactis lost so that it may recoverfrom possiblesmall errorsin pushing. Boxestend to rotatewhenpushedby the circular robot if not pusheddirectly throughthe centerof drag, making this task considerably more difficult than it might first appear. . Unwedgerbehavior removesthe robot when the box becomesno longer ' pushable. The robot s reward is + 1 if the STUCK state goes to 0, or its punishmentis - 3 if it persistsin the STUCK state. This behavioris active wheneverthe STUCK bit is on andpersistsfor a shorttime after it goesoff to ensuresafeextrication.
318
ChapterS
8.3.2.1 TheLearningAlgorithm Q-Iearningprovidesthe ability to learn, by determiningwhich behavioralactions are most appropriatefor a given situation, the correctglobal robotic response p for a given set of stimuliS presentedby the world. An updaterule is usedfor the utility function Q(x , a), wherex representsthe statesanda the resultingactions: Q(x , a) -+- Q(x , a) + ,8(r + AE (y) - Q(x , a ,
(8.4)
where ; ,8 is a learningrate parameter r is die payoff (reward or punishment); A is a parameter, called die discount factor , ranging between 0 and I ; E (y ) is die utility of die state y d1atresults from die action and is computed by E (y ) = max ( Q(y , a for all actions a . Reward actions are also propagated across statesso d1atrewards from similar states can facilitate d1eir learning as well . The issue here is what constitutes a similar state. One approach uses die weighted Hamming distance as die basis for die similarity metric . The IS -bit state representation is compacted to 9 bits , d1en die difference between die number of set bits between different states is computed . Some state characteristics are considered more important dian od1ers in preserving distinctiveness. In particular , BUMP and STUCK each have a weight of five , NEAR sonar has a weight of two , and aU od1erbits have a unit weight . Arbitrarily , two states are considered similar if d1eir weighted Hamming distance is less dian three. ' The utility function is used to modify die robot s behavioral responses as follows : all Q( x , a) to O. Initialize Do Forever Determine current world state 90% of the time choose action else pick random action Execute a Determine reward r Update Q( x , a ) as above ' Update Q( x , a) for all states End Do
s via sensing a that maximizes Q( x , a )
x ' similar
to x
319
AdaptiveBehavior Another variantof Q-Iearningthat usesstatisticalclusteringfor the similarity metric insteadof weightedHammingdistanceshasalsobeeninvestigated . 8.3.2.2 Robotic Results The methodsdescribedwere testedon the robot Obelix. It was observedthat usingQ-Iearningovera randomagentsubstantiallyimprovedbox pushing. The robot' s performanceusing Q-learning was also comparedto its performance when controlled by a hand-coded, hand-tunedbehavioralcontroller. Mahadevan and Connell ( 1991) statethat the robot was " fairly successfulat learning to find andpushboxesandunwedgefrom stalledstates" (p. 772). The ultimate " performanceafter learning was said to be close to or better than the hand" ' . codedagent (ibid.). This work s importancelies in its empirical demonstration of Q-learning's feasibility as a useful approachto behavior-basedrobotic learning.
8.3.3 Learningto Shoot Researchers at OsakaUniversity (Asadaet at. 1995) haveappliedvision-based reinforcementlearning to the task of shooting a ball into a goal, using Q' . In this instancethe set of learningas their systems underlyingmethodology statesfor the utility function Q(x , a) is definedin terms of the input visual ' imageobtainedfrom a cameramountedon therobot. The ball s locationwithin the imageis quantizedin termsof position (left, center, or right) and distance ' (large/near, middle, or small/far). The goal s location is quantizedin termsof the sametwo qualitiesplus relativeangle(left-, right-, or front-oriented). The setof statesdefinedby all allowablecombinationsof thesesubstatestotals319. Two additionalstateseachexist for when the ball or goal is lost either to the right or left. Thesestatesassistin guiding the robot to move in the correct direction, asopposedto moving randomly, shouldthe stimulusnot be directly presentwithin the image. The robot' s actionsetconsistsof the inputsfor eachof the two independent motors that power each wheel. Each wheels action subsetconsistsof three commands : forward, stop, and back. This yields a total set of nine possible actionsfor the robot. The robot continuesin a selectedaction until its current statechanges . The reward value is set to 1 if the ball reachesthe goal and 0 otherwise. The discountfactor, A, is set to 0.8. Becauseof the difficulty in associating a particularstateaction in the pastwith the rewardgenerated , convergenceis
320
Chapter8 very hard to achieve if the robot is started in a random state. This difficulty , referred to as the delayed reinforcement problem , is related to the credit assignment problem discussed earlier in the chapter. The problem is addressed here by having a good teacher who provides an intelligent design for learning experiences. The trainer devises easy situations (such as head-on approaches that are near) to allow the system initially to improve its performance in this constrained situation . As this level of competency is mastered, more difficult and challenging states are progressively introduced . This procedure improves learning perfo~ ance convergence dramatically . Suitable training instance selection is used to facilitate many types of learning . An early example is Win ' ston s ( 1975) ARCH program , which illustrated the power of using proper training sequencesfor achieving convergence in inductive concept formation tasks.
8.3.3.1 RoboticResults The systemwas implementedon attacked robot equippedwith a radio/video link to an oftboard real-time imageprocessingsystemand Sun control computer . The ball waspaintedred andthe goalbox blue to makefeatureextraction simpler. The experimentswere run on the laboratoryfloor. The actualsuccess rate was about50 percentas comparedto 70 percentpredictedby simulation studies. Sucha differencein actualand predictedsuccessratesis not uncommon becauseof the significantinaccuraciesin simulationmodelsandnoisein . Looking at the final action-statedata, approximately60 percent the input sensors of the stimulus-responsemappingswere correctly created. Considering that the robot had no ability whatsoeverto get the ball into the goal initially , theseresultsare encouragingand supportearlier claims that Q-Iearningis a feasiblemethodfor behavior-basedrobotic learning.
8.4 LEARNINGIN NEURALNETWORKS Although learning in neural networks can be viewed as a form of reinforcement learning , neural networks are sufficiently distinct to warrant a discussion on their own. In chapter 2 we presented the rudiments of neural network systems . In this section, we discuss methods for encoding behavior -based robotic control using neural networks and the adaptation methods that can be used to modify the synaptic weights that encode the means by which the robot can respond.
321
AdaptiveBehavior Hebb ( 1949) developed one of the earliest training algorithms for neural networks . Hebbian learning increases synaptic strength along the neural pathways associated with a stimulus and a correct response, strengthening frequently used paths. Specifically ,. Wij ( t + 1) = Wij (t ) + 11* OiOj ,
( 8.5)
where: Wij ( t ) and Wij ( t + 1) are the synaptic weights connecting neurons i and j before and after updating , respectively ; 1] is the learning -rate coefficient ; and Oi and OJ are the outputs of neurons i and j , respectively. .
Perceptrons, introduced in section 2.2.3.2, have also been used for robotic learning . Perceptron learning uses a method different from Hebbian learning for synaptic adjustment . The overall training procedure is as follows : Repeat 1. Present
an example from a set of positive and negative . learning experiences 2 . Verify the output of the network as to whether it is correct or incorrect . 3 . If it is incorrect , supply the correct output at the output unit . 4 . Adjust the synaptic weights of the perceptrons in a manner that reduces the error between the observed output and correct output . Until satisfactory performance as manifested by convergence achieved ( i . e . , the network has reached its limit of performance ) or some other stopping condition is met . Various methods are used for updating the synaptic weights in step 4. One such method, the delta rule , is used for perceptrons without hidden layers. It modifies synaptic weights according to the formulad ( Wij ) = 1] * Wij * (tj - OJ) ,
( 8.6)
where: d ( Wij ) is the synaptic adjustment applied to the connection between neurons i and j ; 1] is the learning -rate coefficient ; and tj and OJare the correct and incorrect outputs, respectively.
322
ChapterS The delta rule strives to minimize the error term (1j - OJ) using a gradient descent approach.
an
minimize
to
seek
that
methods
to
objective the
in
each
At
.
measured
descent
is
Gradient
which
point
.
refers
learning
time
, value
perfonnance
function
minimal
the
that
objective
next
yields
function
system
by
the
choose
to
is
step
policy
' condition
state
s
one
to
continue
should
one
that
states
essence
in
This
improve rate
commonly , Each
.
time
The
parameters in
a
can
one
as
as
learning
each
at
taken
the
to
local
of
basis
the
introduces
but the
process
analogous
,
as
.
in
found
the during
, minima
local
)
procedure
at
step optimization
in
climbing .
fast
which
extremely Hill
descent
computed is
efficient
is
as
algorithms is
only a
one these
,
on
( myopia
can
refer
step
infonnation
for
long size
point local
.
result
can
that
traps maximized
is
function
the objective
whereby
Back-propagationis probablythe mostcommonlyusedmethodfor updating synapticweights. (SeeWerbos1995for a review.) It employsa generalizedversion of thedeltarule for usein multilayerperceptronnetworks(a commonform of neural networksusedin robotic control and vision) . Usually the synaptic ' weights initial valuesaresetrandomly, thenadjustedby the following update rule astraining instancesareprovided: Wjj(t + 1) = Wjj(t ) + ,,8j Oj, where 8j = j ( 1 - j )(tj - OJ) for an output node; and 8j = 0j ( 1 - 0j ) Ek8kWj k for a hiddenlayer node. The errorsarepropagatedbackwardfrom the output layer.
(8.7)
8.4.1 ClassicalConditioning Classicalconditioning, initially studiedby Pavlov( 1927), assumesthat unconditioned stimuli ( US) automaticallygeneratean unconditionedresponse(UR). The US-UR pair is definedgeneticallyand is appropriateto ensuresurvival ' in the agent's environment. In Pavlovs studies, the sight of food ( US) would ' result in a dog s salivation( UR). Pavlov observedthat associationscould be developedbetweena conditionedstimulus (CS), which has no intrinsic survival value, and the UR. In the dog' s case, when a bell was rung repeatedly
323
AdaptiveBehavior
SENSORS ' C0Iiislon Detectors Finder Range Detector Target
Aversive Unconditioned Stimulus Uncon dit'lone Response
Robot Motors
Appetitive Unconditioned Stimulus
Figure 8.6 Learningarchitecturesupportingclassicalconditioning.
with the sightof food, overtime an associationwasmadesothat the ringing of the bell alonewassufficientto inducesalivation. Hebbianlearningcanproduce classicalconditioning. An internationalresearchgroup ( Vershure , Krose, and Pfeifer 1992; Verat classical shureet al. 1995) haslooked conditioningmethodsasa basisfor the self organizationof a behaviorbasedrobotic system. Insteadof hard-wiring the relationshipsbetweenstimuli andresponses , the learningarchitecture(figure 8.6) permitstheseassociationsto developover time. In their early simulationstudies( 1992), Vershure,Krose, and Pfeifer divide the positive US fields into four discreteareasin which the attractive(appetitive ) target may appear: ahead, behind, left, and right. A significant turning responseis requiredwhenthe targetis considerably to the left or right, a lesser one(if any) if it is to the front or rear. The UR setconsistsof six possiblecommands : advance , turn right 1 degree, turn left 9 , reverse, turn right 9 degrees , and turn left 1 degree. 1\\10additionalcollision sensorsserveas negative degrees US, producinga responseconsistingof a reverseand turning 9 degrees awayfrom the directionof the collision. The negative(aversive) US caninhibit overtarget thepositiveUS, ensuringthat managingcollisionstakesprecedence acquisition. The CSsusea rangesensorcapableof producinga distanceprofile of 180 degreesin the directionin which the robot is heading. The readingsaredivided ' into varying discretelevelsof resolutionbasedon the robot s heading: for the forward area, ranging from - 30 to + 30, there are twenty units covering 3 degreeseach; for the areato the right, ranging from + 30 to + 60, there are five units covering 6 degreeseach; and for the areato the far right, ranging
324
Chapter8
from + 60 to + 90, there are three units covering 10 degreeseach. Areas to the left and far left haveresolutionssimilar to thoseof the right and far right, , making a grandtotal of 36 units of rangedata. respectively For the neural net~ ork implementation, perceptron-like linear threshold units with binary activationvaluesareused. The synapticweightsareupdated by the following rule: ~ Wij =
770iO jE ~ (
"OWij) ,
(8.8)
where 77is the learning rate; E" the decay rate; N is the number of units in the CS field : 0;, OJ is the binary output value of units i and j respectively ; and () is the average activity of the US field . The robot ' s task is to learn useful behaviors by associating perceptual stimuli with environmental feedback. The behaviors include avoidance, in which the robot learns not to bump into things , and avoidance, combined with approach to a desired target. Note that the robot has no a priori understanding of how to use the range data to prevent collisions from occurring in the US set: This must be learned from the CS. The simulation studies indicate that successful emergent behavior does occur in a manner consistent with the agent' s goals. The authors likened these results to the development of an adaptive field by the construction of a sensor-driven control schema. These learned behaviors are consistent with specialized variants of schema-based navigation , discussed in section 4.4 , in which the agent constructs the required responses through environmental interactions. More recently , this work has been extended to vision -based navigation for an actual mobile robot , NOMAD ( Vershureet al. 1995) . The robot ' s learning task involves sorting colored blocks that conduct electricity , either strongly or weakly , based on feedback from an aversive or appetitive response. Because the robot can obtain the US only when it is in actual contact with the block, its goal is to learn the correct response to the color characteristics associated with each block type . In this case the UR arises from a conductivity sensor, and the CR uses color vision . The synaptic update rule used is
325
AdaptiveBehavior Wij ( t + 1) = Wij (t ) + v ( t )ai (t ) ( 71aj(t ) E'Wij (t ,
(8.9)
where
v (t ) is the averageactivationof the valueunits, wherethe valueunits provide constraintson the system; ' ai is the neurons firing rate; aj is the firing rate of the presynapticneuronj ; and 1] and E are the learning rate parametersfor potentiationand depressionof . synapticstrength, respectively Threedifferent variationswereusedin the study: 1. A value systemis present. ( Equation8.9 is usedas is.) This value system exertsaninfluenceon alterationsin synapticstrengthbasedon relevantsensory inputsindependentof whetherthey are aversiveor appetitive. 2. The valuesystemis turnedoff by settingv (t ) = 1 in equation8.9. 3. A model is usedthat correspondsto Hebbianlearning, with no valueunits and a droppingof the dependencyof synapticdepressionon aj (i.e., the last term in equation8.9 becomes1]aital (t ) - EWij(t . . In the study, the robot success fully learnedto accomplishthe task. The learning performancevaried dependingon the methodchosen, with method 2 resulting in the best performanceand fastestdevelopmentof conditioned . responses Othershaveusedclassicalconditioning methods: Scutt ( 1994), for example , taught a Braitenberg-like robot (section 1.2.1) consistingof a neuralnet of only five neuronsto seeklight . In anotherexample, Gaussierand Zrehen ( 1994) useda variationof Hebbianlearningto teacha small mobile robot, using a neuralnetwork, to developa topologicalmap, learningsuitableresponses at different locationswithin its world. Theseresearchersshowedthat learned encodingssimilar to potentialfields could be developedusingclassicalconditioning methodscapableof representinglearnedlandmarkpositions(Gaussier andZrehen1995).
8.4.2 AdaptiveHeuristicCritic Learning With adaptiveheuristiccritic (AHC) reinforcementlearningmethods, a critic learns the utility or value of particular statesthrough reinforcement. These learnedvalues are then used locally to reinforce the selectionof particular actionsfor a given state(panel(A ) in figure 8.3). A researchgroup in Spain (Gachetet al. 1994) has studiedthe useof connectionistAHC reinforcement
326
Chapter8
as a basis for learning the relative strengths(i.e., G) of eachbehavior's response within an active assemblage . The study' s specificgoalswere to learn how to coordinateeffectively a robot equippedwith the following behaviors: goal attraction, two perimeter-following behaviors(left and right), free-space attraction, avoidingobjects, andfollowing a path. The outputfor eachof these behaviorsis a vector that the robot sumsbefore execution, in a mannervery -basedmethodsdiscussedearlier(section4.4) . similar to that of the schema The AHC network (figure 8.7) startswith a classificationsystemthat maps the incoming sonardataonto a setof situations, either thirty-two or sixty-four dependingupon the task, that reflect the sensedenvironment.An output layer containinga weight matrix W, called the associativesearchelement(ASE), computesthe individual behavioralgains gi for eachbehavior. Eachelement Wki of the weight matrix is updatedasfollows: Wki(t + 1) = Wki(t ) + abi (t )eki(t ),
(8.10)
wherea is the learningrate and eki(t ) is the eligibility of the weight Wki for reinforcement(maintainedandupdatedas a separatematrix). A separateadaptivecritic element(ACE) determinesthe reinforcementsignal bi(t ) to be appliedto the ASE. Its weights V areupdatedindependentlyas follows: Vki(t + 1) = Vki(t ) + .Bbi(t )Xk(t ),
(8.11)
where.Bis a positiveconstantand Xk is the eligibility for the ACE. This partitioning of the reinforcementupdatingrulesfrom the actionelementupdating is a characteristicof AHC methodsin general. The task for the non-holonomic robot (namedRobuter) is to learn the set of gain multipliers G for a particular task-environment . Three different missions are definedfor the robot. The first missioninvolveslearningto explore the environmentsafely. The secondinvolveslearninghow to move back and forth betweenalternatinggoal points safely, andthe third is to follow a predetermined path, alsowithout collisions. In the explorationmission, initially the robot movesrandomly within the world. When a collision occurs, negativereinforcementis applied and the robot is movedback to a position it occupiedN stepsearlier (N = 30 for the simulations, 10 for the actual robot becauseof tight quarters). For the other goal-orientedmissions, negativereinforcementoccursunder two conditions: when a collision is imminent, and when the robot is pointing away from the goal (or next path point) and no obstaclesnearbyare blocking its way to the goal. Testson Robuterin a relativelyconfinedlaboratoryenvironmentshowed
327
AdaptiveBehavior
Weigh1 Matrix Local Reinforcement Signals from Critic
Input Layer Output Behavioral Gains
l ~::::Cj Adaptive Search Element
Figure8.7 Connectionist AHClearningsystem(afterGachetet aI. 1994 ). successfullearning for the first and secondmissions. Successfulsimulation resultswerereportedfor the third missionaswell. Milian ( 1994) also used AHC learning methodsfor navigation tasks. A Nomadicsrobot namedTeseoused temporal differencing methods(Sutton 1988) as a specific instanceof AHC learning. In this work, the robot learns behavioralpatternsefficient for a particular task, such as moving from one locationto another.The robot learnsto avoidunfruitful paths(e.g., deadends) on the way to achievingits navigationalgoals. Basicreactivereflexes areused as primitive behaviors. Similarly, researchersat the University of Southern California (Fagg et al. 1994) have used temporal differencing as a form of connectionistAHC learningto permit a robot to learncollision avoidance , wall following, andenvironmentalexploration.
328
Chapter8 A researchteamat the Universityof Karlsruhe(fig andBerns 1995) applied AHC learningmethodsto teachleg coordinationandcontrolto a hexapodrobot nameLAURON (figure 8.8) . Here also a critic elementgeneratesa rewardfor the currentstateand 3: separateaction elementgeneratesthe choiceof how to act next. Individual leg actionswerelearned(returnandpowerstrokes) aswell ascoordinatedcontrol betweenthe legs.
8.4.3 Learning New Behaviors Using an AssociativeMemory A researchgroupat theUniversityof Edinburgh(Nehrnzow,Smithers, andMcGonigle 1993; NehrnzowandMcGonigle 1994) hasstudiedwaysto increasea robot' s behavioralrepertoire(i .e., learn new .Bithrough a connectionistassociative memory. 1\\10 of their robots, namedAlder and Cairngorm, were ' , one eachon the vehicle s left and right equippedwith two whisker sensors front, usedto signal collisions. The robot was also able to sensewhen it is moving forward. In later work a more sophisticatedrobot, the EdinburghR2, usedfive bump sensors , eight infraredsensorsfor proximity detection, andsix resistors to detectthe presenceof light . light dependent Instinct rules, the first approachtaken, wereestablishedfor the agent, which resultedin various forms of locomotion. The motor responsesthat became associatedwith theserules correspondcloselyto behaviorsin a priority-based systemsuch as the subsumptionarchitecture. Examplebehaviorslearnedby Alder andCairngorminclude (Nehrnzow,Smithers, andMcGonigle 1993): . Rule I - Keepforward motion sensoron: The robot learnsto moveforward. . Add Rule 2- Keepwhiskersstraight: The robot learnsto avoidbumpinginto obstacles . . Add Rule 3- Make whiskersrespondafter four seconds : The robot learns wall-following behavior. . Add Rule 4 - Make alternatewhiskersrespond: The robot learnscorridorfollowing behavior. Figure 8.9 depictsthe learning architecturefor the EdinburghR2 ( Nehmzow and McGonigle 1994). As is the casefor their otherrobots, the two-layer ' perceptron-basedneural network s goal is to associatenovel sensorinformation with recently undertakenactions. A novelty detectordetermineswhen conditionshavechangedsufficiently to warranta changein action, which occurs when either the robot' s headinghas changedsignificantly or there is a significantnetdifferencein theproximity sensordata. The associativememory actson a sensorvector containingthe proximity data (quantizedinto a range
329
AdaptiveBehavior
(A)
(B) Figure 8.8 LAURON II . ( photographscourtesy (A ) Hexapodrobot LAURON and( B) its successor of KarstenBerns.)
330
Chapter8
Two- Layer Perceptron PatternAssociator
Figure 8.9 1Wo- layer perceptron learningarchitecture.
between0 and3) andlight-detectordata(digitized into a normalizedrangebetween 0 and255). The outputof this associativememorypassesinto an arbiter that selectsthe most appropriateaction. Only three actionsare defined: swift left turn, swift right turn, andmoveforward. This two-layer perceptron (eleveninput, threeoutput nodes), in contrastto the multilayer neural networksalreadydiscussed , has the advantageof rapid often with one . , learning only positive example Theseperceptron networks, however,canlearnonly linearly separablefunctions( MinskyandPapert1969), althoughthis hasnot yet provento be a problemfor this type of application. Scalabilityissuesare asyet unresolved. The synapticweight updaterule usedappearsin equation8.6. For the R2, teachingis supervisedby a human, in contrastto the instinct rulesusedearlier. The experimenterprovidescritical feedbackon its performancedirectly to the robot by coveringor uncoveringa light sensoron its top. The robot considers itself to be behavingcorrectly while the sensoris dark (positivefeedback). When the sensoris uncovered(negativefeedback) the robot tries different actionsuntil it finds a suitableone, at which point the sensoris coveredagain. Clearly a good teacheris essentialfor proper learning in this instance. This " " processof applying externalfeedbackis likened to shaping, a processof behaviormodificationusedfor training animals. This robot haslearnedfour basicbehaviors: obstacleavoidance , box pushing , wall following, and light seeking. Combinationsof thesebehaviorsare also teachable , permitting the robot to learn how to navigatethrougha maze,
331
AdaptiveBehavin,
for example, by usingavoidance , light seeking, andwall following in a learned order.
8.5 GENETICALGORITHMS
The geneticsof behaviorhas been well studiedin the context of biological systems. Breedsof animals are createdto possesscertain useful behavioral propertiessuchas disposition(e.g., viciousnessor friendlinessin dogs). It is also possibleto usecomputationalanalogiesof behavioralgeneticsto configure . In this section, we first review how certainclasses robotic characteristics of geneticalgorithmsoperate, then look at specificallyhow they can be used within robotic systems.We alsoexaminehow their strengthscanbe combined with the strengthsof neuralnetworksto yield hybrid adaptiverobotic control systems.
8.5.1 What Are GeneticAlgorithms? Geneticalgorithms(GAs) form a classof gradientdescentmethodsin which a high-quality solutionis found by applying a set of biologically inspiredoperators to individual points within a searchspace, yielding bettergenerationsof solutionsover an evolutionarytimescale(Goldberg1989) . The fitnessof each memberof the population(the set of points in the searchspace) is computed using an evaluationfunction, called the fitness function, that measureshow ' well eachindividual performswith respectto the task. The populations best membersare rewardedaccordingto their fitness, and poorly performingindividuals arepunishedor deletedfrom the populationentirely. Over generations , the populationimprovesthe quality of its setof solutions. Although GAs, and gradientdescentmethodsin general, are not guaranteedto yield an optimal global solution, they generallyproducehigh-quality solutionswithin reasonable . As we will see, this includes amountsof time for certainproblemspaces robots. behavior based the learningof control strategiesfor Genetic algorithmsusually require specializedknowledgerepresentations ' (encodings) to facilitate their operators operation. The encodingstypically take the form of position-dependentbit strings in which eachbit represents . An initial population(a representativeset a genein the string chromosome of bit strings) is establishedby somemeans, often by randomization.Genetic
332
Chapter8 Reproduce
~~
~~
~~ ~~ ~~ ~
~~ ~~ ~~ ~~ ~ ~ ~~ ~ ~~~ ~~ ~~
Crossover
~~ ._
~~
~~
_ II _
~ ~~ ~ .~
~
. .
~~
~ ~ ~~
I_
. .
I_
_ . ~~
I~~
Mutate
~~
~~ ~~ ~ ~ ~~
~
~ ~~ ~ ~~
-
~~
Figure8.10 The genetic operators reproduction , crossover, and mutation .
operatorsarethenappliedto thebit stringencodingof thepopulationmembers. The three most frequently used operatorsare reproduction, crossover, and mutation. (figure 8.10). Prior to the operator's application, eachindividual' s fitnessis computedusing the fitnessfunction. For a behavior-basedsystem, this may involverunning a robot througha seriesof experiments , using the encodingof the behavioral controller representedby the particular individual bit string encodingbeing evaluated.The fitnessfunctionreturnsa valuecapturingthe robot' s overallperformance for the setof conditionsbeingtested. Using the reproductionoperator, the fittest individuals are copied exactly and replaceless-fit individuals. This is done probabilistically, usually using weightedroulette-wheel selection, increasingthe likelihood of but not guaranteeing the fittest individuals' reproduction. This operator's net effect is an increasein the ratio of highly fit individuals relative to the numberof poor performers, looselyfollowing the Darwinianprinciple of survivalof the fittest. Crossover involves two individual encodings, exchanging information through the transfer of somepart of their representationto anotherindividual . This processcreatesnew individuals that mayor may not perform better thanthe parentindividuals. Which individualsto crossoverandwhatbit string partsto exchangeareusuallychosenrandomly. The net effect is an increasein the overall population. Mutation, a simpleprobabilisticflipping of bit valuesin theencoding, affects an individual only and does not increasethe overall population size. This
333
. Behavior Adaptive randomeffect providesthe ability to escapelocal minima, a commonproblem associatedwith gradientdescentmethods.Justasin biological mutations, most mutationswill lead to inferior individuals, but occasionallya morefit one will emerge. Becausethe probability of mutationis generallyvery low and copies of the most fit individuals result from reproduction, this randomnesspermits . high-quality solutionsto emergethat would otherwisebe unattainable The useof thesegeneticoperatorsresultsin a varying populationover time. Someindividualscreatedhavea lower fitnessthantheir parents,but on average the entirepopulation's overall fitnessaswell asthat of the bestindividualsimproves . If properly designed,the learningsystem with successivegenerations eventuallysettleson a setof highly fit, near-optimal individualswith similar bit ' strings. The final solution s quality andlengthof time to obtainit dependhe:avily on the natureof the problem and the valuesfor the many parametersthat controltheGA. Fortunately,controlproblems, andin particularbehavior-based control methods, are highly compatiblewith thesemethods, as they generally havea reasonableparameterset size.
8.5.2 Genetic Algorithms for Learning Behavioral Control GAs, althougha powerful techniquefor developingconttol systems, require somerestrictionson implementationcomparedto the individual learningmethods . Sincethesemethodstypically requirea significant we havealreadyinvestigated populationof robotsfor fitnesstesting, and the robotsmustbe further testedovermany, manygenerations , muchof the learningin geneticalgorithms is of necessityconductedin simulationoff line. As an evolutionarytimescale is needed, it is generallyinfeasibleto conductreal-time learning. Simulated learning, fortunately, cangenerallybe conductedat speedsordersof magnitude fasterthanreal-world testing. Assumingthat a simulationhasa reasonabledegree of fidelity to the real robot andenvironment , the conttolparametersfrom the fittest simulatedindividual developedover many generationscan then be transferredto the actualrobot for use. An examplefrom sucha simulationappearsin figure 8.11(Ramet al. 1994). -basedbehavioralconttoller is In this particular system, GA-Robot, a schema evolvedusing geneticalgorithms. An encodingis createdthat representsthe individual gainsof the componentbehaviors(goal attraction, obstacleavoidance , and noise) and additional parametersinternal to certain behaviors(obstacle ) . In this work, insteadof using a sphereof influence, noisepersistence more slowly convergingbit string, an encodingusing floating-point valuesfor the gainsandparametersis used.
334
ChapterS begin Obstacles .Create ; 1* Make a new environment * 1 . Build 1* Make a new population * 1 ; Population for 1 to HUMBER do _GENERATIONS begin for 1 to RUNS do _PER_GENERATION 1* Let Robots try to reach goal * 1 begin for 1 to MAX _HUMBER _STEPSdo begin Robots .Move; end Obstacles .Recreate ; 1* Update environment * 1 end 1* Prepare next generation * 1 Robots . Reproduce; Robots .Crossover ; Robots .Mutate ; end end Figure 8.11 GA-Robot' s main evolutionaryalgorithm. Fitness for an individual is defined as a function of weighted penalties: = collision _weight * number_of _collisions raw _fitness + time _weight * number_of _steps + distance _weight * distance _traveled By altering the penalty weights for each component of the fitness functions , three different classes of robots are evolved , each specialized for a particular ecological niche (figure 8.12) : . Safe: optimized to avoid hitting obstacles while still attaining the goal . . Fast: optimized to take the least amount of time to attain the goal . . Direct : optimized to take the shortest path (which may be slower becauseof reduced speedsin cluttered areas) . The different behavior of a single class of evolved robots across differing environments is also in evidence in figure 8.13. Although these robots are optimized to avoid collisions , they can still find relatively direct paths in low clutter environments. As the clutter increases, however, the paths begin to diverge, until the robots find many indirect and slow but safe routes through the obstacle field . Using GAs in this way permits an environment - specific control
(A)
'
.
"
. .
, .
. .
.
.
,
.
:
.
.
" .
.
.
"
.
. .
.
.
.
.
'
. .
,
w
.
.
.
~ . . ~ 0 . .
. .
(B)
(C) Figure8.12 worldsof (A) safe Finalpathsthrough 25%cluttered , (B) fast,and(C) direct general . robots
. . . . .
~
-
-
0 _
~
0
i .
. .
(A)
. . . . . . ~ ~ .
ii
.
~
uE
~
;
= ~ ~
, :
~ . . . . .
(C) Figure8.13 Finalpathsof saferobotsthrough(A) 1%, (B) 10%, and(C) 25% clutteredgeneral worlds.
337
AdaptiveBehavior system to be evolved if desired, filling targeted ecological niches ( e.g ., fast
robotsin highlyclutteredworlds). 8.5. 3
Classifier Systems
Other variationson representationalencodingsare also used. One common alternativesttategyis the use of a classifier system(Booker, Goldberg, and Holland 1989). Here, geneticoperatorsact upon a set of rules encodedby bit strings. The performanceelementof a classifiersystemworks in the mannerof a productionsystem: Preconditionsfor a setof rulesare checkedto determine their applicability giventhecurrentsituationalcontext. The preconditionshave fixed-lengthbit encodingswith values0, 1, or # (don' t care). The actionsideof the rule is also a fixed-lengthencodingwith valuesof 0, 1. Conflict resolution methodsareusedto selectwhich onefrom a setof potentialrules will be used. The performanceelementthenexecutesthe selectedrule. Credit assignmentis performedby a separatecritic modulethat evaluatesthe resultsof the chosen actions(i.e., the fitness) and is used, as always, to guide learning. A learning elementcreatesnew rulesusinggeneticoperators. We sawearlier (section3.3.1) that rule-basedsystemsarea usefulencoding methodfor behavior-basedrobotic systems. GA-basedclassificationsystems are a naturalfit . One internationalresearchgroup centeredin Milan hasused thesemethodsto evolvebehavioralcontrol in a systemcalled ALECSYS (figure 's 8.14). Testingon a seriesof small robots(AutonoMouse), the researcher havedemonsttatedphototaxis: learningto approachboth stationaryand moving light sources(Colombetti and Dorigo 1992). In simulation, they have further demonstratedthe coordinationof threedifferent primitive behaviors- using severaldifferent potentialcoordination , chasing, and escaping approaching e. . combination , suppression , and sequencing ). The chase operators( g , behavior operatesalong the following lines: A sensorencodesin a four-bit string the location of the object to be chased. This particular encodingcan serveas the preconditionfor a rule that has an action encodingconsistingof five bits, the first threeencodingthe direction to move, one bit for whetheror not to move, and a one-bit flag to notify the behaviorcoordinatorthat an action is recommendedby the rule. Thesebit stringsrepresentingrules evolve using the geneticoperatorsdescribedearlier, creatingnew rules as necessary and deleting uselessones accordingto the results of the critic ' s fitness assessment . ALECSYS has also beenapplied to manipulatorcontrol, learning to coordinatevision (exteroceptive ) and encoder(proprioceptive ) sensorsto
338
Chapter8
PERFORMANCE ELEMENT SENSORS
MESSAGES
CLASSIFIER RULE SELECTION
CRITIC
CONFLICT RESOLUTION
ACTUATORS
GENETIC ALGORITHM
Figure 8.14 ALECSYS learningarchitecture.
producegrossmotion guiding the manipulatorto a target object ( patelet ale 1995). Another important approachcombining the power of productionrules and genetic algorithmsis SAMUEL, a systemdevelopedat the Naval Research Laboratoriesand testedon a Nomad robot (Grefenstetteand Schultz 1994) . The task for this systemis to learn how to avoid obstaclesand to safelynavigate to a goal in a clutteredenvironmentusing sonarand infrared sensors . At eachtime step an action is producedconsistingof moving at a linear speed between- 1 and 5 inches per secondand a turning speedof - 40 to + 40 degreesper second. The standardset of geneticoperatorsof selection, crossover and mutationareemployed. Perfonnancefor the top five individualsimproved significantly (from approximately72 percentto 93 percentsuccess ) for a set of fifty individual rule sets evaluatedover twenty trials each and evolved over fifty generations . The initial rule set was obtainedfrom a set of human-createdrules and machine-generatedvariations. Significant simulation resultswere also generatedfor behavioralscenariosinvolving evasion, dogfighting, minefieldnavigation, andprey tracking(SchultzandGrefenstette 1992). 8. 5.4
On -Line Evolution
Thoughmost GAs evolvedifferent individualsover time, it is alsopossibleto permit the continuousevolutionof the control systemduring executionusing geneticalgorithms. Steels( 1994) accomplishes on-line adaptationby treating the behavioralcontroller as a populationof concurrentactivebehavioralprocesses . The goal for therobotic agent(in this casea small machineconstructed
339
AdaptiveBehavinr
from LegoTechnics) is to survive; this meansfinding adequateenergysources (which are depletable ) and not getting stuck in obstacletraps while seeking themout. An initial populationof behavioralprocesses is generatedthat compete for actuatorconb"oil, producing changesin speedor turning. Fitnessis evaluatedover somehistory window, typically looking aboutone secondinto the past, accordingto how well the systemrespondsto satisfyingits survival needs. During this time interval eachprocess's contributionto (impact on) the robot' s control is logged. The fitnessand impact a processhad during its last history window guidesreproduction.New processcreationinvolving GA operators is usedto keepthetotal numberof activeprocess esin the systemconstant. over initial behavioralconfigurations Significant performanceenhancements havebeenachievedusingthis approach. In this system, an individual agenthas the ability to respondto continual changesin its environment.Most GA methodsassumethat the behavioralcontroller is fixed during a particularagent's lifetime. This methodpennitscontinuous adaptation, which can potentially provide greaterflexibility in adapting to evolving ecologicalnichesand it explicitly recognizesan agent's needto . changein responseto ongoingenvironmental.change s. sis
Evolving Fonn Concurrently with Control
Sims ( 1994) hasdevelopedan interestingapproachto evolving entire robotic creaturesusing GAs. Given the questionbeing studied, Sim' s work hasof necessity beentestedonly in simulation. Sims allows for the evolution of not ' only the agents controller, but alsoits morphology(form). Genotypesencoded as directedgraphsare used to producephenotypicstructuresthat constitute the correspondingthree-dimensionalkinematicsystems.Rigid, revolute, twist, sphericaland other joint types are permitted. The genotypeencodingdetermines . Sensorsattachedto the independentlyevolved points of attachment control systeminclude contact, joint angle, and photosensors . A neural controller sensor onto effector . maps inputs outputs To evaluatefitness, a high-fidelity physical simulation is created. Various : for swimming and walking, the distance objectivefunctions are established traveledby the agent's centerof massperunit time; for jumping, the maximum clearanceachievedby the lowestpart of the agent; andfor following, the average speedof approachto a light source. Figure8.15showsseveralsuccess fully evolvedcreaturesfor swimmingandwalking. Reproductiveselectionis baseduponfitness, asis normally the case. Crossover operationson genotypicencodingsinvolve combiningcomponentsof the
340
Chapter8
(A) Figure 8.15
directedgraphratherthan bit string manipulation. Mutation is also conducted by addingrandomnodesto the genotypegraph. The resultsof thesesimulationstudiesareimpressive.It is hardto envision, giventhe currentstateof the art, how thesekinds of kinematicvariationscould . Nonetheless be conductedautonomously , it is certainly important to recognize that behavioralroboticsis limited if one considersonly a fixed spectrum of physicalrobotic structuresto control. Natureprovidesthis flexibility to its creatures . Roboticistswould alsodo well to considertheseaspectsof morphological adaptation,at the very leastwithin the designprocess.
8.5.6 Hybrid Genetic/NeuralLearningand Control Severalsystemshavecombinedthe powerof ne~ controllersandgeneticalat the Centerfor Neural Engineeringat the University gorithms. Researchers of SouthernCalifornia ( Lewis, Fagg, andBekey 1994) haveusedGA methods
(B)
Figure 8.15 (continued) Evolved creatures , (A ) Walking creatures , ( B) Evolutionary ancestorsof the water snake, and (C) Swimmingwatersnakes.( photographscourtesyof Karl Sims.)
342
Chapter8
to evolve the weights for a neural controller for a robotic hexapod, named ). Rodney, ratherthan using traditional neurallearning(e.g., back-propagation each control for Fitnessfunctions were definedfor first learning oscillatory of the legs and then coordinatingthe oscillationsto produceeffective gaits. The tripod gait manifesteditself, but surprisinglythe robot preferredto walk backwardrather than forward: Evidently this was more efficient for this particular mechanicalstructure. Similar resultsfor a hybrid genetic/neural system were obtainedat CaseWesternReserveUniversity (Gallagherand Beer 1992) . In other work, a Braitenberg-style neuralcontroller was implementedon a smallcommerciallyavailableKheperarobot equippedwith threeambientlight sensorspointedto the floor andeight infraredproximity sensors( Mondadaand Floreano1995; Floreanoand Mondada1996) (figure 8.16). Fitnessfunctions weredefinedfor variousbehaviors, including . Navigationandobstacleavoidance : Fitnessmaximizesmotion and distance . from obstacles . Homing: Fitnessensuresthat the poweris kept at adequatelevelsby adding a light-seekingbehaviorto guide it to its black rechargingareawhen power becomeslow. . Graspingof balls using an addedgripper: Fitnessmaximizesthe numberof objects(balls) grippedin an obstacle-free environment. Geneticalgorithmsareusedto evolvethe synapticweightsfor the neuralcontroller . . In all three cases, the targetedbehavioris learnedto varying degrees Here also backwardlocomotionis a preferredmethodfor the evolvedmobile gripper controller. The most successfulindividual backedup until it encountered something,thenturnedaroundandattemptedto grip it.
CONTROL 8.6 FUZZYBEHAVIORAL Any fool can makea rule And everyfool will mind it. - Henry D. Thoreau
In this section, we belatedlyintroducefuzzy behavioralcontrol, a variant of discreterule-basedencodings(section3.3.1). We review first the basicprinciples of fuzzy logic andthen its applicationsspecificto robotic systems.Some aspectsof learningof fuzzy control arethenpresentedfor reactiverobots.
343
AdaptiveBehavior
Figure 8.16 , and A. Guignard. Kheperarobot. ( photographcourtesy of E. Franzi, F. MODdada Photographed by Alain Herzog.)
8.6.1 WhatIs FuzzyControl? Fuzzycontrol systemsproduceactionsusinga setof fuzzy rulesbasedon fuzzy logic, which is different from conventionalpredicatelogic. In conventional logic, assertionsabout the world are either true or false: there is nothing in between. Valuessuch as true and false are referredto as crisp, that is, they haveoneexactmeaning. Fuzzylogic givesus a differentperspective , allowing variablesto takeon valuesdeterminedby how muchtheybelongto a particular fuzzy set" (definedby a membershipfunction). In fuzzy logic thesevariablesare referredto as linguistic variables, which have noncrispmeanings(e.g., fast, slow, far, near, etc.). Membershipfunctionsmeasurenumericallythe degreeof similarity an instanceof a variablehasin its associatedfuzzy set. A fuzzy logic control system(figure 8.17) consistsof the following: . Fuzzifier: which maps a set of crisp sensorreadingsonto a collection of fuzzy input sets. . Fuzzyrule base: which containsa collectionof IF-THEN rules.
344
Chapter8
Sensors
Fuzzy Inference Engine
Defuzzifier
Actuators
Figure8.17 Fuzzy logic control systemarchitecture. . Fuzzy inference engine : which maps fuzzy sets onto other fuzzy sets according to the rulebase and membership functions . . Defuzzifier : which maps a set of fuzzy output sets onto a set of crisp actuator commands.
Consideran exampleof linguistic variablesthat may be usefulforbehaviorbasedrobotics: steeringcontrol. It might be useful to instruct the robot to turn in somedirection, but we may not want the behaviorto specify the value of a turn too crisply (e.g., turn right 16.3 degrees ). Insteadit may be desirable to havea fuzzy output, suchas turn-hard-right, or slightly-right, or don' t turn (and similarly for the left). Membershipfunctions encodingthis information might appearsomewhatas shown along the horizontal axis in figure 8.18. . Supposesimilarly Note the overlapin membershipbetweenlinguistic classes we have obstacledetectionsensorsthat provide linguistic information such as clear-ahead, obstacle-near-right, obstacle-far-right, and similarly for the left. ( Examplemembershipfunctions are also shown on the vertical axis in figure 8.18) . Simplefuzzy rules canthenbe created, suchas . IF clear-aheadTHEN don' t turn. . IF obstacle-near-right THEN turn-hard-right. . IF obstacle-far-right THEN turn-slightly-right. A fuzzy control systemof this sort would startwith crisp sensorreadings(e.g., numericvaluesfrom ultrasound); translatethem into linguistic classesin the fuzzifier; fire the appropriaterules in the fuzzy inferenceengine, generating a fuzzy output value; then translatetheseinto a crisp turning angle in the defuzzifier, asultimately the motor mustbe commandedto turn at a particular discreteangle.
345
AdaptiveBehavior
eft
eft
-ahead
ht
-right
INPUT OBSTACLE LOCATION
Ha < Left orwar ~ ~ X ~Hard left ~ ~ ) ~ ~ ~ Right right OUTPUT HEADING Figure 8.18 Fuzzy logic for steeringcontrol, showinginput and output membershipfunctionsand fuzzy rules relatingthem.
Fuzzy systemshavemore flexibility than conventionalrule-basedmethods and permit more robust integrationof sensorimotorcommandsthan conventional production systems. Fuzzy control systemsare now pervasivein consumer , and V CRs, to namea few. products: washingmachines, camcorders Additional introductoryinformationon fuzzy logic canbe found in Koskoand Isaka 1993.
8.6.2 FuzzyBehavior-BasedRoboticSystems Success fully fielded systemsshow the advantagesof behavioral fusion or blendingusing fuzzy logic. We reviewtwo of theseefforts.
346
Chapter8
8.6.2.1 Flakey At SRI, Saffiotti, Ruspini, and Konolige ( 1993b) have designeda reactive fuzzy controller for the robot PIakey(figure 6.12). Specificbehaviorsare encoded ascollectionsof fuzzy rules. One suchexamplerule for obstacleavoidance IS: -Close-on-left THEN turn . IF obstacle-close-in-front AND NCYf Obstacle sharp-left. Fuzzy rules can also invoke whole behavioralrule sets, providing context for interpretingsensordata. Thesemetarules describethe applicability of a behaviorfor a given situation: IF context; THEN apply(B;) . For example, the rulesbelow specifywhich behaviorsshouldbe activedependingon whethera
collisionis imminent . -oft) . IF collision-dangerTHENapply(keep . IF NOT(collision-danger ) THENapply(follow) Rememberthat the appliedbehaviorsthemselvesare fuzzy, so dependingon the membershipfunction structure, eachmaybe activeto varyingdegrees , thus their . The behaviors are as schemas , smoothly implemented blending responses eachconsistingof threecomponents : . Contextdeterminesa particularbehavior's relevancyto a given situation. . A desirabilityfunctionis implementedasa setof rulesspecifyingthe control regime. . A descriptorset definesthe objectsthat must be perceivedor actedupon during execution(e.g., placesandthings in the world). Using fuzzy control, Flakeycan pursuemultiple goals, blendingbehaviors using rules without requiring strict arbitration. Fuzzy control also permitsintegration of a deliberativeplannerto yield a hybrid architecture(Saffiotti, Ruspini, and Konologe 1993a). Using this controller, Flakey, deployedprimarily in anoffice environment,success fully competedat the first AAAI mobilerobot in San Jose , winning secondplacein a competitionemphasizing competition the ability to navigatein an obstacle-strewnenvironment.
8.6.2.2 MARGE Another robot using fuzzy logic, MARGE (figure 8.19), developedat North CarolinaStateUniversity (GoodridgeandLuo 1994), wasa winner in the following ' year s AAAI robot competition. MARGE usedfuzzy logic differently
347
AdaptiveBehavior
Figure8.19 etat. 1996@1996IEEE.) . (Reprinted withpennission fromJanet MARGE
348
ChapterS however.Insteadof allowing contextto enableanddisablebehavioralrule sets, MAR GE' s controller usesa networkedcollection of distributedfuzzy agents, aUindependentandconcurrent. Implementedfuzzy behavioralagentsinclude , wall following, and docking. Fuzzy behavioral goal seeking, obstacl~ avoidance fusion is conducted, using additional fuzzy controllers as multiplexers to adjust the gains (g;) for eachbehavior. Weightedvector summationis the methodfor producingthe final defuzzifiedcommandsignal. A finite statemachine betweenfuzzy controllerssuitablefor the competition's tasks, sequences suchasoffice rearrangement , which requiredmoving boxesfrom one location to anotherin a clutteredworld. MARGE won first placein this event.
8.6.3 Learning Fuzzy Rules Learningin fuzzy control systemsfor behavior-basedrobotshaspredominantly focusedon learningthe fuzzy rules themselves . In work conductedat the Oak Ridge National Laboratories, Pin and Watanabe( 1995) provide one example for automaticallygeneratinga fuzzy rule basefrom a user-providedqualitative descriptionof behavior. Learningis vieweddifferently in this system, which is ' given the ability to reflect the user s intentionsmore effectively. A traditional rule-basedlearningsystem, TEIRESIAS(Davis 1982), that waslayeredon top of productionexpertsystemsserveda similar purpose: to facilitate the transfer of knowledgeinto a usableform and assistin the developmentof a rule base. In this fuzzy approach, the rule baseis generatedautomaticallyby the following process: 1. The user entersthe rule strategyfor reactingto a given stimuli (the base behavior) in a qualitativeform usinga template. 2. The user definesthe input membershipfunctionsfor the stimulus specifically for eachbehavior. 3. The systemcreatesa skeletonrule basefor this informationand verifiesits completenessregardingcoverageof the stimulus-aDd-responsespace. Output membershipfunctionsare initially setto a standardvalue. 4. Specializedmetarulesare then generatedto suppressor inhibit behaviors (in the subsumptionsense ). The membershipfunctions of the rules are automatically adjustedto reflectthe desireddominancerelationships. Successfulresultshavebeenachievedusingautomaticallygeneratedfuzzy rule basesfor both indoor andoutdoorrobotic systems. In a systemmorecloselyrelatedto the othertypesof learningdiscussedearlier in this chapter, work at the Laboratoired' lnformatique Fondamentaleet ' d IntelligenceArtificielle (LIFIA ) in France(Reignier 1995) has usedsuper-
349
AdaptiveBehavior
vised incrementalmethodsfor learningfuzzy rules. The task hereis to find a collectionof fuzzy rulesthat capturesthe robot' s existingsupervisedbehavior. Temporaldifferencelearningmethods(discussedin section8.4.2) areused. In particular, the systemi ~ capableof rule creation, adaptation(parametermodification of THEN part of rule), and generalization(modificationof IF part). Positive reinforcementresults when the robot reachesthe goal, negativereinforcement when it bumpsinto something. Learning occurswhile the robot movesthroughthe world on the way to the goal. At this writing only preliminary simulationresultswere available, but nonethelessthis work showsmore ' traditional reinforcementlearningmethods extensibility to more unusualbehavioral control regimessuchasfuzzy logic. 8.7 OTHER TYPES OF LEARNING Severalother methodsfor learninghavebeenappliedto or havepotentialfor applicationin behavior-basedsystems.A brief surveyfollows. 8.7.1 Case-BasedLearning Case- based learning methods use the results of past experiences to guide future action ( Kolodner 1994) . Experiences are stored as structured cases. The basic algorithm for case-based learning and acting is as follows : 1. Classify the current problem . 2. Use the resulting problem description to retrieve similar case( s) from case memory . ' ' 3. Adapt the old case s solution to the new situation s specifics. 4. Apply the new solution and evaluate the results. 5. Learn by storing the new case and its results. At Georgia Tech, Ram et al. ( 1997) have applied these methods in simulation only to a schema-based behavioral controller called ACBARR . Cases comprise three components: a set of gains G and several internal parameters used for wandering and obstacle avoidance that represent a particular behavioral assemblage ; environmental information indicating when this configuration was in use; and some local bookkeeping information . The goal becomes learning ' which situations should be associated with which case s behavioral configurations . Figure 8.20 shows the overall system architecture. The system begins with a particular configuration determined by the parameters of the behaviors . When perfonnance inadequacies are determined by such criteria as not
350
8 Chapter making progresstowardthe goal or not moving sufficiently, a new casemore appropriatefor the task is selected.On-line adaptationof the caseoccursby a methodreferredto aslearning momentum(Clark, Arkin , andRam 1992). This method, succinctlys.ummarized, statesthat if the systemis doing well, do the samething a little more strongly; if doing poorly, alter the behavioralcomposition to improve its performance . For example, if the systemmovesfrom a relatively obstacle-free areato a more clutteredone, obstacleavoidancebegins to increaseand goal attractionto decreaseuntil satisfactoryperformance is achieved. WheneverACBARR encountersa sufficientlynovel environmentor significantly modifiesthe original caseretrieved, that informationis storedfor future reference,i.e., it learnsto usethosebehaviorsthe nexttime it encountersa similar situation. The systemis capableof escapingbox canyonsand navigating complexmazesusingthesemethods,which a purely reactivesystemwould not normally be ableto do (figure 8.21).
8.7.2 Memory-basedLearning Memory-basedlearning can perhapsbe viewed as case-basedlearning taken to the extreme, in which explicit numerical details of every experienceare rememberedand stored. Although it has not yet been applied in behavioral controllers, it has been proven an effective techniquein robots for learning functional control law approximatorsfor tasks such as pole balancing, juggling , and billiards (Atkeson, Moore, and Schaal1997; Moore, Atkeson, and Schaal1995). Simply speaking, in memory-basedlearning, complex control functions are approximatedby the interpolation of locally related past successful . Theselazy learning methodsare well-suitedfor complex experiences domainswith largeamountsof data.
8.7.3 Explanation-BasedLearning Explanation-basedlearning( EBL) methodsusemodels(typically symbolic) of the domainto guide the generalizationand specializationof a conceptby induction . Learningoccurson an instance-by-instancebasis, with refinementof the underlyingmodeloccurringat all stepsin the process,guidedby an underlying modelor theory (explanationof the world). Domain-specificknowledge is crucial for this processto operateeffectively, contraryto the numericmethods of reinforcementand neurallearningwe discussedearlier. The robot generates a plan of action basedon its goal, currentperceptions , and underlying
( Q or
,"'0e ~ 8 ~ .e #-n . ~ # ~ # ~ ~ ~ ~P
Chapter8 -
Sphere
032
.
Goal
0
To
48
.
0
Motion
Goal
Gain
Gain 50
5
-
-
-
-
oF
InFluence
Noise
Gain
Persistence
Noise
ObJect .
00
1 40
.
4
.
50
1 Obstacles
58
.
.
38
Dist
2
0 0
260
Contacts
Steps
510
.
0
Direction
39
.
0
Maenitude
c
352
(A) Figure8.21 Effects of Adaptation onBoxCanyon Perfonnance : (A) Purelyreactive or , nolearning -based No cases used on line and ; , (B ) ; (C) Case adaptation only adaptation reasoning on-lineadaptation (ACBARR ).
) Figure8.21(continued (B) D
044
.
0
Direction
00
.
1
Malnitude
614
2
Influence
Sphere
of
-
89
.
4
57
.
ObJect
Gain
-
1
Gain
-
.
15
Goal
1
Gain
-
1
Noise
Persistence
-
Noise
-
353
AdaptiveBehavior
990
.
0
Goal
To
00
.
1
2
Motion
65
Sphere
of
-
.
00
.
0
4
00
.
ObJect
Gain
-
Obstacles
94
.
072
.
Dist
.
45
Goal
2
1
0
4
Gain
-
Figure 8. 21(continued) Influence
Direction
20
00
.
case
Current
Malnitude
1
Contacts
0
120
Steps
1
Gain
-
Noise
Persistence
-
Noise
-
354
Chapter
355
AdaptiveBehavior
' theory of the world it inhabits. Basedon the plan s successor failure, future plansarechosenasguidedby the theory. Learning sequencesof operationsin manipulationtasks seemsto be the domain of choice for EBL scientistsstudying robotics (Segreand Delong 1985). Other researchin the manipulationdomain has demonstratedthat the , Mason, underlyingtheory requiredin EBL can also be learned(Christiansen and Mitchell 1991). Here, not only is the plan selectionprocessaffected, but the underlyingtheoreticalexplanationof actionis learnedaswell. Thrun ( 1995) hasdemonstrated hybrid EBL, Q-learning, andneurallearning for the navigationof the mobile robot XAVIER (figure 5.15), whoseparticular task here is to recognizeand then move to a specific target (green soda can) using sonar, vision, and laserstripe rangedata. Navigationconsistsof a . sequenceof specifiedactions, ratherthanbeingcontrolled by a setof activebehaviors . Sevenactionsarepermissible, including sharpturns, moving forward, and a specialized , hard-coded obstacleavoidanceroutine should something in the A get way. learningepisodeconsistsof the robot startingat somepoint within the lab andterminatingwhen either the robot is directly in front of the targetsodacan(rewarded) or the targetleavesthe field of view (penalized). Qlearning is usedto determinethe action policies. The domain theory, instead of being symbolically representedby rules, which is usually the casein EBL, is capturedin a neural network in advanceusing back-propagationtraining. The training set for Thrun' s experimentsincludes 3,000 instancesfrom 700 . Theseneural networksare usedto predict (explain) the navigationepisodes reinforcementthat would resultfrom the applicationof a particularaction. The learningprocessproceedsfirst by an ex post facto explanationusing the domain ' theoryof thecurrenttrainingexamples result. Generalizationthenoccurs basedon the explanationof the training instancein accordancewith the existing weight spacederivedfrom previousexamples.Finally, refinementoccurs by minimizing the errorbetweenthe training exampleandthe synapticweights in the networks. The net effect in Thrun' s trials was successfuland relatively rapid on-line learning(lessthanten minutes) andnavigationfor this particular task. EBL hasyet to be extendedto behavior-basedsystems,but thereappears to be significantpotentialfor its use, basedupontheseotherresults.
8.8 CHAPTER SUMMARY . Robots need to learn in order to adapt effectively to a changing and dynamic environment . . Behavior -based robots can learn in a variety of ways:
356
ChapterS . They canlearnentire new behaviors. . They canlearnmoreeffectiveresponses . . They can learn to associatemore appropriateor broaderstimuli with a particular . response . They canlearnnew combinationsof behaviors(assemblages ). . They canlearnmoreeffectivecoordinationof existingbehaviors. . Learningcaneitherbe continuousandon-line or be conductedat the end of an episodeor manyepisodes . . Reinforcementlearning is a battery of numerical techniquesthat can be effectivelyusedin adaptivebehavior-basedrobots: . Using statisticalcorrelationto associaterewardswith actions. . Adaptive heuristic critic methods, in which the decisionpolicy is learned independentlyfrom the utility cost function for stateevaluation. TheseAHC methodsoften areimplementedin neuralnetworksystems. . Q-Iearningin which actionsand statesareevaluatedtogether. . Neural networks, a form of reinforcementlearning, use specialized , multinode architectures . Learningoccursthroughtheadjustmentof synapticweights ' by an error minimizationproceduresuchasHebb s rule or back-propagation. . Classicalconditioning in which a conditionedstimulus is eventually, over time with suitabletraining, associatedwith an unconditionedresponse , canbe manifestedin robotic systemsaswell. . Simpleassociativememoriesimplementedastwo-layerperceptronscanproduce rapid learningfor simpletasks. . Geneticalgorithmsoperateoversetsof individualsovermultiple generations using operatorssuchasselection, crossoverandmutation. . Effective fitnessfunctionsmust be definedfor the particulartask and environment for successfulevolutionarylearning. By suitableselection, particular ecologicalnichescanbe definedfor variousbehavioralclassesof robots(e.g., safe, fast, etc.) . A classifiersystemusesfixed-lengthbit stringrule-basedrepresentations for discretebehavioralencodingsfor usewith geneticoperators. . Evolutionary strategieshavebeenusedfor on-line adaptationand changes in physical structurein addition to the more commonapplicationto off-line . learningof control systemparameters . Fuzzy control usesrule-basedmethodsthat involve taking crisp sensorinputs , fuzzifying them, conductingfuzzy inference, andthen producinga crisp . response . Membershipfunctionsmap the inputs onto the degreeof membershipfor a particularlinguistic variable.
357
AdaptiveBehavior
. Learning can be accomplishedin fuzzy behavior- basedrobot systemsby capturingdesignerintentionsthroughthe automaticgenerationandrefinement of suitablefuzzy rulesfor an applicationor by adaptingthe rulesdirectly using reinforcementlearningmethods. . Many other powerfull "earningmethodshavejust begunto be exploredin the contextof robotics, including memory-based,case-based, andexplanationbasedlearning. . Behaviora1learningsystemshaveenabledrobots to learn to walk, to push boxes, to shoota ball into a goal, andto navigatesafelytowarda goal, among otherthings.
Chapter9 -
Social Behavior
The mob hasmanyhead~ but no brains. - Englishproverb A teameffort is a lot of peopledoing what I say. - Michael WInner
Chapter Objectives .of robotic of and the benefits I.2.To understand systems multiagent complexities which teams different dimensions the able to characterize To be along . incommunication can be robots organized and the differences .adaptation 3 To ,learning ,compared perception recognize robotic to when social behavior with associated solitary . agents BETTERTHANONE? 9.1 ARETWO(ORN) ROBOTS When is it better to go it alone, and when to have teammates? This question applies not only to human endeavorsbut robotics as well . As expected, teaming robots together has both an upside and a downside. The positive aspects: . Improved system performance : Where tasks are naturally decomposable, the " divide and conquer" strategy is wholly appropriate. By exploiting the parallelism inherent in teaming , tasks can be completed consider ably more efficiently overall for a wide range of tasks and environments using groups of robots working together. . Task enablement: The ability to do certain tasks that would be impossible for a single robot.
360
Chapter9
. Distributed sensing: Information sharingbeyond the range of an existing sensorsuiteon an individual robot. . Distributedaction at a distance: A robot teamcan simultaneouslycarry out actionsat manydifferent locations. . Fault tolerance: Agent redundancyand reducedindividual complexity can increaseoverall systemreliability. The negativeaspects : . Interference: The old adage"Too many cooksspoil the broth" pretty much sumsit up. The fact that actualrobotshavephysicalsize providesthe opportunity for blockageor robot-robot collisions. The volume of the agentsthemselves resultsin anoverallreductionof navigationalfree spacewhenmorethan onerobot is used. This is especiallysignificantin tight quarters. . Communicationcostandrobustness : Communicationis not free. It generally requiresadditionalhardware, computationalprocessing , andenergy. Communication canalsosufferbecauseof noisychannels,electroniccountermeasures , anddeceitby otheragents,complicatingreliability. . Uncertaintyconcerningother robots' intentions: Coordinationgenerallyrequires knowing what the other agentis doing, at leastto someextent. When this is unclearbecauseof lack of knowledgeor poor communication, robots . may competeratherthancooperate . Overall systemcost: In somecases, two robotsmay cost more than one. If the team can be designedusing simpler, lesscomplex, robots than would be requiredindividually, this is not necessarilythe case. In light of the potentially significant advantagesafforded by multirobot teamsdespitethe potentialdrawbacks,researchersare investigatingcooperative societies,bringing a wide rangeof perspectivesto bearon socialbehavior: . Ethological: Studyinghow animalscooperateand communicate(Arkin and Hobbs 1992). . Organizational : Looking at how humanorganizationsarestructured(Carley 1995). . Computationalmodels: Drawing from computersciencein the areasof multiprocessing andparallel systemdesign( Wang1995). . Distributedartificial intelligence: Dealing with the problemsof agencyand cooperationusing negotiation, deception, and methodsfor communication ( Lesser1995). . Motion planning: Addressingthe geometricandkinematicproblemsof multiple objectsmoving aboutin space( Latombe1991).
361
Social Behavior
. Artificial life: Studying the relationshipsthe multiagentteamsform with their environments , typically including aspectsof competitionas well as cooperation (Langton 1995). Independentof the ~ rspective taken, many potentially useful jobs for robotic societieshave been identified. Someof the most commonly studied tasksfor multiagentrobotic systemsinclude . Foraging, whererandomlyplaceditems aredistributedthroughoutthe environment ' , andthe teams taskis to carry thembackto a centrallocation. . Consuming, which requiresthe robots to perform work on the desiredobjects in place, ratherthancarryingthembackto a homebase. This may involve assemblyor disassemblyoperations,suchasin a land mine field. . Grazing, which requiresa robot team to cover an environmentalareaade' quately. The potentialapplicationsof this social behaviorinclude lawn mowing , surveillance operationsfor searchand rescue, and cleaning operations suchasvacuuming. . Formationsor flocking, which require the team of robots to assumeage ometric pattern (approximatein the caseof flocking, specific in the caseof fonnations) and maintainit while moving aboutthe world. Early work in this areahasbeenconcernedwith theoreticaland simulatedresults(e.g., Sugihara and Suzuki 1990; Parker1992; ChenandLob 1994). Behavior-basedmethods for coordinatingmultiple graphicalagentshavealso had a significantimpact within the computeranimationcommunity(Reynolds1987; HodginsandBrogan 1994). Real robots havedemonstratedboth formation (Balch and Arkin 1995) andflocking (Mataric 1993a) behaviors. . Objecttransport, which probablycanbe viewedasa subtaskof certaintypes of foraging, typically requiresthe distribution of severalrobots around the desiredobjectwith the goalbeingto moveit to a particularlocation. Particular examplesincludebox pushing(Kube andZhang 1992) andcoordinatedpallet lifting andtransport(JohnsonandBay 1995). Scientistsand engineersin Japanwere amongthe first to study coordinated mobilemultirobot systems.An earlycellularrobotic (CEBar ) system( Fukuda et al. 1989), involving the docking of severalsmall robot units to producea larger robot, illustratedcommunicationmechanismsthat can be usedto support coordinatedbehavior. Interrobotcommunicationdevicesincludedinfrared photodiodesusedfor messagingthatprovidedpositionalinformationregarding dock location. The CEBOTprogramresearchhascontinuedoverthe years, resulting in an architecture(Cal et al. 1995), depictedin figure 9.1, that consists
362
Chapter9
INPUT DEVICES
OUTPUT DEVICES
Figure9.1 CEBaT MarkV controlsystem . of multiple parallel behaviors using vector summation as the basis forbehav loral integration ( section 3.4.3.2) . Small teams of mobile robots (figure 9.2) were successfully tested for multirobot goal - oriented navigation among obstacles . These agents use a behavioral suite consisting of go-to - goal , avoid obstacles (using infrared and ultrasonic sensors), and an avoid -robot -collision behavior that produces a right turn whenever another robot is within a certain distance. At the University of Tsukuba, Premvuti and Yuta ( 1995) experimented with a multiagent robotic system using Yamabico robots equipped with ultrasound , dead reckoning for position estimation , and a communication network capable of transmitting position infonnation . This work focused on cooperation between robots to avoid collisions as they moved about corridors and through intersections.
9.2 ETHOLOGICAL CONSm ERATI ONS As hasbeenour practicethroughoutthis book, we look towardbiological systems , wheneverfeasible, to provide insightsregardingthe designof behaviorbasedrobotic systems. Ethological studiesshowclearly that multiagentsocieties offer significant advantagesin the achievementof community tasks. A wide range of animal social structuresexists to supportagent-agentinteractions . For example, urn-level organizationsare found in schoolingfish, hierarchical systemsarefound in baboonsocieties,andcastesystemsaretypified by manyinsectcolonies(e.g., bees). The relationshipsbetweentheseagentsoften
363
SocialBehavior
Figure9.2 CEBOTMarkV robotteam.(Reprinted with pennission fromCalet al. 1995.@1995 IEEE.) determinethe natureandtype of communicationessentialfor the socialsystem to succeed . The conversealsoholds in that the communicationabilities somewhat determinethe mosteffectivesocialorganizationsfor a particularclassof agents. ' Tinbergens ( 1953) influential work on animal behaviordescribesa broad rangeof socialactivity: . Simple socialbehaviors . Symp~thetic induction, or doing the samethings as others(e.g., the yawn) yawn response . Reciprocalbehavior, suchascoital activity or feedingyoungthroughinduced regurgitation . Antagonisticbehavioror simpleconflict . Mating behaviors . Persuasionandappeasement . Orientationor approach . Family andgrouplife behaviors . Flocking and herding defense-relatedbehaviors, such as communalattack (mobs), warning (the flock is as alert as the most observantindividual), and ) crowding(reducingvulnerability by confusingpredators . Congregation smell or vision , using
364
Chapter9
. Infectiousbehaviors thatspread thesociety , suchasalarm,sleep , throughout andeating . Fighting behaviors . Reproductivefighting: Preventingrivals from being at samelocation . Mutual hostility: Spreadingthe societyover a region . Peck order: Establishinga dominancehierarchy and ultimately reducing fighting Oneof the mostcommonlystudiedsocialbiological systemsis that of ants. Excellentreferenceson their socialorganizationandcommunicationmethods include Holldobler and Wilson 1990 and Goetsch1957. Ants typically use chemicalcommunicationto conveyinformationto one another. We haveseen earlier (section2.5.1) an exampleof a robotic systemcapableof a primitive form of chemotaxisinspiredby ants' communicationmethods.Foragingmechanisms are considerably more sophisticatedin ant colonies, however, than in . Foragingantslay down chemicaltrails, robotic systemthusfar developed any dramaticallyincreasingthe efficiencyof foraging while avoidingthe needfor explicit memoryin the organism. Decisionmakingis a collectiveeffort rather andGoss1989). This patternis consistent thana master-slavedecision( Deneubourg with the goal of avoiding hierarchicaldecisionsin a behavior-based robotic society. Different foraging patternsfor different ant specieshavebeen simulated, exploiting their tendenciesof collecting different-sizedfood particles , amongother characteristics(Gosset aI. 1990) . One study (Franks 1986) haslookedin particularat the behaviorof army antsin the contextof groupretrieval of prey, evaluatingthe relationshipof the massof retrievedobjectsand the velocity of their return. A samplingof other interestingsocialethologicalstudiesincludes: . the impactof environmentalfactorssuchasfood supply, hunger, danger, and competitionon the foragingbehaviorof fish (Croy andHughes1991). . mob behaviorand communicationin the whip-tail wallaby, illustrating the emergentorganizationof multiple agentsandthe natureof communicationthat supportsthis groupbehavior(Kaufmann1974). . primatestudiesregardingthe organizationof colonies(Altmann 1974) relative to their environment. . the role of display behaviorfor parsimoniouscommunicationmechanisms (e.g., Moynihan 1970). ' (Seealso section2.4 for additionaldiscussionof ethologicalstudies influence on behavior-basedrobotics.)
365
SocialBehavior
9.3 CHARACTERIZATION OFSOCIAL BEHAVIOR . A team Designinga societyof robotsinvolvesmanydifferent considerations of agents(either animal or robotic) can be characterizedalong a numberof dimensions, including reliability, organization, communication, spatialdistribution . , andcongregation
9.3.1 Reliability Systemreliability is definedasthe probability that the systemcanact correctly in a given situation over time. Parallelism, in general, increasesreliability, eliminating the potential for single-point failures that would be found in serially structuredsystems.Holldobler andWilson ( 1990), basedon their studies of ants, arguethat redundancywithin an organizationshouldoccur at low levels ratherthan high levels. Redundancyat the agentlevel itself, asopposedto redundantteamsof agents, is consideredmore important. The agentsmust be predisposedin somemannerto work togetheras well. This may be something as simpleasnot interferingwith eachotherby stayingout of eachother' s way or as complex as the developmentof a complex vocabularyfor exchanging . messages
9.3.2 SocialOrganization Animal societiesare very diverse. Wilson ( 1975) establishedten " qualities of sociality" : group size, demographicdistribution, cohesiveness , amountand , permeability, compartmentalization , differentiation patternof connectedness of roles, integrationof behavior, informationflow, andfraction of time devoted to social behavior. Deegenerdefinedover forty categoriesof animal societies (Allee 1978). A few distinctexamplesof socialorganizationincludemultilevel hierarchicalstructures(e.g., antcastesystems ), flat single-level structures(e.g., schoolingfish), dynamic, loosely structuredmobs(e.g., whip-tail wallabies), and dominancesystemsor peck orders(e.g., roosting place competitionfor fowl ). The numberand types of constitutentagentsultimately determinethe ' . Specialization(heterogeneity ) should be basedupon societys performance societalneed. One heuristicstatesthat if an eventoccursregularly within the ' societys lifetime, a particular class of agentsshould be presentto handleit societiesshouldbe ( Wilson1975), implying, for robotics, that heterogeneous developedif thereis a demandfor specializedskills. For example, it might be betterto createrobotsthat areexpertsat gatheringmaterialandthendelivering
366
9 Chapter Table 9.1 Modesof Animal Communication Mode Audition Luminescense Chemical
.
Directionality
Distance
Relevant Uses
Low-Medium
Far
Alann ,
High Low
Medium
individuality Location
Low
Mass
Medium
Medium
High Low
Low
Socialdistance (Box 1973) Contact
Low
Aggression
communication Reflected light Tactile Electric
it to robots that are experts at assembling structures than to try to make all of them competent for both tasks. The cliche "jack of all trades and master of none" may apply to robot societies as well . 9. 3. 3 Communication
Communicationhastwo major aspects : . Information Content: Most animal communicationmechanismsoperateat a very low bandwidth. Even when vision is used, the signalsbetweenagents generallyhave a low information content. Messagesin animal societiesare oftenvery limited: For ants, therearetypically ten to twenty differentchemical Is , birds, andfish havebeen signals( HolldoblerandWilson 1990); andmanuna estimatedto have approximatelyfifteen to thirty-five distinct major display behaviors( Moynihan1970). Note that thesemay be gradedby intensity. . Mode: Different animalsocietiesusea surprisinglywide rangeof communication mechanisms , including chemical, biol~ escence , reflected-light, tactile , acoustic, echolocation , infrared, andelectriccommunication(table 9.1). 9. 3.4
Spatial Distribution
Spatialdistributionis particularly importantfor activitiessuchas foragingfor food. Spatialconsiderationsinclude small versuslargegroupsor overlapping versusnon-overlappingforaging ranges. Resourcedensity, an environmental
367
Social Behavior
factor, often has a direct bearingon overall society size (the more resources , the larger the group) and also influencesthe foraging patterns, so that the more restrictedthe resource, the greaterthe overlapof foraging ranges(Alt mann 1974; Carr and MacDonald 1986). A generalizationresultingfrom this ' relationshipis Horn s Principle of Group Foraging, which statesthat if a resourceis evenly distributed, it is better for the agents(in this case, birds) to form individual, non-overlappingforaging rangesinsteadof roosting and foraging together( Wilson1975). Variousmodelsfor ant foraging have also beendevelopedrelating foragingrangesand strategiesto resourcedensityand distribution ( Deneubourgand Goss 1989; Gosset ale 1990). Similar considerations of resource-task allocation should also affect robot societies' social behavior.
9.3.5 Congregation Coordinatingactivity is important for a society. How can the societyremain togetherover time? Simple taskssuchas finding other agentscan be difficult in a large group or broad area. Animals use various strategiesto accomplish this task: . By defining a colony location as a predefinedmeetingpoint recognizedby otheragents,agentscanconvergeat this location. Colonieshavethe advantages of havingcommonstorageof resourcesandgooddefensecapabilities. . Lekking is a group behaviorthat involvesthe generationof a loud noiseby a numberof similar agents(e.g., animalsof the samesex), simultaneouslyincreasing the likelihood of other agents' hearingandthenjoining and strengthening the group' s lekking. . Distinctive calls can be usedto help find lost agentsor to indicate that an agentis lost. . Specificassemblycalls by a single agentcan also mustera group of agents . that is widely dispersed
9.3.6 Performance To effectively evaluate societal system perfonnance , specific metrics must be introduced . One useful metric is speedup ( 5 [ i , j ]), a measure of the performance of a team of N robots relative to N times the performance of a single robot . Formally , the speedup for a team of i robots carrying out j task actions is
368
9 Chapter
P[[~ l,j]i-, S[i,j]=P
(9.1)
where P [ i , j ] is a performance measure. P [ i , j ] can be measured in many different ways, depending on what is important to the task. It could be the total time taken to complete a task, the total length of travel for the robots, the energy used during task achievement, or various combinations of these or other metrics ( Balch and Arkin 1994) . Speedup results can be categorized into sublinear performance ( 5 [ i , j ] < 1), where multiples of a single agent perform better than a team; superlinear performance ( 5 [ i , j ] > 1), where a team performs better ; and linear performance ( 5 [ i , j ] = 1), a break- even point where the overall performance is comparable (Mataric 1992c) .
9.4 WHATMAKESA ROBOTICTEAM? The issueof what makesa robotic team influenceshow designersof robotic systemsand societiesmakeintelligent decisionsregardingwhat organization, communication,behavioralstrategies , andthe like areappropriatefor aparticular task environment.We now examinesomewaysin which robot teamscan be structuredandin so doing help definethe designspacefor thesesocieties. in Tsukuba, Japan(Premvutiand Yuta 1990; Yuta 1993), Early researchers several categorized importantaspectsregardingthe organizationof multirobot teams, delineatingeachsocietyaccordingto . Active or non-active cooperation: Robots either shareor do not sharea commongoal. . Level of independence : Control is either distributed or centralized (the ' robots decisionsare madeeither locally or by someexternal global agent) or somecombinationof both. . Typesof communication:Communicationis either . explicit (wherea signalis intentionally sharedbetweentwo or more robots) or . implicit (where information is sharedby observationof other agents' actions ). Thesefirst stepstowards.a taxonomyaswell asa proliferationof multiagent roboticsresearchled a Canadianresearchgroup( Dudeket al. 1993) to propose a more completetaxonomycapableof categorizingthe variety of multiagent robotic systemsbeingcreatedby laboratoriesaroundtheworld. It characterizes teamingalongtheselines:
369
Social Behavior
. Team size: Refers to the number of robots and consists of the following subclasses: alone (one robot ), pair (two robots), limited group (a relatively small number of robots given the magnitude of the task), or infinite group (for
all practicalpurposesan infinite numberof robots). . Communication range: Refersto eachrobot' s ability to communicatedirectly with otherteammembersandconsistsof the following subclass es: none (no direct communication ), near (only robots within a short distancecan be communicatedwith directly), and infinite (no limit to the robots' direct communication capabilities). . Communication topology: Refersto the pathwaysby which communication can occur. The subclass es are broadcast(all information is sent and received by all robotswithin range), addressed(direct messagingis allowedon a namedbasis), tree (only hierarchicalcommunicationis permitted), and graph (arbitrarycommunicationpathwayscanbe established ). . Communication bandwidth: Refersto the amountof communicationavailable . The subclass es are high (communicationis for all practical purposes free), motion-related(motion andcommunicationcostsare approximatelythe same), low (communicationcostsarevery high), and zero(no communication is available). . Teamreconfigurability : Refersto the flexibility regardingthe structureand es static (no changesare permitted), organizationof the team, with subclass communicationcoordinated(robots in communicationwith each other can ), anddynamic(arbitraryreorganizationis permitted). reorganize . Team unit processing ability : Refers to the underlying computational model used. Subclass es include non-linear summationunits, finite state automata , pushdown automata,andTuring machineequivalent. . Teamcomposition: Refersto the compositionof the agentsthemselves . The more than one are all the same and subclass es homogeneous ), ( ( heterogeneous type). Cao et al. ( 1995) at UCLA' s CommotionLab madeanotherattemptat capturing the designspaceof multirobot systems. It describesfour principal research axes: . Architecture: Whetherthe system's control is centralizedor decentralized . . Differentiationof agents:Whetherthe constituentagents' structureandcontrol ). ) or different (heterogeneous systemsare identical(homogeneous . Communicationstructures: . Via environment: for example, a trail left by the robot . Via sensing: by observingotherrobots' actions
370
Chapter9 . Via communication : by intentional signaling . Models of other agents' intentions , capabilities , states, or beliefs : This aspect incorporates ideas from the distributed AI community . It is probably premature to assume that anyone of these categorizations or taxonomies can adequately express the wide range of robotic team possibilities . In the remainder of this chapter, we will focus instead on social organization and structure, interrobot communication , distributed perception , and societal learning , concluding with a case study of a successful application . As in previous chapters, despite the large body of simulation studies, we concentrate on those systems tested on actual robotic hardware.
9.5 SOCIALORGANIZATION ANDSTRUCTURE The behavioralarchitecturefor a robotic society's constituentagentsis only oneof manycommitmentsmadeduring teamdesign. The permissiblecommunication protocolsbetweenteammembersandthe societalstructure( homogeneous or heterogeneous agents) arealsoextremelyimportant. We now look at a rangeof architecturalstrategiesfor robotic societies.Oftenthe issuesdiscussed within theseparticularsystemstranscendthe individual agents' behavioralarchitecture . The systemsdescribedare only representativeof the field and do not constitutea completesurveyby any means.
9.5.1 The NerdHerd -style An intellectualdescendent of Brooks, Matarichasexpandedsubsumption architecturesfor applicationsof robotic teams( Mataric1994a). We encountered -basedforaging this multiagentapproachin the contextof a subsumption systemin section4.3.4. A broadrangeof basicsocialbehaviorshasbeenspecified, including . Homing: Eachagentstrivesto moveto a commonhomebase. . Aggregation: Agentstry to gatherwhile maintaininga specifiedseparation . . Dispersion: Agentscovera largearea, establishingand maintaininga minimum separationbetweenrobots. . Following: Robotsfollow oneafter the other. . Safewandering: Robotsmovearoundwhile avoidingcollisions with obstacles and eachother. As is standardfor subsumption , a rule-basedencodingis used. For example:
371
SocialBehavior
Figure9.3 TheNerdHerd. ( photograph of M. Mataric.) courtesy Aggregate: If an agent is outside the aggre,gation distance turn toward the aggregation centroid and go . Else stop . Similar simplerules areconstructedfor the otherbehaviors. 1\ vo differentcoordinationmechanismsareused: direct combination, which is a vector summationprocess, and temporalcombination, which sequences through a seriesof behavioralstates. Perceptualinformation is encodedas a seriesof predicates(e.g., at-home? have-pu,?k? crowded? behind-kin? sense ? used to encode the data needed to activate the relevant behaviors puck ) sensory . The systemhas been evaluatedboth in simulation and on a set of up to twenty small mobile robots, the so-called Nerd Herd (figure 9.3) . The basic behaviors, describedearlier, can be combinedto yield more complex social interactions,including . flocking, consistingof safewandering, aggregation , and dispersion.
372
Chapter9
. surrounding,consistingof safewandering, following, and aggregation . . herding, consistingof safewandering, surrounding,and flocking. . foraging, consistingof safewandering, dispersion, following, horning, and flocking. Note that in herding, for example, compositebehaviorssuch as surrounding are alsousedasbuilding blocks. The contributionsof this work lie not in architecturaladvancesbut rather in the study of new rule-basedbehaviorsfor multiple physically embodied agentscapableof interactingwith eachother. This social interactioncan lead to physicalinterferencewith eachothers' goals, complicatingoverall societal taskcompletion. 9.5.2
Alliance Architecture
Another offshoot of the subsumptionapproachis the Alliance architecture, which includesspecialconsiderationfor heterogeneous teamsof robots(Parker 1994, 1995). Alliance varies significantly from subsumptionin its addition of behavior sets and a motivational system. Behavior sets enabledifferent groupsof low-level behaviorsto be activetogetheror to hibernate, permitting -style architectures a reconfigurability atypical of subsumption . Motivational behaviorsenableor disable thesebehaviorsets. They operateby accepting, in additionto the normalinputsfrom sensorsandinhibition from otherbehaviors , information from interrobot commonplacein subsumptionarchitectures communicationand the existing agent's internal motivational state. Internal motivation allows the robot to respondeffectively when trappedby permitting it to becomeimpatientor by allowing it to acquiesce(give up) on a task if it is overly difficult or unachievable . To someextentAlliance canbe viewed asaddinga layer abovethe subsumptionarchitecturethat embodiesthesenew capabilities(figure 9.4). The direct input of communicationsignalsfrom otherrobotsinto an agent's active behaviorsfacilitates cooperationbetweenagents. Explicit models of interrobotcommunicationprovide predicatessupportinginformation transfer betweentwo robotsregardinga specifictaskover a given time period. ' Impatiencerelates to a robot s waiting for the completion of a task by anotherrobot that is a prerequisitefor the impatient robot' s next action. In Alliance, it is implementedusing an impatiencerate function and a binary impatience_resetfunction. Thesefunctions control the motivationalvariable representingimpatiencewithin the robot. Acquiescenceis similar but deter-
Social Behavior
Lateral
Inhibition -
-
-
-
-
-
-
-
-
-
-
-
-
-
,
.
'
.
-
-
-
-
, -
"
' ,
" ,
, , -
COMMUNICATION
-
373
-
-
C
Behavior 2
Set
Behavior 3 .
Set
SENSORS
.
Figure 9.4 The Alliance architecture.
mines when to changebehaviorin deferenceto anotherrobot. Each robot' s (ri ) overall motivation for a behavioralset Ail is computedby the following equations: mij (O) = 0, and mij (t ) = [mij (t 1) + impatienCeil(t )] * sensoryJeedbackij (t ) * _resetij(t ) * acquiescen activity_suppressionij(t ) * impatience Ceil(t ), where impatienCeil(t ) is the impatiencerate function that determineshow quickly the robot becomesimpatient; sensoryJeedbackij (t ) is a binary predicate that indicateswhetherthe preconditionsfor the behavioralsetaresatisfied or not; activity_suppressionij(t ) is a binary predicateindicatingwhetheror not anotherbehavioralsetaik, j ~ k is activeat time t ; impatience _resetij(t ) is a is when another robot is that 0 makingprogresson thetaskthat binarypredicate therobot is waiting on, andotherwise1; andacquiescen Ceil(t ) is a binarypredicate that detennineswhetherto give up on a task or not. Thus the motivation for a behavioralsetwill continueto grow unlesssensordataindicatesthat it is
374
Chapter9
Table9.2 Exampletasksfor Alliance Task
Robots
Behavioralsets
Genghis push,go-home R-2 push-left, push-right -methodically Hazardouswastecleanup 3R 2 find-locations , Box pushing
Janitorialservice Boundingoverwatch
-wander find-locations - spill, report-progress , move Simulation emptygarbage , dustfurniture , clean-floor -leader -leader Simulation join-group,emerge follow , , lead-to-waypoint , overwatch
not needed,anothercompetingbehavioris active, anotherrobot hastakenover the task, or the robot givesup on the task. When the motivationvaluecrosses an arbitrarily predefinedthreshold, then behavioralset Ail becomesactive in robot ri . The robot then concurrentlyand periodically broadcaststo all other robotsthe fact that Ail is active. Alliance hasbeenusedfor a wide rangeof mission scenarios , as table 9.2 shows. Figure9.5 showstwo snapshotsof thehazardouswastecleanupmission beingconductedby threeR-2 robots.
9.5.3 StagnationBehaviors Kube and Zhang ( 1994) of the University of Alberta havestudieda common problemin multirobot tasks: avoidingstagnationduring taskcompletion. Stagnation occurswhen team membersare not cooperatingeffectively with each other. Alliance address es this problemthroughmotivationalvariablessuchas acquiescenceand impatience, relying to a large degreeon interrobot broadcast communication. An alternativeapproachaddsa new stagnationbehavior to the overall architectureconsistingof one or more specific strategiesused to overcomethe particulardifficulty confrontingthe team. In box pushing, for ' example, stagnationmay result from individual agents pushing in opposite directions, effectivelycancelingeachother' s forces. Figure9.6 depictsa three-behaviorarbitration-basedarchitecturefor a boxpushingtask. Stagnationis definedin this contextaswhena robot is in contact with the box, but the box is not moving. To handlethis potentialevent, each ' agents stagnationbehavioris composedof severalstrategicbehaviorsincluding realignment,which changesthedirectionin which therobot is pushing, and
375
Social Behavior
(A)
( B) Figure 90S In a mock-up mission, a team of robots (A ) retrieveand ( B) deliver spill objects(the dark pucks) to the disposalarea(the squareregion in the foreground). ( Photographs courtesyof Lynne Parker.)
376
Chapter
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
r
I I I
STAGNATION I . I I I I
"
-
-
'
-
I } " ACTUATORS
ON
8 ~
#
~
_ ~ rAI
"
~
. . I . ,
J
.
_
.
.
"
. ,
-
-
-
-
-
.
"
" ,
"
" ,
"
~ I .
~ :
_
_
_
L
_
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
/ 1 .
_
_
.
1 I
.
.
. 8
Figure 9.6 Behavioral architecture . incorporating stagnationbehavior.
repositioning, which movesthe robot to a different randomlocation alongthe box' s perimeter. Thesestagnationstrategiescanbe assignedpriorities according to the length of time the stagnationcondition haspersisted. (In this case, realignmentis attemptedbeforerepositioning.) In contrast to Alliance, no explicit communicationis required between robots, nor knowledgeof the other agent's intentions, to eliminate the stagnation condition . The behavioral control architecture has been successfully verified in experiments with teams of small robots (figure 9.7) . 9.5.4
Societal Agents
-basedrobotic architectureshavealso beendevelopedand Multiagent schema fielded. In the SocietalAgent Theory ( MacKenzie1996), a single representational syntax is usedto expressnot only primitive sensorimotorbehaviors andassemblages but alsoteamsof physicalagents.This approach,inspiredby ' Minsky s Societyof Minds ( Minsky1986), makesno distinction betweenthe methodsusedto deployintraand interagentbehaviors.A societyconsistsof a collectionof behavingagentsthat mayor may not be spatiallydistributed(i.e., ) . A team of robots can mayor may not havemultiple physicalembodiments itself. thusbe viewedasan assemblage
377
SocialBehavior
Figure 9.7 A teamof smaIlrobotspushinga box.
Earlier work at GeorgiaTech(Arkin 1992b) providedconceptualproof that robotic teamcooperationwas feasiblein the absenceof any explicit robot-torobot communication.This wasshownfirst for foragingtasks, thenextendedto consumingand grazing scenarios(Balch and Arkin 1994). A straightforward -basedreactivecontrol (section4.4) was used, but there extensionof schema wasno direct modelingof the robot societyasan entity itself. The subsequent SocietalAgentTheoryprovidesa new meansfor expressing both homogeneous andheterogeneous teams. This in turn facilitatesthe design of robot teamsand hasbeenincorporatedinto a multiagentdesignand specification , Cameron, and systemcalledMissionLab (section9.9.2) ( MacKenzie Arkin 1995). Multiagent behaviorssuchas formation control, which allows cooperative motion of robots relative to each other ( Balch and Arkin 1995), and team , where a single human operatorcan effectively influence the teleautonomy
378
9 Chapter
(A ) Robotsin line formation. Figure 9.8 1\1/0 DenningRobotsmoving acrossthe laboratory, initially startingin line fonnation (side by side), then transitioningto column formation (one following the other) - note changein orientationrelativeto stripeson floor.
behaviorof an entire team of robots (Arkin and All 1994), havebeendeveloped within this framework (sections9.9.1 through 9.9.3). Thesebehaviors havebeentestedin simulation, on Denningmobile robots(figure 9.8), andon ' military vehiclesas part of the DefenseAdvancedResearchProjectAgency s ' (DARPA s) UnmannedGroundVehicle( UGV) DemoII Program(section9.9).
9.sis Anny Ant
Project
At Virginia Tech, Johnsonand Bay ( 1995) have focusedon cooperationby teamsof robotsin payloadtransportation . A controller (figure 9.9) consisting of four behaviorsusing a vectorsummationcoordinationmechanismhasbeen developedfor eachof the agentsto direct a transporttask that involveslifting a pallet containingmaterial and moving it to a goal location. The orientation behavior strives to keep the pallet level, independentof its height. A force
379
SocialBehavior
.
.
' ' ; "
'
"
r
"' '
" ,
~
&
( B) Robotsin columnfonnation. Figure 9.8 (continued)
behavior coordinatesthe forces exertedby the individual robot with those of other membersof its team using interagentbroadcastcommunicationto distributethe load as evenly as possible. The pallet-contactbehaviorensures that the robot maintainscontactwith the pallet as it moves, while the height behavior detenninesthe level at which the pallet should be held. Although testedonly in simulationto date, this researchprovidescompellingresultsfor cooperationin lifting , transporting, andlowering payloadsover rough terrain.
9.6 INTERROBOTCOMMUNICATION Communicationbetweenrobotsis an extremelyimportantconsiderationin the designof a multirobot society. In this sectionwe will considerthe following issues: . . . .
Whethercommunicationis neededat aU Over what rangecommunicationshouldbe pennitted What the informationcontentshouldbe What guarantees canbe maderegardingcommunicationandperformance
380
Chapter9
S E N S 0 R S
1:
ACTUATORS
Figure 9.9 Behavioralcontroller for payloadtransport. 9.6.1
The Need for Communication
A fundamentalissue in effecting cooperativebehavior in a team of robots is the level of appropriateinteragentcommunication. Communicationis not free andcanbe undependable . It can occurexplicitly, throughdirect channels, or indirectly, through the observationof behavioraldisplaysor changesleft in the environment(e.g., trail marking). In hostile environments , electronic countermeasures may be in effect, lamming information flow betweenagents or introducingdeceitinto the informationstream. So what should a robot listen to and believeif it is to work togetherwith otherrobots? How shoulda designerof a multiagentrobot systemincorporate communicationinto the system? Identifying the major rolesof communication in robot teamsshouldhelp in the designprocess. They include ( Fukudaand Sekiyama1994) : . Thesynchronizationof action: Certaintasksrequirecertainactionsto beperformed in a particular sequenceor simultaneously . Communicationbetween agentsprovidesthe ability to coordinatetheseactivities. . Information exchange : Different agentshave varying perspectiveson the world basedon their spatialposition or knowledgeof past events. It is often usefulto sharethis information. . Negotiations: Decisionsmay need to be made regardingwho should do what. This avoids the duplication of effort yielding a more efficient society. The communicationand sharingof goalsandintentionscanleadto productive ' changesin behaviorbasedon other agents projectedactions. Is communicationimportantfor cooperation? WernerandDyer ( 1990) have studied the evolution of communicationin synthetic agents. They demonstrated that directional mating signalscan evolve in thesesystemsgiven the
381
SocialBehavior . MacLennan( 1991) hasalsostudiedthis problem presenceof societalnecessity andhassimilarly concludedthatcommunicationcanevolvein a societyof simple roboticagents.In his studies, the societiesin which communicationevolved . were 84 percentfitter than those in which communicationwas suppressed In simulation researchconductedat the EnvironmentalResearchInstitute of Michigan, Franklin and Hannon ( 1987) useda rule-basedcooperativemultiagent systemto study the role of communication, cooperation, and inference and how theserelationshipslead to specializedcategoriesof cooperativesystems . Regardingcommunication , they recognizedthat information neednot be explicitly requestedby a receiverto be potentially useful to a multiagent systemas a whole. All of thesestudiesarguefor the utility of somelevel of communicationin robotic teams. Nonetheless , Arkin ( 1992b) hasestablished , that for certainclassesof tasks, explicit communicationis not a prerequisite for cooperation.
9.6.2 Communication Range A tacit assumptionis often madethat louder is better; that is, that the wider a robot' s communicationrange, the better its performancewill be. This is not necessarilythe case. Agah and Bekey ( 1995a) studied calls for help in the contextof a multirobot object-carryingtask at the Universityof SouthernCalifornia. In their tropism systemcognitive architecture, a set of attractiveand ' aversiveactionsis selectedbasedon an agents current sensoryinput. From this set, a single action is chosenbasedon a weightedroulette wheel strategy ' . In that incorporatesa degreeof nondetenninisminto the robot s response one example, two small robotshavebeenconstructedto carry out the task of pipe transport(figure 9.10). In a simulatedcooperativeforagingtaskusinghomogeneous that societalperformancecandecrease robots, it wasdemonstrated ' substantiallywith increasesin a robot s communicationradius. The trade-off is that too weaka call for helppreventsan agentfrom beingheard, but too strong a call bringstheentirecolonytogetherandpreventseffectiveexplorationof the environment.Loudestis indeednot bestfor all tasks. A probabilistic approachto determiningthe optimal communicationrange for multirobot teams under different conditions appearsin Yoshida et al. 1995. This rangeis detenninedby minimizing the communicationdelay time betweenrobots, assumingthey are moving randomly. If more robots send information than the receivingagentcan handle, resultingin completeblockage of communicationflow, the optimal range Xoptimal (representedas the
382
Chapter9
Figure 9.10 lWo small robotscooperativelycarrying a pipe. ( photographcourtesyof A. Agah and G. Bekey.)
averagenumberof robotswithin the outputrange) is computedasfollows: ! = -.vI C , (9.2) Xoptimal P where c is the information acquisitioncapacity, an integer representingthe upperlimit on thenumberof robotsthatcanbereceivedat anyonetime without loss of information, and p is the probability of information output for each robot.
9.6.3 Communication Content But what should be said betweenrobots? Yanco and Stein ( 1993), at MIT , studiedcommunicationspecificallyin the contextof robotic systems.In their research , a task is defined that requires communicationto coordinatetwo robots, Ernie (the follower) and Bert (the leader) (figure 9.11). The robots have an extremely limited vocabulary(two words) that self-organizesover time, improving the performance . The targettaskinvolvesthe follower robot' s ' mimicking the leaders behaviorby either spinningor moving forward. Both
383
SocialBehavior
Figure9.11 Ernie andBert. ( photographcourtesyof Holly Yanco.)
TASKSIGNAL REINFORCEMENt SIGNAL
LEADER ROBOT
ACTION
ROBOT LANGUAGE SIGNAL FOLLOWER ROBOT
ACTION
Figure 9.12 Learningcoordinationvia communication.
robotsreceivereinforcementfrom a humaninstructor. Figure 9.12 depictsthe relationshipsbetweenthe robotsandinstructor-providedreinforcement. At GeorgiaTech, researchersstudiedcommunication's impact on the performance of multiagent.robotic teams. Initial studies(Arkin 1992b) indicated that robots could cooperatein foraging tasksevenin the absenceof explicit communication.Cooperationin this contextis evidencedasthe phenomenaof recruitment, the sharedeffort of manyrobotsto perform a task. Holldobler and " Wilson ( 199O , p. 265) havedefinedrecruitmentas communicationthat brings nestmatesto somepoint in spacewherework is required." Although communication mechanismscanenhancethe speedat which multiple agentsconverge at a commonwork location, recruitment-like behaviorin the absenceof direct
384
9 Chapter communicationbetweenthe agentshasalsobeendemonstrated . This resultargues that althoughcommunicationmaybe useful, it is not necessaryfor certain typesof tasks. For example, in a foragingtask, eachof the agents,operatingindependently , can discovera commonattractor. As discoveryoccurs, more agentsacquire the sameobject and work togetherto transportit to a commongoal. As they convergeon the attractingobject, the speedat which it is retrievedincreases becauseof the larger numberof actorstransportingit, yielding a cooperative effect. In mostcases,objectstoo largefor movementby a singleagentcanstill be recoveredsuccess fully after multiple agentshavearrivedat the work site. But what is gainedif communicationability is added? Embarkingfrom the minimalist approach(i.e., what can be accomplishedin the absenceof any communication ), additional studieswere performedto quantify performance based on addingexplicit communicationto foraging, consuming improvements and tasks , (section9.1). 1\ vo new classesof communicationbetween grazing agentswereintroduced(Balch andArkin 1994): . Statecommunication: A single bit of information is transmitted, indicating which state(s) the transmittingagentis in. Figure 9.13 showsa partitioning of the foragingFSA, wheretransmissionof a 0 indicatesthat the robot is wandering , and transmissionof a 1 indicatesthat it is goal-directed: either acquiring the detectedattractoror returning it to the homebase. Insteadof headingdirectly toward the attractor object, the robot movestoward the transmitting agent, following it until within detectionrangeof the object itself. This type of communicationis analogousto displaybehaviorin animals. . Goal communication: Going one step further, the location of the detected attractoris transmittedto the attendingagent. Herethe robot canmovedirectly to the goal objectwithout following the otheragent. Someexamplesfrom simulationstudiesqualitativelyshowthe variationin performancewith thesemethods: figure 9.14 for foraging behaviorshowsin -like displaythe reductionof effort in accomplishingthe society's a Rorschach taskasmorecommunicationis introduced; figure9.15, showingthe consuming task, depictsa markedreductionfor statecommunicationbut no noticeable differencewhen statecommunicationis replacedby goal. Theseresultswere portedto Denningmobile robotsfor further experimentation(figures9.16 and 9.17). Quantitativeanalysisof extensivesimulationstudiesyielded the following conclusionsregardingcommunicationcontent( BalchandArkin 1994):
385
SncialBehavior
a
Figure9.13 Statecommunication duringforagingtask.
w ~ ~
. Communicationimprovesperformancesignificantlyin tasksinvolving little implicit communication(foragingandconsuming). In the grazingtask, robots leaveevidenceof their passage , sincethe placesthey visit are modified. This fact is observableby the other robots. These types of communicationare . referredto asimplicit, sincethey requireno deliberateact of transmission . Communicationis not essentialin tasksthat include implicit communication . . More complexcommunicationstrategies(goal) offer little benefitoverbasic (state) communicationfor these tasks, confirming that display behavior is indeeda rich communicationmethod. ( z
a I
0
9.6.4
Guaranteeing Communication
Fonnal theoreticalmethodshavebeenappliedin a limited way to ensurethe quality of communicationin multiagentrobotic systems. At the University of California at Riverside, Wang( 1995) has looked at distributedmutual exclusion ' , techniquesfor coordinatingmultirobot systems. In Wangs research no centralizedclock or sharedmemoryis usedbetweenagents. Only limited communicationin immediateneighborhoodsis requiredto provide deadlock
386
Chapter9 State Communication ,
No Communication
. 1 .
.
. , '
{
"
c "
' .
,
'
"
1
'
.,
. ;
"
: "
,
" ~
.
,
~,
" -
,
'
. . ~ ,
' "
.
"
"
'
, ~
I
; '
Y '
/
, -
)
'
"
'
.
,
,
, ' ~
. .
'
-
-
.
,
. '
,
.
I ,
'
I
'
' ' .
.
a
I , . ; : " -
-
-
-
-
-
-
, . .
(A)
(B)
Goal Communication
. . C )
Figure 9.14 Typical run for foraging task with (A ) no, (B) state, and (C) goal communication.The figuresshowthe pathstwo robotstook in retrieving sevenattractors. Note that moving from (A ) through(C) the societybecomesprogressivelymore goal directed. The simulations , to complete. Additional required 5, 145, 4,470 and 3,495 steps, respectively communicationconsistentlyimprovesperformance .
(
Figure 9.15 The consumingtask with (A ) no, ( B) state, and (C) goal communication.The simulations , to complete. Note that state required9,200, 8,340 and 8,355 steps, respectively andgoal communicationperformanceareapproximatelyequalin this task. .
.
.
.
.
.
Goal Communication
(B) -
~
. .
" '
" ,
.
.
: :
.
.
. '
-
. "
w
.
:
"
. .
-
'
,
.
-
~
-
-
. .
,
'
-
-
C
-
-
, ,
.
,
,
,
,
' -
. '
~ v '
,
. . . ,
'
-
'
"
~
~ -
'
.
J
-
"
'
r
J
'
1
. . "
~ . : I . ,
' . . ~ I
,
I
-
.
-
\
-
-
\
.
-
-
-
-
-
-
-
-
-
.
. .
"
~
-
-
,
,
,
\ ~ ~
\
'
r '
-
~
.
Y
r
. . . . .
,
'
,
.
1 II , . ~
;
I
~
\
\
I I 1
' r
4
"
'
; ,
.
'
<
-
"
.
, ,
. .
;
~
I
-
-
,
:
'
,
"
I
. . .
- - ' -
-
,
~
,
!
~
, ,
~
"
'
-
-
W
"
i
,
. ,
.
/
,
-
-
-
-
: '
:
.
-
.
.
-
.
-
"
. . ,
,
,
"
-
,
, ,
,
\
"
,
"
.
, -
,
I
~
'
,
,
.
. .
, .
I
"
,
.
r '
-
' '
.
-
-
-
~'
~
,
.
-
,
' "
' -
:
,
-
-
'
-
.
,
"
, , ,
-
"
_
. .
.
-
-
-
-
"
.
" .
"
"
, \
"
" "
' I
/
'
\ -
.
'
;
,
.
,
'
~ I ' ~
!
-
.
-
.
-
, ,
\
I
.
" -
~
,
,
r '
'
"
.
~ ~
.
~
,
'
-
-
r
,
\
,
'
' I
,
-
I
,
;
"
'
'
' .
"
~ .
;
I 1
"
101 ]
~
,
" '
'
.
/ -
. "
\ '
I
,
-
" "
.
'
-
, , .
.
. , ~
' ,
,
.
"
~
J
A
'
'
,
j ,
c
-
.
'
-
, " ~
,
. ~
" ,
I
,
\
~ .
'
-
'
, ,
,
- I
'
r
!
.
' '
-
" "
.
.
,
'
'
.
.
OJ
'
"
,
,
~ ,
'
. '
' "
>
_
/ "
:
,
"
' L
'
"
"
-
.
- '
' '
,
' , ,
~ '
' . . .
.
'
,
-
.
r
-
-
.
"
.
I .
,
'
. ". " '
. /
'
\
,
/ / .
, -
' .
~
. .
q
, ~
,
,
, "
\
,
" ~
'
" :
. -
I
'
-
"
"
'
I
"
,
" .
' "
\
~ ' \
\
.
'
.
,
-
(A) ,
.
'
'
/
\
,
,
8 I "
\ "
,
\
" ~
_
,
.
.
. "
.
~*
, ,
~
,
(
.
-
.
'
.
,
, .
:
.
.
-
.
-
,
-
,
C
~ .
:
t /
,
r
,
'
' " :
.
;
~
, "
I
.
\
r
-
~ . . "
jJ , " ' ;~
;
-
,
-
"
'
.~
,
\
r
' .
~
.
.
~
. . . . .
.
~
'
*
_
'
"
~
.
-
'
. .
:
~
. ~
f -
.
,
' ~
A
I
-
-
, , . . L
'
,
" ~
,
.
' ~
-
'
'
,
~
.
-
'
"
~
,
.
' -
.
-
-
'
" .
Ij
11
\
- i '
. '
~
I \
.
-
~
" " ,
,
,
'
,
~ ' " ~ - . ,
"
,
,
,
J ~
/
. ' " ,
. . .
"
"
" \
~
.
\
\
-
.
. . ' ,
'
~
~
.
' '
' "
. .
)
" '
"
" " .
, '
. , .
, .
,
I
~
State Communication
No Communication
387
SocialBehavior
388
Chapter9
(A)
( B) Figure 9.16 (A ) Two Denningrobots, Ren and Stirnpy, demonstratethe foraging task, in this case without explicit communication.( B) Rentagsan attractor.
389
SocialBehavior
(C)
(D) ) Figure9.16(continued " " . ( D) ReoandStimpydelivertheattractors to homebase . (C) Stimpytags anattractor
SocialBehavior
Horne
Base .
-
-
-
-
-
'
-
~
~
~
-
) V
-
L
-
J -
-
-
/ '
"
_
~ . . .
"
. . . . .
. . ~ h Obstacle
r D actor '
Atb
391
(C) Figure 9.17 (continued) A reconstructionof the pathtaken(from previousfigure) in the foragingdemonstration . Note the cooperationin retrievingthe last object in figure (C).
detection(a stagnationcondition) or to coordinatemultiple agentscompeting for a single resource(e.g., passingthrough a narrow corridor or crossing at an intersection). Robotsuse" signboard" communication( Wang1994), a specific low-bandwidth messageprotocol displayedby a device on each robot and perceivableonly by nearbyrobots. Although the results have yet to be fielded on actualrobots, they can be provencorrectwithin their formal framework. Lin andHsu ( 1995) at the NationalTaiwanUniversitydevelopedadeadlockfree cooperationprotocol for a multiagentobject-sorting task. The protocol usesbroadcastcommunicationto sort agentpriorities after a deadlockcondition hasbeendetected.Specializedbehavioralstrategieshavebeendeveloped for helping other robotic agents, for performing load-balancing among the robots, and for selectingpartnersfor the task of moving a set of randomly placedobjectsto specific goal locations. An object, 0 ;, requiresi robots to moveit to its goal location. Deadlockcaneasily occur whenthe robotsdo not help eachother. This is a variationon the foragingtaskandcanbe represented asan FSA, asin figure 9.18.
392
Chapter9
START receive arid --help -accept request
-home reach
objec -help enough Fil- Ure9.18 Simplified FSA for object sorting. A help requestis emittedwhen the robot entersthe wait stateafter detectingan objecttoo largeto moveby itself.
The robots are capableof broadcastingrequestsfor help and sendingout . In one strategy point-to-point offers to help in responseto particular requests , deadlock is preventedby establishingpriorities for the objects to be movedand helping to move the one with the highestpriority first. Priorities are basedon the object' s distanceto its final destination. If there is still a conflict, x-coordinatesare compared, and if still needed, y-coordinatesare used for tie breaking. Since no two objects can possessthe samex- and ycoordinates , deadlockcannotoccuraslong assufficientagentsareavailableto move the help largestpossibleobject. More complexstrategiesinvolve detecting deadlockconditionsafter they occur and then remedyingthe situationor generatinga feasiblesequenceof actionsthat preventsdeadlockfrom occurring at all while simultaneouslybalancingthe workload amongthe available robots.
PERCEPTION 9.7 DISTRIBUTED Our discussionsof communicationhave focusedon how information can be sharedamongrobotic agents. Further issuesinvolve how perceptualactivity
393
SocialBehavior
can be coordinatedamonga team of robots, what sensoryor perceptualinformation is worth sharing, and how a team of robots, as distinct from an individual robot, shouldview the world. One important perceptualtask, relating to the notion of perceptualclasses 7.4.3), is that of distinguishingteam membersfrom other environmental section ( features. Kin recognitionis the term usedto refer to this particularperceptual ability (Mataric 1993b). In biology, specializedneuralcircuitry often , the inability to recognizefaces, occurs accomplishesthis task. Prosopagnosia in humanswhentherearelesions, typically dueto strokes, on the undersideof theoccipitallobes, providing strongevidencethat" someneuralnetworkwithin this regionis specializedfor the rapid andreliablerecognitionof humanfaces" (Geschwind1979, p. 112). In multiagent robotic systems, kin recognition mayor may not be useful . A robotic team that does not have , dependingon the circumstances often two distinct this ability has only : obstaclesand target perceptualclasses . Kin objects. Team memberrobots are indistinguishablefrom obstacles however when information , , , recognition implemented provides regarding other robots' position and, if needed, their identity. Thus behaviorscan be developedthat allow the robots to interact more effectively and minimize interferencewith each other (Mataric 1992c). This information can be provided in various ways: for example, by transmitting the positional information directly (Mataric 1992c) or instead using specific perceptualcues, such as making the robots a unique color relative to their environment(e.g., green in the caseof the 1993AAAI mobile robot competition( Balch et al. 1995). The information sharedby perceptionbetweenagentscan go far beyond ' simply recognizingeach other; an agents intentions can also be discerned. Cooperationby observationrefers to this sharing of perceivedinformation in the absenceof explicit communication. More specifically, cooperationby observationinvolves observinganotheragent's action, then choosingappropriate actions based on the observedaction and the current task situation (adaptedfrom Kuniyoshi et al. 1994). This is closely related to the notion of plan recognition in the distributed AI community, in which an agent's intentions are inferred by observing its actions (Huber and Durfee 1995). In work at Japan's ElectrotechnicalLaboratory, Kuniyoshi ( 1995), using a team of small robots equippedwith stereovision, has defined severalperceptual functions to accomplishthe task of figuring out what anotherrobot is doing:
394
Chapter9
. Find: Directsthe observingagent's attentionto a new target. . Track: Follows the targetagentasit movesthroughthe world. . Anticipate: Recognizespotentialcollisionsor otherhazardoussituationsand preventsthem. . Event Detection: Recognizescertain preconditionsto synchronousaction with other robots. This may involve spatialcoordinationor a teammember's releaseof an object. The eventdetectionalgorithmsmust be designedin a task-specificmannerto providethe informationnecessaryfor thejob at hand(section7.5.1). A numberof kinds of applicationsrequiremultiagentperceptionincluding . Convoying: A commontask for teamsof robotstravelingfrom one location to anotheris follow -the-leaderor convoying. This task hasusein both intelligent vehiclehighwaysystemsandmilitary logistical operations.In somecases, a mixed human-robot systemmay be deployed, with the humandriver leading the convoy and the other robots following safely behind. One representative for two heterogeneous perceptualsystem, developedat the University of Tennessee robots, usesa 10DOF robotic headperfonningcorrelation-basedvisual trackingtied to a fuzzy logic controller(Marapane,Holder, andTrivedi 1994). Numerousother examplesinclude work at GeorgiaTech (Balch and Arkin 1995) that usesGPSsensordatafor maintainingcolumn formation in a team of unmannedgroundvehicles(section9.9.1). . Landminedetection: This task involvescoordina.ted spatialexplorationand probing of a geographically boundedregion. A wide range of sensorsis available for this task including magnetic, Xray , acoustic subsoil sensors , and ground-penetratingradar systems. A team of UCLA and U.S. Army researchers( Franklin, Kahng, and Lewis 1995) have proposeda prototype heterogeneoussociety consisting of ten R-3 robots and a small X -Cell 60 aerial robot. Clearly, for this mission, it is crucially important that agents shareinformation to prevent their traversingan already-detectedmine and also to increasethe efficiencyand coverageof an areaby avoidingredundant search. . Reconnaissance and surveillance: Teamsof robotsconcernedwith monitoring an areafor incursionby anintrudermusthavethe ability to coordinatetheir for robotsto spendtime spatialandperceptualactivities. It is not advantageous at the same location while others . looking ignoring Gage( 1992) hasdefineda varietyof coveragemethodsprescribingthe locationof robotsfor surveillance, including
395
SocialBehavior
. blanketcoverage,in which eachrobot takesup a stationandremainsthereto watchfor intruders. . barriercoverage,in which a staticline of robotsis createdto preventcrossing of the barrier without detection. . sweepcoverage,in whIcha teamof robotsmovesthroughan areaattempting to ensurethat no enemyor intruderactivity is present. Sensorpointing must be controlled in a manner consistentwith the other robots' positions and sensordeployments . Section9.9.1 describesdecisiontheoreticmethodsfor accomplishingthis. . Map making: Expandingupon the notion of cooperationby observation, behavior-basednavigationusing sharedmapsby a team of robots has been developedin a joint U.S.-Japaneffort (Barth and Ishiguro 1994). In con. trast to the use of panoramicvision for global localization discussedin section 5.2.2.1, theserobots can cooperateby providing to one anotherinformation regarding the relative whereaboutsof team members. Aschema based control approachis used, .with behaviors including avoid-obstacle, avoid-other-robot, group-moment-attraction (drawing the robots toward the center of mass of the other perceivedrobots to help keep the group together ), and object-range-uncertainty-attraction(guiding the robot toward regions of uncertaintyto gathermore information). Small mobile robots with panoramicvision systems(figure 9.19) are being developedto provide exploration andformationcapabilitiesthat realizethe resultsof Barth andIshiguro' s simulations.
9.8 SOCIALLEARNING Teamsof robotsoffer new opportunitiesfor learning, particularly how to selforganize andbecomemorecooperativeover time. Mataric ( 1994b) definesthe basicforms of sociallearningasimitation or mimicry, in which one agentacquires the ability to repeator mimic another's behavior, andsocialfacilitation, in which existing behaviorsare expressedmore effectively as a direct consequence of social interactions. An inherenttensionexists betweenindividual andgroup needs. Agentsmay be strongly self-interestedandhaveno concern for the society's overall well-being. What ecologicalpressurescan be brought to bearthat encouragenongreedystrategiesthatbenefitthe societyandyet may be detrimentalto the individual and how can social rules be developedthat transcendan individual agent's goals?
396
Chapter9
(A) Figure9.19 with panoramic Robotequipped vision(A) anda panoramic viewof a laboratory(B). withpennission fromBarthandIshiguro1994.@1994IEEE.) (Reprinted 9.8.1
Reinforcement Learning
Reinforcementlearning (section8.3), a commonstrategyusedfor individual robotic learning, has beenapplied in multirobot contextsas well. Optimization functions in social robotics typically center on minimizing interference betweenagentsandmaximizingthe society's reward. Reinforcementcanresult from an agent's actionsdirectly, from observationof anotheragent's actions, or from observationof the reinforcementanotheragentreceives(vicariousreinforcement ). Matarlc ( 1994a) conductedexperimentsin sociallearningwithin a foraging context. A teamof four robotswasequippedwith adaptivebehaviorsfor safe wandering, dispersion,restingandhoming. Perceptionwasencodedasa setof : have-puck? at-home? near-intruder? night-time? Therobotslearned predicates overtime to associatethecorrectperceptualpreconditionswith the appropriate behaviorin this societalcontext. Both delayed-reinforcementQ-learning(section 8.3) and a progressestimatorreinforcementsummationalgorithm were
(
(
398
Chapter9
tested. The progressestimatorapproachyieldedbetterresults, presumablydue to the task' s non-Markovian nature (a consequenceof the inherent noise in perceptionand actuation). Subsequentwork comparedlearningusing two additional social rules: yielding, in which a robot yields the right-of-way when on one side of an oncomingteammateand continueswhen on the other, and sharing, in which a robot broadcastsinformation to other robots. Reinforcement learningusingthesesocialrulesalwaysimprovedperformanceoverthose methodsthat usedsolely greedystrategies .
9.8.2 L -Alliance Parker( 1994) hasextendedthe Alliance architecturedescribedin section9.5.2 to include learning mechanisms . Learning in L -Alliance involvesparametric , eliminating the need adjustmentandimprovesteamandmissionperformance for a humanoperatorto tune behavioralparametersettings. Eachteammember within L -Alliance maintainsstatisticaldataregardingits own pastperformance as well as that of eachof its teammates . The time history is relatively small, typically five previoustrials. This small history window permitsrapid responsiveness yet allows reasonablepredictionsto be maderegardingfuture . performancerequirements One learning problem unique to multirobot systemsconcernsoptimizing taskdistributionoverthe availablerobotic agents,which is not unlike loadbalancing in multiprocessorsystems. Indeed, different task allocation strategies basedon results from the parallel processorcommunity were tested. Techniques suchastrying to accomplishthe longesttaskfirst (basedon descending first fit [GareyandJohnson1979]) werefound to resultin terribleperformance , however, becausemultirobot taskshad a high rate of failure during execution. Other allocation strategiesusing shortesttask first or random selectionproduced betterresults. The learnedparametricvaluesinclude influenceandmotivation parametersthat affect task selectionandimpatienceandacquiescence valuesthat affect task completion. The metricsusedto measureperformance weretime andenergyconsumption. During the initial active learningphase, the robots initially are maximally . In the subsequentadaptivelearningphase, patientand minimally acquiescent the robots start with the parametricvalueslearnedduring the active learning phase. Ad hoc updateequationsspecific for eachof the adjustedparameters havebeenusedto achieveresultswithin 20 percentof optimal for oneparticular simulatedcontrol strategy.
399
SENSORS
TRO . . ~ . LEAR EV SYS P E R C MAT P TRO T IN 0
SocialBehavior
ACTUATORS
9.20 Figure Tropism-basedcognitivearchitecture(after Agah andBekey 1997). 9.8.3 TropismSystemCognitiveArchitecture
Figure9.20 depictsthe ttopism systemcognitivearchitecture,developedat the University of SouthernCalifornia andinttoducedin section9.6.2. A weighted roulette wheel action-selectionmechanismbasedon each action' s strength (its ttopism value) arbitratesbetweenwhateverttopisms are matchedwith current sensoryinputs. Recall that ttopisms representthe robot' s " likes and dislikes." This architectureinttoducesthreetypesof learning(Agah andBekey 1995b): . In perceptual leaming, a new ttopism is created, and the oldest one is removedfrom the ttopism set. The new tropism consistsof the four-tuple , '['initial>where0- is the novel sensedentity, p is the state, arandom (0- , p, arandom is a randomaction, and '['initialis an initial ttopism value usedby the actionselectionmechanism . . Leamingfrom successresemblesQ-learning(section8.3), graduallyincreasing the tropismvalue'[' by a fixed incrementup to a maximumvalue, makingit morelikely to occurthe next time the systemfinds itself in the samesituation. . Leamingfromfailure changesarandom in the four-tupleto a newactionwhen the last randomlygeneratedonehasprovenunfruitful. In contrastto L -Alliance, in which eachrobot maintainsstatisticaldataon otherrobots' performancethroughcommunication,hereeachrobot learnsindependently of the others. Performanceimprovementwas demonstratedin
400
Chapter9
simulation studiesusing three different metrics: total number of tasks performed , total energyconsumptionof the colony, andenergyconsumedper task completed. Geneticalgori~ s (section8.5) havealso beenapplied within this architecture . Thesemethodsstronglyresembleclassifiersystemsin their approach, with the tropismsencodedsimilarly to productionrules. Fitnessevaluationis basedon the numberof taskscompletedand energyconsumption.The major 's differencebetweentheseand other GA approach es lies in the performance being measuredat the societallevel, ratherthan at the individual. In addition to simulationstudies, a small setof robotsin an actualimplementationlearned to gatherobjectsandcooperatein carryingthem (Agah andBekey 1997).
9.8.4 . Learningby Imitation Imitation involves first observing anotheragent's actions (either human or robot), then encodingthat action in someinternal representation , and finally initial action . Bakker the to and reproducing According Kuniyoshi ( 1996), observationof an action involves . . . .
beingmotivatedto find a teacher. finding a goodteacher. identifying what needsto be learnedfrom the teacher. ' perceivingthe teachers actionscorrectly.
of an actioninvolves Representation . selectinga suitableencodingthat matchesthe observationto the action. . capturinga particularobservationin the chosenrepresentational format. Reproductionof an actioninvolves . being motivatedto act in responseto an observation . . selectingan actionfor the currentcontext. . adaptingthe actionto the currentenvironment. We havealreadyseenan exampleof action-imitation in Yanco's work (section 9.6.3), when Ernie the robot learnsto coordinateitself with Bert' s motor activities. Another exampleof learning by imitation involvesa robot imitating a robotic teachermovingwithin a maze(HayesandDemiris 1994). During training, the learningagentobserveswhetherthe leadereitherturns90 degrees or continuesmoving forward at a given point within the maze. It thenencodes a rule associatingtheenvironmentwith thatparticularaction. Later the agentis
401
Social Behavior
able to determine, via sensing, which action rule is appropriate for its current
positionwithinthemaze.
9.9 CASESTUDY : UGVDEMOn DARPA conducteda researchprogramin the mid- 1990sfocusedon providing supportfor battlefieldscoutingoperations.Behavior-basedrobotic systems clearly canplaya role in this highly uncertainanddynamicdomain. The UGV Demo II programemployeda team of unmannedground vehiclesas scouts, , surveillance, andtargetacquisitionoperations capableof conductingreconnaissance in a coordinatedmanner. In conventionalmilitary operations, motorized scoutstypically move in advanceof the main force to report on enemy positionsandcapabilities. Incorporatedon eachindividual vehicle (HMMWV - High Mobility Multipurpose WheeledVehicle; seefigure 9.21) is an architectureconsistingof a suiteof behaviors(figure 9.22). Theseinclude . Stripe: A teleoperationbehaviorusedby the operatorto establishintermediate way pointsfor the vehicle, thenautomaticallycreatea paththat the vehicle strivesto follow. . Crosscountry: A path-following behaviorthat usesGPSdata for localization. . Ranger: A navigationalbehavior using geometricdata derived from both sensorandmapdata. . Ganesha : An obstacleavoidancebehaviorthat usesa local mapderivedfrom laserrangefinder data(Langer, Rosenblatt , andHebert 1994). . Safety: Obstacledetectionand avoidanceusing stereovision (Chun et al. 1995). . Alvinn: A neuralnetworkroad-following behavior(section7.6.1). . Formation: Formationcontrol for multiple vehicles(section9.9.1). The DAMN arbiter (section4.5.5) is usedto coordinatethesebehaviors. At DemoA , the first of a seriesof demonstrations , in 1993, a singlevehicle showedthe capability of road following using Alvinn and Stripe teleoperation (ChunandJochem1994). Off-roadnavigationalcapabilitieswereaddedat DemoB in 1994usingstereovision for obstacleavoidance(Chunet al. 1995). Sincethis chapterdiscusses multiagentsystems, we focus on the capabilities ' , conductedin 1995, including developedfor Demo C s multivehicledemonstrations mission formation control , multiagent specificationandplanning, and . teamteleautonomy
402
Chapter9
(A)
(B) Figure9.21 : (A) Single Unmanned groundvehiclesfrom the DARPAUGV DemoII program HMMWVscout; (B) EntireDemoII HMMWVUGV team.(Photographs courtesyof -Martin, Denver .) Lockheed , Colorado
403
Social Behavior
BEHAVIORS
ARB~ DAMN
SENSORS
DS
SHARED
MEMORY
8 ~ ~ ' Q
I ' Ai ~ ~
-
-
. ~ 0
.
.
R S
. .
Figure9.22 UGV Demo n softwarearchitecture(behavioralcomponents ).
9.9.1 FonnationBehaviors Fonnationcontrol hassignificantutility for a wide rangeof potentialapplications in the unmannedvehiclecommunity. In UGVs, formationcontrol canbe usedfor military scoutingmissionsandlogistical supportin convoying. Scout teamsemploy specific formations for particular tasks. Column formation is usually associatedwith road-following activities, whereasline formationsare usedto crosslarge expansesof open terrain. Robotic behaviorswere implemented to accomplishthe four primary formationsfor scoutvehicleslisted by U.S. Army manuals( 1986) : diamond, wedge, line, andcolumn(figure 9.23). Thesefonnatio:n control behaviorswere success fully testedfirst in simulation studies, thenon Denningmobile robotsin the GeorgiaTechMobile Robot Laboratoryand on two HMMWV s at Demo C of the UGV Demo II program in July 1995 in Denver. Figure 9.24 depictsa sequenceof imagesfrom the demonstrationin which the vehiclesstartedinitially in a column formation,
~
404
9 Chapter
8
8
. (A)
8
. (B) Figure9.23
405
SocialBehavior
..
.
8
8
(C)
.
.
8 (D) Figure 9. 23(continued) Four simulatedrobotsin leader-referenced(A ) diamond, ( B) wedge, (C) line, and ( D) columnfonnationsexecutinga 9O- degreeturn in an obstaclefield.
407
Social Behavior
(C) ) Figure9. 24(continued DemoC Formationtechdemo:1\\10UGVs traveling in (A ) column, ( B) wedge, then (C) line fonnation.
then transitedto wedgeformation, then movedto line formation, and finally revertedto columnformationwith the vehicles' initial positionreversed.Waypoint navigationandformationmaintenancewereconcurrentlyactive, andthe vehiclesswitchedamongformationssmoothlyand autonomouslyat set GPS' designatedpoints. Differential GPSprovidedthe vehicles position relativeto eachother. The formation behavior itself is comprisedof two main components : a maintain schema detect formation and a motor schema position perceptual ' formation. The perceptualschemadeterminesthe robot s desiredlocation for the formationtype in use, its relativeposition in the overall formation, andthe otherrobots' locations. Maintain-fonnation computesa vectortowardthis p0sition whosemagnitudeis basedon how far out of position the robot is. Three zonesaredefined(figure 9.25) : . Ballistic zone: The robot is far from the desiredposition, so the output vector's magnitudeis set at its maximum, which equatesto the schema's gain value, with its directional componentpointing toward the centerof the computeddeadzone.
408
9 Chapter
-
~ u " fn ~ =
. Control led zone: The robot is midway to slightly out of position, with the vector's magnitudedecreasinglinearly from a maximumat the zone's farthest edgeto zero at the inner edge. The directional componentpoints toward the deadzone's center. . Dead zone: The robot is within acceptablepositionaltolerance. Within the deadzone, the vector' s magnitudeis alwayszero. In figure 9.25, Robot 3 attemptsto maintaina positionto the left of andabeam Robot 1. Robot3 is in thecontrolled zone, sothebehaviorgeneratesa moderate force towardthe desiredposition (forward andright). Each robot must computeits own position in the formation continuously. Threetechniquesfor this computationhavebeenidentified: . Unit -center- referenced: A unit-centeris computedby averagingthe x- and y-coordinatesof all robotsinvolvedin the formation. Eachrobot determinesits own formation positionrelativeto that center. . Leader-referenced: Eachrobot determinesits formationpositionin relation to a designatedleadrobot. The leaderdoesnot attemptto maintainformation;
409
SocialBehavior the other robots are responsible for maintaining suitable offsets relative to the
leader. . Neighbor-referenced: Eachrobot maintainsits position relativeto an adjacent robot. Someinterestingobservationsfrom this research(Balch and Arkin 1995) provide guidancein the use of formations. For example, the unit-center approach requiresa transmitterand receiverfor each robot and a protocol for exchangingposition information. On the other hand, the leader-referenced approachrequires only one transmitterfor the leader and one receiver for eachfollowing robot, reducingcommunicationsbandwidthrequirementssignificantly -restricted and making it a preferableapproachin communications applications.Also, regardingtheuseof kin recognitioninsteadof explicit communication : unit-center- referencedformationsplace a heavydemandon any sensor systems(e.g., vision) used. In a four-robot visual formation, for passive instance, eachrobot would haveto track threeotherrobotsthat may be spread acrossan extremelywide field of view. Leader- and neighbor-referencedformations , on the otherhand, requiretrackingonly oneotherrobot. A relatedquestionis how the availablesensorresourcescan be usedeffectively for a givenformation. Research to performefficientscoutreconnaissance at the Universityof Texasat Arlington considershow to coordinateperception over multiple vehicles(Cook, Gmytrasiewicz, and Holder 1996). Figure 9.26 showsa simulation of sensor-pointing algorithms fielded on a team of two robotic vehiclesaspart of DARPA' s UGV Demon program. U sing decision-theoretic methods, utility values for individual fields-ofregardare assignedto each memberof a formation, allocating specific time in eachrelativeposition (figure 9.27) . The utility valuefor scanningan areaA using sensorSfrom position P is determinedby the following equation: Uscan (A , S, P ) = ( E Plc (x , y) P2k(X, y ) V Ikdxdy , JA k
(9.3)
where Plc (X, y) is the conditional probability that a targetat location (x , y ) will be identified correctly from position P using sensorS; P2k(X, y) is the prior probability that a target of type k exists at location (x , y) ; and V Ik is the value of information regardinga target of type k. Theseutility values arecontinuouslyrecomputedtaking into accountdynamicfactorssuchas terrain , security, and focusof attentionandare usedto updatethe field-of-regard selectionweights. Weightedroulettewheel methodsare usedfor the selection
410
Chapter
Figure 9.26 A simulationof sensorpointing asa teamof four robotstravelsleft to right in a diamond formationconductingreconnaissance operations.Triangleslooking outwardfrom each ' fields of view. All robot representthe video sensors targets(representedascircles) are success fully detectedduring the mission. ( Figurecourtesyof Diane Cook.) of new fields - of -regard values, introducing nondetenninism into the observational strategy that increases the likelihood of hostile target detection while reducing the effectiveness of countermeasures. 9.9.2
Multiagent Mission Specification
To effectively designa mission for a team of robots, suitabletools must be available. Severaldifferent approach es were developedfor use within the Demo II program addressingdifferent aspectsof this problem. This section focuseson the MissionLab tool set. At the University of Michigan, Lee et at.
SocialBehavirn
~ ~ ",\ " ' / /~~ -A\
UNEFORMATION
COLUMN FORMAnON
G . G Lf ~ . :~>~
UGV- 1
-2 UGV -3 UGV
UGV -4
Direction
of
w
411
Motion
.
~ ~ " ' " ~ ",5 /~~ ; /UGV \ -2\ UGV -4 -
UGV-4
W ~",..._"7~~~~.;../ 7 ~: _
.
DIrection of Motion
UGV 1
UGV -3 5 W ) '\",,-.:~~ 7 . -_. Directionof Motion
.
DIAMONDFORMATION
61 DIrection af Mauon
--
WEDGEFORMATION Figure9.27 . Thenumberin eachsectorrepresents Examplefieldsof regardfor variousfonnations thepercentage of timespentonreconnaissance activitywithinthatarea.
412
Chapter9
( 1994) developedanothersystem, UM -PRS, basedon the ProceduralReasoning System(section6.6.4). It elaboratesplans for the current environmental context consistentwith long-term goals. Although success fully fielded on a robot namedMAVERIC (Kenny et al. 1994), only limited resultshavebeen reportedto datefor its usein multivehiclemissions. MissionLab,l a multiagentmissionspecificationsystemdevelopedat Georgia Tech, usesan agent-orientedphilosophyas the underlying methodology , . It includes a societies of robots formulation of the recursive graphical permitting configurationeditor, a multiagentsimulationsystem, andtwo different architectural . This softwaresystemembodiesthe SocietalAgent codegenerators Theorydescribedin section9.5.4. A societyis viewedasan agentthat consists robots. Eachindividual of a collection of either homogeneousor heterogeneous of behaviors, coordinatedin various robotic agentconsistsof assemblages ways. Temporalsequencing(Arkin and MacKenzie 1994) affords transitions betweenvariousbehavioralstatesnaturallyrepresentedas a finite stateacceptor . Coordinationof parallelbehaviorscanbe accomplishedvia fusion (vector summation), action-selection, priority (e.g., subsumption ) or other coordination consist behavioral These individual . as assemblages operators necessary motor behaviors of groupsof primitive perceptualand ultimately groundedto ' the robot s physicalsensorsandactuators. Creatinga multiagentrobot configurationinvolvesthreesteps: determining an appropriateset of skills for eachof the vehicles; translatingthosemissionorientedskills into setsof suitablebehaviors(assemblages ); and constructing or selectingsuitablecoordinationmechanismsto ensurethat the correct skill aredeployedfor the mission's duration. assemblages An importantfeatureof MissionLab is its ability to delaybinding to aparticular behavioralarchitecture(e.g., schemabased, UGV Demo ll , subsumption ) until after the desiredmissionbehaviorhasbeenspecified. Binding to aparticular physicalrobot occursafter specificationas well, permitting the designto - androbot-independent . be both architecture Lab the Mission shows 9.28 system, which has separatesoftware Figure librariesfor abstractbehaviors, specificarchitectures , and variousrobots. The user interactsthrough a design interface tool (the configurationeditor) that permitshim to visualizea specificationasit is created. Specificationsarerepresentedgraphicallyasicons, which canbe createdas neededor reusedfrom an existingrepertoireavailablein the behaviorallibrary. viatheworldwide-webat: Labis available I. Mission Lab.html. / Mission . edu / ai /robot-lab/research .cc. ://www gatech http
413
SocialBehavior
Figure9. 28 Mission Labsystemarchitecture .
414
Chapter9
(A) Figure9.29 ' , which canbe targetedto the designers abilities, Multiple levelsof abstraction rangefrom entirerobot missionconfigurationsdownto the low-level language for a particular behavior. After the behavioralconfigurationis specified, the . Thencompilationoccurs, generating architectureandrobot typesare selected the robot executables , which can be run within a simulationenvironmentprovided by MissionLab or, througha softwareswitch, downloadedto the actual robotsfor execution. Panel(A ) of figure 9.29 depictsa finite statediagramspecifying a simple military scoutingmission. In this case, explicit GPScoordinatesare usedas . The remainingpanelsshow four robots during executionof the destinations scoutmissionin theMissionLab simulator. Noticethattherobotsbeginmoving in line formation from .the bottom left comer. They then switch to column
415
Social Behavior
(B) Figure 9.29 (continued)
formation to traversethe gap in the forward lines (passagepoint). The robots travel along the axis of advancein wedgeformation and finally occupy the objectivein a diamondformation. 9.9. 3 Team Teleautonomy
Another importantaspectof multiagentcontrol involvesintroducingan operator' s intentions into an autonomousrobotic team's ongoing performance . Softwaredevelopedas part of the UGV Demo n programprovidesthis capability in two ways (Arkin andAll 1994) :
416
Chapter9
(C) Figure 9.29(continued) A finite state configuration, constructedwidrin MissionLab and correspondingto a scoutingmission, appearsin (A ). The mission, consistingof a sequenceof coordinated actionsin differing formations, is shownat variousstagesof executionin ( B) and (C) (moving initially upwardin line formation, thento the right in a column, thenchanging to wedge, and finally occupyingthe objectivein a diamondformation).
Social Behavinr
. A (
)
(
)
B
417
Figure 9. 30 (A ) On-screenjoystick for teleautonomousdirectional control; (B) Personalityslider barsfor teambehavioralmodification.
. The operator as a behavior: In this approach,a separatebehavioris created that permitsthe operatorto introducea headingfor the robot teamusinganonscreenjoystick (panel(A ) of figure 9.30) . This blasesthe ongoingautonomous behavioralcontrol for all of the robots in a particular direction. Indeed, all other behaviorsare still active, typically including obstacleavoidanceand formation maintenance . The output of this behavioris a vector representing ' the operators directionalintentionsand commandstrength. All of the robotic team membersgeneratethe sameteleautonomousbehavioralresponseto the ' operators intentions. The entire teamactsin concertwithout any knowledge of oneanother's behavioralstate. . The operator as a supervisor: Using this method, the operatoris permitted to conductbehavioralmodificationsduring run-time. This can occur at two levels:
418
Chapter9
(A)
(B) Figure9.31
SocialBehavinr -
r
419
0U des
~ I ~ -
-
-
-
-
J
~
, ~ " .
~ .
.
. " C
"
.
~
-
T-
of . . . . . ~ t8U2 . . . . ts
Figure 9.31(continued) Teleautonomous extricationfrom a box canyonof a teamof 2 Denningmobile robots viewed from above ( ). (A ) showsrobots trappedin box canyon, and (B) after teleautonomous removal. (C) providesthe executiontraceof the robotic run (rotated90 degrees clockwiserelativeto the photographs ).
. The knowledgeableoperatorcan adjust the low-level gains and parameters of the activebehavioralset for the entire teamdirectly if desired, varying the relativestrengthsandbehavioralcompositionasthe missionprogress es. . For the normal operator, team behavioraltraits (" personalitycharacteristics " are abstractedand ) presentedon screenfor adjustment(panel ( 8) of . 9.30 These include characteristicssuch as aggressiveness ) (inversely figure the relative ) and adjusting strengthof goal attractionand obstacleavoidance wanderlust (inverselyvarying the strengthof noiserelative to goal attraction and/or formation maintenance ). Theseabstractqualities are more naturalfor an operatorunskilled in behavioralprogramming. This approachpermits the concurrentbehavioralmodificationof all of the robotsin a teamaccordingto the operator's wishes. An exampleillustrating the utility of the directional control approachinvolves the extricationof teamsfrom potentialtraps. Panels(A ) and( 8) of figure 9.31, show(from above) a run usingtwo Denningmobilerobots. The active behaviorsinclude avoid-static-obstacle, move-to-goal, andcolumn-formation. The robotswanderinto the box canyonandbecomestucktrying to maketheir
;
I
420
Chapter9 way to the goal point specifiedjust behindthe box canyon. The operatorintervenes , using the joystick to direct the robotsto the right. While moving, they continueto avoidobstaclesandmaintainformation. Onceclearof the trap, the operatorceasesdirecting the robots and they proceedautonomouslyto their earlier prescribedgoal. The overall executiontraceis depictedin panel(C) of the figure.
9.10 CHAPTER SUMMARY . Teamsof robotsafford significantadvantages overindividual robotsin terms of performance and fault tolerance. Problemsinvolving , sensingcapabilities, interference,communicationcosts, anduncertaintyin the actionsof otherscan preventfull realizationof the benefitsof teaming. . Typical generictasksfor societiesof robotsincludeforaging, flocking, consuming , moving material, andgrazing. . Ethological studies provide insights into social behaviorsand interagent communication. . Multiagentrobotic systemscanbe characterizedalongthe lines of reliability, social organization, communicationcontent and mode, spatial distribution, . , andperformance congregation . Useful taxonomiesexist for definingthe relationshipsin robotic teams. . Behavior-basedarchitectureshavebeenexpandedto includesocialbehavior. . The Nerd Herd andAlliance arevariantsof the subsumptionarchitecture. . The SocietalAgent Theoryextendsschematheory to multiagentrobotics. . Communicationplays a centralrole in coordinatingteamsof robots. . Communicationis not necessaryfor cooperationbut is often desirable. . Range, content, and guaranteesfor communicationare importantfactorsin the designof socialbehavior. . Distributedperceptionovermultiple robotsinvolvessharingdiscernedinformation betweenthem. . Variousforms of machinelearninghavebeenappliedto robotic teams, including reinforcementlearningandimitation. . The UGV Demo II program provides an exampleof teams of robots in action, including aspectsof missionspecification,formationmaintenance , and team teleautonomy.
Chapter 10
FringeRobotics: BeyondBehavior
' " " Theyre madeout of meat. "Meat?" ' " " Meat. Theyre madeout of meat. " Meat?" " There's no doubt about it. We picked up severalfrom differentparts of theplanet, ' took them aboard our recon vessels , and probed them all the way through. Theyre " completelymeat. " That's to the stars?" impossible. Whatabout the radio signals? Themessages ' " . Thesignals Theyusethe radio wavesto talk, but the signalsdon t comefrom them comefrom machines ." " So who madethe machines? That's who we want to contact." " . That's what I 'm trying to tell you. Meat madethe machines Theymadethe machines " . " That's ridiculous. How can meatmakea machine? You're askingme to believein sentientmeat." ' " I 'm not askingyou, I m telling you. Thesecreaturesare the only sentientrace in ' that sectorand they re madeout of meat." ' " Maybe they re like the orfolei. Youknow, a carbon-basedintelligencethat goes " througha meatstage. " 're born meatand . Nope They theydie meat. Westudiedthemfor severalof their life ' ' " spans, which didn t take long. Do you haveany idea what s the life spanof meat? ' " Spareme. Okay, maybethey re only part meat. Youknow, like the weddilei. A meat headwith an electronplasmabrain inside." " Nope. Wethoughtof that, since they do havemeat heads, like the weddilei. But I . They're meatall the way through." told you, weprobedthem " No brain?" ' ' ' " Oh, theres a brain all right. It sjust that the brain is madeout of meat! That s what ' " I ve beentrying to tell you. " " So . . . what doesthe thinking? ' ' " You're not , are you? Youre refusingto deal with what I m telling understanding " . The meat . . The brain does the thinking you " ' " Thinkingmeat! Youre askingme to believein thinking meat!
422
10 Chapter " Yes , thinking meat! Consciousmeat! Loving meat. Dreamingmeat. Themeatis the wholedeal! Are you beginningto get thepicture or do I haveto start all over?" - Terry Bisson, Nebula Award Nominee, from "They ' re Made out of Meat" (OMNI Magazine, April 1991). (Reprintedwith the permissionof Mr. Bisson.)
Chapter Objectives I.To the and ramifications ofarobotic mind with explore concept particular to consciousness emotion and . , , , regard thought imagination 2.To consider unusual of.robotic control homeostasis ,including ,immune aspects and , systems nanotechnology -robot 3.To entertain the notion ofhuman . equivalence In this chapter, we moverelatively far from mainstreamroboticsandprobeits fringe areas.Quite often, practicingroboticistsbecomeenmeshedin pragmatic issuesand ignore their work' s philosophical, ethical, and evenmetaphysical ramifications. We now explore some of the deeperquestionsregardingthe potential for robotic intelligence and in so doing discovera wide range of views on the subjectsof robot mind and body. Certainly interestingsubjects, often controversial, with nothing broadly agreedupon: Intelligence mayor may not be achievableby computation; robotsmayor may not be ableto attain consciousness ; theroles, if any, thatemotionsandimaginationplay in artificial systems;what otherbiological models, suchashormonalandimmunecontrol ; andin what sense , systems,haveto offer; whatof incredibly smallnanorobots if any, a robot canbe viewedasequivalentto or a successorto humanbeings.
10.1 ISSUES OFTIlE ROBOTMIND The idea of artificial minds has beena central philosophicalquestionin AI researchsinceits inception. Attributing a mind to a robot is viewedby manyas a substantialleapof faith for a scientist. The very conceptmay be disturbingto manyof us, for it canraisefearsregardingrobotsaspotentialcompetitorsat all levelsof humanendeavor . Canrobotsthink? Canrobotsbe consciousor selfaware ? Can they feel or dreamas we do? It is certainly easy, at this point, to dismissthesequestions.It is useful, however, to examinea broadperspective of theseissues, often lessscientificandmorephilosophicalthanwhat we have becomeaccustomedto, but nonethelessworthy of explorationwith an open mind.
423
Fringe Robotics: Beyond,Behavior
10.1.1 On ComputationalThought We begin with a caveat: This sectionis not intendedas a discourseon philosophybut rathera brief review of severalnotableopinionson this and other questionsregardingmind as it pertainsto machinesin generaland robots in particular. Should a machinebe said to think if it can fool a human who is merely -basedapproach observingit into believingit is capableof thought? This performance is the basisfor the classicThring Testof machineintelligence, in which the definition of thoughtis basednot on machineconsciousness but ratheron humanfallibility (Thring 1950; Epstein 1992). This definition by deception carriesenoughweight to prompt the annualsponsoringof the LoebnerCompetition , which offers a $ 100,000 prize to a machinethat can success fully pass the Thring Test.
two
of
front
a
in sitting the
other
a
and
to
free
the is
of
One
version popular This
tester
any
questions
the
Test Turing
ask
respondents a
involves
person
to is
.
terminals
intelligence which
of
one
the
terminals
,
of
end
other
the
on
computer the
and
between
cannot
discriminate
the
.
If
.
human
questioner
computer Test
have
the Turing
passed
to
said
is
being the
computer
human ,
the
subject
Those for whom the proof of possession of thought is based solely upon observed action have little reason to read further . For those who believe that a thinking agent must possessmore than merely the ability to exhibit plausible actions for a wide range of situations in order to be considered able to think , we proceed. Bellman ( 1978) states that since no one knows what " think " really means anyway, we cannot fairly answer the question of whether a machine can truly think . He argues from a mathematical perspective that computers can perform processes representative of human thought (i .e., decision making and learning ), but since no one can precisely define what constitutes thinking , the question cannot be rigorously answered.
A different stanceexpressedby Weizenbaumand echoedby Albus ( 1981, " p. 297) states: For robots to truly understandhumansthey would have to be indistinguishablefrom humansin bodily appearanceas well as physical and mental developmentand remain so throughouta life cycle identical of humans." This leadsto the strong conclusionthat robots will never be able
424
10 Chapter to comprehendhumans' values, and thus never be able to think as humans do. Brooks ( 1991, p. 22), on the other hand, arguesthat these aspectsof intelligenceare a naturally occurring by-product of a behavior-basedeffort: " will not needto be programmedin. They will Thought and consciousness . ". emerge ' Roger Penrosecan easily be characterizedas AI s most ardent opponent. He deniesthe possibility that computationalprocess escaneverleadto thought andpresentsfour alternateperspectives on the issuesof thinking andawareness ( penrose1994) : 1. All thinking is computation, thus computersare capableof thought. (This positionis often referredto asStrongAI ). 2. Thoughtis a result of the brain' s physicalactions. Computerscan simulate this action, but a simulationis neverthe sameasthe thing simulated, andthus computerscannotthink. 3. The brain' s actionscannotevenbe simulatedcomputationally. 4. Awarenesscannot be explained by any scientific approachand thus is unattainablecomputationally. In TheEmperor's NewMind ( 1989), Penrosepresentsstrongargumentsagainst computationalintelligencebasedon pillars of mathematicsandcomputational ' theory, GOedels IncompletenessTheorem and the Church-Turing Thesis, ' amongothers. The readeris referredto Penroses books for the full development of his dismissalof AI as a meansfor producingthought, consciousness , self and -awaremachines. ' Obviously Penroses point of view has met stiff resistancewithin the AI community.. Most (e.g., Brooks and Stein 1994) point out that his arguments are fundamentallyflawedand havebeencontradictedby earlier mathematical investigations(Arbib 1964). Caustically, Brooks states: " . . . Penrose error, but thenin a desperate . . . not only makesthe sameThring-GOedeI of of mindandapplyingthestandard methodology attemptto find theessence , namelyto finda simplifyingunderlying , resortsto analmostmystical principle physics " relianceonquantummechanics (BrooksandStein1994,p. 23). Nonetheless , Penrosesteadfastlydevotesa majorportionof his secondbook, Shadowso/ the Mind ( 1994), to disputing the plethoraof argumentsagainst his position, dismissingsomefar more easily than others. He statesthat intelligence , an , and understandingrequiresawareness requiresunderstanding in out for the . Most puzzling is his optimism holding aspectof consciousness possibility of a type of noncomputationalintelligencebasedon a new science
425
Fringe Robotics: BeyonqBehavior
of quantumphysics. He contendsthat intelligence is a consequenceof this sort of activity naturallymanifestedwithin the microtubuleslocatedwithin the brain' s neurons.Much of this argumentis speculative , but nonethelessinbiguing, especiallygiventhe recentadvancesin the theoryof quantumcomputation (Lloyd 1996; Hogg 1996).
are dev tha on th sc Quant comp curre hypo op of atoms . t he wa na of ( They explo quan para pro state in of a set of dis . sta Th , ) particu existi supe ab endow them with the to do clas ca do an capa ever com more . S 199 and the sim ho of ( ) using quant logic Fact qu are two 1996 of to ha (Lsoluti )on loyd compu exam prob inh efficie this class of mac .pro
Despitetheragingdebateon the very possibility of thinking machines,there is demonstrablevaluein pursuingtheir development . The success es achieved in behavior-basedrobotic systemsgenerallydo not lay claim to human-level intelligence, nor do they arguethat the systemscreatedare aware. To most roboticsresearchers this is an irrelevantquestion, their goal beingto build useful and, at the very least, debatablyintelligent machines.The commonlyheld belief that engineersand scientistshavesuccess fully proventhat a bumblebee is incapable of flight may also have implications philosophers and thinking robots .
10.1.2
for the issues surrounding
On Consciousness
Chalmers( 1995, p. 81) characterizesthe mostmysteriousaspectof consciousness " esin the brain give riseto subjectiveexperience as" how physicalprocess , i .e., the experiencesof color, pain, emotion, and feelings in general. Some philosophersboldly arguethat they have unshroudedthis mystery. In Consciousness Explained, Dennett( 1991, p. 433) argues,usingreductionism, that " all that complicatedslewof activity in the brain amountsto consciousexperience ." His conclusionsassumethat the brain canbe viewedasan informationprocessingsystem, a computerif you will , from which he further concludes . that software-basedcomputersystemscan give rise to consciousness Perhapsthe single most disputedthought experimentused to deflate the is Searle's ChineseRoom ( 1980) : potentialof machineconsciousness
426
An t wh ca he no ot is lo in a , ( CP ) , Eng spe sp la roo with a de bla m a r b u ( ) wri o u pa s t tpre tha he wh to do w (Chi ) pro say pr p cha . A of Ch ch t i h s ( ) slip pa co a sm in the do . Th hu u ru thro op to tran in Ch fir th c fr , per by the to en th ru bo an inp slip cor b c out ins ne C . T bo spe cr an to pro gen in q The are the wr for th ou a ( ) pas o u .Tes sma The ca be as su th ope sys fu p if a tak the of the hu . Se co th ,not com pla c tran has no wh th of t d h und sli pa und . Ch wi an m Me co up rig b doe not con pos un
Chapter
Searle ultimately assertsthat no robot could ever be conscious( Boden 's 1995). Just as Penrose argumentsmet with widespreaddisclaim, so did the ChineseRoom refutation of strong AI . Dennettcountersthat thesetypes of ' ' " thoughtexperiments work preciselybecausethey dissuadethe readerfrom " trying to imagine, in detail, how softwarecould accomplishthis (Dennett 1991, p. 435). The believersin computationalconsciousness persist. McCarthy ( 1995) believes is not only possiblefor robots, but necessary consciousness . This consciousness ' is intendedto be different from a humans and, accordingto McCarthy, should not have humanlikeemotions, in order to make robots more servile. A robot consciousness requiresthe ability to observemanythings, including its own physicalbody, theextentof its knowledge(what it doesor does not know), its goalsandintentions, the history or basisfor its beliefs, andwhat it is capableof achieving. Bellman ( 1978, p. 94) attemptsto draw mathematicsinto the fray with the cannotbe made observationthat although"the generalareaof consciousness " ..." of consciousness can be treated by mathematical many aspects precise means, which meansthat we can have a computerbe consciousin certain " asa control process. ways. The result is a characterizationof consciousness will eventually Moravec (forthcoming) statesthat a robot' s consciousness exceedthat of people: " Someconfigurationswill make a robot more thoroughly consciousthan the averagehuman. . . . " This view is consistentwith his long-time articulated expectationthat robots are humans' natural suc-
427
FringeRobotics: BeyondBehavior cessors. These machines constitute the next logical step in evolution , our " mind children " , ultimately capable of transcending human biological frailty ( Moravec 1988) . Consciousness may be overrated anyway. Minsky ( 1986, p . 29) argues " In ' " general, we re least aware of what our minds do best and that consciousness arises when our automatic systems begin to fail . Moravec ( 1988, p . 44), despite his own tendencies to the contrary , observes that " robotics research is too practical to seriously set itself the explicit goal of producing machines with such nebulous and controversial characteristics as emotion and consciousness." Most roboticists are more than happy to leave these debates on consciousness to those with more philosophical leanings.
10.1.3 On Emotions Severalroboticists, however, recently have paid attentionto emotional state andits impacton behavior. Wecanintuitively understandthat emotionsindeed influencebehavior. When someoneis angry, they generallybehavedifferently than when they are happy. The effect may be anywherefrom subtleto quite strong, dependingupon the individual and the strengthof the emotion. But what is this emotionalstuff, andwhy would it be of possibleimportanceor use to a robot? We have alreadyseenexamplesof the attribution of emotion to behaviorbasedsystems . Braitenberg, in particular, unabashedlydescribeshis vehicles as possessingfear, love, and aggression(as discussedin section1.2.1). Albus " ( 1981, p. 208) states: Emotionsplaya crucial role in the selectionof behavior ". Though lower life forms exhibit simple emotions(pleasureor pain), humansapparentlyhavea much broaderrange(hate, love, anger, fear, happiness, disgust, amongothers). Neurologically, it is generallyacknowledgedthat humanemotionoriginateswithin the brain' s limbic system. ' Modifying AssociateU.S. SupremeCourtJusticeJohnPaulStevensfamous ' quotation, we can t defineemotion, but we know it when we seeit. And perhaps that is the crux of the argumentthat robotscanindeedpossessemotions. Moravec( 1988) contendsthat a robot is actuallyexperiencingfear when, upon encounteringa stairwell, it backsaway from the dangerbecauseof adetectcliff sensingsystemcoupledwith a deal-with-cliff action system. Is emotion thenin theeyeof theobserver? For eachof us, speakingof ourselves,we would clearly answerthat it certainly is not. We know when we are angry, happy, or whatever, independentof externalobservation . But we ascribethis emotional observation . So how do we know whether to others capacity primarily through a robot is or is not experiencingemotion?
428
Chapter10
" Minsky ( 1986, p. 163) states: The questionis not whetherintelligent machines can haveany emotions, but whethermachinescan be intelligent without " any emotions. Emotional capacitymay better adaptrobots to deal with the world: Love can provide social behavior useful when cooperatingwith other agents(humanor robotic); angercan be useful when competingwith other agents; andpain or pleasurecan be usedfor reinforcementlearningand self-protection. As Moravec(forthcoming) observes : " In general, robots will exhibit someof theemotionsfound in animalsandhumansbecausethoseemotions areaneffectiveway to dealwith the contingenciesof life in the wide, wild world." To date, Japanesescientistshaveconductedmostof the pragmaticresearch on giving robotsemotions. Frustration, an emotionnot uncommonto robotics researchers , oftenservesasthebasisfor emotionalcontrol. Researchat Nagoya University (Mochida et al. 1995) has looked at robots that can experience two states: pleasantness and unpleasantness . A variablerepresentingfrustration representsthe states: low frustrationis pleasant, but high frustrationis to the contrary. Simulationshavebeenconductedusinga Braitenberg-style architecture ' supplementedwith a neural emotionalmodel that alters the systems behavioras it becomesmore or lessfrustrated. This enablesit to escapetraps, suchas box canyons, which it cannotaccomplishwithout this emotionalbehavioral switching. Researchat MITI in Japan(Shibata, Ohkawa, and Tanie 1996) extendsthe use of frustration to multirobot systems. In the contextof empty-can collection tasks, frustration arisesnot only from the agent's own behaviorbut also from otherteammembers'performance . The frustrationlevel altersthe actionselectionprocess, appearingto result in greatercooperationthan would occur otherwise. Section 10.2.1 describesan actual vacuumcleaningrobot, Sozzy, that alsousesemotionsas a basisfor action-selection. Of course, the debatewill continueasto whethertheserobotsreally experience emotion. An importantpoint to takefrom this discussion,however, is that biological emotionalcontrol systemsmay havesomeutility in the contextof behavior-basedrobotics, evenif they servemerely to inspire modelsthat are quite limited or ratherfar afield.
10.1.4 On Imagination What of imagination? " Imaginationgives us the ability to think about what we aregoing to do beforecommitting ourselvesto action" (Albus 1981). This capacity for simulation (imaging future actions) providespotentially useful
429
FringeRobotics: BeyondBehavior
Figure10'.1 . Highlightedcomponents arethoseaddedaboveandbeyond MetaTotos architecture . Toto's corecapabilities . The feedbackas to the utility andrelevanceof any plansunderconsideration quality of feedbackis directly relatedto the quality of the simulationitself and the accuracyof its underlyingassumptionsaboutthe world. At MIT , imagination, in at least one sense , has been integratedinto the . subsumptionarchitecture(Stein 1994) Cognition, viewed here as high-level deliberativereasoningor planning, is treatedas imaginedinteractionwith the world. This cognitivesystemis not disjoint from the robot control architecture, as is the casein many hybrid architectures(chapter6), but rather usesthe underlyingbehavioralarchitectureasthe simulatoritself. This novel approachwasembodiedin MetaToto, extendingthe earliernavigational within the subsumptionarchitectureof Toto work usingrepresentation ' ' . 1 . 10.1 section 5.2.2 ) Figure ( depictstherobot s controller, which reusesToto s architecturein its entirety, wrappingthe imagining simulatoraroundit. Sensing is imaginedusingvery simplesonarmodelsanda straightforwardscan-line algorithm. Acting is imaginedby the updatingof threepositionalvariables: x, this is a rudimentarysimulator. positiony -position, and heading. In essence MetaToto, however, is capableof exploringfloor plan drawingsandimagining how it would movein thesepreviouslyunexploredenvironments , whereas Totocould not. The actualsonarandcompassreadingsin therobot architecture
430
Chapter10 havebeenreplacedwith imaginedones. Although simulatorsthat usethe same control codeasactualrobotsarenot new (e.g., MissionLab, section9.9.2), the cognitiveframeworkin which this work is couchedis interesting. Whetherthis is truly an exampleof robot imaginationis subjectto debate,just as were the notionsof robotic thought, consciousness , and emotiondiscussedin the preceding sections.
10.2 ISSUESOF THE ROBOT BODY We now examine some fringe areas not directly related to the debate regarding the robot mind . Nonneural control systems have the potential for contributing to intelligent robotic systems. In animals, the endocrine system uses chemical messagesfor autoregulatory purposes whereas the immune system responds to external events in a defensive manner. This section explores the implications of these alternate control paradigms for robotics , then examines the issue of scale: What if robots could be made to operate at a molecular level? This draws us into the field of nanotechnology, currently little more than a dream but of potentially great importance .
10.2.1 HonnonesandHomeostasis The endocrinesystemin mammals, using hormonesas chemicalmessengers , servesas a meansfor both information processingand control. This homeostatic control systemis concernedwith maintaininga safeand stableinternal operatingenvironment,whetherfor animalor machine. Homeostasisis a term typically appliedto biological systemsfor the processby which that safeand stablestateis achievedandmaintained. Gerald( 1981) describesthe endocrinesystem's basicrole asfollows: " [The endocrinesystem] may be comparedto an orchestra , in which, when one instrumentis out of tune, a perfectensembleis impossible. . . . It is constantly monitoring the internal environment, and it is ideally situatedto function in " responseto psychicstimuli. (Boyd 1971, p. 421). A basic biological function of the endocrinesystemis to maintain the or' ). This is accomplishedby the ganisms internal self-consistency(homeostasis endocrine(ductless) glands' directly secretingtheir chemicalmessengers (hormones . The circulatory systemthen carriesthesemessengers ) into the bloodstream ' , broadcasts (in essence ) to all the organisms cells. Different cellshave , somereactingmarkedlyto certain differing responsesto endocrinesecretions
431
FringeRobotics: BeyondB.ehavior
hormones, othersnot at all. Selectivetissues, called target tissues, are selectively arousedby a specifichormone. An exampleis a hormonesecretedby the thyroid glandthat targetsbonetissuespecifically. Somehormonesarenonspecific , suchasinsulin, which actson almostall cells (with the notableexception of mostbrain cells), affeCtingtheir energy(glucose) uptake. Negativefeedbackmechanismsare fundamentalto endocrinesystemcontrol . Onetypical example( Boyd 1971) illustrateshow the hypothalamusmonitors the releaseof severaldifferent hormonesand coordinatestheir effects basedon central nervoussysteminputs. By applying this negativefeedback control regime, the system, whenstressed , canbe restoredto a steadystate, an crucial to the of homeostasis . ability process The biological endocrinesystemis concernedwith three different areas: , nervoussystemfunction, andmetabolismregulation, . growthanddevelopment with the latter perhapspaying the highest dividendsin robotics. Lehninger " ( 1975, p. 363) definesmetabolismas a highly coordinatedpurposefulactivity in which many setsof interrelated. . . systemsparticipate, exchangingboth matter and energy betweenthe cell and its environment." This is resource managementat a very low level. The questionis what resourceswe should be concernedwith in the roboticsdomain. ' Energymanagementis onechoice. Justasglucosefuels most of the body s cells, someform of energymustbe madeavailablefor the robot. In mammals, two typical modesof glucosemetabolismarefound, "feast or famine." Cellular esarealwaysdrawingenergyfrom their environment.Whenglucoseis process abundantin the blood, insulin is released , signalingto the cells that their energy If . the glucoselevel dropsoff , the hormonelevel uptakecanbe increased also drops, signifying a fasting state. This hormonalreleasealso affectsbiological organtissuesmarkedly, in additi
432
Chapter10
. Global stress: Heat must be exchangedbetweenthe robot and its surroundings to restoreacceptableconditions(e.g., a robot on the sunnyside of Mercury ). . Local stress: Heatmustbe redistributedwithin the robot to maintainreliable . The failure of a single subsystemcould operationof a particular subsystem have the domino effect, ultimately resulting in the robot' s completefailure. An exampleof local stresswould be the overheatingof a robot' s arm while servicinga furnace. In either case, temperaturemust be regulatedby a control system. Sincethis regulationoccursunconsciouslyin mammals(global stressthroughsweating or panting, local stressby dilation or constrictionof blood vessels ), homeostaticcontrol caneffectively managethis function. Emergencynotification and resultantbehavioralparameteralterationscan also be carried out quickly and efficiently using a broadcastcommunication mechanism . This loosely parallels the secretionof epinephrine(adrenaline ) that markedly and rapidly increasesthe rate of a creature's processes in response to an unanticipatedevent. The fact that dormancycan be induced rapidly, over the samechannels, should not be overlooked. Indeedmuch of the " fight or flight" responsecanbe embeddedin this manner. 10. 2.1.1
The Homeostat
Ashby ( 1952) was amongthe first to developthe notion of homeostasisin a cyberneticcontext, extendingthe principles of biology to machines. In particular , he arguedthat adaptationis essentialand that it is achievedby maintaining certainessentialstatevariableswithin acceptablephysiologicallimits , blood citing glucoselevel maintenanceandthermoregulationastwo examples, ' amongothers. Becausehe sawmaintainingstability asessentialto a systems survival, he createdan unusualdevicecalled a homeostatthat embodiesthese principles. It consistsof four interactingunits eachcontainingan electromagnet and a water-basedpotentiometer(figure 10.2). The units are fully interconnected : eachreceivesinputs from and eachsendsoutputsto the others. In testsof the system, certainsettingsproducedstablebehavior(with the magnets ) whereasothersettings movingto a centralpositionandresistingdisplacement ' velocities with the yielded runaway instability ( magnets increasinguncondevice . a trollably) Although seeminglyuninteresting given the complexityof ' s robots thehomeostat , today provideda testbedfor the notionsof homeostatic stability in machinesin the 1950s.
Fringe Robotics: BeyondBehavior
(A)
~ z 8 ~ F ~ ~ ID ~~
. L I=:;A :i:P l~~ ':~~-;":ii= A j : ~ ~ ~ r Z B ~~ ~~:i~ ~:~,I:~ :~~ = :~ t:= J:~:~:3 =(~:]: c )
:
d
I
Mr
E
(B)
~
433
Figure10.2 ' : (A) theactualdevice ; (8) thecircuitfor a singleunit. Ashbys homeostat
434
10 Chapter -BasedHomeostaticControl 10.2.1.2 Schema The addition of a new classof behavioralconttol units called signal schemas (Arkin 1988) provides a meansfor a robot to senseand transmit to motor behaviorsinformation regardingits own internal state. Thesesignal schemas are of two types: transmitterschemas , associatedwith specific internal sensors . The honnonal , and receptorschemas , embeddedwithin motor schemas of is achieved the transmitter schemasto concept targetability by allowing broadcasttheir informationto all activebehaviors. Only thosemotor schemas whoseactivity is dependenton a particulartypeof informationcontainreceptor schemassensitiveto thosespecificbroadcastmessages . Transmitterschemassendinformation pertaining to one particular aspect of the robot' s internal state. Their role is ~o provide the feedbackrequiredto achievehomeostaticcontrol. For example, a sensorcanmeasurea robot' s available fuel reserves . In the caseof battery-poweredvehicles, this might involve an ammeter; for petroleum-poweredvehicles, a fuel tank measuringdevice could be used. The rate of consumptioncan also be monitored, providing additional informationfor negativefeedbackanalysis. , embeddedwithin the motor schemas , providethe mechanism Receptorschemas for modulatingthe motor behavioritself. In responseto the information the transmitterschemabroadcasts , the receptorschemaaltersparameters within its motor schema,changingits output. If fuel reservesare runninglow, motor ratesare tunedto run at more efficient levels. Internalchangescan produce shorter, albeit more risky, pathswhen fuel depletionwarrantsthe risks. The unit hasonereceptorschemafor eachtransmitterto which it is sensitive, implementingthe conceptof targetabilityby specifyingwhich, if any, of the transmittedsignalsthe behavioralcontroller shouldbe awareof. Figure 10.3 depictstheserelationships. In the caseof energymanagement , a transmittermessageemanatesfrom an internal sensorreporting availablefuel reserves . Sincethis messageis transmitted it affects all motor behaviors targeted uniformly. In the case globally, of energyreduction, this producessmooth, more efficient (albeit slower) motion . In thermoregulation , decreasingthe rate of motion reducesthe amount of heatproducedper unit of time, allowing the motorsto dissipateheatmore effectivelyandusepowermoreefficiently. As an example, figure 10.4 showsthe effectson navigationas the robot' s initial fuel ,reservesrangefrom full to almost empty. As the energysupplies dwindle, the robot comescloserandcloserto the obstacles , movingat a slower and more efficient speed. Eventually, the courseof the path taken actually
435
Fringe Robotics : Beyond Behavior
MOTORSCHEMAS Al
T
N
RON I
ME
V
EN
SENSORS ROBOT Sensors
Internal
IS1 E ~ G
R 0 N M E N T
MOTORS . . . : Key PS
Schema
Perceptual . PSS
Subschema
Perceptual Schema
MS
Motor -
ES
Environmental
Sensor
-
. RS
Schema
Receptor TS
Transmitter
Schema
. . Sensors
Internal
IS .
Figure 10.3 -basedcontrol architecture.The highlightedcomponents pertain to Homeostaticschema homeostaticcontrol. switch es, producing a much shorter path (in terms of distance but not time ) in reaction to low fuel conditions . This set of paths clearly indicates the impact of available fuel reserves on the schema-based navigation process. Additional results, including those concerning thermoregulation , appear in Arkin 1992c.
10.2.1.3
Subsumption -Based Hormonal Control
Another application of the notion of honnonal control involves the development of a honnone - driven autonomous vacuum cleaner named " Sozzy" (figure 10.5) ( Yamamoto 1993) . This system is really a mixed metaphor in which ' " " honnonalanalogies are used to modify the robot s emotions , specifically fatigue , sadness, desperation, and joy . Figure 10.6 shows the means by which an emotion - suppressing behavior is added to an underlying subsumption- style
Chapter10
+
~
+
....
*
*
/ , Fuel
Least
Available
+
.t*
* START
436
Figure 10.4 . Note as the fuel reservesbecome Collection of padisreflectingdifferent fuel reserves obstacles and closer to the closer the , finally resultingin a complete pathsget depleted, 's from detouringto the upper it as in the eventually changes general quality change path . regionsto moving moreclosely aroundthe obstaclesat slowerspeeds
. control systemthat modifies the underlying active behavioralconstituency Hormonal state variablesthat reflect the various emotional statesare maintained within the emotion-suppressingbehavior. A function that receivesboth internal(e.g., batterylevel, time expired) andexternalstimuli (e.g., lossof beacon ) regulatesthesestates. The net result is behavioralswitching as opposed -basedhormonalcontroller. to behavioralmodification as seenin the schema in a A robot implementingthe systemwastested laboratorysettingwherethe hormonelevelscorrespondingto the variousemotionalstatesroseandfell over ' time. This changedtherobot s overallbehavior, giving it the subjectiveappearance " " of being more friendly andmore lively (Yamamoto1993, p. 221) than whenthe hormonalsystemwasinactive.
437
Fringe Robotics: BeyondBehavior
Figure10.5 Sozzy: A honnone-driven robot. (Photograph courtesyof MasakiYamamoto.)
S E N S 0 R S
A C T U A T 0 R S
Figure 10.6 Honnonal behavioral switching. Inhibition from the emotion-suppressingbehavior keepsthe robot inactive until an emotion is enabled, which in turn inhibits the inhibition , effectively selectinga suitablebehavioralset.
438 10. 2.2
10 Chapter Immune Systems
Another biological control systemparallel exploredin the contextof robotics involvesthe immune response . Immune networksdetectand attack antigens (alien nonself materialssuchas bacteria) by producingantibodies. The bone marrow and thymus gland createlymphocytesof varioustypes that regulate the production of antibodies and circulate throughout the lymphatic system in mammals. Specificantibodiesattack specificantigensby recognizing , fonning a complexcontrol systemcapable particularantigenicdetenninants of recognizingandeliminatingboth previouslyencounteredandnew antigens. In Japan, researchers haveappliedprinciplesinspiredby immunenetworks to robotic control problems. Using immune systemmodelsinitially derived for fault tolerance( Mizessynand Ishida 1993), the simulatedlearningof gait acquisitionfor a six-leggedrobot hasbeenachieved(Ishiguro, Ichikawa, and Uchikawa 1994). Immunesystemmodelshavebeenextendedto includemultiagent robotic systems(Mitsumoto et al. 1996). Here parallelsare drawn at severallevels: The robot and its environmentare modeledas a stimulating antibody-antigenrelationship, and robot-robot interactionscanbe both stimulating and suppressing(analogousto antibody-antibodyrelations). Eachrobot decidesits next action basedon theserelationshipswith other robots and the world, organizingitself to effectively conductthe task. The systemhasbeen testedin simulationonly on a foragingtask, with an eyetowardmovingit onto a six-agentmicrorobotcolony. Both of theseexamplesare basedloosely on the idiotype network model (Jerne1973), in which novel, randomlycreatedantibodiesareinitially treated as antigens, resultingin systemwidestimulationor suppressionof other antibodies , eventuallyresultingin steady-stateconditions. The presenceof other , the antigenic material can disrupt this equilibrium and, as in homeostasis systemthen respondsin a mannerto restoreor achievea new steadystate condition. Others(e.g., Bersini 1992) haveusedimmuneresponsemodelsas an inspirationfor reinforcementlearningmethodssimilar to Q-learning(section 8.3), but this hasasyet not beenappliedto actualrobot control systems.
10.2.3 Nanotechnology What if we could build really small robots, robots so small that they could operate at the molecular level? As fantastic sounding as it is , this is the domain of nanotechnology, where machines operate on atomic scales. It is not as absurd as it first sounds. One could argue that nanomachines already exist
439
Robotics : Beyond Behavior Fringe operating within cells. These however result from natural sources. Protein enzymes routinely are involved in the assembly and disassembly of the stuff of which we are made. Is it really so implausible that engineered machines
? could be devisedto serv~ similar purposes Drexler' s vision of nanotechnologyhasservedasthe foundationof the field. In Enginesof Creation( 1986) Drexler describesa future in which molecular machinesarecommonplace . Cell repair machines,for example, havethe ability to curedisease , reverseaging, or serveasactiveshieldsagainstinfection. He . alsoraisesthe spectreof their beingusedfor evil endsor destructivepurposes Drexler' s subsequentbook Nanosystems( 1992) providesmore of a scientific basisfor his earliervision, describingthe techniquesby which molecularmanufacturingcouldpotentiallybe achieved.Molecularmanufacturinghasmorein commonwith biochemistrythan with engineeringin tenD Sof precision, control , defectrate, productsize, andcycle time. Moravec ( 1988) envisionsnanotechnologyas the meansby which robots . For example, a robot bush (figure 10.7) could truly becomeself-assembling could self-constructand would have a structureunlike anything we' ve seen thusfar. Roboticcilia would propelthe bushabout, andits shapecould change dynamically. Local reflexes might handlemuchof the control. Nothing closeto an actualnanorobothas yet beenproduced, but many researchers are still working on a very small scale. This is the domainof microrobots , very smallrobotic systemsthat cando fundamentallydifferenttasksin different ways than the more conventionalsystemswe havealreadystudied. On thesescales,friction becomesthe dominantforce ratherthangravity. Gnat robotshavebeenproposed(Flynn 1987) andbuilt (Flynn et al. 1989) that can fit on a single electronicchip. Thesesystemshavenumerouspotential uses: bugsfor the CIA , multiagentswarmsfor spaceexploration, autonomousbillboards , andpatchingholesin or removingbarnaclesfrom a , eyemicrosurgery ' s hull ( Flynn, Brooks, andTavrow 1989.) ship MIT ' s AI Lab hasdevelopedseveralmicrorobotsystems: . Squirt: a low-cost prototypemicrorobotbuilt for roboticseducation(Flynn et al. 1989). . The Ants: a colony of microrobotsbeing developedfor explosiveordnance disposalapplications(figure 10.8) . . The Rockettes:a colony of microrobotson the orderof 10gramseachbeing developedfor planetaryexploration. Perhapsthe most unusualmicrorobot encounteredthus far is the hybrid insectrobot Takeuchi( 1996) has developedat the University of Tokyo. This systemis part robot and part cockroach: two severedcockroachlegs serve
440
10 Chapter
Figure 10.7 A self-assemblingrobot bush. ( Figurecourtesyof HansMoravec.)
as the actuatorsfor a single chip microcontroller. It is able to walk when an artificial body is attachedto the legs. This roboroachis reportedlycapableof functioning for approximatelyone hour andusesfour electrodesinsertedinto the cockroachlegs.
10.3 ONEQUIVALENCE (ORBETTER ) We now briefly examinesomeissuessurroundingthe ultimaterelationshipbetween robotsandhumans.Will robotsinherit theearth? Will humansultimately residein robotic form? This sectiondiscusses the positionsof thosewho believe either (or both) of thosethings will happen. Although it may be easyto disregardthesepoints of view, let us open-mindedlyexplorethesepositions.
441
FringeRobotics: BeyondBehavior
Figure10.8 Ant microrobots . (photograph of RodneyBrooks.) courtesy " Minsky ( 1994, p. 109) respondsto the question, Will robots inherit the " " earth? : Yes, as we engineerreplacementbodiesand brainsusing nanotechnology . We will thenlive longer, possessgreaterwisdomandenjoy capabilities as yet unimagined." Minsky' s answeralso toucheson the issueof humansas from biology to technologyis robots. Accordingto this view, thetranscendence a causefor celebrationratherthanfear. We will ultimatelybe freedfrom our biologicallimitations. Eventhe option of immortality is posed. Humansbecome machinesand vice versa- ultimately there is no distinction. This somewhat radicalviewpoint flies in the faceof the counterarguments againstmachineintelligence we encounteredearlier in this chapter , thought, and consciousness . but we continueour explorationnonetheless Moravec( 1988) arguesthat this is our destiny, that, as statedearlier, these future robotsareour mind children. But how do we becomeour robotic equivalent ? This processof a person's becominga machineis referredto as transmigration , which might occurin severalways(accordingto Moravec) : 1. One approachmight involve a high-fidelity surgicalneuron-by-neuronreplacement of your brain with an electronicneuroncounterpart.As this would be a step-by-step processconductedwith verifying simulationsat every replacement step, at what point do you stopbeing a humanandbecomea robot?
442
Chapter10
2. Anotherstrategywould involvea high-resolutionbrain scanthat, in a single " operation, would createa new you while you wait." 3. Perhapsa computerthat you would wear throughoutyour lifetime would recordall of your life experiences . Onceit learnedwhat it was like to be you andcould act equivalentlyto you, the recordcould be transferredto a machine substrate . 4. Another approachwould be to severthe CQrpuscallosum, which connects the brain' s two hemispheres , and attacheach end to a computerthat at first passesthe messagesthrough while also recordingthem. Eventually your biological brain would die, but during your lifetime this computerwould have learnedhow to be you andcould continuein that capacityindefinitely. The net result, accordingto this view, is that the sum total of you is a programindependentof wheretheprogramresides:in carbon-basedlife forms, in silicon, or perhapsin somethingelse. You could run (think?) at speeds millions of timesfasterthanthe limitations biology imposeson you. It matters not if your frail biological life form dies- you still exist: " If the machine you inhabit is fatally clobbered, the [ backup] tape can be read into a blank computer,resultingin anotheryou, minusthe experiencessincethe copy. With " enoughcopies, permanentdeathwould be very unlikely (Moravec 1985, p. 145). Fringe robotics indeed. For this point of view Moravechasbeenlabeleda " DNA traitor" by many who either fear or dismissthe consequences of these . thoughts
10.4 OPPORTUNITIES In concludingthis book, it might be wise to provide someguidanceto those who follow by describingwhat problemsremainto be solvedto advancethe scienceof robotics. Many openquestionswarrantfurther investigation. Justa few arementionedbelow: . Identifying ecological niches where robots can success fully competeand survive, making them sufficiently adaptableto changesin the world they inhabit . . Increasingaccessibility: bringing robotics to the massesthrough suitable interface, specification, andprogrammingsystems. . Representingand controlling sensingby viewing it as a form of dynamic agent-environmentcommunication.
443
FringeRobotics: BeyondBehavior . Improving perception : new sensors, selective attention mechanisms, gaze control and stabilization , improved eye-hand coordination , foveal vision , and specialized hardware, among others. . Further exploitation of expectations, attention , and intention in extracting information about the world . . Understanding more deeply the relationship between deliberation and reaction , leading to more effective and adaptive interfaces for hybrid architectures. . Evaluating , benchmarking , and developing metrics: In order to be more accurately characterized as a science, robotics needs more effective means for evaluating its experiments. Although progress is beginning to be made in this area (Gat 1995) much more remains to be done. . Satisfying the need for far more advanced learning and adaptation capabili . ties than are currently available. . Creating large societies of multiagent robots capable of conducting complex tasks in dynamic environments. . Using robots as instruments to advance the understanding of animal and human intelligence by embedding biological models of ever-increasing complexity in actual robotic hardware. ' Certainly , the roboticist s plate is full of a myriad of important and exciting problems to explore .
10.5 CHAPTER SUMMARY . The issue of whether or not robots are capableof intelligent thought or consciousness is quite controversial,with a broadspectrumof opinion ranging from " absolutelynot" to " most assuredlyso." . Robotic emotionsmay playa useful role in the control of behavior-based , althoughtheir role is just beginningto be explored. systems . Homeostaticcontrol, concernedwith managinga robot' s internal environment , canalsobe usefulfor modulatingongoingbehaviorto assistin survival. . Immunesystemsarealsobeginningto beexploredasa meansfor controlling both individual andgrouprobot behavior. . Nanorobotsand microrobotscan revolutionizethe way in which we think aboutrobotic applications. . Oneschoolof thoughtin roboticsassertsthat thesemachinesare mankind's naturalsuccessors . . As in any importantendeavor , therearea wide rangeof questionswaiting to be answeredaswell asopportunitiesto be explored.
References
Aboaf, E., Drucker, S., andAtkeson, C. 1989. "Task-Level RobotLearning: Jugglinga TennisBall More Accurately," Proceedingsof theInternationalConferenceon Robotics and Automation, Scottsdale , AZ , pp. 1290- 95.
Affordances of Slopes : , E. J., andEppler , MiA. 1990." Perceiving Adolph, K., Gibson ' Locomotion " TheUpsandDownsof Toddlers , EmoryCognitionProjectReport#16, of Psychology . , EmoryUniversity Deparbnent " . In a Teamof RobotsTheLoudestIs Not Necessarily , G. 1995a Agah, A., andBekey theBest," Proceedings on Systems . Man, andCybernetics of theInternationalConference , Vancouver , B.C. . "Learningfrom Perception , G. 1995b , Success , andFailure Agah, A., andBekey " in a Teamof Autonomous MobileRobots , Proceedings of the Seventh Portuguese onArtificialIntelligence . (EPIA1995), MadeiraIsland,Portugal Conference " A. and andOntogeneticLearningin a Colonyof , G. 1997. Phylogenetic Agah, , Bekey " Robots Robots , Autonomous , Vol. 1, No. 4, January , pp. 85- 100. Interacting " of a Theoryof Activity , D. 1987. Pengi: An Implementation Agre, P. E., andChapman ," Proceedings of theAmericanAssociation of ArtificialIntelligenceConference (AAAI-87), pp. 268-71. , D. 1990."WhatAre PlansFor?" RoboticsandAutonomous Agre, P. E. andChapman Vol . 6 . 17 34. , , pp Systems Albus, J. 1981.Brains.Behavio1 ; andRobotics , BYTEBooks,Peter , NH. borough " IEEETransactions Albus, J. 1991." Outlinefor a Theoryof Intelligence on . , Systems ManandCybernetics , Vol. 21, No. 3, May-June,pp. 473- 509. Albus,J., McCain,H. andLumia, R. 1987."NASA/ NBSStandard Reference Modelfor ' Architecture Note1235 Telerobot ControlSystem , Robot ( NASREM ) : NBSTechnical Division,NationalBureauof Standards . Systems Allee, W., 1978.AnimalAggregations , Univ. of ChicagoPress , Chicago , IL. Aloimonos ErlbaumAssocciates , Lawrence , Hillsdale, , Y. (ed.) 1993.ActivePerception NJ. " " Aloimonos , Y., andRosenfeld , A. 1991. ComputerVision, Science , Vol. 253, pp. . 1249-53, September
446
References
Altmann, S. A. 1974. " Baboons, Space, Time, and Energy," AmericanZoologist, Vol. 14, pp. 221- 48. Anderson, J. A . 1995. "AssociativeNetworks," in TheHandbookof Brain Theoryand Neural Networks, ed. M. Arbib, MIT Press, Cambridge,MA , pp. 102- 7. Anderson, T., and Donath, M. 1991. "Animal Behavioras a Paradigmfor Developing Robot Autonomy," in DesigningAutonomousAgents, ed. P. Maes, MIT Press, Cambridge , MA , pp. 145- 68. " Andresen, F., Davis, L., Eastman,R., and Kambhampati , S. 1985. Visual Algorithms " for AutonomousNavigation, Proceedingsof the IEEE International Conferenceon RoboticsandAutomation, St. Louis, MO , pp. 856- 61. Arbib, M.A. 1964. Brains, Machines, and Mathematics,McGraw-Hill , New York. Arbib, M . A. 1981. " PerceptualStructuresand DistributedMotor Control," in Handbook of Physiology- TheNervousSystemII : Motor Control, ed. V. B. Brooks, American , MD , pp. 1449- 80. PhysiologicalSociety, Bethesda Arbib, M . A. 1992. " SchemaTheory," in The Encyclopediaof Artificial Intelligence, 2nd ed., ed. S. Shapiro, Wiley-interscience,New York, N.Y., pp. 1427- 43. Arbib, M. A. 1995a. " SchemaTheory," in TheHandbookof Brain Theoryand Neural Networks, ed. M . Arbib , MIT Press, Cambridge, MA , pp. 830- 34. Arbib, M. A. 1995b. TheHandbookof Brain Theoryand Neural Networks, ed. M. Ar bib, MIT Press, Cambridge, MA . Arbib , M ., andHouse, D. 1987. " DepthandDetours: An Essayon Visually GuidedBehavior " , in Vision, Brain, and CooperativeComputation,ed. M. Arbib and A. Hanson, MIT Press, Cambridge, MA , pp. 129- 63. Arbib, M., IberaIl , T., and Lyons, D. 1985. " CoordinatedControl Programsfor Movements of the Hand," in Hand Function and the Neocortex, eds. A. Goodmanand I. Darian-Smith, Springer-Verlag, New York, pp. 135- 70. Arbib, M . A., Kfoury, A. J., and Moll , R. N. 1981. A Basisfor TheoreticalComputer Science,Springer-Verlag, New York. Arkin , R. C. 1986. " PathPlanningfor a Vision-BasedAutonomousRobot," Proceedings of the SPIE Conferenceon Mobile Robots, Cambridge, MA , pp. 240- 49. Arkin , R. C. 1987a. " Motor SchemaBasedNavigation for a Mobile Robot: An Approach to Programmingby Behavior," Proceedingsof theIEEE Conferenceon Robotics and Automation, Raleigh, NC, pp. 264- 71. Arkin , R. C. 1987b."TowardsCosmopolitanRobots: IntelligentNavigationin Extended " Man-Made Environments , PhiD. Dissertation, COINS TechnicalReport 87-80, University of Massachusetts , Departmentof ComputerandInformationScience. Arkin , R. C. 1988. " HomeostaticControl for a Mobile Robot: DynamicReplanningin " HazardousEnvironments , Proceedingsof the SPIE Conferenceon Mobile RobotsIII , . Cambridge, MA , pp 407 13. Arkin , R. C. 1989a. " Neurosciencein Motion: The Application of SchemaTheory to Mobile Robotics," in VisuomotorCoordination: Amphibians, Comparisons , Models, and Robots, eds. J.-P. Ewert andM. Arbib, New York: PlenumPress,pp. 649- 72.
447
References
-BasedMobile Robot Navigation," International Arkin , R. C. 1989b. " Motor Schema Journal of RoboticsResearch , Vol. 8, No. 4, pp. 92- 112. " Arkin , R. C. 1989c. NavigationalPath Planningfor a Vision-basedMobile Robot," Robotica, Vol. 7, pp. 49- 63. Arkin , R. C. 1989d. "Towards the Unification of NavigationalPlanningand Reactive Control," working notes, AAAI SpringSymposiumon RobotNavigation, StanfordUniversity , CA , March. Arkin , R. C. 1990a. "The Impact of Cyberneticson the Design of a Mobile Robot " , Man, and Cybernetics , Vol. System: A CaseStudy, IEEE Transactionson Systems 20, No. 6, November/ December , pp. 1245- 57. Arkin , R. C. 1990b. " Integrating Behavioral, Perceptual , and World Knowledge in ReactiveNavigation," RoboticsandAutonomousSystems , Vol. 6, pp. 105- 22. " Arkin , R. C. 1991. " ReactiveControl as a Substratefor TeleroboticSystems , IEEE and Electronics Vol. 6 No. 6 June . 2431. , , , pp Aerospace SystemsMagazine, Arkin , R. C. 1992a. " Behavior-BasedRobotNavigationfor ExtendedDomains," Adaptive Behavior, Vol. 1, No. 2, pp. 201- 225. Arkin , R. C. 1992b. " Cooperationwithout Communication:Multiagent SchemaBased RobotNavigation," Journal of RoboticSystems , Vol. 9, No. 3, April , pp. 351- 64. Arkin , R. C. 1992c. " HomeostaticControl for a Mobile Robot: Dynamic Replanning " in HazardousEnvironments , Journal of Robotic Systems , Vol. 9, No. 2, March, pp. 197 214. Arkin , R. C. 1993. " Modeling Neural Function at the SchemaLevel: Implications and Resultsfor Robotic Control," in Biological Neural Networksin InvertebrateNeuroethologyand Robotics, eds. T. McKenna, R. Ritzmannand R. Beer, SanDiego, CA, pp. 383- 410. Arkin , R. C., and Ali , K. 1994. " Integrationof Reactiveand TeleroboticControl in " MultiAgent Robotic Systems, Proceedingsof the Third International Conferenceon Simulationof AdaptiveBehavior(SAB94) [ FromAnimals to Animats] , Brighton, UK , August, pp. 473- 78. Arkin , R. C., Balch, T., Collins, T., Henshaw , A., MacKenzie, D., Nitz, E., Rodriguez, -BasedReactiveRobotic R., and Ward, K. 1993. " Buzz: An Instantiationof a Schema " System, Proceedingsof the International Conferenceon Intelligent AutonomousSystems (IAS-3), Pittsburgh, PA, February,pp. 418- 27. Arkin , R. C., and Hobbs, J. D. 1992. " Dimensionsof Communicationand SocialOr" ganizationin MultiAgent RoboticSystems, FromAnimalsto Animats2: Proceedings of the SecondInternational Conferenceon Simulationof Adaptive~ ehavior, Honolulu, HI , December , MIT Press, CambridgeMA , pp. 486- 93. Arkin , R. C., andLawton, D. 1990. " ReactiveBehavioralSupportfor QualitativeVisual " Navigation, Proceedingsof the IEEE International Symposiumon Intelligent Motion Control, Istanbul, Turkey, 1990, pp. IP21- 28. Arkin , R. C., and MacKenzie, D. 1994. "Temporal Coordinationof PerceptualAlgorithms for Mobile RobotNavigation," IEEE Transactionson RoboticsandAutomation, Vol. 10, No. 3, June, pp. 276- 86.
448
References Arkin , R. C., and Murphy, R. R. 1990. "AutonomousNavigationin a Manufacturing Environment," IEEE Transactionson Roboticsand Automation, Vol. 6, No. 4, August, pp. 445- 54. " Arkin , R. C., Murphy, R., Pearson , M., and VaughnD . 1989. Mobile Robot Docking " , Operationsin a ManufaCturingEnvironment: Progressin Visual PerceptualStrategies , Proceedingsof the IEEE International Workshopon Intelligent Robotsand Systems Tsukuba , Japan, pp. 147- 54. " Aron, S., Deneubourg , J. 1990. FunctionalSelf-Organization , J., Goss, S., andPasteels lliustratedby Inter NestTraffic in Ants: The Caseof the ArgentineAnt ," in Biological Motion, eds. W. Alt and G. Hoffmann, Springer-Verlag, Berlin, pp. 533- 47. " , S., and Hosoda , K. 1995. Vision-BasedReinforcement Asada, M., Noda, S., Tawaratsumida " Learning for PurposiveBehavior Acquisition, Proceedingsof the IEEE International Conferenceon RoboticsandAutomation, May, pp. 146- 53. Ashby, W. R. 1952. Designfor a Brain: The Origin of Adaptive Behavior, J. Wiley, New York, 1952(Secondedition 1960). AstrOm, K. 1995. "AdaptiveControl: GeneralMethodology," in TheHandbookof Brain Theoryand Neural Networks, ed. M. Arbib, Mit Press, Cambridge, MA , pp. 66- 69. Atkeson, C., Moore, A ., andSchaal, S. 1997. " Locally WeightedLearningfor Control," Artificial IntelligenceReview, February, Vol. 11. No. 1- 5, pp. 11- 73. Badal, S., Raveia, S., Draper, B., andHanson, A. 1994. "A PracticalObstacleDetection andAvoidanceSystem," Proceedingsof the SecondIEEE Workshopon Applicationsof , pp. 97- 104. , FL , December ComputerVision, Sarasota " Badier, N., and Webber, B. 1991. Animation from Instructions," in Making the Move: Mechanics, Control and Animation of Articulated Figures, eds. Badier, Barsky, and Zeltzer, MorganKaufmann, SanMateo, CA , pp. 51- 93. " " , Proceedingsof the IEEE, Vol. 76, No. 8, August, Bajcsy, R. 1988. Active Perception . 1005 . 996pp Bakker, P., and Kuniyoshi, Y. 1996. " Robot See, Robot Do: An Overview of Robot Imitation," AISB Workshopon Learning in Robotsand Animals, Bright on, UK, April . Balch, T., andArkin , R. C. 1993. "Avoiding thePast: A Simplebut EffectiveStrategyfor ReactiveNavigation," Proceedingsof the IEEE International Conferenceon Robotics and Automation, Atlanta, GA , May, Vol. 1, pp. 678- 85. Balch, T., and Arkin , R. C. 1994. "Communication in ReactiveMultiagent Robotic " Systems, AutonomousRobots, Vol. 1, No. 1, pp. 27- 52. -BasedFormationControl for Multiagent Balch, T., and Arkin , R. C. 1995. " Motor Schema RobotTeams:' Proceedings1995International Conferenceon Multiagent Systems , SanFrancisco, CA , pp. 10- 16. Balch, T., Boone, G., Collins, T., Forbes, H., MacKenzie, D., and Santamaria , J. 1995. " 10, Ganymede , " AI , and Callisto- A Multiagent Robot Trash-Collecting Team Magazine, Vol. 16, No. 2, Summer,pp. 39- 51. " " Ballard, D. 1989. ReferenceFramesfor AnimateVision, Proceedingsof the Eleventh International Joint Conferenceon Artificial Intelligence(IJCAI-89), Detroit, MI , pp. 1635- 41.
449
References Ballard, D., and Brown, C. 1993. " Principlesof Active Perception," in Active Perception , ed. Y. Aloimonos, LawrenceErlbaumAssociates , Hillsdale, NJ, pp. 245- 82. Barbera, A., Fitzgerald, M., Albus, J., and Haynes, L . 1984. " RCS: The NBS Realtime Control System," Proceedingsof the Robots8 Conference , Detroit, MI , June, pp. 19.1- 19.38. Barth, M ., and Ishiguro, H. 1994. " Distributed PanoramicSensingin Multiagent Robotics," Proceedingsof the IEEE International Conferenceon MultisensorFusion and Integrationfor Intelligent Systems , Las Vegas,NY, October, pp. 739- 46. Bartlett, F. C. 1932. Remembering : A Studyin Experimentaland Social Psychology, London, CambridgeUniversity Press. " " Basye, K. 1992. An Automata-BasedApproachto Robotic Map Learning, working notes, AAAI Fall Symposiumon Applicationsof AI to Real-WorldAutonomousMobile Robots. Beer, R. 1990. Intelligence as Adaptive Behavior: An Experimentin Computational , AcadelnicPress, New York, NY. Neuroethology Beer, R., Chiel, H., and Sterling, L . 1990. "A Biological Perspectiveon Autonomous " , Vol. 6, pp. 169- 86. Agent Design, RoboticsandAutonomousSystems " " Bekey, G., and Tomovic, R. 1986. Robot Control by Reflex Actions, IEEE International Conferenceon Roboticsand Automation, SanFrancisco, CA , April , pp. 240- 47. " " , Bekey G., andTomovic, R. 1990. Biologically BasedRobotControl, Proceedingsof theAnnual International Conferenceof the IEEE Engineeringin Medicineand Biology Society, Vol. 12, No. 5, pp. 1938- 39. Bellman, R. 1978. Artificial Intelligence: Can ComputersThink? Boyd and Frasier PublishingCo., Boston, MA . Benson, S., andNilsson, N. 1995. " Reacting, Planning, andLearningin anAutonomous " , D. Michie, and S. Muggleton, Agent, in MachineIntelligence14, eds. K. Furukawa ClarendonPress, Oxford, UK. Bersini, H. 1992. " Immune Network and Adaptive Control," Proceedingsof the First EuropeanConferenceon Artificial Ufe , Paris, France, pp. 217- 26. Biedennan, I. 1990. " Higher-Level Vision," in VisualCognitionand Action, Vol. 2, ed. N. Osherson , S. Kosslyn, andJ. Hollerbach, MIT Press, Cambridge, MA . Birnbaum, L., Brand, M ., and Cooper, P. 1993. " Looking for Trouble: Using Causal Semanticsto Direct Focusof Attention," Proceedingsof the FourthInternational Conference on ComputerVision(ICCV-93), Berlin, Gennany, May, pp. 49- 56. Bizzi, E., Mussa-Ivaldi, F., and Giszter, S. 1991. " ComputationsUnderlying the Execution " of Movement: A Biological Perspective , Science,Vol. 253, July, pp. 287- 91. Blake, A. 1993. " ComputationalModelling of Hand-Eye Coordination," in Active Perception , ed. Y. Aloimonos, LawrenceErlbaumAssocociates , Hillsdale, NJ, pp. 227- 44. Blake, A. 1995. "Active Vision," in The Handbookof Brain Theoryand Neural Networks , ed. M. Arbib , MIT Press, Cambridge, MA , pp. 61- 63. Boden, M. 1995. "AI ' s Half-Century," AI Magazine, Vol. 16, No. 2, Wmter, pp. 96- 99.
450
References . "Active Investigationof Functionality," Proceedings Bogoni, L., andBajcsy, R. 1994a on the Role of the Workshop of Functionality in Object Recognition, 1994Conf. on Vision and Pattern Computer Recognition, Seattle, WA, June. " Bogoni, L., and Bajcsy, R. 1994b. FunctionalityInvestigationUsing a DiscreteEvent " Roboticsand Autonomous , Vol. 13, No. 3, October, pp. Systems SystemApproach, 173- 96. Bohm, C., and Jacopini, G. 1966. " Flow Diagrams, Thring Machines, and Languages with Only 1WoFormationRules," Communications of theACM, May. Vol. 9, No. 5, pp. 366- 71. Bonasso,P. 1991. " UnderwaterExperimentsUsing a ReactiveSystemfor Autonomous Vehicles," Proceedingsof theAAAI , pp. 794- 800. Bonasso, P. 1992. " ReactiveControl of UnderwaterVehicles," Applied Intelligence, Vol. 2, No. 3, September , pp. 201- 04. Booker, L., Goldberg, D., andHolland, J. 1989. " ClassifierSystemsandGeneticAlgorithms " , Artificial Intelligence, Vol. 40, No. 1-3, pp. 235- 82. Borenstein, J., and Koren, Y. 1989. " Real-Time ObstacleAvoidancefor Fast Mobile Robots," IEEE Transactionson Systems , Man, and Cybernetics , Vol. 19, No. 5, September , pp. 1179- 87. Borenstein,J., andKoren, Y. 1991. "' TheVectorField Histogram- FastObstacleAvoidance for Mobile Robots," IEEE Transactionson RoboticsandAutomation, Vol. 7, No. 3, June, pp. 278- 88. Bower, T. 1974. "' TheEvolution of SensorySystems," in Perception: Essaysin Honor of JamesJ. Gibson, eds. R. MacLeodandH. Pick, Cornell UniversityPress,Ithaca, NY, p. 141. : Experimentaland Naturalistic Box, H. 1973. Organisationin Animal Communities Studiesof the SocialBehaviorof Animals, Butterworths, London. , Boyd, W. 1971. An Introductionto the Studyof Disease. Lea & Febiger, Philadelphia PA. " " Brady, M . 1985. Artificial Intelligence and Robotics, Artificial Intelligence and Robotics, Vol. 26, pp. 79- 121. : Experimentsin SyntheticPsychology , MIT Press,Cambridge Braitenberg,V. 1984. Vehicles , MA . " Brill , F. 1994. " Perceptionand Action in a Dynamic Three-DimensionalWorld, Proceedings of the IEEE Workshopon VisualBehaviors, Seattle, WA, June, pp. 60- 67. " Brooks, R. 1986. "A Robust Layered Control System for a Mobile Robot, IEEE Journal of RoboticsandAutomation, Vol. RA 2, No. I , pp. 14- 23. " Brooks, R. 1987a. " Planningis Justa Wayof Avoiding Figuring Out What to Do Next, . WorkingPaper303, MIT AI Laboratory, September " Brooks, R. 1987b. A HardwareRetargetableDistributed Layered Architecture for Mobile RobotControl," Proceedingsof theIEEE InternationalConferenceon Robotics and Automation, Raleigh, NC, May, pp. 106- 10.
451
References Brooks, R. 1989a."A RobotThat Walks: EmergentBehaviorsfrom a CarefullyEvolved " Network, Proceedingsof the IEEE International Conferenceon RoboticsandAutomation , May, pp. 692- 94. Brooks, R. 1989b. "The Whole Iguana," in RoboticsScience,ed. M . Brady, MIT Press, Cambridge, MA , pp. 432- 56. Brooks, R. 1990a. "The BehaviorLanguage," A.I. MemoNo. 1227, MIT AI Laboratory, April . Brooks, R. 1990b. " ElephantsDon' t Play Chess," in DesigningAutonomousAgents, ed. P. Maes, MIT Press, Cambridge, MA , pp. 3- 15. Brooks, R. 1991a. " IntelligenceWithout Reason," A.I. MemoNo. 1293, MIT AI Laboratory , April . Brooks, R. 1991b. " New Approaches to Robotics," Science, Vol. 253, September , pp. 1227- 32. . Brooks, R., and Flynn, A. 1989. " Robot Beings," Proceedingsof the IEEE/ RSJInternational Conferenceon Intelligent Roboticsand Systems(IROS-89), Tsukuba, Japan, . 2 10. pp Brooks, R. A., and Stein, L . 1994. " Building Brains for Bodies," AutonomousRobots, Vol. 1, No. 1, pp. 7- 25. Brown, C. 1991. " GazeBehaviorsfor Robots," in Active Perceptionand Robot Vision, eds. A. SoodandH. Wechsler,Springer-Verlag, Berlin, pp. 115- 39. " Budenske , J., and Gini, M . 1994. Why Is It So Difficult for a Robot to Passthrough a DoorwayUsing illtrasonic Sensors ?" Proceedingsof the IEEE International Conference on Roboticsand Automation, pp. 3124- 29. Buhler, M., Koditschek , D., and Kindlmann, P. 1989. "A Family of Robot Control " , Proceedingsof the International Strategiesfor IntemrittentDynamicalEnvironments , AZ , pp. 1296- 1301. Conferenceon Roboticsand Automation, Scottsdale " CourseControl Stored M. and H. 1972 . , , Mittelstaedt , Burger by ProprioceptiveInformation in Millipedes," Biocybernetics , Vol. IV, ed. H. DrischelandP. Dettmar, FischerVerlag, Berlin, June 1972. Byrnes, R., Healey, A., McGhee, R., Nelson, R., Kwak, S., and Brotzman, D. 1996. "The RationalBehaviorSoftwareArchitecturefor " Intelligent Ships, Naval Engineers Journal, Vol. 108, No. 2, March, pp. 43- 55. " Cal, A ., Fukuda , T., Araj, F., Ueyama , T., andSakal, A. 1995. HierarchicalControl Architecture for Cellular RoboticSystem- SimulationsandExperiments," Proceedingsof the IEEE International Conferenceon RoboticsandAutomation, June, pp. 1191- 96. Cameron,J., MacKenzie, D., Ward, K., Arkin , R., andBook, W. 1993. " ReactiveControl for Mobile Manipulation," Processingof the International Conferenceon Robotics and Automation, Atlanta, GA, pp. 228--35. " Cao, Y., Fukunaga , A., Kabog, A., andMeng, F. 1995. CooperativeMobile Robotics: " Antecedentsand Directions, Proceedingsof the IEEE/ RSJInternational Conference on Intelligent Roboticsand Systems(IROS ' 95), Pittsburgh, PA, pp. 226--34.
452
References Cardoze, D., and Arkin , R. C. 1995. " Developmentof Visual TrackingAlgorithms for an AutonomousHelicopter," Procee~ingsof theMobile RobotsX , Philadelphia,PA, pp. 145- 56. " Carley, K. 1995. Computationaland MathematicalOrganizationTheory: Perspective " and Directions, Computationand MathematicalOrganizationTheory, Vol. 1, No. 1, pp. 39- 56. Carr, G. M ., and MacDonaldD . 1986. "The Sociality of Solitary Foragers: A Model Basedon ResourceDispersion," Animal Behavior, Vol. 34, pp. 1540- 49. -Perez, F. 1995. " VisuomotorCoordinationin FrogsandToads," in TheHandbook Cervantes of Brain Theoryand Neural Networks, ed. M. Arbib, MIT Press, Cambridge,MA , pp. 1036- 42. Chalmers, D. 1995. "The Puzzleof ConsciousExperience," ScientificAmerican, Vol. . 273, No. 6, pp. 80- 86, December " " lntennediateVision: Architecture . D. 1990 , Implementation, and Use, Chapman, TechnicalReportTR-90- 06, TeleosResearch , PaloAlto , CA, October. " . Coordination and Control of a Group of Small Robots," L. and Lob J. 1994 Chen, , , Proceedingsof the International Conferenceon Roboticsand Automation, May, pp. 2315- 20. Christensen , H., Bowyer, K., and Bunke, Heds . 1993. Active Robot Vision: Camera . Heads, Model-BasedNavigationand ReactiveControl, World Scientific, Singapore " Christiansen , A., Mason, M., and Mitchell , T. 1991. LearningReliableManipulation ' , Vol. 8, Strategieswithout Initial PhysicalModels: RoboticsandAutonomousSystems No. 1, pp. 7- 18. " Chon, W., and Jochem , T. 1994. UnmannedGroundVehicleDemo II : Demonstration Wmter A ," UnmannedSystems , pp. 14- 20. , " Chun, W., Lynch, R., Shoemaker , C., and Munkeby, S. 1995. UGV- Demonstration " . 2025. B , UnmannedSystems , Summer, pp " Clark, R. J., Arkin , R. C., and Ram, A . 1992. Learning Momentum: On-Line Performance " Enhancementfor ReactiveSystems, Proceedingsof the IEEE International Conferenceon RoboticsandAutomation, Nice, France, May, pp. 111- 16. Colgan, P. 1983. ComparativeSocialRecognition, J. Wiley, New York. " Collins, T. R., Arkin , R. C., and Henshaw , A. M . 1993. Integrationof ReactiveNavigation " with a Flexible ParallelHardwareArchitecture, Proceedingsof the IEEE International Conferenceon Roboticsand Automation, Atlanta, GA, May, Vol. 1, pp. 271- 76. Colombetti, M., and Dorigo, M. 1992. " Learning to Control an AutonomousRobot " by DistributedGeneticAlgorithms, From Animals to Animats 2: Proceedingsof the SecondInternational Conferenceon Simulationof Adaptive Behavior, Honolulu, HI , December , MIT Press, CambridgeMA , pp. 305- 12. " Connell, J. 1987. " CreatureBuilding with the SubsumptionArchitecture, Proceedings IJCAI 87 on ), Milan , Italy, of the InternationalJoint Conference Artificial Intelligence( 1124 26. . pp
453
References Connell, J. 1989a . "A Behavior-BasedArm Controller," lE EETransactionson Robotics and Automation, Vol. 5, No. 6, December , pp. 784- 91. Connell, J. 1989b. "A Colony Architecturefor an Artificial Creature," TechnicalReport No. 1151, MIT AI Laboratory, August. Connell, J. 1992. " SSS: A Hybrid ArchitectureApplied to RobotNavigation," Proceedings of the IEEE International Conferenceon Roboticsand Automation, Nice, France, pp. 2719--24. Connell, J., and Viola P. 1990. " CooperativeControl of a Semi-AutonomousMobile Robot," Proceedingsof the IEEE International Conferenceon Roboticsand Automation , pp. III8 --21. " Connolly, C., and Gropen, R. 1993. On the Applications of Harmonic Functionsto " Robotics, Journal of RoboticSystems , Vol. 10, No. 7, pp. 931- 46. D. Cook, , Gmytrasiewicz, P., and Holder, L. 1996. " Decision-TheoreticCooperative ,SensorPlanning," IEEE Transactionson PatternAnalysis and Machine Intelligence, Vol. 18, No. 10, October, pp. 1013- 23. Craig, J. 1989. Introduction to Robotics: Mechanicsand Control, 2nd Ed., AddisonWesley, Reading, MA . Crisman, J. 1991. " Color RegionTracking for VehicleGuidance," Active Vision, MIT Press, Cambridge, MA , pp. 107- 20. " , M . 1994. IntegrationandControl Crowley, J., Bedrune, J., Bekker, M ., andSchneider " of Visual Process es, Proceedingsof the IEEE Workshopon VisualBehaviors, Seattle, WA, June, pp. 45- 52. " Croy, M ., and Hughes, R. 1991. Effects of Food Supply, Hunger, Danger, and Competition " on Choice of ForagingLocation by the Fifteen-spinedStickleback , Animal Behavior, Vol. 42, pp. 131- 39. Culhane, S., and Tsotsos, J. 1992. "An Attentional Prototypefor Early Vision," Proceedings of the SecondEuropeanConferenceon Computer Vision, ed. G. Sandini, LNCS-SeriesVol. 588, Springer-Verlag, Berlin, May, pp. 551- 60. Davis, R. 1982. "Applicationsof MetaLevel Knowledgeto the Construction, Maintenance " , and Useof Large KnowledgeBases, in Knowledge-BasedSystemsin Artificial -Hill , New York, pp. 229- 490. R. eds . Davis and D. Lenat McGraw , , Intelligence Dean, T., Angluin, D., Basye, K., Engelson, S., Kaelbling, L., Kokkevis, E., and Maron, O. 1995. " Inferring Finite Automatawith StochasticOutput Functionsand an " Application to Map Learning, MachineLearning, Vol. 18, No. I , pp. 81- 108, Jan. Dean, T., and Wellman, M. 1991. Planning. and Control, Morgan-Kaufmann, San Mateo, CA. " " , J., and Goss, S. 1984. CollectivePatternsand Decision-Making, EtholDeneubourg ogy, Ecology, and Evolution, Vol. 1, pp. 295 311. " " Dennett, D. 1982. Stylesof Mental Representation , Proceedingsof the Aristotelian Society, Vol. LXXXIII , pp. 213 16. Dennett, D. 1991. Consciousness Explained, Little , Brown, andCo., Boston, MA .
454
References
Dickmanns, E. 1992. "A GeneralDynamic Vision Architecturefor UGV and UAV," , pp. 251- 70. Applied Intelligence, Vol. 2, No. 3, September " Dickmanns, E., andZapp, A. 1985. Guiding Land VehiclesalongRoadwaysby Computer Vision," AFCET Conference , Toulouse, France. " Donald, B. 1993. Information Invariantsin Robotics: Partll - SensorsandComputation " , Proceedingsof the International Conferenceon RoboticsandAutomation, Vol. 3, pp. 284- 90. Donald, B., andJennings,J. 1991a. " SensorInterpretationandTask-DirectedPlanning " , Proceedingsof the International Conference Using PerceptualEquivalenceClasses on Roboticsand Automation, Anahelm , CA, pp. 190- 97. Donald, B., andJennings,J. 1991b. " PerceptualLimits, PerceptualEquivalenceClasses , and a Robot' s Sensori -ComputationalCapabilities," Proceedingsof the IEEE/ RSJInternational ' Conferenceon IntelligentRoboticsand Systems(IROS 91), pp. 1397- 1405. " Donald, B., Jennings,J., and Brown, R. 1992. ConstructiveRecognizabilityfor Task" DirectedRobotProgramming , RoboticsandAutonomousSystems , Vol. 9, No. 1- 2, pp. 41 74. Drexler, E. 1986. Enginesof Creation: The ComingEra of Nanotechnology , Anchor Press/ Doubleday, New York. Drexler, E. 1992. Nanosystems : Molecular Machinery, Manufacturingand Computation , Wiley-interscience,New York. Duchon A., WarrenW., and Kaelbling, L. 1995. " Ecological Robotics: Controlling AMual Conferenceof the Behaviorwith Optic Flow," Proceedingsof the Seventeenth . Science 164 69. Society, pp Cognitive " Dudek, G., Jenkin, M ., Milios , E., and Wilkes, D. 1993. A Taxonomyfor Swarm Robots," Proceedin?,s C?f the IEEE/ RSJInternational Conferenceon Intelli ?,ent Robots
' andSystems , Japan , pp. 441- 47. (IROS93), Yokohama " " BIfes, A. 1986 . A SonarBasedMappingandNavigation , Proceedings of the System IEEEInternationalConference San Francisco on RoboticsandAutomation , , CA, pp. - 56. 1151 " Think'!" AI Magazine , Vol. 13, No. 2, Summer , pp. , R. 1992. CanMachines Epstein 80- 95. -Roth, F., Lesser Ennan, L., Hayes , V., andReddy , D. 1980."The HearsayII Speech " : IntegratingKnowledge to ResolveUncertainty , Computing Understanding System . 12 No. 2 . 213 -53. Vol , , pp , Surveys " -InspiredHexapod , K., Quinn,R., Chiel, H., andBeer,R. 1994. Biologically Espenscheid " onRobotics and RobotControl, Proceedings of theFifthInternational Conference ' , Maul, HI, pp. 89- 102. (ISRAM94), August Manufacturing Everett MobileRobots , MA. , A.K. Peters , Wellesley , B. 1995.Sensorsfor Fundamentals Ewert, J-P. 1980.Neuroethology : An Introductionto theNeurophysiological -Verlag , Berlin. , Springer of Behavior ErlbaumAssociates , , M. 1993.Principlesof Cognitive , Lawrence Psychology Eysenck Hove,UK.
455
References
" -Learning Approach Fagg, A., Lotspeich, D., and Bekey, G. 1994. A Reinforcement to ReactiveControl Policy Designfor AutonomousRobots," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, pp. 39- 44. Ferrell, C. 1994. " RobustAgent Control of an AutonomousRobot with Many Sensors andActuators," MiS. Thesis, MIT AI Laboratory, Cambridge,MA . Ferrier, N., andClark, J. 1993. "The HarvardBinocularHead," in Active Robot Vision: CameraHeads, Model-BasedNavigation and ReactiveControl, eds. H. Christensen , K. Bowyer, andH. Bunke, World Scientific, Singapore , pp. 9- 31. Fikes, R., and Nilsson, N. 1971. " STRIPS: A New Approachto the Application of TheoremProvingto ProblemSolving," Artificial Intelligence, Vol. 2, pp. 189- 208. " " Firby, R. J. 1989. AdaptiveExecutionin ComplexDynamic Worlds, PhiD. Dissertation Technical YALEU/CSD / RR #672 Yale New Haven, CT. , , University, Report " " Firby, R. J. 1995. LessonsLearnedfrom the Animate Agent Project(So Far), working notes, AAAl Spring Symposiumon LessonsLearnedfrom ImplementedSoftware Architecturesfor PhysicalAgents, PaloAlto , CA, March, pp. 92- 96. " Firby, R. J., and Slack, M. 1995. Task Execution: Interfacingto ReactiveSkill Networks " , working notes, AAAl SpringSymposiumon LessonsLearnedfrom Implemented SoftwareArchitecturesfor PhysicalAgents, PaloAlto , CA, March, pp. 97- 111. : A Multidisciplinary Approach, Academic Fite, K. 1976. TheAmphibianVisualSystem Press, New York. " Floreano, D., and Mondada , F. 1996. Evolution of Homing Navigation in a Real " Mobile Robot, IEEE Transactionson Systems , Man, and Cybernetics , Vol. 26, No. 3, June, pp. 396- 407. FlorenceS ., and Kaas, J. 1995. " Somatotopy: Plasticity of SensoryMaps," in The Handbookof Brain Theoryand Neural Networks, ed. M . Arbib, MIT Press, Cambridge MA , pp. 888- 91. " " Flynn, A. 1987. GnatRobots(andHow They Will ChangeRobotics), Working Paper No. 295, MIT AI Laboratory, Cambridge, MA , June. " " Flynn, A., and Brooks, R. 1989. Battling Reality, AI Memo No. 1148, MIT AI Laboratory, Cambridge, MA , October. " : A Gnat Flynn, A., Brooks, R., andTavrowL. 1989. 1Wilight ZonesandCornerstones " Robot Double Feature, A.I. Memo No. 1126, MIT AI Laboratory, Cambridge, MA , July. " Flynn, A., Brooks, R., Wells, W., and Barrett, D. 1989. Squirt: The Prototypical " Mobile Robot for AutonomousGraduateStudents, A.I. Memo No. 1120, MIT AI Laboratory, CambridgeMA , July. " Fok, K.-Y., and Kabuka, MR . 1991. An Automatic Navigation Systemfor Vision " GuidedVehiclesUsing a DoubleHeuristicanda Finite StateMachine, IEEE Transactions on Roboticsand Automation, Vol. 7, No. 1, February, pp. 181- 88. " Franceshini,N., Pichon, J., andBianes, C. 1992. " From InsectVision to RobotVision, ThePhilosophicalTransactionsof theRoyalSocietyof LondonB, Vol. 337, pp. 283- 94.
456
References Franceshini, N., Riehle, A., and Le Nestour, A. 1989. " Directionally SelectiveMotion Detectionby InsectNeurons," in Facetsof Vision, eds. Slavengaand Hardie, SpringerVerlag, Berlin, pp. 360- 90. Franklin, R. F., and Hannon, L . A. 1987. Elementsof CooperativeBehavior. Internal Researchand DevelopmentFinal Report 655404-1-F , EnvironmentalResearchInstitute of Michigan ( BRIM), Ann Arbor, MI . FranklinD ., Kahng, A., and Lewis, M . 1995. " DistributedSensingand Probing with " Multiple SearchAgents: TowardSystem-Level LandmineDetectionSolution, in Proceedings DetectionTechnologies for Mines and Minelike Targets, SPIE Vol. 2496, pp. 69&- 709. Franks, N. 1986. "TeaIns in Social Insects: Group Retrievalof Prey by Anny Ants," BehavioralEcologyand Sociobiology, Vol. 18, pp. 425- 29. Fu, D., Hammond K., and Swain, M. 1994. " Vision and Navigation in Man-made Environments: Looking for Syrup in All the Right Places," Proceedingsof the IEEE Workshopon VisualBehaviors, Seattle, WA, June, pp. 20- 26. Fukuda, T., Nakagawa , S., Kawauchi, Y., and Buss, M . 1989. " StructureDecision for Self OrganisingRobots Basedon Cell Structures- CEB( Jf," IEEE International , AZ , pp. 695- 70. Conferenceon Roboticsand Automation, Scottsdale " Fukuda, T., and Sekiyama, K. 1994. CommunicationReductionwith Risk Estimate for Multiple Robotic System," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, pp. 2864- 69. " Gachet, D., Salichs, M ., Moreno, L., andPi mental , J. 1994. LearningEmergentTasks " for an AutonomousMobile Robot, Proceedingsof the International Conferenceon ' Intelligent Robotsand Systems(IROS 94), Munich, Germany,September , pp. 290- 97. " " Gage, D. 1992. SensorAbstractionsto SupportMany-Robot Systems, Proceedings of Mobile RobotsVII , Boston, MA , November,pp. 235--46. " Gallagher, J., and Beer, R. 1992. A QualitativeDynaInical Analysisof EvolvedLocomotion " Controllers, From Animals to Animats2: Proceedingsof the SecondInternational Conferenceon Simulationof AdaptiveBehavior, Honolulu, In , December , MIT Press, CambridgeMA , pp. 71- 80. Gallistel, C. R. 1980. TheOrganizationof Action: A NewSynthesis , LawrenceErlbaum Associates , Hinsdale, NJ. Gallistel, C. R. 1990. The Organizationof Learning, MIT Press, Cambridge, MA . Gardner, H. 1985. The Mind 's New Science : A History of the CognitiveRevolution, BasicBooks, New York. Garey, M ., andJohnson,D. 1979. Computersand Intractability: A Guideto the Theory , W.H. FreemanandCo., SanFrancisco, CA. of NP-Completeness Gat, E. 1991a . " Reliable Goal-Directed Reactive Control of Autonomous Mobile Robots," Ph. D. Dissertation, Virginia Polytechnic Institute and State University, Blacksburg.
Gat, E. 1991 b. "ALFA: A Language for Programming Reactive RoboticControlSystems " onRobotics andAutomation , Proceedings , of theIEEEInternational Conference Sacramento , CA, pp. 1116-20.
457
References
" Gat, E. 1992. IntegratingPlanningandReactionin a Heterogeneous Asynchronous " for ControllingReal-WorldMobileRobots Architecture , Proceedings of theAAAl . Gat, E. 1995, "TowardsPrincipledExperimentalStudy of AutonomousMobile Robots Robots , Vol. 2, No. 3, pp. 179-89. ," Autonomous " ~ RobotNavigation Gat, E., andDorais,G. 1994 ," Proceedings Sequencing byConditional - 99. . on Robotics and Automation the IEEE International , pp 1293 Conference of " : NeuralMapfor On LineLearning Gaussier , S. 1994. A Topological , P., andZrehen in a MobileRobot," Proceedings Avoidance of Obstacle of theThirdConference Emergence on Simulation , of AdaptiveBehavior(FromAnimalsto Animats3), MIT Press - 90. MA . 282 , , Cambridge pp " to ControlArtificial Gaussier , P., andZrehen , S. 1995. PerAc: A NeuralArchitecture Animals," RoboticsandAutonomous , Vol. 16, pp. 291- 320. Systems " " andPlanning , A. 1987. Reactive , M. andLansky of Proceedings Reasoning Georgeff .theAAAl -87, pp. 677- 82. " andPlanningin Dynamic , A., andSchoppers , M. 1986. Reasoning , M., Lansky Georgeff NoteNo. 380, : An Experiment with a MobileRobot," SRITechnical Domains . AI Center , SRIInternational " " , Vol. 9, pp. , AnnualReviewof Neuroscience , A. 1986. On Reaching Georgopoulos 14770. -Hall, Engle: An Introductionto Drugs, Prentice Gerald , M. C. 1981.Pharmacology woodCliffs, NJ. " of theHumanBrain," in TheBrain, W.H. Freeman Geschwind , N. 1979. Specializations , NewYork, pp. 108-17. Gibson , HoughtonMifflin, , J. J. 1979.TheEcologicalApproachto VisualPerception Boston , MA. " : An (Almost) UniversallyBadIdea," A/ Magazine , M. 1989. UniversalPlanning Ginsberg , Vol. 10, No. 4, Winter,pp. 40- 44. " andMotion Giralt, G., Chatila , M. 1984. An Integrated , R., andVaisset Navigation " ControlSystemfor Autonomous , First International MultisensoryMobile Robots Research onRobotics , ed. M. BradyandR. Paul,pp. 191- 214. Symposium Goel, A., Ali , K., Donnellan, M., GomezdeSilvaGarza , T. 1994. , A., andCallantine " , pp. ," IEEEExpert, Vol. 9, No. 6, December Multistrategy AdaptivePathPlanning 57- 65. Goetsch , AnnArbor, MI. , W. 1957.TheAnts. Universityof MichiganPress in . D. Genetic , andMachineLearning , Optimization , 1989 Algorithms Search Goldberg MA. Addison , , WesleyReading " Controlof an , S. andLuo, R. 1994. FuzzyBehaviorFusionfor Reactive Goodridge IEEE International the Autonomous MobileRobot:MARGE," Proceedings Conference of - 27. andAutomation onRobotics , pp. 1622 Goss , J. 1990. " How Trail , J., Arnn, S., andPasteels , R., Deneubourg , S., Beckers " Mechanisms Problems , in Behavioral LayingandTrailFollowingCanSolveForaging . 661- 78. . R. ed Food Selection , , , , , Germany pp Heidelberg Verlag HughesSpringer of
458
References Graefe, V. 1990. "An Approach to ObstacleRecognition for AutonomousMobile Robots," Proceedingsof the IEEE International Workshopon Intelligent Robotsand ' Systems(IROS 90), pp. 151- 58. " Grefenstette , J. and Schultz, A. 1994. An Evolutionary Approach to Learning in " Robots, MachineLearning Workshopon Robot Learning, New Brunswick, NJ, July. Also availableas NCARAl ReportAlC -94-014, Navy Centerfor Applied Researchin Artificial Intelligence, Washington , DC. " , T. 1990. AutochthonousBehaviors- MappingPerception Grupen, R., andHenderson " to Action, in Traditional and Non Traditional Robotic Sensors , ed. T. Henderson , NATO ASI Series, Vol. F-63, Springer-Verlag, Berlin, pp. 285- 311. " " Guigon, E., and Burnod, Y. 1995. Short-Term Memory, in The Handbookof Brain Theoryand Neural Networks, ed. M . Arbib, MIT Press, Cambridge, MA , pp. 867- 71. " " " Hartley, R., and Pipitone, F. 1991. Experimentswith the SubsumptionArchitecture, Proceedingsof theIEEE InternationalConferenceon RoboticsandAutomation, Sacramento , CA , pp. 1652- 8. " " Hayes, G., and Demiris, J. 1994. A Robot Controller Using Learning by Imitation, Proceedingsof the SecondInternational Symposiumon Intelligent Robotic Systems , Grenoble, France, pp. 198- 204. " " Hayes-Roth, B. 1995. An Architecture for Adaptive Intelligent Systems, Artificial Intelligence, Vol. 72, No. 1 2, January,pp. 329- 65. " Hayes-Roth, B., Lalanda, P., Morignot, P., Pfleger, K., andBalabanovic,P. 1993. Plans " andBehaviorin IntelligentAgents, TechnicalReportKSL- 93-43, KnowledgeSystems Laboratory, StanfordUniversity, Stanford, CA. Hayes-Roth, B., Pfleger, K., Morignot, P., Lalanda, P., and Balabanovic, M. 1995. "A Domain" , IEEE Specific SoftwareArchitecturefor Adaptive Intelligent Systems Transactionson SoftwareEngineering, Vol. 21, No. 4, April , pp. 288- 301. " in NaturalTasks," Hayhoe, M ., Ballard, D., and Pelz, J. 1994. Visual Representations Proceedingsof the IEEE Workshopon VisualBehaviors, Seattle, WA, June, pp. 1- 9. Head, H. andHolmes, G., 1911. " SensoryDisturbancesfrom CerebralLesions," Brain, Vol. 34, p. 102. Hebb, D. 1949. TheOrganizationof Behavior, New York, Wiley. Hexmoor, H. and Kortenkamp, D. 1995. " Issueson Building Softwarefor Hardware " Agents, KnowledgeEngineeringReview, Vol. 10, No. 3, pp. 301- 04. Hexmoor, H., Kortenkamp, D., Arkin , R., Bonasso, P., and Musliner, Deds . 1995. Working Notes, LessonsLearnedfrom ImplementedSoftwareArchitecturesfor Physical Agents, AAAI Spring SymposiumSeries, March, Stanford, CA. Hillis , D. 1988. " Intelligence as EmergentBehavior, or the Songsof Eden," in The , MIT Press, Cambridge, Artificial IntelligenceDebate: FalseStarts, Real Foundations MA . Hinton, G., and Sejnowski, T. 1986. " Learning and Relearningin Boltzmann Machines " , in Parallel DistributedProcessing , Vol. 1, eds. D. RumelhartandJ. McClellan, MIT Press, Cambridge, MA , pp. 282- 317.
459
References " Hodgins, J., and Brogan, D. 1994. Robot Herds: Group Behaviorsfor Systemswith " SignificantDynamics, Artificial Ufe IV , MIT Press, Cambridge, MA , pp. 319- 24. " " , MIT EpistemolHogg, D., Martin, F., and Resnick, R. 1991. BraitenbergCreatures and Me morandum No. 13 MA . , Cambridge, ogy Learning " " Hogg, T. 1996. QuantumComputingandPhaseTransitionsin CombinatorialSearch, Journal of Artificial IntelligenceResearch , Vol. 4, pp. 91 128. Holldobler, B., andWilson, E. 1990. TheAnts, BelknapPress, Cambridge,MA . Horswill, I. 1993a. " Specializationof PerceptualProcess es," PhiD. Dissertation, Department of Electrical Engineeringand ComputerScience, Massachusetts Institute of Technology,Cambridge, MA May. Horswill, I. 1993b. " Polly, A Vision-BasedArtificial Agent," Proceedingsof theAAAl 93, Washington , DC, pp. 824- 29. Horswill, I., and Brooks, R. A . 1988. " SituatedVision in a Dynamic Environment: . " ChasingObjects, Proceedingsof the SeventhNational Conferenceon Artificial Intelligence ' (AAAl 88), St. Paul, MN , August, pp. 796- 800. " Horswill, I., and Yamamoto, M . 1994. "A $1000Active StereoVision System , Proceedings the IEEE on Visual WA . Behaviors Seattle June 107 10. , , , , pp of Workshop " " Huang, H M. 1996. An Architectureanda Methodologyfor IntelligentControl, IEEE Expert: Intelligent Systemsand Their Applications, Vol. II , No. 2, April , pp. 46- 55. Huber, M., and Durfee, E. 1995. " Deciding When to Commit to Action During -BasedCoordination," Proceedingsof the First International Conference Observation on MultiagentSystems(ICMAS '95), SanFrancisco, CA, pp. 163- 69. Hull , C. 1943. Principlesof Behavior: An Introductionto BehaviorTheory, AppletonCentury-Crofts, New York. " fig , W., andBerns, K. 1995. A LearningArchitectureBasedon ReinforcementLearning for Adaptive Control of the Walking Machine LAURON," Roboticsand Autonomous , Vol. 15, October, pp. 321- 34. Systems " Ishiguro, A., Ichikawa, S., and Uchikawa, Y. 1994. A Gait Acquisition of a Six" LeggedRobotUsing ImmuneNetworks, Proceedingsof theInternational Conference on Intelligent Roboticsand Systems(IROS ' 94), Munich, Germany,pp. 1034- 41. " Ishiguro, H., Maeda, T., Miyashita, T., and Tsuji, S. 1994. A Strategyfor Acquiring an EnvironmentalModel with PanoramicSensingby a Mobile Robot," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, SanDiego, CA , pp. 724- 29. Jablonski, J., and Posey, J. 1985. " RoboticsTerminology," in Handbookof Industrial Robotics, ed. S. Nof, J. Wiley, New York, pp. 1271- 1303. " Janet, J., Schudel, D., White, M ., England, A ., and Snyder, W. 1996. Global SelfLocalizationfor Actual Mobile Robots: GeneratingandSharingTopographicalKnowledge " Using the Region-FeatureNeural Network, Proceedingsof the IEEE International . , DC, December Conferenceon MultisensorFusionand Integration, Washington " " Jeme, N. 1973. The ImmuneSystem , ScientificAmerican, Vol. 229, pp. 52- 60.
460
References Johnson, P., and Bay, J. 1995. " DistributedControl of SimulatedAutonomousMobile " Robot Collectivesin PayloadTransportation , AutonomousRobots, Vol. 2, No. I , pp. 43- 64. Johnson, S. D. 1983. Synthesisof Digital Designsfrom RecursionEquations, MIT Press, Cambridge, MA . ' Kaas, J., Krubitzer, L., Chino, Y., Langston, A., Polley, E., andBlair, N. 1990. " Reorganization of RetinotopicCortical Maps in Adult Mammalsafter Lesionsof the Retina," Science, Vol. 248, April , pp. 229- 31. " " Kaelbling, L. 1986. An Architecturefor Intelligent ReactiveSystems, SRI International TechnicalNote No. 400, Menlo Park, CA , October. " Kaelbling, L. 1987. REX: A Symbolic Languagefor the Design and Parallel Implementation " of EmbeddedSystems , Proceedingsof the AIAA Conferenceon Computers in AerospaceVI , Wakefield , MA , pp. 255- 60. " " , S. 1991. Action and Planningin EmbeddedAgents, Kaelbling, L., and Rosenschein in DesigningAutonomousAgents, ed. P. Maes, MIT Press, Cambridge,MA , pp. 35- 48. Kahn, P. 1991. " SpecificationandControl of BehavioralRobotPrograms," Proceedings of the SPIESensorFusionIV , Boston, MA , November. Kaufmann, J. 1974. " Social Ethology of the Whiptail Wallaby, MacropusParryi, in NortheasternNew SouthWales," Animal Behavior, Vol. 22, pp. 281- 369. " Keirsey, D., Mitchell, J., Payton, D., and Preyss, E. 1984. Multilevel Path Planning " for AutonomousVehicles, SPIE Vol. 485, Applicationsof Artificial Intelligence, pp. 133- 37. " Kelly, M . and Levine, M . 1995. WhereandWhat: Object Perceptionfor Autonomous " Robots, Proceedingsof the IEEE International Conferenceon Roboticsand Automation , pp. 261- 67. Kenny, P., Bidlack, C., Kluge, K., Lee, J., Huber, M., Durfee, E., and Weymouth, T. 1994. " Implementationof a ReactiveAutonomousNavigationSystemon an Outdoor Mobile Robot," Proceedingsof the Associationof UnmannedVehicleSystemsAnnual , Detroit, MI , May, pp. 233- 39. Symposium Khatib, O. 1985. " Real-Time ObstacleAvoidancefor ManipulatorsandMobile Robots," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, St. Louis, MO , pp. 500- 05. Kim , J., and Khosla, P. 1992. " Real-Time ObstacleAvoidanceUsing HarmonicPotential Functions," IEEE Transactionson Roboticsand Automation, Vol. 8, No. 3, June, pp. 338- 49. Kirchner, W., and Towne, W. 1994. "The SensoryBasis of the Honeybee's Dance " Language, ScientificAmerican, June, pp. 74- 80. " " Kluge, K., and Thorpe, C. 1989. Explicit Models for Robot Road Following, Proceedings , AZ , of the International Conferenceon Roboticsand Automation, Scottsdale pp. 1148- 54. Kohler, W. 1947. Gestalt Psychology : An Introduction to New Conceptsin Modem Psychology,Liveright PublishingCo., New York.
461
References Kolodner, J. 1994. Case-BasedReasoning , Morgan-Kaufman, SanMateo, CA. " Koren,Y., andBorenstein , J. 1991. PotentialField MethodsandTheir InherentLimita" tionsfor MobileRobotNavigation , Proceedingsof theIEEE International Conference
andAutomation onRobotics , Sacramento , CA, pp. 1398-1404. " Kosecka , J., Bajcsy , R., andMintz, M. 1993. Controlof VisuallyGuidedBehaviors " of Computer Science , Technical , UDiversity ReportMS CS93 101, Department of Pennsylvania . , Philadelphia Kosko, B., andlsaka, S. 1993." FuzzyLogic," ScientificAmerican , Vol. 268, No. 1, July, pp. 76- 81. " in Sensory Substitution Coordination , R. 1989. Perception Koy-Oberthur by Sensorimotor " : Amphibians for theBlind, in Visuomotor Coordination , Models , , Comparisons andRobots , NewYork, pp. 397- 418. , eds.J. P. EwertandM. A. Arbib, Plenum " Avoidance PotentialFieldApproach to Obstacle Control , B. 1984. A Generalized Krogh " SME - RI Technical , Dear, PaperMS84-484, Societyof Manufacturing Engineers . born, Michigan , C. 1986. "IntegratedPathPlanningand DynamicSteering Krogh, B. andThorpe " Vehicles Controlfor Autonomous , Proceedings of theIEEEInternationalConference Francisco - 69. onRobotics andAutomation San , , CA, pp. 1664 " " andAutonomous Krose,B. 1995. LearningfromDelayedRewards , Robotics , Systems Vol. 15, No. 4, October , pp. 233- 36. " Kube, C. R. andZhang, H. 1992. "CollectiveRoboticIntelligence , FromAnimals on Simulationof to Animats2: Proceedings of theSecondInternationalConference , MIT Press , Cambridge , MA, pp. 460- 68. , Honolulu , HI, December AdaptiveBehavior " Kube, C. R. and Zhang , H. 1994. Stagnation RecoveryBehaviorsfor Collective " onIntelligentRobotsandSystems Robotics , Proceedings of theInternationalConference ' (IROS94), pp. 1883-90. " , QualitativeMethodfor RobotSpatial , B., andByun, Y-T. 1988. A Robust Kuipers " onArtificialIntelligence NationalConference , , Proceedings of theSeventh Learning pp. 774- 79. " andMappingStrategy Based , B., andByun, Y-T. 1991. A RobotExploration Kuipers " on a Semantic , RoboticsandAutonomous Systems Hierarchyof SpatialRepresentations , Vol. 8, pp. 47- 63. " for MultiRobot Cooperation , Y. 1995. BehaviorMatchingby Observation Kuniyoshi ", International amAmmersee Research , Germany , Herrsching of Robotics Symposium , pp. 343- 52. , T. 1995."A , K., Nakamura , S., andSuehiro , Y., Nobuyuki , K., Sugimoto Kuniyoshi " WideAngleLensfor ActiveVision, Proceedings Foveated of theIEEEInternational andAutomation onRobotics , Japan , May, pp. 2982- 88. , Nagoya Conference , M. 1994. , S., andKakikura , Y., Rougeaux , S., Ishii, M., Kit&, N., Sakane Kuniyoshi " " : TheFramework andBasicTaskPatterns , Proceedings by Observation Cooperation onRobotics andAutomation , pp. 767- 73. of theIEEEInternationalConference
462
References
Kutulakos , K., Lumelsky , V., andDyer., C. 1992." ObjectExplorationby Purposive , " Science Department , Technical DynamicViewpointAdjustment ReportNo. 1124,Computer . , Universityof Wisconsin , Madison , November " -BasedSystemfor QffD. Rosenblatt J. and M. . A 1994 Behavior , , , , Hebert , Langer " RoadNavigation on RoboticsandAutomation , IEEE Transactions , Vol. 10, No. 6, December , pp. 776- 83. , C. (ed.) 1995.ArtificialLife: An Overview , Mit Press Langton , Cambridge , MA. Latombe . RobotMotionPlanning Publishers , J C. 1991 . , KluwerAcademic , Boston " Lawton, D., Arkin, R. C., andCameron J. . 1990 , QualitativeSpatialUnderstanding " andReactive Controlfor Autonomous Robots , Proceedings of theIEEEInternational ' onIntelligentRobotsandSystems Workshop (IROS90), Ibaraki,Japan , pp. 709- 14. Lee, D. 1978."The Functions of Vision," Modesof Perceiving andProcessing Information , eds. H. PickandE. Saltzman , Wiley, NewYork. Lee, J. L., Huber,M., Durfee , E., andKenny , P. 1994." UM-PRS: An Implementation " of theProcedural for Multirobot , Proceedings Reasoning System Applications of the '94 Houston onIntelligentRoboticsin Field, Factory , andSpace(CIRFFSS Conference ), , TX, March,pp. 842- 49. " Lefebvre Architecture for IntelligentMachines , D. andSaridis , G. 1992."A Computer , onRobotics andAutomation Proceedings of theIEEEInternationalConference , Nice, France , May, pp. 2745-50. , A. 1975.Biochemistry , 2nded., Worth , NewYork. Lehninger V. Lesser onMultiagentSystems , (ed.) 1995.Proceedings of theInternationalConference , SanFrancisco , CA, June. for MobileRobots Levitt, T. andLawton,D. 1990." Qualitative ," Artificial Navigation Vol . 44 , , No. 3, pp. 305- 60. Intelligence Lewis, M., Fagg, A., andBekey in , G. 1994." GeneticAlgorithmsfor GaitSynthesis " a Hexapod Robot, in RecentTrendsin MobileRobots , ed. Y. Zheng , WorldScientific , , pp. 317- 31. Singapore " for Programming MobileRobots Lim, W. 1994. An Agent-BasedApproach ," Proceedings on RoboticsandAutomation of theIEEEInternationalConference , SanDiego, CA, pp. 3584-89. Lin, L. 1992. " Self-ImprovingReactiveAgentsBasedon Reinforcement , Learning " , MachineLearning , Vol. 8, pp. 293- 321. PlanningandTeaching -Handlingfor anObject-Sorting Lin, F. andHsu, J. 1995."Cooperation andDeadlock " Taskin a MultiAgent RoboticSystem , Proceedings of theIEEEInternationalConference onRobotics andAutomation , pp. 2580-85. " " - 78. Simulators , Science , Vol. 273, pp. 1073 Lloyd, S. 1996. UniversalQuantum Lorenz,K. 1981.TheFoundations , SpringerVerlag , NewYork. of Ethology Lorenz,K., andLeyhausen : An , P. 1973.Motivationof HumanandAnimalBehavior EthologicalView,VanNostrandReinholdCo., NewYork. Lumia, R. 1994." UsingNASREMfor Real-TIlDeSensory Interactive RobotControl," Robotica , Vol. 12, pp. 127 35.
463
References
Lund, J., Wu, Q., andLevitt, J. 1995. " VisualCortexCell TypesandConnection," in The Handbookof Brain TheoryandNeuralNetworks, ed. M . Arbib , MIT Press,Cambridge, MA , pp. 344- 48. " " Lyons, D. 1992. Planning, Reactive, Encyclopediaof Artificial Intelligence, ed. S. Shapiro, 2nd ed., JohnWiley andSons, New York, pp. 1171- 82. " Lyons, D., and Arbib, M. 1989. A Formal Model of Computationfor Sensory-Based " Robotics, IEEE Transactionson Roboticsand Automation, Vol. 6, No. 3, June, pp. 280- 93. " " Lyons, D., and Hendriks, A. 1992. Planningfor ReactiveRobot Behavior, Proceedings of the IEEE International Conferenceon Roboticsand Automation, Nice, France, pp. 2675- 80. " " Lyons, D., and Hendriks, A. 1993. SafelyAdapting a HierarchicalReactiveSystem, Proceedingsof SPIEIntelligent Robotsand ComputerVisionXII : Active VISionand 3D Methods, Vol. 2056, pp. 450- 59. " " Lyons, D., and Hendriks, A. 1994; Planningby Adaptation: ExperimentalResults, Proceedingsof the IEEE International Conferenceon Roboticsand Automation, San Diego, CA , pp. 855--60. " Lyons, D., and Hendriks, A. 1995. Planningas IncrementalAdaptationof a Reactive " RoboticsandAutonomous , Vol. 14, No. 4, pp. 255- 88. , Systems System MacKenzie, D. 1996. "A DesignMethodologyfor the Specificationof Behavior-Based " Robotic Systems , PhiD. Dissertation, College of Computing, Georgia Institute of Technology,Atlanta. MacKenzie, D., and Balch, T. 1993. " Making a Clean Sweep: Behavior BasedVacuuming " , in working notes, AAAI Fall Symposium: InstantiatingReal-world Agents, Raleigh, NC, October, pp. 93- 98. MacKenzie, D., Cameron,J., andArkin , R. 1995. " SpecificationandExecutionof Multiagent Missions," Proceedingsof the International Conferenceon Intelligent Robotics and Systems(IROS ' 95), Pittsburgh, PA, pp. 51- 58. MacLennan, B. 1991. " SyntheticEthology: An Approachto the Study of Communication " , in Artificial life II , SF! Studiesin the Sciencesof Complexity, Vol. XI , ed. Farmeret aI., Addison-Wesley, Reading, MA . Maes, P. 1989. "The Dynamicsof Action Selection" Proceedingsof the EleventhInternational Joint Conferenceon Artificial Intelligence(IJCAI-89), Detroit, MI , pp. 99197. Maes, P. 1990. " SituatedAgentsCan HaveGoals," RoboticsandAutonomousSystems , Vol. 6, pp. 49- 70. " Maes, P., andBrooks, R. 1990. " Learningto CoordinateBehaviors, Proceedingsof the ' EighthNational Conferenceon Artificial Intelligence(AAAI 90), Boston, MA , August, pp. 796- 802. " Mahadevan , S., and Connell, J. 1991. Automatic Programmingof Behavior-Based ' RobotsUsing ReinforcementLearning: Proceedingsof theNinth National Conference on Artificial Intelligence(AAAI ' 91), Anahelm, CA , July, pp. 768- 73.
464
References Malcolm, C., and Smithers, T. 1990. " Symbol Grounding via a Hybrid Architecture " in an AutonomousAssemblySystem , in DesigningAutonomousAgents, ed. P. Maes, MIT Press, Cambridge, MA , pp. 123- 44. Malkin, P. and Addanki, S. 1990. " LOGnets: A Hybrid GraphSpatialRepresentation for Robot Navigation," .Proceedingsof the Eighth National Conferenceon Artificial Intelligence(AAAI-90), Boston, MA , pp. 1045- 50. " Marapane,S., Holder, M., andTrivedi, M . 1994. Coordinating Motion of Cooperative " Mobile Robots through Visual Observation , Proceedingsof the IEEE International , Man and Cybernetics,SanAntonio, TX , pp. 2260- 65. Conferenceon Systems Marr, D. 1982. Vision: A ComputationalInvestigationinto the HumanRepresentation and Processingof VisualInformalion, W.H. Freeman,SanFrancisco, CA. Mataric, M. 1990. " Navigatingwith a Rat Brain: A Neurobiologically-inspired Model " for Robot SpatialRepresentation , FromAnimals to Animats: Proceedingsof the First International Conferenceon Simulationof AdaptiveBehavior, MIT Press, Cambridge, MA , pp. 169- 75. Mataric, M . 1992a. " Behavior-BasedConttol: Main Propertiesand Implications," Proceedings , International Conferenceon of Workshopon Intelligent Control Systems Roboticsand Automation, Nice, France, May. Mataric, M. 1992b. " Integrationof Representationinto Goal-Driven Behavior-Based Robots," IEEE Transactionson RoboticsandAutomation, Vol. 8, No. 3, June, pp. 30412. Mataric, M . 1992c. " Minimizing Complexity in Conttolling a Mobile Robot Population " , IEEE International Conferenceon Roboticsand Automation, Nice, France, pp. 830- 35. Mataric, M. 1993a. " SynthesizingGroup Behaviors," Proceedingsof the Workshopon Dynamically Interacting Robots, International Joint Conferenceon Artificial Intelligence (IJCAI-93), Chambery,France, August, pp. 1- 10. Mataric, M. 1993b. " Kin Recognition, Similarity, andGroupBehavior," Proceedingsof the FifteenthAnnual CognitiveScienceConference , Boulder, CO, June, pp. 705- 10. Mataric, M . 1994a. " Interactionand Intelligent Behavior," PhiD. Dissertation, Department of ElectricalEngineeringandComputerScience,Massachusetts Instituteof Technology , Cambridge, May. Mataric, M. 1994b. " Learning to BehaveSocially," FromAnimals to Animats 3: Proceedings of the Third International Conferenceon Simulationof Adaptive Behavior, Bright on, UK , pp. 453- 62. Matthies, L., andElfes, A . 1988. " Integrationof SonarandStereoRangeDataUsing a " Grid-BasedRepresentation , Proceedingsof the International Conferenceon Robotics . and Automation, Philadelphia , PA, April , pp. 727- 33. Mazokhin-Porshnyakov , G. 1969. Insect Vision, PlenumPress, New York. " " McCarthy, J. 1995. Making Robots Consciousof their. Mental States, Computer ScienceReport, StanfordUniversity, CA , July 24. , N., and Shannon, C. 1955. A Proposalfor the McCarthy, J., Minsky, M., Rochester DartmouthSummerResearchProject on Artificial Intelligence, August 31.
465
References
McCulloch, W., and Pitts, W. 1943. "A Logical Calculus of the Ideas Immanentin NervousActivity : ' Bulletin of MathematicalBiophysics, Vol. 5, pp. 115- 33. McFarland, D. 1981. The Oxford Companionto Animal Behavior, Oxford University Press. McFarland, D., and Bosset, U. 1993. Intelligent Behaviorin Animalsand Robots, MIT Press, Cambridge, MA . McGhee, R. 1967. " Finite StateControl of QuadrupedLocomotion," Simulation, pp. . 135- 40, September McKerrow, P. 1991. Introductionto Robotics, Addison-Wesley, Reading, MA . " Meystel, A. 1986. Planning in a Hierarchical Nested Controller for Autonomous " -fifth Conferenceon Decisionand Control, Athens, Robots, Proceedingsof the 1Wenty Greece, pp. 1237- 49. Milian , J. 1994. " Learning Efficient ReactiveBehavioralSequencesfrom Basic Reflexesin a Goal-DirectedAutonomousRobot," (FromAnimalsto Animats3), Proceedings of the Third Conferenceon Simulationof AdaptiveBehavior, MIT Press, Cambridge , MA , pp. 266- 74. Miller , D. 1995. " ExperiencesLooking into Niches," WorkingNotes, 1995AAAI Spring : LessonsLearnedfrom ImplementedSoftwareArchitecturesfor Physical Symposium Agents, PaloAlto ~CA, March, pp. 141- 45. Miller , E., andDesimoneR. 1994. Science,Vol. 263, Janury, pp. 520- 22. Miller , G., Galanter, E., and Pribram , K. 1960. Plans and the Structureof Behavior, Holt, Rinehart, andWinston, New York. , New York. Minsky, M . 1986. TheSocietyof theMind , Simonand Schuster " Minsky, M . 1994. Will RobotsInherit the Earth?" ScientificAmerican, Vol. 271, No.4, October, pp. 109- 13, : An Essayin ComputationalGeometry, Minsky, M ., and Papert, S. 1969. Perceptrons MIT Press, Cambridge, MA . Mishkin, M ., Ungergleider,L., andMacko, K. 1983. " ObjectVision andSpatialVision: ' l \voCortical Pathways," Trendsin Neurosciences , Vol. 6, pp. 414- 17. Mitchell, T. 1990. " BecomingIncreasinglyReactive," Proceedingsof the Eighth National Conferenceon Artificial Intelligence(AAAI-90), Boston, MA , pp. 1051- 58. Mitsumoto, N., Fukuda, T., Araj , F., Tadashi, H., and Idogaki, T. 1996. " SelfOrganizing " Multiple Robotic System, Proceedingsof the IEEE International Conference on RoboticsandAutomation, Minneapolis, MN , April , pp. 1614- 19. " " Mittelstaedt, H. 1983. Introduction into Cyberneticsof Orientation Behavior, Biophysics , eds. W. Hoppeet al., Springer-Verlag, Berlin, pp. 794- 801. " Mittelstaedt, H. 1985. "Analytical Cyberneticsof SpiderNavigation, Neurobiologyof Arachnids, ed. F. Barth, Springer-Verlag, Berlin, pp. 298- 316. " " Mizessyn, F., and Ishida, Y. 1993. Immune Networks for CementPlants, Proceedings , Kawasaki, of theInternationalSymposiumon AutonomousDecentralizedSystems Japan, pp. 282- 88.
466
References
Mochida , T., Ishiguro , A., Aoki, T., andUchikawa , Y. 1995. " BehaviorArbitration for Autonomous Mobile RobotsUsingEmotionMechanisms : ' Proceedings of' the IEER E SJInternationalConference on IntelligentRobotsand Systems (IROS 95), , PA, pp. 516- 21. Pittsburgh Mondada , F., andFloreano : Some , D. 1995."EvolutionandNeuralControlSt11lctures " onMobileRobots andAutonomous , Robotics Experiments , Vol. 16, No. 2-4, Systems pp. 183-96. -Based , J., Fagg,A., andBekey , G. 1995."The USCAFV- l : A Behavior Montgomery " ] AerialRoboticsCompetition Entryin the 1994Intemationa , IEEEExpert, Vol. 10, No. 2, April, pp. 16-22. -BasedLearningfor Control," Moore,A., Atkeson , C., andSchaal , S., 1995." Memory -MellonUniversity CMU RoboticsInstituteTechnicalReportCMU RI TR 95- 18, Carnegie , Pittsburgh , PA, April. Morasso for MotorPlanning , P., and , V. 1994."Self-Organising Body-Schema " Journal Sanguineti , , Vol. 26, pp. 131- 48. of MotorBehavior " Moravec Avoidance , H. 1977."TowardsAutomaticVisualObstacle , Proceedings of the Fifth InternationalJoint Conference on Artificial Intelligence , Cambridge , MA, , p. 584. August " Moravec CartandtheCMURover , H. 1983."The Stanford , Proceedings of theIEEE, Vol. 71, No. 7, pp. 872- 84. Moravec MobileRobotsAnnualReport, , H. 1985." RobotsthatRove," in Autonomous -MellonUniversity Technical No. RI CMU TR 86 -4 Robotics Institute , , Carnegie Report . , Pittsburgh , PA, February Moravec : TheFutureof Robotand HumanIntelligence , H. 1988. Mind Children , HarvardUniversityPress , Cambridge , MA. Moravec . MindAge: Transcendence , H. Forthcoming , OxfordUniversity throughRobots Press . Moravec , H., andElfes, A. 1985."High Resolution ," Mapsfrom WideAngleSonar the ASME International in Engineering Proceedings of , Boston , Computers Conference MA , pp. 375- 80.
" , Decay, Disappearance , and Replacement Moynihan, M. 1970. Control, Suppression of Displays," Journal of TheoreticalBiology, Vol. 29, pp. 85- 112. " Murphy, R. R. 1991. An Application of Dempster-ShaferTheory to a Novel Control Schemefor SensorFusion," Proceedingsof SPIE StochasticMethodsin Signal Processing , ImageProcessing , and ComputerVision, SanDiego, CA , July, pp. 55- 68. " " Murphy, R. R. 1992. An Architecturefor Intelligent Robotic SensorFusion, PhiD. Dissertation, TechnicalReport No. GIT ICS 92/42, College of Computing, Georgia Instituteof Technology , Atlanta. " for Action-Oriented Murphy, R. R., and Arkin , R. C. 1992. SFX: An Architecture
" SensorFusion on IntelligentRobotics , Proceedings of theInternationalConference ' andSystems - 86. , NC, July, pp. 1079 (IROS92), Raleigh " with , A. 1993. VisuallyGuidedStableGrasping Murphy,T., Lyons,D., andHendriks " -BasedApproach a Multi-Fingered RobotHand:A Behavior , Proceedings of theSPIE
467
References
Intelligent Robotsand ComputerVisionXII : Active VISionand 3D Methods, Vol. 2056, pp. 252- 63. Nauta, W., and Feirtag, M. 1979. "The Organizationof the Brain," in TheBrain, W.H. Freeman,SanFrancisco,pp. 40- 55. Nehmzow, U., and McGonigle, B. 1994. "Achieving Rapid Adaptationsin Robotsby Means of External Tuition," From Animals to Animats 3: Proceedingsof the Third Conferenceon Simulationof AdaptiveBehavior, Mit Press,Cambridge, MA , pp. 30108. Nehmzow,U., Smithers, T. andMcGonigle, B. 1993. " IncreasingBehavioralRepertoire in a Mobile Robot," From Animals to Animats 2: Proceedingsof the SecondInternational Conferenceon Simulationof Adaptive Behavior, Honolulu, ffi , Mit Press, Cambridge, MA , pp. 291- 97. Neisser, U. 1976. Cognition and Reality: Principles and Implications of Cognitive Psychology,W.H. Freeman,SanFrancisco. " Neisser, U. 1989. " Direct Perceptionand Recognitionas Distinct PerceptualSystems , text of addresspresentedto the CognitiveScienceSociety, August. Neisser, U. 1993. " Without Perception , ThereIs No Knowledge: Implicationsfor Artificial " Intelligence, in Natural and Artificial Minds, ed. R. G. Burton, StateUniversity of New York Press, Albany, pp. 147- 64. " Nelson, J. I. 1995. " Visual ScenePerception: Neurophysiology , in The Handbookof Brain Theoryand Neural Networks, ed. M. Arbib , Mit Press, Cambridge, MA , pp. 1024- 28. Neumann, 0 ., and Prinz, W. 1990. " Prologue: Historical Approaches to Perception and Action," in RelationshipsbetweenPerceptionand Action, eds. O. Neumannand W. Prinz, Springer-Verlag, Berlin, pp. 5- 19. Nilsson, N. 1965. Learning Machines: Foundationsof TrainablePattern- Classifying Systems , McGraw-Hill , New York. Nilsson, N. 1969. "A Mobile Automaton: An Application of Artificial Intelligence " , Proceedingsof the International Joint Conferenceon Artificial Intelligence Techniques (JJCAI-69) . WashingtonD.C., May. Reprintedin AutonomousMobile Robots, Vol. 2, eds. S. IyengarandA . Elfes, IEEE ComputerSocietyPress, Los Alamitos, 1991, pp. 233- 44. Nilsson, N. 1980. Principles of Artificial Intelligence, Tioga, PaloAlto , CA. Nilsson, N. 1984. " Shakeythe Robot," TechnicalNote No. 323, Artificial Intelligence Center, SRI International, Menlo Park, CA. Nilsson, N. 1994. "Teleo-ReactiveProgramsfor Agent Control," Journal of Artificial , Vol. 1, pp. 139- 58. IntelligenceResearch Nilsson, N. 1995. " Eye on the Prize," AI Magazine, Vol. 16, No. 2, Summer, pp. 9- 17. Noreils, F., and Chatila, R. 1989. " Control of Mobile Robot Actions," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, pp. 701- 07.
468
References
Noreils, F., andChatila, R. 1995. " PlanExecutionMonitoring andControl Architecture for Mobile Robots," IEEE Transactionson Roboticsand Automation, Vol. 11, No. 2, April , pp. 255- 66. NormanD ., andShallice, T. 1986. "Attention to Action: Willed andAutomaticControl of Behavior," in Consciousness and Self-Regulation: Advancesin Researchand Theory, Vol. 4, eds. R. Davidson, G. Schwartz, and D. Shapiro, PlenumPress, New York, pp. 1- 17. Noton D., and Stark, L . 1971. " Scanpathsin SaccadicEye MovementsWhile Viewing " andRecognizingPatterns , WsionResearch , Vol. 11, pp. 929- 42. " " Olshausen , B., and Koch, C. 1995. SelectiveVisual Attention, in The Handbookof Brain Theoryand Neural Networks, ed. M . Arbib, Mit Press, Cambridge, MA , pp. 837- 40. Ooka, A., Ogi, K., Wada , Y., Kida, Y., Takemoto, A., Okamoto, K., and Yoshida , K. 1985. " Intelligent Robot Systemn ," RoboticsResearch , SecondInternational Symposium , eds. H. HanafusaandH. Inoue, Mit Press, Cambridge, MA , pp. 341- 47. " " Pahlavan , K., and Eklundh, J. 1993. Heads, Eyes and Head-Eye Systems, in Active Robot Vision: CameraHeads, Model-BasedNavigation and Reactive Control, eds. H. Christensen , K. Bowyer, andH. Bunke, World Scientific, Singapore,pp. 33- 49. " " K. Pahlavan Uhlin , , , T., and Eklundh, J. 1993. Active Vision as a Methodology, in Active Perception,ed. Y. Aloimonos, LawrenceErlbaumAssociates , Hillsdale, NJ, pp. 19- 46.
Pani,J. 1996.Personal communication , May. " " Parker , L. 1992. LocalversusGlobalControlLawsfor Cooperative , AI AgentTeams Memorandum No. 1357, MIT AI Lab, Cambridge , MA, March. " Parker Multi-RobotCooperation , L. 1994. "Heterogeneous , PhiD. Dissertation , DeandComputer Science Instituteof , Massachusetts parbnentof ElectricalEngineering . , Cambridge , MA, February Technology Parker for FaultTolerantMultiRobotCooperation , L. 1995."Alliance: An Architecture Memorandum No. 12920 . ," OakRidgeNationalLaboratoryTechnical , February " Patel,M., Colombetti M. and M. . 1995 for , , , Evolutionary Learning Intelligent " Dorigo : A CaseStudy Automation andSoftComputing , IntelligentAutomation , Vol. I , No. I , pp. 29- 42. " in Sensor PauL. 1991." Behavioral / DataFusionSystems , in ActivePerception Knowledge -Verlag andRobotVision , eds. A. SoodandH. Wechsler , Springer , Berlin, pp. 357- 72. . Pavlov , 1. 1927.Conditioned , OxfordUniversityPress , London Reftexes " " : A Representation Plans for ActionResources , D. 1991. Internalized , in Designing Payton ed . P . Maes MIT Press MA . Autonomous 89 103. , , , Cambridge , , pp Agents " , D., Keirsey , D., Kimble, D., Krozel, J., andRosenblatt , J. 1992. Do Whatever Payton Works: A RobustApproachto Fault-tolerantAutonomous Control," AppliedIntelligence , Vol. 2, No. 3, September , pp. 225- 50. " " American Pearson , Vol. 235, pp. 72- 86. , K. 1976. The Controlof Walking, Scientific
469
References Pellionisz, A., and ilinas , R. 1980. "Tensorial Approach to the Geometryof Brain Function: CerebellarCoordination via a Metric Tensor," Neuroscience , Vol. 5, pp. 1125- 36. ' Penrose , R. 1989. TheEmperors NewMind , Oxford University Press, New York. Penrose , R. 1994. Shadowsa/ the Mind , Oxford University Press, New York. Piaget, J. 1971. Biology and Knowledge: An Essayon the RelationsbetweenOrganic Regulationsand CognitiveProcesses, University of ChicagoPress. " Pin, F., and Watanabe , Y. 1995. Automatic Generationof Fuzzy Rules Using the " FuzzyBehavioristApproach: The Caseof Sensor-BasedRobotNavigation, Intelligent Automationand Soft Computing, Vol. 1, No. 2, pp. 161- 78. Pomerleau , D. 1993. Neural NetworkPerception/ or Mobile Robot Guidance, Kluwer AcademicPublishers,Boston. " " Pomerleau , D. 1995. Ralph: Rapidly AdaptingLateralPositionHandler, Proceedings 0/ the IEEE Symposiumon Intelligent Vehicles , Detroit, MI , September , pp. 506- 11. Premvuti, S., andYuta, S. 1990. " Considerationon theCooperationof Multiple Autonomous Mobile Robots," IEEE International Workshopon IntelligentRobotsand Systems ' (IROS 90), Tsuchiura, Japanpp. 59- 63. " Prokopowicz, P., Swain, M., and Kahn, R. 1994. Task and Environment-Sensitive " Tracking, Proceedingsof the IEEE Workshopon VisualBehaviors, Seattle, WA, June, pp. 73- 78. " , K. 1993. Control of a HexapodRobot using a Biologically Quinn, R., and Espenschied " InspiredNeural Network, in Biological Neural Networksin InvertebrateNeuroethologyand Robotics, eds. R. Beer, R. Ritzmann, andT. McKenna, AcademicPress, SanDiego, CA, pp. 365- 81. " Ram , A., Arkin , R., Boone, G., and Pearce, M. 1994. Using GeneticAlgorithms to Learn ReactiveControl Parametersfor AutonomousRobotic Navigation," Journal of AdaptiveBehavior, Vol. 2, No. 3, pp. 277- 305. " Ram , A ., Arkin , R. C., Moonnan, K., and Clark, R. 1997. Case-Based Reactive Navigation: A CaseBasedMethod for On Line Selectionand Adaptationof Reactive " Control Parametersin AutonomousRobotic Systems , , IEEE Transactionson Systems . PartB , Vol. 27. No. 3, pp. 376- 94. Man, and Cybernetics Reece, D., andShafer, S. 1991. "Active Vision at the SystemLevel for RobotDriving," Working Notes, AAAI Fall Symposiumon SensoryAspectsof Robotic Intelligence, Asilomar, CA , November,pp. 70- 77. " " Reignier, P. 1995. SupervisedIncrementalLearning of Fuzzy Rules, Roboticsand AutonomousSystems Vol. . 57 71. 16 , , pp " " Reynolds, C. 1987. Flocks, Herds, and Schools: A Distributed BehavioralModel, ComputerGraphics, Vol. 21, No. 4, pp. 25--34. " " Rimey, R. 1992. Task-OrientedVision with Multiple BayesNets, in Active Vision, eds. A. Blake, andA. Yuille, MIT Press,Cambridge, MA .
470
References " Riseman, E., and Hanson, A . 1987. " GeneralKnowledge- BasedVision Systems , in Vision, Brain and CooperativeComputation,eds. M. Arbib andA. Hanson, MIT Press, Cambridge, MA , p. 287. " Rosenblatt , F. 1958. The Perceptron: A ProbabilisticModel for Information Storage andOrganizationin the Brain," PsychologicalReview, Vol. 65, pp. 386- 408. " Rosenblatt , J. 1995. DAMN : A Distributed Architecture for Mobile Navigation," Working Notes, AAAI 1995 Spring Symposiumon LessonsLearnedfor Implemented SoftwareArchitecturesfor PhysicalAgents, PaloAlto , CA , March, pp. 167- 78. " Rosenblatt , J., and Payton, D. 1989. A Fine-GrainedAlternativeto the Subsumption Architecturefor Mobile RobotControl," Proceedingsof the InternationalJoint Conference on Neural Networks, June, pp. 317- 23. " Rosenschein , S., and Kaelbling, L . 1987. The Synthesisof Digital Machineswith " ProvableEpistemicProperties, SRIInternational TechnicalNoteNo. 412, Menlo Park, CA, April . " Rumelhart , D., Hinton, G., and Williams, R. 1986. LearningInternal Representations " , in Parallel DistributedProcessing , Vol. I , eds. D. Rumelhartand by Error Propagation J. McClelland, MIT Press,Cambridge, MA , pp. 318- 62. Russell, A., Thiel, D., and Mackay-Sim, A. 1994. " SensingOdour Trails for Mobile RobotNavigation," Proceedingsof theIEEE International Conferenceon Roboticsand Automation, SanDiego, CA, May, pp. 2672- 77. " Sacerdoti, E. 1974. " Planningin a Hierarchyof AbstractionSpaces , Artificial Intelligence , Vol. 5, No. 2, pp. 115- 35. Sacerdoti,E. 1975. "A Structurefor PlansandBehavior," PhiD. Dissertation,Technical Note No. 109, AI Center, SRI International, Menlo Park, CA. Saffiotti, A., Ruspini, E., and Konolige, K. 1993a. " Blending Reactivity and OoalDirectednessin a Fuzzy Controller," SecondIEEE International Conferenceon Fuzzy , SanFrancisco, CA, March, pp. 134- 39. Systems Saffiotti, A., Ruspini, E., and Konolige, K. 1993b. " RobustExecutionof Robot Plans " Using Fuzzy Logic, Proceedingsof the Workshopon FuzzyLogic in Artificial Intelligence ' , InternationalJoint Conferenceon Artificial Intelligence(IJCA/ 93), Chambery, France, pp. 24- 37. Saffiotti, A , Konolige, K., and Ruspini, E. 1995. "A Multi -ValuedLogic Approachto " IntegratingPlanningand Control, Artificial Intelligence, Vol. 76, No. 1-2, July, pp. 481 526. Sahota, M . 1993. " Real-Time Intelligent Behaviorin DynamicEnvironments: Soccer" Playing Robots, MiS. Thesis, Departmentof ComputerScience, University of British . Columbia, Vancouver Saito, F., Fukuda, T., and Araj, F. 1994. " Swing and Locomotion Control for a lWoLink BrachiationRobot," IEEE Control SystemsMagazine, Vol. 14, No. 1, February, pp. 5- 11. " Sanborn, J. C. 1988. "A Model of Reactionfor Planningin Dynamic Environments , MiS. Thesis, ComputerScienceDepartment , University of Maryland, College Park, May.
471
References
" anIndoorEnvironment with a MobileRobotand Sarachik , K. 1989. Characterizing " on Robotics Uncalibrated Stereo , Proceedings of theIEEEInternationalConference andAutomation , Scottsdale , AZ, pp. 984-89. " " onAutomaticControl Saridis , G. 1983. IntelligentRoboticControl, IEEETransactions , Vol. 28, No. 5, May, pp~547 56. Saridis ," Proceedings , G., andValvanis , K. 1987." OntheTheoryof IntelligentControls onAdvances in IntelligentRoboticSystems , Cambridge , MA, of theSPIEConference October , pp. 488--95. ' " , Vol. 8, No.4, Wmter,pp. 59- 65. Schank , R. 1987."What sAI, Anyway? AI Magazine " " , N. 1995. CognitiveMaps, in TheHandbook of BrainTheoryandNeural Schmajuk Networks , Cambridge , MA, pp. 197- 200. , ed. M. Arbib, MIT Press ," Psychological Schmidt , R. 1975."A Schema Theoryof DiscreteMotorSkill Learning Review , Vol. 82, pp. 225- 60. -Schoner , G., andDose, M. 1992. "A DynamicalSystemsApproachto Task-Level VehicleMotion," Robotics Usedto PlanandControlAutonomous SystemIntegration . . 253 67 . andAutonomous , Vol 10, pp Systems " UniversalPlansfor Reactive Robotsin Unpredictable Environments M. . 1987 , Schoppers on ArtificialIntelligence ," Proceedings of theTenthInternationalJointConference (IJCAI-87), pp. 852- 59. " " PlansasCaches of Reaction , Vol. 10, , AI Magazine , M. 1989. In Defense Schoppers No.4, Winter,pp. 51- 60. " Schultz , J. 1992. Usinga GeneticAlgorithmto LearnBehaviors , A., andGrefenstette " Vehicles for Autonomous , NavalResearch LaboratoryNCARAITR #AIC-92-009, , DC. Washington to LearnHowto Scutt,T. 1994."The FiveNeuronTrick: UsingClassical Conditioning SeekLight," FromAnimalstoAnimats3: Proceedings of theThirdConf . onSimulation , MA, pp. 364-70. , MIT Press , Cambridge ofAdaptiveBehavior " " andBrainSciences Searle , Vol. 3, , Behavioral , J. 1980. Minds, Brains,andPrograms pp. 417 57. " -BasedManipulator : Acquisition , A., andDeJong , G. 1985. Explanation Learning Segre " of PlanningAbility throughObservation , WorkingPaper62, AI Research Group, -Champaign M~ h. . Lab, Univerityof DlinoisatUrbana Science Coordinated "," American , Scientific , 0 ., andNeisser , U. 1960. Pattern Recognition byMachine Selfridge , Vol. 203, pp. 60- 68. -Ivaldi, F. 1994 Structure of theAdaptiveController . " Geometric Shadmehr , R., andMussa " of theHumanArm, MIT AI MemoNo. 1437,Cambridge , MA, March. " Behaviorof Robots Shibata , T., Ohkawa , K., and Tanie, K. 1996. Spontaneous " - EmotionallyIntelligentRobotSystem for Cooperation , Proceedings of the IEEE on RoboticsandAutomation InternationalConference , MN, April, pp. , Minneapolis 2426-31. Shiffrin, R., andSchneider , W. 1977."Controlled andAutomaticHumanInformation " Review : II , Psychological , Vol. 84, pp. 127- 90. Processing
472
References
" " ShirleyD ., andMatijevic, J. 1995. Mars PathfinderMicrorover, AutonomousRobots, Vol. 2, pp. 283- 89. Shor, P. 1994. "Algorithms for QuantumComputation: DiscreteLogarithmsandFactoring " , Proceedingsof the Thirty-fifth AnnualSymposiumon theFoundationsof Computer Science,SantaFe, NM , November,pp. 124- 34. Simmons, R. (ed.) 1992. Working notes, AAAJ 1992 Spring Symposiumon Control of SelectivePerception,StanfordUniversity, March, pp. 138- 41. Simmons, R., and Koenig, S. 1995. " ProbabilisticRobot Navigationin Partially Observable " Environments , Proceedingsof theInternationalJoint ConferenceonArtificial Intelligence(IJCAI-95), Montreal, CA , August, pp. 1080- 87. Simon, H. 1983. " Why ShouldMachinesLearn?" in MachineLearning: An Artificial IntelligenceApproach, Vol. 1, cds. R. Michalski, J. Carbonell, and T. Mitchell, Tioga Pubishing, PaloAlto , CA , pp. 25- 39. " Sims, K. 1994. " Evolving Virtual Creatures , Proceedingsof theACM Siggraph94, pp. 15- 22. Skinner, B.F. 1974. About Behaviorism, Alfred Knopf, New York. " Slack, M . 1990. SituationallyDriven Local Navigationfor Mobile Robots," JPL Publication No. 90- 17, NASA Jet PropulsionLaboratory, Pasadena , CA , April . Smith, W. J. 1977. TheBehaviorof Communicating : An EthologicalApproach, Harvard University Press, Cambridge, MA . Soldo, M . 1990. " ReactiveandPreplannedControl in a Mobile Robot," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, Cincinnati, OH, pp. 1128- 32. " " Spector, L. 1992. Superveniencein Dynamic-World Planning, PhiD. Dissertation, UMIACS-TR-92-55, University of Maryland, CollegePark. " Spinelli, D. N. 1987. A Traceof Memory: An EvolutionaryPerspectiveon the Visual " in Vision, Brain and CooperativeComputation,cds. M . Arbib andA . Hanson, System, MIT Press, Cambridge, MA , pp. 165--82. " Stark, L ., and Bowyer, K. 1991. Achieving GeneralizedObject Recognitionthrough about Association of Functionto Structure," IEEE Transactionson Pattern Reasoning and Machine Analysis Intelligence, Vol. 13, No. 10, October, pp. 1097- 1103. " Stark, L., and Bowyer, K. 1994. Function-BasedGeneric Recognitionfor Multiple " CVGIP: , , Vol. 59, No. 1, January,pp. 1- 21. ObjectCategories Image Understanding " Stark, L., and Ellis, S. 1981. ScanpathsRevisited: Cognitive Models Direct Active " : Cognition and Visual Perception, cds. C. Fisher, Looking, in Eye Movements R. Monty andJ. Senders , LawrenceErlbaumAssociates , Mahwah, NJ, pp. 193- 226. " " Steels, L . 1990. Exploiting Analogical Representations , in DesigningAutonomous Agents, ed. P. Maes, MIT Press, Cambridge, MA , pp. 71- 88. Steels, L. 1994. " EmergentFunctionality in Robotic Agents through On-Line Evolution " , Proceedingson Artificial life IV , MIT Press, Cambridge, MA , pp. 8- 14. Steels, L. 1995. " When Are RobotsIntelligent AutonomousAgents?" Roboticsand AutonomousSystems , Vol. 15, pp. 3- 9.
473
References Stein, L. 1994. " Imagination and SituatedCognition," Journal of Experimentaland TheoreticalAI , Vol. 6, No. 4, October- December , pp. 393- 407. Stentz, A. 1994. " Optimal and Efficient Path Planningfor Partially-Known Environments " , Proceedingsof the IEEE International Conferenceon Roboticsand Automation , May, SanDiego, CA, pp. 3310- 17. Stentz, A ., andHebert, M. 1995. "A CompleteNavigationSystemfor Goal Acquisition " in UnknownEnvironments , Proceedingsof theI EEFl RSJ International Conferenceon ' and Intelligent Robots Systems(IROS 95), Pittsburgh, PA, pp. 425- 32. Stone, H. (ed.), 1980. Introductionto ComputerArchitecture, 2nd ed., SRA, Chicago. Stroulia, E. 1991. " ReflectiveSelf-Adaptive Systems," PhiD. Dissertation, Collegeof Computing, GeorgiaInstituteof Technology,Atlanta, GA. " " Suga, N., and Kanwal, J. 1995. Echolocation: CreatingComputationalMaps, in The Handbookof Brain TheoryandNeuralNetworks, ed. M. Arbib, MIT Press,Cambridge, .MA , pp. 344- 48. " , K., and Suzuki, I. 1990. DistributedMotion Coordinationof Multiple Mobile Sugihara Robots," Proceedingsof the Fifth InternationalSymposiumon Intelligent Control, , PA, pp. 138- 43. Philadelphia Sussman , G. 1975. A ComputerModel of Skill Acquisition, American Elsevier, New York. Sutton, R. 1988. " Learning to Predictby the Methodsof TemporalDifferences," Machine Learning, Vol. 3, pp. 9- 44. Tachi, S., and Komoriya, K. 1985. " Guide Dog Robot," RoboticsResearch , Second InternationalSymposium , eds. H. HanafusaandH. Inoue, MIT Press,Cambridge, MA , pp. 333- 40. Takeuchi, S. 1996. " Hybrid InsectRobot," World Wide WebURL http:// scorpio.leopard.t.u-tOkyo.ac.jp /takeuchi/abstract.html, University of Tokyo. Tan, M. 1991. " Cost-SensitiveRobot Learning," PhiD. Dissertation, TechnicalReport CMU-CS-91- 134, School of Computer Science, Carnegie-Mellon University, Pittsburgh , PA, May. Tanimoto, S. 1990. The Elementsof Artificial Intelligence, ComputerSciencePress, New York. Thorndike, E. 1911. Animal Intelligence, Hafner, Darien, CT. Thrun, S. 1995. "An Approachto Learning Mobile Robot Navigation," Roboticsand AutonomousSystems , Vol. 15, No. 4, pp. 301- 20. Tianmiao, W., and Bo, Z. 1992. "Time -Varying PotentialField Basedon PerceptionAction Behaviorsof Mobile Robot," Proceedingsof the 1992IEEE International Conference on Roboticsand Automation, Nice, France, pp. 2549- 54. Tinbergen, N. 1953. SocialBehaviorin Animals, Methuen, London. Tsai, W., and Chen, Y. 1986. "Adaptive Navigationof AutomatedVehiclesby Image " , Man and Cybernetics , Vol. 16, , IEEE Transactionson Systems Analysis Techniques No. 5, September , pp. 730- 40.
474
References Tsotsos, J. 1989. "The Complexity of PerceptualSearchTasks," Proceedingsof the EleventhInternationalJoint Conferenceon Artificial Intelligence(IJCAI '89), Detroit, MI , pp. 1571- 77. Tsotsos, J. 1990. "Analyzing Vision at the Complexity Level," Behavioral and Brain Sciences , Vol. 13, pp. 423- 69. Tsotsos, J. 1992. " On the RelativeComplexityof Active versusPassiveVisual Search," InternationalJournal of ComputerVision, Vol. 7, No. 2, pp. 127- 41. " Tsotsos, J. 1995. " BehavioristIntelligenceand the ScalingProblem , Artificial Intelligence , Vol. 75, No. 2, June, pp. 135- 60. Thrban, E. 1992. Expert Systemsand Applied Artificial Intelligence, MacMillan, New York. " " Thring, A. 1950. ComputingMachineryandIntelligence, Mind , Vol. 59, pp. 433- 60. " " Ullman, S. 1985. Visual Routines, in Visual Cognition, ed. S. Pinker, MIT Press, Cambridge, MA , pp. 97- 159. U.S. Anny . 1986. Field Manual No. 7- 7J. Departmentof the Army, Washington , DC. Vershure,P., Krose, B., and Pfeifer, R. 1992. " DistributedAdaptiveControl: The SelfOrganization of StructuredBehavior," Roboticsand AutonomousSystems , Vol. 9, pp. 181 96. Vershure, P., Wray, J., Sprong, 0 ., Tononi, G., and Edelman, G. 1995. " Multilevel " Analysis of ClassicalConditioningin a BehavingReal World Artifact, Roboticsand AutonomousSystems Vol. 16, 1995, pp. 247- 65. , Wallace, R. 1987. " Robot RoadFollowing by AdaptiveColor Classificationand Shape " Tracking, Proceedingsof theIEEE InternationalConferenceon RoboticsandAutomation , Raleigh, NC, pp. 258--63. Wallace, R., Matsuzaki, K., Crisman, J., Ooto, Y., Webb, J., and Kanade, T. 1986. " " Progressin Robot Road-Following, Proceedingsof the IEEE International Conference on Roboticsand Automation, SanFrancisco, CA, pp. 1615- 21. Wallace, R., Stentz, A., Thorpe, C., Moravec, H., Whittaker, W. and Kanade, T. 1985. " First Resultsin Robot Road" Following, International Joint Conferenceon Artificial 85 2 IJCAI Vol. Los , ), Intelligence( Angeles, CA, pp. 1089- 95. Walter, W. G. 1953(reprinted1963) . TheLiving Brain, Norton, New York. " Wang, J. 1994. On Sign-Board Based Inter-Robot Communicationin Distributed Robotic Systems," Proceedingsof the IEEE International Conferenceon Roboticsand Automation, pp. 1045- 50. " Wang, J. 1995. OperatingPrimitivesSupportingTraffic RegulationandControlof Mobile " RobotsunderDistributedRoboticSystems , Proceedingsof theIEEE International Conferenceon Roboticsand Automation, Nagoya, Japan, June, pp. 1613- 18. Wasserman , P. 1989. Neural Computing: Theoryand Practice, VanNostrandReinhold, New York. Waterman , T. 1989. Animal Navigation, ScientificAmericanBooks, New York. Watkins, C., andDayan, P. 1992. " Q-Learning," MachineLearning, Vol. 8, pp. 279- 92.
475
References Watson, J. B. 1925. Behaviorism, People's InstitutePublishingCo., New York. " Waxman, A. M., LeMoigne, J., and Srinivasan , B. 1985. Visual Navigationof Roadways " , Proceedingsof the IEEE InternationalConferenceon RoboticsandAutomation, St. Louis, MO , pp. 862- 67. Waxman, A ., LeMoigne, J., Davis, L., Srinivasan, B., Kushner, T., Liang, E. and " , T. 1987. A Visual NavigationSystemfor AutonomousLand Vehicles," Siddalingaiah IEEE Journal of Roboticsand Automation, Vol. RA-3, No. 2, April , pp. 124- 41. Webster's Ninth New CollegiateDictionary, 1984. Merriam-Webster,Springfield , MA . Weiner, N. 1948. Cybernetics , or Control and Communicationin Animals and Machines , Wiley, New York. " Weitzenfeld , A. 1993. A HierarchicalComputationalModel for DistributedHeterogeneous " Technical , Systems ReportTR-93-02, Centerfor Neural Engineering, University of SouthernCalifornia, Los Angeles. " , Werbos, P. 1995. " Backpropagation : BasicsandNew Developments , in TheHandbook Brain and Neural M. Networks ed. Arbib MIT Press , , , Cambridge, MA , pp. of Theory 134- 39. " Werner, G., andDyer, M. 1990. " Evolutionof Communicationin Artificial Organisms , TechnicalReport UCLA-AI -90-06, AI Laboratory, University of California, Los Angeles. Wilkes, D., andTsotsos, J. 1994. " Integrationof CameraMotion Behaviorsfor Active " Object Recognition, Proceedingsof the IEEE Workshopon VisualBehaviors, Seattle, WA, June, pp. 10- 19. Wilson, E. O. 1975. Sociobiology: TheNewSynthesis , BelknapPress,Cambridge,MA . Winston, P. 1975. " LearningStructuralDescriptionsfrom Examples," in Psychologyof ComputerVision, ed. P. Winston, MIT Press, Cambridge, MA , pp. 157- 209. Woodfill, J., and Zabih, R. 1991. " Using Motion Vision for a Simple Robotic Task," WorkingNotes, 1991AAAI Fall Symposiumon SensoryAspectsof RoboticIntelligence, Asilomar, CA , November, pp. 162- 65. " " Yamamoto , M. 1993. Sozzy: A HomloneDriven AutonomousVacuumCleaner, Proceedings of Mobile RobotsVIII , pp. 211 23. Yamauchi, B. 1990. " BehavioralMemory Techniquesfor Robot Navigation," unpublished , Malibu, report, Artificial Intelligence Center, HughesResearchLaboratories CA. Yamauchi, B., and NelsonY . 1991. "A Behavior-BasedArchitecturefor RobotsUsing Real-Time Vision," Proceedingsof the International Conferenceon Roboticsand Automation, Sacramento , CA, pp. 1822- 27. Yanco, H., andStein, L . 1993. "An AdaptiveCommunicationProtocolfor Cooperating Mobile Robots," From Animals to Animats: Proceedingsof the SecondInternational Conferenceon theSimulationof AdaptiveBehavior, Honolulu, In , Mit Press/ Bradford Books, Cambridge, MA , pp. 478- 85.
476
References " Yoshida , E., Yamamoto, M., Araj, T., Ota, J., and Kurabayashi , D. 1995. A Design " Medtodof Local CommunicationAreain Multiple Mobile RobotSystem, Proceedings of the IEEE International Conferenceon RoboticsandAutomation, June, pp. 2567- 72. " " Yuta, S. 1993. CooperativeBehaviorof Multiple AutonomousMobile Robots, Workshop on Needsfor Researchin CooperatingRobots, IEEE International Conferenceon Roboticsand Automation," Atlanta, GA. Zadeh, L. 1973. " Outline of a New Approachto dIe Analysisof ComplexSystemsand Decision Process es," IEEE Transactionson Systems , Man, and Cybernetics , Vol. 3, No. 1, January,pp. 28- 44. " , R., Perrier, M., Lepinay, P., Thompson, P., and Jouvencal , B. 1991. FastMobile Zapata " Robotsin ill StructuredEnvironments , Proceedingsof theI EEFl RSJInternational ' , Japan, pp. 793- 98. Conferenceon Intelligent Roboticsand Systems(IROS 91), Osaka " Zelinsky, A ., Kuniyoshi, Y. 1996. Learningto CoordinateBehaviorsfor Robot Navigation " , AdvancedRobotics, Vol. 10, No. 2, pp. 143- 59. " Zelinsky, A., Kuniyoshi, Y., Suehiro, T., and Tsukune, H. 1995. Using an Augmentable " Resourceto Robustly and Purpose fully Navigatea Robot, Proceedingsof the IEEE International Conferenceon Roboticsand Automation, Nagoya, Japan, pp. 2586- 92. Zeltzer, D., and Johnson, M . 1991. " Motor Planning: An Architecturefor Specifying andControllingthe Behaviorof Virtual Actors," Journalof Visualizationand Computer Animation, Vol. 2, pp. 74- 80. " Zeng, N., andCrisman, J. 1995. CategoricalColor Projectionfor RobotRoadFollowing " IEEE International , Conferenceon Roboticsand Automation, Nagoya, Japan, pp. 1080- 85.
Name Index
, A., 381-382,399 Agah P., 73, 169 , , 214,258,260,272 Agre . Albus , J., 21- 22, 423 Ali, K., 378,415 Anderson , T., 171 Arbib,M., 40, 42-43, 69, 82, 86, 118 , 141 , 143 , 254,286 Arkin,R., 28, 69, 82, 84, 142 - 143 , 155 , 214, 283,377,381,383,409,412,432 , W., 8, 133 , 432 Ashby Badier , No,171 , Ro , 202,270,280 Bajcsy Bakker , Po , 400 Balch , To,129 , 155 , 161 , 377,408 Ballard , Do,271,272 Barber a, Ao, 166 Bartlett , Fo , 42 Beer , Ro , 57, 69 , Go , 172 , 381-382,399 Bekey Bellman , Ro , 423,426 Biederman , 10 ' 276 Birnbaum , Lo,278 Blake , Ao,270 , Lo,202 Bogoni Bohm , Co , 127 Bonasso , Po , 88, 234 Borenstein , Jo , 189 Bosser , Uo,106 , 125 , 172 , K. , 201-202,298 Bowyer , Mo,2 Brady , Vo , 10- 14, 133 , 325,342, Braitenberg 427-428 Brand , Mo,278 Brill, Fo , 273 , Do,170 Brogan - 134 Brooks , Ro , 15,28, 74, 82, 106 , 131 , 166 , 168 , 170 , 298,313,424 Budenske , J., 202 Bunke , H., 298
Cameron , J., 217-218 Cao , Y., 369 Chalmers , D., 425
Chapman . D.. 73. 169 . 214 . 258 . 260 . 272 Chatila , R., 232 Chiel , H., 57, 69 Christensen , H., 298 Connell - 230 , J., 77, 98, 167 , 229 , 316 Cook , D., 409-410 , P., 278 Cooper Crisman , J., 289 , J., 269 Crowley Culhane , S., 276-277 Dean , T., 125 , J., 53 Deneubourg Dennen , D., 181 , 425 Dickmanns , E., 288 Donald , B., 263 Donath , M., 171 Drexler , E., 39 , M., 380 Dyer Eklundh , J.-O., 270,298 EItes , A., 189 , 216 & penscheid , K., 58 Ferrell , C., 77, 129 Ferrier , N., 298 Flynn . ,-A.,-27,-439 , Jo,74, 169 , 219 Firby Franklin , Ro , 381 Franceshini , No,55- 56 Fu, Do,278 , D., 394 Gage Galanter , E., 169 Gallistel , C., 184 Gardner , H., 244 Gat,E., 82, 218
478
Name Index
Gaussier , P., 325 , M., 169 Georgeff Gerald , M., 430 Gibson , J. J., 46, 51, 244-245,255 Gini, M., 202 Goel , A., 217 Goss , S., 53, 130 , R., 171 Gropen Hammond , Ko, 278 Harmon , Lo, 381 -Rodi, Bo, 125, 129,231 Hayes Hebb,Do,321 Henderson , To, 171 Hendricks , Ao, 222- 223 Hillis, Do, 168 , Jo, 170 Hodgins Holldobler , Bo, 52, 364-365, 383 Horswill, lo, 129, 133,264, 298
House , Do,40 Hsu,Jo , 391 Hull, Co , 180
1bera Il , To,286 , Go , 127 Jacopini , J., 263 Jennings Johnson , S., 166
, L., 87-88, 118 , 166 Kaelbling Kahn , P., 171 Kahn , R., 295 Kant , I., 41 Khatib , 0., 98, 143 , K., 286 Kluge , S., 197 Koenig , K., 82 Komoriya , K., lIS, 346 Konolige Koren , Y., 189 Kosecka , J., 280 , B., 98, 234 Krogh Krose , B., 323 Kube , R., 374 , Y., 200,277,393,400 Kuniyosbi Latombe , J.-C., 98 Lim, W., 172 Lin, F., 391 Lorenz , K., 47- 48
-223 , D., 86-87, 118 , 142 , 212 , 222 , Lyons 286 , J., 14,426 McCarthy McFarland , 172 , D., 51, 106 MacKenzie , D., 82, 376,412
MacLennan , Bo,381 Mahadevan , So , 316 - 168 Maes - 171 , Po , 112 , 167 , 170 , 313 Malcolm , Co , 181 , 212,231 Marr,Do,254,256 Martial , Ho,199 Mataric , Mo,125 , 129 , 136 , 393,395-396 , Ao,210 Meystel Milian,Jo , 327 Miller,Do,172 Miller,Go , 169 , Mo, 14- 15,44, 168 , 254,330,376, Minsky 427-428,441 Mintz,Mo,280 Mitchell , To , 232 Mittelstaedt , Ho,242 Moravec , Ho,18- 19, 106 , 189 , 216,426-428, 439,441,442 , Ro , 217-218,286-288 Murphy Neisser , Uo,42-43, 207,244-246 Nelson , Ro , 297 Nilsson , No,96, 118 , 166 , 306 Noreils , Fo , 232 Norman , Do,42, 207-209 Pahlavan , K., 270,298 Pau , L., 284 Pavlov , I., 322 , D., 75, 112 , 114 , 129 , 170 , 195 , 199 , Payton 273 Penrose , R., 128 , 424 Pfeifer , R., 323 , J., 42-43, 310 Piaget Pin,F., 347 Pomerleau , D., 291 Premwti , S., 362,368 Pribram , K., 169 , P., 295 Prokopowica , R., 58 Quinn Ram , A., 217,349 Reece , D., 260 , C., 170 Reynolds
Rosenblatt , Fo,44 Rosenblatt , Jo, 112- 113, 170 Rosenscbein , So , 87- 88, 166 , Eo, 115 Ruspini
Saffiotti , 230,346 , A., 115 Saridis , G., 24, 234 Schank , R., 305 Schneider , W., 207 , M., 74 Scboppers
479
NameIndex
Scutt , T., 325 Searle , J., 425 Shafer , S., 260 shallice , T., 207-209 Sherring ton, C., 180 Shiffrin , R., 207 Simmons , R., 197 SiIDS, K., 339 Skinner , B. F., 45, 180 , 207 Slack , M., 100 , 169 , 221 Smithers , 212,231 , T., 181 Stark , Lawrence , 203,277 Stark , Louise , 201-202 Steels , 178 , 338 , L., 106 Stein , L., 298,382,429 Stentz , A., 199 Stone , H., 125 , E., 217 .Stroulia Swain , M., 278,295 Tachi , S., 82 Takeuchi , S., 439 Tanimoto , S., 178 , C., 286 Thorpe Tbnm , S., 355 , N., 47-48, 363 Tmbergen Tsotsos , J., 28, 240,270,276-277 Thrban , E., 178
Uh Iin, T., 270 -258 Ullman , S., 169 , 256 , 260 Vershure , P., 323 Viola , P., 98 Wallace , R., 289 Walter , W. Grey , 8- 10, 133 , J., 385 Wang Watanabe , Y., 341 Webber , B., 111 Weitzenfeld , A., 255 Wellman , M., 125 Werner , G., 380 Wiener , N., 8 Wilkes , D., 210 Wilson . E.. 52. 364-365.383 .Wmston , P., 320 Woodfill,J., 295
Yamauchi , Bo,188 , 200,297 Yanco , Ho,382,400 , 381 Yoshida , Eo Yum , So , 362,368
Zabih , R., 295 Zadeh , L., 170 , A., 288 Zapp , A., 200 Zelinsky Zeltzer , D., 170 , H., 374 Zhang Zrehen , S., 325
Subject Index
A* , 197, 199,215, 217 Arbitration . SeeCoordination , behavioral , Abstractschema arbitration , 255 language . Abstraction ARCHprogram , 117- 118, 135, 143, 166, 231, . 320 233, 414 Architecture -partitioned -selection Abstraction evaluator action - 110 , 233 , 109 , 149 , 112 , - 168 ABSTRIPS 167 , 14 , 174 ACBARR , 349- 351 , 231 agent Accommodation ALLIANCE , 310 , 372-375,420 animate , 372- 373, 398 , 169 , 174 Acquiescence agent Action-perception ARC,172 cycle, 245-246, 273 Activation Atlantis , 214,218-222,235 levels AuRA , 141, 149, 167 , 130 , 214-218,229,234-235 BART , 167 , 171 spreading values . 324 circuit , 87-89, 166 , 174 - 168 , 167 , 174 , 317 (seealsoLearning ) Adaptation colony - 113 DAMN - 170 , 306 , 112 , 169 , 174 , 199 , 401 evolutionary on-line, 217, 225, 338-339, 350 definitions , 125 deliberative Jhierarchical , 21- 24 planningas, 214, 224 reaction , 326 , 172 Adaptivecriticelement dynamic evaluation - 129 , criteria . 128 Adaptiveheuristiccritic(AHC). SeeLearning heuristiccritic - 128 , 126 adaptive expressiveness , 214, 222 robot , 232 Advising generic Afferentinputs,36 , 28, 117 , 206-207,346,429 hybrid Affordances . 46. 91. 201- 202. 244- 245. 255. L-ALLIANCE , 398-399
299 AFSM - 137 , 132 , Agent AgentSeeSoftware Alder , 328 - 338 ALECSYS , 337 ALFA , 219 Allen, 133 , 140 Allothetic , 242
ALV. SeeDARPAautonomous landvehicle program ALVINN, 291, 293, 401 Anchovies , 154 Animation , computer , 170- 171, 361 Ant, 52- 54, 130, 364-365 , 438 Antibody , 438 Antigen
NASREM , 22, 124 , 126 -based niche , 52,172 - 130 , 126 , 129 organizing principles -reactor , 214,222-226,235 planner reactive deliberation , 233-234 - 165 scbemas , motor , 141 , 174 SFX,286-288 SIVS , 258,262 skillnetwork - 171 , 170 , 174 , 172 SiDartyCat SSS , 229 - 141 , 109 , 112 , 126 , 130 , 164 , subsumption 166 - 167 , 313,370,372,435 , 174 , 233 supervenience teleo - reactive , 233 agent , 381, 399- 401 tropismsystem cognitive
482
SubjectIndex Architecture (cont.) UGVDemoD, 401- 403 Ariel, 4 , 378-379 ArmyAnt project ArtificialInsectproject , 57 Artificialintelligence (AI) " distributed , IS, 142, 360, 393 origins, 14- 15
fighting,51 , 165, 183 fly-at-a-window gain, 93, 108, 113- 114, 120, 142, 149- 150, 326 , 129, 136, 142, 166 granularity , 185,242, 370 boming , 132, 136- 137 layers letting, 367 libraries , 424,426 , 143, 163, 164, 171 strong Artificiallife. 361 , 92-93 mapping . SeeBehavior , assemblage , 51 Assemblage mating micro-, 72 , 206, 223-225, 231 Assembly Assimilation , 310 network , 142 Attention , 270 objectrecognition definition , 24 , 230 preplanned focusof, 171, 240, 251, 273, 276- 279, , 116, 142 primitives 288 reflex,8, 26, 327 in humans roadfollowing(driving), 256, 286 , 276- 277 resources , 207- 208 robot, 57, 65 requiring visual,258 social,363- 364, 428 Attila, 140 , 374, 376 stagnation AuRA(Autonomous RobotArchitecture , 296-297(SeealsoVision, ). See tracking Architecture , AuRA , tracking ) computer AUSS,6 traits, 419 Autonomouse visual,298 , 337 Awareness willed, 207 , 166,207 -basedsystems Behavior Baboon , 362 , 26-27 aspects Back-propagation , 44, 322, 355- 356 , 69-79, 141- 142, 172- 173 design Bat, 34, 185,243 dogma , 131 Bee, 60, 185, 362, 425 features , l24 Behavior , 163 reconfiguration animal nIle-based , 21- 22, 39- 41, 47- 52, 58, 62, 154, , 95-97, 120, 313, 370- 372 365- 366 usefulness , 127- 128 , 48, 87, 116- 118, 120, 130, 309, BehaviorLanguage , 97, 133- 134, 137, 140, assemblage 172 326, 376 Belief, 227 , 104- 119,370 assembling autochthonous Birds, 154, 185, 365, 367 , 171 Bolds, 170 automatic , 207 chase Brachiation , 337 , primate , 60 Brain classification , robot,77- 79 continuous functionalencoding , 98- 104, , 36, 185, 243 auditorycortex function 120, 143 , 36, 142 , 9- 10 , 185, 194 cybernetic hippocampus definition , 24 , 36, 431 hypothalamus discrete limbicsystem , 93-98, 120, 133, 166, , 35- 36, 427 encoding 168. 170- 171 , 43 parietalcortex structure , 35-37 , 50, 364 display visualareas , 279, 361 , 35-36, 184,243- 244 docking dnmkensailor, 165 BrownUniversity , 265 BUSTER , 278 , 24, 27, 105- 107, 120, 167 emergent Buzz, 143, 165 , 89- 103, 130 encoding , 51 escape C Language , 229 , 81 experts , 328 , 79- 89 expression Cairngorm
483
SubjectIndex -MellonUniversity(CMU ) , 18-20, Carnegie 169, 198- 199,232, 286, 289, 291 Case -basedreasoning . SeeLearning , -based case CaseWestern Reserve , 57 University Cat, 243 CCDcamera , 250, 278 CEBOT , 361- 363 Centralnervous system(CNS), 33- 41 (See alsoBrain) CertaintyGrid, 189- 190 Chemotaxis , 53-54 Chinese roomargument , 425- 426 Chip, 169 Chromosome . 331 -Thringthesis Cburch , 424 Classifier , 337- 338, 356 systems CMURover , 18-20 CNRS,55 Cockroach , 34, 56- 60, 69, 440 Cog, 140, 298-299, 302- 303 , 42 Cognitivemodels failure,219 Cognizant rnmmllnicat1nn animal , 50, 60- 62, 362- 364, 366 broadcast , 392, 432 content , 382- 385 evolutionof, 380- 381
goal,384-387 - 392 guaranteeing - , 385
, 60- 61 honeybee implicit, 385 interrobot(SeeMultiagentrobotics , communication ) , 131 minimizing , 381- 382 range , 391 signboard state , 384-387 , 135 subsumption , 74, 167- 168 Competences , environmental , 51 Competition , 240, 263 , computational Complexity , 322- 325, 356 , classical Conditioning Conflictresolution , 94 . SeeNeuralnetworks Connectionism Consciousness , 422, 424- 427 . SeeNavigation , consuming Consuming Control , 307 adaptive automatic , 207- 209 -loop, 242 closed feedback , 8, 431, 434 fuzzy, 115, 342- 349, 356 hierarchical ) (SeeDeliberative systems -loop, 242 open
reactive , 24 threads , 208-209 willed, 207- 20S , 316, 321, 333 Convergence , 393 Cooperation by observation Coordination , behavioral arbitration , 10, 74, 111, 113, 133, 139- 140, 142, 166, 16S,170, 197, 376 , lOS, 111- 113, 129, 133, 16S competitive , lOS, 113- 115, 129, 142- 143 cooperative functions , Sl, lOS,309, 412 fusion, 113- 116, 142 maximum selection , 50 -based schema , 149- 155,326, 377 , 109, 135 subsumption - 114, 169- 170 voting, l12 ' Coulombs law, 98 Creditassignment , 309, 311 problem Critic, 311, 329, 337 Crossover , 332, 339 K2A robot, 172 Cybermotion , 8- 14, 267 Cybernetics D* , 199-200, 204 DAI. SeeArtificialintelligence , distributed DARPAautonomous landvehicle(ALV) , 195,275 program DARPAUGVDemon program , 170, 229, 401- 403, 410- 420 Deadlock , 385, 391- 392 Deadreckoning , 247- 248 Decisiontheory , 409- 410 Defense , 195 MappingAgency Deformation zones , 102- 103 of freedom( DOF ) , 90 Degree Deliberative systems characteristics , 21 , 209 examples use, 206 knowledge , deliberative ) (SeePlanning planners Deltanile, 321- 322 -Shafermethods , 217 Dempster Denningrobot, 165, 217, 378- 379, 384, 418- 419 of Defense , 249 Deparbnent Design /constrained , 69-72, ethologically guided 120, 143, 168, 171 driven,74-79, 120, 133, experimentally 136- 137, 168, 170- 171 -based motorschema , 155- 156 situatedactivity-based , 72- 74, 120, 166, 169 , 136 subsumption Desires , 227
484
SubjectIndex , 90 Dimensionality Discrete eventsystems , 280, 284-28~ . SeeBehavior , display Displaybehavior Distinctive , 192, 204 places Dog, 322, 331 DOF.SeeDegree of freedom DrexelUniversity , 22- 23 , 232- 233 Dynamical systems approach , 89-90 Dynamics Echolocation , 243 Ecological ,I approach dynamics , 27 niche, 27, 51- 52, 125, 174, 211, 217, 334- 337, 339, 442 , 128 pressure robotics ,1 R2, 328, 330 Edinburgh Efferentsignals , 36 Electrotechnical ( ETL), 393 Laboratory Embodiment , 26, 135 . SeeBehavior , emergent Emergence Emotions . SeeRobot , emotions Endocrine , 430- 431 system , 431 Energymanagement Environmental Research Instituteof Michiga E
( ERIM ), 381 , 184 Epistemology , 440-442 Equivalence ErnieandBert,382-383 , 32, 47- 52, 360 Ethology Evolution , 51 , 46, 240,246,251,272,274-275, Expectations 278 , 241-242 Exteroception Faulttolerance , 75, 77, 360,420,438 FERMI , 286 Finitestateacceptor , 81-86, (FSA ) diagrams 118 , 120 , 131 , 157 , 169 , 194 , 217,226, 280-281,384-385,391-392 Fish,154 , 362,364 Fitness function , 332-334,342,356 Fixation , 272-273 -action Fixed , 48, 51, 57, 94, 268 patterns , 166 , 227-228,346 Flakey . SeeNavigation , flocking Flocking , 55 Fly, house Focus of attention . SeeAttention , focusof . SeeNavigation , foraging Foraging Forced relaxation , 224 assumption Formal Methods , 84 Formations . SeeNavigation , fonnation Fovea , 203,276,277
Frameof reference , 256, 271 egocenbic , 271 object-centered Frog/toad, 34, 38- 41, 69-72, 243, 247 FSAs.SeeFinitestateacceptor diagrams Fuddbot , 172 Functional notation , 80- 81, 120 GA-Robot , 333 Gaits.SeeWalking , gaits Ganesha , 401 , 87- 89, 95- 96, 120, 164, 166, 234 Gapps General visionproblem , 268 Geneticalgorithms , 155, 217, 331- 342, 356, 400 , 76, 133, 136- 137,313- 316, 374 Genghis , 143, 164- 165 George Tech, 143,349, 377, 383, 412 Georgia Gerbil, 185 Gibbon 60- 61 ',s Goedel Theorem , 424 Incompleteness GPS(GlobalPositionSystem , ). SeeSensor GPS Gradient descent , 322, 331 Gradient field. 195- 196
Habituation , 306-307 HACKER , 14 distance , 318-319 Hamming HARV , 143 , 164 Harvard University , 298,300 , 15 HearsayD HEGI060 ,5 , autonomous , 251 Helicopter Herbert , 78, 168 Hero2, 232
. SeeRobot , hexapod Hexapod Hierarchical control.SeeDeliberative systems Hn . .ARE, 17- 18, 232 Hill climbing, 192, 322 HMMWV,229, 401- 403 Holonomic , 90 Homeostasis , 216, 306, 430- 432, 438 Homeostat , 432- 433 Homeostatic control,215-216, 443 Hormones , 430- 432, 435- 437 , 279 Houghtransform ArtificialIntelligence Center , 114 Hughes /reactive ) systems Hybrid(deliberative , 207- 209 biologicalevidence , 212 components
485
SubjectIndex connections , 209, 212 interface , 214 strategies
, 213,218 layers . 212 synthesis
mM , 229, 316 Ideothetic , 242 , 438 Idiotypenetwork . SeeRobot , imagination Imagination Immunesystems , 438 , 165, 372- 373, 398 Impatience -functional Indexical , 73 Infraredsensor . SeeSensor , infrared , 309 Indexingproblem INS(Inertialnavigation , ). SeeSensor system INS Insects , 185, 362 , 131 Intelligence Intentions), 24, 43, 46, 142, 167, 182, 201, 227, 360
Lateralinhibition,35, 112 Lawof EtIect, 310- 311 Lawof Universal Gravitation , 98 ) Learning(seealsoAdaptation
- 313 heuristic critic(AHC ), 312 , adaptive 325 -329 . 356
, 224 , 15, 173 strong , 181 taxonomy timehorizon , 184 tradeoffs , 180 - 183 , 182 , 186 , 203 transitory , 182 types weak , 224 , 173 KTH, 298,301
batch , 310 case -based , 217, 308, 349-354 communication , 382- 383 continuous , 310 deductive , 310 definitions , 306 dimensions , 310 , 308, 333- 337 evolutionary -based , 350-355 explanation fuzzyrules, 347- 349 Hebbian , 321, 323, 325 /neural , 340, 342 hybridgenetic imitation , 395, 400- 401 inductive , 310 lazy, 350 machine , 218 -based , 308, 350 memory momentum , 350 , 308 multistrategy , 37 neurobiological for, 308- 310 opportunities , 309 problems Q- , 312-313, 316, 318-320, 355, 356, 396, 399, 438 reinforcement , 307, 310- 320, 356, 396-398, 438 social,395- 401 , 311, 330, 347 supervised , 311 unsupervised Leastcommitment , 214, 216- 217, 235 , lekking Lekking.SeeBehavior LIFIA, 347 USP, 81, 87, 97, 219 Localminima , 99, 151 Localization , 180- 181, 187, 395 Locusof superiorcommand , 50 Loebnercompetition , 423 Logic formal, 166 fuzzy, 115, 342- 343, 345--346 multivalued , 115,230- 231 /actuators ( LSAs ), 202 Logicalsensors LTM. SeeMemory , long-term
LAAS, 17 Landmarks , 184, 193,264, 325 LAURON , 328-329 Laserscanner . SeeSensor , laserscanner Lateralgeniculate nucleus , 243
Manifolds , 102- 103 Map a priori, 186, 194-200 -204 , 181, 184- 186. 203 cognitive , 229 geometric
10,Callisto - 165 , Ganymede , 163 IRM(Innate mechanisms ), 48, 63 releasing - 138 ISRobotics , 137
JetPropulsion (JPL), 218 Laboratory School of Medicine Johns Hopkins , 40 Khepera , 342- 343 Kin recognition , 51, 247, 393, 4(1) Kinematics , 89-90, 248, 339- 340 Knowledge , 66, 131, 211, 263, avoidingrepresentational 268 basis , 179 characteristics , 179- 184, 203 conb ' Oversy , 178 definitions , 178 , 212 necessity -to-know, 185 need , 272- 273 perceptual persistent , 179, 182- 183, 190, 203
486
SubjectIndex Map (cont.) instantaneous obstacle (10M), 189 meadow , 215 perilsin using, 195 , 200 purposive Qualitative , 192- 193 sonar , 216 , 180, 186, 193 spatial tectal, 185 Mapping behavioral , 92-93 , 184,243, 258 retinotopic as, 263, 269 sensing , 36, 184,243 somatotopic , 244 tonotopic , 36, 56 topographic MARGE,346-348 Markovmodels , 197 Mars/LunarRover , 77, 172, 221 Massachusetts Instituteof Technology ( MIT) , II , 38, 130, 133, 141, 168, 264, 298, 392, 429, 439 MAVERIC , 412 Maximumselecting , 50 system MELDOG , 82 function , 343- 344, 356 Membership Memory action,188- 189 associative , 183, 328, 330, 356 behavioral . 186- 190.204 , 183 forgetting global,22 long-term, 37. 40, 183. 185- 186. 204, 310 short-term. 37, 40, 155. 183, 185, 187. 204. 216- 217, 310 wall, 188 Metabolism . 431 MetaToto, 429 MilitaryUniversityof Munich,288 , 242 Millipedes Mission Lab. 377, 410- 415, 430 MITI. 428 Mobilemanipulator . 77, 144. 153, 165. 168 . SeeSoftware , modularity Modularity Molecularmanufacturing , 439 . macaque . 243 Monkey MSSMP .6 robotics Multiagent . 359- 360 advantages , 394 applications , 376 assemblage communication . 50- 51. 360- 364, 366, 369. 372, 378-392. 443 . 367 congregation methods . 394-395 coverage
designdimensions,365- 368 , 360 disadvantages emotionsin, 428 , 362- 364 edtologicalconsiderations , 365,369 heterogeneity immunesystemsin, 438 interference , 360 metrics, 367- 368 micro-, 439, 441 missionspecification,41(}..415 system, 137 teamsize, 369 , 377, 417- 420 teleautonomy Mutation, 332- 333 Mutual exclusion, 385
, 60, 428 Nagoya University , 438-440 Nanotechnology NASA , 22, 219 NASREM . SeeArchitecture , NASREM National Institute ofStandards andTechnology (NIST ), 21- 23 National Taiwan , 391 University NATs.SeeNavigational templates NavalResearch Laboratories , 338 Navigation animal , 40- 41, 184- 185,242, 362- 364 classroom , 67, SO , 82- 84, 87, 109- 111 , 361, 385 consuming , 202 doorway flocking,361, 363, 371 , 130- 131, 137- 139, 157, 161, 163, foraging 185, 361, 364, 366, 372, 381, 384-386 fonnations , 361, 377, 401, 403- 410 , 361 grazing , homing ) homing(SeeBehavior indoor,165,218, 229, 265, 327 office, 16- 17, 194, 197, 227, 229, 231 outdoor , 165, 195, 199-200, 218 , 192- 193,204 qualitative robot, 56, 152, 165, 194 -based schema , 69, 143- 161, 333, 395 three-dimensional , 144, 152 , 100,221, 235 Navigational templates Navlab(I), 289, 291, 294 NavlabD, 199 Navlab5, 293, 295 NerdHerd, 370- 372, 420 Neural circuits,34, 393 model , 57 networks , 43- 44, 47, 183- 184, 307, 320-331, 356 Neuron , 32- 33 -Pittsmodel McCulloch , 43
487
SubjectIndex Neuroscience , 32- 44 Newtermproblem , 309 NewYorkUniversityMedicalCenter , 40 NOAH, 14, 223 NoHandsAcrossAmerica , 293 Noise,99, 151, 155- 157, 163, 165 NOMAD, 324 NomadicTechnologies , 126, 200, 249- 250, 327, 338 Nondetenninism , 107, 381 Non-holonomic , 90, 326 NorthCarolinaStateUniversity , 346 Northwestern , 278 University
Opticflow, 55, 255 OsakaUniversity , 319 Pandemonium 15 Pathfinder , 219 Pathintegration , 242 , 73-74, 129, 258, 272 Pengi , 264-265, 268, 272, 287- 288 Percept Perception action-oriented , 143, 151, 201, 215, 240, 245- 246, 265-270, 279, 299 active , 202, 240, 265, 269- 272, 299 ascommunication , 246-247 distributed , 360, 392- 396, 409- 410, 420 ethologyand, 246- 247 , 254 generalized modular , 254- 265 -to-know, 143, 201, 240 need neuroscience and, 242- 244 and, 244-246, 286 psychology ron, 44, 321- 322, 324, 330 Percept Perceptual classes , 240, 262- 264, 393 , 263 equivalence , 150 performance schema , perceptual ) (SeeSchema , 268- 269, 273, 279-283, 299 sequencing , 240 specialization trigger,92- 93, 209, 217, 280 Pheromone , 53 , 84 PhonyPony Phototaxis , 337 Plan definition , 14 digitalterrain, 195
elaboration , 226-227, 235 asFSA, 217 expression global,209 internalized , 195- 196,203, 273 , 393 recognition , 74 sketchy Universal , 74 Planning , 223 anytime , 15, 67 avoiding deliberative , 209-212 hierarchical , 209-212 in hybridsystems , 214 mission . 215-216 motion,360 multisb ' ategy , 217 path, 98, 231, 234 route,215- 216 , 21 scope Policy,312 Polly, 133, 140, 264- 266 Portautomata , 86
Potential fields . 43.69.98-99. 113 . 151 . 120 . 172. 190. 232. 234. 325 Primate , 243, 364 Procedural ), 214, Reasoning System(PRS 226- 230, 235 , 241- 242, 248 Proprioception , 393 Prosopagnosia Prosthetics , 172 , 45- 47 Psychology behaviorism,45, 180- 181, 207 cognitive, 46, 181, 203, 207- 209, 267 definition, 32 ecological, 46, 51, 245 gestalt, 44 gibsonian, 244- 245 modelfor sensorfusion, 286 , 45 sensorypsychophysics
. SeeLearning . QQ- learning . 211 Qualification problem . SeeNavigation , Qualitative navigation qualitative . 365 Qualitiesof sociality . 425 Quantum computers
R-l robot , 138 R-2robot , 374-375 R-3 robot , 394 RALPH , 293
RAPs . 169 . 219 . 234 -235 -.. 74.. 77.. 129 Rat, 185, 194 Randomness , 155 , 401 Ranger
488
SubjectIndex RCS,22, 166 Reactive actionpackages . SeeRAPs Reactive systems , 206 assumptions characteristics , 66- 67 definitions , 24, 66 Recruitment , 383-384 Reflex es, 47(SeealsoBehavior , reflex) RenandStimpy , 143, 164- 165 Rensselaer Institute( RPI), 24 Polytechnic (seealsoWorldmodel ) Representation deictic,272-273 drawbacks , 203 -based function , 201- 202, 204 , 189, 201, 203 geometric grid, 187- 189, 195, 199,204, 234 in geneticalgorithms , 331 mental , 244 metric, 179, 182, 197, 204 , 200- 203 perceptual , 192- 193,204 qualitative relational , 18, 179, 182 roadmodel,289- 290 , 182, 186, 193, 199 spatial , 192, 197,325 topological , 332- 334, 363-364 Reproduction Resource distribution , 366-367 Response , 93 categories continuous , 98 fight- or-flight, 432 global, 108,318 instantaneous , 90 orientation , 89, 93, 144 , 90-91 range (magnitude ), 89, 144 strength unconditioned , 322, 324 Reuse . SeeSoftware , reuse Rex, 87, 166,219, 234 Roadfollowing.SeeRobots , roadfollowing , 221 Robby Roboroach , 439- 440 Robot aerial,218, 394 ann, 231, 241, 270, 297 bio-robots , 52- 62 biped , 2-3 box-pushing , 316- 319, 374, 376- 377 cellrepair , 439 cellular , 361 cockroach , 58-59, 439- 440 , 84, 86, 118- 119, 165, competitions 217- 218, 251, 346, 348, 393 controlSpectl1lm , 20, 67, 207 , 394 convoying
definitions , 1, 2 , 231 delivery , 279- 283 docking education , 439 emotions in, 11, 427- 428, 443 , 134, 326-327, 439 exploration ordnance , 439 explosive disposal gnat,439 , 171, 225, 231, 337, grasping /manipulation 355 hand,225 hazardous wastecleanup , 374 head , 298- 303 , 4-5, 58-59, 75- 77, 136- 137, 140, hexapod 313-316, 328- 329, 342, 438 , 428- 430 imagination industrial , 241 juggling, 297, 350 landmine detection , 394 LEGO, 11 micro-, 439- 441 military, 218, 247 mind, 422- 430 nano - , 438- 439 non-holonomic , 90- 91 omnidirectional , 91 , 378-380 palletmoving ,4 quadruped reconnaissance , 394, 401, 409- 410 roadfollowing(driving), 44, 256, 260-262, 275, 286, 288- 295 -based schema , 163- 165, 171, 395 self-assembling , 439- 440 , 319- 320 shooting , 140- 141 subsumption surveillance , 125,231, 394, 401 taxonomies , 368- 370 teams(SeeMultiagent robotics ) tourguide, 140,264-266 , 218, 428- 429, 435- 437 vacuuming wheelchair , 98, 168, 172 Robotics origins, 15- 20 rehabilitative , 172 Robustness , 129, 131, 163, 166 Robuter , 326 Rockettes , 439
Rockym, 172 RockyIV, 220 , 342 Rodney Router , 217 RS(Robot Schema ) model,86- 88, 118, 120, 223-224,255 Rules
behavioral . 137 . 139 . 232 . 313 -315 . 370 - 371
489
SubjectIndex classifier , 337- 338 fuzzy, 343-344, 346- 349, 356 If -then, 94,98 instinct , 328 , 96, 347 production RWIrobot, 197, 219, 316 .
, 330 Shaping , 154 Sheep Simulated , 155 annealing Simulations . 27. 53. 136. 168. 172. 333. 337- 342, 349, 379, 381, 428 Situated automata model,87- 89 Situatedness , 8, 26, 135 Situational , 309 , 223 Saliency problem hierarchy SAMUEL,338 SIVS. SeeArchitecture , SIVS Satellites . 249 SocietalAgentTheory , 376-378, 412, 420 Scale of, 28 , issues Societyof Mind, 15, 376 Software , 203, 277, 279 Scanpaths Scarecrow , 171 , 163, 170- 171 agent SCARF , 289 , 66, 129, 163, 166, 218 modularity Schema , 163 objects reuse , 246 , 156, 163, 264 anticipatory avoid-past, 145, 155, 158- 161, 163, 165, , 220 Sojourner 187- 188 SOMASS , 231 definitions Sonar . SeeSensor , 43, 48, 255 , ultrasound , 145- 148 , 428- 429, 435 encodings Sozzy motor, 141- 165, 174,230, 256 , 215 Spatialreasoning , 143- 144,215, 217, 254-256, 407 Speedup , 367- 368 perceptual , 434 , 276 receptor Spotlightmetaphor activation . SeeActivation , spreading signal,434 Spreading , 41- 43, 163, 167,254- 255 , 133, 140,439 theory Squirt -response translnitter SRdiagrams . SeeStimulus , 434 diagrams SRI(StanfordResearch Institute , 362, 365 ), 16, 166, Schooling -Plan-Act paradigm Sense 227, 231, 346 , 130 andactive , passive , 247, 249, 409 , 432 Sensing Stability Sensitization , 306- 307 , 187, 391 Stagnation Sensor Stanford Cart, 18- 19 Stanford , 429 , 295 compass University OOPS Stickleback Fish, 48- 49 , 249, 407 fuel, 215, 434 Stimulus fashion(SeePerceptual conditioned , 322- 324 ) sequencing fission,268-269, 299 domain , 91- 92 fusion,268- 269, 283-288, 299 environmental , 142 GPS, 188,248-249, 407, 414 , 241- 242 perceptual infrared , 19, 168, 342 plane,89 INS, 188,248 , 91, 93 strength internal unconditioned , 215 , 322- 324 -response laserscanner Stimulus , 247, 251- 254 , 79- 80, 118- 120, diagrams models 139, 161- 162 , 107 STM. SeeMemory ,9 , short-term photocell shaftencoders Stress , 247- 248 , 432 , 215 temperature Stripe,401 touch,313 STRIPS , 14, 17, 232 architecture . SeeArchitecture ultrasonic , 18- 19, 168, 188- 190, 193, 202, , Subsumption 216, 243, 247, 249- 250, 253, 316, 326, 429 subsumption . SeeVector vision(SeeVision, computer , summation ) Superpositioning Survival whisker , 328 , 51 , 133,140 , 27, 181- 182,241 Symbolgrounding problem Seymour SFX. SeeArchitecture , SFX T-norms Shaftencoders . SeeSensor , shaftencoders , 115 , 16- 17, 130, 166 , Shakey Targettissues
490
SubjectIndex Targetability, 434 Taxes, 47- 48 TEIRESIAS, 347 Teleautonomy -.- 145.- 165.- 377. 415- 420
Teleo -reactive , 96, 118 , 233 , 289,292 Terregator Teseo , 327
Thea - agent , 232 , Albus's, 21- 22 Theoryof intelligence , 431- 432 Thermoregulation , computational , 423- 425 Thought Thyroid,431 Tito, 140 TJ, 229-230 TomandJerry, 140 Tooth,219- 220 Tortoise , GreyWalter's, 8- 10 Toto, 133, 140, 193 . SeeVision, tracking Tracking , 441- 442 Transmigration , 8, 381, 399- 401 Tropism ThringTest,423 UAV.SeeUnmanned vehicles , aerial UGV. SeeUnmanned vehicles , ground illtrasonicsensors . SeeSensor , ultrasound illysses,260- 263 UM-PRS,227, 412 Universal plans.SeePlan,universal , 374 Universityof Alberta , 53 Universityof Brussels , Berkeley , 203 Universityof California , LosAngeles Universityof California ( UCLA), 369, 394 , Riverside , 385 Universityof California , 278, 295 Universityof Chicago , 328 Universityof Edinburgh , 43 Universityof Genova , 328 Universityof Karlsruhe , 275 Universityof Maryland , 410- 412 Universityof Michigan , 280 Universityof Pennsylvania , 272, 297 Universityof Rochester California , 84, 251, Universityof Southern 327, 340, 381, 399 , Arlington,409- 410 Universityof Texas , 276 Universityof Toronto , 362, 368 Universityof Tsukuba , 273 Universityof Virginia , 271 Universityof Wisconsin vehicles Unmanned aerial( UAV), 3, 6, 251- 252, 289 ground( UGV), 3, 7, 170,401- 402 undersea ( UUV ) , 3, 6, 88- 89, 95, 115,234 , 312, 318 Utility function
, 309 Utility problem UUV. SeeUnmanned vehicles , undersea Valuesystem , 325 VaMoRs , 289-290, 292 VaMP , 289- 291 Vector actionvectors , 144, 181 basedmotorcontrol,38- 41, 113- 114, 142 fieldhistogram , 190 fields,232 , 40 hypothesis , 184- 185 maps polar, 184 , 189 steering sllmmation , 98- 99, 101, 113- 114, 142, 149- 150,215, 347, 371, 378 Viewframes , 193 Vision , 35, 243-245, 245, 257, 273-274, biological 393 fly, 55-56 , 245 spatial whatandwhere , 36, 181, 200- Wl , 243 Vision,computer active(SeePerception , active ) animate , active (SeePerception ) depdlfrommotion, 194 functional , WI - 202 , 264-265 lightweight -based model , 201 , 194, 395- 397 panoramic , 238-240 paradigms with, 238 problems , action-oriented (SeePerception ) purposive , 203, 274- 275, 277 recognition -based schema , 42, 254- 256, 377 sensor , 250- 251 situated , action-oriented (SeePerception ) stereo , 18, 35, 140, 189,298, 393 task-oriented(SeePerception , actionoriented ) dleory,257 , 251- 252, 258, 274- 275, 286, tracking 293- 297
Visualroutines , 256-262,272 Visualsearch , 240,258 , 364-365 Wallaby Walking gaits,57- 58, 313, 342 , 136- 137 subsumption WASUBOT ,2
-tate Winner -aU , 35, 110 , 112 , 170 Win . -I, 3
491
Index Subject World models(seealso Representation ) asa convenience , 229 conStniction, 66 global, 241 problemswith, 107 relianceon, 210 symbolic, 21 three-dimensional , 238 XAVIER, 197- 198, 355 Yamabicorobots, 362