Deutsche Forschungsgemeinschaft Auditory Worlds: Sensory Analysis and Perception in Animals and Man
Auditory Worlds: Se...
94 downloads
1053 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Deutsche Forschungsgemeinschaft Auditory Worlds: Sensory Analysis and Perception in Animals and Man
Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
Deutsche Forschungsgemeinschaft
Auditory Worlds: Sensory Analysis and Perception in Animals and Man Final report of the collaborative research centre 204, ”Nachrichtenaufnahme und -verarbeitung im Hörsystem von Vertebraten (Munich)“, 1983 –1997 Edited by Geoffrey A. Manley, Hugo Fastl, Manfred Kössl, Horst Oeckinghaus and Georg Klump Collaborative Research Centres
Deutsche Forschungsgemeinschaft Kennedyallee 40, D-53175 Bonn, Federal Republik of Germany Postal address: D-53170 Bonn Phone: (02 28) 8 85-1 Telefax: (02 28) 8 85-27 77 E-Mail: (X.400): S = postmaster; P = dfg; A = d400; C = de E-Mail: (Internet RFC 822): postmaster @dfg.de Internet: http://www.dfg-bonn.de
This book was carefully produced. Nevertheless, authors, editors and publisher do not warrant the information contained therein to be free of errors. Readers are advised to keep in mind that statements, data, illustrations, procedural details or other items may inadvertently be inaccurate.
Library of Congress Card No. applied for. A catalogue record for this book is available from the British Library. Die Deutsche Bibliothek – CIP Cataloguing-in-Publication Data A catalogue record for this publication is available from Die Deutsche Bibliothek
© WILEY-VCH Verlag GmbH, D-69469 Weinheim (Federal Republic of Germany), 2000 Print on acid-free and chlorine-free paper. All rights reserved (including those of translation in other languages). No part of this book may be reproduced in any form – by photoprinting, microfilm, or any other means – nor transmitted or translated into machine language without written permission from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law. Cover Design and Typography: Dieter Hüsken Composition: Hagedorn Kommunikation, D-86519 Viernheim, Germany Printing: Strauss Offsetdruck GmbH, D-69503 Mörlenbach Bookbinding: J. Schäffer GmbH & Co. KG, D-67269 Grünstadt Printed in the Federal Republic of Germany
Contents
1 1.1 1.2 1.3 1.4 1.5 1.5.1 1.5.2 1.5.3
2 2.1
Preface.................................................................................................... Geoffrey A. Manley List of Abbreviations............................................................................. Introduction ........................................................................................... Geoffrey A. Manley and Georg Klump Design plasticity in the evolution of the amniote hearing organ .... Geoffrey A. Manley The origin of the amniote hearing organ, the basilar papilla............. The origin and evolution of the mammalian cochlea ......................... The origin and evolution of the hearing organ of archosaurs ............ The origin and evolution of the hearing organ of lepidosaurs........... General evolutionary trends in the physiology of amniote hearing organs .................................................................... Functional principles seen in all auditory organs ............................... Parallel trends in the evolution of amniote auditory organs .............. Specialization of hair-cell populations across the papilla in birds and mammals...............................................................
Comparative anatomy and physiology of hearing organs ............... Anatomy of the cochlea in birds ........................................................... Franz Peter Fischer, Otto Gleich, Christine Köppl and Geoffrey A. Manley 2.1.1 Anatomy and evolution of the avian basilar papilla ........................... 2.1.2 Basic structure of the avian hearing organ .......................................... 2.1.3 The hair cells.......................................................................................... 2.1.4 Hair-cell ultrastructure and synapses .................................................. 2.1.5 Gradients in hair-cell morphology........................................................ 2.1.5.1 Gradients in the shape of the hair cells ............................................... 2.1.5.2 Stereovillar bundle shape ..................................................................... 2.1.5.3 Stereovillar bundle orientation ............................................................. 2.1.6 Hair-cell innervation.............................................................................. 2.2 A special case of congenital hearing deficits: The Waterslager canary ........................................................................ Otto Gleich
XIII XVII 1 7 7 8 10 12 14 14 14 15 18 18
18 20 22 23 24 25 26 28 28 31 V
Contents 2.3 2.4 2.5 2.5.1 2.5.2 2.5.3 2.5.4 2.5.5 2.5.6 2.6 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7 2.7.1 2.7.2 2.8
2.8.1 2.8.2 2.8.3 2.8.4 2.8.5
Micromechanical properties of chicken hair-cell bundles.................. Jutta Brix and Geoffrey A. Manley Potassium concentration and its development in the chicken cochlea Geoffrey A. Manley Discharge activity of afferent fibres in the avian hearing organ ....... Otto Gleich, Christine Köppl and Geoffrey A. Manley Spontaneous activity of afferent nerve fibres ...................................... Responses to simple tonal stimuli......................................................... Frequency selectivity............................................................................. Excitation patterns on the starling hearing organ............................... Phase locking to tones........................................................................... Birds and mammals – similarities and differences in auditory organs.................................................................................. Function of the cochlear efferent system – a comparative approach Alexander Kaiser, Geoffrey A. Manley and Grit Taschenberger Why are birds valuable for studying hearing-organ efferents? ......... Comparative physiology of the avian efferent system........................ Putative efferent influences on otoacoustic emissions in the barn owl ....................................................................................... Input to avian cochlear efferents .......................................................... Speculations on the functional significance of cochlear efferents ..... The lagenar macula and its neural connections to the brainstem ..... Geoffrey A. Manley and Alexander Kaiser Efferent connections.............................................................................. Afferent connections ............................................................................. Anatomy of the cochlea and physiology of auditory afferents in lizards .................................................................. Christine Köppl and Geoffrey A. Manley Basilar-papilla morphology in the bobtail lizard ................................. Characteristics of primary auditory afferent fibres in the bobtail lizard................................................................................ A model of frequency tuning and tonotopic organization in the bobtail lizard................................................................................ Basilar-papilla morphology in the Tokay gecko.................................. A model of frequency tuning and tonotopic organization in the Tokay gecko ................................................................................
Cochlear frequency maps and their specializations in mammals and birds........................................................................... 3.1 Cochlear specializations in bats ........................................................... Marianne Vater 3.1.1 Structure-function correlations in echolocating bats .......................... 3.1.2 Cochlear fine structure and immunocytochemistry ............................ 3.1.2.1 Receptor surface and tectorial membrane ........................................... 3.1.2.2 Organization of OHCs and their attachments ..................................... 3.1.2.3 Cochlear development in the horseshoe bat .......................................
33 34 35 36 38 39 41 41 43 44 44 45 45 48 48 49 50 50 52 52 54 56 57 58
3
VI
60 61 61 65 66 67 69
Contents 3.2
3.2.1 3.2.2 3.2.3 3.3
4 4.1 4.1.1 4.1.2 4.1.3 4.2 4.2.1 4.2.1.1 4.2.1.2 4.2.1.3 4.2.2 4.2.2.1 4.2.2.2 4.2.3 4.2.4 4.2.5 4.2.6 4.2.7 4.3
5 5.1
Cochlear maps in birds.......................................................................... Otto Gleich, Alexander Kaiser, Christine Köppl and Geoffrey A. Manley Frequency representation along the basilar papilla ........................... Possible changes in function across the papilla’s width ..................... Development of body temperature in chickens – Implications for the development of frequency maps...................... The auditory fovea of the barn owl ...................................................... Christine Köppl Models of the human auditory system ................................................ Psychoacoustically-based models of the inner ear.............................. Hugo Fastl for Eberhard Zwicker † A hydrodynamic model of the human cochlea.................................... An electronic model of the human cochlea ......................................... A computer model of the human cochlea ............................................ Linear model of peripheral-ear transduction (PET) ............................ Ernst Terhardt Basic definitions and features............................................................... The ECR filter ........................................................................................ The CTF filter......................................................................................... Temporal behaviour .............................................................................. Setting PET’s parameters ...................................................................... The ECR filter ........................................................................................ The CTF filters ....................................................................................... Simulation of the threshold of hearing................................................. Simulation of tuning curves .................................................................. Scaling of characteristic frequencies.................................................... Digital computation ............................................................................... PET and gammatone ............................................................................. Nonlinear mechanics of the organ of Corti.......................................... Frank Boehnke
Active mechanics and otoacoustic emissions in animals and man ... Otoacoustic emissions in lizards ........................................................... Geoffrey A. Manley 5.1.1 The origin of lizard SOAE..................................................................... 5.1.2 The influence of changes in head temperature on lizard SOAE ....... 5.1.3 Interactions between SOAE and external tones ................................. 5.1.3.1 Level suppression by external tones .................................................... 5.1.3.2 Facilitation by external tones................................................................ 5.1.3.3 Shifts in SOAE frequency due to external tones ................................. 5.1.4 Distortion-product otoacoustic emissions in lizards ............................ 5.2 Otoacoustic emissions in birds.............................................................. Geoffrey A. Manley and Grit Taschenberger 5.2.1 Simultaneous-evoked emissions in the starling ..................................
70
70 72 73 75
76 76 76 77 79 81 81 82 82 83 84 84 84 86 87 87 88 89 90
93 93 95 97 99 99 100 101 102 103 103 VII
Contents 5.2.2 5.2.3 5.3 5.3.1 5.3.2 5.3.3 5.3.4 5.4 5.5
5.5.1 5.5.2 5.5.3 5.5.4 5.6
5.6.1 5.6.2 5.6.3 5.6.4 6 6.1 6.1.1 6.1.1.1 6.1.1.2 6.1.1.3 6.1.1.4 6.1.1.5 6.1.2 6.1.2.1 6.1.2.2 6.1.2.3 6.1.3 6.1.4 6.2
VIII
Spontaneous otoacoustic emissions in birds (SOAE) .......................... Distortion-product OAE in the chicken, starling and owl .................. Otoacoustic emissions and cochlear mechanisms in mammals ......... Manfred Kössl Acoustic distortions and the cochlear amplifier .................................. Acoustic distortions and cochlear tuning............................................. Enhanced auditory filters in bats.......................................................... Distortion thresholds in mammals ........................................................ Otoacoustic Emissions in human test subjects .................................... Hugo Fastl for Eberhard Zwicker † Properties of 2f1-f2 distortion product otoacoustic emissions in humans.............................................................................. Thomas Janssen Influence of frequency ratio f2/f1 and level difference L1-L2 ............. Stability of the DPOAE level................................................................. Relationship between DPOAE and hearing threshold ....................... in normal hearing .................................................................................. Suppression properties.......................................................................... Developing clinical applications of 2f1-f2 distortion-product otoacoustic emissions ............................................. Thomas Janssen Monitoring cochlear dysfunction during recovery.............................. Scanning of cochlear dysfunction in hearing loss ears with and without tinnitus ...................................................................................... Prediction of hearing threshold ............................................................ DPOAE as a clinical tool ....................................................................... Neural processing in the brain ............................................................ Auditory brainstem processing in bats ................................................ Marianne Vater The cochlear nucleus: Origin of parallel ascending pathways .......... Cytoarchitecture of the cochlear nucleus ............................................ Inputs, tonotopy and intrinsic connectivity of the CN ........................ Chemoarchitecture of the CN............................................................... Physiology of the cochlear nucleus ...................................................... Central connections of the CN ............................................................. Superior olivary complex: the first stage of binaural interactions ..... Anatomy and connectivity of the principal nuclei of the SOC........... Physiology of the SOC........................................................................... Chemoarchitecture of the SOC ............................................................ Lateral lemniscus ................................................................................... Inferior colliculus ................................................................................... Serotoninergic innervation of the auditory pathway in birds and bats – implications for acoustical signal processing .................... Alexander Kaiser and Birgit Kuhn
103 105 106 108 113 114 119 120
127 127 132 133 135 136 136 138 140 141 142 142 142 142 143 145 146 147 148 149 150 150 151 151 152
Contents 6.3 6.3.1 6.3.2 6.3.3 6.4
6.4.1 6.4.2 6.4.2.1 6.4.2.2 6.5 6.5.1 6.5.1.1 6.5.1.2 6.5.1.3 6.5.1.4 6.5.1.5 6.5.1.6 6.5.2 6.5.2.1 6.5.2.2 6.5.3 6.5.4 6.5.5 6.5.6 6.5.6.1 6.5.6.2 6.5.7 6.5.8 6.5.9 6.5.10 6.5.11 6.6
6.6.1
Temporal processing in the lower brainstem....................................... Benedikt Grothe The role of neural delays in IID-coding of LSO cells .......................... The role of inhibition in processing periodic stimuli........................... Temporal vs. spatial processing in the lower auditory brainstem...... The barn owl as a model for high-resolution temporal processing in the auditory system ........................................................ Christine Köppl Does phase locking improve in the nucleus magnocellularis?........... Frequency-specific adaptations of neuronal brainstem circuits ........ Neuronal and synaptic morphology ..................................................... Axonal delay lines between N. magnocellularis and N. laminaris..................................................................................... Cortical physiology, sensorimotor interactions.................................... Gerd Schuller Auditory cortex in the horseshoe bat ................................................... Primary auditory field............................................................................ Anterior dorsal field............................................................................... Posterior dorsal field.............................................................................. Rostral dorsal field ................................................................................. Dorsal field of the dorsal auditory cortex............................................. Ventral auditory field ............................................................................ Afferent and efferent connections of auditory cortical fields ............. Thalamocortical connections ................................................................ Connection with other brain regions ................................................... The auditory cortex in the horseshoe bat: Summary and conclusions .................................................................... Audio-vocal interaction in horseshoe bats........................................... Peripheral vocalization system in the horseshoe bat .......................... Where in the brain can vocalization be elicited? ................................ Paralemniscal tegmental area .............................................................. Pretectal area ......................................................................................... Functional implications of the acoustical afferent connections to the pretectal area................................................................................... Possible control of motor actions by efferent connections of the pretectal area ......................................................................................... Superior colliculus ................................................................................. Nucleus cuneiformis and adjacent lateral periaqueductal grey, and vocalization ..................................................................................... The descending vocalization system in the horseshoe bat: summary and conclusions ..................................................................... The processing of ‘biologically relevant’ sounds in the auditory pathway of a non-human primate ........................................................ Peter Müller-Preuss and Detlev Ploog The relationship of the auditory pathway to structures involved in sound production: Anatomy and physiology...................................
153 154 155 159 161 161 163 163 164 165 165 166 167 167 167 167 168 168 168 169 170 171 173 174 175 177 178 180 181 181 183 184
184 IX
Contents 6.6.2
Processing of species-specific vocalizations that are modulated in amplitude ........................................................................................... 6.6.2.1 Processing of sounds masked by a preceding stimulus ...................... 6.6.2.2 Processing of artificial amplitude-modulated sounds ......................... 7 7.1
7.1.1 7.1.2 7.1.3 7.1.4 7.1.4.1 7.1.4.2 7.1.4.3 7.1.5 7.1.6 7.2 7.2.1 7.2.2 7.2.3 7.3 7.3.1 7.3.1.1 7.3.1.2 7.3.1.3 7.3.2 7.3.3 7.4 7.4.1 7.4.2 7.4.3 7.4.3.1 7.4.3.2 7.4.4 7.4.4.1 7.4.4.2 7.4.4.3
X
Comparative animal behaviour and psychoacoustics....................... The European starling as a model for understanding perceptual mechanisms......................................................................... Georg Klump, Ulrike Langemann and Otto Gleich Auditory sensitivity: absolute thresholds ............................................. Frequency selectivity............................................................................. Frequency discrimination ..................................................................... Temporal processing.............................................................................. Temporal resolution of the auditory system......................................... Long term integration............................................................................ Duration discrimination......................................................................... Spectro-temporal integration: comodulation masking release .......... The processing of signals by the European starling’s auditory system – conclusions............................................................... Mechanisms underlying acoustic motion detection............................ Hermann Wagner The problem underlying acoustic-motion detection ........................... Recent psychophysical and neurological findings on acoustic motion ................................................................................. Physiological correlates of acoustic motion detection in the barn owl ....................................................................................... Comparative echolocation behaviour in bats ...................................... Gerhard Neuweiler Field studies ........................................................................................... Horseshoe bats....................................................................................... Gleaning bats......................................................................................... FM-bats .................................................................................................. Insect abundance and prey selectivity................................................. Time windows for echo perception ...................................................... Echolocation behaviour and comparative psychoacoustics in bats ... Sabine Schmidt Texture perception by echolocation ..................................................... Discrimination of target structure by the gleaning bat, M. lyra......... A broadband spectral analysis model for texture perception ............ Outline of the model.............................................................................. Implications of the model...................................................................... Perceptual categories for auditory imaging......................................... Prevalence of absolute pitch ................................................................. Collective pitch versus spectral pitches and timbre ........................... Spontaneous classification of complex tones – evidence for collective pitch perception in the ultrasonic range .............................
186 187 189 193 193 194 195 200 204 204 207 208 209 211 212 212 213 213 215 215 216 219 221 222 224 226 226 227 228 228 229 230 230 232 233
Contents 7.4.4.4 Frequency resolution and auditory filter shape in M. lyra ................. 7.4.4.5 Perception of spectral pitches – corroboration of the broadband spectral-analysis model......................................................................... 7.4.4.6 Focused perception of single partials in complex tones ..................... 7.4.5 Psychoacoustical frequency and time-domain constants in bats ....... 7.4.5.1 Discrimination performance in the frequency domain ....................... 7.4.5.2 Time constants in echolocating bats ....................................................
237 239 240 240 241
8 8.1
246 246
8.2 8.2.1 8.2.2 8.3 8.3.1 8.3.2 8.3.3 8.4 8.4.1 8.4.2 8.4.3 8.4.4 8.4.5 8.4.6 8.5 8.5.1 8.5.2 8.6 8.7
8.7.1 8.7.2 8.7.3 8.7.4 8.7.5 8.7.6
The human aspect I: Human psychophysics....................................... The presentation of stimuli in psychoacoustics ................................... Hugo Fastl Masking effects...................................................................................... Hugo Fastl Post masking (forward masking) .......................................................... Binaural Masking-Level Differences ................................................... Basic hearing sensations ....................................................................... Hugo Fastl Pitch ........................................................................................................ Pitch strength ......................................................................................... Fluctuation strength .............................................................................. Loudness and noise evaluation............................................................. Hugo Fastl Loudness evaluation.............................................................................. Physical measurement of loudness and psychoacoustic annoyance . Loudness of speech................................................................................ Evaluation of road traffic noise............................................................. Evaluation of aircraft noise ................................................................... Evaluation of railway noise................................................................... Perceived differences and quality judgments of piano sounds.......... Miriam N. Valenzuela Perceived differences between piano sounds ..................................... Influence of sharpness on quality judgments of piano sounds .......... Identification and segregation of multiple auditory objects............... Uwe Baumann The role of accentuation of spectral pitch in auditory information processing .......................................................................... Claus von Rücker Spectral pitch as a primary auditory contour ...................................... Accentuation as a cue for segregation ................................................. Physical signal parameters evoking accentuation .............................. Improved perception of vowels by accentuation ................................ Thresholds for the accentuation of part tones in harmonic sounds ... Accentuation of pitch – discussion and outlook ..................................
235
247 247 249 251 251 255 257 258 258 261 262 263 264 266 268 268 272 274
278 278 279 279 280 283 285
XI
Contents 9
9.3.8 9.3.9
Hearing impairment: Evaluation and rehabilitation......................... Karin Schorn and Hugo Fastl Psychoacoustic tests for clinical use ..................................................... Amplitude resolution – ∆L-test ............................................................. Frequency selectivity – psychoacoustical tuning curves .................... Frequency discrimination – ∆f measurement ...................................... Temporal integration ............................................................................. Temporal resolution............................................................................... Melodic pattern discrimination test...................................................... Objective tests for clinical use .............................................................. Delayed evoked otoacoustic emissions (DEOAE) for screening ........ Delayed-evoked and distortion-product otoacoustic emissions for hearing threshold determination ................................... Delayed-evoked and distortion-product otoacoustic emissions as an objective TTS measurement ...................................... Evoked-response audiometry (ERA) for hearing threshold determination ........................................................................ Evoked-response audiometry (ERA) for hearing aid fitting ............... Rehabilitation with hearing aids, cochlear implants and tactile aids Psychoacoustic basis of hearing aid fitting .......................................... Hearing aid fitting with in-situ measurement ..................................... Hearing aid fitting – earmold modifications with vents and horns.... Hearing aid fitting with loudness scaling ............................................ Hearing aid fitting with speech in noise.............................................. Experimental Hearing Aid.................................................................... Cochlear implantation – Preliminary test, intra- and postoperative control ............................................................................. Speech processing with cochlear implant ........................................... Tactile Hearing Aids..............................................................................
308 309 310
10
References (Work of the collaborative research centre 204)..............
312
11
References (Work outside the collaborative research centre 204).....
337
12 12.1 12.2 12.3 12.4
Appendices ............................................................................................ Institutes involved in the collaborative research centre 204 .............. Projects supported by the collaborative research centre 204............. Members and coworkers of the collaborative research centre 204 ... Doctoral (D) and habilitation (H) theses of the collaborative research centre 204......................................................... Guest scientists of the collaborative research centre 204................... International and industrial cooperations of the collaborative research centre 204.........................................................
349 349 350 352
9.1 9.1.1 9.1.2 9.1.3 9.1.4 9.1.5 9.1.6 9.2 9.2.1 9.2.2 9.2.3 9.2.4 9.2.5 9.3 9.3.1 9.3.2 9.3.3 9.3.4 9.3.5 9.3.6 9.3.7
12.5 12.6
XII
286 286 286 288 290 290 291 293 294 294 295 295 296 298 299 299 299 301 302 304 307
355 358 360
Preface Geoffrey A. Manley
The purpose of this book is two-fold. Firstly, it is to fulfill the formal requirements of the Deutsche Forschungsgemeinschaft (DFG), that a final report be issued when a collaborative research centre (SFB, Sonderforschungsbereich) comes to a close. The second, more important, purpose of this book is to provide in a succinct form a report of the 15 years of research activity on the hearing system of vertebrates that took place (mostly) in the region of the city of Munich between the 15 years from 1983 to 1997. The inspiration to attempt to set up a new collaborative research centre in the field of hearing derived from a unique constellation of scientists with significantly large research groups that found themselves together in the Munich area. Prof. Eberhard Zwicker, Head of the Dept. of Electroacoustics at the Technical University in Munich had already established a large hearing research group, initially founded within the collaborative research centre ‘Cybernetics’, when Prof. Geoffrey Manley became the first Head of Zoology at the same University in 1980. One year later, it became obvious that Prof. Gerhard Neuweiler, then of Frankfurt University, would be moving to Munich to take over a Chair of Zoology at the Ludwig-MaximiliansUniversity, also in Munich. These three ‘founding fathers’ felt strongly that research efforts in the Munich area into problems of hearing could be greatly strengthened through the close, cooperative programme typical of a collaborative research centre. The collaborative research centre 204 “Hearing in vertebrates“ was called into being following a meeting of a panel of scientific reviewers in 1982, and initially consisted of 11 research groups from both Universities and one additional group from the Max-Planck-Institute for Psychiatry, a research establishment also in the city of Munich. The collaborative research centre 204 had a very auspicious beginning. Compared to the average, it was at the same time unusually coherent and comprehensive and was able to build on the organizational experience gained in the expired collaborative research centre ‘Cybernetics’. The efforts of all the research groups involved in this new collaborative research centre were directed to resolving key questions concerning the same sensory system. The scientific workers came from a broad range of backgrounds and their techniques reflected almost every available approach to studying the ear and the ‘hearing brain’. There were engineers with interests in measuring and defining sound and the human perception of it (the groups XIII
Preface of Zwicker and Fastl), and human perception of speech and music (Ruske, Terhardt). Others trained in Zoology concentrated their efforts in understanding the structure and function of hearing organs of different vertebrate groups – reptiles, birds and mammals, including specialized animals such as the barn owl and bats (Kössl, Manley, Vater). Some groups worked towards understanding how the brain processes auditory information that is important during sound production and vocalization in animals (Ploog, Schuller) or the acoustic signals relevant to behaviour (Neuweiler). The comparison to perception in hearing-impaired humans was established through a project in the Ear-Nose-Throat Clinic of the Ludwig-Maximilian University (Schorn). Within a short period of time, these groups had established a close working relationship that made it possible, for example, to understand the relationship between physiology on the one hand and psychoacoustics (of animals and man) on the other. This cooperation included a concerted effort to find common terminologies, techniques, computer software, stimuli and analytical methods. Over the 15 years of the collaborative research centre (the maximum possible duration), of course, there was a certain degree of flux in the projects that came up for review every three years, due among other things to scientists moving their groups into and out of the Munich area. Thus some areas became less emphasized with time (e.g. speech perception), whereas others enjoyed added emphasis (animal psychoacoustics, clinical studies). Over the years, there was a trend away from human psychoacoustical studies, from initially four research groups down to two, but a concomitant increase in the number of groups carrying out animal studies. New methodologies, such as the measurement of otoacoustic emissions, became established during the tenure of this project and had significant influence on the kind of work carried out in later years. Eberhard Zwicker, as the most experienced administrator following his tenure as Chair of the collaborative research centre ‘Cybernetics’, became the founding ‘Speaker’ (elected chair of the steering committee) of the collaborative research centre 204. His strong leadership was cut tragically short by his untimely death in November 1990. Following this, Gerhard Neuweiler and later Geoffrey Manley were elected to Speaker of the collaborative research centre. It is of course difficult to quantify the influence of this project on the field of hearing research in general. The collaborative research centre generated 5 to 6 publications per month in international journals and was responsible for a large number of conference presentations and seminars worldwide. In addition, a number of books were authored or edited by members of this project. We believe that the uniquely interdisciplinary nature of this collaborative research centre made each member acutely aware of the breadth of the issues relating to hearing research in general. This influenced the nature of the articles written and produced a desire to adequately place the data from individual projects clearly in the context of the entire field. On a local scale, each of the projects benefitted greatly from the whole. There was a clear synergistic effect that was evident, for example, in the speed with which technical problems could be solved through advice and collaboration outside the individual group. A number of research projects were initiated and carried through specifically to further the goal of a better integration of physiology and psychoacousXIV
Preface tics, and the stimuli, techniques and interpretations were strongly influenced by the desire to produce data of interest to the widest possible audience. Those of us who were members of this collaborative research centre during the entire 15 years are sure that this kind of research organization increased both the quantity and the quality (in terms of the depth and breadth of the issues and the interpretations) of the research carried out.
The role of this book The Deutsche Forschungsgemeinschaft requires a report at the end of any research project, and in this respect, the collaborative research centre 204 is no exception. The reports produced at the end of each three-year period of support were volumes of up to almost 250 pages. These were accompanied by (almost as voluminous) grant applications for the continuing three-year period. For the closing report, however, the Deutsche Forschungsgemeinschaft permits a collaborative research centre to produce a book instead. Given the choice of presenting a standard report or creating a book, we chose to present our report in a book format, that will make it available to our colleagues worldwide and attain the widest possible dissemination of our results and our ideas. This has meant condensing the equivalent of 1200 pages of text and figures into something manageable and publishable. This challenge we have met partly by leaving out reports from groups which only collaborated briefly in the collaborative research centre and also some minor pieces of work. In other words we do not intend to be exhaustive. We have also attempted to avoid the style of a dry report, but instead have taken the opportunity to integrate the findings and to attempt to give an overview of the results of our research efforts over the past decade and a half. Thus, this book is not a sequence of individual projects’ reports. The chapters do not correspond to projects, but have been jointly written by several researchers in each case. All the members of the collaborative research centre 204 wish to express their gratitude to the Deutsche Forschungsgemeinschaft for the generous support and the encouragement received during the tenure of this collaborative research centre. We are especially grateful to those staff members of the Deutsche Forschungsgemeinschaft (Drs. Bode, Mai and Rohe) whose quick responses to our numerous enquiries helped the administration of the collaborative research centre run smoothly. To the many colleagues who accepted the (voluntary and honorary) task of periodically reviewing the collaborative research centre, which involved not only a two-day site visit but numerous hours of reading of reports and grant applications, we wish to express our special thanks. We hope that our successes as reported in this volume will reassure them that all the effort was worthwhile. We also wish to thank various members of the administration of the Universities involved, especially of the ‘Chair’ University, the Technical University of Munich. The Presidents, Chancellors and their administrative assistants were often responsible for providing the help needed to make this collaborative research centre possible. Because of the commitment of the Bavarian Ministry of Science, Culture and Education and of the Universities to excellence in research (which sometimes XV
Preface led them to be under pressure to produce additional funds and/or staff positions), the members of this collaborative research centre always felt that at the federal, state and University levels, we had enthusiastic support. It was in this environment that the collaborative research centre 204 developed into an integrated, hard-working and productive research body. We will all sorely miss the advantages in planning continuity and the intellectual stimulation it brought. Last, but by no means least, we are united in our wish to dedicate this book to the late Eberhard Zwicker. It was mainly his inspiration, drive and enthusiasm that led to the establishment of this collaborative research centre and which carried it through its early years. His tremendous energy and his example to us all in his rich creativity and productivity inspired many of the successes of this collaborative research centre. His death came as a great shock to us all. The fact that the collaborative research centre 204 continued its productivity and cooperations thereafter is a fitting tribute to the firm groundwork he laid.
XVI
List of Abbreviations
5-HAT AI AII AAF AEP AGC ALC AM AOP APN APR AVCN BAPTA BDA BERA BM BMF BMI BMLD BP CAP CB CD CF CI CIS CM CMF CMR CN CR CTF dB DC DCN DEOAE DiI DF DL DT
5-hydroxytryptamine, serotonin primary auditory cortex secondary auditory cortex anterior auditory field auditory evoked potential automatic gain control anterior limbic cortex amplitude modulation auditory-object pattern anterior pretectal region auditory startle (Preyer) reflex antero-ventral cochlear nucleus ,2-bis(2-Aminophenoxy)ethane-N,N,N’,N’-tetraacetic acid biotinylated dextranamine brainstem evoked response audiometry basilar membrane best modulation frequency bicuculline methiodide binaural masking-level difference basilar papilla compound action potential critical bandwidth compact disk characteristic frequency (or, for bats - constant frequency part of call) cochlear implant continuous-interleaved sampling cochlear microphonic potential critical modulation frequency comodulation masking release cochlear nucleus critical-masking ratio, critical ratio cochlear transfer function decibel Deiter’s cell dorsal cochlear nucleus delayed-evoked otoacoustic emissions 1,1-dioctadecyl-3,3,3’,3’-tetramethylindocarbocyanine perchlorate frequency difference level difference time difference
XVII
List of Abbreviations DLF, FLD DMSO DPOAE ECR EM EP ER ERA FEM FFR FLD, DLF FM FMDL FRF FTT GABA HC HRP IC ICo IHC IID IPC I/O ITD JNDAM K+ KEMAR kHz LGB LL LOC LRC MAE MDS MGBv MLd MNTB MOC MPDT N Na+ NA NCAT NL NLL XVIII
difference limen of frequency dorso-medial superior olive distortion-product otoacoustic emission ear canal resonance electron microscope endocochlear potential of Scala media endoplasmatic reticulum evoked-response audiometry finite element model frequency-following response frequency-difference limen frequency modulation frequency-modulation difference limen frequency resolution factor Fourier-t transformation gamma-amino-butyric acid hair cell horseradish peroxidase inferior colliculus Nucleus intercollicularis inner hair cell of mammals inter-aural intensity difference inner pillar cell input-output inter-aural time difference just-noticeable difference in amplitude modulation potassium ion Knowles electronic manikin for acoustic research kilohertz lateral geniculate body lateral lemniscus lateral olivo-cochlear efferent system inductor-resistor-capacitor circuit motion after-effect motion-direction sensitivity ventral division of the medial geniculate body Nucleus mesencephalicus lateralis dorsalis medial nucleus of the trapezoid body medial olivo-cochlear efferent system melodic pattern discrimination test nucleus sodium ion Nucleus angularis nucleus of the central acoustic tract Nucleus laminaris Nucleus lemnisci lateralis
List of Abbreviations NM NOT NPO OAE OC OHC OPC PAF PET PSTH PTTP PVCN PVF, PVDF Q10dB, Q40dB RL SAM SC SEM SEOAE SFB SFM SHC SI SISI SNR SO, SOC SOAE SPL ST STC TEM THC TIH TM TMTF TQ TV UCL VMSO VNLL WGA-HRP ZTE
Nucleus magnocellularis nucleus of the optic tract olivary pretectal nucleus otoacoustic emission organ of Corti outer hair cell of mammals outer pillar cell posterior auditory field peripheral ear transduction peri-stimulus-time histogram part-tone time pattern posterio-ventral cochlear nucleus Polyvinyliden flouride tuning sharpness coefficients at 10 and 40 dB above threshold reticular lamina sinusoidally amplitude-modulated stimuli supporting cell scanning electron microscope simultaneous-evoked otoacoustic emissions Sonderforschungsbereich (Special Research Area) sinusoidally frequency-modulated tones short hair cell of birds sparsely-innervated short increment sensivity index signal-to-noise ratio superior olivary nucleus spontaneous otoacoustic emission sound pressure level Scala tympani suppression tuning curve transmission electron microscope tall hair cell of birds time-interval histogram tectorial membrane temporal modulation transfer function threshold in quiet tegmentum vasculosum of birds uncomfortable listening level ventromedial superior olive ventral nucleus of the lateral lemniscus wheat-germ agglutinate horseradish peroxidase Zwicker-tone exciters
XIX
Introduction Geoffrey A. Manley and Georg Klump
What auditory worlds exist? The question posed by the title of this section implies that the auditory experience of various animals is not the same. The beginning of the clear recognition of this fact can be traced back to Griffin and co-worker’s demonstration during the 1940’s that bats use ultrasonic frequencies to orient themselves in space, so putting an end to a plethora of theories, including one that postulated that bats orient by ‘remote touch’ (Griffin et al. 1960). The recognition that the sense organs of animals may be quite different to those of man, involving senses unknown to humans, such as electric organs to detect electrical-potential fields, was crucial to understanding animal behaviour and to the establishment of the science of sensory physiology. Of course, the differences in the coding capabilities of sense organs may be less dramatic and more gradual. This is the case with the hearing organs of the various groups of terrestrial vertebrates. Nonetheless researchers in this field have repeatedly shown that different animals are capable of sensory feats that are beyond our own physiological capacities. The ‘auditory world’ of a species can be defined as those acoustic signals of the outside world that are coded into information for the central nervous system by the species’ unique type of hearing apparatus and are then transformed into perceptual entities by the species’ unique combination of neural processing steps. There is no evidence that amniotes – mammals, birds and reptiles – normally use any other, additional sense organs for the detection of air-borne sound than the basilar papilla or organ of Corti. Only in cases of extremely high sound intensities are there indications that other sense organs may respond to sound – the saccular part of the vestibular system, for example (Cazals et al. 1980). This can, however, hardly be regarded as a true adequate stimulus, and is comparable to the experience of ‘seeing stars’ following a physical blow to the eye! The auditory world of any animal has, of course, been shaped by its evolutionary history, and the shaping was often strongly influenced by some ‘accidental’ events in the past. For example, when the ancestors of modern reptiles, birds and mammals all developed a middle ear sensitive to air-borne sounds, probably in Triassic times (Clack 1997), it happened that mammals created a middle-ear ossicular chain of three bones. This was coupled with the origin of the unique mammalian 1 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
Introduction secondary jaw joint. Other amniote groups developed a middle ear using functionally only one ossicle. This event almost certainly occurred at a time when these ancestral animals could not hear high frequencies (i.e. above about 10kHz). As it turned out, however, this new mammalian kind of middle ear was much more suited to the transmission of high frequencies than was and is the single-ossicle middle ear of non-mammals (Manley 1990). Thus this one almost accidental development, the selective pressure for which can be sought in an improvement of the feeding apparatus and of hearing sensitivity in general, opened for mammals a new ‘auditory world’, the detection of what we know as ultrasonic frequencies. Whole groups of mammals (notably bats and some whales) owe their way of life to this development. These organisms represent an extreme development of the auditory system – both of the peripheral hearing organ and of the central brain centres. The ability of objects around us to influence the sounds we perceive is great and complex. Animals that communicate using sound often make use of the distortions signals experience passing through their environment to judge, for example, the distance of a neighbour. The smaller the object, the higher the frequencies that it will most affect, and the higher the sound frequency, the greater is the information content of echos thrown back from that object. Thus it is no surprise that flying mammals like bats that catch insects in flight use ultrasonic frequencies to find their prey. These frequencies range in total between about 20 and 180 kHz; human hearing ends, for adults at least, below 16 kHz. Bats flying in their natural environment will, of course, often be confronted with echos from a variety of objects. The remarkable differentiation of the vocalization elements and of the structure of the different bat cochleae to optimally receive the echos on the one hand and of their brain centres to compute important aspects of the animal’s surroundings from the echoes on the other hand are extremely interesting but complex stories. Bats show very diverse behaviour, depending on the type of prey, and whether their prey is caught in flight (as are most insects) or while sitting on a surface (some insects, frogs) or even in water (fish). Prey location through actively emitting sounds, or echolocation, is distinguished from passive acoustic location of prey by listening to noises the prey animals make. The sensitive hearing required for passive listening is also found in some bats, but also in some owls, organisms that hunt under low-light conditions. Based on the idea that functional principles may be best understood when studied in their most extreme manifestation, our collaborative research centre has studied the ear, brain and behaviour of different kinds of bats and of the barn owl. One of the aims of these studies was to understand the fundamental principles underlying the function of the mammalian, and in a wider sense, the vertebrate ear.
The integration of physiology and psychoacoustics An additional goal of the collaborative research centre 204 was to improve the conceptual integration of the fields of physiology and psychoacoustics. Our approach to this was multifaceted, involving psychophysical studies of human perception (Ch. 8) and parallel studies of the psychoacoustics of hearing in species of animals that could be trained to respond behaviourally to sound (birds, Sect. 7.1, and bats, 2
Introduction Sect. 7.4). The bridge to physiology was only possible in these animals, in which neurophysiological data could be compared to psychoacoustical results. One longterm goal of such comparisons is, of course, to explain human perception (see Ch. 8). Almost at the other extreme to the owls and bats is the ‘auditory world’ of our fellow human beings who have hearing deficits. A very significant portion of our society is made up of people that through malformation, through disease, damage or old age have a poorly-functioning hearing system. The auditory worlds experienced by these individuals are highly diverse, depending on the etiology of their problem. Since humans primarily communicate using speech, these disabilities can have dramatic effects on social behaviour and psychological well-being. In addition, their consequences are a great economic burden on the affected individuals and on society in general. Up to 20 % of the populations of modern societies suffer from damage to their hearing systems, and the noisy toys of modern children (e.g. toy guns that are sometimes louder than real ones) and the loud concerts of youth (utilizing sound intensities that are far above those allowed in industry and potentially producing permanent hearing damage within a few hours) offer little prospect of improvement in the future. To understand the individual’s difficulties and attempt to ameliorate the problems was the goal of the clinical projects of the collaborative research centre 204. Some of the improvements in methodologies used by the Munich Ear-NoseThroat clinics were made possible by theoretical and technical advances in other projects of the collaborative research centre, for example new psychoacoustical tests and more sophisticated test stimuli, that enabled some important advances in differential diagnoses (see Sect. 9.1, 9.2). In addition, simple, portable instruments were developed that made rapid clinical tests possible or easier. This applies also to the use of otoacoustic emissions in the clinical context (Sect. 5.5, 5.6, 9.2.2). Improvements in hearing aids and the suitability of the different kinds of aids for various patients also benefitted from advances in our understanding of the fundamental principles underlying the processing of acoustic signals in the ear and auditory brain. The foundation of such clinical studies has been and always will be the basic research carried out in human psychoacoustics (Sect. 8.1, 8.3 to 8.7). The data thus obtained are the ‘grist for the modeller’s mill’, and are essential for the continuing refinement of models of the function of the hearing organ and the simulation of human hearing performance (Ch. 4, Sect. 8.3). As already evident from its title “Hearing in vertebrates“, one of the defining features of the collaborative research centre 204 was the comparative nature of its work. By ‘comparative’ we mean not only the study of hearing in mammals other than humans. We also mean the study of other vertebrates in order to be able to compare the structure and function of their hearing systems with those of mammals. The basis for this comparison is the conviction, based on very solid evidence, that the hearing organ itself has a common origin in all amniotes (Ch. 1). Thus, as in all sense organs that underwent changes over their – sometimes very long – evolutionary history, a great deal is to be learned about the functional meaning of structural changes by comparing the variety of epithelia presented to us by the natural world. Comparative studies thus fulfill two general functions. They enable us to learn more about the world we live in, but can also serve as a very important ‘sounding board’ 3
Introduction for theories of the function of the ear of mammals. In these two senses, then, we have actively pursued studies of both birds and lizards, to understand how their hearing organs are structured, how they might work, what role sound plays in their lives and, to a lesser extent, how the brain of these animals processes the coded input from the ear. Since many birds also use acoustic communication, they can serve as comparative objects for understanding communication principles in humans (see below). Parts of this book thus provide information on the coding abilities of lizard and bird hearing organs. Their ability to encode frequency, for example, is compared to that of mammals. An attempt is made to understand these similarities and differences and to come to conclusions on the underlying functional mechanisms (Ch. 2 and 3). In addition, we describe studies of lizard, bird and mammalian ears using the various kinds of otoacoustic emissions, a non-invasive technique available since about 1980 (Sections 5.1, 5.2). In such studies, the stimulus paradigms can be easily standardized and the same kinds of tests pursued in all groups. Both in the neurophysiological studies and in the otoacoustic emissions studies, we were able to describe a great number of similarities between hearing epithelia, despite their structural differences. There were also a number of interesting differences. It is one of the purposes of this book not only to describe these phenomena in isolation for each group, but to closely compare and to attempt to come to conclusions regarding the underlying reasons why ears differ in their function or do not, as the case may be. Another extremely important area of the comparative work of the collaborative research centre 204 also involved the use of behavioural – psychoacoustical techniques. This methodology is based on the principle, in humans, at least, of presenting test persons with very carefully-selected and generated acoustical signals, in order to test their ability to discriminate fine differences or to hear particular stimuli. It is one of the finest achievements of this collaborative research centre that we have contributed in a very significant way to the literature in human psychoacoustics, including descriptions of new techniques, new phenomena and unique models. One example of the new techniques pioneered in the collaborative research centre is the development of noise-measurement procedures resulting in values reflecting more closely the impact of noise on the human auditory system than the measurement rules that are commonly applied. These new procedures were documented in various norms (DIN 54631 and ISO 532 B). The groundwork for this improved evaluation of the impact of noise on humans was laid by numerous psychophysical studies conducted in the collaborative research centre on masking and suppression, and on the perception of fluctuation strength and of pitch strength. These studies resulted in models describing human perception of spectral pitch, of temporal partial masking, of perceived average loudness and of psychoacoustic annoyance. Crosscultural studies in cooperation with Japanese scientists provided data on the generality of the findings. Another area of research in which the collaborative research centre contributed significantly to the measurement of auditory function is the application of background noise in clinical audiometry (e.g. the “Fastl noise“) allowing a better evaluation of the auditory system than measurements obtained only in unnatural, quiet conditions. An example of the new phenomena discovered 4
Introduction in the research of this project is the “Zwicker-tone“ that describes the formation of an auditory after-im-age that can be perceived following notched-noise stimulation. This after-image, which probably is due to central processing phenomena (unlike the common visual after-images) has attracted a lot of interest from neurophysiologists. Whereas it is relatively easy to explain to human volunteers what they are supposed to attend to during a stimulus presentation, and to obtain responses on this basis, the situation in animals is quite different. They have to be instructed non-verbally and the experimental paradigm has to eliminate responses to stimulus features that are not under study. Operant-conditioning techniques provide the tools for comparative psychoacoustic studies in animals. They have been applied in the collaborative research centre 204 with great success, resulting in a large body of comparative data. Using appropriate methods for stimulus presentation and data analysis that are adapted from human psychoacoustic studies, the data obtained from “experienced observers“ among the animals often show less variation than results obtained from human observers. This demonstrates the practicability of animal studies that provided us with critical data for understanding functional principles in vertebrate auditory perception. Having evolved in an acoustic environment that is very similar to that of human’s, the songbird species that was studied extensively in the collaborative research centre, the European starling, has developed mechanisms of auditory analysis that allow this bird a perceptual performance matching human auditory perception. As shown in Chapter 7., this similarity extends to many tasks of auditory signal analysis. Astonishingly, the starling achieves this “humanlike“ performance with a basilar papilla that has a length of only 1/10th of the length of the human basilar papilla, and models that were developed to explain human psychophysical performance on the basis of the frequency representation of the inner ear can also be easily fitted to the starling data. Caution is necessary, however, in interpreting the correspondence between the model predictions and the perceptual performance. The correlational evidence in the “auditory generalists“ does not allow decisions on whether the observed similarities in the analysis of the auditory worlds are a result of a comparable spatial representation of frequencies in the inner ear of the different species reflecting the micromechanics of the inner ear and the travelling wave or whether they represent the basic tuning properties of individual hair cells that may depend on additional intrinsic cell properties. The auditory specialists studied in the collaborative research centre 204, the barn owl and the bats, help us to evaluate the importance of different morphological and physiological features for the auditory performance as measured in psychophysical studies. Barn owls, for example, have an unusual cochlear frequency map in which the frequencies of the octave above 5 kHz occupy more than 60 % of the sensory epithelium (see Sect. 3.2; 3.3). Contrary to predictions of the owl’s psychophysical performance that were based on cochlear-map functions, the large spatial spread of frequencies above 5 kHz does not result in an improved frequency selectivity. The tuning properties of individual hair cells, however, show a better fit to the psychophysical data. It would be parsimonious to assume that in vertebrates in general, the contribution of individual hair cells to the excitation pattern in the auditory system, 5
Introduction rather than the spread of frequencies on the cochlear map reflecting the mechanics of the cochlea, is the most important component in determining psychophysical performance. Higher centers in the auditory system may provide some additional, but often less significant, modification of the basic pattern. In summary, the combination of morphological, neurophysiological and psychophysical evidence from a large number of vertebrate species, including humans, has allowed the collaborative research centre 204 to achieve a level of understanding of auditory processing that would not have been possible without the joint effort of zoologists, medical researchers and engineers. We believe that this book demonstrates the success of this approach for understanding auditory perception in humans and for hearing in vertebrates in general.
6
1 Design plasticity in the evolution of the amniote hearing organ Geoffrey A. Manley
Although the general aim of the collaborative research centre 204 was scientific research in the field of ‘sensory reception and information processing in the hearing system of vertebrates’, the studies carried out were in fact restricted to land vertebrates and, within those, to the amniotes – reptiles, birds, mammals (including of course humans). There is no question that one of the most important features of the collaborative research centre was just this comparative approach, based on the conviction that it will be difficult or even impossible to fully understand the hearing system of one group of vertebrates, such as mammals and man, in isolation. At this early point in our presentation of the work of the collaborative research centre, it is thus appropriate to briefly describe the evolutionary history of the amniote hearing organ, the diversity and the design plasticity of which offers an ideal opportunity to study structure-function relationships (Manley and Köppl 1998).
1.1 The origin of the amniote hearing organ, the basilar papilla The hearing organ of amniotic land vertebrates, the basilar papilla, in mammals also called the organ of Corti, was the last of the paired sense organs to arise during evolution. Its origin can be traced roughly to the time of emergence of the vertebrates on to the dry land. Following the origin of the reptiles, and thus the amniotes, the basilar papilla underwent major formative evolution over a period of more than 300 million years. Some of the most obvious structural differences between the hearing organs of mammals, birds and reptiles did not evolve until the last period of the Mesozoic era, the Cretaceous. Among the various sensory epithelia found in the inner ear (mostly part of the vestibular-balance system), the basilar papilla was the first that lay wholly or partly over a freely-moving membrane, the basilar membrane. Although there is no 7 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
1 Design plasticity in the evolution of the amniote hearing Organ direct evidence on what the early basilar papilla looked like, evidence suggests that the sensory epithelium of turtles and of the Tuatara can be regarded as being primitive (Miller 1980, Wever 1978). Their papillae are small, containing at the most about one thousand hair cells that are contacted by both afferent and efferent nerve fibres. A well-developed tectorial membrane covers all the hair cell, and their ‘hair’, or stereovillar, bundles are abneurally oriented. How did the development of this hearing organ progress over time during the evolution of the three major lines of amniotes, the mammals, the archosaurs (birds and crocodilians) and the lepidosaurs (snakes and lizards)? Was the evolution of the hearing organ different in different groups of amniotes? What were the functional consequences of the different structural configurations seen in the various groups; in other words what role did design plasticity play in the adaptation of its structure to the species’ way of life?
1.2 The origin and evolution of the mammalian cochlea The true mammals arose during the Triassic period, from a group of reptiles that split off very early from the stem reptiles and thus was not closely related to those groups giving rise to modern reptiles and birds (Fig. 1.2.1). The comparative morphological evidence suggests that at the time when the ancestors of all modern amniote groups diverged from the stem group, their hearing organs were quite simple and similar to one another. What these animals could ‘hear’ is controversial, since middle ears as we now know them were not developed until much later in evolution (Clack 1997). The differences in the hearing organs we see today are derived from specializations and adaptations that arose during their long history of independent evolution. The four features typical of the mammalian peripheral hearing system (the three-ossicle middle ear, the usually highly elongated sensory epithelium, the coiling of the epithelium into a cochlea and the specialization of hair-cell rows into two clearly-distinct populations across the epithelium), just like the many other features of the body plan of mammals, did not all arise simultaneously. The first event in the sequence was the origin of the three-ossicle middle ear, being one of the changes defining the origin of true mammals from mammal-like reptiles. The importance of this development for high-frequency hearing in mammals can hardly be over-emphasized. Since middle-ear studies were not carried out during the tenure of the collaborative research centre, however, the underlying principles will not be discussed here. Suffice it to say that the development of this unique kind of middle ear in mammals predestined the system to respond very well to high frequencies and thus opened the way for the extension of the hearing range of mammals. This possibility did not present itself following the parallel development of the single-ossicle middle ear of non-mammals, thus restricting the development of the frequency range during later reptilian and avian evolution (Manley 1990). 8
1.2 The origin and evolution of the mammalian cochlea
Fig. 1.2.1: A simplified phylogenetic tree of the amniotes illustrating the probable times of origin of some features of the different hearing organs. The line leading to the mammals diverged earliest from the stem reptile stock (right branch). The other two lines lead to the modern archosaurs (dashed lines) and lepidosaurs and turtles (dotted lines, left branches). During the Triassic period, a sensitive middle ear system and a pressure-release window was developed in all branches independently (a ‘tympanic’ ear). Following this, hair-cell specialization developed independently, giving rise to low- and high-frequency cell groups in the lizard ancestors, and specialized hair-cell groups across the papillae in the ancestors of mammals on the one hand and, again independently, the bird-crocodilian ancestor on the other. Following the branching-off of the marsupial-placental line of mammals, their cochleae coiled and the number of hair-cell rows was reduced. (Reprinted from: Current Opinion in Neurobiology Vol. 8, G. A. Manley and C. Köppl, Phylogenetic development of the cochlea and ist innervation, pp. 468-474 (1998), with kind permission from Elsevier Science Ltd., London.)
9
1 Design plasticity in the evolution of the amniote hearing Organ The specialization of hair cells into inner and outer hair cells, arranged across the organ of Corti, probably took place next and thus relatively early, and all mammal groups (placentals, marsupials and monotremes) show this feature. Only placental and marsupials, however, have a coiled hearing organ, the cochlea, whose origin can thus be traced to the middle cretaceous period after these main mammalian lines diverged from the egg-laying monotremes (Figs. 1.2.1, 1.2.2). Accompanying this event was the loss of the lagenar macula in the true cochlea, this macula being retained by the monotreme mammals (see also Sect. 2.7). Following this event, there was also a reduction in the number of hair-cell rows arranged across the hearing organ of the placental-marsupial line to the typical pattern seen in modern representatives: one row of inner, three rows of outer hair cells. In monotremes, the number of hair-cell rows remained larger (Pickles 1992). As outlined below, these hair-cell types that differentiated across the papilla in well-defined rows assumed two distinct roles in the hearing process. The further evolution of the mammalian organ of Corti manifest itself in an elongation, up to 104 mm in the blue whale (the longest-known hearing organ), and other, sometimes quite subtle, structural changes. Some of the structural changes are confined to distinct regions, producing in some cases specialized frequency zones that improve the analysis of signals of particular importance to individual species. In these particular areas, the distribution of frequencies – the tonotopic pattern – may be significantly different to the normal mammalian pattern, expanding special frequency bands over large areas of the papilla (see below the discussion of cochlear specializations in bats, Sect. 3.1).
1.3 The origin and evolution of the hearing organ of archosaurs All the other living evolutionary derivatives of the stem reptiles are descendents of diapsid reptiles, and almost all modern representatives belong either to the archosaurs (Crocodilia – crocodiles, alligators, gavials – and Aves, birds) or to the lepidosaurs (lizards and snakes). Archosaurs share a number of characteristic features and thus the basic structure of their papilla is a synapomorphy and almost certainly arose before the birds and Crocodilia diverged late in the Jurassic period (Fig. 1.2.1). The archosaur basilar papilla is also elongated, but wider than in mammals and is covered with an irregular mosaic (rather than rows) of thousands of hair cells that fall into two morphological groups arranged across the basilar papilla. On the inner or neural side of the archosaur papilla, and covering more of the papilla’s width in the apex than in the base, are so-called tall hair cells (THC), that are cylindrical in shape and receive both afferent and efferent innervation (review in: Fischer 1994, Manley and Gleich 1992, 1998). Short hair cells (SHC) occupy the outer or 10
1.3 The origin and evolution of the hearing organ of archosaurs
Fig. 1.2.2: Schematic representation of the evolution of specialized hair-cell populations in amniotes. The stem reptiles configuration is assumed to resemble that of modern-day turtles. Three independent lines diverged from the stem reptiles, meaning that the two kinds of hair cells in the three groups lizards, archosaurs and mammals are not homologous, but are synapomorphies of the respective group. The filled black areas represent typical configurations of the papillar hair-cell surface, as seen from above. In lizards, the two hair-cell populations are arranged along the papilla and differentiated according to their frequency range into lowfrequency (LFHC) and high-frequency (HFHC) groups. The differentiation in archosaurs and mammals, however, is across the width of the papilla into tall (THC) and short (SHC) hair cells in birds and inner (IHC) and outer (OHC) hair cells in mammals. These are the neural and abneural side, respectively. The mammal configuration shown here is for a placental.
abneural area, and – in birds – they lack afferent innervation. Remarkably, they are only contacted by large efferent endings (Fischer 1994). The origin of the tall and short hair cells took place in the Jurassic, roughly at the same time as the origin of the (quite independently-derived) hair-cell groups of mammals (Figs. 1.2.1, 1.2.2). As outlined below (Sect. 2.1) there is some evidence that the hair-cell types in birds developed – also independently – a similar functional specialization to that seen in mammals (Manley 1995). The avian papilla achieves a total length of from about 2 mm in small song birds up to 11 mm in the barn owl (Manley and Gleich 1992, 1998). In birds, there is also one known case of the development of 11
1 Design plasticity in the evolution of the amniote hearing Organ an expanded frequency representation in an auditory fovea (see below, the discussion of the barn owl, Sect. 3.3).
1.4 The origin and evolution of the hearing organ of lepidosaurs Within the lepidosaurs, the lizards and snakes form the largest groups that arose, and these two groups diverged from each other in the early Cretaceous. In the lizards, there was a remarkable differentiation of hearing-organ structure, evolving a family-, sub-family-, genus- and in some cases even species-specific anatomy. Thus in contrast to the two other major groups, the mammals and the archosaurs, whose papillae tended during evolution to conform to group-characteristic structural constellations, the lizard papilla appears to have been subject to a large number of natural ‘experiments’. The reason for this difference probably lies in the fact that, in contrast to most mammals and to birds, and with only one exception at the family level, the auditory system of lizards is not used for intra-specific communication, nor is it normally used for hunting prey. Presumably because of this, the functional constraints on its evolution were considerably weaker, and the function of general acoustic signal detection (e.g. warning of danger) could be fulfilled by a variety of structural configurations. The history of this organ may thus be viewed as an example of neutral evolution. Also unlike in other amniotes, in modern lizards, the basilar papilla always has two clearly-recognizable hair-cell areas arranged along the papilla’s length (Fig. 1.2.2), a pattern that presumably arose in the middle of the Cretaceous period (Fig. 1.2.1). One of these areas contains hair cells that respond best to frequencies below 1 kHz and probably is homologous to the whole papilla of stem reptiles. In the other area, the hair cells respond best to frequencies above 1 kHz, generally up to 5 or 6 kHz, the upper limit being species-specific. This second hair-cell area represents a specialization not found in stem reptiles and is a synapomorphy of lizards. Thus lizards, unlike mammals and archosaurs, show two groups of hair cells that are differentiated with respect to their frequency responses, and are in fact arranged over different areas along the length of the papilla, rather than across the papilla, as in birds and mammals. The most primitive structural pattern (Miller 1980, 1992) is found in families in which the basilar papilla consists of a central, low-frequency area bounded at both ends by two mirror-image high-frequency areas (many teeids). These high-frequency areas contain roughly equally-large groups of hair cells whose stereovillar bundles face each other (‘bidirectional orientation’). From this basic morphological condition, a remarkable array of evolutionary trends over time is discernable in the different lizard families (Fig. 1.4.1), in most families resulting in a pronounced tendency to elongation of the papilla and the loss of one of the two high-frequency regions (Manley 1990, Köppl and Manley 1992). The elongation is strongest in the high12
1.4 The origin and evolution of the hearing organ of lepidosaurs frequency area, but maximal papillar length rarely exceeds 2 mm, much shorter than in most birds and mammals (Manley 1990). The elongation of the papilla is accompanied by an increase in the number of hair cells up to a maximum in lizards of about 2000 (Miller 1992, Miller and Beck 1988). Although the tonotopic organization of frequencies along the papilla in lizards varies between species, depending on the amount of space available, there are no known cases of the specialization of particular regions such as the auditory foveae of some birds and mammals.
Fig. 1.4.1: A schematic representation of the evolution of the basilar papilla of the different lizard families. It is assumed that the ancestors of the lizards had a basilar papilla like that of turtles, with only a low-frequency hair-cell area. Following the origin of two micromechanically-tuned, high-frequency areas at each end of the papilla in ancestral lizards, the various families modified this basic plan by the loss of one or the other high-frequency area and by changing the structure or losing the tectorial membrane. In the high-frequency area(s) in varanids (left), lacertids and in some iguanids (the latter two not shown here), the papilla was also physically divided by a limbic bridge into two, unequally-sized sub-papillae.
13
1 Design plasticity in the evolution of the amniote hearing Organ
1.5 General evolutionary trends in the physiology of amniote hearing organs
1.5.1 Functional principles seen in all auditory organs There are three common functional principles underlying hearing in all amniotes, and these can be regarded as symplesiomorphic features (Manley and Köppl 1998). Firstly, the electrical properties of the hair-cell membrane can strongly influence the cell’s responses to stimuli. Cell membrane-potential oscillations lead to the existence of the most primitive type of frequency selectivity, the so-called electrical tuning found in some types of hair cells. Secondly, as mechanoreceptors, hair-cell sensory responses can be strongly influenced by the accessory structures around them. Hair cells are generally connected to a tectorial membrane in auditory organs, that, together with the micromechanical properties of the cell’s own stereovillar bundle, provides the basis for a micromechanical resonance system that underlies micromechanical frequency selectivity. Micromechanical frequency tuning is not limited by the inherent frequency limit of electrical tuning to a few kHz (Wu et al. 1995). Compared to electrical tuning, reliance on micromechanical frequency tuning is a relatively new evolutionary development. We do not know, however, whether micromechanical tuning was independently developed in the different groups. All hair cells will have some degree of micromechanical tuning, if only because they have a stereovillar bundle that has a specific stiffness, etc. Thirdly, Hair cells have an ‘active process’ (or ‘cochlear amplifier’) that increases the sensitivity of the hearing organ and is the basis for the generation of, for example, the spontaneous otoacoustic emissions (see Sects. 5.1, 5.2, 5.3). Present data suggest that there might be two types of mechanism underlying this phenomenon, one in mammals and another in non-mammals (Manley and Gallo 1996).
1.5.2 Parallel trends in the evolution of amniote auditory organs Despite the obvious plasticity in the design of the various amniote hearing organs that has resulted in remarkable differences between the mammalian organ of Corti and the basilar papillae of archosaurs and lepidosaurs, there are common trends observable in the evolution of all of them. These are trends that, in the various groups, developed parallel to each other in time. The elongation of the basilar papilla independently in all groups was a response to the increasing importance of – and opportunity offered by – micromechanical tuning, since the fundamental mechanisms involved are more dependent on the amount of space available than is electrical tuning. In lizard papillae, it was the 14
1.5 General evolutionary trends in the physiology of amniote hearing organs high-frequency, micromechanically-tuned hair-cell area that became longer and more varied in structure during evolution. The elongation of the papilla in birds reached such dimensions that accommodating the long, relatively straight epithelia became a developmental problem. In the barn owl, for example, the enclosures of the hearing organ meet in the midline of the head, and the long epithelium is contorted to fit into the space available. The coiling of the cochlea of most mammals was an elegant solution for the problem of accommodating highly elongated sensory epithelia (Fig. 1.2.2). This elongation of the auditory epithelium seen in most representatives of all groups enabled not only an increase in the upper limit of hearing by providing space to accommodate higher octaves (to a first approximation, the upper frequency limit of hearing in amniotes is a function of the length of the papilla – it is, however, subject to other, partly group-specific effects, such as the animal’s absolute size; Manley 1973). In addition, the elongation was generally accompanied by an increase in the amount of space devoted to each octave. Thus the space-per-octave available in the smaller papillae of lizards is generally less than that in birds, which itself is generally less than that in mammals (Fig. 1.5.1; Manley 1973, Manley et al. 1988). The space constant (mm/octave along the epithelium) influences the degree to which increased innervation of a narrow frequency region can provide increased input for the parallel processing of information. Thus some species sacrifice a potentially greater breadth of frequency response for a greater degree of detail concerning specific frequency ranges. There are a number of very interesting frequency-distribution patterns in specialist ‘hearers’ such as some bats and owls, that have strongly expanded the representation of behaviourally-important frequency ranges. These cases are discussed in more detail elsewhere in this book (see Ch. 3).
1.5.3 Specialization of hair-cell populations across the papilla in birds and mammals Whereas the two hair-cell types in lizards are responsible for different frequency ranges, and are thus placed at different positions along the papilla, the situation in archosaurs and mammals is quite different. The origin of hair-cell specializations in birds and mammals as manifest in the morphological differences between different hair-cell populations (Manley et al. 1989) is an interesting case of convergent evolution. Since the evolution of these specialized hair-cell populations occurred independently, it is not unexpected that, alongside a number of obvious common features, their detailed configurations show clear differences (Manley et al. 1989). The innervation patterns of the hair-cell groups in both birds and mammals are for example similar, the neural population receiving a prominent afferent and a weak (birds) or no (mammals) efferent innervation, the abneural group receiving a strong efferent and a weak (mammals) or no (birds) afferent innervation. Although the presence of a ‘cochlear amplifier’ is only well established for the mammalian cochlea, the morphology and some physiology suggests that this principle is also realized 15
1 Design plasticity in the evolution of the amniote hearing Organ
Fig. 1.5.1: The space along the basilar papilla that is devoted to one octave in frequency in the various species for which frequency maps of the hearing epithelium are available. Each data point represents the value for a particular species, except that for most species, two data points are given, for the high-frequency range (suffix to species’ abbreviation - h) and the lowfrequency region (-l). The data points are arranged into taxonomic groups. Species’ key: Reptiles: GS, granite spiny lizard; A, alligator lizard; Pod, Podarcis; T, red-eared turtle; G, Gecko; B, bobtail skink. Birds: ST, starling; Ch, chicken; P, pigeon; E, Emu; O, Barn owl. Mammals: M, mouse; R, rat; GP, guinea pig; Cat, cat; BP, bat (Pteronotus), BR, bat (Rhinolophus).
(perhaps using a different mechanism) in birds. According to this view, both the avian and mammalian cochleae contain ‘normal’ receptor cells lying neurally and receptor cells modified as ‘effector’ cells lying abneurally (Manley 1995). Since hair-cell structure and the innervation patterns change across the width of the hearing epithelium in birds and mammals, it might be expected that there would be physiological differences in the responses of hair cells and their innervating afferent fibres across the width of the organ. The evidence to date suggests that in birds, the most sensitive fibres contact hair cells near the neural or inner edge of the auditory papilla, and the least sensitive afferents contact hair cells near the middle of the epithelium (Manley and Gleich 1998, see Sect. 3.2.2). At least in higher-frequency regions, there are in birds in most cases no afferents to hair cells lying 16
1.5 General evolutionary trends in the physiology of amniote hearing organs more abneurally than the middle of the papilla; thus no thresholds can be measured for these regions. Comparing mammals to birds, it appears that mammals achieve in single hair cells – the inner hair-cell row – at least partly what birds achieve through cells distributed across the whole hearing epithelium (Fig. 1.5.2). This suggests that the mechanisms underlying the interactions between hair cells, and that also involve the active process, operate a little differently in mammals and birds (Yates et al. 1998). It thus appears that during evolution, the selection pressures working on similar, large hair-cell epithelia in the various groups of amniotes have led to some similar and to some different solutions to the problems of developing sensitive, high-frequency hearing. Many of the solutions found utilized pre-existing abilities of hair cells that, through specialization, could be made more efficient. The end result of these trends was a remarkable variety of hearing organs whose sensitivities and frequency-response ranges differ in characteristic ways between the various modern groups of amniotes.
Fig. 1.5.2: Schematic diagram to illustrate the differences between mammals and birds with respect to correlations between threshold and position of the synapses of afferent nerve fibres. In mammals, large afferent fibres contact the outer face of the inner hair cell and these afferents have a high spontaneous rate and a low response threshold. Thin fibres that contact the inner face of the same inner hair cell have opposite properties. In birds, afferent fibres that contact hair cells near the neural edge of the papilla have low response thresholds, those that contact hair cells near the middle of the papilla have higher thresholds. Hair cells further abneurally generally have no afferent contacts.
17
2 Comparative anatomy and physiology of hearing organs
2.1 Anatomy of the cochlea in birds Franz Peter Fischer, Otto Gleich, Christine Köppl and Geoffrey A. Manley
2.1.1 Anatomy and evolution of the avian basilar papilla Through intensive research over more than a decade, a great deal of morphological and physiological data on the avian basilar papilla has become available. The finding that hair cells (HC) in the hearing epithelium of birds can regenerate following damage (e.g. Corwin and Cotanche 1988, Ryals and Rubel 1988) has considerably stimulated efforts in this field, and the chicken has become the ‘standard bird’ in research on avian HC regeneration. The mechanism of hearing in birds is, however, still far less understood than in mammals. In an attempt to establish structure-function correlations, we have carried out a long-term quantitative study of the morphological and innervational patterns of avian basilar papillae and their HC in differently-specialized bird species. Two of the aims of these studies were firstly, to try to understand the evolution of this complex sense organ. Thus our choice of species included both primitive and specialized birds. Secondly, we wanted to provide the morphological groundwork for comparative physiological studies of amniote hearing organs. The emu, a member of the palaeognathous Ratitae, is considered to be a rather primitive species (Carroll 1988). Among the neognathous birds, the water-bird assemblage (e.g. ducks and seagulls) is believed to show more primitive features than the land-bird assemblage (Feduccia 1980, Carroll 1988). The land-bird assemblage has two subdivisions, and the pigeon and the chick belong to the more primitive subdivision, whereas the passerines (or song birds) are advanced land birds. The same is true for the barn owl, an extreme case of specialization, that can catch its prey in absolute darkness by using auditory cues. With the use of the scanning (SEM) and transmission (TEM) electron microscopes, we have studied the hearing organs in nine avian species (Table 2.1.1). As a counterpart to the morphological research, numerous physiological studies were carried out by our group on the emu, chicken, starling and barn owl (see 18 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
2.1 Anatomy of the cochlea in birds Sects. 2.2, 2.4, 2.5, 2.6, 3.2, 5.2). References to the literature describing additional data available for other species are to be found in our reviews (Manley 1990, Manley and Gleich 1992, Fischer 1994b, Gleich and Manley 1998).
Table 2.1.1: Species-specific variation of morphological parameters in the avian cochlea. The length measurements are corrected to the living state (assuming 30 % shrinkage in the SEM). Species
BP length (mm) HC number
Techniques
Literature
emu
5.5
17 000
SEM, TEM
Köppl and Manley 1997; Gleich and Manley 1998; Fischer 1994b, 1998.
tufted duck
3.6
8200
SEM
Manley et al. 1996.
chicken
4.7–6
11 100
SEM, TEM
Gleich and Manley 1988; Fischer et al. 1994b; Manley 1995; Manley et al. 1996.
pigeon
4–5.2
10 000
SEM
Boord 1969; Gleich and Manley 1988.
budgerigar
2.5
5400
SEM
Manley et al. 1993.
starling
2.9–3.5
5800
SEM, TEM
Gleich and Manley 1988; Fischer et al. 1992.
canary
2.1
3000
SEM
Gleich et al. 1994.
zebra finch
2.1
3600
SEM
Gleich et al. 1994.
barn owl
11–12
17 000
SEM, TEM
Smith et al. 1985; Fischer et al. 1988; Fischer 1994a; Köppl 1997a,b,c; Köppl et al. 1993.
19
2 Comparative anatomy and physiology of hearing organs
2.1.2 Basic structure of the avian hearing organ In contrast to the coiled mammalian organ of Corti, the avian hearing epithelium (the basilar papilla, BP) is a flat, slightly curved band. Its apex is broad, and its width decreases in most cases steadily towards the base (Fig. 2.1.1). In general, the basilar papilla of birds is shorter than the mammalian hearing epithelium. The shortest BP we have studied are those of the canary and the zebra finch (Table 2.1.1). At the other extreme, the BP of the barn owl is very long and, at close to 11 mm, certainly longer than many mammalian papillae. In this species, the basal part of the papilla has been considerably extended, whereas the apical third corresponds morphologically to an ‘ordinary’ avian BP (Fischer 1994a, see also Sect. 3.2.1). When comparing different bird species, two trends are discernable: Firstly, the bird’s body size roughly correlates with the BP size. In specialized birds, such as in the barn owl, however, a particular part of the BP is enlarged, in the case of the barn owl it is the base (Fischer et al. 1988). Similarly, in the pigeon, the apex is disproportionally wider (Gleich and Manley 1988). In the smaller songbirds we studied so far, the canary and zebra finch, the basilar papilla resembles the basal portion of the basilar papilla of unspecialized birds. Thus these smaller birds have probably reduced the apical, low-frequency part of the papilla (Gleich et al. 1994). The tectorial membrane (TM) covers the entire papilla, and the tallest stereovilli of all HC bundles are firmly attached to it, as are the microvilli of the surrounding supporting cells. Each HC is thus enclosed in an individual pocket of the TM. The TM is generally much thicker than in mammals, but its thickness tapers strongly from the neural side towards the abneural edge of the BP. After normal histological treatment, the avian tectorial membrane resembles swiss cheese, and the holes in it are often so large as to reveal the hair cells beneath it. When treated appropriately, however, we were able to show that the TM’s surface has no holes, but is quite smooth (Runhaar 1989). Since the HC with the most prominent afferent innervation are not located over the basilar membrane but rather over the cartilaginous neural limbus (Manley et al. 1987, Gleich 1989, Fischer 1994b), it is suspected that the TM plays a crucial role in their mechanical stimulation (Manley 1995). This may involve an interaction with the abneural HC over the free basilar membrane and possibly also the hyaline cells, providing the mechanical input to those HC on the neural side that actually transmit information about the acoustic stimulus to the brain. In the pigeon, there is an unusual increase of the TM mass near the apical end, and this is probably related to the apical, low-frequency specialization of that species (Gleich and Manley 1988, Schermuly and Klinke 1990). A vestibular organ, the lagena, surrounds the apical end of the BP. It contains about 5000 HC that are stimulated by calcareous otholiths embedded in gelatinous material. Nerve fibres from the chicken lagena do not respond to sound (Manley et al. 1991, see also Sect. 2.7).
20
2.1 Anatomy of the cochlea in birds
Fig. 2.1.1: Top: Schematic drawing of the typical appearance of the avian basilar papilla. The apical end is partially surrounded by the lagenar macula (right). The neural side is convex, the abneural concave and bordered by the hyaline cells. The dashed line shows the approximate division between the neural area in which hair cells are both afferently and efferently innervated, and the abneural area, in which hair cells are only efferently innervated. The position of this line is species-dependent. Bottom: Diagrammatic representation of epithelia from eight bird species to illustrate the differences in size and shape of the basilar papillae. In each case, the length of the fixed epithelium and the number of hair cells is indicated.
21
2 Comparative anatomy and physiology of hearing organs
2.1.3 The hair cells The total number of HC in the avian papilla (several thousand, Table 2.1.1) is comparable to that of mammalian cochleae. In birds, however, there are no distinct HC rows as in mammals, instead, the HC make up nearly the whole surface of the BP and are arranged in a complex mosaic. The number of HC across the papilla varies according to the species and the position along the papilla from about 4 HC basally to about 50 apically (e.g. Fischer et al. 1988, Gleich and Manley 1988, Gleich et al. 1994, Manley et al. 1993, Manley et al. 1996). Each HC is surrounded by a thin sheath of supporting cells (SC); the main body of the SC lies beneath the layer of the HC. Unlike in mammals, the SC do not contain large quantities of stiffening proteins (Fischer et al. 1992) and do not appear to be so specialized. Along the abneural edge of the BP, a zone of hyaline cells is present (Fig. 2.1.2). The hair cells are always fully surrounded by supporting cells, and there are no fluid spaces within the papilla.
Fig. 2.1.2: Light-microscopical transverse section of the chicken basilar papilla at a position 41 % from the basal end. Hair cells are shown filled black, and representative hair cells from left to right in the cross section (at the positions 1, 9, 12, 21, 53, 69 and 85 % positions from the neural border) are reconstructed below from transmission electron microscopy. Afferent nerve terminals are shown filled black, efferent terminals are stippled.
22
2.1 Anatomy of the cochlea in birds In contrast to mammals, the avian papilla shows a continuous transition between two basic HC forms across and along the papilla (e.g. Gleich et al. 1994, Fischer 1992, 1994a, 1998, see below): Neurally and apically are relatively tall and columnar HC (Tall Hair Cells, THC), and shorter, cup-shaped HC are found abneurally and basally (Short Hair Cells, SHC). Most THC are situated over the cartilage-like neural limbus, whereas the SHC lie over the free basilar membrane (Fig. 2.1.2). On the neural side of the BP, afferent and efferent nerve fibres pass from the cochlear ganglion via the habenula perforata into the BP. In more primitive birds, such as the pigeon and the emu, preliminary data indicate that there are fewer nerve fibres than HC; the majority of these fibres are afferent in nature (Tab. 1). On the other hand, in the budgerigar, the canary and the barn owl, there are many more nerve fibres then HC. In the chicken and the starling, the nerve-fibre/HC ratio lies between these extremes. Thus species with advanced hearing organs have, on average, a denser afferent innervation (Gleich and Manley 1998).
2.1.4 Hair-cell ultrastructure and synapses The ultrastructure of HC in the avian BP resembles that of HC in other vertebrates (Fig. 2.1.3). Subsurface cisternae (as found in many OHC and some IHC of mammals) are, however, not present. Kinocilia may be found at the abneural edge of the bundle of many HC, depending on the animal’s age and species. In cases where the kinocilia are vestigial, a basal body remains. If kinocilia are present, they are more numerous in the apex of the BP (that has a more primitive appearance) than in the base. HC contact their adjacent supporting cells by a zonula occludens and a zonula adhaerens. The bundle of stereovilli on the apical surface of each hair cell inserts into a cuticular plate with actin root filaments. The cuticular plate itself is mainly made up of a network of actin filaments and its shape is more variable than it is in mammalian HC (Brix et al. 1994). The shape of the HC shows systematic gradients across and along the BP. Direct membrane contacts between HC are a common feature in birds. With few exceptions, however, they only occur between THC, and are most frequent in the apical half of the BP. There are different types of contacts, the most intriguing ones being true fusions of two neighbouring HC, with no intervening cell membranes (Fischer et al. 1991). Typically, neural HC have 2 larger knob-like afferent and 1 small knob-like efferent synapses. In contrast, abneural HC mostly have only one large, cup-like efferent synapse, except in the apex (Fischer 1994b). The ultrastructure of the synapses is distinctive: Afferent synapses have rather irregular thickenings of the pre- and postsynaptic membrane. The overall area of these thickenings correlates well with the total synaptic contact area (Fischer 1992). Within the HC, numerous presynaptic bodies are found near the afferent synaptic membrane. In contrast to mammals, these are uniformly round in shape, but of varying size (150-400 nm), depending on the species and the position of the HC on the BP 23
2 Comparative anatomy and physiology of hearing organs
Fig. 2.1.3: Transmission-electron micrographs of hair cells of the barn owl. Left panel: Tall hair cells at 83 % of the papillar length from the basal end. Right panel: A short hair cell at 61 %.
(e.g. Fischer 1994a). The presynaptic bodies are surrounded by vesicles that are about 20–30 nm wide. Efferent nerve endings are filled on the presynaptic side with vesicles that are larger (35–80 nm wide). The mitochondria in these efferent fibres are thinner and longer than those of the afferent terminals. Opposite all large and most small efferent endings, a flat subsynaptic cisterna is present within the HC. This cisterna is a derivative of the granular ER, in contrast to the mammalian HC, where the cisternae derive from smooth ER (e.g. Takasaka and Smith 1971, Fischer 1992).
2.1.5 Gradients in hair-cell morphology The tonotopic organization of the avian BP correlates with a number of gradients in HC morphology along the papilla (Manley et al. 1987, Gleich 1989, Köppl et al. 1993, Chen et al. 1994, Jones and Jones 1995, Köppl and Manley 1997). Two directions of morphological gradients exist in the avian BP, one over its width, and another along its length. Gradients are found in many anatomical features, e.g. in the HC shape, bundle morphology, bundle orientation and innervation pattern. The gradients are similar among different bird species, but their patterns are species-specific (e.g. Gleich and Manley 1988, Manley et al. 1993, Fischer 1994b, Gleich et al. 1994, Tilney and Saunders 1983). In addition, some species also show obvious localized anatomical specializations.
24
2.1 Anatomy of the cochlea in birds 2.1.5.1 Gradients in the shape of the hair cells There are systematic changes in HC height: In general, apical HC are taller than basal ones, and neural HC are taller than abneural ones. An exception are HC situated directly along the neural border: although the general pattern applies, the height changes in this population are less pronounced along the BP than in their more abneural neighbours. HC of the neural region give rise to the most sensitive afferent nerve-fibre responses, and the tallest HC may not be at the extreme neural edge (Gleich 1989, Manley et al. 1989, Smolders et al. 1995). Abneural and basal HC are nearly completely filled by the nucleus and the cuticular plate. This is not only due to the fact that SHC are smaller, but their cuticular plates have a greater volume and occupy a greater proportion of the HC (Brix et al. 1994, Fischer et al. 1992). In contrast, apical and neural HC contain much more active cytoplasm. The emu has generally quite tall HC (30 µm in the apex, 6 µm in the base), whereas e.g. the starling’s HC are rather short (18 µm in the apex, 4 µm in the base). Birds hearing higher frequencies generally have shorter HC, whereas the emu has a large proportion of its BP devoted to low frequencies (Köppl and Manley 1997, Manley et al. 1997). This suggests a species-specific correlation between HC height and characteristic frequency of the responses of the HC and their afferent fibres (Fischer 1994b). In the base of the barn-owl BP, for example, the HC (especially SHC) are extremely short (3–4 µm), corresponding with the fact that the barn owl has a sensitive hearing range extending about one octave higher than in most other birds. However, the HC in the barn owl’s extreme apex are also very tall (30 µm, Fischer 1994a), suggesting good low-frequency hearing as well. Takasaka and Smith (1971) first used a simple form factor to classify avian HC: HC that are taller than wide were named THC, and HC that are wider than tall were called SHC. THC are predominant in the apex of the BP, are less specialized than the SHC and are similar to typical HC of more primitive vertebrate groups (Fig. 2.1.3, 2.1.4; Takasaka and Smith 1971, Chandler 1984). The transition from THC to SHC is, however, very gradual in birds and their definition is quite arbitrary. Because of this, we have suggested replacing it with a more functional definition, based on newly-discovered aspects of the innervation pattern, as follows: THC have afferent as well as (in almost all cases) efferent synapses, whereas SHC have only efferent but no afferent synapses (Fischer 1994a,b). In addition to this feature of the innervation pattern, that can be assessed ultrastructurally for any HC, there are also functional differences. There is a difference in ion-channel complements between these HC forms (Fuchs 1992), and acetylcholine inhibits SHC but not THC (Fuchs and Murrow 1992). THC thus fulfil the more classical HC function, i.e. that of receiving sensory stimuli and transducing them for transmission to the brain. SHC are a very specialized HC type, with an as yet unknown function within the auditory papilla that is presumably under efferent control.
25
2 Comparative anatomy and physiology of hearing organs
Fig. 2.1.4: Scanning electron micrographs of the apical surface of tall hair cells (left) and short hair cells (right) from the basilar papilla of the Budgerigar, at a position 40 % of the papilla’s length from the base. The neural edge is to the left in each case.
2.1.5.2 Stereovillar bundle shape As in other vertebrates, the stereovillar bundle of avian basilar-papilla hair cells is composed of rows of stereovilli of increasing height that are arranged in columns from short to tall stereovilli (Pickles et al. 1989). Despite some species-specific variability, the short axis of the bundle (indicating the number of stereovilli in a column) stays rather constant along the papilla at 2–3 µm, whereas the long axis (indicating the number of columns) decreases strongly from the base (6–10 µm) towards the apex (3–6 µm; Fischer et al. 1988, Gleich and Manley 1988, Gleich et al. 1994, Manley et al. 1993, 1996). There is also a height gradient along the papilla. As a consequence of this gradual change of bundle shape, apical bundles are rather round and have tall stereovilli, reminiscent of shaving brushes, whereas basal bundles are elongated and look, with their short stereovilli, more like toothbrushes (Fig. 2.1.5). Working in cooperation with Dr. Jim Pickles of the University of Queensland, Australia, we showed that the stereovillar tip links that he had described in mammalian hair cells also occur in birds and lizards. In each case, the orientation of the tip links was always along the presumed axis of stimulation of the bundles, independently of the orientation of the hair-cell bundle in the epithelium (Pickles et al. 1988, 1989a, b). During development, the existence of tip links in the chicken papilla could be followed back to stage E8. At that time, the stereovilli are all equally short and have connections, the nature of which was not determined in detail, to all their neighbours. During further developmental stages, the stereovilli on one side of 26
2.1 Anatomy of the cochlea in birds
Fig. 2.1.5: A stereovillar bundle, as seen in the scanning EM, from a starling hair cell in the neural-basal area, showing the tip links oriented along the stereovillar columns.
the bundle grow taller and the tip links to equally-tall stereovilli of the same row disappear. At stage E16, the stereovillar bundles are fully differentiated and the tip links only connect stereovilli of increasing height that, in birds, build the typical columns of the excitatory-inhibitory axis (Fig. 2.1.5; Pickles et al. 1991a). There are also strong gradients in the number of stereovilli per bundle (Gleich and Manley 1988, Gleich et al. 1994, Fischer et al. 1988, Fischer 1998, Manley et al. 1993, 1996): Apical HC have about 50, basal ones about 200 stereovilli. In most birds (chicken, pigeon, budgerigar, canary, zebra finch, barn owl), there are also markedly more (up to 100 % more) stereovilli on HC on the neural side of the BP, compared with HC on the abneural side at the same longitudinal position. In the canary and zebra finch, this difference is largest and is present along most of the BP, whereas in the chicken and the starling, there is no change in stereovillar number over the BP’s width in the apical half. The height of the stereovilli at a given position on the BP varies with the species. In the starling, the height of the tallest stereovilli varies from 2.7 µm basally to 9.4 µm apically, and in the pigeon it changes from 4.0 µm to 12.7 µm (Gleich and Manley 1988). Within and between species, there is a correlation between stereovillar height and frequency (Fischer 1994b), at least along the longitudinal axis of the BP. In the auditory foveal region of the barn owl, the stereovillar height is rather constant at about 1.4 µm along the entire basal half of the BP (Fischer et al. 1988, Fischer 1994a), a region along which the frequency response changes very slowly (see Sect. 3.2).
27
2 Comparative anatomy and physiology of hearing organs 2.1.5.3 Stereovillar bundle orientation A characteristic feature in birds is the systematic change in HC bundle orientation, both across and along the BP. The sensitive axes of stereovillar bundles of neural and abneural HC are oriented perpendicularly to the edge of the BP (kinocilium and tallest stereovilli facing away from the neural edge). HC bundles in the centre of the BP, and mostly in the apical half of the BP, are rotated towards the apex (up to 90°). This pattern is similar in all birds studied so far (Tilney et al. 1987, Gleich and Manley 1988, 1992, Fischer et al. 1988, Jörgensen and Christensen 1989, Manley et al. 1993, 1996, Gleich et al. 1994). The location of maximal rotation and the degree of bundle rotation towards the apex are species-specific (Fig. 2.1.6). Since a bundle shows the highest sensitivity to a deflection towards the kinocilium (Hudspeth and Jacobs 1979), the orientation also presumably affects the direction of maximal sensitivity (Pickles et al. 1989). So far, we do not yet understand the interactions of all these gradients along and across the BP that presumably determine the micromechanics of the respective cochleae.
2.1.6 Hair-cell innervation Neural and medial HC usually receive about two afferent terminals. The largest area of synaptic contact is found on HC within the neural region. This can be the most neural HC (e.g. barn owl, Fischer 1994a) or HC up to 20 % of the BP’s width from the neural edge can have the largest number of afferent contacts (e.g. pigeon, Smolders et al. 1995). In each species, a maximum in the number and/or the total afferent synaptic contact area is observed in that part of the BP where the behaviourally most important frequencies (and also the most sensitive hearing range) are encoded (Fischer 1994b). The extreme case is in the auditory fovea of the barn owl, where neural HC have up to 20 afferent terminals (Fischer 1994a, see also Sects. 3.2.1, 6.4). This is reminiscent of mammalian cochleae, where 15–20 afferents typically innervate inner hair cells in the most sensitive part of the cochlea (Lim 1986). Systematic TEM studies revealed that in the chicken, starling, barn owl and also in the primitive emu, HC on the abneural side, and generally in the basal half of the BP, do not have any afferent innervation (for review see Fischer 1994b). There are indications that during the evolution of the avian papilla, this region of hair cells that lacks afferent contacts has become larger: In the rather primitive emu, this zone is quite narrow, whereas in the starling and the barn owl, it is broad (Fig. 2.1.6). The HC without afferents (SHC) must fulfil some (yet unclear) function within the BP itself. Avian HC usually receive one efferent terminal, rarely two or even three terminals. There is a continuous increase in the efferent synaptic contact area from neural THC to abneural SHC, although in the apex, the efferent contact area of SHC may also be small. Whereas abneural HC always have efferent synapses, neural and medial HC may lack efferents, especially in the BP’s apex (Fischer 1992, Zidanic and 28
2.1 Anatomy of the cochlea in birds Fuchs 1996). It thus appears that efferent HC innervation is especially important at higher frequencies and on SHC. In the base of the BP, the differentiation of the HC types is most clearly established: Only THC have afferent innervation, whereas abneural HC lack afferents.
Fig. 2.1.6: Hair-cell stereovillar bundle orientation in the chicken (top) and the barn owl (bottom two drawings). The neural papillar edge is to the top, the base to the right in each case. Not to scale; the dimensions in life are: maximal width of the basilar papilla in both cases 0.24 mm, papillar length is 4.3 mm in the chicken and 11 mm in the barn owl. In the chicken, the thick line in the basal half delineates an abneural region where the hair cells lack afferentfibre connections. The same delineation line is shown for the barn owl in the lowest diagram. The thin black lines in the chicken and in the upper barn-owl drawing are iso-orientation contours that show the hair-cell areas within which the bundles are rotated by the indicated number of degrees towards the apex.
29
2 Comparative anatomy and physiology of hearing organs Morphologically, some authors find a gradual change from small efferents on THC to calyx-like synapses on SHC (Fischer 1992, Ofsie and Cotanche 1996), whereas others see a rather sharp border in the midline of the BP between the two forms of efferent terminals (Zidanic and Fuchs 1996). There are physiological indications that in the avian BP two types of efferents coexist, as in the mammalian cochlea (Code and Carr 1994, Kaiser and Manley 1994, see Sect. 2.6). In most cases, afferent nerve fibres contact neural and medial HC rather directly and exclusively (Gleich 1989, Manley et al. 1989, Fischer 1994b). Non-exclusive afferent fibres are mostly found in the apex of the BP. In the barn owl’s apex, this is extreme: here numerous non-exclusive afferents and many HC-HC fusion contacts join the sensory cells together into presumed functional units (Fischer 1994a). In the emu, afferent fibres contacting more than one hair cell innervated abneural, apical hair-cell areas (Köppl and Manley 1997). In contrast, the efferent nerve fibres to abneural HC run from the habenula perforata to the abneural side of the BP, where they first contact the hyaline cells (that are situated along the abneural border of the BP), before they turn back, branch, and contact several HC. In the chicken, one efferent fibre contacts about 40 HC (Cole and Gummer 1990, Fischer 1992). THC, on the other hand, are usually directly contacted by small efferents. Afferent nerve fibres are mostly myelinated. In the young chicken, about 3 % were unmyelinated (Fischer et al. 1994), whereas in the adult barn owl, virtually all afferent fibres were found to be myelinated (Köppl 1997b). In the efferent nervefibre population on the other hand, nearly 60 % were unmyelinated in the young chicken (Fischer et al. 1994). Axon diameters differ markedly between species. In the chicken, afferents are 0.5–3 µm in diameter, in the barn owl, they are 1.0–5.5 µm (Köppl 1997b). The emu has very large afferents in the apical half of its BP; up to 7 µm in diameter (Fischer 1998). In the barn owl, a clear correlation of afferent axon diameters with the longitudinal position they innervated along the BP was documented, such that axons supplying the 7 kHz-region were largest and diameters decreased towards both lower and higher frequencies (Fig. 2.1.7; Köppl 1997b). It is not yet clear whether this particular trend or any such correlation at all may be typical for birds. A fuller treatment of evolutionary aspects of papillar morphology in birds can be found in Gleich and Manley 1998.
30
2.2 A special case of congenital hearing deficits: The Waterslager canary
Fig. 2.1.7: The most common diameters of afferent auditory-nerve axons innervating regions of different characteristic frequency in the barn owl basilar papilla. Axon diameters were evaluated at successive locations along the basilar papilla and the modes of the size distributions of axons entering the nerve between the different locations are shown (horizontal grid lines indicate the bins of 0.5 µm that were used to classify axon sizes). The locations along the papilla were converted to frequency according to the known frequency map of the barn owl papilla (Köppl et al. 1993).
2.2 A special case of congenital hearing deficits: The Waterslager canary Otto Gleich
In cooperative work with Prof. R.J. Dooling of the University of Maryland, we studied the well-documented ‘partial deafness’ of one race of canary, the Waterslager, in order to determine whether the hearing loss observed in behavioural studies is of peripheral and/or of central origin. This question is especially interesting in view of the ability of birds to regenerate hair cells lost through noise damage or through treatment with ototoxic drugs. If birds can really fully regenerate hair cells, why do these birds remain hard of hearing? We studied the compound action potential of the cochlea (CAP) and the cochlear microphonics in response to pure tones, and looked at the cochlear epithelia of these birds in the scanning electron microscope. A comparison of CAP and microphonic audiograms with normal canaries showed that Waterslager canaries have systematically elevated thresholds (Gleich et al. 1994a,b, 1995a,b, Fig. 2.2.1). The fact that CAP and cochlear microphonic thresholds were raised relative to those of normal canaries and that these animals have a normal middle ear implies that damage is present at the level of the hair cells and that at least part of the hearing loss is of peripheral origin. 31
2 Comparative anatomy and physiology of hearing organs
Fig. 2.2.1: Top: A comparison between the thresholds of the mean compound action potential of the auditory nerve of Belgian Waterslager canaries (BWS, thin line) and normal canaries (thick line). Bottom: A comparison of the basilar-papillar surface in (left) a normal canary and (right) a Waterslager. Each of the lower panels is about 20 µm wide. Figures courtesy of O. Gleich.
Scanning EM studies of the Waterslager canaries showed that they had obvious pathological changes to the hair cells (Fig. 2.2.1), and that the degree of pathology varied strongly between individuals. The changes seen were, e.g. a strongly reduced number of hair cells (e.g. 1700 instead of 3000) and various changes to the hair-cell surface and the stereovillar bundle (Gleich et al. 1993, 1994a). The changes in the 32
2.3 Micromechanical properties of chicken hair-cell bundles bundles varied from the loss of a few stereovilli, through extremely thin, thick or elongated stereovilli to fused stereovilli. Some hair cells had several partial bundles; on others, the bundle was completely missing. Hair cells with extremely enlarged apical surfaces were also common, especially in the abneural papillar area. Most of the papillae showed hair cells whose surface was completely covered in microvilli or had, in addition, a short stereovillar bundle. The appearance of these cells strongly resembled that of hair cells regenerating following noise damage or treatment with ototoxic drugs. It became obvious that the Waterslager canary could be an excellent model for a – presumably genetically determined – continuous hair-cell regeneration, a species in which a proportion of the hair cells is always either dying or regenerating, but which never attains a completely recovered sensory epithelium.
2.3 Micromechanical properties of chicken hair-cell bundles Jutta Brix and Geoffrey A. Manley
As avian hair cells are very difficult to study in the living animal, we decided to investigate some of their mechanical properties in isolated pieces of chicken hearing epithelium, in cultured epithelia and in isolated individual hair cells. In order to do this, it was necessary to modify existing culture techniques for this preparation and to develop a method of measuring hair-cell bundle movements of only a few nanometers. Isolated single chicken hair cells and pieces of epithelium without the tectorial membrane, either freshly isolated or in tissue culture, were studied using waterjet stimulation of their stereovillar bundles and current injection through microelectrodes (Brix and Manley 1994). Mechanical responses were measured under enhanced video-microscopic observation or while using a differential photodiode technique sensitive to motion in the nanometer range. When stimulated with a water jet at low displacement amplitudes up to about 200 nm, the displacement of the stereovillar bundle was asymmetrical and indicating a lower stiffness in the excitatory direction. The reverse was true at higher displacement amplitudes. Undamaged bundles showed no mechanical resonances below 1 kHz, suggesting that the normal micromechanical resonances of this system are only obtainable in the presence of the tectorial membrane that provides most of the mass of the resonant system. In damaged bundles, however, such resonances were prominent and were accompanied by splaying of the stereovilli. When subject to current injection through inserted microelectrodes, hair cells still embedded in the epithelia showed small bundle movements (0.6 nm/mV) whose polarity depended on the polarity of the current. These movements probably resulted from activation of the bundle’s adaptation motors (Brix and Manley 1994).
33
2 Comparative anatomy and physiology of hearing organs We also mechanically stimulated hair cells and recorded their extracellular receptor potentials in the freshly-isolated chicken auditory epithelium in order to study the properties of the stereovillar tip links. We were able to measure stable microphonic potentials over a period of more than two hours. Hair-cell transduction was not influenced by adding collagenase to the bathing solution, whereas streptomycin led to a complete loss of the potentials within a few minutes. Since anatomical studies of these hair cells showed that a similar percentage of tip links were visible in bundles that were or were not treated with collagenase, we concluded that the tip-link protein is not collagen (Pickles et al. 1991b).
2.4 Potassium concentration and its development in the chicken cochlea Geoffrey A. Manley
The periotic space of the inner ear of vertebrates contains perilymph, a fluid which has a high sodium concentration and a low potassium concentration. In contrast, the endolymph of the otic spaces has a high concentration of potassium ions and a low concentration of sodium ions. In addition, the endolymphatic space of mammals has been shown to be at a high positive potential compared to the extracellular spaces outside the cochlea, a potential between +80 and +100 mV, that came to be known as the endocochlear potential (EP). The high potassium concentration was also later found in the vestibular system, but was accompanied by only small potential differences. In birds, this EP was shown to be much smaller than in mammals and values up to +20 mV were typical. With regard to the ionic concentrations of the inner-ear spaces in non-mammals, very little information is available. Although there are reports that endolymph contains more potassium than perilymph and perilymph more sodium, technical problems made exact measurements impossible. The combination of a high potassium concentration and high positive potential in mammalian endolymph has been suggested to be a mechanism for amplifying the transduction current in the cochlea and thus to play a very important role in the hearing mechanism. Should the potassium concentration in birds be as high as in mammals, the absence of a high EP would have important consequences for the general applicability of Davis’ theory. It thus became necessary to carry out an exact measurement of the ionic concentrations in the inner ear of birds. In addition, we investigated chicks of different ages, in order to attempt to correlate the ontogeny of the EP and ionic concentrations with the known morphological changes in the TV at different ages (see Runhaar et al. 1991, for references). We measured the potassium concentration in the perilymphatic and endolymphatic spaces of the chicken’s cochlea using double-barrelled ion-specific microelectrodes, one barrel of which was filled with a potassium ion exchanger. The potas34
2.5 Discharge activity of afferent fibres in the avian hearing organ sium concentration in Scala media was found to be about 24 times higher than that of the perilymph. Both this ratio and the actual values of 161 and 8.1 mM/l, respectively, are very similar to data reported for the equivalent spaces of mammalian cochleae. The high potassium concentration in the endolymph is reached at the latest at stage E42, and thus before the tegmentum vasculosum is fully developed (Runhaar et al. 1991). The endocochlear potential of chickens was maximally +13 mV and thus somewhat below previously-published values for birds. The EP of birds is thus very much lower than that of mammals, but higher than values measured in vestibular endolymphatic spaces. Positive potentials in the endolymphatic space of 5 mV or more were already present in animals of stages E41 and E42, and this developmental process is complete at the latest one day after hatching. The sensitivity of the avian EP to hypoxia can be most easily interpreted as an indication of the presence of an electrogenic Na+-K+-pump mechanism in the Tegmentum vasculosum (Runhaar et al. 1991).
2.5 Discharge activity of afferent fibres in the avian hearing organ Otto Gleich, Christine Köppl and Geoffrey A. Manley
In studying the responses of single auditory-nerve fibres of different bird species, our aims were to try to understand the evolution of function of the avian papilla and to learn how it achieves its high frequency selectivity. Parallel to our studies, most other work on the peripheral auditory system of birds was carried out on the pigeon (e.g. Smolders et al. 1995, Gummer et al. 1986, Hill et al. 1989, Schermuly and Klinke 1990, Temchin 1988) and the chicken (Salvi et al. 1992, Warchol and Dallos 1989). Two different mechanisms of frequency selectivity have been recognized in terrestrial vertebrates (see e.g. Manley 1986, 1990), and it is likely that both mechanisms – electrical and micromechanical tuning – co-exist in the avian auditory papilla. Four sets of data suggest that birds have retained electrical tuning, at least in their low-frequency hair cells (review in Manley and Gleich 1992). These are: The presence of preferred intervals in the spontaneous activity of primary auditory-nerve fibres (Manley and Gleich 1984), specific deviations from their expected phase responses (Gleich 1987), the strong temperature sensitivity of frequency tuning (Schermuly and Klinke 1985) and patterns seen in the electrical responses of avian hair cells (Fuchs et al. 1988, 1990). Thus a major part of the tuning selectivity of avian THC may reside in the properties of hair-cell membranes. Wu et al. (1995) have shown that at the high body temperatures of birds, the upper frequency limit of this electrical tuning might extend beyond 4 kHz. The upper limit we found for preferred intervals in the spontaneous activity of 35
2 Comparative anatomy and physiology of hearing organs primary auditory-nerve fibres in birds is 4.7 kHz in the barn owl (Köppl 1994). In the owl, there is no reason to suspect that these observations were limited by a decrease in the phase-locking ability of the nerve fibres, as the frequency limit of phase locking in this species is considerably above 5 kHz (see Sect. 2.5.5; Köppl 1997c) In birds, a variety of structures can interact in complex ways to produce the response patterns of the second kind of frequency-selective mechanism, micromechanical tuning. In spite of the intense efforts made, however, we still understand little of the micromechanics underlying frequency-selectivity mechanisms in any kind of papilla. The passive mechanical characteristics of the hair-cell stereovillar bundles (determined e.g. by the number and height of stereovilli) and of the tectorial membrane (whose dimensions vary along the papilla) and/or active motile processes in hair cells, all play a role. As discussed above (see Sects. 2.1.2 to 2.1.5), many structures show strong gradients in all birds; and are undoubtedly at least partially responsible for the gradient of frequency responses along the hearing epithelium. In addition, our study of rate-intensity responses of primary afferents in the emu (Sect. 2.5.2) and of spontaneous otoacoustic emissions in the barn owl (Sect. 5.2.2) indicated that birds have a cochlear amplifier. We suspect that there are specializations of hair cells for active motion on the one hand (presumably by the short hair cells, at least those that lack an afferent innervation) and of tall hair cells as receptor elements on the other (Manley 1995). In our study of hair cells of the chicken, isolated hair cells displayed bundle movements or shape changes upon current injection (Brix and Manley 1994). However, due to the small size of the hair cells, it was not possible to investigate whether chicken hair cells are capable of fast movements at auditory frequencies. During the course of the collaborative research centre, we studied the physiology of single auditory-nerve afferent fibres in several species. These are the chicken (Manley et al. 1987, 1989, 1991, 1992) starling (Manley et al. 1985, Gleich and Narins 1988, Klump and Gleich 1991, Gleich 1994, Gleich and Klump 1995), barn owl (Köppl 1997a, c) and emu (Manley et al. 1997, Köppl and Manley 1997, Köppl et al. 1997). The resulting data are briefly summarized below.
2.5.1 Spontaneous activity of afferent nerve fibres The rates of spontaneous activity of afferent fibres have a monomodal distribution, with maximal values exceeding 100 spikes/sec. In the developing emu (Manley et al. 1997) the mean spontaneous rate increased with posthatching age. A smaller change in the same direction was also seen in the chicken (Manley et al. 1992). Spontaneous activity results from the stochastic release of transmitter packets at the afferent synapse and should result in a Poisson-like distribution of the intervals between action potentials (except for the absence of the shortest intervals, due to the refractory period of the neuron). We observed significant deviations from this pattern in all species, many cells producing an unexpectedly large proportion of interspike intervals whose period was related to their most sensitive frequency. This re36
2.5 Discharge activity of afferent fibres in the avian hearing organ sults in peaks (preferred intervals) and valleys in inter-spike interval histograms, as we first described in the starling (Manley 1979, Manley and Gleich 1984, Manley et al. 1985) and subsequently observed in the chicken (Manley et al. 1991), barn owl (Fig. 2.5.1; Köppl 1997a) and the emu (Manley et al. 1997). The reciprocals of these preferred intervals are close to, but not necessarily identical with, the fibre’s CF.
Fig. 2.5.1: Time interval histograms from barn owl auditory-nerve fibres, with (bottom) and without (top) preferred intervals. Both of these nerve fibres had a high characteristic frequency, and the preferred intervals are only visible when the time axis is expanded (lower inset).
37
2 Comparative anatomy and physiology of hearing organs We have suggested that some of these preferred intervals may reflect spontaneous oscillations of the hair-cell potential related to an electrical hair-cell tuning mechanism (Manley 1979, Manley and Gleich 1984, Manley et al. 1985). Inadvertant stimulation, e.g. body noise, might explain preferred intervals that are only found in the most sensitive units (Klinke et al. 1994), but the thresholds of cells with spontaneous preferred intervals varies greatly and we found preferred intervals in auditory-nerve fibres of the starling with thresholds as high as 80 dB SPL. In the barn owl also, Köppl (1997a) found no relationship between fibre threshold and the presence or absence of preferred intervals.
2.5.2 Responses to simple tonal stimuli The most common response pattern of avian primary afferent fibres to sound is an increase of the discharge rate (see also Fig. 2.5.3a). Decreases in discharge rate (primary suppression) can, however, occur in response to single-tone stimuli of frequencies outside the excitatory response area. Maximum discharge rates in birds are on average higher than those of mammals for fibres at the same characteristic frequency. We studied rate-level (I/O)-functions in the chicken, starling and especially in the emu (Manley et al. 1985, 1991, Köppl et al. 1997). In chickens, the mean slopes of the I/O-functions almost doubled between post-hatching days 2 and 21; the maximum discharge rates rose by about 40 % and the dynamic range in dB fell by 20 % (Manley et al. 1991). These changes occurred although most other features of the activity of the auditory-nerve fibres (e.g. the characteristics of the frequency-tuning curves) did not change over the same developmental time period. In higher-CF fibres from the emu (i.e. out of the phase-locking range), we found that the rate-level functions were almost invariably of the sloping-saturation type (Köppl et al. 1997), which can be interpreted to indicate that a compressive nonlinearity is setting in above a certain sound-pressure level, called the break point. The break-points in a large sample of fibres correlated very tightly with the fibre’s sensitivity (Fig. 2.5.2b; Yates et al. in press). We interpreted this as indicating that the nonlinearity associated with the cochlear amplifier in the emu is local in nature and not global as in mammals (Köppl et al. 1997). This suggests that the mechanical output of the cochlear amplifier in birds is not fed into the motion of the basilar membrane on a large scale, but retained within the local papillar-tectorial complex.
38
2.5 Discharge activity of afferent fibres in the avian hearing organ
Fig. 2.5.2: A: Rate-intensity functions for a primary auditory-nerve fibre in the emu. The point labelled A3 is the breakpoint on the curve at the characteristic frequency. B: Correlation between the breakpoint and the sound-pressure level at which different fibres reach half their maximum firing rate. The parameter is fibres grouped according to their characteristic frequency into six frequency ranges. In all ranges, these two parameters are closely correlated. In mammals, all fibres with the same characteristic frequency and in any one individual share the same breakpoint SPL.
2.5.3 Frequency selectivity Avian auditory-nerve fibres generally respond to only a restricted frequency range with a modulation of their discharge rate, i.e. they are frequency selective. Tuning curves were generally obtained from response matrices in the frequency and intensity domain using a discharge threshold criterion just above the spontaneous rate. From these contours, the characteristic frequency (CF, at the sensitive tip of the tuning curve), the threshold at CF and various measures of frequency selectivity (Q10dB, high and low frequency slopes) of the respective neuron can be defined. In addition, areas where neural responses are suppressed below spontaneous rate (i.e. areas of primary suppression) can also be delimited. The examples shown in Figure 2.5.3 illustrate that avian excitatory tuning curves are generally V-shaped and – on a logarithmic frequency scale – roughly symmetrical around the CF (Manley et al. 1985). We quantified several measures related to tuning curves in the starling (Manley et al. 1985, Klump and Gleich 1991, Gleich 1994), the chicken (Manley et al. 1991) the emu (Manley et al. 1997) and the barn owl (Köppl 1997a). In general, the tuning-curve symmetry changes gradually across the hearing range. Low-CF cells tend to have steeper low-frequency flanks than high-frequency flanks, and high-CF cells show the reverse behaviour (Manley et al. 1985, 1991). In the emu, this transition occurs at quite low frequencies, so that the tuning curves show a more pronounced asymmetry similar to that of mammals (Manley et al. 1996). In addition, there is an overall trend in the shape of avian high CF tuning curves, such that in primitive species like the emu tuning curves show low-frequency tails similar to 39
2 Comparative anatomy and physiology of hearing organs
Fig.2.5.3: Sample data from physiological studies of avian primary-auditory afferent fibres, in this case data from the emu. A: Threshold tuning curves for single fibres over a range of characteristic frequencies. B: The shape of the threshold tuning curves described by the relationship between the high-frequency and the low-frequency flank slopes, measured between 3 and 23 dB above the CF threshold. C and D: The frequency-selectivity coefficients Q10dB and Q40dB, respectively as a function of the characteristic frequency.
those found in mammals, whereas in advanced species such as the barn owl (Köppl 1997a), no flattening on either the low or high-frequency tuning-curve flanks is evident. The range of best response thresholds at a given CF at lower frequencies in birds can exceed 50 dB (Manley et al. 1985, 1991, 1996). It is important to emphasize that this is not an indication for poor physiological condition, and high- and lowthreshold units of the same CF are often encountered successively in a single recording track (Manley et al. 1985, 1997). In the barn owl, the range of thresholds at higher CFs is reduced to maximally 30 dB (Köppl 1997a). The most sensitive thresholds at a given CF are related to the respective threshold of the behavioural audiogram (Manley et al. 1985, Gleich 1994). In most bird species, the lowest thresholds are found between 1 and 3 kHz. 40
2.5 Discharge activity of afferent fibres in the avian hearing organ The frequency selectivity of an afferent fibre is quantified using the frequency bandwidth of the tuning curve; narrow curves indicate a high selectivity. The tuningcurve bandwidth is usually determined 10 dB and/or 40 dB above the CF threshold and a quality factor (Q10dB or Q40dB, Fig. 2.5.3) is calculated (CF/width at x dB above CF threshold) and characterizes the frequency selectivity of the unit independent of its CF. Despite a considerable variability of Q10dB at any given CF, the average frequency selectivity increases with the CF (Köppl 1997a, Manley et al. 1985, 1991, 1996). When comparing afferent fibres in the same frequency range, the frequency selectivity of avian auditory fibres tends to be higher at the equivalent CF than in mammals (Manley et al. 1985, Manley 1990). This difference is even more obvious when comparing the selectivity 40 dB above threshold and reflects the fact that low-frequency tails, which are so typical for mammalian tuning curves, are not normally found in birds. Even in the emu where such ‘tails’ are regularly observed in higher-CF fibres, the frequency selectivity still remains higher than in mammals (Manley et al. 1997).
2.5.4 Excitation patterns on the starling hearing organ As an alternative to studying the response characteristics of single fibres to different stimuli, we also reconstructed the pattern of excitation along the auditory papilla of the starling from the responses of a large sample of cells to single test stimuli (Gleich 1994). These patterns can be compared to the many psychophysical data available for the starling. The symmetry of the excitation patterns was independent of the stimulus level and the high-frequency flank of the excitation pattern was steeper than the low-frequency flank. Based on these neural excitation patterns and a critical-band scale derived from the cochlear frequency map (from Gleich 1989), Buus et al. (1995) developed a model of excitation-patterns for the starling (see also Sect. 7.1).
2.5.5 Phase locking to tones Instead of using the rate of the spike discharges, the degree of synchronization of the spikes to the phase of stimulus cycles, known as phase locking, is a further measure of the neuronal response to tonal stimulation. In many auditory-nerve fibres, phase locking begins at sound pressure levels below those eliciting a rate increase. The fibres can thus rearrange their spontaneous discharges in the time domain to follow the cycles of the stimulus tone without raising the discharge rate. In both the starling and the barn owl, the threshold for phase locking was about 10–15 dB below the rate threshold, with the difference diminishing towards the high-frequency limit of phase locking (Gleich and Narins 1988, Köppl 1997c). Phase locking, expressed as vector strength, deteriorates towards higher frequencies, the upper limit of significant phase locking being species-specific. In the starling and the emu, the vector strengths were high up to several hundred Hz, but 41
2 Comparative anatomy and physiology of hearing organs decreased steeply for frequencies above 1 to 1.5 kHz, with the limit reached at about 4 kHz (Gleich and Narins 1988, Manley et al. 1997). A prominent exception is the barn owl, where phase locking was reliably present at frequencies up to 9 kHz (Fig. 2.5.4; Köppl 1997c). The physiological basis of this unusual ability in the barn owl remains unknown, but presumably involves specializations of the electrical properties of both the high-frequency hair cells and their afferent fibres (Köppl 1997c). It is well established, however, that phase locking in auditory-nerve fibres is the basis for the computation of interaural time differences in the auditory brainstem and midbrain of the barn owl (e.g. Carr and Konishi 1990). In the response of each cell, the mean response phase relative to the phase of the acoustic stimulus changes systematically with both frequency and level. With rising level, the mean response phase typically advances at frequencies below the neuron’s CF, and lags at frequencies above the CF (Gleich and Narins 1988, Köppl 1997c). Near the CF, however, the mean response phase is almost independent of sound pressure level, potentially providing a level-independent cue for stimulus timing. At any given sound-pressure level, the mean response phase of a neuron increases nearly linearly with stimulus frequency, corresponding to a constant response delay (Gleich and Narins 1988, Köppl 1997). However, in the starling, such phase-versus-frequency functions, when investigated closely, were found to deviate systematically from a straight line, especially at near-threshold levels.
Fig. 2.5.4: The maximal vector strength in the phase-locked responses achieved by auditory nerve fibres of different characteristic frequencies in six species of birds as a function of the stimulus frequency. Whereas the degree of phase-locking and the cut-off frequency of most species can be compared to that observed in mammals, the phase-locking ability of barn-owl auditory-nerve fibres is quite exceptional (thick line). The data from the emu, starling and barn owl were collected by this collaborative research area. Other data see Köppl 1997c.
42
2.5 Discharge activity of afferent fibres in the avian hearing organ Using an iterative best-fit procedure while varying the resonant frequency and the quality factor of the filter, we showed that the phase-versus-frequency functions could mathematically be separated into a putative constant delay plus the phase response of a simple LRC-type filter (Gleich 1987). This explained well the residual nonlinearity in the phase-versus-frequency functions of the starling, and the resonant frequencies of the putative LRC filter and the acoustic CF of the individual fibres were significantly correlated. The resonant frequencies were, on average, slightly lower than the measured acoustic CF, however, and the difference resembles the difference found in the equivalent frequency of the preferred intervals in spontaneous activity on the one hand and their acoustic CF on the other hand in the same species (Gleich 1987).
2.5.6 Birds and mammals – similarities and differences in auditory organs Although the anatomical division into different hair-cell types in birds is not as clear-cut as in mammals, structural parallels do exist between the tall and short hair cells of birds and the inner and outer hair cells of mammals, respectively (see Manley et al. 1988b, 1989, Manley 1990, for references). Some indications of this are: ● The relative position of the cell groups in the papillae is the same. Neither IHC nor, in general, THC are found over the free basilar membrane; they are situated within the neural side of the papilla, which overlies the superior cartilaginous plate in birds and the spiral lamina in mammals (Smith 1985). ● The hair cells lying on the neural side of the papilla are regarded as being the less specialized in both groups (Chandler 1984, Takasaka and Smith 1971). ● The THC and IHC receive a much stronger afferent innervation than the SHC and OHC. OHC are innervated non-exclusively by a relatively small percentage of the afferent fibres (mammal 5 to 10 %). ● The efferent innervation of both OHC and SHC is markedly stronger than that to THC or that contacting the afferent fibres of IHC, and the synaptic endings are much larger. Also, in both groups, the innervation density of efferents is higher in the basal than in the apical half of the papilla. ● The ontogenetic development of the afferent and efferent innervation follows very similar patterns in birds and mammals. During early development, there is a reduction of the branching of afferent fibres, so that in the adult bird, fibres innervate either THC or SHC, but not both. ● Both SHC and OHC tend to be the most sensitive to noise damage (Cotanche et al. 1987, Liberman and Kiang 1978, Robertson 1982). ● There is a single population of acoustically-active afferent fibres. 43
2 Comparative anatomy and physiology of hearing organs
2.6 Function of the cochlear efferent system – a comparative approach Alexander Kaiser, Geoffrey A. Manley and Grit Taschenberger
In addition to the afferent fibres, efferent fibres also exist in a variety of sense organs, with the capability of modulating incoming sensory information. Such a modulation is also a primary hypothesis for the role of efferents innervating the hearing organs of mammals. The mammalian efferent system is, however, still poorly understood despite many years of research (see e.g. Guinan 1996). A knowledge of the function of the cochlear efferent system is important to an understanding of signal transduction, pathology and hereditary ailments. This section summarizes the differences and similiarities of the avian and mammalian hearing organs in this respect and, based on our recent data from birds, speculates on a previously-unrecognized possible function for the cochlear efferent system.
2.6.1 Why are birds valuable for studying hearing-organ efferents? Birds have pronounced intra-specific acoustic communication with high demands on sound perception. The problems involved in extracting relevant intraspecific acoustic signals from a noisy environment caused an evolutionary pressure for detecting signals of low intensities and during self-vocalization. Resulting from these demands, both birds and mammals have independently evolved specialized haircell populations (see Sect. 2.5.6). In both groups, one population of hair cells nearly or completely monopolizes the afferent connections and has little or no direct efferent innervation (inner and tall hair cells) and a population with a prominent efferent and little or no afferent innervation (outer and short hair cells). The evolution of specialized hair-cell populations in birds is even more pronounced, as the basal short hair cells totally lack an afferent innervation (Fischer 1994, see Sect. 2.1.6). The efferent innervation of avian tall hair cells is axo-somatic rather than axo-dendritic as on mammalian inner hair cell afferents. Comparing the neuroactive substances found in efferents, birds, in contrast to mammals, have only one single type (acetylcholine), and this might be an advantage for studying efferent physiology (e.g. Code and Carr 1994).
44
2.6 Function of the cochlear efferent system – a comparative approach
2.6.2 Comparative physiology of the avian efferent system We have relatively little knowledge of the physiological properties of efferents in birds. The reasons are (1) the comparatively small amplitude of compound action potentials and (2) the small efferent fibre diameter. Recently, we developed an approach that allowed single-cell recordings from putative cochlear efferent cell bodies in the auditory brainstem of the chicken using stereotactic coordinates obtained from tracing experiments (Kaiser and Manley 1994a). These efferent units were distinguished from others (e.g. afferents) by their low spontaneous activity, broad frequency tuning, long response latency, long modes of their time-interval-histograms (TIH) of spontaneous activity and by their temporal response characteristics to sound. In addition, the properties of these units were compared to those of ascending neurons recorded from the cochlear nuclei in the same animals. The most striking finding of this study was the presence of two fundamentally-different efferent response types. One type responded to sound with an excitation (and a chopper or primary-like pattern to the TIH). Another novel type with a higher spontaneous activity responded with a complete suppression of discharge activity during tonal stimulation (Fig. 2.6.1). In mammalian data so far, all the single-cell recordings of cochlear efferents were restricted to medial olivo-cochlear (MOC) efferents innervating OHC and were physiologically similar to the excitatory chopper-like efferent type that we found in birds (see e.g. Robertson 1984, Liberman and Brown 1986). The properties of the mammalian lateral olivocochlear (LOC) system are still unknown, but there is indirect evidence that LOC efferents, that innervate the afferent dendrites of inner hair cells, might have an excitatory effect on the hearing organ (Liberman 1990). In silence, the suppression type of efferent we described in the chicken could similarly provide a moderate excitatory input that is reduced during a specific sound. Thus these may be the first signs of a second efferent system. However, as long as we do not know the nature of the neurotransmitters and their action on hair cells of these two systems, this remains purely speculative.
2.6.3 Putative efferent influences on otoacoustic emissions in the barn owl In the sedated barn owl, both SOAE and DPOAE vary spontaneously in amplitude with time (see Sect. 5.2). One possible cause of this instability in SOAE and DPOAE amplitudes could be efferent influences on hair cells, that may vary as a function of the momentary anaesthetic state and/or of the animal’s state of arousal. We therefore examined whether contralateral sound stimulation (noise and pure tones) could – via efferent fibres – influence SOAE and DPOAE in the barn owl (Taschenberger and Manley 1996 and in preparation). Contralateral tones and noise of frequencies above 7 kHz were used, that are attenuated >50 dB across the 45
2 Comparative anatomy and physiology of hearing organs
Fig. 2.6.1: A comparison between the discharge pattern displayed by peri-stimulus time histograms from: A: A cell from the afferent pathway in the cochlear nucleus; B: An efferent fibre that shows suppression during tonal stimulation; C: An efferent fibre that shows an onset response to short tone bursts and a chopper response to the long tone burst presented here. In each panel, the duration of the stimulus is shown at the left above the histogram. By courtesy of A. Kaiser.
owl’s interaural canal (Köppl 1997) and thus will not directly stimulate the ear under investigation when moderate sound levels are used. Effects were seen even when using very low levels of contralateral stimulation, and these effects remained after the middle-ear muscles had been severed. The effects were thus not caused by some sort of middle-ear reflex, but were the result of efferent activity across the brainstem. Both reductions and increases in the amplitudes of SOAE and DPOAE were observed as a result of applying contralateral noise (Fig. 2.6.2). Increases in DPOAE level are especially interesting in view of our demonstration (see above) that the activity of some avian auditory efferents is sup46
2.6 Function of the cochlear efferent system – a comparative approach
Fig. 2.6.2: The effect of contralateral noise (filtered to cut off frequencies below 7 kHz) on two SOAE in the barn owl. The two SOAE peaks were larger in the absence of contralateral noise (thicker continuous line) than during contralateral noise with two different attenuations. In addition, the noise shifted the SOAE centre frequencies.
pressed in the presence of sounds (Kaiser and Manley 1994). If the spontaneous activity of these efferents normally inhibits hair cells (which we do not know), then the suppression of such efferent fibres during tonal stimulation should lead to a disinhibition at the hair-cell level, and thus DPOAE level increases. In general, contralateral noise was accompanied by an increase in amplitude of small DPOAE but a decrease in amplitude of large DPOAE. In each case, however, it had a pronounced stabilizing effect on the normally-observed amplitude drift of DPOAE amplitude with time. This may indicate that such drifts were due to changes in the relative levels of activity in the two kinds of efferents, perhaps due to changes in the depth of sedation of the birds. Contralateral stimulation with pure tones, on the other hand, was only associated with inhibitory effects. This inhibitory effect was strongly frequency selective and was tuned to the same contralateral frequency range as the primary frequencies used in generating the ipsilateral DPOAE. This suggests that tonal suppression is mainly mediated by sharply-tuned efferents and that their contralateral projection is tonotopic, with corresponding afferent and efferent cochlear maps (as in mammals).
47
2 Comparative anatomy and physiology of hearing organs
2.6.4 Input to avian cochlear efferents In several studies, the origin of avian cochlear efferents has been traced to brainstem areas medial to the superior olive and the facial nuclei. They are arranged on both sides of the brainstem into a more dorsally-lying cell group medial to the dorsal nucleus facialis and a more ventrally-lying cell group medial to the ventral nucleus facialis (for references see Kaiser 1993). These areas (efferent area) lie within the crossing trapezoid fibre tract that originates from the first binaural nucleus of the time pathway (Nucleus laminaris, NL) and the first nucleus of the intensity pathway (Nucleus angularis, NA), respectively. The dendrites and somata of the cochlear efferents lie within this mixed trapezoid fibre tract, suggesting that synaptic contacts occur (Whitehead and Morest 1981). In order to investigate possible sources of this input, we studied the projections from this brainstem ‘efferent area’ to other auditory structures (Kaiser and Manley 1995). A projection from small neurons in the Nucleus MLd (‘inferior colliculus’) and the Nucleus ICo (intercollicularis) was found. As labelling of these cells was only successful when using high concentrations of tracer, we assume that these neurons have very thin axonal processes. We also found labelled cell bodies in other auditory structures (NA, NL, SO); however, no labelling was detectable in the Nucleus ovoidalis, the avian auditory thalamus. The basic pattern of labelling of these nuclei and fibres projecting to the SO, NLL and MLd is consistent with previous findings reviewed by Carr (1992).
2.6.5 Speculations on the functional significance of cochlear efferents From our tracing experiments, it is obvious that at least 2 types of mesencephalic connections to the cochlear efferent area in the brainstem exists. While a direct connection of these cell bodies to efferent neurons has not been demonstrated, they may be regarded as putative candidates for neuronal input. Such a descending input to cochlear efferents from two different mesencephalic nuclei would have important implications for our understanding of the physiological significance of the cochlear efferent system. While the nucleus MLd is known for the processing of ascending auditory information that is relevant to sound localization, the nucleus ICo has a very different physiological relevance; it is a major motor nucleus for vocalization (Seller 1981). Thus, from our study of efferent connectivities one can speculate that the activation of the efferent system is associated with two different behavioural contexts: sound localization and sound production. The latter aspect is reminiscent of the activation – in this case inhibition – of lateral line efferents during motor (swimming) activity in fish (Highstein and Baker 1985). This might be a general mechanisms for suppressing self-generated activation of sensory organs. Whether there is in fact a direct mesencephalic-to-efferent connection needs to be analyzed in further studies.
48
2.7 The lagenar macula and its neural connections to the brainstem
2.7 The lagenar macula and its neural connections to the brainstem Geoffrey A. Manley and Alexander Kaiser
The lagenar macula is an otolithic organ present in all non-mammalian vertebrates and in monotreme mammals. In birds, it is situated at the apical end of the cochlear duct, separated from, but close to, the hair cells of the auditory basilar papilla. It contains two types of hair cells – type I and type II – which are differentially distributed within and outside a striola, and covered by an otolithic membrane. Morphologically, the lagenar macula thus appears to be a vestibular organ. Our physiological study specifically investigated possible auditory responses of lagenar fibres (Fig. 2.7.1; Manley et al. 1991), but produced no evidence for any auditory function.
Fig. 2.7.1: The lagena of the chicken and the innervation patterns of stained nerve fibres (After Manley et al. 1991a).
49
2 Comparative anatomy and physiology of hearing organs Due to the close proximity of the lagenar macula and basilar papilla, however, it has proved notoriously difficult to separate their brainstem connections in anatomical studies. An early classical study in the pigeon (Boord and Rasmussen 1963) reported projections of the lagenar macula to the cochlear nuclei in addition to those to the vestibular brainstem nuclei, suggesting a functional connection between the lagenar macula and the auditory system. In an extensive tracer study, using DiI in the chicken, we attempted to resolve some of the open questions regarding the separation of both afferent and efferent connections of the vestibular lagenar macula and auditory basilar papilla (Kaiser and Manley 1996, Kaiser 1997). In addition, HRP-labelled auditory connections to the cochlear nuclei in the barn owl (Köppl 1994) complemented these chicken data. In birds, both the vestibular and the auditory efferent neurons arise from the brainstem in close proximity to the facial nucleus (Whitehead and Morest 1981). Whereas a dorsal and a ventral cell group could be identified, these cell groups could not clearly be attributed to efferents innervating the vestibular or the auditory hair cells. There was, however, evidence for an exclusive origin of vestibular efferents from the dorsal cell group (Schwarz et al. 1981, Code 1995), but whether the ventral cell group is a mixture of vestibular and auditory efferents or only a source of auditory efferents was, however, still unclear. It was also not known whether those efferents innervate only vestibular or auditory hair cells and whether the lagenar macula has a common efferent innervation with other hair-cell organs. Knowledge of these efferent colateralization patterns, however, is important for understanding efferent function.
2.7.1 Efferent connections Although there is evidence for an exclusive origin of vestibular efferents from the dorsal efferent cell group, collateralization of individual efferents to several endorgans cannot be ruled out (e.g. Code 1995). In studies using the tracer DiI, we analyzed the peripheral and central neuronal connections of lagenar fibres of the chicken to help resolve the following important questions: (1) Is there a collateralization of efferent fibres? (2) If yes, which end organs have a common efferent innervation? More specifically, do the auditory and vestibular end-organs have a common efferent innervation? When DiI was applied to the lagenar macula, all other vestibular end-organs typically also showed labelled fibres, and there were labelled bipolar ganglion cells in the vestibular ganglion, in addition to those in the cochleolagenar ganglion (Kaiser 1997). Conversely, if the application site was the utricular macula or an ampullary crista, all other vestibular end-organs, including the lagenar macula, also showed labelled fibres. In these cases, however, labelled cell bodies remained restricted to the vestibular ganglion. Thus, the lagenar macula and all other vestibular organs appear to receive a common, most likely efferent input that is separated from the efferent input to the Papilla basilaris. Judging from this peripheral separation of 50
2.7 The lagenar macula and its neural connections to the brainstem vestibular and auditory efferents, a central separation of their cell bodies seems very likely and supports the hypothesis of a dorsal-ventral differentiation of cell groups in the brainstem. Regarding this question, we also looked at the location of the labelled efferent neurons of the lagenar macula in the brainstem. All somata of the few labelled efferents were found medial to the Nucleus facialis dorsalis and therefore belong to the dorsal efferent cell group. The fact that this efferent cell group is known to be the source of vestibular efferents (e.g. Code 1995) also further confirms the vestibular nature of the lagenar macula.
2.7.2 Afferent connections In 1963, Boord and Rasmussen published a study on cochlear brainstem projections in the pigeon, reporting, among other things, a purely lagenar projection area to both cochlear nuclei, the Nucleus magnocellularis and Nucleus angularis. However, this particular conclusion was based on a single experiment that had aimed to selectively lesion the lagenar fibres; according to the authors themselves, this goal was not perfectly achieved. In our hands, the selective application of the tracer DiI to the lagenar macula in the chicken did not reveal any projections to the cochlear nuclei (Kaiser and Manley 1994b, 1996). Labelled terminals in the cochlear nuclei were only found in cases where the apical part of the basilar papilla also showed tracer uptake. Thus we concluded that there is no processing of vestibular information from the lagenar macula in the cochlear nucleus of the chicken. Therefore, the ventrolateral part of the N. magnocellularis and the ventral part of the N. angularis should no longer be referred to as the ‘lagenar part’ of the cochlear nuclei. One major reason why this designation had been accepted so readily is that these areas also show neuronal morphologies which differ from those in the larger, main parts of the cochlear nuclei (e.g. Köppl and Carr 1997). In the N. magnocellularis of the barn owl, the caudolateral part has now clearly been shown to form the low-frequency end of the auditory tonotopic gradient (Köppl 1994, Köppl and Carr 1997) and unpublished observations indicate the same for the N. angularis. There was no region in either nucleus which did not receive auditory afferents. This nicely corroborates the results from the tracing experiments in the chicken, showing that there are no vestibular subregions in the avian cochlear nuclei.
51
2 Comparative anatomy and physiology of hearing organs
2.8 Anatomy of the cochlea and physiology of auditory afferents in lizards Christine Köppl and Geoffrey A. Manley
During the course of the collaborative research centre 204, two projects on the anatomy and physiology of lizard basilar papillae were carried out partly in cooperation with another group. One of the underlying objectives was to attempt to model the frequency-response characteristics of the lizard papillae based on micromechanical resonance characteristics calculated from anatomical parameters that we had measured in morphological studies. These models would then be tested against the results of our neurophysiological experiments. Initially, we collaborated with the Auditory Laboratory at the University of Western Australia, working with an Australian skink species, the bobtail (Tiliqua rugosa, also known as Trachydosaurus rugosus). In the course of our collaboration, the bobtail skink became one of the few lizard species whose peripheral hearing system is very well known – in terms of its anatomy (Köppl 1988), of the neurophysiology of the primary auditory afferent fibres (Köppl and Manley 1990a,b, Köppl et al. 1990, Manley et al. 1990a, b), of the otoacoustic emissions (Köppl and Manley 1993a,b, 1994, Manley and Köppl 1992, 1994, Manley et al. 1990a, 1993a, see also Sect. 5.1) and in terms of papillar modelling (Manley et al. 1988, 1989). Later, we also studied the tonotopic organization of the papilla of the Tokay gecko, since our modelling work, based on our anatomical findings, had suggested a very unusual pattern of tonotopicity. Recent physiological work with the gecko was carried out in collaboration with Dr. Mike Sneary of San José University.
2.8.1 Basilar-papilla morphology in the bobtail lizard The Australian bobtail lizard Tiliqua rugosa is a member of the family Scincidae, that have an elongated basilar papilla (Miller 1980). The bobtail papilla is 2.1 mm long, with over 1900 hair cells and subdivided into two unequally-sized segments along its length: a smaller apical segment, with about 280 hair cells covered by a massive tectorial ‘culmen’, and a much longer basal segment (Fig. 2.8.1; Köppl 1988). The hair bundles of the apical hair cells contain 30–55 stereovilli, with no change in stereovillar number or height along the apical segment. A gradient across the papilla does, however, exist (neural = 49.2, abneural = 41.8). In the basal segment, the stereovilli are most numerous (41) and tallest (14 µm) apically, less numerous (30) and shortest (5.5 µm) towards the basal end. Hair cells showing both orientation polarities coexist throughout the papilla. All along the basal segment of the papilla, there are about equal populations of abneurally- and neurally-oriented hair cells. This orientation pattern has been called 52
2.8 Anatomy of the cochlea and physiology of auditory afferents in lizards opposingly-bidirectional (Miller and Beck 1988). The pattern within the apical segment of the papilla is more complex, the orientation changes twice across the papilla, the kinocilia first facing each other (opposingly-bidirectional) then pointing away from each other (Fig. 2.8.1; divergently-bidirectional, Miller and Beck 1988). This situation differs from the typical lizard pattern, in which the equivalent area is usually unidirectionally oriented (Miller 1992, Manley 1990). The tectorial structure overlying the basal segment is two series of about 70-90 units (= sallets) each, reaching to the edges of the hair-cell area from a central, ropelike interconnection. Since the distribution of hair cells within the basal segment is not uniform, an individual sallet connects together a varying number of hair cells. The mass resting on the hair cells is smallest at the basal end of the papilla and increases gradually towards the middle of the basal segment (Köppl and Authier 1995, Fig. 2.8.1).
Fig. 2.8.1: The anatomy of the basilar papilla of the bobtail skink Tiliqua rugosa. In the top photograph is shown a scanning electron microscope view of the entire papilla, as seen from Scala media. The apical, low-frequency end is to the left, the hair cells there are covered by a large culmen. The long high-frequency hair-cell region is covered by sallets. The centre panel shows a scanning EM photograph of part of the salletal region. The tectorial sallets have shrunk during the preparation and are pulled back from the edges of the hair-cell field. At the bottom is a drawing of the hair-cell orientation patterns to be found in the two regions, as indicated by the arrowheads. Across the apical region (left), the hair-cell bundle orientation changes twice, the rest of the papilla is simply bidirectional.
53
2 Comparative anatomy and physiology of hearing organs
2.8.2 Characteristics of primary auditory afferent fibres in the bobtail lizard The activity patterns of primary auditory nerve fibres were studied in the bobtail lizard mostly by recording from the eighth nerve as it enters the brain cavity. Primary auditory-nerve fibres have asymmetrical, V-shaped frequency-threshold tuning curves whose characteristic frequencies (CF) range from 0.2 to 4.5 kHz. Fibres with CFs below 0.85 kHz have simple U-shaped tuning curves; higher-CF fibres had tuning curves with obvious sharp tips around CF of up to 46 dB in depth (Fig.2.8.2). Low-CF and high-CF groups of fibres also differ in several other parameters of the tuning curves, such as in the selectivity coefficients Q10dB and Q40dB and in the course of the tuning curve flanks. We labelled physiologically-characterized primary auditory neurons and traced them to their innervation sites within the basilar papilla. All stained neurons branched within the basilar papilla to innervate between 4 and 14 hair cells. The branching patterns of fibres innervating in the apical and basal papillar segment, respectively, show characteristic differences. Apical fibres tend to innervate hair cells with the same morphological polarity and often branch extensively along the segment. Basal fibres, in contrast, typically innervate about equal numbers of hair cells of opposing polarity and are more restricted in their longitudinal branching. The distribution of stained fibre terminals shows that low frequencies (up to a CF of about 0.8 kHz) are processed in the smaller apical segment of the papilla and that medium to high frequencies are arranged systematically along the much longer basal segment; the CF increases towards the basal end. The tonotopic organization of the basal segment is well described by an exponential relationship (Fig. 2.8.2). The apical segment of the papilla shows an unusual tonotopic organization in that the CF appears to increase across the epithelium, from abneural to neural. A tonotopicity in this direction has not yet been demonstrated in other vertebrates. An irregular spontaneous activity was found in 70 % of primary afferent fibres in the bobtail, with rates between 0.1 and 123.7 spikes/s. About one third of all fibres show a more prominent mode in their inter-spike interval histogram than is to be expected from a quasi-Poisson distribution. Primary fibres mostly responded to sound with a discharge rate increase. Fibres of low characteristic frequency (CF up to 0.65 kHz) show a characteristic change in their response pattern with stimulation frequency; it is suggested that primary suppression plays an important role in shaping the very phasic response to tones at the fibres’ upper frequency range. The response patterns of higher-CF fibres (CF 0.55–4 kHz) are independent of stimulation frequency. About one third of them shows a primary-like discharge pattern. The majority, however, responds with a chopper-like discharge pattern and there is evidence that this discharge pattern is due to temporal summation. Primary auditory fibres in the bobtail lizard phase-lock up to a maximal frequency of 1.0 to 1.3 kHz at 30 °C body temperature. The vector strength of the phase histograms falls more rapidly with increasing frequency in fibres of a high-CF group than in those of the low-CF group. The corner frequency of the low-CF group is 54
2.8 Anatomy of the cochlea and physiology of auditory afferents in lizards 0.73 kHz, that of the high-CF group, 0.51 kHz. It is suggested that the membrane time constants of high-CF fibres are longer than those of low-CF fibres.
Fig. 2.8.2: Top: Features of the threshold tuning curves of the bobtail skink Tiliqua rugosa. Shown are five low-frequency and six high-frequency tuning curves from one individual animal. Bottom: Tonotopic organization of the high-frequency area of the basilar papilla of Tiliqua as calculated from morphological parameters (‘model’), and as measured using stained and characterized single nerve fibres (‘data’). The bar at the bottom represents the morphology of the papilla. The white area on the left is the low-frequency hair-cell area, the checkerboard area represents the high-frequency hair-cell area covered by salletal tectorial structures.
55
2 Comparative anatomy and physiology of hearing organs
2.8.3 A model of frequency tuning and tonotopic organization in the bobtail lizard In the bobtail lizard also, we measured the displacement amplitudes of the basilar membrane (BM) at different locations to a broad range of frequencies. BM tuning is very broad, not place-dependent and similar to that of the middle ear. Sharplytuned, sensitive regions of the BM were never found. As noted above, however, the primary auditory-nerve fibres in the bobtail lizard are sharply tuned and their centre frequency reflects the place in the papilla that they innervate, i.e. they are highly place-dependent. Afferent fibres recorded from positions directly adjacent to the papilla in the same animals in which the BM measurements were carried out showed sensitive, sharp tuning that was strictly tonotopically organized. Over the accessible basal two-thirds of the basal segment of the papilla, the characteristic frequencies (CF) of nerve fibres varied systematically from 1.4 to 4.2 kHz. Each neural tuning curve shows a broad tuning at high SPL that is very similar for differentfibres: We attribute this broad, insensitive region to the basilar-membrane tuning. Near their respective CF, the neural curves are more sharply-tuned, relatively deep and sensitive. This ‘tip’ region seems to be superimposed on the broad, insensitive tuning region and has a different origin (Fig. 2.8.2). We suggested that this second component was the result of a mechanical resonance of local groups of hair cells and the attached portion of tectorial membrane (Manley et al. 1988a). We modelled this tuning as a set of simple resonant high-pass filters consisting of a group of hair cells coupled through their tectorial sallet and connected in series with the broad tuning of the basilar membrane. The tectorial sallets couple local groups of hair-cell stereovillar bundles and play a critical role in determining the characteristics of local mechanical resonant groups. We assumed that the local resonance frequency depends only on the mass of the sallet and the stiffness of all the stereovillar bundles connected to it. The calculated mass of the sallets varies nonlinearly along the papilla. The stiffness along the papilla, depending on the number of hair cells connect-ed to each sallet, on the number of stereovilli in each bundle and on the height of the bundle, is almost a mirror-image of the change in salletal mass; in the model, these inverse trends work together to produce a smooth gradient in resonance frequencies (Fig. 2.8.2, 2.8.3). The calculated resonance frequency for each hair-cell/salletal unit is shown in Figure 2.8.2, and ranged from 1.08 kHz apically to 6.8 kHz basally. The actual CF range known from our from neurophysiological experiments is from 0.85 to 4.5 kHz. Considering the uncertainty in the necessary assumptions and the small number of variables taken into account in the calculation, the agreement between the neural data and the calculated resonance frequencies is extremely good. We thus propose that the sharp tuning of the CF-region in high-frequency hair cells in Tiliqua is due to the mechanical resonance of local hair-cell groups and their sallet, and that this is superimposed on the broad basilar-membrane tuning. We assume that the hair-cell coupling is optimized by the sub-divided, salletal structure of the tectorial membrane of this basal region. This is also consistent with the assumption in our model of a degree of coupling of adjacent sallets to produce the characteristic tips of the tuning curves. 56
2.8 Anatomy of the cochlea and physiology of auditory afferents in lizards
Fig. 2.8.3: Specific features of the fine morphology of the basilar papilla of the bobtail skink Tiliqua rugosa that were used in the calculation of resonance frequencies. These are: (top left) the number and height of the stereovilli, (top right) the salletal area and number of hair cells per sallet and (bottom left) the stiffness of the stereovilli and the mass of the sallets. On the bottom right, we compare the frequency maps (tonotopic organization) calculated from the morphological data (dashed line) with data points obtained by staining single nerve fibres (diamonds) and a linear fit to these neural data (continuous line).
2.8.4 Basilar-papilla morphology in the Tokay gecko The basilar papilla of the Tokay gecko (Gekko gecko) has an interesting mixture of structural features partly similar to the bobtail lizard’s and partly very different. Our anatomical study (Köppl and Authier 1994) aimed to quantify structural parameters and provide the anatomical basis for a model of mechanical frequency tuning (Authier and Manley 1994) in a similar manner to that of the bobtail lizard. The basilar papilla of Gekko gecko is a large, highly-organized structure in which three areas of hair cells can be distinguished. The basal third of the papilla contains only abneurally oriented hair cells that are covered by a tectorial meshwork. Physiologically, this area corresponds to the apical area of the bobtail papilla – and, indeed, of all lizards that are not geckos or their relatives. The apical two-thirds of the gecko papilla are subdivided into two populations of hair cells running parallel to each other along the length of the papilla. These two populations are separated into a neural (preaxial) and abneural (postaxial) group that 57
2 Comparative anatomy and physiology of hearing organs are both bidirectionally oriented, i.e. the apical two-thirds of the papilla area are ‘doubly bidirectionally oriented’. The basilar papilla of Gekko gecko is 1.92 mm long and contains about 2200 hair cells, 112 in the basal population, 1280 in the preaxial and 810–820 in the postaxial population (Fig. 2.8.4). Cells of both orientations were equally numerous in the preaxial hair-cell area of the apical two thirds of the papilla. Cell density is higher than in the basal area and, as in the basal population, there was no obvious organization of hair cells into rows or columns. The number of stereovilli varies from 33 to 45, the bundle height between 12.3 µm and 4.5 µm. The preaxial area was covered by a continuous tectorial structure connected to the neural limbus, and whose crosssectional area increased linearly from its basal to the apical end. The postaxial hair-cell population has a chain of tectorial sallets, as in the basal hair-cell area of the bobtail lizard. The hair cells are found in strictly-organized transverse columns with 2 (basally) to 7 cells per column (apically). One tectorial sallet covers one transverse hair-cell column, and there are 170 sallets. The hair cells in the postaxial area were much less densely packed than in the preaxial region. The number of stereovilli per hair-cell rose linearly from 32 basally to 48 apically, while bundle height fell from 16 µm to 4.5 µm.
2.8.5 A model of frequency tuning and tonotopic organization in the Tokay gecko Using the quantitative details of the anatomy of the auditory papilla in the Tokay gecko, we constructed a quantitative model predicting the tonotopic organization of two of the three papillar areas. Assuming that hair-cell bundle stiffness is similar to that of other species, a model of resonance frequencies for the apical areas of the papilla was constructed, taking into account factors such as the number of hair cells per resonant unit, their bundle dimensions, the volume of the tectorial mass, etc. Since the pre- and post-axial areas were covered by independent tectorial systems and separated from each other by a hair-cell-free hiatus, we assumed that these two areas responded to sound independently of each other. The model predicted that the apical pre- and postaxial areas, although anatomically adjacent, respond to different frequency ranges, a phenomenon not yet reported from any vertebrate. Together, the model predicted that these areas respond best to frequencies between 1.1 and 5.3 kHz, very close to the range found physiologically (Eatock et al. 1981, 0.8 to 5 kHz) for the high-frequency range for this species (Fig. 2.8.4). Our recent, unpublished physiological experiments, tracing responses to specific papillar nerve fibres do not, however, confirm these interesting predictions of the model. Instead, it appears as if Gekko has a simple, exponential tonotopic organization along the high-frequency area of its papilla (Fig. 2.8.4). The assumption of independent frequency tuning in the pre- and post-axial areas of the papilla thus appears to be untenable. Independently of the tonotopic prediction, the model indicates that, compared to free-standing hair-cell bundles, the semi-isolated tectorial structures called sallets not only lower the range of characteristic frequencies but 58
2.8 Anatomy of the cochlea and physiology of auditory afferents in lizards also increase the frequency selectivity of the attached hair cells. This may be the basis for the fact that frequency selectivity in Gekko is extremely high, especially when compared to lizards having free-standing hair-cell bundles, such as Gerrhonotus (see Manley 1990 for a review).
Fig. 2.8.4: Top: Tonotopicity of the basilar papilla of Gekko gecko. The dotted line is the calculated range of frequency responses in the preaxial area, the dashed line the calculated frequencies of the salletal area. The continuous line is a linear regression (see formula) on the neural data points for frequencies above 1 kHz. The data points shown as open squares are derived from single afferent nerve-fibre stains to the papilla. At the bottom is a schematic diagram showing the outline of the papilla, and the three hair-cell areas in Gekko.
59
3 Cochlear frequency maps and their specializations in mammals and birds
The first important steps in spectral analysis of complex acoustic signals are performed within the hearing organ. The hearing organ of amniotes processes spectral components of sounds in a place-specific manner: in mammals, high frequencies are represented at the base of the cochlear spiral and lower frequencies at progressively more apical positions (Bekesy 1960). Frequency analysis depends on basal–apical gradients in mechanical properties of the vibrating structures such as the basilar membrane (BM), tectorial membrane (TM) and hair cells. The interpretations of many physiological and psychoacoustical data within species and across species with widely different hearing range and hearing sensitivity critically depend on the availability of accurate cochlear-frequency maps that relate physiological data with anatomical findings. The most accurate cochlear frequency maps have been obtained by injecting single auditory nerve fibres of known best frequency with a marker, mostly horseradish peroxidase (HRP), and tracing their point of origin in the cochlea (e.g. Liberman 1982). Less precise but advantageous because of the higher success rate of labelling auditory nerve fibres, is the use of small extracellular HRP-injections into physiologically characterized regions of the cochlear nucleus, the central termination site of auditory nerve fibres. This technique was first employed in studies of the cochlear frequency map of Doppler-shift-compensating bats (Vater et al. 1985, Kössl and Vater 1985b) and then widely used in other species (e.g. Müller 1991, Müller et al. 1992, Köppl et al. 1993). Hearing at frequencies above 12 kHz is, among higher vertebrates, a distinctly mammalian achievement and specifically exploited in echolocating bats (see Ch. 1). In microchiropteran bats, the low cut-off frequency of sensitive hearing (30 dB SPL) typically ranges between 5–8 kHz, but the upper frequency limit can be as high as 180 kHz. In most bats, regions of best hearing coincide with the dominant spectral components of the species’ characteristic ultrasonic echolocation signals (review in Neuweiler 1990). In addition to general cochlear adaptations for processing ultrasound, certain cochlear specializations correlate with the use of specific types of sonar. In species that use short broadband frequency modulated (FM) calls, cochlear tuning is within the standard mammalian range. However, in horseshoe bats and mustached bats that employ narrow-band, Doppler-sensitive sonar based on the long constant-frequency (CF) components of their CF-FM signal, cochlear 60 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
3.1 Cochlear specializations in bats tuning is dramatically enhanced. In addition, a narrow frequency band encompassing the second harmonic CF-component (CF2) is vastly over-represented within the auditory pathway (reviews in: Vater 1988a, Kössl and Vater 1995, Vater 1998). Precise correlations between physiological and anatomical data are a prerequisite for understanding the hydromechanical mechanisms that, firstly, are responsible for general adapatations for hearing in the ultrasonic range and that, secondly, give rise to enhanced cochlear tuning and foveal representations. These correlations are established by mapping the frequency representation along the cochlear duct. Cochlear frequency maps obtained by HRP-tracing in bats are shown in Figure 3.1.1 in comparison to non-echolocating mammals. In horseshoe bats, HRP-tracing confirmed the presence of an expanded frequency representation, but the map was found to be shifted to more apical locations than revealed by the swollen nuclei technique (Bruns 1976b), thus establishing quite different functional anatomical correlations.
3.1 Cochlear specializations in bats Marianne Vater
3.1.1 Structure-function correlations in echolocating bats In both horseshoe bats and mustached bats, a narrow frequency range encompassing the second harmonic CF-component of the echolocation signal is mapped onto the BM of the upper basal turn in an expanded fashion (Vater et al. 1985, Kössl and Vater 1985b). Adopting the term first introduced by Schuller and Pollak (1979) and Bruns and Schmieszek (1980), this ‘acoustic fovea’ comprises about 30 % of BM length. It represents the narrow frequency range between 76–83 kHz in Rhinolophus (second harmonic CF frequency (CF2: 78 kHz) and the frequency range of 5966 kHz in Pteronotus (CF2: 61 kHz). In these ranges, the maximum frequency expansion amounts to about 40 mm BM-length/octave in both species. In other frequency regions, the mapping coefficients of approximately 2–3 mm/octave conform to values observed in non-echolocating mammals such as the cat and rat (Fig. 3.1.1; Liberman 1982, Müller 1991), but frequency representation in the bat cochlea clearly clips off octaves below 10 kHz. Later studies showed that the phenomenon of expanded cochlear representation of certain frequency bands is not restricted to highly specialized Doppler-compensating bats. It is also found in the cochlea of the FM-bat Tadarida brasiliensis (Vater and Siefer 1995) of the African mole rat (Cryptomys hottentotus; Müller et al. 1992), and of a bird, the barn owl (Köppl et al. 1993, see Sect. 3.3). However, the maximal mapping coefficients of about 5–6 mm per octave are much smaller than those in CF-FM bats. The term ‘auditory fovea’ was used for the specialized cochle61
3 Cochlear frequency maps and their specializations in mammals and birds
Fig. 3.1.1: Comparison of HRP-cochlear frequency maps in bats and non-echolocating mammals. Rhinolophus after Vater et al. 1985, Pteronotus after Kössl and Vater 1985b, Tadarida after Vater and Siefer 1985, rat after Müller 1991, cat after Liberman 1982, mole rat after Müller et al. 1992.
ar frequency representation in these species, but it should be kept in mind that the analogy as originally proposed involved both sensory and behavioural specializations. In a similar manner to the eye movements that keep the visual image focussed on the fovea as the place of highest resolution in the retina, Doppler-compensating bats adjust the pitch of their vocalizations such that returning echoes are analysed within the cochlear region of best tuning (Schuller and Pollak 1979). Comparative evidence also shows that cochlear foveal representations are not necessarily linked with enhanced tuning properties as might be expected from the expanded frequency representation within the foveal region. There is no evidence for enhanced tuning in the cochlea of Tadarida, the barn owl or the mole rat, and the expanded frequency range in CF-FM bats reaches into frequency bands where the frequency selectivity of auditory nerve fibres is normal (Köppl et al. 1993, Müller et al. 1992, Vater and Seifer 1995). Moreover, in the mustached bat tuning is also enhanced at CF3, where the cochlear frequency map is not specialized (Kössl and Vater 1985). What is the least common denominator for the function of auditory foveae in species with different behavioural and ecological adaptations? Auditory foveae typically represent frequency ranges of special biological importance. In Tadarida these include the terminal frequencies of the FM-signal and the quasi-constant frequency signal emitted for long range detection. In the mole rat these include the frequency range of communication signals, and in the barn owl, the fovea covers the 62
3.1 Cochlear specializations in bats frequency range used in the passive localization of moving prey (Vater and Siefer 1995, Müller et al. 1992, Köppl et al. 1993). CF-FM bats appear to use their auditory fovea to resolve sharp mechanical tuning to the CF2 signal that is created by additional cochlear specializations (reviews in: Kössl and Vater 1995, Vater 1998). As a consequence of expanded representation in the receptor organ, the representation of certain frequency bands within the central auditory pathway is enlarged, thus creating a hypertrophied neuronal substrate for parallel processing of different features of the frequency range of highest biological importance (Vater and Siefer 1995, Vater 1998). The anatomical substrate for expanded cochlear frequency representations and specialized tuning properties in bats was investigated in detailed measurements of the dimensions of hydromechanically relevant structures such as BM, TM, hair cells and stereovilli from serial semi-thin sections and ultra-thin sections taken from defined cochlear locations. In both CF-FM bats, there are conspicuous specializations of the longitudinal gradients in dimensions of BM and TM within the basal cochlear turn that coincide with specializations of the cochlear frequency map (Fig. 3.1.2; Bruns 1976a,b, Vater et al. 1985, Kössl and Vater 1985b, 1996a,b, Vater and Kössl 1996). Only apical cochlear regions with a normal frequency mapping pattern exhibit morphological gradients that are comparable to those observed in non-specialized species. In both species, auditory nerve fibres that are exceptionally sharply tuned to the CF2 signal range arise within a region that is characterized by a plateau in BM dimensions (BM-thickness and width) and high innervation density. This region is terminated basally by an abrupt increase in BM thickness. In both species, the region of maximal BM thickness is only sparsely innervated (SI-zone; Henson and Henson 1991). In horseshoe bats, the SI-zone extends up to the basal end of the cochlea and its frequency representation is unknown. In the mustached bat, the SIzone is located between 20 % and 45 % distance from the base and represents frequencies just above the CF2 signal range (62-70 kHz), with normal tuning properties. Towards more basal locations, i.e. within the representation place of frequencies between 70 and 120 kHz, there is again a decrease in BM thickness and an increase in innervation density. In both CF-FM bats, there are also profound morphological specializations of the TM. Within the region of origin of sharply tuned auditory-nerve fibres, the TM cross-sectional area is increased and its attachment site to the spiral limbus is enlarged (Vater and Kössl 1996, Vater 1997). Within the SI-zone, the TM area and the attachment site are reduced. The specialized gradients in BM and TM morphology and dimensions are superimposed on very shallow gradients of receptor cell dimensions (Vater et al. 1992, Vater and Kössl 1996). The height of OHC-stereovilli stays at values of 0.7 to 0.8 µm throughout the basal cochlear turn and reaches maximal values of 2.2 µm in the apex. Comparative anatomy in mammals possessing cochlear foveae shows that expanded frequency representation is found in cochlear regions with almost constant dimensions in BM width and thickness (Müller et al. 1992, Vater and Siefer 1995). Focal changes in TM- and BM morphology are unique to the cochlea of CFFM bats where certain frequency ranges are endowed with enhanced tuning properties (e.g. Vater and Kössl 1996). 63
3 Cochlear frequency maps and their specializations in mammals and birds
Fig. 3.1.2: Cochlear frequency maps and morphological gradients in CF-FM bats. A-D: Rhinolophus; E-H: Pteronotus; B and F: innervation density; C and G: BM-dimensions (filled circles: BM-thickness, open circles: BM-width); D and H: TM-dimensions (B: after Bruns and Schmieszek 1980; F: after Zook and Leake 1989; C, D: after Vater 1997; D, G, H: after Vater and Kössl 1996). SI: sparsely innervated zone; CF2: representation place of second harmonic constant-frequency signal.
64
3.1 Cochlear specializations in bats Focal changes in BM-dimensions are thought to represent reflection points for mechanical travelling waves (Duifhuis and Vater 1986, Kössl 1994a,b, see also Sect. 5.3.3), leading to frequency-specific summation and cancellation effects and thereby sharpening the frequency response of the passive mechanical system. Observations of regional specific specializations of mass and geometry of the TM in CF-FM bats are of significance for recent cochlear models that view the mammalian TM as a resonant structure that acts as a second filter superimposed on BM mechanics (e.g. Zwislocki and Cefaratti 1989, Allen and Fahey 1993). Data on specialized gradients in TM morphology, together with measurements of otoacoustic distortion products (Kössl and Vater 1996a,b), leads to the conclusion that TM resonance contributes significantly to sharpen the cochlear frequency response of CF-FM bats beyond values typically encountered in mammals. It is of interest to note that specialized gradients in BM- and TM morphology have developed twice in non-related genera of CF-FM bats: the new world mustached bat (Pteronotus parnellii) and the old world horseshoe bats. The obvious differences in relative dimensions of BM- and TM specializations are likely related to differences in damping of cochlear resonance as evidenced by measurements of cochlear microphonics and otoacoustic emissions (Henson et al. 1985, for further discussion see Sect. 5.3). Judged on an evolutionary time scale, specializations for enhanced cochlear tuning within the genus Pteronotus must have developed quickly, since the cochlea of a closely related species (Pteronotus quadridens) that employs FM-sonar exhibits typically mammalian gradients in BM- and TM morphology that are simply scaled according to the demands of high-frequency processing (Vater 1997).
3.1.2 Cochlear fine structure and immunocytochemistry In order to provide basic comparative data on the fine structure of cochleae specialized for ultrahigh frequencies, we performed systematic studies with scanning and transmission electron microscopy (SEM, TEM; Vater and Lenoir 1992, Vater et al. 1992, Vater and Siefer 1995). We were especially interested in the organization of the micromechanical system of the receptor surface (hair cell stereovilli and attachment to the TM) and in the organization of the OHCs that, in mammals, are commonly regarded as the site of the ‘cochlear amplifier’ (e.g. Brownell et al. 1985, Zenner et al. 1985). As illustrated in Figures 3.1.3 and 3.1.4, the structural composition of the cochlea of echolocating bats conforms generally to the common mammalian plan but there is a clear trend for miniaturization of OHCs and their stereocilia, and a massive development of supporting cells (pillars and Deiter‘s cells). Differences in structural organization among bats with different types of sonar are restricted to specializations of the passive mechanical system (TM and BM): the ultrastructure of the OHCs was found to be highly conserved across species.
65
3 Cochlear frequency maps and their specializations in mammals and birds
Fig. 3.1.3: Schematic illustration of the fine-structural organization of the bat’s organ of Corti: BM, Basilar membrane; IHC, Inner hair cells; IP, Inner pillar cell; L, Limbus; OHC, Outer hair cells; TM, Tectorial membrane; Bc, Böttcher cell; hc, Hensen’s cells; isc, inner sulcus cell; pc, phalangeal cell; IP, inner pillar cell; OP, outer pillar cell; D, Deiter’s cell; Dp, Deiter’s phalange; Dc, Deiter’s cup; cc, Claudius cell; LS, osseus spiral lamina; SL, spiral ligament.
3.1.2.1 Receptor surface and tectorial membrane The bat cochlea features the basic mammalian design of a single row of IHCs located medially to three rows of OHCs. General adaptations for high frequency hearing include the wide opening angle of OHC-stereovilli bundles and the small size of OHC-stereovilli (Fig. 3.1.4; Vater and Lenoir 1992, Vater et al. 1992, Vater and Siefer 1995). As a species-characteristic specialization of Rhinolophus, there is an abrupt transition in IHC-organization coinciding with the transition in BM-morphology. OHC-organization on the other hand stays constant throughout the specialized basal turn (Vater and Lenoir 1992). The subsurface of the TM contains imprints of the stereovilli of both receptorcell populations throughout the bat cochlea except for the extreme apical end. This differs from other mammals in which IHC-stereovilli appear either free-standing or only in contact with the TM in the most basal cochlear locations (Lenoir et al. 1987). This suggests that at high frequencies, IHC excitation, like OHC excitation, is proportional to displacement rather than velocity. 66
3.1 Cochlear specializations in bats TEM-investigations of the protofibril content of the TM of Pteronotus show a more intricate zonation into different subregions than is typical for other mammals (Vater and Kössl 1996) and a higher packing density of thick protofibrils (collagen II) than in non-specialized cochleae or non-specialized frequency regions. Regional specific variation in size and extent of subregions create a specialized geometry and – as a likely consequence – specialized vibration patterns (see also Steele 1997).
3.1.2.2 Organization of OHCs and their attachments The bat OHCs conserve both the typical cylindrical shape of mammalian OHCs (Fig. 3.1.4) and the basic organization of structural components that have been suggested to play a role in fast motility (e.g. Kalinec et al. 1992). There is an intricately-organized subcortical cytoskeleton composed of circumferential and longitudinal filaments linked to the plasma membrane by regularly-arranged pillars (Vater et al. 1992, Vater and Siefer 1995, Pujol et al. 1996). As in other mammals, the longitudinal filaments are composed of spectrin (Kuhn and Vater 1996) and the circumferential filaments are likely formed by actin (e.g. Kalinec et al. 1992). It has been suggested that this system forms a cytoskeletal spring, whose action is important both in the passive and active case of OHC-function. Freeze-fractures through the OHC plasma membrane were difficult to obtain, but showed evidence for the presence of membrane-integral particles (Pujol et al. 1996) typical for the arrangement of ‘motor proteins’ in other mammalian species (Kalinec et al. 1992). Throughout the bat cochlea, the OHCs exhibit additional organizational features that are typically only found at the very basal end of the cochlea in other mammals. They are of short length (12 µm in the base, up to 22 µm in the apex as compared to values of 20 µm and 80 µm in the guinea pig), the mitochondria are concentrated below the cuticular plate and only rarely located along the lateral wall, the subsurface cisternal system is reduced to a single highly-fenestrated layer below the lateral wall as opposed to multiple layers, and the base of the OHC body is tightly held within a specialized, rigid cup formed by the apical process of the Deiter’s cell (Fig. 3.1.4). The cups are much more extensive than in other mammals, and are tightly packed with cytoskeletal elements shown to be composed of F-actin (Fig. 3.1.4) and microtubuli (Pujol et al. 1992, Kuhn and Vater 1996). Freeze-fracture and high-resolution TEM provided evidence that the junctional region between the OHC base and Deiter‘s cup is a specialized type of cell contact previously unknown in the cochlea and similar to paranodal axon-glia junctions of the central nervous system.
67
3 Cochlear frequency maps and their specializations in mammals and birds
Fig. 3.1.4: a, b, Organ of Corti of the mustached bat. a: CF2 region. b: SI-zone. Note pronounced differences in morphology of BM and TM (after Vater and Kössl 1996). c: TEMmicrograph of OHCs of the horseshoe bat (after Vater et al. 1996). d: F-actin staining of organ of Corti of the horseshoe bat (after Kuhn and Vater 1995). e: SEM-picture of receptor surface in upper basal turn of horseshoe bat (Vater and Lenoir unpublished).
68
3.1 Cochlear specializations in bats 3.1.2.3 Cochlear development in the horseshoe bat Frequency mapping in the auditory midbrain (Rübsamen and Schäfer 1990) and in the cochlea of horseshoe bats (Vater and Rübsamen 1992) revealed an age-dependent shift in the frequency responses of the auditory fovea. The onset of hearing as defined by recordings from the central auditory system occurs at postnatal day 3-5 and the first responses, that are insensitive and broadly tuned to frequencies below 50 kHz, arise from the basal cochlear turn. Tuned responses in the frequency range between 10 and 50 kHz arising from the middle and apical cochlear turns were recorded at the end of the first postnatal week. Sharply-tuned responses to high frequencies from the auditory fovea, however, only emerged around postnatal day 1012 and were matched to the individuals’ CF-signal component which at that time is about 1/3 octave below the adult’s call frequency. While the frequency representation within middle and apical cochlear regions remains stable with age, the foveal responses shift upward in frequency until the adult characteristic is achieved at about the end of the 4th postnatal week. Systematic investigations of cochlear anatomy with light- and electron microscopy (Vater 1988, Vater et al. 1997) showed that the shift in frequency mapping within the fovea is not accompanied by growth of the cochlear duct: The speciesspecific specializations of BM dimensions, receptor-cell arrangements and dimensions are adult-like at birth, i.e. prior to onset of hearing. Data further indicate that cellular specializations linked to active OHC function are mature at onset of hearing (Vater et al. 1997). The passive components of the hydromechanical system, however, show subtle maturation after the onset of hearing. The tympanic cover layer is reduced within a time span of about 3 postnatal weeks, the transient attachment of the TM via marginal pillars is reduced during the first week after the onset of hearing and the cytoskeleton of supporting cells reaches the adult-like composition at about postnatal days 12–16 (Vater et al. 1997, Kuhn and Vater 1995). Comparative analysis of the development of the F-actin pattern with confocal fluorescence microscopy in gerbils supports the notion that subtle stiffness changes in the passive hydromechanical system are involved in the postnatal dynamics of the cochlear frequency map (Kuhn and Vater 1995). However, the fact that the shift in the cochlear frequency map is restricted to basal cochlear locations can not be fully explained by the gradients seen in morphological maturation of the organ of Corti. Supporting the idea of Henson and Rübsamen (1996), that the maturation of tension fibroblasts of the spiral ligament may be involved in dynamic alterations of the cochlear frequency place code, we found that the maturation of F-actin staining in the spiral ligament of the gerbil closely matches the maturation of the frequencyplace map. There is no change in the morphological arrangements in the apical cochlea after the onset of hearing, whereas the time course of incorporation of F-actin filaments into tension fibroblasts of the basal turn continues over the critical period of physiological maturation (Kuhn and Vater 1997).
69
3 Cochlear frequency maps and their specializations in mammals and birds
3.2 Cochlear maps in birds Otto Gleich, Alexander Kaiser, Christine Köppl and Geoffrey A. Manley
3.2.1 Frequency representation along the basilar papilla To obtain the frequency map of the basilar papilla at the level of those hair cells that have most of the afferent synapses and thus transmit the auditory information to the brain, we used tracer labelling of single, physiologically-characterized auditory afferents (Gleich 1989, Köppl and Manley 1997, Manley et al. 1987) or focal HRP injections were made in the cochlear nuclei at locations of known CF (Köppl et al. 1993). With these techniques, it became feasible to determine the location of the hair cell(s) contacted by physiologically-characterised fibres in the starling, chicken, barn owl and emu. The resulting frequency maps (maps of the tonotopic organization) show an interesting evolutionary trend. Whereas the emu map is exponential throughout, i.e. frequency is mapped logarithmically as a function of distance, and equal distances are allocated to each octave, those for the chicken and starling show a gradual reduction of the space devoted to lower octaves (Fig. 3.2.1). This trend reaches an extreme in the barn owl (Köppl et al. 1993, see below), where the space devoted to each octave increases dramatically towards higher frequencies, the last octave from 5-10 kHz occupying more than half the papilla’s length. This region has been termed an auditory fovea (Köppl et al. 1993). The changes seen between different species suggest evolutionary specializations over time in the more advanced species. The frequency mapping carried out in chickens also revealed an interesting developmental aspect. In contrast to the assumptions made from the sound-trauma data of Rubel and Ryals (1983), our single-neural maps for 2-day and 3-week old chickens were not statistically different (Manley et al. 1987). This indicated that no shift of the frequency representation in the neural hair-cell area occurs between 2 and 21 days of age, at least for frequencies up to 2 kHz. These findings have recently been confirmed for the E19 chicken embryo by Jones and Jones (1995). These and other relevant data on the development of peripheral frequency maps in birds and mammals were discussed in detail in a recent review (Manley 1996).
70
3.2 Cochlear maps in birds
Fig. 3.2.1: Maps of the frequency distribution along the basilar papilla in four species of birds that were studied in this collaborative research centre. In the top panel, the data are plotted as a function of the position along the papilla in percent of the distance from the apex, thus normalizing the total length of the papillae. In the bottom panel, the data are plotted as a function of the distance from the apex of the papilla in millimeters. In this panel, the length differences between the papillae of the different species are obvious. The extreme flattening of the curve in the barn owl at high frequencies denotes the region of the auditory fovea. Only the emu has a frequency distribution that is truly logarithmic with distance, and thus shows a straight line on these plots.
71
3 Cochlear frequency maps and their specializations in mammals and birds
3.2.2 Possible changes in function across the papilla’s width The avian basilar papilla has many hair cells across its width. Depending on the species, there may be from 5 hair cells across the width basally up to about 50 apically (see also Sect. 2.1; Manley 1990, Manley and Gleich 1992). Whereas apically, hair cells across the entire width are innervated by afferent fibres, basally a moreor-less large proportion of the hair cells (covering up to >50 % of the papilla’s width) is not afferently innervated at all (Fischer 1994b). Thus the avian papilla offers the opportunity of examining data from the localization of stained fibres for correlations with the position of the innervated hair cell(s) across the width of that part of the epithelium that is afferently innervated. Physiologically-characterized and individually-labelled afferent neurons allowed us to investigate correlations between physiological properties and the position of innervated hair cells(s) across the papilla’s width. In the starling, we (Gleich 1989) first showed that afferent fibres contacting hair cells at the same longitudinal position, but at different distances from the papilla’s neural edge differ in their physiological properties. In the CF-range between 0.6 and 1.8 kHz, we found a linear correlation between threshold and the position of the innervated hair cell. Fibres contacting hair cells near the neural edge were the most sensitive, and those contacting hair cells progressively nearer the middle of the papilla were less sensitive to the remarkable extent of 6 dB/hair cell (Fig. 3.2.2).
Fig. 3.2.2: Change of mean threshold of single auditory-nerve fibres as a function of the position of the hair cell they innervate across the width of the papilla in the starling (data from Gleich, this collaborative research centre and the pigeon (data from Smolders et al. 1992). In the starling, the data were restricted to the frequency range from 0.6 to 1.8 kHz. Since especially in the starling most abneural hair cells receive no afferent synapses, there are no data for this area. The pigeon data for the equivalent cochlear region have for comparison been truncated to only reach 42 %.
72
3.2 Cochlear maps in birds In the emu (Köppl and Manley 1997), we found no clear gradient; instead, afferent fibres at any position across the width of the papilla had relatively low thresholds. However, our data sample in the emu was small. These and new data from the pigeon (Smolders et al. 1995) suggest that functional gradients across the basilar papilla may be found in birds, but that their pattern and prominence is strongly species-specific. This is presumably related to the evolutionary trend of an increasing hair-cell specialization across the papilla as seen anatomically. In both the emu and the barn-owl, it appears as if at frequencies above 2-3 kHz, the range of thresholds is much smaller than found at lower frequencies (Köppl 1997a, Manley et al. 1997). Gradients across the papilla may thus be expected to be more prominent in the lowto mid-frequency range.
3.2.3 Development of body temperature in chickens – Implications for the development of frequency maps An important observation regarding the body temperature in chickens triggered a study of the ontogeny of homeothermic regulation (Kaiser 1992). In a set of neurophysiological experiments from different studies, it became obvious that young chickens cannot maintain their body temperature when removed from a heat source and kept at room temperature. Since such an imperfect control of body temperature would have a strong influence on the frequency representation in non-mammalian vertebrates, the study of body-temperature shifts during development was essential for understanding the results of experiments carried out on the ontogeny of tonotopicity. This knowledge is a necessity for evaluating popular models that propose a developmental frequency shift in vertebrate hearing (Rubel 1984). In contrast to mammals, a change in body temperature has a strong effect on frequency tuning of primary auditory afferents in birds (and other non-mammals), resulting in a shift of the characteristic frequency of a single nerve fibre of up to 1 octave per 10 °C (Schermuly and Klinke 1985). Therefore, in our study of the development of frequency maps in the chicken we investigated the development of temperature regulation in chickens and estimated from these measurements the resulting shift in frequency representation that would occur during normal ontogeny. Somewhat surprisingly, animals younger than 4 days posthatching were not able to maintain a constant body temperature at all when exposed to a lower ambient temperature. Animals older than 4 days held their body temperature at a constant value, but that value was lower than in adults. The adult body temperature of about 41.5 °C was not maintained until the animals reached the age of 23 days posthatching (Fig. 3.2.3). These findings have significant implications for the design of developmental hearing studies in chickens and possibly all birds. Depending on the age of the experimental animals and/or the ambient temperature, the actual body temperature may vary and be substantially lower than the mean adult body temperature. Young animals exposed singly or in groups to damaging noises for an hour or more at room temperature would be steadily cooling unless special precautions are taken. 73
3 Cochlear frequency maps and their specializations in mammals and birds While they are cooling, the tonotopic map would be ‘drifting’ along the papilla, and presumably the site of maximal noise damage would be constantly changing as well. Especially for hatchlings, the resulting frequency shift of tonotopic representation would be of the order shown by Heil and Scheich (1992). Thus, the experimental design becomes an important variable when constructing models of the development of tonotopicity. In future experiments with non-mammalian vertebrates, a complete protocol of the experimental body – or, better still, head temperature is essential. Otherwise, it is impossible to distinguish between different possible causes of any observed developmental shifts in tonotopicity or even to establish whether the shift would have occurred at all if the temperature had been stable.
Fig. 3.2.3: Development of body temperature in the chicken. The development of the rectal temperature with age (continuous line, left axis) shows that at age P23, the animals have acquired the adult body temperature of 41.5 ºC. The dashed line (right axis) indicates the animals are homeothermic after age P4. Their body temperature differs from that of adults up to age P23 (after Kaiser 1992).
74
3.3 The auditory fovea of the barn owl
3.3 The auditory fovea of the barn owl Christine Köppl
As discussed above, the barn owl is the only bird species so far where a grossly expanded spatial representation of a small, but behaviourally-relevant frequency band (3-10 kHz) is found (Fig. 3.2.1). A number of unusual morphological features accompany this derived frequency map (Fischer et al. 1988, Smith et al. 1985). There is no indication from either anatomy or physiology, however, that the barn owl might have modified cochlear micromechanics, as is strongly suggested for some bat species (see Sect. 3.1.2). Instead, compared with an average bird (see Sects. 2.1, 3.3.1), the barn owl mainly shows a huge enlargement of the high-frequency, basal part of its basilar papilla and a modified innervation pattern that greatly biases the neural representation towards high frequencies. The basilar papilla of the barn owl is close to 11 mm long, which is unusual among birds. The apical third corresponds morphologically to a typical avian papilla, whereas the remaining part represents a considerably extended base (Fischer et al. 1988, Fischer 1994a). Corresponding to the expanded frequency representation, most anatomical parameters, especially those generally found to correlate with response frequency (e.g. stereovillar height), are nearly constant along the basal twothirds. Also, the hair cells, especially the short hair cells, are extremely small (3-4 µm high). The afferent innervation in the expanded base of the barn owl’s papilla shows a remarkable increase in the numbers of fibres contacting individual hair cells. Neural hair cells receive up to 20 afferent terminals, whereas typically (in other birds and, indeed, in the more apical regions in the owl) hair cells receive no more than 1 to 3 afferent terminals (Fischer 1994a). This unusual number leads to a great overrepresentation of high frequencies in the auditory nerve. Furthermore, the high-frequency afferent fibres have unusually large axonal diameters, which may be an adaptation to the increasing demands for temporal accuracy in phase locking at high frequencies (Köppl 1997).
75
4 Models of the human auditory system
4.1 Psychoacoustically-based models of the inner ear Hugo Fastl for Eberhard Zwicker†
Three types of models of the inner ear were developed and realized in the laboratory of E. Zwicker: A hydrodynamic model, a hardware electronic model, as well as several versions of computer models.
4.1.1 A hydrodynamic model of the human cochlea A single channel hydromechanical model of the human cochlea with nonlinear feedback on the membrane was developed and realized. Figure 4.1.1 gives a schematic cross section of the model with the transducers (Lechner 1993). Both the motors and sensors were realized by PVDF-bending transducers. In the model, a nonlinear feedback circuit is incorporated which is described in more detail for the hardware model. As a membrane, natural rubber with a thickness of 0.3 mm was chosen. For the simulation of the ‘perilymph’, silicon oil turned out to be most suitable, since it does not corrode the transducers and allows an easy filling of the model without bubbles. For motors and sensors, two laminated layers of PVDF foil are used. When working as sensors, the charge on the aluminised faces is proportional to the displacement of the membrane. On the other hand, if external voltages are applied, they exert a bending moment and therefore a force upon the membrane, working as motors. The whole membrane is covered by 256 transducer sections with one sensor- and one motor-transducer each. For a range of 20 critical bands, this corresponds to a resolution of approximately 12 sections per Bark. When the feedback channels are disconnected, the passive response of the model can be measured. In this case, the basilar membrane level exhibits high-frequency slopes of about 60 dB per octave. With the nonlinear feedback activated, a 76 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
4.1 Psychoacoustically-based models of the Inner Ear
Fig. 4.1.1: Cross section of a hydromechanical model of the cochlea with nonlinear feedback using PVDF- bending transducers.
maximum slope of 140 dB per octave is reached. As in psychoacoustic as well as physiological data, a sharpening of the selectivity patterns shows up with decreasing input level.
4.1.2 An electronic model of the human cochlea An electronic hardware model of the cochlea with nonlinear pre-processing and active feedback was developed and realized (Zwicker 1986c). The hardware model consists of 90 sections corresponding to a frequency range from 900 to 8000 Hz. The model is based on the assumption that the outer hair cells act as saturating nonlinear mechanical amplifiers that feed back to the vibration of the basilar membrane. In line with physiological evidence, only inner hair cells transfer information towards higher centres. The basic features of the model can be explained by means of Figure 4.1.2. Figure 4.1.2a shows an electronic circuit that is normally used as an electrical equivalent for the hydromechanics of the inner ear. For the ease of introducing feedback, the dual electrical circuit is used as indicated in Figure 4.1.2b. A simulation of the input at the oval window near places tuned to high frequencies is given on the right side. On the left side, a simulation of the helicotrema is realized. A main feature of Zwicker´s model is displayed in Figure 4.1.2c, simulating the additional nonlinear feedback circuits at each section. From the resonating circuits of the (passive) inner-ear model, a voltage is coupled out, amplified, fed through a saturating nonlinearity, and coupled back to the passive inner ear model. The model is capable of simulating psychoacoustic masking patterns with their nonlinear level-dependent behaviour, i.e. steep upper slope at low level and flat upper slope at high level. Not only level responses, but also phase responses correspond closely to related physiological data. In addition, the model can describe suppression effects. One main advantage of the hardware model is that it allows meaningful approximations to be found that can be used as starting values for computer 77
4 Models of the human auditory system
Fig. 4.1.2: Block diagram of the hardware model of cochlear nonlinear pre-processing with active feedback. a: Electrical circuit normally used to simulate the hydromechanics of the inner ear. b: Dual circuit of (a) with input (right) and load (left). c: Nonlinear feedback circuits representing the influence of outer hair cells.
models. Not only masking effects but also otoacoustic emissions can be simulated in the cochlea hardware model with feedback (Zwicker 1986d). Spontaneous emissions, simultaneous evoked emissions as well as delayed evoked emissions were measured in the model which closely resemble data from psychoacoustic experiments. In summary, we conclude that the cochlea acts in a similar way as established in the hardware model. The hardware model allows, in addition, the quantitative assessment of 2f1-f2-difference tones (Zwicker 1986e). Data obtained with the model suggest that the 2f1-f2-components are produced in a spectral region where the excitations of the two primaries overlap. Further, it is assumed that the interaction products travel along the basilar membrane to a place characteristic for a 2f1-f2-frequency. On their way, of course, they can undergo substantial phase shifts. Therefore, a description of 2f1-f2-nonlinearities as a vector sum can quantitatively describe ‘strange’ dips in cancellation levels measured psychoacoustically.
78
4.1 Psychoacoustically-based models of the Inner Ear
4.1.3 A computer model of the human cochlea A computer model of cochlear pre-processing was developed and implemented (Lumer 1987a) in which the hydromechanics of the cochlea are modelled by a linear transmission line. A feedback circuit with amplifier and saturating nonlinearity is coupled to each element of the transmission line. The nonlinear active model shows distinct frequency selectivity for input signals at low level. However, the selectivity decreases with increasing input level. At and near the characteristic frequency, the input-output function is nonlinear, whereas other frequencies are transferred linearly. This behaviour is comparable to neurophysiologically measured input-output curves of the basilar membrane vibration of mammals. Figure 4.1.3 shows examples for the level-place patterns obtained in the computer model. Sinusoidal inputs at 1585 Hz, 2161 Hz, and 2947 Hz are used for input levels of 0, 40, and 80 dB. The lower panel in Figure 4.1.3 shows the phase-place patterns.
Fig. 4.1.3: a: Level-place patterns and b: Phase-place patterns of a nonlinear computer model for cochlear pre-processing. Sinusoidal input signals at 1585 Hz, 2161 Hz, and 2947 Hz for levels of 0 dB, 40 dB, and 80 dB.
79
4 Models of the human auditory system At low levels, the level-place patterns show substantially-increased selectivity compared to the patterns at higher levels. For the three frequencies considered, the three patterns displayed in Figure 4.1.3a can be obtained by shifting one pattern along the Bark-scale. On the other hand, results plotted in Figure 4.1.3b illustrate that the phase-place patterns are not only shifted, but differ in the position of the phase plateau, which gives the maximum phase lag between input and output signal. Because of the longer travel time from the input to the place of the maximum, the final value of the phase lag is larger for lower frequencies. Sound processing in inner ear models was compared for the nonlinear cochlear pre-processing hardware model as well as the transmission line computer model (Zwicker and Lumer 1985). The interaction of two simultaneously-presented sinusoidal input signals was also investigated in the computer model of cochlear pre-processing (Lumer 1987b). Suppression effects observed in the model are comparable to results of neurophysiological measurements in the mammalian cochlea. Moreover, psychoacoustic data on the ‘additivity of masking’ can be simulated (Lumer 1984). With some simplification it can be stated that the ‘addition’ of two main excitations leads to an increase of 3 dB. The addition of a main excitation plus a slope excitation yields an increase between 3 and 8 dB. If two slopes of masking patterns are ‘added’, the combined masked threshold can be increased by as much as 14 dB. The digital cochlear pre-processing model was also implemented as a realization based on wave-parameters (Zwicker and Peisl 1990). Frequency responses obtained in the wave-parameter model are very similar to the patterns known from the analog model. However, the wave-parameter model allows a significantlyincreased number of sections and shows more stability than the analog model. DEOAEs could be simulated in detail in the wave parameter digital model. Characteristic effects of otoacoustic emissions could be simulated in detail both in the analog model as well as in the wave-parameter digital model by means of lateral coupling (Peisl and Zwicker 1989). The application of the cochlear nonlinear pre-processing model in speech recognition (Zwicker 1986f) is illustrated in Figure 4.1.4. The data stream of fluent speech of the order of magnitude of 100 kbits per second is led to the peripheral nonlinear active pre-processing model with feedback connected to the psychoacoustic model for extraction of basic auditory sensations. In this way, a data reduction by a factor of 10 to 10 kbits per second is reached. By creating complex sensations and selecting dominant parameter changes, a further reduction down to a stream of 1 kbit per second is possible. For the recognition procedure, linguistic and phonetic rules provide an important input leading to a sequence of phonetic items with approximately 100 bits per second.
80
4.2 Linear model of peripheral-ear transduction (PET)
Fig..4.1.4: Block diagram of a speech recognition system based on peripheral nonlinear active pre-processing with feedback, psychoacoustics, phonetics, and linguistics.
4.2 Linear model of peripheral-ear transduction (PET) Ernst Terhardt A system of linear filters (Peripheral-Ear Transduction, PET) was designed for modelling auditory preprocessing of sound (Terhardt 1997, 1998). The PET system is conceptualized as an extension and complementation of the Fourier-t transformation and algorithm (FTT, cf. Terhardt 1985). While FTT has proven to be quite appropriate and efficient (Terhardt 1986, Heinbach 1988, Schlang 1989, Schlang and Mummert 1990, Baumann 1995, Wartini 1996), the PET system has the advantage that it provides some extra degrees of freedom.
4.2.1 Basic definitions and features As shown in Figure 4.2.1 (left), the system consists of (1) a linear filter (ECR) that accounts for the ear-canal resonances (in fact, it accounts for the first two of them, i.e. at about 3.3 and 10 kHz.); (2) a bank of linear filters each of which accounts for the transfer function from outer ear to a particular place on the cochlear partition, defined as an auditory channel (cochlear transfer function, CTF).
Fig. 4.2.1: Left: The peripheral-ear transduction (PET) system. ECR: ear-canal resonances; CT: cochlear transmission functions. Right: Illustration of the basic type of frequency response (absolute magnitude) of the CTF filters (example with k = 1, η = 2).
81
4 Models of the human auditory system 4.2.1.1 The ECR filter The ECR filter consists internally of two chained filters of the type
Hr (s) = 1+ α
s s2 + 2as +a2 + ω20 ’
(a)
where s = j ω; α is a real constant, a a real damping coefficient, and ω0 is the filter’s Eigen frequency which in fact is close to the corresponding resonance frequency. The resonance frequencies of the two internal filters are approximately 3.3 and 10 kHz, respectively. For all frequencies, the absolute magnitude of Hr is greater than 1 but approaches this value asymptotically on both sides of the resonance frequency. The resonance frequency is given by — — (b) ωr = a2 + ω20, and the maximum of absolute magnitude at resonance frequency is 1+α/(2a). The resonance bandwidth is Br ≈ a/π. These relationships allow specification of a and α on the basis of the observed resonance bandwidth and height of resonance maximum. The parameter ω0 is obtained by ω0 = π
4ƒ2r – B 2r .
(c)
4.2.1.2 The CTF filter The typical asymmetric shape of the cochlear filters’ frequency response is dependent on the transmission-line character of basilar-membrane vibrations. Therefore, for modelling cochlear transmission, a type of filter is most adequate the frequency response of which occurs in transmission lines. This is the case for the transfer function
(
k
)
snsn* Hn(s ) = (s – sn )(s – sn* ) .
(d)
Here, the index n denotes a particular cochlear channel and CTF filter, respectively. Each of the CTF filters is designed as a chain of k so-called singular filters of the type which, in Eq. (d), is defined by the expression inside the power function. As an example, in Figure 4.2.1 (right-hand side) the frequency response of a singular filter is shown, i.e. a CTF filter with k = 1. With the expansion
sn = –an + j ωn; sn* = –an – j ωn
82
(e)
4.2 Linear model of peripheral-ear transduction (PET) and for the circular frequency ωc = 2π ƒc =
ω2n – a2n
(f)
the absolute magnitude of Hn takes on the resonance maximum
| | Hn
max
= ηk =
(
a2n + ωn2 2anωn
k
) ( =
2a2n + ω2c
k
)
2an a2n + ωc2 .
(g)
For a given filter, i.e. a particular characteristic frequency fc, the parameter an is determined by the height of the resonance maximum. Resolving (g) for an yields
an = ƒc π 2
1 + 1/(η2 –1) –1,
(h)
where η denotes the height of the resonance maximum of the pertinent singular filter. The ±45°-bandwidth of the singular CTF filter turns out to be B’n = an/π. For any given eigenfrequency ωn, both the height of resonance maximum and the bandwidth are determined by the parameter an. When for a particular characteristic frequency the height of resonance is prescribed, the effective filter bandwidth is automatically prescribed as well. If the maximum is high, bandwidth is low, and vice versa. For a cascaded CTF filter, i.e. a filter that consists of k identical singular filters, the quantitative relationship between resonance height and bandwidth depends on k. By choosing k appropriately, nearly any resonance height can be combined with any bandwidth. The 3-dB bandwidth of a k-th order cascaded filter such as defined by (d) is
Bn =
ω2n – a2n π 2
1– 1–
4a2nω2n ( k 2 – 1) . (ω2n – a2n)2
(i)
4.2.1.3 Temporal behaviour The temporal behaviour of the PET system is essentially determined by that of the CTF filters alone. The latter can be grossly characterized by the effective duration of the impulse responses, which in turn is equivalent to the effective length of the time window of an equivalent Fourier analyzer. An analysis of the impulse response of the CTF filter defined by (d) yields for the product Tƒc (where T denotes the effective duration of the impulse response) the formula
Tƒc =
(k – 1)!ek – 1 . π 2(k – 1)k – 1
1 1+1/(η2 – 1) –1
.
(i)
The meaning of Tƒc is the effective duration of the impulse response normalized to the period corresponding to the filter’s characteristic frequency. Note that η denotes the resonance height of the singular filter. When the cascaded filter is intended to 83
4 Models of the human auditory system
Fig. 4.2.2: Left: Normalized onset time T of the CTF filters as a function of k, for the dynamic ranges indicated. Right: 3-dB bandwidth of the CTF filters as a function of characteristic frequency fc.
have the resonance height g, η must be set to η = k√–– g. Figure 4.2.2 (left) illustrates the dependencies of Tfc on k and resonance height (in dB). As an example, consider a CTF filter with fc = 1000 Hz, k = 5, and a resonance height of 70 dB. According to Figure 4.2.2 its effective time-window length is about 8 ms.
4.2.2 Setting PET’s parameters Most remarkably, the majority of parameters included in the PET system can be deduced from the absolute threshold of hearing. In addition, the ear’s temporal behaviour and dynamic range provide essential criteria.
4.2.2.1 The ECR filter As the ECR filter is not intended to fully represent outer- and middle-ear transmission, but just the effect of the ear-canal resonances, its parameters can be read from the threshold of hearing. By optimization of the model’s reproduction of the threshold of hearing, the following parameters were found to be appropriate. First resonance: α = 10 000, ω0 = 20 000/s, a = 5000/s; second resonance: α = 20 000, ω0 = 65 000/s, a = 20 000/s.
4.2.2.2 The CTF filters From the user’s point of view, the primary parameters to be chosen for the CTF filters are (1) the characteristic frequencies ƒc, such that to each channel indexed by n there pertains one particular ƒc, (2) the damping coefficients an. Regarding the characteristic frequencies, it is reasonable to define that these (1) grow in the same order 84
4.2 Linear model of peripheral-ear transduction (PET) as the channel number n; (2) grow in steps, the size of which is a constant proportion of bandwidth. The latter definition implies that the degree of overlap of filter response curves is one and the same in the entire CTF system. This in turn implies that the ultimate scaling of characteristic frequencies cannot be carried out until the bandwidths – i.e. the parameters an – of all filters have been determined. When the an are chosen such that the bandwidths are a constant fraction of, say, critical bandwidth (Zwicker 1961a) or equivalent rectangular bandwidth (ERB, cf. Glasberg and Moore 1990), it turns out that the height of the resonance peaks as a function of characteristic frequency takes on approximately the inverse shape of the threshold of hearing, except for the contribution of ear-canal resonances. This implies that aurally adequate bandwidths of the filters are automatically obtained when one sets the an such that the height of resonance maxima equals the difference between the threshold of hearing and a constant absolute SPL – the PET system’s reference level, Lr. Setting the an according to that criterion implies that within any channel, a constant intensity of the filter output is presumed for absolute threshold. Formally, this approach can be taken advantage of as follows. The threshold of hearing excluding the contribution of ear-canal resonances, L*A(ƒ), can be expressed by
LA* (ƒ) / dB = 3.46
(
ƒ 1000 Hz
)
– 0.8
(
+10 – 3
ƒ 1000 Hz
)
4
(l)
(Terhardt 1979). This expression is used to formalize the above relationship between the height of resonance maxima and characteristic frequencies:
Igη(ƒc ) =
[ (
Lr / dB 0.182 ƒc + 1– k 1000 Hz 20k
) ] [ ( – 0.8
10 – 4 ƒc + 1– 2k 1000 Hz
)] 4
.
(m)
The reference level Lr, by definition, is identical with the resonance height of the CTF filter with fc = 1000 Hz. For any fc, Eq. (m) yields the pertinent η value, which in turn yields the pertinent an value from (h) The parameter ωn pertinent to fc is obtained from (f). The value of the reference level Lr can be obtained from physiological and psychoacoustical data on the ear’s dynamic range. In particular, it can be assessed by analysis of tuning curves and narrow-band masked thresholds. As an example, in the present PET system Lr = 70 dB was chosen. The number k of cascaded singular filters per channel must be chosen such that the effective time-window length of spectral analysis is aurally adequate, see Figure 4.2.2 (left). In the present PET system, k = 5 was chosen. This way, the filter parameters for any fc are determined, i.e. derived from the threshold of hearing. With (i) then the bandwidths can be calculated. This yields the CTF-filter bandwidths as a function of fc depicted in Figure 4.2.2 (right).
85
4 Models of the human auditory system
4.2.3 Simulation of the threshold of hearing With the above definitions of the CTF filter system, simulation of the threshold of hearing only requires specification of the threshold intensity or SPL in the channels. This value can be most easily obtained from the threshold of hearing for a 1-kHz pure tone. Assuming that this value is 3 dB and taking into account the small contribution of the first ear-channel resonance at 1 kHz, the threshold level inside each of the channels, LnA, is appropriately set to
LnA = 3.3 dB + Lr .
(n)
By definition, a steady sinusoidal tone reaches the absolute threshold when in any channel the value LnA is attained. For a steady sinusoidal tone with a given frequency f, this of course will occur in the particular channel whose characteristic frequency equals f. For the simulated threshold of hearing, this criterion eventually yields the formula
|
|
LA (ƒ) / dB = L r / dB + 3.3 –20 lg HECR (ƒ) +20 lgηk (ƒc = ƒ).
(o)
The threshold thus calculated using the particular parameters of the PET system described above is shown in Figure 4.2.3 (left).
Fig. 4.2.3: Left: Threshold of hearing, and tuning curves; simulated with the PET system (Lr = 70 dB, k = 5). Right: Characteristic frequencies fc of the CTF channels for N = 256; here, the channels are indexed by ν = n–1.
86
4.2 Linear model of peripheral-ear transduction (PET)
4.2.4 Simulation of tuning curves While the absolute threshold is the SPL required to produce the channel-specific threshold level LnA at the characteristic frequencies, a tuning curve can be simulated by calculating the SPL which is required to produce LnA (or any other SPL) in a certain channel (i.e. with fc constant), as a function of a sinusoidal tone’s frequency ƒ. When in (o) ηk (the height of resonance maximum) is replaced by |Hn(ƒ)|, (the particular filter’s frequency response), a formula is obtained that describes a simulated tuning curve, that is pertinent to a particular channel number n with characteristic frequency fc. When the threshold level LnA is chosen, the tuning curve’s minimum touches the absolute threshold of hearing. The tuning-curve formula for that case is
|
|
|
|
L (ƒ) / dB = L r / dB + 3.3 –20 lg HECR (ƒ) +20 lg Hn (ƒ) .
(p)
The function |Hn(ƒ)| in (p) denotes the absolute magnitude of the CTF-filter’s transfer function (d, e). Figure 4.2.3 (left) depicts a number of tuning curves calculated with (p) and the above PET parameters, namely, for fc = 0.1, 0.2, 0.5, 1, 2, 5, and 10 kHz.
4.2.5 Scaling of characteristic frequencies Since using the above definitions and equations, the relationship between characteristic frequency ƒc and CTF-filter bandwidth Bn = B(ƒc) has been established, one can assign numerical values to the ƒc of the channels, according to the criterion that the overlap of filter response curves is constant. To get the frequency scale, one starts by assigning a certain ƒc to the first and to the last channel, respectively. Then the ƒc of the other channels are obtained as follows: Let ƒc(0) be the lowest characteristic frequency, i.e. that of the first channel, ƒc(1) that of the second channel, etc. Then this series of assignments is made: ƒc (1) = ƒc (0) + εB [ƒc (0)]; ƒc (2) = ƒc (1) + εB [ƒc (1)]; … ƒc (N – 1) = ƒc (N – 2 ) + εB [ƒc (N – 2 )].
(q)
The factor ε determines the degree of overlap of filter response curves. If ε <1, there indeed is overlap, i.e. in the sense that the spacing of characteristic frequencies is narrower than that of adjacent bandwidths. The most obvious way to determine ε for a given N, fc(0) and fc(N–1) is by repeatedly performing the above series of assignments on the computer with decreasing values of ε, until fc(N–1) (the characteristic frequency of the last channel) is assigned with sufficient approximation the desired value. For example, with N = 256, Lr = 70 dB, k = 5, fc(0) = 25 Hz and fc(N–1) = 15 000 Hz, the appropriate value of ε emerges as ε = 0.2562526. The pertinent dis87
4 Models of the human auditory system tribution of characteristic frequencies is shown in Figure 4.2.3 (right). The value ε ≈ 1/4 found for this sample PET system indicates that there are about four adjacent characteristic frequencies per filter bandwidth.
4.2.6 Digital computation Both the ECR- and the CTF-filters can be digitally computed by means of one and the same recursive algorithm. To demonstrate this, consider the impulse responses of the respective filter types. The impulse response of a single ECR filter such as defined by (a) is
a sinω t ). hr (t ) = δ (t ) + α e–at (cosω 0t – –– 0 ω0
(r)
The impulse response of a singular CTF filter such as defined inside the power function of (d) is
hn(t ) =
an2 + ωn2 –a t n ωn e sinωnt .
(s)
To compute the filter responses, the corresponding convolution integrals must be calculated. This can be achieved recursively, taking advantage of the formula –(a–j ω)Tx P m +1 = e–(a–j ω)Tx P m+p m +11– e . a – jω o
o
o
(t)
In (t), P° denotes a discrete, complex spectrum; it equals the FTT spectrum (Terhardt 1985) multiplied by ejωt. Tx denotes the sampling interval, m the sample number, and p the input signal to the filter. When P° according to (t) is split into its real and imaginary parts, one finds the components that are required for calculation of the convolution integrals of the PET system’s two types of filter. For the output samples q of one of the ECR filters, this eventually yields the recursive formula αa qm +1 = pm +1 +αXm +1 – ω Ym +1 0
(u)
where Xm+1, Ym+1 denote the real and imaginary parts of P°m+1. Likewise, for the output of a singular CTF filter, one obtains
qm +1 =
an2 + ωn2 ωn Ym+1.
(v)
The output of a k-th order CTF filter is obtained by k-fold concatenation of the algorithm (v).
88
4.2 Linear model of peripheral-ear transduction (PET)
4.2.7 PET and gammatone The so-called gammatone filterbank (Patterson et al. 1992, Slaney 1993, Cooke 1993, cf. de Boer and Kuyper 1968), which has found application in a considerable number of auditory models, is another type of filter having the advantage that it accounts for the transmission-line character of auditory Fourier analysis. There is indeed a relationship between the gammatone and the CTF types of filter. In their first-order versions, the CTF and the gammatone filters are even almost identical. The impulse response that defines the gammatone filter has the form
g (t ) = t k–1 e –at cos (ωt + ϕ).
(w)
For k = 1, this becomes essentially equivalent to the impulse response of the singular CTF filter (s). However, to adequately model auditory filtering, k >1 is required. For instance, in the present PET system, k = 5 was chosen as being optimal, while the gammatone filter is ordinarily used with k = 4. For k >1 the two types of filter are different. The impulse response of the k-th order CTF filter, the transfer function of which was defined by Eq. (d), has the form
hn(t ) = (–1) k –1
(an2 + ωn2 )kt k e–an t 2k –1(k –1)! ωnk –1
( ) d d(ωnt )
k –1 sinω
ωnt
nt
.
(x)
As one can see, for k >1, (w) and (x) differ in essential respects. Correspondingly, the pertinent transfer functions are different as well. The similarities and differences between the CTF and the gammatone filters can be traced back to the manner in which these filters were defined. While the gammatone filter was defined in the time domain, i.e. by the impulse response (w), the CTF filter was defined in the Fourier-transform domain, i.e. by the transfer function (d). As a consequence, the transfer function of the CTF filter is formally simple while that of the gammatone filter is formally complicated; the reverse applies to the respective impulse responses.
89
4 Models of the human auditory system
4.3 Nonlinear mechanics of the organ of Corti Frank Boehnke
In addition to the nonlinear transfer function of the outer hair cells (OHC, e.g. Preyer and Gummer 1996) we suggested that the nonlinearities of the Deiters cells (DC) may also be responsible for nonlinear basilar membrane (BM) motion (Boehnke and Arnold 1998). To examine this, we used continuum mechanics to simulate a mechanical system consisting of a short BM section (60 µm long and 306 µm wide). The various parameters used for the different tissues will not be repeated here. The resulting truss was discretized by finite elements and evaluated numerically. Of course, the reduction of the human BM, which is about 32 mm long, to a short segment is clearly a considerable simplification. However, the main objective of this work was to examine nonlinear mechanisms, and the identification of nonlinear mechanisms in a short segment of the OC can be extrapolated to the complete OC. The phalangeal processes (PhP, diameter 1 µm) connect the bases of OHCs (diameter 8 µm) with the heads of the next-but-one OHCs. The surface of the OHC, together with the endings of the PhP of the DC constitute the reticular lamina (RL) a stiff layer containing a high concentration of cytokeratin (Arnold and Anniko 1990). A close inspection of the BM shows that it is a mixture of rigid transversal fibres embedded into a ground substance (Iurato 1962, Voldrˇich 1978), thus we idealized the BM as an orthotropic plate. The back side of the plate (Fig. 4.3.1) is clamped, whereas the opposite side, at the ligamentum spirale, is solely supported with non-zero results of the x-rotational degree of freedom. The reticular lamina (RL) was also idealized as an isotropic plate, 40 µm long 131 µm wide. Inner (IPC) and outer (OPC) pillar cells were idealized as three-dimensional elastic rectangular beams. At the RL, IPC and OPC axes created an angle of 72°. The mechanical behaviour of the OC depends essentially upon the interaction of the OHC with their supporting structures. Though the basal parts of DC consist of cell bodies of considerable size, the mechanical behaviour is determined by the stiff trunks constituting the support for the OHC. They are covered by three-dimensional beams whose thickness is only 1 µm. Figure 4.3.1 shows the complete assembled system including the beam pairs representing the DC that are considered in the passive nonlinear mechanical system. For the development of a preliminary passive mechanical system we consider the plasma membrane of the OHC to be decisive for the mechanics of the OHC. The wall structures of OHC justify their idealization as straight isotropic pipes described by the theory of thin cylindrical shells. All cells of the organ of Corti were idealized as mechanical components whose mechanical behaviour is governed by the equations of elastomechanics. Because of the three-dimensional finite element model (FEM), six unknowns per node had to be calculated. These unknowns are three displacements (ux, uy, uz) and three rotations (rotx, roty, rotz). The couplings at the interfaces between different components was carried out for all six degrees of freedom. The complete system consists of five different element groups (orthotropic shell, isotropic shell, thick beam, thin beam and 90
4.3 Nonlinear mechanics of the organ of Corti
Fig. 4.3.1: Diagrammatic representation of the finite element model of the organ of Corti, including supporting structures of outer hair cells.
pipe) and is discretized by 596 elements in all. The displacement uz of an upper (at the RL) and lower (at the DC) OHC ending (Fig. 4.3.1) was chosen as output variable of the system. We stimulated with two sinusoidal components (f1 = 1 kHz, f2 = 1.2 kHz) of frequency ratio f2/f1 = 1.2. The spectral analysis of the output variable uz (Fig. 4.3.2) is the z-displacement of the joint between the lower ending of an OHC and the top of the trunk of the connected DC. Signals with frequencies up to 2.66 kHz could be analyzed without aliasing errors. Taking the geometrical nonlinearities of the thin elastic beams representing the PhP of DC into account, additional frequency components arose during sinusoidal stimulation. Comparison of the calculated nonlinear spectrum with measurements of BM displacements (Robles et al. 1991) under two-tone stimulation shows multiple spectral components in both cases (Fig. 4.3.2). Therefore, nonlinear distortions can be generated in the supporting structures of the organ of Corti and not, as is mostly suggested, exclusively in the outer hair cells. The nonlinearity of the Deiters cells with their Phalangeal processes limits the displacement of the organ of Corti and, therefore, protects the organ from damage. Nonlinear distortion components appeared in the spectrum of the Deiters cells (Fig. 4.3.2) with a force amplitude of Fz = 0.1 nN. This is equivalent to a sound pressure level of L approx. 60 dB SPL, if the area of the RL ARL = 40 µm * 131 µm is used as reference area. Considering geometrical nonlinearities of the thin beam elements, the forcedisplacement function shows a saturating behaviour which was found in experiments with increasing sound pressure levels (e.g. Johnstone et al. 1986). 91
4 Models of the human auditory system
[Hz] Fig. 4.3.2: Amplitude-spectrum of uz under two-tone stimulation and considering geometrical nonlinearities of the PhP.
92
5 Active mechanics and otoacoustic emissions in animals and man
5.1 Otoacoustic emissions in lizards Geoffrey A. Manley
The term otoacoustic emissions (OAE) describes the phenomenon in which sound energy can be measured in the external ear canal that was not presented in any stimulus. The finding of such OAE in humans (Kemp 1979) led to a flurry of activity among auditory researchers, since it provided evidence for an active process – presumably in hair cells – that could lie at the heart of a variety of unexplained phenomena in hearing, such as the extraordinary sensitivity and frequency selectivity of the peripheral hearing organ. OAE of non-mammals have been studied within the collaborative research centre since about 1990, following our discovery, during cooperative work with Dr. Graeme Yates and Prof. Brian Johnstone of the University of Western Australia, that lizards have both spontaneous (SOAE) and distor-tion-product (DPOAE) otoacoustic emissions (Manley and Köppl 1992, 1993, Köppl and Manley 1993, 1994, Köppl et al. 1993). Since then, we have studied these phenomena in a variety of lizards and birds. One main reason was to study the presence of phenomena associated with active cochlear processes in non-mammals. Can OAE in non-mammals tell us anything about the origin and mechanisms of active processes in vertebrate hearing organs? A thorough comparison of SOAE characteristics of mammals and non-mammals would be very useful in establishing those features that are basic to hair-cell function and those that are derived from unique morphologies or physiologies. These questions was recently reviewed by one of us (Köppl 1995). We have studied in more or less detail the characteristics of SOAE in 11 species of lizards from eight different families. These are: the bobtail lizard, Tiliqua rugosa (a member of the family Scincidae), two gekkonids, the Tokay gecko, Gekko gecko, and the leopard gecko, Eublepharis macularius, a varanid, the steppes monitor Varanus exanthematicus, two teiids, the golden Tegu, Tupinambis teguixin, and the Chile Tegu, Callopistes maculatus, a cordylid, Cordylus tropidosternum, three iguanids, the brown bahamian anole, Anolis sagrei, the green iguana, 93 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
5 Active mechanics and otoacoustic emissions in animals and man Iguana iguana and the basilisk lizard Basiliscus basiliscus, and finally an anguid, the Texas alligator lizard, Gerrhonotus leiocephalus. The reason for choosing a variety of lizard species is that the structure of the lizard hearing organ varies greatly, but systematically, between and even within families. We chose species for which data on the hearing-organ anatomy was available, or was known for closely-related species. In the following, the lizard species will be referred to using only their generic names. Some of the lizard SOAE data are also summarized in Table 5.1.1.
Table 5.1.1: Characteristics of SOAE in eight species of lizards. A and B groups – see text. Species (number of specimens)
Frequency Average nr. range of SOAE per SOAE, kHz emitting ear
SOAE-amplitude [dB SPL] min max
Frequency shift, oct/°C at 28–30 °C
1.3 – 7.7
7
-7.2
4.7
0.02
A Gerrhonotus (2) 0.99 – 4.1
6
-7
3.1
0.035
A Gekko (7)
12
-6
10.3
0.035
A Eublepharis (9) 3.31 – 4.0
4
-6.6
4.9
0.03
A Tiliqua (18)
0.93 – 4.61
10
-10
9.3
0.03
B
Varanus (7)
0.75 – 3.2
5–6
-6
27
0.055
B
Tupinambis (6)
1.4 – 4.6
7
-7
15.5
0.08
B
Callopistes (8)
2.09 – 3.73
2.2
-4.5
23.3
0.1
A Anolis (6 )
1.94 – 4.45
Compared to most other classes of vertebrates, an unusually high percentage of lizard ears, mostly 100 %, produces SOAE. There is no reason to doubt that these SOAE are a product of the normal function of the hearing epithelium. Peaks in the averaged ear-canal sound spectra were identified as SOAE when their frequencies were temperature-sensitive and when their peak level was suppressible by external tones. Across all species, SOAE centre frequencies lay between 0.75 and 7.7 kHz (Fig. 5.1.1). The number of emissions per ear varied from 2 to 15, and the number of peaks was roughly correlated with the structure of the hearing organ. In those species having a ‘primitive’ hearing-organ structure (see Manley 1990 for details; such organs have an unspecialized tectorial membrane covering all hair cells), such as Tupinambis, Callopistes and Varanus, only between two and seven SPOAE peaks were found in each ear. These peaks were often large, up to 27 dB SPL. Compared to this, the other species tended to have a larger number of SOAE peaks that were of smaller amplitude, up to about 10 dB SPL. This correlated with the presence of a more specialized hearing-organ structure, showing either the complete absence of a tectorial membrane over the relevant hair-cell area or the presence of specialized tectorial structures such as sallets. 94
5.1 Otoacoustic emissions in lizards
Fig. 5.1.1: Sample spectra of lizard SOAE from the ear canal of four individual animals representing four different species. The two species in the upper row are a teeid and a monitor lizard, that have a continuous tectorial membrane over their auditory papillae. In the two lower species, the tecorial membrane is either broken up into sallets, as in the gecko, or entirely absent from the high-frequency hair-cell area, as in the iguanid lizard Anolis.
5.1.1 The origin of lizard SOAE In those species for which frequency maps of the hearing organ are available (Tiliqua, Gerrhonotus and Gekko, Manley 1990), the range of the centre frequencies of SOAE lies within the ‘high-frequency’ hair-cell area of the basilar papilla. Current evidence indicates that in lizards only hair cells with centre frequencies above about 1 kHz are predominantly micromechanically tuned (Köppl and Manley 1992, Manley 1990). SOAE are thus only generated by micromechanically-tuned hair-cell groups. 95
5 Active mechanics and otoacoustic emissions in animals and man SOAE peaks appear in the averaged spectra to be like very narrow-band noise. The frequency bandwidths differ considerably between different SOAE and generally overlap with SOAE bandwidths measured for other species, including mammals, although lizard tend to be wider than most of those measured in, for example, humans (which can be as narrow as 1 Hz). The SOAE bandwidth correlates with the amplitude, however, and the most lizard SOAE have small amplitudes. Is there evidence that SOAE represent the output of actively-oscillating cell groups? In collaboration with Pim van Dijk of the University of Groningen, Netherlands, we examined this question in an analysis of the statistical properties of a number of SOAE from different lizard species and one bird species, the barn owl (see also below). The evidence clearly suggested that the SOAE of these nonmammals have their origin in some kind of active oscillations (van Dijk et al. 1996). Because of this fact, it is reasonable to expect studies of SOAE to provide useful and even detailed information about the characteristics of the hearing organ. Since data from several species indicate that in lizards, there is no frequencyselective basilar-membrane travelling wave (see Köppl and Manley 1992, Manley 1990), the origin of SOAE has to be sought at the local level, in terms of motility of particular groups of hair cells and, if they are attached to a tectorial structure, in the driven motility of this structure also. Since studies of isolated lizard papillae indicate that acoustic stimuli activate hair cells through a side-to-side movement of the auditory papilla and its tectorial structure, the origin of SOAE is presumably in such movements generated spontaneously by the hair cells. Whereas in mammals it is generally assumed that the outer hair cells (OHC) produce rapid length changes that couple back into the motion of the epithelium, motility of the hair-cell body is unlikely in non-mammalian hearing organs (Köppl 1995, Manley 1995). Another possible source of mechanical energy is the hair-cell bundle itself, that has been shown to undergo active movements, including rapid, biphasic twitches during deflection and even spontaneous twitching (see Manley and Gallo 1997 for references). To investigate the possibility that the putatively myosindriven hair-cell adaptational mechanism could supply the energy of emissions, we studied SOAE in Anolis, a lizard in which the relevant hair-cell area lacks a tectorial structure. We were able to attribute each SOAE peak to a particular group of hair cells by projecting the SOAE frequency limits on to a frequency map of the papilla and – from the known anatomy – estimating the number of hair cells generating each peak. By estimating both the energy in each peak from the sound pressures in the ear canal and the number of myosin molecules in the adaptational motors of the cells involved, we were able calculate the power output of the putative myosin generators (Fig. 5.1.2; Manley and Gallo 1997). Our study suggested that the power produced would – in this situation – be adequate to provide the energy for SOAE. Whether the mechanism really employs myosin or not will have to await further studies. Detailed observations of the variation of SOAE frequency and amplitude over time in various species of lizards showed that the wide bandwidth in averaged spectra results from generators whose frequency is unstable and varies very rapidly. The 96
5.1 Otoacoustic emissions in lizards
Fig. 5.1.2: Left: The dependence of the power in SOAE peaks of the lizard Anolis on the number of hair cells producing the peaks. The thicker dashed line is a linear regression to all the data, the thinner line is a regression that omits the three points with the highest power, that may be outliers. Right: The sound power emitted per hair cell as a function of the frequency of the SOAE peak. The line is a linear regression on all the data. All correlations are significant.
unaveraged signal occurs most often at frequencies near the centre of the averaged spectral peak and less often at frequencies on the flanks. The differences in bandwidth seen between SOAE in the various classes of vertebrates may thus simply be a reflection of the frequency stability of the SOAE generators. By this criterion, the SOAE generators in lizards show poor frequency stability. This idea is compatible with the fact that lizard SOAE may shift their centre frequency a great deal (up to several hundred Hz) under the influence of external tones.
5.1.2 The influence of changes in head temperature on lizard SOAE The centre frequency of SOAE can be influenced by the head temperature, all species showing essentially the same behaviour. SOAE amplitude is not strongly influenced by the temperature, except that at very low and very high temperatures, all peaks tend to disappear. The centre frequency of SOAE shifts to higher values when the animals are warmed and to lower frequencies when cooled, but the rate of change of frequency with temperature is somewhat species-specific (Table 5.1.1, Fig. 5.1.3). This specificity is related to the anatomical characteristics of the hearing organ in a similar way to the relationship between the anatomy and the spectral features of SOAE. The change with temperature of the SOAE frequency was two to three times higher in the less specialized papillae of Callopistes, Tupinambis and Varanus, being 100 to 180 Hz/ºC, or about 0.055 to 0.1 oct/ºC. The average shift in species with more specialized papillae was only about 60 Hz/ºC (or 0.015 to 0.06 oct./ºC). The shift of 97
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.1.3: Temperature effects on the centre frequencies of SOAE in two species of lizards, to illustrate that the magnitude of the temperature-related frequency shift can vary between species. Each line (continuous for the teeid lizard Callopistes, dotted for the skink Tiliqua) shows the change in the frequency of a single SOAE peak over the range of temperatures forwhich its amplitude was large enough to permit a frequency measurement.
centre frequency was generally dependent both on the frequency itself and on the temperature range in question In general, the shift was greater for SOAE of higher frequency, both absolutely (i.e. when expressed in Hz/°C) and proportionally (when expressed in octaves/°C). The size of these shifts depends on the temperature range such that for each species, the smallest shifts were observed in a temperature range that corresponds – where this parameter is known – to the ecological optimal range. Thus the sensitivity of the SOAE is lowest at the temperatures at which the animal spends most of its time when active. Within this range also, the largest number of SOAE peaks were recordable. This of course raises interesting questions as to how this range is matched at the cellular level, about which nothing is known.
98
5.1 Otoacoustic emissions in lizards
5.1.3 Interactions between SOAE and external tones A second manifestation of frequency instability in the generators of lizard SOAE is the relatively high sensitivity of their centre frequency and of their amplitude to the presence of external tones. External tones are able to cause large shifts in the centre frequency of SOAE. SOAE amplitude can be negatively (suppression) or positively (facilitation) influenced. The magnitude of the effects is dependent both on the frequency distance between the external tone and SOAE and the external tone’s level. External tones closer in frequency and of higher level are more effective.
5.1.3.1 Level suppression by external tones Externally-applied tones influence SOAE in a strongly frequency-selective way. At any given frequency of external tone that is close enough to the centre frequency of the SOAE, increasing the level of the external tone will affect the characteristics of the SOAE. The first effect observed is that of suppression – the level of the SOAE falls. A criterion level of suppression can be determined for all tones that influence the emission, and it becomes clear that the SOAE has the lowest suppression threshold for tones very close to its own frequency. The tone level necessary to reach criterion suppression rises as the tone frequency is moved away in either direction. Thus suppression tuning curves (STC) are V-shaped (Fig. 5.1.4; e.g. Köppl and Manley 1994, Manley et al. 1996) and can be defined for each SOAE at arbitrary criterion levels; in most cases, we used a criterion of 2 dB suppression. In all cases, the most sensitive frequency lies very close to the centre frequencies of the respective SOAE. This is in contrast to findings in mammals, for which the centre frequencies of STC tend to be at a frequency slightly higher than that of the SOAE (e.g. Schloth and Zwicker 1983), reflecting the presence in mammals of a highly non-linear region at a position slightly basal to the centre-frequency position. There is no evidence in non-mammals for such a discrepancy. The lizard STC bear a remarkable similarity in shape and in frequency selectivity to excitatory tuning curves of primary auditory nerve fibres of the same species (Tiliqua, Gekko and Gerrhonotus, see Table 5.1.2). The similarities extend to details of the respective parameters such as frequency selectivity, measured as the Q10dB, and the slopes of the flanks (Manley and Köppl 1992, Köppl and Manley 1994). Since this has also been shown for other species, it suggests that STC of SOAE give as good information concerning frequency-selectivity characteristics as do neural tuning curves.
99
5 Active mechanics and otoacoustic emissions in animals and man Table 5.1.2: The average values and the ranges of the parameters measured from the STC of SOAE and primary afferent tuning in Tiliqua. Parameter
Primary afferents
SOAE suppression tuning
0.2 to 4.5 kHz
0.91 – 4.18 kHz
6 to 78 dB SPL
9 to 33 dB SPL
Range of centre frequencies Best thresholds
Slopes of low-frequency flanks -10 to -115 dB/octave
-20 to -95 dB/octave
Slope of high-frequency flanks
10 to 325 dB/octave
55 to 215 dB/octave
0.2 to 9
1.7 to 21
Q10dB-values
5.1.3.2 Facilitation by external tones For particular combinations of frequency and level of external tones, facilitation of SOAE level is seen (e.g. Köppl and Manley 1994), in some cases up to 17 dB. Facilitation is especially large in lizards as compared to other species, and it is possible that the external tone stabilizes the SOAE oscillator, which is manifest in a higher peak level. However, facilitation at least in some species occurs at frequencies and levels of tones that match notches on the flanks of the STC (Fig. 5.1.4). At least in Tiliqua, these notches are also seen in frequency-tuning curves for primary auditory nerve fibres (Köppl and Manley 1994, Manley et al. 1990), suggesting that they reflect some important phenomenon that influences tuning selectivity at the hair-cell level. The simplest explanation (Manley 1997) is that at these frequencies and levels, hair-cell groups and their tectorial structures move in phase. This would produce no relative stereovillar-bundle deflection and thus no activation of the neural afferent pathway, but the motion of the whole organ would produce larger emissions than normally, since they are somehow being driven by the tone. At frequencies where the SOAE is not being driven by the applied tone, we presume that only the tectorial structure shows significant (in that case hair-cell-driven) motion. This concept of the origin of facilitation is compatible with the idea that in lizards, the shape of the excitatory tuning curve is determined by the micromechanical characteristics of the hair-cell stereovillar bundles and the local tectorial structure (Manley et al. 1988, 1989), and that the relative motion of these structures changes (in phase and magnitude) across frequency. Since we have also seen facilitation in lizards that lack a tectorial membrane over their high-frequency hair-cell area (Gallo 1997), it is possible that the fluids surrounding the hair-cell bundles can to some extent substitute for a tectorial structure.
100
5.1 Otoacoustic emissions in lizards 5.1.3.3 Shifts in SOAE frequency due to external tones External tones can shift the frequency of an SOAE peak (Fig. 5.1.4). The frequency may shift away from that of the external tone (frequency pushing) or, less often, towards the external tone (frequency pulling). Frequency pulling generally only occurred within a small range of frequencies below the SOAE. In mammals, synchronization to the external tone has been described, a frequency pulling, but there it only occurs in a range of ± 10 Hz of the SOAE (Zwicker and Schloth 1984). The greater range of frequency pulling in lizard SOAE presumably reflects their frequency instability. Overall, SOAE from non-mammals behave very similarly to those of mammals. The differences are quantitative, rather than qualitative. The most obvious difference lies in the greater frequency stability of mammalian SOAE. The recently-discovered broader-bandwidth SOAE of mammals (Ohyama et al. 1991, 1992, Whitehead et al. 1993), however, have reduced the size of this species discrepancy considerably. The great many similarities between mammalian and non-mammalian SOAE are an important finding. They indicate that the hypothesis that SOAE in all terrestrial vertebrates are generated by the same mechanism(s) is a logical conclusion, based on evolutionary relationships, and cannot be dismissed lightly.
Fig. 5.1.4: Summary of the different effects of applying external tones on an SOAE peak, in this case in the cordylid lizard Cordylus. This diagram combines the STC for 2 dB suppression (thin continuous line), the regions in which facilitation to 2 dB occurs (closed grey loops) and the areas within which a downward frequency shift of the SOAE of 35 Hz occurs (dashed lines). It is typical of other lizards also to find the facilitation areas below dips or plateaus along the STC flanks.
101
5 Active mechanics and otoacoustic emissions in animals and man
5.1.4 Distortion-product otoacoustic emissions in lizards SOAE only occur at particular frequencies, and only a few SOAE per ear are found in the less specialized lizard papillae. Nevertheless it would be useful to apply some of the non-invasive SOAE techniques to some related phenomena. We have done this, using emissions induced by presenting two tones to the ear. Under these conditions, the hearing organ emits at various frequencies that are mathematically related to combinations of the stimulus frequencies presented. These distortion-product otoacoustic emissions (DPOAE) are the result of driving the non-linear transfer function of the hair-cell transducer system with two tones simultaneously and have long been studied psychoacoustically as well as neurophysiologically. Such DPOAE are, for example, the quadratic product f1+f2 (where f1 and f2 are the frequencies of the two tonal stimuli), or the cubic distortion products 2f1-f2, 2f2-f1, 3f2-2f1, etc. In humans, clinical interest relates to their use as ‘probes’ to objectively screen the ears of infants, for example, that are not able to communicate their sensations (see Ch. 9). There is thus great interest in establishing to what extent the characteristics of DPOAE reflect the normal function of the hearing organ. Since the first study of DPOAE in lizards (Rosowski et al. 1984) was carried out before fast and sensitive measurement systems became available, we extended their studies to other species. The most detailed data are from Tiliqua, collected during collaborative work at the University of Western Australia (Manley and Köppl 1993, Köppl et al. 1993, Manley et al. 1990). Since DPOAE, by their very nature, are fixed in frequency (their frequency being determined only by the frequencies of the primary tones), neither temperature nor external tones can affect this parameter. Apart from this, however, other parameters of measurement of the characteristics of DPOAE behave in a similar way to SOAE. Thus DPOAE can be suppressed and facili-tated (by a third tone) in a frequency-dependent way, and STC have been measured (Gallo 1997, Manley and Köppl 1993). The characteristics of these STC reflect, like the STC for SOAE, the features of the neural tuning (Manley et al. 1992). Due to the nature of their generation, however, their characteristics do differ somewhat to SOAE-STC and neural tuning curves. The thresholds of DPOAE-STC are higher, for example due to the fact that, in order to generate DPOAE, a higher degree of activation of the hearing organ is necessary than for SOAE, which of course arise spontaneously. In addition, the two frequencies generating DPOAE must have a finite distance from each other, and together, they presumably activate a broader region of the hearing organ – sometimes much broader – than is activated by an SOAE. Thus it is not to be expected that STC of DPOAE suppression (the third tone influences primarily the region of the two primary tones) would be as sharply tuned as STC for SOAE suppression. With these caviats in mind, however, it is clear that the features of data derived from DPOAE generation, suppression and facilitation also reflect the frequency selectivity of the hearing organ.
102
5.2 Otoacoustic emissions in birds
5.2 Otoacoustic emissions in birds Geoffrey A. Manley and Grit Taschenberger
In birds, we initially measured simultaneous-evoked emissions (SEOAE) from the starling (Manley, Schulze and Oeckinghaus 1987). This was followed later by measurements of distortion-product emissions (DPOAE) from the starling and chicken (Kettembeil, Manley and Siegl 1995) and from the barn owl (Taschenberger 1995). SOAE were only found in the barn owl (Manley and Taschenberger 1993, 1997). In addition, we found that anaesthetic agents depress emissions in birds (Kettembeil, Manley and Siegl 1995). Barn owls become anaesthetized with comparatively low doses of Ketamine, which may increase the detectability of SOAE in this species.
5.2.1 Simultaneous-evoked emissions in the starling We reported SEOAE from the starling, that appeared as ripples in the waveform of sound-pressure of a low-level swept tone in the external ear canal (Manley et al. 1987) due to destructive and constructive interactions between the added sound and the emissions. From the changes in sound pressure as compared to the expected levels, the emission sound levels could be calculated. The emissions were fairly broad-band, were non-linear in their growth with increasing SPL and were also suppressible by second tones. The most effective suppressor tones in each case were near the emission frequency. Such emissions were measurable mostly in the upper half of the starling’s hearing range, and varied in level between -30 and +2 dB SPL. Suppression tuning curves were measured by adding second tones and searching for a criterion amount of suppression: Their shape was strongly reminiscent of neural tuning curves in the same species (Manley et al. 1985).
5.2.2 Spontaneous otoacoustic emissions in birds (SOAE) We have scanned the sound field in the ear canal of three species of birds, the chicken, starling and barn owl, but have detected SOAE only in the barn owl. There, we observed them in 79 % of ears, with an average of 1.9 per ear (Manley and Taschenberger 1993, Taschenberger and Manley 1997). SOAE had centre frequencies between 2.3 and 10.5 kHz, but almost all (93 %) were higher than 7.5 kHz (Fig. 5.2.1). The SOAE thus originated primarily in the upper quarter of the animal’s hearing range, and derived from the specialized area of the auditory fovea (Köppl et al. 1993, see also Sect. 3.3). The barn owl in fact shows the highest-frequency SOAE ever reported from non-mammals. Their peak amplitudes lay be103
5 Active mechanics and otoacoustic emissions in animals and man tween -5.8 and 10.3 dB SPL, but they were unstable both in amplitude and in frequency during and between recording sessions. The centre frequency of SOAE was temperature dependent, shifting with an average rate of 0.039 octaves/ºC (Taschenberger 1995). The smallest shifts were seen near the animal’s normal body temperature. Using ipsilateral external tones, SOAE could be suppressed. Suppression tuning curves (STC) were V-shaped, with the sensitive tip (down to about 0 dB SPL) near the centre frequency of the SOAE (Fig. 5.2.1). For SOAE with frequencies between 2.5 and 10.5 kHz, the Q10dB-values of 2 dB-iso-suppression tuning curves varied from 1.07 to 10.40. These ranges resemble those for single eighth-nerve afferent responses (Köppl 1997a, see Sect. 2.5). When compared with the cochlear map (Köppl et al. 1993), it is obvious that SOAE in this species originate primarily from the auditory fovea. Both the frequency and the amplitude of SOAE could be influenced by contralateral stimuli (Taschenberger 1995, see also Sect. 2.6.3). Contralateral band-pass noise generally induced a suppression of amplitude and a shift of frequency in an upward direction.
Fig. 5.2.1: Left: Three SOAE peaks in one barn owl before (dashed line) and during (continuous line) the presentation of an external tone at 9 kHz, 28 dB SPL. This tone suppresses the amplitude of all three SOAE. Right: Iso-suppression tuning curves for 2 dB suppression of two SOAE in two different barn owl ear canals. The legend indicates the animal number, the degree of suppression, the centre frequency of the unsuppressed SOAE and the tuning sharpness measure Q10dB.
104
5.2 Otoacoustic emissions in birds
5.2.3 Distortion-product OAE in the chicken, starling and owl We measured the distortion-product emission 2f1-f2 and, in some cases, 2f2-f1 in the ear canal of both awake and anaesthetized European Starlings and chickens (Kettembeil and Manley 1995) and in the barn owl (Fig. 5.2.2; Taschenberger 1996, Taschenberger et al. 1995). The effect of a third suppressive tone and the behaviour of the DP under deeper anaesthesia (starling and chicken) were also studied. In general, the DPOAE in birds first appeared at relatively high primary-tone levels. The best frequencies of third tones suppressing 2f1-f2 lay near the first primary tone (f1), suggesting DPOAE generation near the f1-place in the papilla. Facilitation via a third tone was also seen often at levels below those eliciting suppression. In the chicken and starling, the DPOAE 2f1-f2 disappeared completely at the onset of deep anaesthesia and recovered to its original magnitude when the anaesthesia was lightened, sometimes with a considerable delay. Control experiments showed that this effect was not a result of hypoxia (Kettembeil et al. 1995).
Fig. 5.2.2: The characteristics of DPOAE in the barn owl. In A are shown three DPOAE that arose as the result of presenting two tones at 7.7 and 8.7 kHz, 60 dB SPL at the eardrum. In B, the optimal ratios are shown as a function of the frequency of f1. In C, two examples of suppression tuning curves (-6 dB) for DPOAE are shown, elicited by primary-tone levels of 55 dB in both cases, and primary-tone frequency ratios of 1.25 (left curve, f1 = 6 kHz) and 1.22 (right curve, f1 = 7 kHz). In D, the suppression of a DPOAE by contralateral tones is shown, specifically the dependence of the DPOAE level on the frequency of the contralateral tone. The parameter is the sound-pressure level of the contralateral tone, in steps of 5 dB from 15 to 50 dB SPL. The dotted curve shows the ipsilateral sound pressure without tones in either ear.
105
5 Active mechanics and otoacoustic emissions in animals and man Although the barn owl has such a specialized papilla (see also Sect. 3.2), many features of its DPOAE were found to be very similar to those described for the chicken and starling (Taschenberger 1995). As a measure of frequency selectivity, the Q10dBvalue of the STC increased as a function of frequency up to 15.8 and strongly resembled the equivalent increase in frequency selectivity with CF of barn-owl auditory-nerve fibres (Köppl 1997a).
5.3 Otoacoustic emissions and cochlear mechanisms in mammals Manfred Kössl
The auditory sense organ in mammals probably evolved originally in adaptation to a nocturnal habitat and primitive mammals may have relied on listening to high frequency noise from prey or predators. Therefore, in contrast to reptiles and birds, the mammalian hearing organ is designed to yield a good sensitivity for highest frequencies, up to about 200 kHz in some echolocating mammals. Three middle ear bones and the strict functional division of the receptor cells in inner and outer hair cells can be seen as an adaptation for high frequency hearing. To achieve good mechanical sensitivity and sharp tuning, low level sound signals are actively amplified in the mammalian cochlea by the outer hair cells (e.g. Dallos 1992). As a byproduct of the active and nonlinear amplification, otoacoustic emissions (OAEs) are generated by the cochlea and are transmitted through the middle ear to the tympanum, where they can be measured with a microphone. OAEs were first discovered in humans by Kemp (1978) and since then proved to be a robust phenomenon present in all terrestrial vertebrates investigated so far (e.g. Probst et al. 1991, Köppl 1995). There are several classes of OAEs: Spontaneous OAEs can be measured without sound stimulation and seem to be a direct consequence of cellular force generation in the organ of Corti. In humans, they are generally of low level below the threshold of hearing and are found close to local threshold minima (Zwicker and Schloth 1984). They only appear at a few frequencies – in some ears they cannot be detected at all and they are more frequent in women than in men (Moulin et al. 1993). In small mammals, they are only rarely found. We investigated the occurrence of SOAEs in a number of bat species, gerbils, mole rats and opposums. SOAEs were only found in the mustached bat. In about 10 % of these bats, strong SOAEs of up to 40 dB SPL could be recorded for a short period of time (Kössl and Vater 1985a, Kössl 1994). These SOAEs are at about 60 kHz (Fig. 5.3.1). At the same frequency, which is just above the dominant echolocation frequency (CF2: second harmonic constant frequency component of the call), peripheral auditory tuning is considerably enhanced. At and basal to the cochlear frequency place of the emissions, there are pronounced structural discontinuities of the 106
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
Fig. 5.3.1: Spectrum of a SOAE from the mustached bat, Pteronotus parnellii, in comparison to human SOAEs (adapted from Kössl 1997).
basilar membrane (BM) and the tectorial membrane (TM). This suggests that the occurrence of SOAEs is probably linked to structural discontinuities leading to local impedance changes and reflections. Evoked OAEs (EOAEs) are induced by external tones, they can be measured e.g. due to interference with an external pure tone sweep (Wilson 1980). Figure 5.3.2 shows EOAEs from the mustached bat (Henson et al. 1985, Kössl and Vater 1985a, Kössl 1994). They can be recorded in each individual at a specific frequency a few hundred Hz above the CF2-call frequency and can convert to SOAEs (see above). Obviously, in this species, the same mechanism that produces the SOAEs is responsible for EOAEs, and the difference between both emissions is the degree of cochlear damping. The EOAEs are slightly lower in frequency in male than in female bats. Their correlation with the generation of a strong cochlear resonance will be discussed below. Spontaneous and evoked OAEs are emitted due to anomalies in cochlear structure that, at least in the case of the mustached bat, are not pathological but employed for the creation of very sharp tuning and high sensitivity to a biologically important narrow frequency range. However, such emissions, if present at all, do not give information about mechanical processing along the whole length of the cochlear duct. To overcome this limitation, in the last decade, a focus of many researchers has been the measurement of a different type of OAE, the distortion-product OAEs (DPOAEs).
107
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.3.2: Evoked OAE measured as stimulus-frequency OAE (SFOAE). A continuous pure tone was swept upward in frequency. (A) Close to 63 kHz there is a maximum and a minimum in the frequency response measured with a microphone at the ear drum. (B) This interference pattern is accompanied by sudden phase changes, shown are the differences of phase from the 0 dB attenuation data set. The emission saturates at about 70 dB SPL (20 dB attenuation) (adapted from Kössl 1994).
5.3.1 Acoustic distortions and the cochlear amplifier In vitro, outer hair cells (OHCs) are capable of fast contractions and elongations of their cell body in response to an electric field (Brownell et al. 1985). Thus they could in principle create force which feeds back on the passive motion of the BM and TM and sharpens cochlear frequency tuning (Fig. 5.3.3). An important prerequisite for this proposed action of the ‘cochlear amplifier’ is that in vivo, the OHC motion is induced by the cell‘s own receptor potential and is fast enough to cope with sound frequencies beyond 100 kHz (Ashmore 1987). There are several open questions concerning the cochlear amplifier and OHC motility, its proposed cellular correlate: (1) Is OHC motility fast enough? (Kolston 1995 versus Dallos and Evans 1995, Reuter et at. 1994). (2) Even if it would be fast enough, how does in vivo its driving force, the receptor potential, overcome low pass filtering and amplitude reduction with increasing frequency (Russell and Sellick 1978)? (3) Reptiles and birds have sharp tuning and pronounced OAEs without apparent OHC-like motility (Manley et al. 1987, 1996, Köppl and Manley 1993, Köppl 1995, Manley 1995). Are there different mechanisms involved? Or are nonlinear stereovillar properties, that seem to play a decisive role in the cochlear mechanics of non-mammalian tetrapods, also a relevant component of the cochlear amplifier in mammals? Outer hair cell stereovilli in the mouse have nonlinear characteristics comparable to those of non-mammalian hair bundles (Russell et al. 1989, 1992). 108
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
Fig. 5.3.3: Cochlear amplifier in mammals. Fast OHC-motility is thought to generate force that feeds back on the passive motion of the organ of Corti. Positive versus negative feedback could sharpen the tip of IHC tuning curves and rise their tails (adapted from Kössl 1997).
Regardless of its cellular realization, a characteristic feature of the cochlear amplifier is its distinctly nonlinear behaviour. Low level input sound is amplified more strongly than loud sound. This nonlinearity is evident in the motion of the BM (see Fig. 5.3.4, upper left inset) and in auditory nerve fibre properties and produces strong frequency distortions. When stimulating with two pure tones of different frequency f1 and f2 with f1
Fig. 5.3.4: Distortion generation in the mammalian cochlea. Due to nonlinear mechanical amplification by OHCs in the zone of overlap of the two stimulus travelling waves, distortions are generated. Right inset shows cubic distortion products from the mustached bat (adapted from Kössl 1997).
109
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.3.5: Effect of salicylate on 2f1-f2 growth functions in the bat Carollia perspicillata (A) and on the time course of 2f1-f2 and f2-f1 in the gerbil (B). From Kössl (1992) and Frank and Kössl (1996).
pure tones at about 70 kHz were lowered during systemic application of salicylate, predominantly at small stimulus levels (Fig. 5.3.5a; Kössl 1992). Comparable data were obtained by other groups (Stypulkowski 1990, Kujawa et al. 1992). However, so far there are no data available concerning effects of salicylate on the hair cell stereovilli. When salicylate was injected iontophoretically into the scala tympani of the gerbil, a reversible decrease of the 2f1-f2 distortion product follows (Fig. 5.3.5b, Frank and Kössl 1996). Interestingly, in this experiment, the acoustic difference tone f2-f1 increased during the period of salicylate action. We suspected that the two distortions describe different aspects of the cochlear amplifier. The cubic distortions (e.g. 2f1-f2) should be related to the slope of the transfer function around the operating point. The steeper the slope, the larger is the gain of the amplifier and also the level of 2f1-f2. The quadratic distortions (e.g. f2-f1) arise from symmetric components of the transfer function and therefore should reflect the symmetry of the transfer function and hence the position of the operating point of the amplifier. This is illustrated in a model calculation (Fig. 5.3.6). The levels of 2f1-f2 and f2-f1 strongly depend on the position of the operating point along a nonlinear Boltzman function which describes mechano-electrical transduction in hair cells (Corey and Hudspeth 1983). When the operating point is symmetrical (position C), the amplifier works at maximum slope and large 2f1-f2 distortion is generated. When the operating point is shifted towards more asymmetric positions (D), 2f1-f2 decreases and f2-f1 increases.
110
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
Fig. 5.3.6: Two sinusoid input stimuli f1 and f2 were processed by a nonlinear Boltzman function y = [ 1 + exp(a2(x2-x))*(1 + exp(a1(x1-x)))]-1 with x1 = x2 = -0.06, a1 = 3*a2 = 12.8. The resulting distortions depend on the position of the operating point (C vs. D). For a continuous shift of the operating point close to the symetric zero-position (B) there is an opposite behaviour of 2f1-f2 and f2-f1 (adapted from Frank and Kössl 1996).
Therefore, by measuring both distortions, 2f1-f2 and f2-f1, is should be possible to obtain information on the gain and operating state of the cochlear amplifier. We tested this by applying low frequency sound which is known to bias the position of the BM and the organ of Corti and hence shift the operating point of hair cells to asymmetrical positions. Figure 5.3.7 shows three cases (Frank and Kössl 1996, 1997). In Figure 5.3.7a, a 5 Hz biasing tone induced a strong reduction of f2-f1 during transitions between condensation and rarefaction (second and fourth zero-crossing of the bias tone, open arrows), that due to a 90° phase shift between velocity and displacement, should correspond to maxima of BM-displacement towards Scala tympani (ST) and hence to a shift of the OHC stereovilli in the hyperpolarizing direction. To produce this pattern, the initial operating point has to lie asymmetrically on the depolarizing side of the transfer function. In this case, movement of the BM towards ST and hyperpolarization will increase the symmetry and therefore decrease the level of f2-f1. In comparison, the pattern shown in Figure 5.3.7c (solid 111
5 Active mechanics and otoacoustic emissions in animals and man lines) is shifted by 180°. This behaviour, a decrease of f2-f1 during a depolarizing bias (transition between rarefaction and condensation) should follow if the initial operating point is located on the hyperpolarizing side of the transfer function. If the initial operating point is located at a symmetrical position in the centre of the transfer function, both de- and hyperpolarizing bias will result in a movement towards asymmetry and hence in this case, there are maxima of f2-f1 for all zero-crossings of the applied low frequency tone (Fig. 5.3.7b). We also attempted to shift the operating point by electrically inducing OHCmotility. A low frequency bias current was applied in Scala media; positive current should lead without phase shifts to a hyperpolarization and hair-cell elongation. The concomitant shift of the operating point leads to an increase of f2-f1 (Fig. 5.3.7c). If hair-cell transduction is interrupted by applying BAPTA in Scala media, which is known to disrupt the tip links of hair cells, both 2f1-f2 and f2-f1 are abolished (Frank 1996). These experiments show that it is possible to determine the operating state of the cochlear amplifier noninvasively by measuring different distortion products.
Fig. 5.3.7: Influence of biasing the cochlear partition by low frequency sound (A, B, C) or intracochlear current (C) on 2f1-f2 and f2-f1 distortions. The bottom diagrams schematically show the derived initial position of the operating point (op) of the cochlear amplifier and its behaviour during biasing. The distances of the operating point to the symmetric position are overemphasized. For further explanation see text (adapted from Frank and Kössl 1996, 1997).
112
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
5.3.2 Acoustic distortions and cochlear tuning At low sound pressure levels, it is generally assumed that two-tone distortions are produced locally within the zone of overlap of the two stimulus travelling waves. It should therefore be possible to determine the tuning of the travelling waves on the BM by applying a third, suppressing stimulus (f3) and determining the level of f3 that at a certain f3-frequency is sufficent to suppress the distortion. Figure 5.3.8 shows 2f1-f2 suppression tuning curves (STCs) for different species. The curves in general have the typical mammalian shape, with a steeper high frequency slope. Their minimum is in most cases close to the frequency of f2, which indicates that 2f1-f2 is generated in the vicinity of the f2 place on the BM. One notable exception is seen in the oppossum for stimulus frequencies close to 60 kHz. Here the STC minimum is clearly below f1, which could be caused by the fact that the stimulus frequencies are at or beyond the tuned frequency representation at the basal end of the cochlea. The Q10dB values of STCs are comparable to those of neuronal recordings. As also proven for laboratory mammals and man (Brown and Kemp 1984, Kummer et al. 1995), the STC method is a valuable alternative to invasive tuning measurements. In the mole rat, STC tuning is very broad and Q10dB values are low, in particular for stimulus frequencies between about 0.7–1.2 kHz which is about the range of the auditory fovea of this species (Müller et al. 1992). This emphasizes that an auditory fovea primarily leads to an overrepresentation of neurons that code frequencies of special behavioural importance without a sharpening of cochlear tuning in the foveal range. Only by additional hydromechanical resonance (see below), can certain bat species achieve extraordinary sharp tuning in their foveal frequency ranges.
Fig. 5.3.8: 2f1-f2 suppression tuning curves in the opposum Monodelphis domestica and the mole rat Cryptomys spec. Symbols indicate f1 and f2 for each curve (adapted from Faulstich et al. 1996, Kössl et al. 1996).
113
5 Active mechanics and otoacoustic emissions in animals and man
5.3.3 Enhanced auditory filters in bats The auditory system of bats with long constant-frequency (CF) components in their echolocation calls is known to show extraordinary sharp tuning to the CF-frequencies with neuronal Q10dB values exceeding 400, in comparison to a maximal Q10dB of about 20 in other mammals. Enhanced tuning is already evident at the level of the cochlear nucleus and auditory nerve (Suga et al. 1975, Suga and Jen 1977, Kössl and Vater 1995). Figure 5.3.9 shows neuronal tuning curves for the mustached bat’s anteroventral cochlear nucleus. Enhanced tuning is found close to the 2nd CF component (CF2) at about 62 kHz and at CF3 (93 kHz). The same is evident from 2f1-f2 STCs (Fig. 5.3.9), which proves that enhanced tuning is produced at the level of cochlear mechanics. The frequency of the strong evoked or spontaneous OAEs (Fig. 5.3.9, open arrow) found in this species corresponds to the tips of the most sensitive and sharply tuned neurons. This indicates that a strong cochlear resonance mechanism is involved in the production of sharp tuning. The same resonator produces the OAEs and also ringing in cochlear microphonic potentials (Suga et al. 1975, Henson et al. 1985, Kössl and Vater 1985a).
Fig. 5.3 9: Neuronal (A) and 2f1-f2 suppression tuning curves (B) in the mustached bat. For further explanation see text (adapted from Kössl and Vater 1990b, Frank and Kössl 1995, Kössl and Frank 1995).
114
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
Fig. 5.3.10: Basilar membrane displacement in the mustached bat. A: Magnitude and phase of displacement during stimulation with constant level signals of different frequency. The solid curves describe a simple resonator with a Q of 261 and a damping coefficient of 0.00138. B: Iso-displacement tuning curve, given is the stimulus level sufficent for a displacement of 0.1 nm (after Kössl and Russell 1995).
When measuring BM displacement with a laser interferometer, a sharply tuned response that corresponds to the OAE frequency is evident at BM places basal to the cochlear place of about 62 kHz (Fig. 5.3.10). These data imply that at least part of the energy of the cochlear resonance travels along the BM in the reverse direction. The measured signal on the BM shifts through about 180° of phase change as is typical for simple resonators that could, e.g. be produced by cochlear reflections and standing waves. What is the source of cochlear resonance in the mustached bat? Even if we measure the resonance on the BM, this does not necessarily imply that it is produced solely by BM properties. Another important structure involved in cochlear tuning is the tectorial membrane (TM), whose movement is very difficult to determine in direct measurements. However, there seems to be an indirect way to derive TM tuning from DPOAErecordings. From psychophysical measurements, it is known that in humans the perceived 2f1-f2 distortion increases steadily when the frequency difference between f2 and f1, expressed as frequency ratio f2/f1, is lowered, as would be expected from the resulting increase of the zone of overlap of the two primary travelling waves. This simple relationship is not evident in the acoustic 2f1-f2 distortion measured at the tympanum. With decreasing frequency ratio, the level of this DPOAE initially increases, but then for smaller ratios of f2/f1 decreases again. A distortion maximum can be defined at a certain ratio, which is called the optimal ratio. To resolve this discrepancy between psychophysical and acoustic measurements, Brown et al. (1992) and Allen and Fahey (1993) suggested that the ratio dependence of the acoustic distortions is due to the action of a second cochlear filter element, most probably the TM. They proposed that the 115
5 Active mechanics and otoacoustic emissions in animals and man TM at a certain f2 place on the BM is tuned to slightly lower frequencies than the BM and therefore enhances these frequencies, which results in a distortion maximum at respective distortion frequencies. This implies that if one knows the BM frequencyplace map of a species and assumes that 2f1-f2 is generated locally close to a certain f2 place on the BM, one can calculate the putative TM tuning of this place from the optimal ratio f2/f1 that produces maximum distortion. Figure 5.3.11a shows the optimal ratios for different f2 frequencies in the mustached bat. Close to 62 and 93 kHz, the ratios are very small (1.0005) which, according to this hypothesis, suggests that in the corresponding cochlear regions, the characteristic frequencies of BM and TM tuning are nearly identical. Figure 5.3.11 (bottom) shows a 2f1-f2 threshold curve of the mustached bat. For each f2 frequency, f1 was adjusted accordingly the optimal ratio and both stimuli were increased in level. From the resultant 2f1-f2 growth function, the stimulus level sufficient to induce a small distortion of -10 dB SPL was calculated. The shape of this threshold curve is similar to neuronal threshold measurements (Kössl and Vater 1990b). There is a sequence of threshold maximum and minimum in the range of CF2 and CF3. Exactly at 62 kHz where the optimal ratio is smallest and the evoked OAEs are found, the threshold curve has a sharp minimum (Fig. 5.3.11, dashed lines).
Fig. 5.3.11: Top: Optimal frequency ratio f2/f1 for different f2 frequencies in the mustached bat (n = 6). Bottom: 2f1-f2 threshold curve determined from distortion growth functions measured at the otimum ratio. In all measurements the level of the f1 stimulus was 10 dB below that of f2 (adapted from Kössl and Vater 1996a,b).
116
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
Fig. 5.3.12: BM frequency-place map (solid symbols, Kössl and Vater 1985b, 1990a) and putative TM map derived from distortion product measurements (open symbols). For further explanation see text (adapted from Kössl and Vater 1996a,b).
A putative map of frequency representation of the TM along cochlear length was calculated from the optimal ratio data (Fig. 5.3.12) using a BM-frequency map that was derived from HRP (Horseradish-peroxidase)-labeling of the afferent innervation of IHCs as the reference. At the BM-place of each f2-frequency, the frequency of the 2f1-f2 distortion at the optimal ratio was plotted (see Allen and Fahey 1993). Compared with the BM-map, the putative TM-map in the mustached bat is shifted to lower frequencies. At the level of the afferent fibres and IHCs, there is a vast spatial overrepresentation of frequencies around 62 kHz, typical for an acoustic fovea. This foveal representation of 62 kHz is even more pronounced at the level of the TM. The TM appears to be tuned exactly and exclusively to 62 kHz over about 20 % of the cochlear length. The respective region is located just basally to the 62 kHz place on the BM and coincides with a sparse innervation of the organ of Corti (SI-zone). In the SI-zone, frequencies between 62 and 72 kHz are represented on the BM. Since it is known that cochlear resonance evident in EOAEs and microphonic potentials is suppressed most effectively when using a suppressor at a frequency between 62 and 72 kHz, we assume that the resonance is generated here. In the SI-zone there are remarkable anatomical specializations of both BM and TM: The BM is thickened and the mass (estimated from crossectional area measurements) of the TM is decreased. Most importantly, the attachment of the TM to its limbal anchoring structure is reduced considerably (Vater and Kössl 1996). We suggest that this configuration leads to a TM resonance. At about 45 % of cochlear length, where both TM and BM are tuned to 62 kHz, the resonant energy may be transferred into the organ of Corti to induce a sensitive displacement of the IHC stereovilli (see schematic illustration of Fig. 5.3.13). For this mechanism to work, it is necessary that at BM places of e.g 70 kHz, the TM, which resonates at 62 kHz, does not deflect the hair bundles. As indicated in a model of the behaviour of the specialized TM (Steele 1997), this is indeed the case. The TM appears to resonate not in a radial direction, but vertically, which should not deflect the hair bundles. 117
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.3.13: Schematic illustration of generation of cochlear resonance in the mustached bat. In the SI-zone the TM is tuned to 62 kHz and its mass is reduced. This together with the fact that the attachment of the TM to the limbus is reduced may provide the basis for resonant oscillation which could also involve reflections at the borders of the SI-zone where abrupt discontinuities appear in the structure of TM and BM. For further explanation see text (adapted from Kössl 1997).
There is a controversial discussion as to the degree to which the optimal ratio measurement can be used for deriving TM tuning. In models, it is possible to generate an optimal distortion ratio by assuming a TM filter (Allen 1980, Allen and Fahey 1993), but also alternatively by phase cancellation mechanisms (Neely and Stover 1997). An observation which at first seems to contradict the TM filter hypothesis is that in the alligator lizard, optimal ratios can be measured for f2-frequencies between 2 and 4 kHz despite the fact that there is no TM present in the respective region of the inner ear (Taschenberger et al. 1995). To assess this challenge for the filter theory, one has to keep in mind that in mammals the proposed TM filter is actually composed of the TM contributing mass and the OHC stereovilli contributing stiffness to the filter (Allen 1980). In the lizard, the mass of the fluid that surrounds the hair bundles may be effective both for displacement of the stereovilli as well as for a possible mechanical filtering (Freeman and Weiss 1990) with the mechanical properties of the stereovilli being the most important variable. In CF-bats, the shape and length of the stereovillar bundle of OHCs is constant over the basal-most 70 % of the cochlea (Vater and Kössl 1996). Obviously, other parameters such as TM shape are critical for enhanced tuning. TM foveae and correlated structural specializations are only found in the two CF-bat species Pteronotus parnellii and Rhinolophus rouxi where enhanced tuning occurs. In Pteronotus there are two regions of nearly constant putative TM tuning at 62 and 93 kHz. In the case of 93 kHz, there is no obvious spatial expansion present on the BM, but neuronal tuning is sharpened and the putative TM-map exhibits a region of constant tun-ing (Fig. 5.3.12, CF3). In Rhinolophus, a TM fovea is found at 78 kHz, coinciding with the sharpest tuning (Faulstich, unpublished) and with morphological specializations of the tectorial membrane (Vater, personal communication). 118
5.3 Otoacoustic emissions and cochlear mechanisms in mammals
5.3.4 Distortion thresholds in mammals As already pointed out, it is possible to derive relative thresholds of cochlear mechanics from distortion measurements. From growth functions of 2f1-f2, the stimulus level to produce a constant distortion level (-10 dB SPL) can be interpolated and used as the threshold value. An important prerequisite for such threshold measurements is that f1 is adjusted according to the optimal frequency ratio f2/f1. These threshold measurements are noninvasive, can be conducted within a few hours, and are obtainable from awake animals or animals that are only lightly anaesthetized, i.e interfering only minimally with the physiological state of the cochlea. This makes distortion thresholds an ideal means to study evolutionary changes in cochlear processing, particularly in rare animal species, which can be released again after the measurements. Figure 5.3.14 shows examples of distortion threshold curves. As a model for generic mammalian hearing characteristics, the oppossum, a marsupial, has good high frequency hearing unlike low frequency specialists such as the blind mole rat, where maximum sensitivity is found close to its auditory fovea between 0.3-2.5 kHz. In bats that use broadband FM echolocation signals (ML, CP), cochlear sensitivity is shifted towards higher frequencies without any obvious specializations of the threshold curve. In CF-bats, the hearing threshold curves are highly specialized. At the frequency of the dominant CF2 component, there is a pronounced threshold maximum. In the case of Pteronotus parnellii (PP), another maximum is correlated with CF3, in Rhinolophus (RR) with CF1. Slightly above the CF2-call frequencies, there are sharp threshold minima that correlate with the frequencies of the Doppler-shifted CF2-echo. CF-bats seem to employ a cochlear filter mechanism that reduces the mechanical response to the call so that they can focus on the echoes. There is evidence that the same TM-resonance that enhances tuning, acts, via phase shifts and cancellations, as an absorbing filter for frequencies just below the theshold minimum. The minimum coincides with sharpest tuning and also, in the case of Pteronotus parnellii, with strong EOAEs. Surprisingly, this absorbing filter, that encompasses massive anatomical changes, seems to have developed within short evolutionary time scales, since a close relative to the mustached bat, Pteronotus quadridens, has a rather flat hearing characteristic (PQ, dotted line in Fig. 5.3.14) and no prominent anatomical specializations (Vater 1997). The evolutionary development of sophisticated cochlear filters that involve both TM and BM specializations is still an unsolved problem.
119
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.3.14: 2f1-f2 Iso-threshold curves in different mammals. Data are taken from Faulstich et al. (1996: Opossum); Kössl et al. (1996: mole rat, 1997: Pteronotus quadridens PQ); Kössl (1992: Megaderma lyra, ML, Carollia perspicillata, CP); Kössl (1994: Rhinolophus rouxi, RR); Kössl and Vater (1996b: Pteronotus parnellii, PP).
5.4 Otoacoustic Emissions in human test subjects Hugo Fastl for Eberhard Zwicker†
Spontaneous otoacoustic emissions (SOAEs) were investigated with 119 subjects. At least one SOAE could be found in 75 % of all subjects and in 44 % of all ears. The frequencies of the SOAEs were between 500 Hz and 4.5 kHz, the level between –30 dB and +10 dB. If in one ear more than one SOAE was observed, the difference between neighbouring SOAEs showed a maximum at a distance of 0.4 Bark (Dallmayr 1985a). For SOAEs, a close correlation to the fine structure of the threshold in quiet was verified (Schloth 1983, Zwicker 1987a). SOAEs are observed for minima of the threshold in quiet. On the other hand, however, we did not find a SOAE at each frequency corresponding to a threshold minimum. Mechanical and acoustical influences on SOAEs were assessed in extensive studies (Schloth and Zwicker 1983). The magnitude of SOAEs can be substantially decreased (10 dB) by eliciting the stapedius reflex. In addition, the frequency of the 120
5.4 Otoacoustic Emissions in human test subjects
Fig. 5.4.1: Suppression of a SOAE by a tone impulse. a: Temporal pattern of the (suppressing) tone impulse at 1480 Hz with 70 dB (corresponding to 63 mPa); b: Time pattern of a SOAE at 1003 Hz as a reaction to the presentation of the sinusoidal suppressor.
SOAE increases by a few Hz. The magnitude of the decrease in amplitude of the SOAE as well as the increase in its frequency strongly depends on the individual subject. Similarly both the magnitude and the frequency of otoacoustic emissions can be influenced by variations of the air pressure in the ear canal. Irrespective of whether the air pressure in the ear canal is increased or decreased, the magnitude of the otoacoustic emission always decreases while the frequency increases. The effects of additional (suppressing) tones on SOAEs are illustrated in Figure 5.4.1. The upper panel in Figure 5.4.1 shows the temporal course of a suppressor at 1480 Hz, switched on for a period of 60 ms. The lower panel in Figure 5.4.1 shows that as a consequence of the presentation of this (suppressing) tone, the otoacoustic emission decreases by about a factor of 2, i.e. about 6 dB. Closer inspection reveals that the reduction of the SOAE starts after a delaytime Td. When the suppressing tone is switched off at t = 60 ms, after a delay time Td, the magnitude of the otoacoustic emission recovers and reaches the original value after about 50 ms. The decrease and increase in the magnitude of the otoacoustic emission corresponds closely to an exponential function with a time constant of 30 ms (dotted). The magnitude of the delay time Td is about 2 ms. We showed that 90 % of human ears with normal hearing emit sound in response to stimulating tones. These simultaneous evoked otoacoustic emissions (SEOAE) have the same frequency as the stimulus and are found in a frequency range between 1 and 3 kHz (Dallmayr 1987). At low stimulus levels up to about 20 dB above their respective threshold, the level of the SEOAE increases linearly with stimulus level. However, above 20 dB some saturation occurs. With additional tones, 121
5 Active mechanics and otoacoustic emissions in animals and man the magnitude of the SEOAEs can be reduced, an effect called suppression. The dependence of suppression on suppressor frequency is very similar to the shape of psychoacoustical tuning curves. For SOAEs, SEOAEs, and for the spectral content of DEOAEs, a distance of neighbouring extrema of 0.4 Bark was verified (Zwicker 1989). The distance between neighbouring SEOAEs of 0.4 Bark corresponds to a distance of about 0.5 mm along the human basilar membrane (Zwicker 1990a). This characteristic value is assessed by discussing related data from an analog model of nonlinear cochlea preprocessing with active feedback and lateral coupling. The results indicate that the phase characteristic of the travelling wave in the cochlea has a slope of 180° per 0.4 Bark (0.5 mm) near the respective characteristic place. A frequency shift corresponding to a travelling-wave-place shift of 0.5 mm results in an SEOAE-phase shift of two times 180° = 360° phase shift. This value is found for neighbouring SEOAE maxima or minima. Simultaneously evoked otoacoustic emissions clearly depend on the acoustic impendance of the probe applied to the sealed ear canal (Zwicker 1990b). Applying different probes, the maxima and minima can be shifted with respect to the frequency axis. However, the distance between neighbouring maxima or minima is independent of the impedance of the probe used. The effects of the acoustical impedance of the probe could also be simulated in an analog model of peripheral pre-processing. The influence of the impendance of probes used for the measurement of otoacoustic emissions were studied by simulating the impedance both of the ear as well as the probe (Jurzitza and Hemmert 1992). An alternative method for the measurement of SEOAEs was developed and realized. In contrast to earlier investigations, SEOAEs are measured as fluctuations of the acoustic input impendance of the outer ear canal or the tympanic membrane, respectively. This procedure yields quantitative and comparable data on SEOAEs. The probe was developed and realized with a tiny electrodynamic speaker as the sound-emitting system and a pressure sensitive electret microphone as receiver. For DEOAEs, we showed that their increase is proportional to the level of the evoking sound and reaches a saturation after about 30 dB. In extensive studies, we verified the close relationship between masking-period patterns and delayedevoked otoacoustic emissions (Zwicker 1983a). It can be illustrated by means of Figure 5.4.2. Short test tone bursts at 1.3 kHz (Fig. 5.4.2b) are masked by Gaussian-shaped pressure pulses (Fig. 5.4.2c). The corresponding masking-period pattern is displayed in Figure 5.4.2a. It is clear that negative Gaussian-shaped pressure impulses produce a clear increase in masked threshold, whereas positive Gaussian-shaped pressure impulses lead to a double peak in the masking-period pattern at a relative delay time of 3/4. Figure 5.4.2e shows the correlated DEOAEs as a suppression-period pattern, which is obtained by averaging the amplitude of the DEOAE within a time window extending from 9 to 14 ms. When comparing Figure 5.4.2a with Figure 5.4.2e, it can be seen that the masking-period pattern and the suppression pattern for DEOAEs are nearly perfect mirror images. The patterns in Figure 5.4.2d show temporal traces of DEOAEs tilted by 90°. They contain the same information as Figure 5.4.2e, but reveal more details. 122
5.4 Otoacoustic Emissions in human test subjects
Fig. 5.4.2: Masking-period pattern and suppression pattern for tone bursts masked by Gaussian-shaped pressure impulses. a: Masking-period pattern; b: Test tone burst at 1.3 kHz; c: Gaussian-shaped pressure impulses; d: Time patterns of delayed evoked otoacoustic emissions tilted by 90 degrees; e: Suppression pattern.
In view of the correlations between otoacoustic emissions and masking-period patterns, we carried out psychoacoustic experiments on masking-period patterns with masker frequencies between 36 Hz and 324 Hz. For periodic sound-pressure time functions with frequency components below 50 Hz, we measured masking-period patterns as well as suppression-period patterns for DEOAEs (Zwicker and Scherer 1987). Three different time functions were used: An alternating Gaussian impulse, its first integral, and its second integral. In each case, the time pattern of the suppression-period pattern is a mirror image of the time pattern for the masking-period pattern. Large values of the test tone level correspond 123
5 Active mechanics and otoacoustic emissions in animals and man to a small amplitude of the otoacoustic emission. Both the masking-period pattern and the suppression-period pattern are closely related to the second derivative of the sound-pressure time function. The relationship between masking-period patterns and otoacoustic emissions was also examined for SOAEs (Dallmayr 1985b). When measuring the RMS-value (time constant 0.8 ms) of a SOAE at 1510 Hz, the time function of a low frequency (32 Hz) suppressor is reflected in the time pattern of the SOAE. Furthermore, the correlation between DEOAEs and masking-period patterns was extended to results from evoked potentials (Zwicker et al. 1987). In temporal regions of the masking-period pattern, where the test tone bursts are totally masked, the amplitude of DEOAEs as well as brainstem potentials is extremely reduced, down to values of the noise floor of the apparatus. The correlation of masking and suppression of otoacoustic emissions is of great relevance for models of hearing. We were able to verify a close relationship between otoacoustic emissions and the fine structure of the threshold in quiet for all three categories of emissions, namely spontaneous, simultaneously evoked, and DEOAEs (Zwicker and Schloth 1984). The common features of the different kinds of otoacoustic emissions and psychoacoustic data is illustrated in Figure 5.4.3. The frequency selectivity of the hearing system are expressed in tuning curves (Zwicker 1988b). Suppression-tuning curves were obtained for three different kinds of otoacoustic emissions: Spontaneous otoacoustic emissions (SOAE), simultaneous evoked otoacoustic emissions (SEOAE), as well as delayed evoked otoacoustic emissions (DEOAEs). In all cases, the level of a suppressor tone necessary to reduce the respective otoacoustic emission by 3 dB is given as a function of the suppressor frequency.
Fig. 5.4.3: Suppression-tuning curves for different kinds of otoacoustic emissions and psychoacoustic tuning curve. a: Spontaneous otoacoustic emission; b: Simultaneous evoked otoacoustic emission; c: Delayed evoked otoacoustic emission; d: Psychoacoustic tuning curve.
124
5.4 Otoacoustic Emissions in human test subjects The shapes of the suppression-tuning curves look rather similar, suggesting a common mechanism for the three types of otoacoustic emissions described. The similarity of the suppression-tuning curves to the psychoacoustic tuning curves again stresses the close relationship between masking and otoacoustic emissions. The effects of simultaneous masking can be described quantitatively on the basis of the correlated DEOAEs (Scherer 1988, Zwicker 1983b). On the other hand, temporal masking, i.e. post-masking or pre-masking is not reflected in the amplitude of the correlated DEOAEs. ‘Addition’ of simultaneous masking produced by two maskers was compared to the ‘addition’ of suppression for DEOAEs (Zwicker and Wesel 1990). The results indicate that the nonlinear effects in the ‘addition’ of masking are strongly correlated with the corresponding effect in suppression. Therefore, not only simultaneous masking but also nonlinear ‘addition’ of masking can be assumed to a occur in the inner ear. The interaction of SOAEs and the fine structure of the threshold in quiet was compared to a subject’s ability to detect amplitude modulation at low levels (Zwicker 1986b). The correlation between SOAEs and absolute threshold is also reflected in the magnitude of just-noticeable differences in amplitude modulation (JNDAMs). Most interesting and somewhat unexpected is the fact that the JNDAMs are negatively related to the threshold in quiet: Low threshold values correspond to large values of JNDAM and vice versa, if the amplitude modulation is presented at a constant level above threshold. This behaviour is illustrated in Figure 5.4.4. The variations in thresholds are shown in the lower part (TQ), the different JNDAMs in the upper part. For modulation frequencies of 1 Hz, 4 Hz, 16 Hz, and 64 Hz, the threshold values and JNDAMs show a mirror image. This holds true for a sensation level of the AM sounds of 17 dB. As in the formation of distortion products in psychoacoustic experiments when presenting to the hearing system two primaries, in otoacoustic emissions distortion products can also be observed (Zwicker 1990c). This type of otoacoustic emissions is called distortion product otoacoustic emissions (DPOAE).
Fig. 5.4.4: Fine structure of the threshold in quiet (TQ) for tones and their just noticeable amplitude modulation at 17 dB sensation level.
125
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.4.5: Spectra obtained for two primaries and psychoacoustic cancellation (a) or emission cancellation (b).
DPOAEs were compared to psychoacoustic measurements of 2f1-f2 distortion products by a cancellation method (Zwicker and Harris 1990). The level and the phase of the 2f1-f2-difference tone were measured as a function of the primary-tone level using the psychoacoustical method of cancellation or the objective method of emission cancellation for four frequency separations at f1 = 1620 Hz. Between the hearing cancellation and the emission cancellation, large differences showed up which are illustrated by Figure 5.4.5. Spectra are displayed for f1 = 1620 Hz, f2 = 1851 Hz, 2f1-f2 = 1389 Hz. For clarity, the different curves in Figure 5.4.5 are shifted by 20 dB. The lowest trace shows just the two primaries with no distortion product as measured with the probe loaded by a passive cavity. The next trace shows the spectrum when the otoacoustic emission is cancelled. In the following trace the otoacoustic emission is clearly visible. The top trace shows the configuration for hearing cancellation. A comparison of the two upper traces reveals dramatic differences between the cancellation of a DPOAE in comparison to the psychoacoustic cancellation of an audible distortion product. In extreme cases, these differences can amount to 60 dB. Four reasons may contribute to such large differences in results obtained psychoacoustically or by otoacoustic emissions: (a) the frequency-dependent attenuation of the middle-ear transfer function, (b) the frequency-dependent mismatch of the acoustical impedances at the ear drum, (c) the frequency-dependence of the microphone’s sensitivity mounted within the probe, (d) the different reaction of 126
5.5 Properties of 2f1-f2 distortion product otoacoustic emissions in humans active nonlinear cochlear processes on the hearing cancellation tones or the emission cancellation tones. As concerns the level and phase of the 2f1-f2 cancellation tone measured psychoacoustically, a representation in vector diagrams enables a quantitative description of the psychoacoustically observed phenomena (Zwicker 1983c).
5.5 Properties of 2f1-f2 distortion product otoacoustic emissions in humans Thomas Janssen
We investigated the properties of distortion product otoacoustic emissions (DPOAEs) in humans and its efficacy in diagnosing cochlear hearing disorders. Based on measurements on the nonlinearity of the basilar membrane, a stimulus paradigm was developed that accounts for the unequal primary tone responses at the DPOAE generation site, i.e. the f2 place. With the novel parameter paradigm, DPOAEs can be measured over a wider dynamic range. DPOAEs elicited at nearthreshold primary-tone levels strongly correlate with hearing threshold. Thus, hearing threshold can be predicted with a probable error of about 10 dB by using linear estimation models. DPOAEs prove to be a valuable tool for monitoring the integrity of hearing function at the outer hair cell (OHC) level during therapy in patients suffering from sudden hearing loss and acoustic trauma. A peculiar DPOAE growth behaviour found in tinnitus patients indicates a specific outer hair cell impairment that might be a potential correlate of tinnitus (see 5.6).
5.5.1 Influence of frequency ratio f2/f1 and level difference L1-L2 Due to their non-linear transmission characteristics and the corresponding intermodulation distortion, outer hair cells (OHCs) evoke intermodulation vibrations in cochlear micromechanics when stimulated by two tones of neighbouring frequencies. Intermodulation vibrations propagate retrogradely via middle ear and tympanic membrane to the outer ear canal where they can be measured as DPOAEs (Kemp 1979). In humans the 2f1-f2 distortion has the highest amplitude and is therefore primarily used for diagnosing cochlear dysfunction. Level (L1, L2), frequency ratio (f2/f1, f2>f1), and level difference (L1-L2) of the primary tones determine the level of the resulting DPOAE (Ldp) (Gaskill and Brown 1990, Harris et al. 1989, Hauser and Probst 1991, Whitehead et al. 1992, Whitehead et al. 1995a,b). 127
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.5.1: Model of the influence of the frequency ratio f2/f1 and levels L1 and L2 of the primary tones on the DPOAE generation. Shaded area = overlapping region of the primary tone travelling waves, with maxima at x1 and x2.
Intermodulation distortion originates in the cochlear region where the travelling waves of the two primaries overlap (Kemp 1986). Due to the steeper slope of the travelling wave towards the cochlear apex, the maximum interaction site is near f2 (Fig. 5.5.1a). Thus, the OHCs of the f2 place (x2) contribute most to DPOAE generation. Investigations of the suppressibility of the DPOAE confirm this idea (see below, Sect. 5.6.4). The number of OHCs contributing to DPOAE generation depends on the size of the overlapping region, which is determined by the level and the frequency ratio of the primary tones. The size of the overlap region changes when changing either the level or the frequency ratio of the primary tones. When decreasing the primary tone level under the L1 = L2 condition, the overlap region disappears and no emissions can be elicited (Fig. 5.5.1b). To preserve the overlap region at low primary tone levels, a lower frequency ratio has to be used (Fig. 5.5.1c). However, perhaps due to filtering by the tectorial membrane (Brown et al. 1992), DPOAEs disappear at a very low frequency ratio (f2/f1<1.1). Thus, for eliciting DPOAEs at low primary tone levels, an intermediate frequency ratio and a lower L2 level have to be chosen (Fig. 5.5.1d). The optimum primary-tone level difference for yielding maximum emission level at different stimulus intensities can be derived from basilar membrane displacement measurements (Johnstone et al. 1986, Ruggero et al. 1997) that are extra128
5.5 Properties of 2f1-f2 distortion product otoacoustic emissions in humans
Fig. 5.5.2: Average emission level for 20 normally-hearing ears, measured at 85 different primary tone level combinations (20 ≤ L1 ≤ 70, 5 ≤ L2 ≤ 65 db SPL) at f2 = 1, 1.5, 2, 3, 4, 6 and 8 kHz. Frequency ratio was set to f2/f1 = 1.2. The line on the back of the ‘hill’ represents L1 = 0.4L2 + 39. These and the following DPOAE measurements were made using Cubedis and Etymotic Research (ER-10C) instrumentation (Mimosa Acoustics, USA). The signal from the ER-10C probe was amplified, sampled at 50 kHz, averaged 200 times into a 20.48 ms (1024point) buffer (4 seconds total sampling time), and Fourier transformed (48.8 Hz frequency resolution). Only DPOAE readings with signal-to-noise ratios exceeding 6 dB were included in the data analysis. Data were analyzed off-line with MATLAB.
polated to humans. Due to the different compressions of the two primaries at the f2 place, the primary tones have to be set with increasing level differences towards threshold (Kummer et al. 1997, 1998b). The optimized primary tone level paradigm was derived from DPOAE level data measured at 85 different primary tone level combinations (Fig. 5.5.2). It can be clearly seen that maximum emission levels (back of the ‘hill’ in Fig. 5.5.2) are yielded when changing the low frequency primary tone level L1 by a smaller step than the high frequency primary tone L2. However, when eliciting DPOAEs at equal primary tone levels, as usually used in clinical DPOAE measurements, emission amplitude drastically diminishes with decreasing stimulus level (see steep slope of the ‘hill’ in Fig. 5.5.2). Therefore, no near-threshold DPOAE measurements can be made when using the L1 = L2 condition. DPOAE measurements on the mutual influence of the frequency ratio and level difference of the primaries (Fig. 5.5.3 and 5.5.4) confirm the idea of non-linear interaction of the two primaries at the DPOAE generation site at f2. The data show that under the L1 = L2 condition, the frequency ratio has to be continuously 129
5 Active mechanics and otoacoustic emissions in animals and man lowered with decreasing stimulus level (Fig. 5.5.3, right panel). Since DPOAEs disappear at a frequency ratio below 1.1, eliciting DPOAEs under the L1 = L2 condition is more difficult or even impossible at primary tone levels below L1 = L2 = 30 dB SPL. However, at the L1 = 0.4L2 + 39 primary tone level setting, the optimum frequency ratio for obtaining maximum DPOAE levels remains almost constant around f2/f1 = 1.2 and DPOAEs can be reliably measured at near-threshold stimulus level (Fig. 5.5.3, left panel). The amount of the loss of DPOAE level dLdp at a fixed frequency ratio (eg. f2/f1 = 1.2) in comparison to the optimal ratio of each individual is considerably lower under the L1 = 0.4L2 + 39 condition compared to the L1 = L2 condition (Fig. 5.5.4). Also, the standard deviation of the optimized frequency ratio is lower in the L1 = 0.4L2 + 39 condition. For clinical application, a parameter setting according to L1 = 0.4L2 + 39 with a constant frequency ratio of 1.2 is therefore recommended (see line in Fig. 5.5.2).
Fig.5.5.3: Average (–), standard deviation (– – –), and individual values (+) of the optimized frequency ratio f2/f1 in 13 normally-hearing subjects that yielded maximum emission levels while changing the primary tone level L2 under the L1 > L2 condition (L1 = 0.4L2 + 39, 53 ≤ L1 ≤ 65 and 35 ≤ L2 ≤ 65 and under the L1 = L2 condition (35 ≤ L1 = L2 ≤ 65)). Frequency ratio f2/f1 was changed from 1.04 to 1.4 (step size = 0.03).
130
5.5 Properties of 2f1-f2 distortion product otoacoustic emissions in humans
Fig. 5.5.4: Upper two panels: loss of emission level dLdp (average of 13 normally-hearing ears) at a constant frequency ratio setting of f2/f1 = 1.19 under the L1 > L2 condition according to L1 = 0.4L2 + 39, at different primary tone level regions, at 61 ≤ L1 ≤ 65, 55 ≤ L2 ≤ 65 (*), 59 ≤ L1 ≤ 57, 50 ≤ L2 ≤ 45 (x), 55 ≤ L1 ≤ 40, 53 ≤ L2 ≤ 35 dB SPL(o) and under the L1 = L2 condition at 59 ≤ L1 = L2 ≤ 65 (*), 45 ≤ L1 = L2 ≤ 57 (x), 35 ≤ L1 = L2 ≤ 43 (o). Lower two panels: average (–) and standard deviation (– – –) of the optimized frequency ratio (opt. f2/f1) yielding maximum emission levels at the L1 > L2 and the L1 = L2 condition.
131
5 Active mechanics and otoacoustic emissions in animals and man
5.5.2 Stability of the DPOAE level The stability of the DPOAE level depends essentially on the signal-to-noise ratio SNR (Fig. 5.5.5 and 5.5.6). With normally-hearing test persons, the standard deviation of the DPOAE level from repeated measurements with unchanged probe is very small (<1 dB), when having a SNR higher than 20 dB. With decreasing SNR below 20 dB, the variation increases rapidly. At a SNR of 6 dB, the extrapolated standard deviation amounts to 2.5 dB (Fig. 5.5.6). For monitoring small DPOAE changes, e.g. as they occur during contralateral stimulation, high SNRs have to be achieved. Because of the time-domain averaging used, the equivalent bandwidth of the resulting Fourier spectrum amounts to 0.25 Hz at total sampling time of 4 seconds. At that bandwidth, the minimum noise floor is sufficiently low (about -30 dB SPL). Thus, 4 seconds averaging time and a 6 dB SNR is a suitable compromise for measuring DPOAEs in patients.
Fig. 5.5.5: Average of the emission level Ldp (–) and noise floor Lnf (– – –) in a normally-hearing subject for repeated measurements (i = 10) at three different stimulus levels according to L1 = 0.4L2 + 39. Error bars indicate ranges of ±1 standard deviation.
Fig. 5.5.6: ±1 standard deviation (SD) of the emission level Ldp (n = 11 normally-hearing ears) for 10 repeated measurements across signal-to-noise ratio S/N obtained at L1 = 51, L2 = 30 dB SPL at 7 f2 frequencies as in Fig. 5.5.5. Frequency ratio was set to f2/f1 = 1.2.
132
5.5 Properties of 2f1-f2 distortion product otoacoustic emissions in humans
5.5.3 Relationship between DPOAE and hearing threshold in normal hearing The DPOAE level and the slope of the DPOAE I/O functions correlate well with the hearing threshold when elicited DPOAE by the L1 = 0.4L2 + 39 parameter setting. In a normally-hearing subject, DPOAE ‘audiograms’ show the closest match with the threshold fine structure obtained with the same sound probe and same calibration when elicited by low primary tone levels (r = 0.62, p < 0.001 at L1 = 49 and L2 = 25 dB SPL, upper panel in Fig. 5.5.7). The average DPOAE level and the threshold level from 20 normally-hearing ears share a bimodal shape, with maxima near 1 and 5 kHz (two middle panels in Fig. 5.5.7). However, there is a shift of the DPOAE maxima in relation to the threshold minima towards higher frequencies, which increases with increasing stimulus level. Thus, the highest average correlation coefficient was found at L1 = 49, L2 = 20 dB SPL (r = -0.49 ± 0.18, p < 0.05). The uncommon decrease of DPOAE and hearing threshold level around 2 and 6 kHz can be explained by calibration failures due to standing waves in the outer ear canal (Siegel and Hirohata 1994, Whitehead et al. 1995c). The slope s of the DPOAE I/O function also correlates with the hearing threshold (p < 0.05).
133
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.5.7: Upper panel: Emission level Ldp (–) and noise floor (– – –) in a normally-hearing subject at 10 primary tone level settings according to L1 = 0.4L2 + 39. DPOAE audiograms follow with decreasing stimulus level from L2 = 65 to L2 = 20 dB SPL, in 5-dB steps, from top to bottom. The bold line is the pure-tone threshold Lt at f2 plotted inversely and with reference to the right-hand scale. Lower panels: Average of the pure-tone threshold Lt, DPOAE level Ldp at the same primary tone level setting as in the individual subject (see upper panel), and slope of the DPOAE I/O-functions computed between L2 = 60 and L2 = 40 dB SPL from 20 normally-hearing subjects. Error bars indicate ranges of ±1 SD. For the DPOAE level Ldp, SD ranges are given only for the lowest and highest primary tone levels. Frequency ratio of the primaries was set to f2/f1 = 1.2. Signal-to-noise ratio > 6 dB.
134
5.5 Properties of 2f1-f2 distortion product otoacoustic emissions in humans
Fig. 5.5.8: Average (–) and ±1 SD (– – –) of the emission level Ldp and slope s of the DPOAE I/O-functions as a function of L2 from 20 normally-hearing ears. Parameter setting see Fig. 5.5.7.
Human DPOAE I/O functions reflect the compressive nonlinearity of the cochlear amplifier known from animal data (Johnstone et al. 1986, Mills and Rubel 1996, Ruggero et al. 1997). In the lower primary-tone level region, there is a steep slope with an average value of 0.8 dB/dB, revealing amplification of low level signals, whereas in the upper region, the course of the I/O function is flat (around 0.1 dB/dB) indicating strong saturation (Fig. 5.5.8).
5.5.4 Suppression properties Human (Brown and Kemp 1984, Harris et al. 1992, Kummer et al. 1995, Abdala et al. 1996) and animal DPOAE iso-suppression tuning curves (Köppl and Manley 1993, Frank and Kössl 1995) mirror the high frequency selectivity of the inner ear. Human suppression tuning curves have the typical V-shape known from the neural tuning curves in animals, with steep high-frequency and flat low-frequency slopes (Fig. 5.5.9). Their slopes increase with increasing characteristic frequency (CF), from 144 dB/oktave (high-frequency slope) and -29 dB/octave (low-frequency slope) at a CF of 1 kHz to 232 dB/octave and -43 dB/octave at a CF of 4 kHz. Their Q10dB-values vary from 1.96 to 7.87, indicating an increasing frequency resolution with increasing CF (Kummer et al. 1995). CF and f2 coincide, supporting the idea that the DPOAE generation site is at the f2 place (see Fig. 5.5.1). However, a secondary source exists at the 2f1-f2 place, elicited by the travelling wave at the f2 place (Kemp and Brown 1983a,b, Kummer et al. 1995, Brown et al. 1996, Heitmann et al. 1997). 135
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.5.9: DPOAE iso-suppression tuning curves at f2 = 1, 2, 4, 6 kHz (f2/f1 = 1.19, L1 = 55, L2 = 40 dB SPL), obtained in 8 normally-hearing ears (4-dB criterion). Arrows indicate the respective f2-frequencies. The frequency resolution of the suppressor tone was 25 Hz.
5.6 Developing clinical applications of 2f1-f2 distortion-product otoacoustic emissions Thomas Janssen
5.6.1 Monitoring cochlear dysfunction during recovery Eliciting DPOAEs with the L1 = 0.4L2 + 39 parameter setting, that accounts for the different compression of the two primaries at the DPOAE generation site at f2, seems to allow a more detailed assessment and detection of changes in OHC function. The L1 = 0.4L2 + 39 paradigm offered obvious advantages over the normally-used L1 = L2 paradigm, by considerably extending the potential stimulus level range towards the hearing threshold (Janssen et al. 1995a,b, Janssen 1996, Kummer et al. 1997b, Kummer et al. 1998a, Janssen et al. 1998). Both the emission level and the slope of the DPOAE I/O-function allow a more detailed examination of the amplification process; thus the OHC function at the f2 place can be infered from the DPOAE growth. A linearized DPOAE growth also mirrors the loss of compression of the cochlear amplifier in human cochlear-impaired ears. In patients with cochlear hearing loss, we found steepened DPOAE I/O-functions corresponding to the linearization of the basilar membrane displacement seen after selective OHC destruction in animal models (Johnstone et al. 1986, Mills and Rubel 1994). A patient suffering from sudden hearing loss showed a steep DPOAE I/O-function at the beginning of the treatment. During the recovery of hearing, the linearized DPOAE growth became compressive, returning to a normal hearing function (Fig. 5.6.1). This exem136
5.6 Developing clinical applications of 2f1-f2 distortion-product otoacoustic emissions
Fig. 5.6.1: DPOAE I/O-functions (right panel, lines; noise floor —) and hearing loss (left panel) in a sudden hearing loss ear during recovery from the 4th to the 8th day of therapy.
plary case demonstrates the high sensitivity of near-threshold DPOAE measurements, since between the 4th and 8th day of therapy, there was a much greater change in DPOAE level at the lower than at the higher primary tone levels. Another patient who suffered from acoustic trauma associated with tinnitus that appeared when visiting a discotheque also showed steep DPOAE I/O-functions. In the notch region (4 kHz), the I/O functions measured during trauma and after recovery differed considerably. At L2 = 25 dB SPL, the difference amounted to 20 dB (a in the right panel of Fig. 5.6.2) corresponding to the hearing loss difference during therapy. However, at the stimulus level of 55 dB SPL, the difference was only a tenth of that found at 25 dB SPL (b in the right panel of Fig. 5.6.2), revealing loss of compression. In the tinnitus region (6 kHz) the DPOAE behaved differently. At the higher primary tone levels (above 40 dB), the DPOAE level was higher during trauma, the difference amounting to almost 10 dB (b’ in the left panel of Fig. 5.6.2), whereas at the lower primary tone levels the DPOAE level differed by only about 10 dB (a’ in the left panel of Fig. 5.6.2). Taking into account the hearing loss in the tinnitus region, these high DPOAE levels are quite surprising and could result in wrong conclusions about cochlear integrity. However, it also gives a hint of a possible involvement of OHC dysfunction in tinnitus generation.
Fig. 5.6.2: DPOAE I/O-functions in an acoustic-trauma ear obtained in the notch (right panel) and tinnitus region (left panel) at 6.055 kHz during trauma (x) and after recovery (o).
137
5 Active mechanics and otoacoustic emissions in animals and man
5.6.2 Scanning of cochlear dysfunction in hearing loss ears with and without tinnitus In ears with a chronic cochlear hearing loss, we also observed a linearized growth where the DPOAE level decreased and the slope of the DPOAE I/O-functions steepened with increasing hearing loss (patients B and C in Fig. 5.6.3). Both DPOAE level and the slope of the DPOAE I/O-functions strongly correlate with the hearing threshold obtained with the same sound probe and same frequency resolution (e.g. patient B: r = -0.89, p < 0.001 at L1 = 55, L2 = 40 dB SPL for Ldp and 0.65, r < 0.001 for s). The congruent relationship between DPOAE and hearing threshold points to a loss of OHC function decreasing the gain of the cochlear amplifier, diminishing low-level responses and hearing sensitivity. However, as already shown in Figure 5.6.2, in tinnitus ears, a distinctly different DPOAE behaviour may occur (patient A in Fig. 5.6.3). In roughly half of the tinnitus ears examined, high DPOAE levels were found despite severe hearing losses, especially at high primary tone levels (Janssen and Arnold 1995b, Janssen et al. 1998). In these ears, there was a much poorer or even inverse relationship between DPOAE and hearing threshold level around the tinnitus frequency. This divergent relationship between DPOAE and hearing threshold appears to indicate a more complex cochlear impairment pattern, making the interpretation difficult. High emission levels would indicate that OHCs are functioning. However, the linearized growth unambiguously indicates OHC malfunction. Apparently, impaired OHCs produce unusually high distortion levels when stimulated by high-level input signals, as happens in overmodulated technical systems during signal clipping. Thus, not sensitivity but discrimination ability might be affected. In these ears, partial OHC impairment presumably disturbs the electromechanical feedback regulation of the cochlear amplifier. Our findings therefore support the hypotheses of a OHC hyperactivity as a potential tinnitus correlate (Zenner and Ernst 1993, Janssen and Arnold 1995b, Attias et al. 1996, Janssen et al. 1998). In the congruent cochlear hearing-loss ears (like patients B and C in Fig. 5.6.3), we also found a strong correlation between DPOAE and hearing threshold level at high primary tone levels, contradicting the poor correlation in normally-hearing ears at higher stimulus level (left panel in Fig. 5.6.4). One reason for this observation might be the influence of the secondary DPOAE source at the 2f1-f2 place, that disturbs the main source at f2 (Heitmann et al. 1997). However, considering the high correlation in the pathological ears, the secondary source seems to have no effect in cochlear hearing loss. The driving force responsible for OAE generation is assumed to be a byproduct of forces controlling the strength of the travelling wave, which is the primary determinant of hearing threshold (in mammals this is most likely longitudinal contractions of the lateral wall of OHCs). There are, however, many other important factors involved in hearing threshold (middle ear mechanics, cochlear micro- and hydromechanics, inner hair cell function, neural transmission to brain centres, mapping and processing of neural signals, see Kemp 1997). DPOAEs can only evaluate the functioning of hearing at the OHC level. Therefore one cannot accurately predict threshold levels from DPOAE levels. Nevertheless, a rough relationship be138
5.6 Developing clinical applications of 2f1-f2 distortion-product otoacoustic emissions tween DPOAE and hearing threshold can be obtained when plotting moving-average DPOAE level data across hearing threshold in cochlear hearing loss ears of the congruent type (right panel in Fig. 5.6.4). In the region of low hearing losses (Lt < 20 dB SPL), the average emission level slightly decreases by about 0.06 dB/dB when elicited at high primary tone levels. However, at low primary tone levels, the slope is considerably higher (about 0.25 dB/dB at L2 = 25 dB SPL) demonstrating the higher sensitivity of near-threshold DPOAEs. In the region of higher hearing losses (20 < Lt < 40 dB SPL), the emission level drops much more (about 0.5 dB/dB), and almost independently of primary tone level. Thus, using the L1 = 0.4L2 + 39 stimulus paradigm, a 1 dB hearing loss corresponds to a decrease of DPOAE level of 0.5 dB.
Fig. 5.6.3: Data from 3 patients with cochlear hearing impairment, A (divergent type): Sudden hearing loss associated with tinnitus at 6 kHz; B (congruent type): Ménière disease and tinnitus at 6 kHz; C (congruent type): Hearing loss of undetermined origin without tinnitus. Uppermost panels plot the pure tone threshold Lt, middle panels the DPOAE level Ldp (–) and noise floor (– – –) at primary tone levels decreasing from L2 = 65 to 20 dBSPL according to L1 = 0.4L2 + 39 (in C only down to 40 dB SPL), lower panels plot the slope s of the DPOAE I/O-functions calculated between L2 = 40 and L2 = 60 dB SPL. The shaded area indicates ranges of mean ±1 SD of the normally-hearing subject sample (cf. Fig. 5.5.7), for the DPOAE level only at L2 = 65 dB SPL.
139
5 Active mechanics and otoacoustic emissions in animals and man
Fig. 5.6.4: Left panel: Mean correlation coefficient r of linear-regression analysis between Lt and Ldp at different primary tone levels (L2, L1 = 0.4L2 + 39) from the normally-hearing subject sample (N) (cf. Fig. 7) and cochlear impaired ears (IOS). Right panel: 101-point moving averages derived from DPOAE level Ldp data at different primary tone levels from samples N and IOS plotted as a function of threshold level Lt.
5.6.3 Prediction of hearing threshold With a view to predicting hearing threshold by means of DPOAE, estimation models can be used that describe the relationship between hearing threshold Lt and the underlying parameters, namely the primary tone level L2 and the slope s of the DPOAE I/O-function according to: LtN,f2 = β0 + bN + β1 · L2 + β2 · sN,f2 + β3 · ƒ2 + β4 · (ƒ2)2 + β5 · (ƒ2)3, (N = 1, ..., 74, ƒ2 = 0.5, ..., 8 kHz). These relatively simple models provide a quite good estimation of hearing threshold with a average probable error of about 10 dB when taking into account both the DPOAE level and the slope of the I/O-functions (Janssen et al. 1997). DPOAE estimation models that apply the emission level only provide hearing thresholds which only poorly coincide with the patient’s threshold (see model L60 and L40 in Fig. 5.6.5), whereas models that take into account both the DPOAE level and the slopes of the I/O function allow a much better estimate, especially the L60+s model. The reason for the better prediction is that the L60+s model takes into account the growth behaviour of the DPOAE.
140
5.6 Developing clinical applications of 2f1-f2 distortion-product otoacoustic emissions
Fig.5.6.5: Reconstruction of hearing threshold Lt in patient B in Fig. 5.6.3 with high-frequency hearing loss by means of DPOAE-estimation models. L40 and L60 models apply DPOAE data elicited at L1 = 55, L2 = 40 and L1 = 63, L2 = 60, respectively. L40+s and L60+s models take into account both the DPOAE level at the respective primary tone level and the slope of the DPOAE I/O-function. Shaded area: ± 1 SD of pure tone threshold in 20 normally-hearing ears (see Fig. 5.5.7).
5.6.4 DPOAE as a clinical tool The parameter paradigm L1 = 0.4L2 + 39 that accounts for the different compression of the primaries at the DPOAE generation site, at f2, should have advantages over the commonly-used equilevel paradigm with respect to a better evaluation of the functional state of the cochlear amplifier. It will thus enhance the performance of clinical tests using DPOAEs. Since the sensitivity of the DPOAE considerably increases when elicited by near-threshold stimulus levels, DPOAE measurements should be routinely performed within a wide primary tone level range. In order to assess OHC function, not only the DPOAE level but also the DPOAE growth behaviour should be considered. Especially in those tinnitus ears where the DPOAE levels paradoxically increase with increasing hearing loss, the DPOAE level is hardly capable of assessing hearing impairment and may even be a misleading measure. In these ears, the DPOAE slope seems to be the only reliable indicator of cochlear malfunction. The abnormal mechanical distortion manifested in the growth behaviour correlates with a disturbance of the OHC amplification process and thus negatively influences the sensory transduction process in the inner hair cells. Mechanical mismatch of outer and inner hair cells is therefore suggested to be one potential correlate of tinnitus of cochlear origin. The strong correlation between DPOAE and hearing level found in the ears of the congruent type indicates the usefulness of DPOAEs as predictors of hearing threshold in difficult-to-test patients. 141
6 Neural processing in the brain
6.1 Auditory brainstem processing in bats Marianne Vater
The basic organization of the brainstem auditory pathways follows three principles: 1. A tonotopic arrangement; 2. Parallel processing of different aspects of auditory stimuli within specific nuclei or nuclear groups by virtue of intricately-designed excitatory and inhibitory circuits; 3. Convergence of different ascending pathways at the level of the auditory midbrain (Inferior colliculus, IC). These organizational features were investigated with combined physiological and anatomical techniques under comparative aspects in several species of echolocating bats that differ in orientationcall design and in hunting strategy.
6.1.1 The cochlear nucleus: Origin of parallel ascending pathways 6.1.1.1 Cytoarchitecture of the cochlear nucleus As in other mammals (review in Cant 1992), the cochlear nucleus (CN) of bats is composed of three major subdivisions, anteroventral, posteroventral and dorsal CN (AVCN, PVCN, DCN) that differ in cellular composition. All subdivisions receive input from the bifurcating auditory nerve fibres, thus a common cochlear input encompassing the complete receptor surface is distributed onto multiple target sites with specific central connectivity. In CF-FM bats, AVCN and PVCN are vastly hypertrophied as compared to the small size of the DCN. The cytoarchitecture of the CN in horseshoe bat and mustached bat reveals a mosaic organization: Common mammalian features are mixed with general adaptations for high frequency hearing, and species-specific specializations (Feng and Vater 1985, Kössl and Vater 1990a). 142 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
6.1 Auditory brainstem processing in bats In both species, the dominant cell type in the AVCN is the small spherical cell that receives input via large end-bulbs of Held. Large spherical cells that are typically found at the rostral-most (low frequency) aspect of AVCN in other mammals, are completely lacking in echolocating bats. This may represent an adaptation to the predominantly high-frequency range of hearing (for review see Covey and Casseday 1995). Similar to other mammals, further cell types of the bat AVCN include stellate cells, multipolar and globular cells. Only in the genus Pteronotus do marginal cells as a subtype of large multipolar neurons occupy the medial region of the AVCN and the dorsal caudal region of the PVCN. In both species, the PVCN is composed of large and small multipolar cells, elongate cells and globular cells. Its most caudal portion contains octopus cells. The whole DCN in Pteronotus, but only the ventral part of the DCN in Rhinolophus, is characterized by a laminar organization into molecular, fusiform and deep layers (for review see Covey and Casseday 1995). This lamination is generally far less pronounced than in other non-primate mammals. Furthermore, the dorsal part of the DCN in Rhinolophus completely lacks a lamination (Feng and Vater 1985).
6.1.1.2 Inputs, tonotopy and intrinsic connectivity of the CN The receptor surface of the cochlea is represented as systematically-arranged slabs of iso-frequency laminae in all subdivisions of the bats CN, as revealed by tracing labelled auditory nerve fibres from small HRP-injections into physiologically-characterized regions of the CN (Feng and Vater 1985, Kössl and Vater 1990). Due to the systematic bifurcation of single auditory nerve fibres into ascending and descending branches, a single injection into one subdivision labels the corresponding frequency band in the other subdivisions. Figure 6.1.1 shows the tonotopic organization in the mustached bat. As in Rhinolophus, narrow frequency bands encompassing the species specific CF-component are vastly over-represented in AVCN and PVCN, thus following the foveal over-representation within the cochlea. As a species-specific feature of the horseshoe bat, the representation of the second harmonic CF-signal is confined to the dorsal non-laminated portion of the DCN (Feng and Vater 1985). Unique to Pteronotus is the deviation in tonotopy of the medial marginal cell group from the normal frequency mapping in the AVCN: It predominantly houses the frequency range of the first harmonic FM-component of the echolocation signal that plays an integral role in echo ranging (Fig. 6.1.1; Kössl and Vater 1990). In both species, HRP-injections reveal the presence of tonotopically-organized intrinsic connections between the DCN and the two ventral subdivisions. One pathway arises in the deep layer of the DCN and projects to corresponding isofrequency regions of AVCN and DCN. The other pathway arises in the ventral CN and projects to the DCN (Feng and Vater 1985, Kössl and Vater 1990).
143
6 Neural processing in the brain
A
B
C
D
Fig. 6.1.1: Connectivity and tonotopic organization of the bat CN. Inset: Labelling patterns derived from small HRP-injection into physiologically-characterized regions of the CN (after Vater et al. 1985). A–C: Organization of isofrequency laminae in the CN of the mustached bat. Note expanded representation of the narrow frequency band encompassing CF2 in all subnuclei and the specialized tonotopic organization of MAM. A: Sagittal plane; B: Horizontal plane; C: Transversal plane (after Kössl and Vater 1990); D: Tuning curves of CN neurons of the mustached bat (after Kössl and Vater 1990).
144
6.1 Auditory brainstem processing in bats 6.1.1.3 Chemoarchitecture of the CN The complex anatomical organization of the CN (see above) and the variety of physiological response patterns (see Rhode and Greenberg 1992) renders unlikely a hypothesis that it merely represents a relay from the cochlea to higher brainstem centres. Investigations of the chemoarchictecture with immunocytochemical and histochemical techniques show that there is an intricate arrangement of putatively inhibitory and neuromodulatory circuits in the CN (Vater et al. 1992, 1996, Kemmer and Vater, 1997). In the CN of the mustached bat, as in other mammals, most of the larger cell types that establish the main central connections of the CN (spherical cells, globular cells, multipolar cells, octopus cells, fusiform cells, giant cells etc.) are typically nonlabelled in GABA and glycine-immunocytochemistry (see Fig. 6.1.3). Thus the ascending connections of the CN are predominantly excitatory in nature. The population of putatively inhibitory cells typically comprises small cell types whose predominant function appears to be in local or intrinsic inhibitory circuits. Following the typical mammalian pattern, the DCN contains by far the largest contingent of putatively inhibitory neuronal structures, but there are certain specializations that may be linked to the apparent lack of lamination observed in Nissl stains. The molecular and fusiform layers of the DCN contain far fewer putatively inhibitory cells (cartwheel, stellate cells) than the DCN of carnivores and rodents. However, the population of gly+ cells in the deep layer that represents the origin of the intranuclear pathway to the ventral CN is more abundant than noted in other species. This feature appears to correlate with the hypertrophy of the ventral CN noted in CF-FMbats. The circuitry of the superficial layers of the DCN is thus evolutionarily plastic, whereas the inhibitory circuitry of the deep DCN is highly conserved (Vater et al. 1996, Kemmer and Vater 1997). Histochemistry for putatively noradrenergic structures revealed a specialized labelling pattern in the medial marginal cell group in the mustached bat. Contrasting with the diffuse innervation of other subregions of the CN, the MA (medial, marginal cell group) receives a dense supply from putatively catecholaminergic fibres and thus appears to be a main target of descending inputs from the locus coerulus and/or lateral tegmentum (Fig. 6.1.2; Kössl et al. 1988).
145
6 Neural processing in the brain
Fig. 6.1.2: Catecholaminergic innervation of the CN (left) and the influence of microiontophoretic application of noradrenalin on response patterns of CN neurons in the mustached bat (right). After Kössl et al. 1988, Kössl and Vater 1989.
6.1.1.4 Physiology of the cochlear nucleus In horsehoe bats and mustached bats, the frequency tuning curves of single CN-neurons responsive to the dominant harmonics of the CF-signal range reflect specialized hydromechanical tuning properties of the cochlea. In the mustached bat CN (Fig. 6.1.1), tuning curves are exceedingly narrow in the frequency range between 60 and 90 kHz with Q10dB values reaching up to 400. A neuronal correlate for pronounced resonance phenoma in the cochlea of Pteronotus parnellii as revealed by long-lasting ringing in the CM response to 61 kHz (e.g. Henson et al. 1985) is noted in after-discharges of CN-neurons sharply tuned to this frequency range. Furthermore, in neurons tuned to the CF-signal range, there is an abrupt change in temporal response patterns from tonic discharges to phasic on-off discharge patterns at frequencies slightly below the cochlear resonance frequency. This is not due to frequency-specific inhibition, but represents a correlate for nonlinear cochlear processing (Kössl and Vater 1990). Evidence for differences in inhibitory inputs and/or differences in intrinsic membrane properties of the various cell types are found in the different temporal responses patterns of single CN neurons (Feng and Vater 1985, Kössl and Vater 1990). AVCN-neurons typically respond with primary-like discharge activity to pure tone stimulation and are capable of phase locked firing to high modulation rates of sinusoidal frequency and amplitude modulations (Vater 1982). Many PVCN-neurons and the medial marginal cells respond with one or few spikes tightly locked to the stimulus onset (Kössl and Vater 1989); DCN cells exhibit complex tuning curves with 146
6.1 Auditory brainstem processing in bats upper thresholds and side-band inhibition (Feng and Vater 1985). These features are already present shortly after the onset of hearing in the horseshoe bat, indicating that inhibitory circuits of the CN are established early in ontogeny (Vater and Rübsamen 1992). This is supported by immunocytochemical observations in gerbils (Gleich et al. 1997, Gleich and Vater, in press), where putatively GABAergic and glycinergic circuits are already present prior to the onset of hearing. In the CN of Pteronotus, micro-iontophoretic application of noradrenalin enhances temporal precision in the occurrence of the first spike and improves the neuronal signal-to-noise ratio (Kössl and Vater 1989).
6.1.1.5 Central connections of the CN The CN of horseshoe bats gives rise to multiple ascending pathways and is the target of multiple descending pathways (Vater and Feng 1990). The principal connection scheme in bats follows the basic mammalian plan and is summarized in Figure 6.1.3. The ascending connections are tonotopically organized and include three major target sites, the nuclei of the superior olivary complex (SOC), the nuclei of the lateral lemniscus (LL) and the inferior colliclus (IC). The principle nuclei of the SOC receive predominant input from the AVCN but also from the PVCN. The same is true for the variety of peri-olivary regions, including those giving rise to olivocochlear projections, but they are in addition a target of the DCN. The ventral and intermediate nucleus of the lateral lemniscus receive input mainly from the AVCN but there is a distinct input to a specialized subregion of the VNLL that appears to be derived from multipolar or octopus cell-like neurons of the caudal PVCN. The IC is the target of projections from all three subdivisions of the CN. Descending inputs to the CN are derived from several periolivary nuclei, many of which immunostain for GABA and/or glycine. They thus represent a potential source of extrinsic inhibitory input (Vater and Feng 1990).
147
6 Neural processing in the brain
A
B
C
D
Fig. 6.1.3: A: Generalized connectivity scheme of the ascending brainstem auditory pathway of bats (following data from Vater and Feng 1990, Casseday et al. 1988, Vater et al. 1995). BC: Distribution of GABA and glycine immunoreactivity in neurons of the main nuclei of the ascending auditory pathway derived from postembedding immunocytochemistry (B after Vater et al. 1997, C after Kemmer and Vater 1997, D after Vater 1995).
6.1.2 Superior olivary complex: the first stage of binaural interactions The principal nuclei of the SOC play an integral role in processing binaural clues (reviews in Irvine 1992, Schwartz 1992). According to the duplex theory of hearing, the lateral superior olive (LSO) processes interaural level differences by virtue of convergence of a direct excitatory input from ispilateral AVCN and an indirect inhibitory input from the contralateral ear that is relayed in the medial nucleus of the trapezoid body (MNTB). The medial superior olive (MSO) of non-echolocating mammals is involved in processing small interaural time/phase differences by func148
6.1 Auditory brainstem processing in bats tioning as a coincidence detector of the activity in two excitatory channels directly derived from ipsi- and contralateral CN. Contrasting with the prediction of a reduced or absent MSO in mammals with predominantly high frequency hearing and small interaural distance, all main nuclei of the SOC are well developed in bats (review in Covey and Casseday 1995). Anatomical and physiological studies of the SOC in bats show a highly-conserved LSO/MNTB system but a high variability in functional organization of cell groups occupying the same relative location as the MSO of non-echolocators (Casseday et al. 1988a,b, Covey et al. 1991, Grothe et al. 1992, Vater 1995, 1996).
6.1.2.1 Anatomy and connectivity of the principal nuclei of the SOC In all bats, the LSO and MNTB are readily identified in conventional Nissl-stained sections but there is pronounced variability in the design of the nuclei located inbetween. For example, in Pteronotus, the single well-delineated nucleus located between LSO and MNTB has been referred to as MSO due to similarities in relative location and ascending projections to the IC to those of the ‘classical’ MSO. In Rhinolophus, two nuclei occupy a similar relative position, and these were termed ventromedial superior olive (VMSO) and dorsal medial superior olive (DMSO) in order to signify that each of them shares some similarity with an MSO, but that they also possesses different organizational features (Casseday et al. 1988, Vater and Feng 1990). Small injections of WGA-HRP into physiologically-characterized regions of the SOC of the horseshoe bat (Casseday et al. 1988) and the mustached bat (Covey et al. 1991) show that the connectivity of the LSO/MNTB system are highly conserved in both species. The inputs to LSO are derived from the ipsilateral CN and the ipsilateral MNTB, and the inputs to the MNTB are derived from the globular cell area of the contralateral CN. The LSO projects bilaterally to the dorsal nucleus of the lateral lemniscus (DNLL) and inferior colliculus (IC) (Vater et al. 1995). The organization of the input connectivity of the MSO in Pteronotus and the VMSO and DMSO in Rhinolophus differs quantitatively from the pattern obtained in other mammals in that the inputs from the ipsilateral side are greatly reduced and the inputs from the contralateral CN dominate (Fig. 6.1.3). Thus the input connectivity argues to some extent against a homology of MSO-like structures in these bats. Additionally, in bats, MSO or MSO-like structures receive a prominent input from the ipsilateral MNTB. However, consistent with the projection pattern of a typical MSO, the MSO in Pteronotus, and VMSO and DMSO in the horseshoe bat project strictly ipsilaterally to the DNLL and IC. Use of two different anterograde tracers in Pteronotus shows that the central projections of neurons from corresponding isofrequency laminae in LSO and MSO largely overlap at the level of DNLL but are partially segregated and partially overlapping within the IC (Vater et al. 1995).
149
6 Neural processing in the brain 6.1.2.2 Physiology of the SOC In horseshoe bats (Casseday et al. 1988) and the mustached bat (Covey et al. 1991), neurons in the LSO respond with tonic or chopper discharge patterns to monaural stimulation of the ipsilateral ear. Increasing stimulus intensity at the contralateral ear causes graded inhibition of the response. As in other mammals, the vast majority of bat LSO-neurons exhibit this classical E/I response property, with the inhibition being mediated by the glycinergic MNTB-input (review in Irvine 1992). Despite differences in the shape of LSO between Pteronotus and Rhinolophus, the same basic tonotopy is observed with low frequencies represented laterally and high frequencies found progressively more medially. Recordings in MSO of Pteronotus show that most units are monaural, being driven by stimulation of the contralateral ear and not influenced in any obvious way by ipsilateral inputs (O/E) (Covey et al. 1991, Grothe et al. 1992). This differs from the E/E responses as the characteristic binaural response type of the mammalian MSO (review in Irvine 1992). Furthermore, many cells of the mustached bat MSO respond with either phasic on or off discharges to pure tones. The VMSO/DMSO of the horseshoe bat contain a mixed population of cells with O/E and E/E response properties (Casseday et al. 1988). Micro-iontophoresis of putative inhibitory neurotransmitters and their antagonist onto MSO-cells in Pteronotus show that the specialized response properties are created by convergence of excitatory input with glycinergic input, both being activated by the contralateral ear (Grothe et al. 1992).
6.1.2.3 Chemoarchitecture of the SOC Since physiological data suggested that the glycinergic input to the MSO derived from the MNTB is more pronounced than in other mammals, post-embedding immunocytochemistry was performed on semithin sections of the SOC of the mustached bat. Additionally, parallel investigations were carried out of the synaptic inventory of the main nuclei of the SOC (Vater 1995). Again, LSO and MNTB appear conservative in their neurotransmitter profile but the MSO shows an organization very different from that of other mammals. In the mustached bat MSO, putatively glycinergic inputs, as revealed by dense perisomatic puncta labelling and synaptic boutons containing flattened vesicles, are much more abundant than in other mammals. Moreover, the mustached bat MSO contains glycine-immunoreactive and double GABA-glycine labelled projection neurons mixed with non-immunoreactive projection neurons (Fig. 6.1.4), whereas the classical MSO of other mammals is composed of putatively excitatory neurons only. The highly derived organization of the mustached bat’s MSO may have developed from a fusion of peri-olivary structures (superior para-olivary nucleus) and MSO-like structures (Vater 1996). It represents a species-characteristic specialization, since comparative immunocytochemistry in horseshoe bats and big brown bats show significantly different organizational features (Vater 1996).
150
6.1 Auditory brainstem processing in bats
Fig. 6.1.4: Left: Binaural response properties and connectivity of the main nuclei of the SOC in horseshoe bats and mustached bats (Rhinolophus after Casseday et al. 1988, Pteronotus after Covey et al. 1991). Right: Influence of microiontophoretically-applied GABA, Glycine and their antagonists (biccuculine, strychnine) on temporal response patterns of mustached bat MSO neurons (after Grothe et al. 1992).
6.1.3 Lateral lemniscus The nuclei of the lateral lemniscus are especially highly developed in echolocating bats (review in Covey and Casseday 1995, Schwartz 1992). Of particular interest is the functional organization of a prominent subregion of the VNLL of bats that receives input via large calyciform endings of CN axons (Vater and Feng 1990). Neurons of this ‘columnar region’ of the VNLL (VNLLc) are known to respond with single, latency-constant spikes tightly locked to the stimulus onset (Covey and Casseday 1995). According to our immunocytochemical results, they represent a fast glycinergic relay to the midbrain (Vater and Braun 1994, Vater et al. 1997). Specializations of excitatory somatic and dendritic inputs are likely involved in creating the broad tuning and sensitivity for sweep direction that is highly relevant to the temporal processing of echo information (Vater et al. 1997).
6.1.4 Inferior colliculus The IC is the target of multiple ascending pathways with different neurotransmitter profiles. As shown by immunocytochemical techniques, it receives putatively GABAergic and glycinergic input via several ascending pathways and contains the substrate for intrinsic GABAergic interactions (Vater et al. 1992). Micro-iontophoretic application of the putative inhibitory neurotransmitters GABA and glycine and their antagonists presents a useful technique for dissecting the response properties that 151
6 Neural processing in the brain are created locally by interactions of excitatory and inhibitory inputs from those that are already present in the inputs from a variety of lower brainstem nuclei (Vater et al. 1992). We showed that GABAergic and glycinergic inhibition sculptures the temporal response properties of many IC-neurons, and modifies tuning curve shape and binaural responses. Moreover, we showed that information processing within the IC is under the influence of modulatory cholinergic inputs (Habbicht and Vater 1996).
6.2 Serotoninergic innervation of the auditory pathway in birds and bats – implications for acoustical signal processing Alexander Kaiser
In vertebrates, 5-HT neurons that use serotonin (5-hydroxytryptamine, 5-HT) as a transmitter are located mainly in the raphe nuclei that project bilaterally to form a fine network innervating many structures of the brain, including the auditory system. The 5-HT system is associated with feeding behaviour, thermoregulation, sexual behaviour, cardiovascular/ hypotensive effects, sleep, aggression and attention. A disturbance of the regulation of the 5-HT system can affect hearing through the hyperacusis associated with migraine (Wang et al. 1996). In the rat cochlear nucleus, micro-iontophoretic application of 5-HT was shown to produce a decrease in spontaneous and sound-evoked activity and a long-latency increase of soundevoked responses (Ebert and Ostwald 1992). 5-HT-immunoreactive terminals are known to be present e.g. in the cochlear nucleus, superior olivary complex, nuclei of the lateral lemniscus and inferior colliculus in the guinea pig, cat and bush baby (Thompson et al. 1994, Thompson et al. 1995). To investigate the putative serotoninergic modulation of acoustic processing, we studied 5-HT-immunoreactive fibres in auditory nuclei of the echolocating bat, Eptesicus fuscus and the chicken, Gallus domesticus. The parallel neuronal processing of space-specific acoustic information relevant to sound localization is well understood in birds (e.g. Carr 1992, Konishi 1993). The auditory brainstem nuclei of echolocating bats are larger and more highly differentiated than in other vertebrate species (Covey and Casseday 1995), but are not different in their same general organization and, due to adaptations for their biosonar, their inferior colliculus has anatomically identified frequency laminae. Thus, both species are excellent models for comparing 5-HT innervation and the function of auditory nuclei. As previously described (e.g. Thompson et al. 1994), 5-HT-immunoreactive cell bodies were located in the raphe and other nuclei in both chicken and bat. The density of 5-HT immunoreactive fibres showed the typical patterns for different nuclei in both species studied. In the avian nuclei of the ‘time pathway’, the magnocellular 152
6.3 Temporal processing in the lower brainstem cochlear nucleus and laminar nucleus, no 5-HT immunoreactive fibres were present. The first nucleus of the ‘intensity pathway’, the cochlear angular nucleus, however, was moderately densely innervated. This suggests different 5-HT modulation of these parallel auditory brainstem pathways. Within the inferior colliculus of the bat, the distribution of 5-HT terminals was clearly correlated with the expanded 20–30 kHz and the 50+ kHz frequency laminae, both of which are important for echolocation. The lower density of 5-HT fibres in these regions suggests that the role of serotoninergic modulation is less important in these regions. In both bat and chicken, the auditory thalamus was virtually devoid of 5-HT immunoreactivity, suggesting no modulatory role for 5-HT in these nuclei. In summary, there were clear differences in the amount of 5-HT innervation of different nuclei of the ascending auditory system, and in different regions of the bat inferior colliculus. This suggests that different auditory nuclei in both species and different regions of the bat inferior colliculus receive different amounts of serotoninergic modulation. This may correspond to a greater or lesser role of attention in the processing performed at these different levels (Kaiser and Covey 1996).
6.3 Temporal processing in the lower brainstem Benedikt Grothe
Under natural conditions we do not perceive single isolated sounds, but rather a complex composition of different acoustical events. In order to create distinct images of the different sound sources, for instance of different voices in a party room, the mammalian central auditory system has to compute distinct images of the acoustical environment by dissecting the various information available in the mixed input. Numerous psychoacoustic experiments have shown that the temporal structure of a sound presents a major source of information in performing this task. Additional, temporal cues that physically exist in the acoustic input or may be introduced by the auditory system itself are important for localizing sounds in space. Despite its importance, little is known about the neuronal mechanisms of temporal processing. In our project, we investigated three different aspects of temporal processing at the level of the superior olivary complex and the auditory midbrain of bats: (1) the role of neural delays in processing temporal and spatial information, (2) the role of inhibition in periodicity coding, and (3) the inter-dependence of temporal and spatial stimulus parameters.
153
6 Neural processing in the brain
6.3.1 The role of neural delays in IID-coding of LSO cells Neurons in the lateral superior olive (LSO) respond selectively to interaural intensity differences (IIDs), the chief cue associated with localizing high frequency sounds in space (for review see Irvine 1992). LSO cells are excited by sounds that are more intense at the ipsilateral ear and inhibited by sounds that are more intense at the contralateral ear. This response pattern creates IID functions that are characterized by maximal discharge for monaural stimulation at the ipsilateral ear. Increasing sound pressure at the contralateral ear progressively reduces the discharge rate of LSO neurons and, finally, completely inhibits the response. Thus, the main mechanism for processing IIDs in the LSO is a subtraction of the inputs from the two ears. Despite their relatively homogeneous pattern of innervation, IID selectivity (e.g. the point of complete inhibition in the IID function) varies substantially from cell to cell, such that selectivities are distributed over the range of IIDs that would be encountered in nature (Sanes and Rubel 1988). It has been speculated for some time that the relative timing of the excitatory and inhibitory inputs to an LSO cell might shape IID selectivity (Yin et al. 1985, Pollak 1988, Tsuchitani 1988, Irvine et al. 1995, Joris and Yin 1995). For the free tailed bat, Tadarida brasiliensis mexicana, we showed that, for more than half of the cells, inhibition arrived several hundred µs after excitation when similar intensities are presented at both ears (Park et al. 1996). When interaural time differences (ITDs) were experimentally introduced to compensate for that delay, the point of complete inhibition was shifted (Fig. 6.3.1). Under natural conditions (without prominent ITDs), increasing the intensity to the inhibitory ear shortens the latency of inhibition (Grothe and Park 1995, Park et al. 1996) and brings the timing of the inputs from the two ears into register. Thus, a neural delay of the inhibitory input helped to define the IID selectivity of these cells, accounting for a significant part of the variation in IID-selectivity among LSO cells.
Fig. 6.3.1: IID functions from three Tadarida LSO cells illustrating how electronically advancing the signal to the inhibitory ear affected IID selectivity in cells whose inhibitory inputs had a neural delay relative to their excitatory inputs. Each graph shows the IID function of a cell when the stimulus was presented simultaneously at both ears (solid lines) and when the signal to the inhibitory ear was electronically advanced by 300 or 400 µs relative to the signal at the excitatory ear (dashed lines). Less intensity was required at the inhibitory ear when the signal to this ear was electronically advanced (from Park et al. 1996).
154
6.3 Temporal processing in the lower brainstem
6.3.2 The role of inhibition in processing periodic stimuli In mammals with good low frequency hearing and a moderate to large inter-ear distance, neurons in the medial superior olive (MSO) are sensitive to interaural time differences (ITDs), thus to the relative interaural delay (for review see Irvine 1992). The underlying mechanism is thought to rely on a coincidence of excitatory inputs from the two ears that are phase locked either to the stimulus frequency or the stimulus envelope (Jeffress 1948). Extracellular recordings from MSO neurons in several mammals are consistent with this theory (Goldberg and Brown 1969, Yin and Chan 1990, Spitzer and Semple 1995). However, there are two aspects that remain a puzzle. The first concerns the role of the MSO in small mammals that have relatively poor low frequency hearing and whose heads are small and generate ITDs of less than 30 µs. The second puzzling aspect of the scenario concerns the role of the prominent inhibitory inputs to MSO neurons (Cant and Hyson 1992, Kuwabara and Zook 1992, Grothe and Sanes 1993). In one bat species, the mustached bat, the MSO is a functionally monaural nucleus (Covey et al. 1991) and, therefore, does not compare ITDs. However, single unit recordings and neuropharmacological manipulations of these neurons have shown that these neurons still process timing information by comparing the excitatory and the inhibitory inputs, both driven by the contralateral ear (Grothe et al. 1992). By means of this comparison, these neurons present low-pass filters for the rate of sinusoidally amplitude-modulated stimuli (SAM) at which the filter cut-off is defined by the time course, particularly by the delay and the duration of the inhibition. These filter characteristics can be experimentally eliminated by antagonizing the inhibitory transmitter glycine through iontophoretic application of strychnine (Grothe 1994). Similar filter mechanisms based on the interaction of excitatory and inhibitory inputs have been shown for the gerbil MSO in vitro (Grothe and Sanes 1994) and more recently for the dorsal nucleus of the lateral lemniscus (Yang and Pollak 1997) and the inferior colliculus (Koch and Grothe 1997, see below). In contrast to the mustached bat, the MSO of the Mexican free-tailed bat presents a more typical binaural structure (Grothe et al. 1994) with more than half of the neurons receiving excitation and inhibition from both sides. Nevertheless, its neurons also act as SAM filters comparable to those in the mustached bat. Several tests proved that, again, the delay and duration of the inhibitory inputs is crucial for establishing the filter mechanism (Fig. 6.3.2; Grothe et al. 1997a).
155
6 Neural processing in the brain
Fig. 6.3.2: PSTHs of an MSO neuron in Tadarida in response to sinusoidally amplitude-modulated sounds with different modulation rates. Left column: Under control conditions the neuron only phase-locked well to low modulation rates (lower panel) but not to 200 Hz modulation rate (upper panel). Right column: In the presence of strychnine, the neuron exhibited precise phase-locking to 200 Hz modulation frequency and higher (not shown).
Using sinusoidally amplitude-modulated tones, we found that the ITD sensitivities of many MSO cells in the bat were remarkably similar to those reported for larger mammals (Fig. 6.3.3; Grothe and Park 1996, 1998). Our data also indicate an important role for inhibition in sharpening ITD sensitivity and increasing the dynamic range of the ITD-functions. Again, the time course of the inhibitory inputs is a crucial factor for the selectivity of these neurons (Grothe et al. 1997a). Figure 6.3.4 illustrates these effects in ITD-functions before and during blockade of glycinergic inhibition (Grothe 1997a). These in vivo results are consistent with in vitro recordings from gerbil brain slices, in which ITD-coding was significantly affected by the timing of the glycinergic inputs (Grothe and Sanes 1994).
156
6.3 Temporal processing in the lower brainstem Fig. 6.3.3: ITD sensitivity for a cat MSO neuron (upper curve, adapted from Yin and Chan), their Fig.3, and a free-tailed bat MSO neuron (lower curve). As illustrated here, ITD functions reported for the cat, as well as those we measured from the bat, showed a correlation of response magnitude with the relative phase difference of the stimulus at the two ears. Hence, to facilitate a direct comparison, ITDs were translated into phase differences on the x-axis of the graphs presented here. In the example shown for the cat, the neuron was tested with a 300 Hz pure tone and sensitivity was related to the relative timing of the 300 cycles/s at each ear. The stimulus presented to the bat was a high frequency tone that was amplitude modulated at a rate of 200 cycles/s and sensitivity was related to the relative timing of the amplitude modulations. The shaded areas on each graph display the range of ITDs that naturally occurs for these species (From Park and Grothe 1996). Fig. 6.3.4: ITD-function of an MSO neuron in Tadarida in response to a pure tone with 200 Hz amplitude modulation (dotted line: control; solid line: in the presence of strychnine). Strychnine affected the dynamic range of the ITD functions by diminishing the troughs about more than 50 %. Additionally, the peak in the ITD-functions shifted and broadened (After Grothe 1997a).
These results show that in any case, MSO neurons code for temporal aspects of acoustic stimuli and, if receiving binaural inputs, are in principal capable of ITD coding. However, the resolution of the ITD functions and the profound impact of stimulus parameters other than ITDs (such as periodicity) support the hypothesis that ITD coding is not the primary function of MSO neurons in bats and in other small mammals. The possible function of the binaural inputs are discussed below. At the level of the auditory midbrain, inhibition seems to be involved in creating filter characteristics for periodic stimuli by mechanisms that may be similar to those in 157
6 Neural processing in the brain the MSO and the DNLL. Many neurons in the IC of Eptesicus respond poorly or not at all to pure tones of a single frequency, yet are highly selective to rate and modulation depth of sinusoidally frequency modulated tones (SFM) (Casseday et al. 1997). The majority of these neurons exhibit band-pass filter characteristics in their modulation transfer functions. In about 40 % of the neurons tested, both the lower and the upper cut-off could be manipulated pharmacologically by iontophoretically applying the GABA antagonist bicuculline or the glycine antagonist strychnine (Fig. 6.3.5; Koch and Grothe 1998). At least for the inhibitory components that are involved in creating the upper cut-off limit, there is evidence that the temporal characteristic (delay and duration) is a crucial factor. Given the high degree of synaptic convergence on IC neurons, it is not surprising that the effects are not as straight forward as shown for MSO and DNLL neurons. However, inhibition seems to be a crucial factor for all kinds of temporal processing investigated so far. This also holds for the temporal discharge patterns in the IC of horseshoe bats (Vater et al. 1992) and the duration-sensitive neurons described for the IC in Eptesicus (Casseday et al. 1994, Covey et al. 1996).
Fig. 6.3.5: In Tadarida LSO neurons, the modulation transfer functions for amplitude-modulated tones are independent of IIDs (A: Example of modulation transfer functions of an LSO neuron; B: Population statistic). In contrast, in MSO neurons, the filter functions depend on IID (C: Example of modulation transfer functions of an MSO neuron; D: Population statistic). The dashed field illustrates the shift of the 50 % cut-off point when IID was changed from 0 dB to +30 dB.
158
6.3 Temporal processing in the lower brainstem
6.3.3 Temporal vs. spatial processing in the lower auditory brainstem As mentioned above, MSO neurons in the mustached bat act as low-pass filters for periodic stimuli. Since most of these neurons are monaural, these filter characteristics are stable in the face of changes of stimulus position (Grothe 1994). In contrast, the majority of MSO neurons in the free-tailed bat are innervated from both sides (Grothe et al. 1997a). The monaural input from each side represents an independent filter in itself due to the interaction of excitatory and inhibitory projections from one side. A functionally important finding is that in most neurons, the cut-off of the filter from one side does not match the cut-off from the other side. Consequently, the filter functions obtained under binaural stimulus conditions present a mixture of the two single ones and, therefore, depend on the strength of the binaural inputs. This mixture, in turn, creates a dependence of the filter functions on interaural intensity differences that are defined by the position of a sound source in space (Fig. 6.3.5 A and B). Therefore, the MSO output is an integrated response to the temporal structure of a stimulus as well as its azimuthal position, i.e. IIDs. There are no in-vivo results concerning filter characteristics in the ‘classical’ MSO of larger mammals, but our data confirm an earlier speculation about this interdependence based on data derived from a gerbil brain-slice preparation (Grothe and Sanes 1994). The interdependence found in the MSO contrasts with the effects observed in the LSO under similar conditions (Grothe et al. 1997b). Here, filter characteristics obtained from normalized modulation transfer functions do not change with IIDs, as shown in Figure 6.3.5 C and D. This finding is striking because the contralateral, inhibitory input derives from the same source as that to the MSO, namely from the ipsilateral MNTB (medial nucleus of the trapezoid body). In order to test whether this kind of interdependence of temporal processing and binaural cues is unique for lower brainstem auditory neurons or is a rather common characteristic of binaural neurons in general, we tested neurons in the auditory midbrain of the big brown bat, Eptesicus fuscus, in two ways. Firstly, we measured receptive fields for single neurons in the inferior colliculus under open-field conditions using different stimulus parameters. The idea was to test the extent to which the spatial selectivity of single neurons depends on stimulus conditions such as intensity, bandwidth and temporal structure. Secondly, we tested filter functions of IC neurons for periodic frequency modulated tones under different binaural conditions using open-field as well as closedfield approaches. The first experiment revealed a surprising variability of the effects of changing stimulus parameter on width, height and centre of receptive fields of IC neurons (Grothe et al. 1996). In the face of changes in stimulus patterns, less then 20 % of the neurons showed stable receptive fields, but the remaining exhibited significant shifts as well as changes in size and shape of receptive fields. In part, these effects were due to different response types of one and the same neuron to stimuli with different temporal features. Accordingly, the second approach, testing filter characteristics under different binaural conditions, showed significant changes in a large proportion of IC neurons. 159
6 Neural processing in the brain Since many neurons in the IC of Eptesicus have been shown to be highly selective to rate and modulation depth of sinusoidally frequency-modulated tones (SFM), this stimulus type was used for these tests (Koch and Grothe 1997). Modulation transfer functions for the SFM stimuli changed significantly as a function of speaker position in about half of the neurons tested (Fig. 6.3.6). Together with other recent studies, our experiments provide strong evidence that neural inhibition is involved in multiple aspects of temporal processing in the auditory brainstem of mammals. The interaction of one, two or multiple inhibitory inputs with excitatory inputs define filter characteristics and, thereby, create selectivity of single neurons to temporal aspects of a sound. Therefore, the exact timing of the inhibitory inputs (delay and duration) is crucial for the way inhibitory inputs are involved in creating selectivity for temporal stimulus parameters. Thus, these neural delays are of fundamental significance. However, an open question remains as to whether they are created by axonal length (delay-line, and are therefore introduced presynaptically) or whether they are created postsynaptically by temporal or spatial integration at the level of the synapse, the dendrite or the whole cell.
Fig. 6.3.6: Example of modulation transfer functions of an IC neuron in Eptesicus in response to monaural (solid line) and binaural stimulation (IID = 0; dotted line).
160
6.4 The barn owl as a model for high-resolution temporal processing in the auditory system
6.4 The barn owl as a model for high-resolution temporal processing in the auditory system Christine Köppl
The barn owl (Tyto alba) has become an important animal model for temporal processing, based on neural phase locking, in sound localization. Owls use minute differences in the microsecond range in the arrival times of an auditory signal at the two ears to determine the azimuthal position of a sound source (e.g. review in Konishi 1993). The underlying neuronal mechanisms are of general significance and are, for example, also found in mammals (Smith et al. 1993, Yin and Chan 1990). Phaselocked responses from the auditory nerve are relayed via the cochlear nucleus magnocellularis (NM) to the binaural nucleus laminaris (NL), whose neurons act as coincidence detectors, being maximally excited when phase-locked spikes arrive simultaneously from both ears (review in Carr 1993). In order to code for a range of interaural time difference, i.e. different azimuthal positions of sound sources, systematic neuronal delays are introduced by varying the lengths and myelination of NM axons. If the introduced delay is compensated by the earlier arrival of an auditory signal at the respective ear, coincidence and thus optimal excitement will occur again in the NL neuron. Although the neuronal circuits and mechanisms just described are in principle well established, important questions remain. For example, it is controversial whether neuronal phase locking is improved in the NM relative to the input from the auditory nerve. Also, previous interest has focussed almost exclusively on the frequency range between 4 and 9 kHz, which is of primary importance for the owl in precise sound localization. However, this often makes the owl hard to compare with other birds, e.g. the chicken, whose sensitive hearing range does not extend that high, and may thus lead to the interpretation of fundamental differences across frequencies as species differences.
6.4.1 Does phase locking improve in the nucleus magnocellularis? In a recent study of phase locking in the auditory nerve and the NM of the barn owl (Köppl 1997), we directly compared data sets for both neuronal populations from the same individuals. The standard measure of phase locking quality, vector strength, was consistently higher in the auditory nerve than in NM above a CF of about 1 kHz. This was true for pooled data as well as when comparing individual auditory-nerve and NM units recorded in the same owl (an example is shown in Fig. 6.4.1). In an earlier study, Sullivan and Konishi (1984) found nearly the opposite, and concluded that phase locking was improved in the NM. This difference between the two studies is not easily explained; however, Sullivan and Konishi’s auditory-nerve sample 161
6 Neural processing in the brain
Fig. 6.4.1: Phase locking in the auditory nerve was superior to that in N. magnocellularis neurons at high frequencies. This plot shows the saturated vector strength for 3 auditory-nerve fibres (filled circles) and 5 NM neurons (empty squares), within an individual range of frequencies each. All 8 units were recorded along the same electrode track in one owl (C. Köppl, unpublished data).
was relatively small and other data available for vector strengths in the NM (Carr and Konishi 1990, Peña, personal communication) agree with ours. Comparing all available sets of barn-owl data, we therefore conclude that phase locking at high frequencies definitely does not improve in the NM, but rather that phase locking in the NM is even inferior to that of the auditory nerve above 1 kHz. A decrease of vector strength at high frequencies between the auditory nerve and the NM is also seen in the chicken (Salvi et al. 1992, Warchol and Dallos 1990) and between the auditory nerve and the anteroventral cochlear nucleus (AVCN) in the cat (Joris et al. 1994). A deterioration of phase-locking quality at high frequencies is thus consistently seen in different species, in spite of specializations in synaptic morphology and membrane-channel properties thought to optimize temporal responses in the NM and AVCN, respectively (e.g. Oertel 1985, Raman and Trussell 1992). Thus, it may well reflect an irreducible amount of temporal jitter at the synapses.
162
6.4 The barn owl as a model for high-resolution temporal processing in the auditory system
6.4.2 Frequency-specific adaptations of neuronal brainstem circuits The two brainstem nuclei NM and NL and the axonal pathways connecting them show many morphological and physiological adaptations for fast temporal processing. However, there is also evidence for clear mophological differences along the tonotopic gradient of both nuclei (e.g. Boord and Rasmussen 1963, Jhaveri and Morest 1982, Takahashi and Konishi 1988), which have been paid little attention so far. What exactly are those differences and might they indicate functional differences between different frequency bands?
6.4.2.1 Neuronal and synaptic morphology An early study in the pigeon (Boord and Rasmussen 1963) defined a purely vestibular projection area at the caudolateral extreme of the NM. While it is now clear that the vestibular projection was probably an artefact and this area actually represents the low-frequency end of the tonotopic gradient (see Sect. 2.7.2), its different morphology from the main body of the NM has consistently been noted by many investigators. The most prominent difference is the absence of the otherwise typical, large endbulb-of-Held terminals on NM neurons. Instead, in the caudolateral, low-frequency part, auditory-nerve fibres terminate in many small boutons (Fig. 6.4.2; Boord and Rasmussen 1963, Jhaveri and Morest 1982, Köppl 1994). As discussed in Köppl (1994), this is probably of little functional significance, since phase locking at low frequencies does not depend on the specialized electrical properties of an endbulb-type terminal. In addition to the difference in auditory-nerve terminal shape, we also recently defined two different cell types in the caudolateral, lowfrequency NM of the barn owl: a stellate type not found in other parts of the NM and a small version of the principal neuron typical for all of the NM (Köppl and Carr 1997).
Fig. 6.4.2: Auditory-nerve fibres terminals on N. magnocellularis neurons were different at low and high frequencies. The left panel shows HRP-filled endbulbs of Held in a 5 kHz region of the nucleus. In the right panel, typical bouton-like endings from a 0.6 kHz region can be seen (C. Köppl).
163
6 Neural processing in the brain In the NL, two distinct neuronal types were found to be unique to the caudolateral, low-frequency region (Köppl and Carr 1997). Interestingly, the practically separate spatial distribution of those two cell types appears to correspond to the target areas of two different projections from the NM. Stellate cells were confined to the dorsal tip of the caudolateral NL, where a convergent projection of all low frequencies (up to about 1.5 kHz) from the NM was consistently found. In contrast, the multipolar cell type was distributed throughout the remaining caudolateral NL, where a tonotopically-arranged projection of the same low-frequency range from the NM was observed (Köppl and Carr 1997). It is still unclear whether different neuronal populations in the NM might be responsible for these two projection areas and, more importantly, what the functional significance of an apparently redundant projection, unique to the low-frequency range, might be.
6.4.2.2 Axonal delay lines between N. magnocellularis and N. laminaris It is well established for the higher-frequency range (above about 2 kHz), that the projection from the NM to the NL is the anatomical substrate of the delay lines for binaural coincidence detection (review in Carr 1993). The projection is tonotopic and each NM axon splits into two primary collaterals, innervating iso-frequency bands in the ipsi- and contralateral NL, respectively. In the barn owl, the NM axons from both sides run within the NL in a counter-current fashion, contacting NL neurons en passant as they traverse the thick nucleus in opposite directions. Physiologically, this results in many parallel maps of interaural time difference, both within and across iso-frequency bands. In the caudolateral, low-frequency area of the NL, however, these well-known patterns are modified (Köppl and Carr 1997). Although a tonotopic projection is preserved (see also Sect. 6.4.2.1), the caudolateral NL is not as thick and the incoming ipsi- and contralateral NM axons do not show an orderly countercurrent course, but branch in different directions. The morphological basis for axonal delay lines as seen in the higher-frequency range therefore does not exist. As discussed in Köppl and Carr (1997), this is probably a reflection of the inherent temporal scatter of the neural phase-locking code at low frequencies, which renders low frequencies effectively useless for precise sound localization and thus eliminates the need for constructing any neural maps of interaural time difference.
164
6.5 Cortical physiology, sensorimotor interactions
6.5 Cortical physiology, sensorimotor interactions Gerd Schuller
6.5.1 Auditory cortex in the horseshoe bat The detection and discrimination performance of echolocating bats resides in highly developed analysis capacities of the auditory system. Auditory processing in the auditory brainstem has been extensively investigated in bats for decades and it has been presumed that much of the behaviourally-relevant echo analysis is performed up to the level of the inferior colliculus (Casseday and Covey 1996). The auditory cortex in bats received increased attention only after O’Neill and Suga (1979) discovered that cortical fields showed highly specialized response patterns to biologically significant stimuli. The mammalian cortex has been divided into a number of physiologically and morphologically distinct subdivisions in a variety of species (Clarey et al. 1992). In many instances, the functional significance of the different fields has remained unclear and the growing evidence for distinctly different functional properties of cortical fields in the bat may contribute to the understanding of mammalian auditory cortical function in general. One group of bats, categorized acoustically by the type of echolocation calls that consist of long, constant frequency portions and short initial and final frequency modulated parts, the so-called CF-FM-bats, have been most extensively explored with respect to cortical subdivisions and show so far the most distinct functional regionalization. This report will review the neurophysiological evidence for functional specialization of subdivisions in acoustically responsive cortical fields in the horseshoe bat, Rhinolophus rouxi, and their afferent and efferent connections. For the functional delimitation of cortical fields, the responses of single and multiple units were recorded with a stereotaxic procedure yielding a positioning precision of about 100 µm (Schuller et al. 1986). The recording coordinates were referred to a standard brain atlas of Rhinolophus rouxi (Radtke-Schuller, unpublished) in order to be able to compare the results from different individuals. Afferent and efferent connections were revealed using anterograde and retrograde transport of wheat germ agglutinin (WGA-HRP) or horseradish peroxidase (HRP). Tracer substances were injected either into subdivisions of the acoustically responsive cortical fields with known physiological properties or in other brain areas having connections with the auditory cortex. Recording of response characteristics of cortical neurons to a wide range of different stimuli, i.e. pure tones, sinusoidally frequency- or amplitude-modulated stimuli, linear frequency modulations, combinations of linear frequency-modulated stimuli, noise as well as spontaneously uttered vocalizations, allowed the delineation of subdivisions of the auditory cortex. The thus defined subdivisions fitted well into cyto- and myelo-architectonic boundaries (Fig. 6.5.1) (Radtke-Schuller and Schuller 1995). 165
6 Neural processing in the brain
Fig. 6.5.1: Acoustically responsive cortical field with functionally characterized subdivisions of the horseshoe bat Rhinolophus rouxi. On the left is shown a dorsal (top panel) and sagittal (bottom panel) view of the cortical surface with the area depicted in the right panel. The curved cortical surface has been unrolled and the acoustically responsive fields are delimited by heavy lines. The horizontal broken lines correspond to the lines in the left drawing. Abbreviations: adf, anterior auditory field; ddf, dorsal dorsal auditory field; paf, posterior auditory field; pf, primary auditory field; rdf, rostral dorsal auditory field; vf, ventral auditory field.
6.5.1.1 Primary auditory field The primary auditory field shows a clear tonotopic organization, with high frequencies lying rostrally and a gradient of decreasing frequencies in the caudal direction. Neurons with best frequencies at and a few kHz above the resting frequency of the bat, i.e. the frequency of the constant frequency portion, are over-represented in number and cortical surface area. The proportion of the cortical area per kHz occupied by neurons in this small frequency band compared to that of all other frequencies is 12 to 3. Concurrently, these neurons exhibit extremely narrow tuning characteristics, with Q-values up to several hundred. Comparable to primary auditory fields in other mammals, pure tones are consistently responded to at the shortest and most stable latencies within auditory cortical areas.
166
6.5 Cortical physiology, sensorimotor interactions 6.5.1.2 Anterior dorsal field Adjacent to the high frequency portion of the primary field, the anterior dorsal field extends to more medial coordinates and also shows a tonotopic gradient to lower frequencies. The transition is continuous and reminds of the reversal of tonotopic gradients between primary and anterior auditory field in other mammals. Best frequencies at and above the resting frequency are also over-represented in the anterior dorsal field and show the highest Q-values, i.e. the sharpest tuning in the auditory cortex. A special feature of this area is the concentration of neurons that were activated during spontaneous vocalization.
6.5.1.3 Posterior dorsal field The posterior dorsal field is the caudal pendant of the anterior dorsal field and its spectral representation is centreed on the frequency-modulated portion of the echolocation call. There is no distinct tonotopic arrangement of best frequencies and no frequencies from the constant frequency portion of the call are represented here. The tuning of many neurons in this area is much narrower than of any neuron tuned to frequencies in the FM band in the primary field or any other field. This property could be useful for the exact temporal encoding of the start and/or end of successive calls or echoes.
6.5.1.4 Rostral dorsal field This non-tonotopic area, at the rostral border of the anterior dorsal field is characterized by the occurrence of neurons that reject even narrow-band signals such as frequency-modulated signals or narrow-band noise and respond preferentially to pure tone stimuli within the entire frequency range. Many neurons respond to combinations of two constant frequency tones with facilitation. The effective frequencies to activate these so-called CF-CF neurons lie near the resting frequency of the bat and the corresponding lower harmonic, although the best frequencies for facilitation are rarely exactly harmonically related. The time delay between the two components is not an important parameter for facilitated responses.
6.5.1.5 Dorsal field of the dorsal auditory cortex This most dorsal portion of the auditory responsive cortex area shows several specializations for the processing of linear frequency modulated stimuli. Many neurons are preferentially activated by combinations of time-delayed, frequency-modulated stimuli (Schuller et al. 1991). The facilitation of the response in these so-called FM/FM neurons is critically dependent first on the frequency ranges of the modulations and second on the time delay between the two components. The first component covers the lower harmonic frequency range of the call (pulse) and the delayed component 167
6 Neural processing in the brain lies in the frequency range of the frequency-modulated portion of the call (echo). Maximum facilitation of the response is reached most often at ratios slightly deviant from the harmonic ratio. Temporal delay tuning is characteristic for individual combination-sensitive neurons and best delay ranges between 1 and 10 ms in the horseshoe bat. The neurons are arranged along the rostro-caudal axis in accordance with their best delay, the shortest best delays being located most rostrally. Best delays between 2 and 4 ms are largely over-represented and occupy 56 % of the cortical area where FM/FM neurons are found. The dorsal field of the dorsal auditory cortex is not tonotopically organized.
6.5.1.6 Ventral auditory field The auditory area ventral to the primary auditory field has not been extensively investigated neurophysiologically and the few recordings showed reduced consistency of auditory responses, which were difficult to assess.
6.5.2 Afferent and efferent connections of auditory cortical fields
6.5.2.1 Thalamocortical connections The tonotopically organized primary auditory field receives its main input from the central portion of the ventral division of the medial geniculate body (MGBv). The topographic order of this afferent pathway corresponds to the tonotopic order in the source and target areas. The anterior dorsal field, however, that also shows a relative clear tonotopic order shows the highest complexity in its afferent connections; these originate in virtually all subdivisions of the auditory thalamus. It constitutes a region of high convergence of tonotopic, multisensory, and to a lesser extent non-tonotopic afferent components. The anterior dorsal field resembles most the anterior auditory field (AAF) in other mammals. The posterior dorsal auditory field in the bat compares best with the posterior auditory fields in the cat (PAF and VPAF), according to its afferent connections from the ventral and anterodorsal partitions of the medial geniculate body and its multisensory nuclei. The location of the ventral auditory field in the bat best corresponds to the secondary auditory cortex (AII) in the cat. Accordingly, analogous afferences originate from border regions of the ventral subdivision and multisensory nuclei of the medial geniculate body. However there are no afferences from the dorsal medial geniculate body as found in the cat. The multisensory nuclei and the anterodorsal portion of the medial geniculate body are the most important sources of thalamic connections to the rostral dorsal field with its specialization for pure tones. 168
6.5 Cortical physiology, sensorimotor interactions The dorsal part of the dorsal auditory cortex receives its predominant afferent input from the anterodorsal medial geniculate body, and additional afferences from the multisensory nuclei of the medial geniculate body. Although there is no correspondence to a dorsal dorsal field in the cat, auditory responsive regions dorsal to the primary auditory field have been found in other species (Radtke-Schuller 1997).
6.5.2.2 Connection with other brain regions The efferent connections of the auditory cortex in the horseshoe bat have been investigated in some detail with respect to the thalamus, the pretectal area, the superior colliculus, the rostral pole of the inferior colliculus and the pontine nuclei. Retrograde transport after tracer injections into these regions has been used to determine these connections (Fig. 6.5.2). Efferences to thalamus: Only the projections from the dorsal region of the dorsal auditory field back to the auditory thalamus will be briefly described. The cortical efferences end in the same region of the dorsal medial geniculate body from which its inputs derive (reciprocal connections), not in an exact point-to-point relationship, but rather an area-to-area relationship. These connections have been verified by tracer injections into the relevant parts of the medial geniculate body (Fromm 1990). Efferences to the pretectal area: Tracer injections in the pretectal area retrogradely labeled pyramidal cells that were found exclusively in the ipsilateral dorsal part of dorsal auditory cortex. The rostral part of the pretectal area receives input from neurons located caudally, whereas the more caudal parts receive projections from more rostral levels (Schuller and Radtke-Schuller, submitted). The labeled neurons are located within the part of the dorsal auditory cortex that contains highly specialized neurons responding solely to combinations of frequency modulated stimuli (FM/FM field) (Schuller et al. 1991).
Fig. 6.5.2: Efferent connections of the acoustic cortex as revealed by retrograde transport after tracer injection into target regions with potential functions in audio-motor control. The representation is schematic and underlines the connections of the dorsal field of the dorsal acoustic cortex. Due to the experimental procedure, the graph is necessarily not a complete representation of efferences of the acoustic cortex in the horseshoe bat.
169
6 Neural processing in the brain Labeling covers about the caudal two thirds of the FM/FM-field. Other auditory cortical fields (Radtke-Schuller and Schuller 1995), e.g. the primary area, do not project to the injected pretectal area and no labeled neurons were found outside the dorsal part of the dorsal auditory cortex. The cortico-pretectal projections were confirmed by anterograde transport from the dorsal auditory cortex to the pretectal area (RadtkeSchuller et al. unpublished). Efferences to the superior colliculus: The intermediate and deep layers of the superior colliculus of the bat are strongly developed in comparison with the thin visual layers and only tracer injections into the former layers are considered (Reimer 1989). These layers receive input from widespread cortical areas covering the cingulate area, the very frontal cortical areas and the caudally adjacent, presumably motor and somatosensory cortices. Temporal auditory fields also contribute to the cortical input to the superior colliculus (Reimer 1989, 1991). The topography of the cortical projections to the superior colliculus has not been determined in detail. Efferences to the rostral pole of the inferior colliculus: An important difference between the rostral pole of the inferior colliculus and the central nucleus is that the former receives descending input from the auditory cortex, whereas the latter, as in all non-primates, gets no or very sparse cortical feedback connections (Prechtl 1995). The cortical neurons projecting ipsilaterally to the rostral pole of the inferior colliculus are located in the dorsal portion of the dorsal auditory cortex, i.e. the FM/FM field. Efferences to pontine nuclei: Any tracer injection covering the pontine grey nuclei also leads to labeling in cortical auditory regions that are, however, restricted to the dorsal portion of the dorsal auditory field. The projecting neurons are found along the entire rostro-caudal extension of the FM/FM field and are also present in the rostral dorsal field, that contains the combination sensitive neurons for CF-CF patterns (Schuller et al. 1991). The connections of the FM/FM field with the pontine region have been reciprocally verified.
6.5.3 The auditory cortex in the horseshoe bat: Summary and conclusions The auditory cortex in the horseshoe bat, Rhinolophus rouxi, has been delineated with different methods, i.e. firstly, with physiological recordings covering a broad range of acoustical stimulus types, secondly, using anterograde transport after tracer injections into the acoustical thalamus, the medial geniculate body, thirdly, by cytoand myeloarchitectonic evaluation of neocortical fields and fourthly, by establishing the thalamo-cortical connections. In the horseshoe bat, the auditory cortex occupies a very large fraction of the neocortex compared to other mammals. Six regions could be distinguished, that to various degrees have correspondences to subdivisions of the auditory cortex in other mammalian species. Some pronounced cortical properties in the horseshoe bat may not be a specialization, but could also be present, although less prominent, in other species. A salient example for this is the dorsal field of the dorsal auditory cortex, which is not described, e.g. in the cat, although it is recognized in other mammals. Echolocation in the bat lends a very appealing interpretation 170
6.5 Cortical physiology, sensorimotor interactions for the function of the facilitated encoding of specific time delays by neurons in this area, in that the travel time from vocalization pulse to echo reception is a measure of distance to the target. However, the analysis of temporal delays is certainly an important function of auditory analysis common to many mammalian species and not a unique specialization in bats. The dorsal field of the dorsal auditory cortex seems to be of importance not only for acoustical analysis of specifically important aspects of echolocation, but also for the control of motor or vocal programs. This assumption rests on the observation that the area sends efferences to many of those brain regions in which vocalization or pinna movements can be elicited either electrically or pharmacologically, i.e. to the pretectal area, the intermediate and deep layers of the superior colliculus, the rostral pole of the inferior colliculus and to the pontine nuclei. Cortical input to these regions, except to the superior colliculus, is in addition restricted to the dorsal dorsal acoustical field and no other cortical acoustical areas send projections there. Potentially, the dorsal field of the dorsal auditory cortex can exert parallel control in all these areas. However, as long as the specific function of these individual nuclei in the audio-motor context is not further elucidated, the functional relevance of the dorsal dorsal auditory cortex for audio-motor control remains speculative.
6.5.4 Audio-vocal interaction in horseshoe bats Insectivorous bats rely almost exclusively on their echolocation system for orientation in space or for catching prey. They are ideally suited for the study of acoustically guided behaviour that relies on mechanisms of the transformation of acoustic information into appropriate motor programs for vocalization, body orientation and flight movements. Orientation towards a sound or echo source involves the auditory analysis of spatial coordinates. The adaptation of echolocation calls to the requirements of specific echolocation tasks modifies the spectral and/or temporal organization of the calls upon auditory analysis of these parameters in the echoes. Two important audio-motor or audio-vocal control systems in bats are in the focus of this short review: first, many bat species, especially rhinolophid and mormoopid bats, have highly motile pinnae and the pinna alignment has an important influence on the directional characteristics of the receiving system (Obrist et al. 1993). Pinna movements provide spatial focusing on targets and are essential for spatial tracking of prey during pursuit. Pinna movements are tightly correlated with the emission of vocalizations and it is supposed that the close temporal coordination of vocalization and pinna orientation is important for spatial performance. Acoustic feedback may have further influence on pinna adjustment. Second, rhinolophid and mormoopid bats, so-called long CF/FM bats with long constant frequency (CF) orientation calls terminated by a short frequency modulated (FM) portion, use a highly sophisticated audio-vocal feedback system that stabilizes the echo carrier frequency to a distinct value. During this so-called Doppler-shift compensation, the Doppler-shift induced by the flight speed of the bat is analyzed in the hearing system and fed back to the motor structures controlling 171
6 Neural processing in the brain the vocal output. The feedback signal causes an appropriate decrease of the emitted frequency so that, in spite of the Doppler-shift, the echo frequency is stabilized at and just above the bat’s resting frequency. This automatic frequency control system is functionally well defined (Schnitzler 1968, 1970, Schuller et al. 1974, Schuller 1980) but the neuronal implementation of the signal paths used for the transformation of the acoustic frequency error into appropriate motor output commands is still largely unknown. The following report will summarize the contributions to this subject from our studies of the descending vocalization system and of relevant auditory structures of the horseshoe bat’s brain. Animals and stereotaxic procedure: All neurophysiological and neuroanatomical investigations were carried out in old-world horseshoe bats, Rhinolophus rouxi, or in the neotropical mustached bat, Pteronotus p. parnellii. In both species, a specially-developed stereotaxic procedure (Schuller et al. 1986) was applied that allowed the referral of all positional data with high precision (typically 100 µm) to a common stereotaxic atlas for each species (atlantes for Rhinolophus rouxi and Pteronotus p. parnellii by Radtke-Schuller, unpublished). The advantages over a ‘case-oriented’ procedure is evident, as only an evaluation procedure using a common reference allows a high precision of interindividual comparison and pooling of data. Simulation of Doppler-shift compensation: In experiments with vocalizing and/or Doppler-shift compensating bats, the Doppler-shifted echoes could be simulated by an electronic playback system that maintained the frequency shift between echo and vocalization at any desired level. The shifts could be systematically modulated in time. Neurophysiological and neuroanatomical procedures: Single or multiple units were recorded with conventional neurophysiological methods using either glass or metal microelectrodes. Stimulus- or vocalization-related spike sequences could be stored and processed following different criteria. Tracers were primarily used for establishing the connections and projection patterns of neurophysiologically-characterized brain areas, but also to allow the verification of the stereotaxic procedure. Wheat germ agglutinin conjugated to horseradish peroxidase (WGA-HRP) was most commonly used, in addition to fluorescent beads, fluoro-gold and biotinylated dextran amin (BDA) in more recent experiments. Tracers were injected either with pressure or iontophoretically. Functional involvement of brain structures in vocal uttering and movements of the pinnae were tested with electrical micro-stimulation at very low (several µA) stimulation currents. The stimulation was applied through metal microelectrodes, that could also be used to lesion the tissue electrolytically. In order to avoid the fibre-of-passage problem, vocal and behavioural responses were also induced by injecting drugs (e.g. kainic acid/glutamate agonist).
172
6.5 Cortical physiology, sensorimotor interactions
6.5.5 Peripheral vocalization system in the horseshoe bat In the horseshoe bat, Rhinolophus rouxi, the innervation of the larynx has been investigated neurophysiologically and anatomically to determine the location of laryngeal motoneurons in the N. ambiguus and their functional involvement both in the control of emitted frequency and the timing of the emitted echolocation call. Frequency control of the emitted echolocation calls is mediated by the motor branch of the superior laryngeal nerve, which originates in the rostral area of the ventrolateral portion of the motor nucleus, the N. ambiguus (Schuller and Rübsamen 1981, Schweizer et al. 1981). The inferior or recurrent laryngeal nerve is involved in the temporal structuring of the duration, the onset and termination of the echolocation call components (Rübsamen and Schuller 1981). Motoneurons of this branch are located in the dorso-caudal portions of the N. ambiguus (Schweizer et al. 1981). Neurophysiological recordings in the N. ambiguus during vocalization (Rübsamen and Betz 1986) showed a large range of response classes that were correlated either to vocalization or to the respiratory cycle. The activity was either temporally linked to distinct portions of the motor events within the echolocation calls, e.g. to the initial or final frequency portion, or changes in the emitted frequency in the vocalization were encoded by scaled spike activity. Neurons whose discharge correlated with the emitted call frequency had a very similar discharge-to-frequency relationship to the superior laryngeal nerve fibres and were found predominantly in the rostral portions of the N. ambiguus and in the retrofacial nucleus. These neurons were most probably motoneurons sending their fibres to the cricothyroid, the frequency controlling laryngeal muscle. Tracer injections into physiologically characterized portions of the N. ambiguus (Rübsamen and Schweizer 1986) revealed a large variety of afferent and efferent connections, which are certainly not exclusively involved in the control of vocalization, because of the functional complexity of this brain region in the medullar reticular formation. Projections from the lateral parts of the periaqueductal gray as well as adjacent regions like the N. cuneiformis are considered to be parts of the descending vocalization system. Links to medial portions of the medulla and to the parabrachial nuclei are probably involved in respiratory coordination of vocalizations. The modulatory influence of further descending projections from the pontine nuclei, the deep and intermediate layers of the superior colliculus, the red nucleus and frontal portions in the cortex on vocal emissions is widely unknown.
173
6 Neural processing in the brain
6.5.6 Where in the brain can vocalization be elicited? Electrical micro-stimulation is a valuable tool to determine brain structures that are involved in eliciting motor reactions or behavioural patterns, although it yields little information about the mechanisms involved in the elicited motor pattern. Systematic screening for eliciting vocalization with electrical stimulation was done in mesencephalic and rostral medullary regions of the horseshoe bat’s brainstem. Three major types of sensitivity to electrical stimulation could be distinguished: 1) vocalizations indistinguishable from spontaneously uttered calls and typically also pinna movements could be elicited at low threshold currents (less than 20 µA) without provoking general limb or body movements, 2) the vocal emission could be elicited, but the stimulation led to distortions in the temporal pattern or in rare cases to alterations of the constant frequency portion of the vocalization, and 3) the vocalizations were not elicited in close correlation to each stimulus burst but started after several repetitions of electrical stimulation, did not synchronize to the stimulation rhythm and persisted for some while after the electrical stimulation had been switched off. In the latter two cases the response-to-stimulus relation was in most cases not systematically assessable and further careful adjustments involving all stimulus parameters and the synchronization to the respiratory cycle is needed. Following the criteria of the first case, i.e. one-to-one relationship between stimulus and vocalization and relatively constant response latency, four areas presumably premotor parts of the descending vocal motor system, could be delineated in the brain stem: a) the paralemniscal tegmental area located rostrally and medially to the dorsal nucleus of the lateral lemniscus, b) dorsolateral parts of the mesencephalic reticular formation, corresponding to the deep mesencephalic nucleus in the rat, c) the intermediate and deep layers of the superior colliculus which are enormously hypertrophied in the bat relative to the visual input layers and d) the pretectal area at the transition between superior colliculus and medial geniculate body (Fig. 6.5.3). Electrical stimulation in these areas yielded vocalizations which were undistinguishable from spontaneously uttered calls and were in most cases accompanied by temporally coordinated pinna movements. Shortest latencies around 25 ms were found in the paralemniscal tegmental area, whereas the latencies in the other regions could range up to 60–100 ms. As long as the stimulus periodicity was not too different from the natural breathing rhythm, no spectral distortions occurred in these regions and the overall frequency of the elicited calls could not be influenced by the electrical stimulation parameters. These areas therefore seem to be more involved in triggering instead of spectrally controlling the vocal emissions. The activation of one of the four areas seems to be sufficient to provoke the emission or the synchronization of vocalization but it is a partly open question whether their activity is also a necessary condition for the emission of echolocation calls. Besides the deep mesencephalic nucleus, the three other regions have been subject to further neurophysiological and neuroanatomical investigation.
174
6.5 Cortical physiology, sensorimotor interactions
Fig. 6.5.3: Schematic representation of the areas in which vocalization could be elicited. The regions in which vocalization was triggered in a one-to-one relationship, at constant latency and without arousal of the animal are marked by hatching. Grey stippling indicates regions in which the vocalizations were accompanied by arousal of the animal (N. cuneiformis/lateral periaqueductal grey) or showed some distortions (lateral pontine regions). Abbreviations: AP, pretectal area; cun, nucleus cuneiformis; DMN, deep mesencephalic nucleus; Hyp, hypothalamus; IC(rp), inferior colliculus (rostral pole); MGB, medial geniculate body; NLL, nuclei of the lateral lemniscus; NR, nucleus ruber; PAG, periaqueductal grey; pc, cerebral peducle; PLA, paralemniscal area; pons, pontine nuclei; SC (s,i,d), superior colliculus (superficial, intermediate, deep); SN, substantia nigra.
6.5.6.1 Paralemniscal tegmental area The application of pharmacological stimulation by injection of the glutamate-agonist kainic acid on one side provokes continuous emission of undistorted vocalizations with very regular, dose-dependent interpulse intervals (Pillat 1993, Pillat and Schuller 1997). Compensation of Doppler-shifts is functioning without impediment during electrical and pharmacological stimulation. This demonstrates that the stimulation of the neurons and not of fibres-of-passage through this area is responsible for the emission of calls and second, that unilateral activation of paralemniscal tegmental area neurons has no effect on Doppler compensation behaviour. Recordings of multiple and single neurons in the paralemniscal tegmental area (Metzner 1989, 1993) showed that activity of cells in the rostral and medial part was correlated to vocalization (vocal neurons), acoustical stimulation (auditory neurons) or both (audio-vocal neurons). Most (86 %) of the audio-vocal neurons had best frequencies between 1 kHz below and 6 kHz above the resting frequency i.e. in the 175
6 Neural processing in the brain frequency range most relevant for echolocation and Doppler-shift compensation and displayed a steep rise of activity within a small frequency increase. Vocal neurons were active before the vocal onset (leading up to -150 ms) or during vocalization (excitation or inhibition) and their discharge rate often showed a dependency on the emitted frequency. Some neurons in this portion of the paralemniscal tegmental area had temporally-tuned response characteristics with a small activity window around 15–25 ms after the vocalization onset. The neuro-physiological response characteristics suggested that the paralemniscal tegmental area might have a functional role in modulating the emitted frequency, possibly during Doppler-shift compensation (Metzner 1989, Metzner 1993). Tracer injections in the paralemniscal tegmental area in two different species of CF-FM-bats, Rhinolophus rouxi and Pteronotus p. parnellii, showed that this area has numerous connections with non-auditory structures involved in motor networks, whereas the connections to auditory structures turned out to be different in the two species (Metzner 1996, Schuller et al. 1997). A very prominent connection found in both species is the reciprocal connection with the intermediate and deep layers of the superior colliculus. Connections from the paralemniscal tegmental area to the superior colliculus are bilateral, whereas input from the superior colliculus reaches the paralemniscal tegmental area only ipsilaterally. The fact that no pronounced afferences from auditory structures were found in Pteronotus p. parnellii in contrast to Rhinolophus rouxi may be due to the restriction of the injection sites to the very rostral portion of the paralemniscal tegmental area, whereas Rhinolophus rouxi also received tracer injections in the more caudal parts of the paralemniscal tegmental area. The differences in connectivity are therefore not considered to be real species differences. The proposed involvement of the paralemniscal tegmental area in the Dopplershift compensation pathway was tested with electrolytic lesions of the paralemniscal area (Pillat 1995, Pillat and Schuller 1997). Neither unilateral lesion nor bilateral lesions affected the Doppler-shift compensation system (Fig. 6.5.4) and had only minor effects on the spontaneous utterance of echolocation calls. The paralemniscal tegmental area can therefore be considered to be not essential for the Doppler-shift compensation system, however, modulatory influences cannot be ruled out by these experiments.
176
6.5 Cortical physiology, sensorimotor interactions
Fig. 6.5.4: Doppler-shift compensation is not impaired by bilateral lesion of the paralemniscal tegmental area. Compensatory responses to a sinusoidally-modulated frequency shift between 0 and 2 kHz are shown before (top) and after (bottom) bilateral lesion of the paralemniscal tegmental area. The extent of the lesions is shown in the right column.
6.5.6.2 Pretectal area Part of the pretectal area at the transition between superior colliculus and medial geniculate body has been delimited physiologically with electrical stimulation as a ‘vocal’ area and the afferent and efferent connections have been determined with tracer (WGA-HRP) injections. The region has very prominent afferent connections from the following structures of the auditory pathway: The dorsal field of the auditory cortex, the inferior colliculus (central and rostral pole nucleus) and the nucleus of the central acoustic tract. Input from non-auditory structures originates in the nucleus ruber, the deep mesencephalic nucleus and the lateral nuclei of the cerebellum. Efferent connections project back to thalamic targets (zona incerta and N. reticularis thalami), to the nucleus ruber, the cuneiform nucleus and distinct areas of the pontine gray (Schuller and Radtke-Schuller, in press). 177
6 Neural processing in the brain In other mammals, the pretectal area is predominantly involved in visuo-motor functions, although afferences from other sensory modalities have been demonstrated, e.g. for somatosensory projections from various sources (e.g. Rees and Roberts 1993, Yoshida et al. 1992) especially to the anterior pretectal region (APN). Very scarce information is available on the influx of auditory information to the pretectal area (Weber et al. 1986, Covey et al. 1987, Herbert et al. 1991, Wenstrup et al. 1994). The afferent and efferent connectivity of the pretectal area found in bats, however, suggests that the pretectum in these animals is functionally important for acoustically-guided behaviour (Fig. 6.5.5). The retinal afferences reaching the pretectum in rufous horseshoe bats are limited to a very narrow superficial shell and terminate in the nucleus of the optic tract (NOT) and the olivary pretectal nucleus (NPO) (Reimer 1989). The remaining large ventromedial portions of the pretectal complex in the horseshoe bat do not receive retinal input. Projections from other visual sources to the pretectum (e.g. superior colliculus, lateral geniculate body (LGB) or visual cortex) have not been investigated in bats. The predominance of acoustically active inputs to the bat’s pretectal area underlines the potential role of the pretectal area in audio-motor coordination in the bat.
6.5.7 Functional implications of the acoustical afferent connections to the pretectal area Neurons in the nucleus of the central acoustic tract (NCAT) are very narrowly tuned at and a few kHz above the bat’s resting frequency (Schuller et al. 1991) with a very short transmission delay, as it receives direct input from the anteroventral cochlear nucleus (AVCN) (Casseday et al. 1989). This connection to the pretectal area could provide for accurate temporal coordination of behaviours such as the timing of repetitive vocalizations or the synchronization of pinna or head movements under the control of outgoing vocalization and returning echoes (Casseday et al. 1989, Huffman and Henson 1990, Schuller et al. 1991b). The projections from the inferior colliculus to the pretectal area are the strongest of all auditory afferences. The neurons in the central nucleus, from which the pretectal area gets its inputs, have best frequencies at and just above the bat’s resting frequency (Schuller and Pollak 1979), display extremely narrow tuning curves and are focused on the processing of the long constant-frequency portion of the echoes. Very small frequency deviations can be detected by these neurons and can be relayed to pretectal neurons, thus furnishing information important for the detection of Doppler-shifts to an area having efferent connections to vocal control nuclei. Binaural information from the central nucleus of the inferior colliculus could also supply information on sound direction to the pretectum and thus contribute to acousticallycontrolled orientation responses. Neurons of the rostral pole of the inferior colliculus are tuned to frequencies just below the resting frequency i.e. the frequencies of the final frequency-modulated (FM) portion of the echolocation call (Prechtl 1995). The short final FM sweep marks 178
6.5 Cortical physiology, sensorimotor interactions
Fig. 6.5.5. Afferent acoustic inputs and efferent ‘motor’ outputs of the pretectal area. The drawing depicts schematically the important acoustic afferences to the pretectal area and the main efferent projections, with potential functional importance in audio-motor behaviour. Abbreviations: AP, pretectal area; AVCN, antero-ventral cochlear nucleus; cochl, cochlea; cun, nucleus cuneiformis; IC, inferior colliculus; IP, interpeduncular nucleus; MGB, medial geniculate body; NCAT, nucleus of the central acoustic tract; NLL(v), nuclei of the lateral lemniscus (ventral); PAG, periaqueductal grey; pons, pontine nuclei; SC, superior colliculus.
179
6 Neural processing in the brain the termination of a call or an echo and can be of importance for the determination of distance to a target, the timing of emitted vocalization and/or for the control of the call duration. The input from rostral pole neurons to the pretectal area may be involved in the control of pulse repetition rate and duration of calls, which are dramatically increased or shortened, respectively, during the bat’s approach phase. Cortical afferences to the pretectal area originate in the dorsal field of the dorsal auditory cortex, a physiologically well-defined area (Radtke-Schuller and Schuller 1995, Schuller et al. 1991a). Only combinations of two stimuli consisting of frequency-modulated sweeps at distinct starting frequencies and with a specific relative time delay (several milliseconds) most effectively activate the neurons within this dorsal cortical field (O’Neill and Suga 1979, 1982, Schuller et al. 1991a). Temporal tuning of these neurons, i.e. encoding of target distances, is important in many respects for the control of the bat’s orientation behaviour, for its call organization and arrangements for the final interception with a target (Schuller et al. 1991a) and might be mediated by the cortico-pretectal projection.
6.5.8 Possible control of motor actions by efferent connections of the pretectal area A functional connection of the pretectal area to vocal circuits has been found in the reciprocal connection with the cuneiform nucleus. This nucleus has direct efferent projections to the N. ambiguus, which provides motor output to the laryngeal muscles (Schuller and Radtke-Schuller 1988). Pretectal activity could therefore influence control of vocal parameters via this relatively direct connection to laryngeal muscles. Pretectal projections to the pontine gray end in close proximity to regions in which electrical stimulation yielded spectrally-distorted vocalizations (Schuller and Radtke-Schuller 1990). As in general, the pontine nuclei, as part of the cortico-pontine-cerebellar loop, have control functions for the fine adjustment of movements and their modulation by sensory modalities, this pathway could provide such modulations. The pretectal area has strong reciprocal connections with the red nucleus, pars compacta. Neurons in this division project contralaterally to brainstem nuclei and spinal levels involved in motor control of forelimbs or facial movements. The pretecto-rubral pathway could therefore constitute an important audio-motor link with influence on the coordination of the bat’s flight and pinna movements during acoustical tracking in echolocation.
180
6.5 Cortical physiology, sensorimotor interactions
6.5.9 Superior colliculus The medium and deep layers of the superior colliculus are susceptible to electrical stimulation for eliciting vocalizations and especially for differentiated triggering of pinna movements. The latter showed a systematic dependency of stimulus location and distinct movements could be elicited at specific locations. Only relatively few neurons in the superior colliculus showed responses to vocalizations, which in most cases resembled the responses to acoustic stimuli mimicking vocalization. About one third of the neurons active during vocalization could not be driven by comparable acoustic stimuli. Vocally-active neurons in the superior colliculus did not discharge prior to the emission of the calls, but always with a latency of some milliseconds after the start of the vocalization. Latency to vocalization was often shorter than that to acoustic stimuli. In the intermediate and deep superior colliculus, the responses to pure tones were tuned in 80 % of neurons to frequencies at and above the resting frequency of the bat, whereas the remaining frequencies were under-represented. The frequency tuning was narrow, with Q10dB-values often above 80 and, in contrast to measurements in cats, in bats narrow-band noise was less effective as a stimulus than pure tones. More than two thirds of the superior colliculus neurons in the horseshoe bat showed a binaural response, which in most cases was characterized by contralateral excitation and ipsilateral inhibition. The inhibition was effective for interaural intensity differences greater than 10 dB on the ipsilateral side, so that signals from frontal directions were processed best, underlining the importance of target reflections from straight ahead for echolocation. The superior colliculus showed a topographic trend for interaural intensity differences with representation of more ipsilateral positions in medial portions and more contralateral positions in lateral parts of the nucleus. The superior colliculus seems to be more involved in functional mechanisms for directional encoding and control of orientation in space, e.g. pinna orientation, than in audio-vocal control. Possibly the superior colliculus is participating in a temporal coordination of vocal utterances, pinna- and orientation movements (Reimer 1989, 1991).
6.5.10 Nucleus cuneiformis and adjacent lateral periaqueductal grey, and vocalization The cuneiform nucleus, ventral to the superior colliculus and latero-ventral to the periaqueductal grey, and adjacent lateral periaqueductal grey regions seem also to participate in the control of vocalization, but show differences in the response to electrical stimulation as compared to the regions described above. The vocalizations obtained at low stimulation currents (below 20 µA) were normal echolocation calls. However, the one-to-one relationship between stimulation and call was disrupted, the vocalization persisted for some time after a short stimulation and the vocal responses were accompanied by arousal that increased with persisting stimulation. Tracer experiments using retrograde (Rübsamen and Schweizer 1986) and anterograde (Schuller and Radtke-Schuller 1988) tracing methods revealed that the 181
6 Neural processing in the brain
Fig. 6.5.6: The region of the nucleus cuneiformis and adjacent lateral periaqueductal grey mediates, as an obligatory relay station, between areas in which vocalization can be electrically or pharmacologically elicited and the motor nucleus of the larynx, the nucleus ambiguus. Abbreviations: med RF, medial reticular formation.
cuneiform nucleus and possibly adjacent areas of the periaqueductal gray have direct descending access to the N. ambiguus, the motor laryngeal area in the medulla. On the other hand, the cuneiform nucleus area receives input from three of the four regions in which vocalization could be electrically triggered, namely the deep mesencephalic nucleus, the deep layers of the superior colliculus and the pretectal area (Fig. 6.5.6). The cuneiform nucleus and adjacent periaqueductal regions in the bat could be compared functionally with the periaqueductal gray in primates, which is considered to be an important relay station of the descending vocalization system in primates (Jürgens and Ploog 1981). As electrical stimulation of the areas projecting to the cuneiform nucleus provokes only triggering but not spectro-temporal interference with the vocalization, and as the nucleus cuneiformis clearly processes emotive components of vocalization (arousal), this pathway seems to be func-tionally comparable to the descending vocal system described in primates. It organizes predominantly vocal mechanisms other than the direct control of spectral parameters of vocalization. The brain sites having direct control on the spectral parameters of the vocalization are still not well defined and may be located in those areas where electrical stimulation leads to distortions of vocal parameters. Areas showing emission of distorted vocalizations were mostly located adjacent to vocal areas in lateral tegmental and in lateral pontine regions (Schuller and Radtke-Schuller 1990). Such spectral and temporal distortion of the calls can be provoked either by a direct influence on the control of motoneuronal output to the larynx or by a temporal mismatch of respiratory and vocal control. In the latter case, the vocalization falls within a period of inhalation and the lack of expiratory volume can provoke the distortion of the calls. The area in the lateral pons in which distorted calls and arousal are elicited also includes the nucleus of the central acoustic tract (NCAT). As described above, this nucleus comprises neurons tuned to the bat’s constant frequency portion 182
6.5 Cortical physiology, sensorimotor interactions and shows projections to brain areas probably involved in audio-motor control. As a strong candidate for audio-motor function, this nucleus is currently under investigation.
6.5.11 The descending vocalization system in the horseshoe bat: summary and conclusions The functional organization of the descending vocalization system in the horseshoe bat, Rhinolophus rouxi, has been traced using neurophysiological and neuroanatomical methods. Separate neuronal circuits originating in the rostral and caudo-dorsal portions of the laryngeal motor nucleus, the nucleus ambiguus, accomplish the control of emitted frequency and of the temporal composition of echolocation calls. Although the nucleus ambiguus and surrounding regions receive numerous inputs from premotor levels, the only source of input that has been verified by anterograde tracing is the cuneiform nucleus and the adjacent lateral portion of the periaqueductal gray. These restricted areas seem to be an obligatory relay station for three out of four brainstem areas in which species-specific vocalization can be elicited electrically and/or pharmacologically, i.e. the deep mesencephalic nucleus, the intermediate and deep layers of the superior colliculus and the pretectal area. The neurons of the paralemniscal tegmental area, however, do not project to the cuneiform nucleus or to the nucleus ambiguus directly. As the paralemniscal tegmental area contains many audio-vocally responsive neurons, it was considered to be essential for the Doppler-shift compensation system. Doppler-shift compensation, however, survives bilateral lesions of the area. Various connections from nuclei of the auditory pathway have been demonstrated to all brain areas in which vocalization can be elicited. Potentially all these connections could serve as a link in audio-motor control or feedback systems. It is evident from the recent results that the anatomical and functional substrate for acoustically-guided behaviour in bats is not hierarchically organized, but rather is found in distributed networks in different layers of the brainstem.
183
6 Neural processing in the brain
6.6 The processing of ‘biologically relevant’ sounds in the auditory pathway of a non-human primate Peter Müller-Preuss and Detlev Ploog
The main goal of this project was the elucidation of the neural mechanisms underlying the processing of biologically meaningful sounds in the auditory structures of a nonhuman primate. A frequently-discussed problem in auditory physiology is the method of acoustic stimulation, i.e. which acoustic signals should be used to test the system: On the one hand there is the use of simple, but clearly-defined artificial sounds, which will evoke also simple reactions. Such sounds will never occur in an individual’s acoustic environment. On the other hand, signals can be selected from the ‘auditory world’ of the species; these sounds can be assumed to be ‘biologically meaningful’. Such sounds are, however, mostly very complex and evoke several kinds of reactions that are difficult to analyze. At the beginning of our work, we used the latter approach. Based on the work of Newman, Funkenstein and Winter, the focus of the research was on the neural mechanisms operative in central stations of the auditory pathway stimulated with species-specific vocalizations, including an individual’s own vocal utterances. Due to the difficulty in interpreting the results obtained by first experiments, however, we shifted strategy during the course of the project to the use of stimuli still carrying ‘biological meaningful parameters’, but which had a simpler structure. The different steps in the experimental approach are reflected in the various sections of this report.
6.6.1 The relationship of the auditory pathway to structures involved in sound production: Anatomy and physiology In searching for the neural mechanisms of audio-vocal behaviour, both anatomical and neurophysiological methods were used: Tracing methods (H3-Leu, HRP, Fluorescence; Bieser and Müller-Preuss 1988) to evaluate connections between structures involved in audition and phonation, stimulation experiments designed to activate certain structures, and single-unit recordings to analyze the neural response properties both in the auditory pathway and in vocalization structures. A stimulation study was carried out in squirrel monkeys to test the properties of the connection between the anterior limbic cortex (ALC = vocalization system) and the secondary auditory cortex in the superior temporal gyrus (shown by H3-Leu tracing, Müller-Preuss et al. 1980). Activation of neural populations within the ALC can cause a decrease of the spike rate of auditory neurons, suggesting an inhibitory influence by this cortical vocalization area on the auditory system. However, the observed effect was the consequence of an artificial activation of one particular part of the vocalization system. 184
6.6 The processing of ‘biologically relevant’ sounds To clarify whether such an inhibitory influence is exerted in general on auditory input or indicative of specific mechanisms for the auditory representation of an individual’s own vocal activity, the activity of auditory neurons was studied during vocalization. An analysis of neural responses evoked through vocalization and, for comparative purposes, during playback of vocalization as well as during simultaneous presentation of both vocalization and playback, revealed that, in general, the auditory input itself is not inhibited during vocal activity. This was demonstrated by the fact that the response pattern of most units of the auditory midbrain and of numerous cells of the auditory thalamus and cortex to the particular stimuli was similar (Müller-Preuss 1988, Müller-Preuss and Ploog 1983). An example of such a unit is given in Figure 6.6.1, this type of unit has been considered to be of a purely ‘auditory’ nature.
Fig. 6.6.1: Location of several of the three different types of neurons within the midbrain are shown in part A. In part B, the types of different response patterns, evoked during the utterance of a vocalization (V) as well as during playback (P), of 3 representative neurons are demonstrated. Bars indicate duration of stimuli, stippled parts point to variable length of a vocalization. Bin width of PSTH: 3 and 6 msec. Abbreviations: ic, inferior colliculus; pag, periaqueductal grey.
185
6 Neural processing in the brain Another type of unit, found mostly within the auditory thalamus and cortex (seldom within the midbrain), had response patterns indicative of interactions between the auditory system and structures involved in sound production. They reacted with decreased discharge rates or not at all to emitted vocalizations, but exhibited a clear response to the same vocalization as a playback stimulus. It was concluded that an inhibitory effect on auditory neurons is exerted by activated vocalization-producing structures and may represent a neural correlate with a certain self-monitoring function of self-produced vocal activities. Such a unit has been considered to be the ‘audio-vocal’ type, and an example from the midbrain is shown in the middle of Figure 6.6.1. A small number of neurons were found in the closest neighbourhood of the IC, but clearly not belonging to it, that display response patterns never seen in recordings within the auditory pathway; the increase of activity began up to 80 ms before the utterance of the vocalization. An example is shown in the lower third of Figure 6.6.1. Such units seem to be related only to structures involved in the production of vocalization and have therefore been defined as being of the ‘vocal’ type.
6.6.2 Processing of species-specific vocalizations that are modulated in amplitude Previous work in our laboratory showed that species-specific vocalizations evoked complex multiple responses within central auditory neurons that were, in addition, quite variable (Manley and Müller-Preuss 1979). Thus an analysis of the neural activity was burdened by the complex response pattern (as shown by Newman and Wollberg 1973). Other approaches were, therefore, designed with species-specific vocalizations as stimuli that were modulated only in amplitude and not in frequency (i.e. reduced stimulus parameters) and that secondly, have been shown by behavioural testing to be biological meaningful. More specifically, only cackle calls of squirrel monkeys, showing no frequency modulation but definite amplitude changes, were correlated with the activity of neurons of the auditory midbrain (inferior colliculus) and thalamus (medial geniculate). Many neurons of our sample displayed response properties in which the level of the particular amplitude seems to be encoded. However, a relatively large number of neurons in the midbrain and from the thalamus showed response patterns that also suggest that parts of the auditory pathway are able to process specific changes in the course of the intensity of a call. The study indicates that specific amplitude changes can be processed independently of amplitude levels and that the position and the direction (increment vs. decrement) of a change within a call can be encoded. The results show that call components relevant to intraspecific communication can selectively elicit the neural activity necessary for perception (Müller-Preuss and Maurus 1985, Müller-Preuss 1988).
186
6.6 The processing of ‘biologically relevant’ sounds 6.6.2.1 Processing of sounds masked by a preceding stimulus Due to the temporal nature of acoustic signals, and due to the mechanical and neural properties of the structures encoding them into brain activity, the ‘intelligibility’ of biological sounds will be strongly influenced by masking effects. Every amplitude change, be it natural, as in the examples shown above, or artificial, as within the sounds used in the experiments outlined in the next sections, will influence the detectability of the following sound components. Most psychoacoustic models of masking use activation of the basilar membrane to describe the results of psychophysical masking tests, thus suggesting that neural coding along the ascending auditory pathway is only subject to minor masking. To date, neural data are only available from the auditory periphery. These data give preliminary information on the influence of suppression, inhibition and synaptic delays on neural representation in non-simultaneous masking experiments. In order to shed light on the central representation of acoustic stimuli in relation to masking, the neural activity of units in the auditory midbrain (N = 52) of squirrel monkeys were recorded during a forward masking paradigm. In a first approach, the effects of noise and tones at a unit’s characteristic frequency (CF) as the masker (M) on the neural activity evoked by a corresponding test-pulse (T) were studied: The strongest masking effects were seen in the range from 2–10 ms after M’s cessation (40–90 % reduction of spike activity). The effects disappear when T follows M by 300 ms. The influence of using noise as M is greater than when using a tonal masker. The effects are weaker when T has a frequency far beyond a neuron’s CF. If M is shortened in duration (from 200 down to 10 ms), the masking effect is diminished. Therefore, natural signals intended to be transmitted unmasked should follow a preceding acoustic event with a certain delay (more than 50 ms, or occur with an AM frequency not higher than 100 Hz), or they should be emitted with a greater amplitude, or produced at a different frequency (MüllerPreuss 1997). Studies of acoustic communication of squirrel monkeys showed that these neural properties may be reflected in the way in which certain (non-FM) calls are produced; amplitude changes shown to be relevant to communication frequently occur as increments and are placed at the beginning of a call. If decrements are the relevant changes, they are found either at the beginning of a call or, if located at its end, more often occur in calls of short duration. Furthermore, at the neural level the data probably mirror the effects of masking in acoustic communication in general. Psychoacoustic experiments carried out in humans show remarkable similarities: The effects of masking disappear 300 ms after the M; the strongest influence is also seen in the range 2–10 ms after the M’s cessation, and the influence of noise is also relatively greater than that of tones (Fig. 6.6.2). In the upper part of Figure 6.6.2, the masking effects on IC neurons are shown, in the lower part are the results of psychoacoustic experiments (Fastl 1979).
187
6 Neural processing in the brain
Fig. 6.6.2: a: Responses of collicular neurons in the squirrel monkey; b: results of a psychoacoustic experiment to a forward-masking paradigm in humans: In (a), the effects of masking are shown as a function of delay time tv (time between end of masker Lm and beginning of test-tone Lt) and discharge rate. (b) shows the reactions of human subjects, where the increasing threshold of Lt (detectability) as a function of decreasing tv indicates masking effects. For the purpose of comparability, in (a) a transformation of discharge rate from % in dB values is shown on the right ordinate. (Partly adapted from Fastl 1979).
188
6.6 The processing of ‘biologically relevant’ sounds 6.6.2.2 Processing of artificial, amplitude-modulated sounds Feature encoding seems an essential framework for the recognition of complex sounds in the brain. Many studies of auditory perception focus on experiments with relatively simple, artificial sounds that may act as information-bearing parameters in a complex sound. One feature found in complex, biologically-relevant sounds is the temporal patterning of the amplitude modulation of a signal. For non-human primates, this issue was discussed above. In humans, to give a further example, the hearing sensation of rhythm depends primarily on variations in the temporal envelope of a sound (Zwicker and Fastl 1990). In the next sections, we review experiments in which the neural processing of artificial AM-sounds were studied in three central stations of the auditory pathway. Midbrain: In order to describe the response properties of neurons of the inferior colliculus of the midbrain to amplitude-modulated sounds, cross-correlation algorithms and spike-rate counts were applied to translate the neural reactions into modulation-transfer functions. All neurons (N = 542) responded selectively to AMsounds in so far as all displayed a best modulation frequency (BMF). In addition, most of them had a band-pass-like modulation transfer function, whose centre frequencies were mainly between 8 and 128 Hz (max. 32 Hz). Histological examination of the particular electrode tracks showed that neurons from the central nucleus of the IC display a sharper band-pass function than neurons recorded from the peripheral nuclei (see Fig. 6.6.3). Transfer functions obtained by measuring the spike rate showed less selectivity; a relatively large number of neurons do not change their spike rate as a function of modulation frequency. The encoding of amplitude-modulated sounds thus occurs more via phase-locking of discharges than via changes in spike number. In the same way, changing depth of modulation may be processed; whereas on average, the spike rate remains constant between 100 % and 0 % modulation, there is a drastic reduction in synchronicity. The results show that in a non-human primate, amplitude modulations are encoded selectively in a band pass function. The midbrain occupies an intermediate position within the pathway; from the periphery up to the cortex, increasingly lower amplitude modulations are encoded. Such a change in temporal resolution is presumably caused by an increase in synaptic activity over the course of the pathway (Müller-Preuss et al. 1994). Thalamus: The influence of sinusoidal changes in the rate, depth and absolute intensity of modulation of sound intensity on the neural activity was also studied in the medial geniculate body of the auditory thalamus. The strength of the periodicity of neural discharges was again measured by spectral analysis. In all cases, the strength of the neural discharge periodicity was dependent on the modulation frequency of the stimulus: Most neurons responded best to one modulation frequency, only a few displayed a low-pass or multiple-peaked response characteristic. The majority of the neurons responded best to modulation rates between 4 Hz and 64 Hz, with a peak at 32 Hz. Such modulation frequencies are found in parts of the species’ vocal repertoire. Changes in the neural discharge periodicity are half the corresponding changes in stimulus intensity, whereas small variations in modulation depth evoke much greater changes in the neural spike periodicity. Due to the 189
6 Neural processing in the brain
Fig.6.6.3: Best modulation frequency distribution of collicular neurons, separately shown for tonal stimuli at a certain characteristic frequency (left) and for noise (right); shown separately for units from central areas (above) and from the periphery (below).
large proportion of tonic neurons, the medial part of the medial geniculate body tends to process slightly higher modulation rates than the other parts of this thalamic region (Preuss and Müller-Preuss 1990). Cortex: Neural responses to AM-sounds were studied in the auditory cortex and insula. The insula was included in this study because of its presumed role in timecritical aspects of auditory information processing (Kelly 1973). The envelope of the AM-sound is encoded by 78.1 % of all auditory neurons. The remaining 21.9 % displayed simple ON, ON/OFF or OFF responses at the beginning or the end of the stimulus sound. Those neurons with AM-coding were able to encode the AM-sound frequency in two different ways: 1. The spikes followed the amplitude modulation envelopes in a phase-locked manner; 2. The spike rate changed significantly with changing modulation frequencies. As reported in other species, the modulation-transfer functions for rate showed higher modulation frequencies than the phase locked response. Both AM-coding types exhibited a filter characteristic for AM-sound, in which 46.6 % of all neurons had the same filter characteristic for both the spike discharge and the phase-locked response. The remaining neurons displayed combinations of different filter types. Varying modulation depth was encoded by the neuron’s ability to follow the envelope cycles and not by the non-phase-locked spike rate frequency. 190
6.6 The processing of ‘biologically relevant’ sounds The topographical organization of the squirrel monkey’s auditory cortex was established by an anatomical study (Jones and Burton 1976). Using physiological parameters, we have added two new fields. All fields investigated showed a clear functional separation for time critical information processing. Figure 6.6.4 shows the different BMF distributions. The best temporal resolution was shown by the primary auditory field (AI), T1 and by Pi. The neural data from these fields and the amplitude-modulation frequency range of squirrel monkey calls suggests a similar correlation between vocalization and perception as in human psychophysical data for speech and hearing sensations. The anterior fields in particular failed to follow the AM-envelopes. For the first time in a primate, the insula was tested with different sound parameters ranging from simple tone bursts to AM-sounds. The observed best frequencies covered the same spectrum as AI. As in the auditory fields, most neurons in the insula encoded AM-sound with different filter characteristics. However, the high proportion of neurons unable to encode AM-sound (40.6 %) and the low mean BMF (9.9 Hz) do not suggest that the insula plays a prominent role in temporal information processing (Bieser and Müller-Preuss 1996). Taken together, an important result of this study is the clear evidence that different cortical auditory areas showed significantly-different response patterns to changes in intensity of the AM-sound. Areas AI, Pi, and T1 are areas with a temporal resolution for AM-sound that is in the same range as the envelope fluctuations
Fig. 6.6.4: Distribution of best modulation frequencies (synchronization) and rate maxima in different cortical areas. Abbreviations: AI, primary auditory cortex; AL, anterior auditory field; In, Insula; Pa, postauditory field; Pi, parainsular field; R, rostral auditory field; Rpi, rostral parainsular field; T1, secondary auditory cortex.
191
6 Neural processing in the brain of most squirrel monkey vocalizations. In contrast, the time resolution of the other fields (AL, RPi, R and Pa) is insufficient to manage such information. Thus in addition to signal representation in the frequency domain (= tonotopicity), the auditory cortex of the squirrel monkey represents information in the time domain, in which areas with high temporal resolution show specifically more band-pass characteristics. It is therefore noteworthy that psychoacoustic studies performed on humans also showed a BP-characteristic for the hearing sensation known as ‘fluctuation strength’ (Fastl 1977), suggesting strong similarities between the hearing sensations in humans and in monkeys. Taken together, the non-human primate data show, as in other mammals, a diminution of temporal resolution in the auditory cortex compared to the peripheral stations along the auditory pathway (Müller-Preuss et al. 1988). Mean BMFs in the midbrain were between 32 and 64 Hz (Müller-Preuss et al. 1988) and in the medial geniculate body 16-32 Hz (Preuss and Müller-Preuss 1990). In addition, the filter characteristics of neurons for AM-sound tend to be more complex in the cortex than in the midbrain. An additional measure for a neuron’s filter characteristic is the filter quality factor. The filter quality factor of the band-pass filters in the midbrain and cortex was calculated as the bandwidth at 40 % and 70 % below the BMF. All cortical areas had on the average narrower filter bandwidths than the midbrain regions. The fact that the temporal resolution capabilities of cortical neurons are decreased with respect to the midbrain neurons seems in opposition to the belief that higher stations of the auditory pathway are responsible for complex sound coding. Alternatively, not all changes in sound intensities are essential for the processing of complex sounds. In humans, for example, fluent speech in various languages shows amplitude fluctuations between 2 and 8 Hz, which corresponds to the syllables. So the relevant temporal feature is associated with a very low modulation frequency range. The envelope modulations (AM) of squirrel monkey calls are between 4 Hz and 64 Hz (Fastl et al. 1991), which is the same modulation frequency range as the synchronization BMFs in AI, Pi and T1 of the cortex. Thus the neural data give evidence for a similar correlation between vocalization and perception, like the human psychoacoustical data for language and hearing sensation (Fastl 1990). At higher AM-frequencies, the hearing sensation ‘roughness’ is perceived by human listeners (Fastl 1977).
192
7 Comparative animal behaviour and psychoacoustics
7.1 The European starling as a model for understanding perceptual mechanisms Georg Klump, Ulrike Langemann and Otto Gleich
In birds, acoustic signal are the most important means of communication. The frequency range that birds use for their calls and songs completely overlaps the hearing range of humans (e.g. see Greenewalt 1968), and especially many songbirds produce signals that can be as appealing to the human auditory system as they probably are to the auditory system of conspecific birds. It seems reasonable to assume that birds and humans evolved in similar auditory worlds and as a result may have developed functionally similar perceptual mechanisms. The bird species that we chose for a comparative study of auditory mechanisms is the European starling (Sturnus vulgaris). Starlings are songbirds with an elaborate vocal repertoire (e.g. see Hartby 1969, Adret-Hausberger and Jenkins 1988, Eens et al. 1989) that they further enlarge by faithfully copying sounds from their acoustic environment (e.g. see Hindmarsh 1984). Because of the range of features that are present in their communication signals and because of their ability to mimic many acoustic signals, starlings may be especially well suited for the study of general mechanisms of auditory perception. As we will show in this section, they may also provide a good model for human perceptual processes. Another reason for choosing starlings for comparative studies of hearing was their known readiness to be trained for auditory-discrimination tasks in standard operant-conditioning procedures using positive reinforcement (e.g. Hulse et al. 1984). Furthermore, a number of studies (e.g. Leppelsack 1974, Manley et al. 1985, Müller and Leppelsack 1985, Rübsamen and Dörrscheidt 1986) have proven that the starling is also well suited for neurophysiological investigations. This offers the possibility of directly comparing their psychophysical performance to the response characteristics of neurons at different levels of the auditory system, as is exemplified below. Finally, because of the relative ease with which starlings can be trained in auditory-discrimination tasks, they are also suited to studying 193 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
7 Comparative animal behaviour and psychoacoustics the variation of auditory processing in relation to learning (e.g. Scharmann et al. 1995, Scharmann 1996). In this section, the emphasis is on the psychophysical study of auditory perception in starlings and some of the physiological correlates. Other species that have also been studied in our laboratory using psychophysical methods are the great tit (Parus major, see Langemann et al. 1998) and the barn owl (Tyto alba, Dyson et al. 1998). Both species have evolved specializations in their auditory processing that can be explained as adaptations to the analysis of environmental acoustic cues.
7.1.1 Auditory sensitivity: absolute thresholds Compared to most mammals, birds have a relatively restricted frequency range of sensitive hearing. Among birds, even auditory specialists such as the barn owl that predominantly rely on the analysis of high-frequency sounds for their living, are limited to frequencies below 14 kHz (Dyson et al. 1998), whereas signals above 20 kHz can be perceived by most mammals (see Fay 1988). The European starling’s absolute threshold curve, shown in Figure 7.1.1, exhibits many characteristics that are typical for birds in general (see Dooling 1992, for a review). Bird audiograms have been characterized by Dooling (1992) in terms of seven descriptive parameters that were determined from a 4th-order polynomial fit to the threshold data. For the European starling, this fit is based on data obtained from 13 individuals tested at least at five different frequencies. The frequency at which the starling is most sensitive, i.e. the best frequency, obtained with the fit is 3.31 kHz and the best sensitivity at this frequency is 2.9 dB. The starling’s hearing bandwidth 20 dB above the best threshold is 6.04 kHz (i.e. it has a sensitive hearing range from 0.45 to 6.48 kHz, corresponding to 3.9 octaves). Its 500-Hz threshold, i.e. a measure of the low-frequency sensitivity, is 19.3 dB. Extrapolating from the audiogram, the high-frequency cutoff at which the auditory threshold reaches 60 dB is 9.07 kHz. Below the most sensitive frequency region (i.e. for frequencies of 1 kHz and below) the starling’s threshold declines at a rate of 10 dB/octave and in the range from 6 to 8 kHz, sensitivity declines at a rate of 83.2 dB/octave. In comparison to an ‘average’ bird (Dooling 1992), the European starling reveals a better low-frequency hearing and, as a result, a larger bandwidth of sensitive hearing. Its most sensitive frequency, its threshold at this frequency and its high-frequency hearing resembles that of an ‘average’ bird (Dooling 1992, based on data from 23 species).
194
7.1 The European starling as a model for understanding perceptual mechanisms
Fig. 7.1.1: Auditory threshold of the European starling measured in quiet. The symbols show data from individual birds (N = 13). All subjects were tested at 0.5, 1, 2, 4 and 8 kHz; some of the subjects were tested at additional frequencies as shown in the figure. The solid line indicates a 4th-order polynomial fit to the data similar to the fit applied by Dooling (1992) for comparing data from different bird species.
7.1.2 Frequency selectivity In the natural environment, absolute thresholds rarely determine the bird’s limits for detecting a sound signal of interest. Masking of signals by background noise frequently effects the detection, and we need to be able to estimate the amount of masking in relation to the characteristics of the noise background and to the signal characteristics. If we want to know the signal-to-noise ratio necessary for the detection of a tonal signal in a wide-band noise background, the critical-masking ratio (CR) provides the best measure on which we can base the estimate. As has been described in other bird species (e.g. Okanoya and Dooling 1987, 1988), the starling’s CR (i.e. the difference between the level of the tone signal at the masked threshold and the spectrum level of the background noise) is independent of the level of the background noise (for details see Langemann et al. 1995). It increases from a median value of 21.0 dB at 1 kHz to 26.9 dB at 6.3 kHz (the dotted lines in Figure 7.1.2 indicate the CRs that have been determined using a white-noise masker of a spectrum level similar to that employed for the narrow-band maskers used in the bandnarrowing experiment described below). The slope of the function describing the relationship between CR and frequency is on average 2.2 dB/octave. The signal-to-noise ratio defined by the CR can be used to compute an estimate of the bandwidth of frequency filters in the starling’s auditory system. This CRbandwidth is based on the assumption that at the detection threshold, the sound 195
7 Comparative animal behaviour and psychoacoustics
Fig. 7.1.2: Signal-to-noise ratios at the detection thresholds for a tone centered in maskers of a constant spectrum level of 42 dB SPL in relation to the masker bandwidth. Symbols show the median signal-to-noise ratios of five birds. The solid lines indicates a fitting masking function (see Langemann et al. 1995); for each tone frequency the critical bandwidth (CB) and the integration time (2) estimated by the sum-of-least-squares fit are shown. The dotted line shows the critical masking ratio obtained in another experiment.
energy of the background noise falling within the frequency limits of the filter is the same as the sound energy of the tone signal presented at the centre frequency of the filter (Fletcher 1940, Scharf 1970). Estimates of the starling’s auditory-filter bandwidth that are based on the CR range from 126 Hz at 1 kHz to 490 Hz at 6.3 kHz. Studies in humans (e.g. Zwicker 1956, summarized in Zwicker and Fastl 1990) have shown, however, that the ratio between the signal energy and the overall energy of the noise within the auditory filter depends on the frequency to which the auditory filter is tuned. Thus the assumption of a specific constant threshold criterion, on which the calculations of the CR-bandwidth are based, may also be violated in the bird (for a detailed discussion see Langemann et al. 1995). The band-narrowing procedure provides a method which allows us to determine the bandwidth of auditory filters without depending on assumptions about a specific threshold criterion that is constant across frequencies. Results from such an experiment in the starling are shown in Figure 7.1.2. In a band-narrowing experiment, the signal-to-noise ratio at the detection threshold is determined for maskers presented at a constant spectrum level but that vary in bandwidth from values well below to values far above the estimated auditory filter bandwidth. The data points are then fit with a theoretical masking function (see Langemann et al. 1995) that 196
7.1 The European starling as a model for understanding perceptual mechanisms relates the signal-to-noise ratio at the detection threshold to the signal-detection measure (d’) used as the threshold criterion, an estimate of the internal noise in the auditory system (τ), the bandwidth of the background noise (W), the integration time of the auditory system (σ) and the equivalent rectangular bandwidth of the auditory filter (i.e. the width of the critical band WCB). For the data shown in Figure 7.1.2, the fit was obtained using τ and WCB as free parameters. The starling’s critical bandwidth (CB) estimates for each fit (see Fig. 7.1.2) were very similar to the estimates that were based on the CR values at the respective frequencies (the difference at 6.3 kHz can be explained by the characteristics of the sound field in the testing apparatus). This indicates that, in contrast to humans, the assumption that the threshold criterion does not vary with frequency does hold for the starling. Furthermore, the CR-bandwidth and the critical bandwidth were very similar, a result which can be expected for the threshold criterion (d’ = 1.8) that was applied in the experiments with starlings (see Langemann et al. 1995). It is interesting to note that the integration times providing the best fit are similar to the integration times that have been determined in the starling in a different experiment for sounds at the absolute threshold (see below). Another method that provides an estimate of the bandwidth of auditory filters is based on the perception of modulation of a tone signal. In humans, the differences in the detectability of sinusoidal amplitude modulation (AM) and frequency modulation (FM) are reduced with increasing modulation frequency. The detectability of AM and FM becomes similar if the sidebands of the carrier generated by the modulation are located outside the critical band (i.e. the limits of the auditory filter) which is centreed at the carrier frequency (Zwicker 1952, Schorer 1986). This modulation frequency at which AM and FM become equally detectable has been called the critical modulation frequency (CMF), and it is thought to be about half the width of the critical band at the carrier frequency (Schorer 1986). Extending earlier experiments by Zwicker (1952), the critical modulation frequency was estimated by Schorer (1986) with the help of a linear regression between the ratio of the AM and FM thresholds in decibel and the logarithm of the modulation frequency. The CMF was defined as the modulation frequency at which the ordinate value of the regression is zero, i.e. the ratio FM-threshold/AM-threshold equals unity. The results of similar experiments determining the CMF in the European starling are shown in Figure 7.1.3. Thresholds for detecting sinusoidal amplitude modulation (m) and frequency modulation (η) were determined at two carrier frequencies of 1 and 4 kHz and at eight modulation frequencies (5, 10, 20, 40, 80, 160, 320, and 640 Hz). Parallel to results in humans, the threshold for the detection of frequency modulation decreases with increasing modulation frequency (the slopes were -5.09 and -5.23 dB/octave at 1 and 4 kHz, respectively), whereas the threshold for the detection of amplitude modulation stays relatively constant over a wide range of modulation frequencies. The starling’s thresholds for the detection of amplitude and frequency modulation are very similar above the critical modulation frequency, which is 67 and 257 Hz for carrier frequencies of 1 and 4 kHz, respectively. The critical bandwidths that can be estimated by doubling the CMF values (Schorer 1986) are very similar to the critical bandwidths assessed using critical masking ratios or determined applying the band-narrowing procedure. 197
7 Comparative animal behaviour and psychoacoustics
Fig. 7.1.3: Thresholds for the detection of sinusoidal amplitude modulation (AM) and frequency modulation (FM) of 800 ms tone signals (50 dB sensation level, median data from four birds). The panels on the left side show the index of modulation at the detection threshold in relation to the modulation frequency. The panels on the right side show the ratio between the modulation index of FM and AM at the detection threshold in dB. The critical modulation frequency (CMF) is determined by the point at which the dashed linear regression line intersects the horizontal line at 0 dB (i.e., where AM and FM thresholds are the same).
The physiological complement of the starling’s psychophysical auditory filters are the frequency-tuning curves of neurons in its auditory system (e.g. see Manley et al. 1985 and Sect. 2.5). In the auditory periphery of the starling (i.e. in cochlearganglion cells), the neuronal tuning curves show a V-shape with slopes that range from about 40 dB/octave at 0.5 kHz to about 100 dB/octave at 3 kHz (however, a large amount of variation was observed at any single frequency; Manley at al. 1985). As a result, in the auditory periphery the bandwidth of the tuning curves increases with the sound pressure level of the stimulus. The bandwidths of the neurons’ tuning curves 10 dB above their best threshold are about 2.5 times larger than the estimated bandwidths of the psychophysical auditory filters (see the open symbols in Fig. 7.1.4). Given the V-shape of the tuning curves, however, their 4-dB bandwidth, that would approximate the equivalent rectangular filter bandwidth 198
7.1 The European starling as a model for understanding perceptual mechanisms
Fig. 7.1.4: The relationship between psychophysical and physiological measures of frequency selectivity in the starling’s auditory system in relation to the test-tone frequency. The open symbols show the 10-dB bandwidth of auditory-nerve fibres (data from Manley et al. 1985, Gleich and Manley, unpublished); the dotted line is a prediction of the 10-dB bandwidth that is based on the cochlear map (see Buus et al. 1995). The filled symbols show critical masking ratio bandwidths (■) and critical bandwidths (●); the solid line is a prediction of these measures calculated on the basis of the cochlear map function (Buus et al. 1995).
(see Buus et al. 1995), corresponds well to the psychophysical measures of the frequency selectivity. In the starling, the tuning characteristics of peripheral auditory neurons are correlated with the spatial frequency representation on the basilar papilla of the inner ear. The bandwidth of peripheral tuning curves can be predicted by the spatial frequency map of the basilar papilla that has been determined by HRP-labeling of auditory-nerve fibres with known characteristic frequency (Gleich 1989, see also Sect. 3.1). The spatial spread of the frequency range corresponding to the average bandwidth 10 dB above a neuron’s best threshold is 265 µm (Buus et al. 1995), and the 10 dB bandwidth that can be predicted from the cochlear map matches the average neuronal bandwidths over a wide frequency range, as is shown in Figure 7.1.4 by the dashed line. Neurons of more central areas of the starling’s auditory system, e.g. those with response properties that are typical for the projection area of neurons from the auditory thalamus into the forebrain, exhibit tuning curves with 10 dB bandwidth that are not very different from the bandwidth observed in the starling’s auditory periphery (provided central and peripheral units are tuned to the same frequencies; Nieder and Klump, 1999). Compared to peripheral neurons, however, the tuning characteristics of more central neurons become much less dependent on the level of the sound. Excitatory tuning curves in the starling’s forebrain exhibit much steeper 199
7 Comparative animal behaviour and psychoacoustics high-frequency and low-frequency slopes than peripheral neurons stimulated with single tones. This increase in the slopes seems to be due to lateral inhibition in the frequency domain (Nieder and Klump, 1999). If forebrain neurons are stimulated using a two-tone paradigm, the slopes of the flanks increase even more. This sharpening of the central neuronal filters makes their bandwidth less dependent on the signal level, which is typical for the starling’s critical ratio data. Similar results have also been reported from studies of neurons in the central auditory system of the cat (e.g. see Ehret 1995). In summary, the starling’s different psychophysical measures of frequency selectivity are remarkably similar to those determined in humans. This result was unexpected, since there are large differences between humans and starlings in the spatial spread of frequencies on their cochlear maps (the critical bandwidth corresponds to a basilar distance of 1–1.3 mm in humans, but only 0.1 mm in the starling; see Buus et al. 1995). The resemblance of the measures of frequency selectivity should provide both species with similar capabilities for the detection of signals in noise.
7.1.3 Frequency discrimination For explaining the frequency-difference limen observed in humans, Zwicker (1956) proposed a model that was based on the evaluation of the change in the pattern of excitation in the auditory system when stimulated with different frequencies. He suggested that a change of the signal frequency would be audible if it resulted in at least a 1 dB change in the excitation level within any critical band. Irrespective of the constant value of increase that is necessary for detecting a change in the excitation level, Zwicker’s model predicts that the difference limen of frequency (DLF) is proportional to the critical bandwidth and thus also reflects a constant distance on the cochlear map. The model that Zwicker proposed for humans has been successfully applied in the European starling (for a detailed discussion of an excitation pattern model for the starling see Buus et al. 1995). In this bird species, the value of the DLF is approximately 1/7 of the size of the critical band (Fig. 7.1.5a). Although this is a much larger value than found in humans, in which the FDL was only 1/27 of the width of the critical band, the corresponding distance on the cochlear map of the starling (15 µm) is much smaller than the corresponding distance on the human cochlear map (about 40–50 µm). Thus, with respect to the space constant of the cochlear map, the starling exhibits a higher resolution. In terms of the Weber fraction for detecting a frequency change, however, the starling is less discriminative than the human (see Langemann and Klump 1992, Moore 1974, Wier et al. 1977).
200
7.1 The European starling as a model for understanding perceptual mechanisms
Fig. 7.1.5: Measures describing the European starling’s ability to discriminate between different signal frequencies: A: The frequency-difference limen (frequency increase) for 800-ms tones presented at 50 dB sensation level in relation to the reference-tone frequency; the solid line shows the prediction based on the spatial frequency map of the starling’s basilar papilla (see Buus et al. 1995). B: The frequency-difference limen for 400 ms tones in relation to their separation by a short temporal gap (reference frequency 1 kHz, increasing frequency). C: The frequency-difference limen for 800 ms frequency-modulated tones in relation to the modulation frequency (symmetrical sinusoidal frequency modulation around the reference frequency of 1 kHz). D: The frequency-difference limen for 800 ms frequency-modulated tones in relation to the modulation frequency (asymmetrical sinusoidal frequency modulation above the reference frequency of 1 kHz). E: The frequency-difference limen for 400 ms tones connected by a sinusoidal frequency variation upward from the reference frequency of 1 kHz. For details on B–E see text and Langemann and Klump (1992). F: A comparison of the frequency-difference limen measured for upward (●) and downward (❍) variation of frequency in relation to the reference frequency (for details see Klump 1991).
201
7 Comparative animal behaviour and psychoacoustics Studies in humans indicated that not only the reference frequency but also the mode of presentation of the frequency change determines the ability of the auditory system to detect a change (e.g. Fastl 1978, Moore and Glasberg 1989). Traditionally, experiments on frequency discrimination have applied two modes of frequency change. In the first type of experiment determining the DLF, two tones are presented that are separated by a short silent period (e.g. in experiments testing humans, Schorer 1989 used 80 ms; in the starling, the duration of the silent gaps was varied between 8 and 100 ms, see Langemann and Klump 1992). In the second type of experiment, the test-tone frequency is varied in the form of a sinusoidal frequency modulation, i.e. the carrier frequency of the signal is varied symmetrically around the reference frequency according to a sine function. The measure of frequency discrimination that was determined using this mode of presentation was called ‘frequency-modulation difference limen’ (FMDL, Moore and Glasberg 1989, Zwicker and Feldtkeller 1967) to distinguish it from the other measure. An additional mode of frequency change was first used by Schorer (1989a) in testing humans: A frequency variation in which the two tones of differing frequency were linked by a frequency sweep that corresponded to a raised sine function (i.e. this stimulus was similar to the DLF stimulus, but the second tone had no abrupt onset). It is also possible to present a sinusoidal frequency modulation that extends exclusively to frequencies above or below the reference frequency (i.e. this stimulus was asymmetrical with respect to the reference frequency, but otherwise similar to the FMDL stimulus). All four different modes of frequency change have been used in psychophysical experiments with starlings (Fig. 7.1.5 B-E, for details see Langemann and Klump 1992). Similar to the results in humans, the smallest frequency change could be detected when presenting two tones that were separated by a temporal gap. In well trained starling subjects, the DLF can be as low as 1 % of the reference frequency (Fig. 7.1.5b), although usually somewhat higher Weber fractions for frequencydiscrimination thresholds were found (Fig. 7.1.5a). Provided the starling subjects are equally well trained as in the DLF experiment, the FMDL for slow rates of sinusoidal modulation is about twice as large as the DLF. The increased difference limen for this mode of presentation could be explained by the form of the frequency change. At a value of the FMDL that is double the DLF, the deviation from the reference frequency is the same (i.e. the starling’s auditory system detects the same amount of change with respect to the reference frequency). The size of the FMDL, however, also depends on the modulation frequency (Fig. 7.1.5c). If the deviation from the reference frequency persists for a relatively long time, i.e. at low modulation frequencies, the FMDL is about constant. At rates of modulation that are so fast that the temporal properties of the auditory system limit the ability to perceive the modulation (see the discussion of temporal modulation transfer functions below), the FMDL increases (e.g. see the peak at 320 Hz in Fig. 7.1.5c). At high modulation frequencies (e.g. 640 Hz in Fig. 7.1.5c), however, there is a sharp drop in the FMDL that can be explained by the use of additional cues in auditory filters that are not tuned to the reference frequency. Frequency modulation of a tone creates a spectrum that has two or more sidebands (depending on the value of the modulation index η). If these sidebands are of sufficient intensity to be detectable by auditory 202
7.1 The European starling as a model for understanding perceptual mechanisms filters differing from the filter which is tuned to the reference frequency, then the detection of the modulation is possible. Such a mechanism can explain the improvement of the FMDL at a modulation frequency of 640 Hz shown in Figure 7.1.5c and 7.1.5d. The data on the frequency difference limen shown in Figure 7.1.5d were obtained with a sinusoidal frequency modulation that only provided a frequency increase with respect to the reference frequency, but was otherwise similar to the stimulus that was used in measuring the data shown in Figure 7.1.5c (i.e. the frequency change was asymmetrical with respect to the reference frequency and tracked a raised sine function). The relationship between the Weber fraction for detecting the frequency change and the modulation frequency was similar to that obtained for symmetrical modulation, but it was generally smaller than the FMDL for symmetrical modulation. This can be easily explained by the fact that for the asymmetrical frequency modulation, a particular frequency difference will result in a deviation from the reference frequency being double that for symmetrical modulation around the reference frequency. A stimulus paradigm in which no temporal gap separates the tones of differing frequency (i.e. if the two tones are connected by a frequency sweep tracking a raised sine quarter wave) results in frequency-difference limens that are similar to the results obtained with a slow asymmetrical sinusoidal frequency modulation (Fig. 7.1.5e). Our studies obtained with various stimulus paradigms show that the starling’s perception of frequency resembles in many respects the effects found in humans (e.g. Fastl 1978, Moore and Glasberg 1989, Schorer 1989a). A study by Heil et al. (1992) demonstrated preferred responses to the direction of a frequency sweep in forebrain neurons of the chicken that varied in relation to the neuron’s best frequency. Neurons with a low best frequency mostly preferred upward sweeps. Neurons with a high best frequency mostly preferred downward sweeps. At intermediate best frequencies, the neurons showed no preference for sweep direction. A psychophysical study of frequency discrimination in the starling (Klump 1991) showed differences in the detection of upward and downward frequency variations that conform to the physiological data from the chicken. Stimuli with a frequency change were composed of two tones of 400 and 300 ms, respectively, that differed in frequency and were linked by a 100-ms frequency sweep that tracked a raised sine function. Corresponding to the preferred direction of neuronal responses in the chicken (Heil et al. 1992), the starlings were better in detecting upward sweeps at low reference frequencies and downward sweeps at high reference frequencies (Fig. 7.1.5f). At intermediate frequencies, no preference was observed. So far, no matching data from studies in humans have been reported.
203
7 Comparative animal behaviour and psychoacoustics
7.1.4 Temporal processing 7.1.4.1 Temporal resolution of the auditory system The observation that songbirds producing the same song again and again show a precise temporal patterning within the song elements and little variation between sequential repeats of the song have lead some researchers to conclude that this group of vertebrates will exhibit a very high temporal acuity in their auditory system (e.g. see Greenewalt 1968). Psychophysical studies in the European starling, however, using a gap-detection paradigm (Klump and Maier 1989) or measuring temporal modulation transfer functions (Klump and Okanoya 1991) have not indicated an unusually high temporal resolution. As has been demonstrated in humans (e.g. Shailer and Moore 1983), the size of the starling’s minimum-detectable gap depends on the level of the stimulus (Fig. 7.1.6, for details of this study see Klump and Maier 1989). The median size of the starling’s minimum-detectable gap in a noise burst presented at 45 dB SPL is 1.8 ms. At higher levels of the noise, the gap-detection threshold remains constant. Below a level of 45 dB SPL, the gap-detection threshold increases with decreasing level of the noise (the size of the minimum-detectable gap was 4.3 ms at the level of 25 dB SPL, which was the lowest level tested). This range of gap-detection thresholds is quite similar to the range between 2.3 and 4.2 ms that was reported from studies in humans using broadband noise (e.g. Buunen and van Valkenburg 1979, Shailer and Moore 1983). Neurophysiological studies of gap detection in neurons of the starling’s peripheral auditory system (Klump and Gleich 1991) and in primary-like neurons of the starling’s auditory forebrain (Buchfellner et al. 1989) were conducted with the same stimuli as those used in the behavioural study. The median minimumdetectable gap of neurons both in the forebrain and in the auditory periphery was 12.8 ms, indicating that the average temporal resolution of single auditory neurons is lower than the temporal resolution found in the behavioural test. Both in the starling’s peripheral and central auditory system, however, a small fraction of the neurons (11 % and 10 % of the peripheral and central neurons, respectively) was able to encode gap sizes of 3.2 ms and below, thus approaching the performance of the entire auditory system that was determined behaviourally. Studying temporal resolution in the human auditory system, Formby and Muir (1988) pointed out that the time constant of the temporal modulation transfer function (TMTF) for broad-band noise carriers, which can be deduced from their highfrequency cutoff, is similar to the size of the minimum-detectable gap. This relationship between the two measures of temporal resolution also holds for the European starling (Klump and Okanoya 1991). In the starling, the 3 dB-down point of the high-frequency cutoff of the TMTF is found at a modulation frequency of 123 Hz, corresponding to a time constant of 1.3 ms. The time constants obtained from the broad-band noise TMTFs in humans are a little larger (2.1 ms and 2.8 ms, when determined from data reported by Formby and Muir 1988, and by Viemeister 1979, respectively) than those measured in the starling, corresponding to the slightly larger size of the human’s minimum-detectable gap (see above). 204
7.1 The European starling as a model for understanding perceptual mechanisms
Fig. 7.1.6: The size of the minimum-detectable gap in a gated white-noise signal in relation to the spectrum level of the signal (starling data from Klump and Maier 1989, data from zebra finch and budgerigar from Okanoya and Dooling 1990).
The TMTFs of starling and human have more features in common. Viemeister (1979) found in humans that the form of the TMTF changed from a low-pass to a band-pass characteristic if the noise carrier was gated on and off (i.e. presented in pulses that were sinusoidally amplitude modulated) rather than being presented continuously (and then being sinusoidally amplitude modulated for a similar time period as in the pulsed presentation). This change in the shape of the TMTF is also observed in the starling (Klump and Okanoya 1991, see Fig. 7.1.7). The fact that a band-pass characteristic of the TMTF can only be found in the bird when presenting the noise pulses at a high sound-pressure level indicates that the change in the form may be related to differences in the degree of adaptation of the auditory system when stimulated with continuous rather than gated sounds. In the starling, a low-pass TMTF is observed if the auditory system is well adapted to the stimulus either because the gated carrier is presented at a low sound-pressure level or because the carrier signal is presented continuously. A study of the TMTFs of peripheral auditory neurons in the starling (Gleich and Klump 1995, see Fig. 7.1.7) applying the same stimuli as in the psychophysical study suggests that the phasic-tonic response characteristic of the non-adapted peripheral neurons may contribute to the band-pass characteristic observed in the psychophysical TMTFs for gated carriers presented at high levels. The transition in the shape of the TMTF observed in the psychophysical and the neurophysiological 205
7 Comparative animal behaviour and psychoacoustics
Fig. 7.1.7: Temporal modulation transfer functions obtained for starling auditory-nerve fibres (solid lines, data from Gleich and Klump 1995) and measured in a psychophysical study using the same broad-band noise carriers in the same species (dotted lines, data from Klump and Okanoya 1991). Higher values of the modulation index indicate a lower depth of modulation.
study in relation to the mode of stimulus presentation (gated versus continuous carrier signal) is very similar (Fig. 7.1.7). In essence, the auditory-nerve fibre TMTFs show the same temporal properties as the psychophysical TMTFs, but the peripheral neurons are less sensitive in coding the depth of modulation than the starling’s auditory system as a whole (i.e. the neural TMTF is shifted towards larger depths of modulation). In the starling’s central auditory system, the neuronal modulation coding has been studied using sinusoidally amplitude-modulated tones (Knipschild et al. 1992). Even at this high level in the starling’s auditory pathway, the average coefficients of synchronization of the neural response to amplitude modulation showed a similar dependency on the modulation frequency to the psychophysical detection threshold for amplitude modulation that is measured in the TMTF. Both in humans and in the European starlings, the time constant of the psychophysical TMTF is related to the frequency range in which the modulation is presented (Formby and Muir 1988, Klump and Okanoya 1991, Viemeister 1979). If the noise carrier encoding the modulation only provides information at high frequencies (e.g. in the studies in the starling and human, a high-pass noise with a cut-off frequency of 3 or 4 kHz, respectively, was modulated and an unmodulated low-passed noise of the same sound-pressure level and cut-off frequency was added), the TMTF indicates similar temporal resolution to that found when using a white-noise carrier. If the frequency range providing information about the modulation is restricted to low frequencies (e.g. to frequencies below 1.5 or 1 kHz in the starling or to frequencies below 1 kHz in humans), the temporal resolution, indicated by the highfrequency cutoff of the TMTF, is reduced. In the starling, the minimum integration 206
7.1 The European starling as a model for understanding perceptual mechanisms time derived from the TMTF increases from a value of 1.3 ms for a wide-band carrier to a value of 4 ms for low-pass carriers (Klump and Okanoya 1991). A study of a corresponding relationship between the minimum integration time and the frequency tuning of individual auditory-nerve fibres in the starling only revealed a weak correlation between the fibres’ temporal resolution and the different measures of their frequency selectivity (Gleich and Klump 1995). This may be explained by the multitude of factors (tuning-curve bandwidth, slopes of the high and low-frequency flanks of the tuning curves, level of stimulation above the neuron’s threshold) that have an effect on the neuron’s minimum integration time, so that a single factor is only of limited importance.
7.1.4.2 Long term integration Temporal processing in the auditory system has been characterized by additional time constants (for an overview see Green 1985) that describe the summation of sound energy over time periods of hundreds of ms rather than a few ms. In the starling, as in observations in other animals and humans (e.g. see Brown and Maloney 1986, Watson and Gengel 1969), the detection threshold for a brief acoustic signal depends on the duration of the stimulus. Figure 7.1.8 shows the mean integration time constants of European starlings (determined for at least 3 individuals, see Klump and Maier 1990) in relation to the signal frequency. The time constants were determined from psychophysical data on thresholds in quiet for tones varying in duration from 30 to 2000 ms using a sum-of-least-squares fit to a theoretical function suggested by Feldtkeller and Oetinger (1956) and Plomp and Bouman (1959). In the starling’s range of best hearing (see above), the integration time constants were at a maximum, whereas at the low (500 Hz) and high (4000 Hz) frequencies, the temporal summation of signal energy proceeded over shorter times. Data on temporal summation obtained for tones of a frequency of 2.86 kHz in two other bird species (Dooling 1979) fit the results in the starling. Contrary to the results in the starling, however, humans show a monotonic decrease of the integration times for temporal summation with increasing frequency (Watson and Gengel 1969). Some mammals show a frequency dependence of temporal summation that is similar to the pattern found in humans, whereas other mammals show a nonmonotonic frequency dependence as found in the starling (see Klump and Maier 1990). The differences between the various species cannot be explained by that variation in the bandwidth of auditory filters with frequency. It is an unresolved issue whether the different frequency dependence of the integration-time constants indicates that more than one neurophysiological mechanism is involved in temporal summation. The slopes of the linear regression lines obtained with a sum-of-least-squares fit to the threshold data in relation to the logarithm of the signal duration (only including signals of a duration shorter than the integration time at the respective frequency) was close to -1 (Klump and Maier 1990). This indicates a complete summation of signal energy in the starling’s auditory system, which is similar to findings in humans (Green et al. 1957). 207
7 Comparative animal behaviour and psychoacoustics
Fig. 7.1.8: Integration times for temporal summation of signal energy of tones at the detection threshold (starling data from Klump and Maier 1990, budgerigar and field sparrow from Dooling 1979).
7.1.4.3 Duration discrimination The discrimination of stimulus duration is another task that requires the auditory system to make comparisons over time periods extending over a few hundred milliseconds. Data from the starling and other vertebrates indicate that the discrimination between stimuli of different durations is not based on the perceived change of stimulus intensity in relation to the change of stimulus duration (i.e. it is not based on the same mechanisms as temporal summation). The intensity-difference limen of the starling for detecting an increase in intensity is 2 to 3 dB (Klump and Bauer 1990), and this would correspond to an increase in duration of about 100 %, provided complete summation of signal energy (see above). However, starlings are able to detect changes in stimulus duration ranging from 10 to 20 % (Fig. 7.1.9), suggesting that a mechanism must be involved that does not depend on energy summation, but encodes duration in a different way. So far, no neurophysiological studies have been undertaken in the starling to investigate the mechanisms that encode the stimulus parameter duration. As shown for other psychophysical measures, the starling’s ability to discriminate duration is similar to that of humans (for a review see Maier and Klump 1990). Also in humans, the difference limen for duration is in the range of 10 to 20 %, and higher values are observed for short reference durations (e.g. Sinnott et al. 1987). Within the frequency range from 0.5 to 4 kHz, in both humans and in starlings the difference limen for duration does not vary greatly.
208
7.1 The European starling as a model for understanding perceptual mechanisms
Fig.7.1.9: The minimum-detectable change in signal duration in relation to the reference-tone duration (for details see Maier and Klump 1990). The solid lines and filled symbols show the difference limen for detecting a decrease in stimulus duration; the dotted lines and open symbols show the difference limen for an increase in stimulus duration.
7.1.5 Spectro-temporal integration: comodulation masking release A number of studies in humans have indicated that temporal pattern analysis also plays an important role in the context of signal detection in background noise. This is a task that has traditionally been viewed as being solved solely by means of spectral filtering (for reviews on this topic see Moore 1990, 1992, Hall and Grose 1991). Hall et al. (1984) were the first to demonstrate that masking of a signal by background noise can be considerably reduced if the noise exhibits random amplitude fluctuations rather than having a relatively constant envelope. They termed this effect ‘comodulation masking release’ (CMR) since it was most prominent if they coherently modulated (i.e. comodulated) the amplitude of the masker over a wide range of frequencies. It has been suggested that two types of mechanisms of signal analysis in the auditory system contribute to CMR. The first type of mechanism involves the analysis of temporal patterns within a single auditory filter (i.e. the so-called within-channel cues). For example, Schooneveldt and Moore (1989) postulated that the auditory system may detect a signal by observing the change in the modulation pattern of the envelope of the masking noise that is due to the addition of the signal within one critical band. Other authors have suggested that the temporal pattern of forward masking within an auditory filter is a major factor that determines the amount of CMR (e.g. see Gralla 1993b). The alternative type of mechanism emphasizes the comparison of the incoming information in different fre209
7 Comparative animal behaviour and psychoacoustics quency channels of the auditory system (i.e. the exploitation of across-channel cues). If a masking signal shows correlated amplitude fluctuations in different frequency channels, the addition of the signal to one of the channels will reduce the across-channel correlation; this change might be detected by the auditory system (e.g. see the model proposed by Buus 1985). Alternatively, given correlated amplitude fluctuations in different frequency channels of the auditory system, a low amplitude of the masker in one channel could predict a good time to detect a signal embedded in the masker in another frequency channel (‘dip-listening hypothesis’, see Buus 1985). Amplitude fluctuations are a common characteristic of sounds in the natural environment of birds (e.g. Richards and Wiley 1980) and, as the studies in humans indicated (see above), their exploitation is of considerable importance for signal detection in background noise. Since our studies in the starling demonstrated that its auditory-filter bandwidths are similar to human critical bands and that its temporal pattern perception resembles the human perceptual performance, this species appears to be an ideal model for studying mechanisms underlying CMR. As a first step, CMR in the starling was tested with the stimulus paradigm applied by Hall et al. (1984) and Schooneveldt and Moore (1989). Similar to the results obtained by Schooneveldt and Moore (1989) studying humans, the starling’s CMR improved with an increase in the masker bandwidth (Fig. 7.1.10a, see also Klump and Langemann 1995). As also found in humans by Schooneveldt and Moore (1989), CMR in the starling was already evident if the masker bandwidth was limited to a single critical band (the starling’s critical bandwidth is 233 Hz, see above). This is demonstrated in Figure 7.1.10a for maskers of a bandwidth up to 200 Hz and in Figure 7.1.10b by the data showing the masking release obtained with a 200-Hz-wide masking noise. Thus, in the starling as in humans, within-channel cues contribute significantly to CMR. As in humans (e.g. Hall et al. 1984, Schooneveldt and Moore 1989), release from masking in starlings increases further if the masker bandwidth is expanded beyond the critical bandwidth (data for masker bandwidths above 200 Hz in Fig. 7.1.10a and results in Fig. 7.1.10b). This indicates a substantial contribution of across-channel cues to CMR in starlings. The perceptual similarities between starlings and humans are not only of a qualitative nature – the size of the effect is about the same in both species.
210
7.1 The European starling as a model for understanding perceptual mechanisms
Fig. 7.1.10: Release from masking for the detection of a 2-kHz tone resulting from coherent amplitude modulation across the frequency range of a band-limited noise masker centered on the tone frequency (for details see Klump and Langemann 1995, Langemann 1995). Medium values and ranges obtained from five starlings are shown. A: Release from masking in relation to the masker bandwidth. The masker was comodulated by a 50-Hz low-passed noise. B: Release from masking in relation to the cut-off frequency of the low-passed noise modulator (● = data for a masker bandwidth of 1600 Hz, ◆ = data for a masker bandwidth of 200 Hz).
7.1.6 The processing of signals by the European starling’s auditory system – conclusions Throughout this section it has become clear that the processing of signals by the European starling’s auditory system shows many parallels to processing in the human auditory system. This is emphasized in the range of psychophysical tests conducted on this species, that exceeds the range of psychophysical studies in any other non-human vertebrate. Furthermore, a model of auditory processing that originally had been tailored to explain perceptual processes in humans allows a good prediction of the psychoacoustical performance of the starling (Buus et al. 1995). Although it seems surprising that a bird, with an auditory system that is morphologically different from the mammalian auditory system, shows a performance so similar to the perception of acoustic signals by humans, this might well be explained by the evolutionary response to similar demands on the auditory system of both species. Both birds and humans base their acoustic communication on signals with frequencies below 10 kHz. They have evolved their auditory processing mechanisms in the same acoustic environment with respect to background noise, to distortion of signals by frequency-dependent attenuation or to the addition of clutter echoes. The similarities of auditory processing in starlings and humans even extend to such complex tasks as the analysis of auditory scenes (Bregman 1990), that are composed of a number of auditory objects (e.g. Hulse et al. 1997). In summary, the research in the European starling has proven the suitability of this species for studying processing at different levels of the auditory system and it provides ample possibilities for future studies of the mechanisms of auditory analysis. 211
7 Comparative animal behaviour and psychoacoustics
7.2 Mechanisms underlying acoustic motion detection Hermann Wagner
Motion information provides one of the most important cues for survival, because it helps to break camouflage of a predator or a prey organism and because it allows predictions about the future path of an object. Despite this, it has been unclear whether neural computations exist in the auditory system to analyze motion cues. Recent data on the processing of acoustic motion from several labs, including our own, have yielded some unexpected findings suggesting that the psychophysical, neurological and neurophysiological mechanisms underlying the detection and representation of acoustic motion are quite similar to those underlying the detection and representation in other modalities, especially in vision. I shall first introduce the problem underlying acoustic motion detection, then briefly review the findings of others, before I turn to our own results.
7.2.1 The problem underlying acoustic-motion detection Sounds are analyzed in separate frequency channels, the signal in each channel being defined by its frequency, its amplitude and its phase. Thus, dynamic auditory cues can be created by varying any of these parameters alone or by varying them in combination. Motion has a direction and can, thus, be represented by a vector. The magnitude of the motion vector is velocity, while the direction will be referred to as motion direction. The direction and velocity of a moving stimulus are usually measured at two different points in space and time. Motion detectors contain three essential computational stages (Poggio and Reichardt 1973): 1) two receptor elements at the input (stage 1), 2) the introduction of a temporal asymmetry (stage 2), and 3) a nonlinear interaction (stage 3) (Fig. 7.2.1a). Such a detector has a preferred direction and a null direction: The response is high with stimulation in the preferred direction and low or absent with stimulation in the null direction. So far, systematic studies on dynamic auditory cues have been rare. The main reason for this reservation may come from the absence of convincing psychophysical evidence for neural systems specialized in the detection of acoustic motion (Middlebrooks and Green 1991).
212
7.2 Mechanisms underlying acoustic motion detection
7.2.2 Recent psychophysical and neurological findings on acoustic motion The existence of a motion after-effect (MAE) is generally regarded as evidence for a specialized motion-processing system in a modality. Until recently, there was no clear demonstration of an auditory motion after-effect (aMAE). Grantham (1997) demonstrated a stable aMAE by creating a spatially enriched acoustic source. Thus, the acoustic system seems to have a stream of information processing dealing with the detection of motion. Support for this conclusion came from a report about a patient that had a specific deficit for determining acoustic motion while the judgment of the location of a stationary acoustic object was not at all or much less impaired (Griffiths et al. 1996).
7.2.3 Physiological correlates of acoustic motion detection in the barn owl There has long been evidence for auditory neurons that exhibit motion-direction sensitivity (MDS) (Altman 1968). In the lower nuclei of the central auditory pathway, such as the medial superior olive, responses lacked MDS, but in a higher centre, the inferior colliculus, motion-direction-sensitive neurons were found. We systematically investigated the neural algorithms underlying the motion computations. To do so, we constructed an apparatus of 7 loudspeakers arrayed in a half-circle of 1 m radius. The angular separation of the loudspeakers was 30 degrees. Switching on the speakers in sequence created an apparent acoustic motion stimulus in the horizontal plane, that moved either clockwise (cw, s. right arrow in Fig. 7.2.1b) or counterclockwise (ccw, see left arrow in Fig. 7.2.1b). Using this apparatus for stimulation, we have recorded from more than 300 neurons that exhibited acoustic motion direction sensitivity. Examples are shown in Figure 7.2.1b-d. In all these neurons, the response to motion in one direction was statistically higher than the response to motion in the opposite direction. Most of these neurons were located in the inferior colliculus, but we also found motiondirection sensitive cells in the optic tectum (Wagner and Takahashi 1990, 1992, Wagner et al. 1994, Kautz 1997). Stage 1 of the motion detector (Fig. 7.2.1a) can be thought of as consisting of two spatially restricted receptive fields that are tuned to different locations in space. The temporal asymmetry (stage 2) is realized by means of a low-pass filter and not by a frequency-independent time shift (Wagner and Takahashi 1992). The time constant of the low-pass filter is approximately 20 ms (D. Kautz, Master’s thesis, University of Oregon 1992). Responses to both stationary stimuli and stimuli moving in the preferred direction were facilitated by the same extent relative to spontaneous activity, while the responses evoked by sounds moving in the null direction were facilitated less or not at all (Fig. 7.2.1c). Thus, stage 3 is realized by an interaction of excitation and inhibition. Finally, we observed that the motion-direction sensitivity was inversely related to the stimulus-driven activity of the cells (Wagner 213
7 Comparative animal behaviour and psychoacoustics and Takahashi 1992). This can be explained if – after the generation of an initial motion-direction sensitivity – a fourth stage is included. In this step, a directionindependent inhibition reduces activity in both the preferred and the null directions, and by this reduction increases MDS (Wagner and Takahashi 1992). Application of the inhibitory transmitter GABA decreased the overall response of the cells, but in many cases it also decreased motion-direction sensitivity (Fig. 7.2.1d). Kautz (1997) demonstrated that the responses of many neurons were influenced by the GABAA -receptor antagonist Bicuculline methiodide (BMI). BMI also decreased motion-direction sensitivity in about 30 % of the cells. After iontophoresing BMI, the response to ccw motion, the null direction stimulation, increase much more than the response to cw motion (Fig. 7.2.1d). In contrast, GABA increased motion-direction sensitivity. In summary, the owl’s acoustic motion detector is a combination of the elements also found in visual-motion detectors (Barlow and Levick 1965, Poggio and Reichardt 1973). Thus it seems that the constraints of moving stimuli have resulted in the same neural algorithms, independent of the modality in which the stimuli occur (Kautz and Wagner 1995).
Fig. 7.2.1: Neural correlates of acoustic motion detection in the owl. A: The acoustic motion detector as derived from experiments in the owl. B: Response of a motion-direction sensitivity cell. This cell responded much more to motion in the ccw direction (indicated by the circle on the left with the arrowhead pointing in the ccw direction) than to motion in the cw direction. C: Stage 3 of the acoustic motion detector is realized similarly to the „AND NOT“ logical operation proposed by Barlow and Levick (1965), because the response to stationary stimulation (stat.) was as high as the response to stimulation in the preferred direction (pref.), while the response to stimulation in the null direction (null) was reduced. Spontaneous activity (spont. act.) is low in these cells. D: MDS decreased when inhibition was reduced by the GABA-antagonist Bicuculline-Methiodide (BMI). CW, clockwise motion; CCW, counter-clockwise.
214
7.3 Comparative echolocation behaviour in bats Almost all motion-direction sensitive neurons in the owl’s inferior colliculus also responded to stationary stimuli. Therefore, the midbrain may contain the neural algorithms creating acoustic MDS, the acoustic motion detector (Fig 7.2.1a), but it need not contain structures that process motion independently of stationary stimuli. In the barn owl, cells that process stationary stimuli, as well as cells that represent motion direction, are arranged in columns. Within a vertical penetration through the barn owl’s inferior colliculus, most motion-direction sensitive cells shared the same preferred direction (Wagner et al. 1994). In addition, about 70 % of the cells recorded from on the animal’s right side were sensitive to counter-clockwise motion, while some 70 % of the cells recorded from on the left side were sensitive to clockwise motion (Wagner 1992). Since the left auditory space is represented in the right midbrain, the cells respond when a target is stationary or when the sound source moves from front to back. Thus, the arrangement of motiondirection-sensitive neurons may help the owl to bring the target into the range of high spatial resolution in the frontal part of its space map.
7.3 Comparative echolocation behaviour in bats Gerhard Neuweiler
There are more than 600 different species of echolocating bats. They live in diverse biotopes and forage on very different resources ranging from flying insects, spiders and caterpillars, lizards and frogs to fruits and nectar and pollen of flowers. Such diverse nightly targets require echolocation capacities adapted to the various foraging strategies and constraints of the specific biotopes. We combined field studies with behavioural experiments to get an idea as to how echo-audition may be specialized to meet the foraging niches of different bat species.
7.3.1 Field studies We focused our efforts to two species, the horseshoe bat Rhinolophus rouxi, and the False Vampire Megaderma lyra.
215
7 Comparative animal behaviour and psychoacoustics 7.3.1.1 Horseshoe bats Horseshoe bats consistently emit a constant frequency (CF) echolocation signal of 10 to 60 ms duration, which starts and ends with a brief frequency-modulated (FM) segment. The complete signal is called a FM/CF/FM echolocation call. In the cochlea of horseshoe bats, we had discovered a magnified representation of the species-specific frequency of CF-echoes (Neuweiler et al. 1980). We named this over-representation within the cochlear tonotopy an auditory fovea. Auditory neurons tuned to foveal frequencies have extremely narrow tuning curves with Q10dB-values up to 600, and respond highly sensitively to minor frequency modulations of the echo frequency. In a detailed study, we investigated the foraging behaviour in a colony of about 15 000 horseshoe bats on the hill slopes of Sri Lanka (Neuweiler et al. 1987). To our surprise, the horseshoe bats did not forage over open, prairie-like areas or over waterways, but spent the nights within the forest and jungle patches. There, they caught insects on the wing during a first activity period of 30 to 60 min. After an inactive interval of another hour, the horseshoe bats started foraging in a flycatcher style. They alighted on specific, leafless twigs under the canopy and scanned the environment for flying insects by continuously echolocating and turning the body around the leg axis in an almost full circle. Horseshoe bats relentlessly scanned for prey throughout the night and emitted echolocation calls at an average rate of 9.6 ±1.4 /s. Such emission rates result in duty cycles of 44–48 %, which are the highest recorded in echolocating bats. During scanning for prey from vantage points, the bats emit pure tones without initial FM components, and the last echolocation sounds before take-off were prolonged to about 60 ms duration. The bats made brief catching flights and returned with their prey to their vantage point. Bats maintain an individual foraging area of about 20 by 20 m and return to the same few vantage points in subsequent nights. From these field observations, we inferred that horseshoe bats might detect flying prey by acoustic glints imposed onto the pure-tone echo by the beating wings of the prey. This hypothesis was tested in an experimental and comparative study in Rh. rouxi, Hipposideros speoris and H. bicolor (Link et al. 1986). Hipposiderid bats also exclusively emit CF/FM signals, however of shorter durations (4 to 7 ms). All three species only responded to insects that were moving their wings. The bats showed no reaction at all to non-moving insects or those walking on the floor, except for the large-eared H. bicolor which occasionally attacked insects crawling on the floor. In horseshoe bats, a single wingbeat may trigger a catching flight. From experiments with wingbeat dummies, we calculated that the threshold for catching responses is at a wing speed of the prey of ca. 2 to 1 cm/s. Our behavioural and neurophysiological experiments resulted in an adaptive model of CF-echolocation systems (Fig. 7.3.1): CF-bat species apply pure tone echolocation and tuned auditory foveae for echo-clutter rejection in dense habitats such as jungles and the foliage of forest trees. The pure tone serves as a carrier for glints caused by beating wings of the potential prey. The glint-bearing echoes are received and analyzed by the narrow foveal frequency filter of the cochlea. The price paid for such a noise-resistent specialization is the inconspicuousness of potential prey that does not beat its wings. 216
7.3 Comparative echolocation behaviour in bats
Fig. 7.3.1: Fluttering target detection with a CF-echolocation system in Rhinolophus rouxi. Top: Field situation of a foraging horseshoe bat. Centre left: The auditory fovea in the cochlea. Centre right: the expanded representation of the auditory fovea (stippled, 72–77 kHz) in the ascending auditory pathway. CN, cochlear nucleus; LL, nuclei of the lateral lemniscus; IC, inferior colliculus; marked area in cerebrum = auditory cortex. Lower left: Audiogram showing the narrow filter tuned to the individual echolocation frequency (arrow). Lower right: Responses of a filter neuron tuned to the pure tone frequency of an echolocation call (top), and to an echo from a wing-beating insect (bottom).
217
7 Comparative animal behaviour and psychoacoustics
Fig. 7.3.2: Auditory foveae tuned to species-specific and individual echo frequencies within the tonotopy of the inferior colliculus. Within the dorsoventral tonotopy the narrow foveal frequency band is shaded. Figures denote best frequency of neurons in kHz. Note the concordance of emitted echolocation frequency (RF) and frequency band of the fovea not only among species but also within a species (Rhinolophus rouxi).
This interpretation is corroborated by the tonotopy of the inferior colliculus, which contains a huge over-representation of the narrow foveal frequency range (Rübsamen et al. 1988). These foveal neurons comprize nearly half of the collicular volume and are tuned not only to the species-specific CF-frequency range, but to the individual CF-frequency in each specimen (Fig. 7.3.2). In Rh.rouxi, adult males and females emit pure-tone frequencies in separate bands, the males from 73.5 to 77 kHz and the females from 76.5 to 79 kHz. How are individual emitted frequency and cochlear foveal frequency matched to one another? In the second week after birth, young horseshoe bats start to emit echolocation calls, and at the same time an auditory fovea appears in auditory recordings (Rübsamen 1987). Young bats start to emit frequencies around 55 kHz to which their auditory foveae are tuned. During the next five weeks, both emitted and foveal frequency increase in matched concordance to the frequency band of 73 to 79 kHz (Fig. 7.3.3). Lesion experiments suggest that the tuning to higher frequencies during ontogeny occurs independently in the larynx and in the cochlea, and allows for a coarse matching of emitted and foveal frequency (Rübsamen and Schäfer 1990a,b). For fine tuning, however, auditory feedback is mandatory. The lesion results and shifts of foveal centre frequencies caused by anaesthesia indicate that fine tuning between emitted and foveal frequency may be under active control of the echolocating bat.
218
7.3 Comparative echolocation behaviour in bats
Fig. 7.3.3: Ontogeny of matched frequency filters in horseshoe bats. Congruent development of the auditory filter frequency and the emitted echolocation frequency in juvenile horseshoe bats (Rhinolophus rouxi) from the 3rd to 5th postnatal week. At every point the frequency tuning of the auditory fovea matches that of the echolocation calls (Rübsamen and Schäfer 1990).
7.3.1.2 Gleaning bats The False Vampire, which also lives in Sri Lanka and India, shows a completely different echolocation behaviour. This species invariably emits ultra-brief (less than 1 ms), click-like echolocation pulses that cover a frequency range from 110 to 20 kHz (wideband signal). A radio-tracking study in South India (Audet et al. 1991) disclosed that False Vampires spend most of the night in individual night roosts. From there, they take off for search flights lasting from 1 to 15 min. In about one third of the search flights, the bat returns with prey (large insects, frogs, lizards, geckos, sometimes birds and mice). False Vampires fly low over the ground and scan it for moving prey. Behavioural experiments with moving and non-moving baits showed that the bats detect the prey by listening to the faint scratching noises of moving targets (Marimuthu and Neuweiler 1987, Marimuthu et al. 1995). Prey that does not make noises will not elicit catching responses of the bat. In several ways, the ascending auditory pathway in False Vampires is adapted for listening for faint signals (Rübsamen et al. 1988, Neuweiler 1989, Fig. 7.3.4): the large and fused pinnae create a hypersensitivity of -20 dB SPL for frequencies be-tween 10 and 35 kHz; the inferior colliculus features auditory neurons that selectively respond to faint noises (similar neurons were also found in true vampires, Desmodus rotundus); the inferior colliculus also contains many neurons that have an upper threshold, and, therefore, only respond to faint signals up to about 40 dB 219
7 Comparative animal behaviour and psychoacoustics
Fig. 7.3.4: Auditory adaptations to listening to faint noises in the False Vampire, Megaderma lyra. A: Audiograms of bats that glean prey from ground or from foliage. Ma, Macroderma gigas; P, Plecotus auritus; Me, Megaderma lyra. Horizontal bars indicate frequency range of the echolocation calls. The hatched area in the lower graph indicates the amplification provided by the large pinnae of Megaderma lyra. B: Neurons with upper thresholds in the inferior colliculus of Megaderma lyra. The upper thresholds are created through gabaergic inhibition (GABA). Application of the GABA-antagonist bicuculline (Bic) abolishes the upper threshold.
220
7.3 Comparative echolocation behaviour in bats SPL (Rübsamen et al. 1988). Why then do False Vampires echolocate continuously during searching, selecting and seizing prey? There is no unambiguous answer to this important question. The behavioural experiments using digitally-simulated target structures that create fine spectral cues in the broad-band echoes of False Vampires suggest that the fine structure of the echo spectrum may give information about the identity of a target (see Sect. 7.4). Another gleaning bat, Antrozous pallidus, was studied by radio-tracking in the National Park Big Bend of Southern Texas (Krull 1992). This species searches for prey in a similar way to False Vampires. However, it mainly subsists on arthropods, which are taken from substrates or in flight. As in False Vampires, detection is only passive by listening to prey noises, including the humming of flying insects, even though the bats continuously emit echolocation sounds. The bats adapt the sound structure for cluttered and non-cluttered backgrounds: While foraging close to the ground or within confined spaces, they emit broad-band FM-pulses lasting only 1.6 ±0.5 ms. When they pursue flying insects in open spaces, however, they lower the bandwidth of the signal and double the duration to 3.1 ±0.9 ms. As in False Vampires, the detailed function of echolocation during foraging behaviour is not yet clear in Antrozous pallidus.
7.3.1.3 FM-bats We also tried to study the foraging behaviour of the classical FM-bat Myotis myotis by radio-tracking in a mountain region of Southern Bavaria (Audet 1990). The bats preferably spent the night within dense forests without undercover. In contrast to horseshoe bats, they spent most of the time on the wing in search of insects that are often taken from the ground by brief dips and landings. However, due to the difficulty of pursuing the bats within the forests and their shyness, we could not record echolocation signals under natural conditions. We were luckier with a related and rare bat species, Myotis emarginatus, that lives in the same area as Myotis myotis. The study disclosed that this species hunts flying insects in open spaces, and also picks up (gleaning) insects from leaves and other substrates (Krull et al. 1991). During gleaning, the bats often hover in front of the prey before they catch it. M. emarginatus adapts the time-frequency structure of its echolocation sound to the particular situation (Schumm et al. 1991). In confined spaces, the bats emit brief frequency-modulated pulses that cover a frequency range of 110 to 40 kHz. During hovering in front of putative prey, the frequency band is extended maximally to 90 ±14 kHz, and the sound duration is minimized to 0.8 ±0.2 ms. In contrast, while searching for insects in open spaces, the echolocation signals are lengthened to up to 7 ms by expanding the end of the brief FM signal into a pure tone segment of about 52 kHz (FM/CF-signal; Fig. 7.3.5). Thus, not only may distinct species adapt their echolocation system to specific constraints, but signal variability within a species may also allow for flexible adjustments to the foraging situation. Since M. emarginatus is an endangered species, no physiological studies on audition could be performed. 221
7 Comparative animal behaviour and psychoacoustics
Fig. 7.3.5: Echolocation signals adapted to different foraging strategies in Myotis emarginatus. Left column sonagrams; right column spectrograms. The figures denote peak frequencies in kHz (Schumm et al. 1991).
It would be most rewarding to study the auditory adaptations to the different echolocation systems in a comparative way. However, such studies have become rare, since many bat species are endangered, and insectivorous species rarely breed in captivity.
7.3.2 Insect abundance and prey selectivity The specifity of echolocation systems may also select for certain prey types, and prey selectivity in turn may depend on the biomass of insects available in a biotope. To obtain a first estimate of prey availability, we collected nocturnal insects by light-traps over a full annual cycle in Madurai scrub jungle, where we had studied the foraging behaviour of the most common South Indian bat species (Eckrich 1988, Eckrich and Neuweiler 1988). Insect abundance was related to the annual prey intake of five bat species, as studied by faecal pellet analysis (Fig. 7.3.6). 222
7.3 Comparative echolocation behaviour in bats
Fig. 7.3.6: Annual cycle of insect orders preyed upon by five bat species (left column, bw = body weight) in scrub jungles of Madurai, South India during the 1985/86 season. Bars indicate percentage of identifiable fragments in fecal pellets (Eckrich 1988).
223
7 Comparative animal behaviour and psychoacoustics Moths are the best-suited resources for echolocating bats, since they offer relatively high individual body masses and high population densities. They are also active all through the night, whereas other insect families are only on the wing in the early evening and early morning hours. Indeed, in the scrub jungle of Madurai, the biomass of nocturnal insects is highest from September to January, and moths contributed 35 % of it, followed by beetles (20 %) and heteropteran insects (13 %). During the dry season from March to August, crickets and grasshoppers dominated (up to 60 %) the insect population active at night. In addition, the insect biomass rapidly declined during the dry season and, in July, reached a minimum of only 5 % of that in November. During the dry season, four of the five bat species monitored lived mainly on crickets, and the larger Taphozous species also fed on grasshoppers. Grasshoppers and crickets are not very active but may migrate at night in considerable heights over ground, and in large numbers. Indeed, several times we observed Taphozous bats soaring several hundred meters above ground in the downhill evening winds for tens of minutes (Siefer and Kriner 1991). They occasionally made brief and abrupt turning and dipping flights, which we interpreted as prey-catching manoeuvres. After the onset of the monsoon in October, the spectrum of insects eaten diversified among different species. However in most species, moths remained a staple food. Some moths species are protected by unpalatable or poisonous substances or by ultrasonic click emissions that cause pursuing bats to turn away. More than 190 moth species occur in the scrub jungle. All hypsid and ctenuchid moths were unpalatable and from 18 examined arctiid species, 15 were chemically protected and three were protected by click emissions. In contrast, from more than a hundred noctuid species tested, only one was rejected by bats. Thus, noctuid moths, crickets and grasshoppers are the main insect groups on which the bats in South Indian scrub jungles subsist over a full annual cycle. Apart from obvious size correlations, no specific food selectivity was observed among the different insectivorous bat species.
7.3.3 Time windows for echo perception Even today, there exists no distinct criterion that differentiates audition in echolocation from listening and auditory communicating. However, an echo can only occur when it was preceded by a signal of the same frequency-time structure. In 1985, Roverud and Grinnell had reported that in Noctilio albiventris, a time window of echo perception might exist that is time-locked to the emitted echolocation signal. In cooperation with Roverud, we studied this question in different bat species and for two different tasks, distance discrimination and fluttering target detection. We tried to confuse the trained bats by playing artificial echoes to them while they had to differentiate target distances or wing-beat frequencies respectively. In the distancediscrimination task, the performance of all three bat species (Rhinolophus rouxi, Eptesicus fuscus and Noctilio albiventris) fell to chance level under two conditions: 224
7.3 Comparative echolocation behaviour in bats – the artificial signal had the same time-frequency structure as the emitted echolocation signal, and – the false echoes were played back within a time window up to 28 to 45 ms after onset of sound emission (Roverud 1990a,b). In horseshoe bats, not only the loud second harmonic of the echolocation signal but also the faint first harmonic disrupts distance differentiation when played back as an artificial echo (Fig. 7.3.7).
Fig. 7.3.7: Time window locked to sound emission for distance discrimination in echolocating bats (Rhinolophus rouxi). Horsehsoe bats fail to respond correctly (below dashed line) to a learned distance discrimination task when artificial echoes (CF/FM signals as indicated above the graph) were played back within a time window of 46 to 70 ms to the beginning of sound emission. Identical artificial echo signals arriving later or earlier did not disrupt distance discrimination (Roverud 1990).
When the bats had to discriminate different wing-beat frequencies, the performance was not disrupted by playing back artificial echoes at any time following sound emission (Roverud et al. 1991). Horseshoe bats with their long-lasting CF-signals (30–40 ms) differentiated wing-beat frequencies only 1.5 to 4 Hz lower than a reference wing-beat frequency of 60 Hz. Hipposideros lankadiva, that have a CF-signal of less than 10 ms duration, detected wing-beat differences down to 7.5 to 11 Hz, and the FM-emitting Eptesicus fuscus down to 15 Hz. Apparently, the longer the signal, the better the perception of glint-repetition rates. Whereas repetition rates of glints due to fluttering wings refers to timing scales of tens to hundreds of ms, the distance-discrimination task requires a resolution of echo travel times in the µs-range. Perhaps this specific neuronal time analysis is sensitive to jamming, whereas analysis of comparatively slow time intervals is a conventional task of auditory systems and may be not disturbed by signals of 5 to 50 ms duration. In any case, the time windows described cannot be interpreted as a general prerequisite for echolocation. 225
7 Comparative animal behaviour and psychoacoustics
7.4 Echolocation behaviour and comparative psychoacoustics in bats Sabine Schmidt
This section covers three main topics. The question of how echolocating bats use their sonar system to perceive target structure is addressed in the first part. The second part aims at describing the perceptual dimensions available to bats for the analysis of complex spectra, and presents a hypothesis of how these dimensions are used to arrange sonar information. Finally, psychoacoustic time constants and measures for frequency processing in echolocating bats are presented, and possible adaptations to sonar are discussed.
7.4.1 Texture perception by echolocation Bats can discriminate among objects of different surface structure by echolocation (cf. Schnitzler and Henson 1980). This ability may be of major importance for gleaning bats that have to find their prey in the presence of clutter echoes from vegetation or from the ground. Although gleaning bats usually detect prey by its rustling noises (see Sect. 7.3), this does not imply that only passive acoustic cues are used for prey identification and capture. On the contrary, we were able to show in a recent study (Schmidt et al. 1997) that the gleaning bat, Megaderma lyra, consistently uses echolocation while approaching and catching terrestrial prey. The bats were trained to use a fixed flight path, so that all calls emitted while approaching the prey and when returning to the perch could be recorded with sensitive microphones. In the course of the hunting flight, systematic changes occurred in the call emission pattern and the temporal and spectral structure of the sonar calls. During the obligatory hovering of the bats above their prey, a significant emphasis of higher harmonics rendering the calls very broad-band was obvious. Calls emitted during hovering are thus ideally suited for structure discrimination, as will be clear from the results explained below. In fact, the bats discriminated between different prey types during hovering, and the echolocation activity in, and the duration of, the hovering phase varied systematically with prey type. This suggests that sonar is indeed used for a close-up inspection of prey.
226
7.4 Echolocation behaviour and comparative psychoacoustics in bats
7.4.2 Discrimination of target structure by the gleaning bat, M. lyra The ability of M. lyra to set up a well-differentiated acoustic image of its environment has been demonstrated in a two-alternative, forced-choice experiment (Schmidt 1988b). The bats were trained to select the disc with the finest grain (i.e. a grain size below 0.4 mm) from other discs covered with a layer of sand of defined grain size. In this experiment, which was performed in either red (darkroom) light or in total darkness to exclude any visual clues, M. lyra discriminated differences in mean grain size of about 2 mm at a distance of 1.8 m. Since the echoes reflected from these three-dimensional objects have co-varying complex spatio-temporal and spatio-spectral properties (Schmidt 1988b) that render it practically impossible to pinpoint the perceptual mechanisms by which the bats solved this discrimination task, a second experiment with acoustically simple, electronically-generated ‘phantom targets’ was devised (Schmidt 1988a, 1992), mimicking the reflection from two parallel planes (Fig. 7.4.1). An example of the bats´ performance is given in Figure 7.4.2, depicting the ability of three bats to discriminate a rewarded reference target with a depth of 1.3 mm from test targets of different depths. Two features were remarkable. First, a 75 %-threshold criterion was met for target pairs differing by only about 1 µs in internal delay, equivalent to a depth difference of 0.2 mm. Second, performance was not a monotonic function of the depth of the unrewarded target, but deteriorated for targets with spectral notches harmonically related to the rewar-
Fig.7.4.1: Setup for the two front target discrimination experiment. Three microphones (ML, MC, MR) pick up the sonar calls of the bat at the starting position. The signal from the centre microphone is fed to the phantom target generator, where „echoes“ from a target with two reflecting surfaces are electronically simulated and played back through the speaker (S) towards which the bat is oriented (determined by comparing ML and MR). T: arrival time of the „echo´s“ leading edge relative to call onset. Target properties are defined by the internal delay ti and the reflectivities A1, A2.
227
7 Comparative animal behaviour and psychoacoustics
Fig. 7.4.2: Performance of three bats (indicated by squares, triangles and circles) discriminating two-front phantom targets from the rewarded reference (ti = 7.77 µs; A1, A2 = 0 dB attenuation) as a function of internal delays (ti), the corresponding distance between two reflective planes (s) and the position of the first notch in the spectral reflectivity of each target (fext, from Schmidt 1988b). Each data point is from at least 30 trials. The solid line is the bats´ average performance.
ded reference. In further tests where target pairs not containing the reference target were presented for discrimination, the bats still preferred one of them. These results, reflecting the functional mechanisms responsible for target discrimination, were accounted for in a model (Schmidt 1988a, 1992), that is outlined below.
7.4.3 A broadband spectral analysis model for texture perception
7.4.3.1 Outline of the model This framework incorporated the following elements: the formation of a dissimilarity measure for targets based on the extraction of target-specific parameters from the echoes and their comparison with an acquired memory reference, a comparison between the dissimilarity measures of any given target pair, and the decision process as a function of the critical parameter obtained in this comparison. The models tested differ in their assumptions about dissimilarity measure formation. Straightforward, time-domain models that assume a monotonic increase of the dissimilarity measure with increasing depth difference between test target and reference fail to account for the experimental data (see e.g. Fig. 7.4.3, left column) for certain sets of data points. A model based on spectral correlation, which assumes that the bat ana228
7.4 Echolocation behaviour and comparative psychoacoustics in bats
Fig. 7.4.3: Perception of two-front targets (modified from Schmidt 1992). In the top graphs, the model parameter functions (dissimilarity measure) are given for a simple time domain model (left) and the spectral correlation model (right). Note that the spectral correlation model function is nonmonotonic for both positive and negative differences in internal delay. The bottom graphs represent the decision behaviour of one bat when assuming the time domain (left) and spectral correlation (right) models. Open symbols and the dashed fit curve refer to experiments in which the reference target was present, closed symbols and the solid fit curve to trials with target pairs not including the reference. Note that the latter decisions must be based on an absolute memory reference of the rewarded target. Systematic deviations of data points from the fit are indicated by arrows.
lyzes the similarity between spectral components within frequency bands and integrates this information across the broad frequency range covered by the sonar signals, however, did provide a unified description of all experimental data (Fig. 7.4.3, right column). Moreover, the spectral-correlation model was independently corroborated by Mogdans (Mogdans and Schnitzler 1990, Mogdans et al. 1993), who showed that this model also accounts for the performance of Eptesicus fuscus, another bat species using broadband sonar calls, in a different target-structure discrimination experiment. The spectral correlation model, rather than explaining just one experimental situation, gives a general description of broad-band echo analysis, with a number of implications, some of which are radically different from other current echolocation models. 229
7 Comparative animal behaviour and psychoacoustics 7.4.3.2 Implications of the model First, in contrast to the unified theory of echolocation proposed by Simmons (Simmons et al. 1990, Saillant and Simmons 1993), the spectral-correlation model does not assume that spectral changes in the echoes are transformed to temporal information by the bat´s auditory system and integrated in an auditory image of space comparable to a visual space representation (for a discussion of this hypothesis, see Neuweiler and Schmidt 1993). Rather, it is assumed that target fine structure is perceived in terms of independent qualities such as timbre or pitch, irrespective of the target´s distance and position in space. Thus, according to the spectral-correlation model, texture perception is based on a general capacity of the mammalian auditory system to analyze complex acoustic stimuli, and not a specific adaptation to sonar. Second, in order to explain the performance of the bats in trials with target pairs not containing the reference, it is assumed in the model framework that the bats remember a memorized representation of the rewarded target. This would imply that bats have a representation of frequency spectra based on absolute pitch. The auditory categories used by bats to organize complex acoustic stimuli, however, had not been studied before. They are the central issue of the following section. The techniques of human psychoacoustics, acquired in the course of a collaboration project with Zwicker (Schmidt and Zwicker 1991), were adapted to experiments with bats, as described below.
7.4.4 Perceptual categories for auditory imaging Sonar provides bats with an auditory representation of the environment. The relevant information is contained in the timing as well as in the spectra of the echoes. This chapter addresses the perceptual dimensions available to bats for the analysis of echo spectra and discusses the possible role of these dimensions for structuring the sonar information. In a series of two-alternative, forced-choice behavioural tests (cf. Fig. 7.4.4), we studied the classification of complex tones by M. lyra with respect to their absolute or relative pitch, and the use of collective pitch, spectral pitches and timbre.
7.4.4.1 Prevalence of absolute pitch The importance of absolute versus relative pitch was established in an experiment in which bats classified the pitch of test tones relative to a constant anchor tone (Sedlmeier 1992). The classification could theoretically be accomplished either by comparing each test tone with the preceeding anchor tone and categorizing the tone sequence as rising or falling, i.e. by relative pitch, or by comparison with a memorized representation of the anchor tone, i.e. by absolute pitch. In this task, several lines of evidence indicate that M. lyra referred exclusively to absolute pitch. First, in trials where the frequency of the anchor tone differed from the 230
7.4 Echolocation behaviour and comparative psychoacoustics in bats
Fig. 7.4.4: Setup for sound-classification experiments. Stimuli were presented while the bat was at the starting position. In absolute pitch perception experiments, the animal had to classify the pitch of tones relative to a memorized pure tone reference by flying to the appropriate feeding dish. In relative pitch-perception experiments, the bat was trained to indicate the direction of pitch shift in a tone pair. Correct classifications were rewarded with mealworms. For details see Schmidt (1995).
memorized reference acquired during training, trained bats referred to their memory reference rather than to the physically present anchor tone. Thus bats trained on an anchor tone of 23 kHz classified all test tones strictly with respect to the 23 kHz reference, even when a 22 kHz anchor tone was presented (Fig. 7.4.5a). Second, categorization performance was independent of the presence of the anchor tone (Fig. 7.4.5b). Finally, once a reference pitch had been learned it was memorized for several months. The absolute frequency discrimination around 23 kHz amounted to about 300 Hz, corresponding to a Weber ratio of only 0.013, and was thus comparable to the best discrimination values described for mammals other than man (cf. Fay 1988). In contrast, it proved difficult to train the bats to develop a concept of ‘falling’ versus ‘rising’ two-tone sequences, i.e. relative pitch (Preisler and Schmidt 1995a). Only one out of three animals succeeded in learning this task, and even this bat did not transfer the concept to completely new frequency regions. These experiments clearly show that absolute pitch is a primary, spontaneously-adopted perceptual dimension for M. lyra. A similar preference for absolute pitch 231
7 Comparative animal behaviour and psychoacoustics has also been reported for various other mammals (May et al. 1989, d´Amato 1988) and songbirds (Hulse and Cynx 1985). Thus, in bats, the primitive ability to perceive absolute pitch may have evolved to an outstanding level. Absolute pitch, combined with a long-lasting memory representation of echo spectra (cf. Fig. 7.4.3 and Schmidt 1992), may form the basis of an acoustical library for the identification of objects.
Fig. 7.4.5: Absolute pitch perception by M. lyra (data taken from Sedlmeier 1992). Five bats were initially trained to classify tones relative to a 23 kHz anchor tone. a: Mean categorization performance of these bats for tone pairs consisting of a randomly chosen anchor tone of 23 kHz (filled symbols) or 22 kHz (open symbols) followed by a complex tone as a function of the fundamental frequency of the complex tone. The continuous curve represents the psychometric function obtained in the presence of the 23 kHz anchor tone (marked by the solid vertical line). For comparison, a shifted curve, centred at 22 kHz, is also given (dashed lines). Note that the animals referred to a memorized reference rather than to the anchor tone. In b, the mean categorization performance is compared for complex tones in the presence (filled symbols) and absence (open symbols) of the 23 kHz anchor tone. The same fit curve accounts for both sets of data. This suggests that the bats referred to absolute pitch in both experiments.
7.4.4.2 Collective pitch versus spectral pitches and timbre In a second series of experiments, we investigated the perceptual dimensions underlying the arrangement of this echo library. Echolocating bats have to extract two types of information from the stream of incoming echoes that are spectrally coloured by reflection from three-dimensional targets. First, they have to relate each echo to its respective call in order to avoid errors in orientation and target identification. Then, object properties have to be derived from the alterations in the echoes. While echo assignation is less critical in species moving in the open air, bats hunting in cluttered surroundings face the situation that the first echoes of a specific call may arrive before later echoes from a previous call; so there should be call markers tolerant to spectral changes. Some bat species alternate calls covering wholly differ232
7.4 Echolocation behaviour and comparative psychoacoustics in bats ent frequency regions (cf. Obrist 1989), and it can be assumed that the different pitches of these signals help identify the echoes. Like many other species that fly close to vegetation and near the ground, M. lyra uses multi-harmonic sonar calls. It is interesting to note that the most prominent frequency varies from call to call in this species (Schmidt et al. 1997). In humans, multiharmonic tones elicit a pitch perception at the fundamental, also referred to as collective, or virtual, pitch. This pitch is tolerant to a cancellation of single harmonics, and may thus be a promising candidate for a specific call marker. The question is, however: Does fundamental pitch perception exist in M. lyra?
Fig. 7.4.6: Spontaneous classification of complete complex tones. a: Examples for complex tones comprising four harmonics (level of each component: 60 ±3 dB SPL) with their fundamental below (A) and above (B) a 23-kHz pure tone reference (dashed line). All harmonics were presented at the same level. b: The performance of four M. lyra (% of „High“ Categorizations) plotted as a function of the fundamental frequency of the complete complex tones. The arrow marks the reference frequency. The dashed lines indicate 25 % and 75 % „High“ categorizations.
7.4.4.3 Spontaneous classification of complex tones – evidence for collective pitch perception in the ultrasonic range In a first experiment to investigate the classification of complex tones (Sedlmeier 1992), bats were first trained to categorize pure tones as higher or lower than a 23 kHz reference tone. Then, complex tones consisting of four harmonics were presented for classification (Fig. 7.4.6a). While two animals spontaneously categorized the complex tones according to the pitch of the fundamental, two others spontaneously classified all complex tones as high, taking the higher harmonics into account (Fig. 7.4.6b). These results suggest that M. lyra may derive collective pitch as well as spectral pitches from complex tones. 233
7 Comparative animal behaviour and psychoacoustics In a second experiment, we tested whether M. lyra would also spontaneously refer to the pitch of a missing fundamental (Preisler and Schmidt 1995b, 1998). Three animals were trained to classify complete tones relative to a 33 kHz reference. Then, the classification behaviour was determined for complex tones without the fundamental, and which were ambiguous in pitch (see Fig. 7.4.7a). Such ambiguous tones were presented with a probability of about 25 %. In these trials, all decisions of the bats were rewarded in order to avoid training effects. As in the first experiment, different individuals used different classification criteria (Fig. 7.4.7b-d). It is important to note, however, that each bat stuck to its individual strategy. One bat classified the ambiguous stimuli having missing fundamentals between 5.3 kHz and 28.3 kHz as low (Fig. 7.4.7b), but correctly classified the unambiguous control stimuli as high (D in Fig. 7.4.7a). The nonlinear distortions caused by the complex tones in the cochlea of M. lyra can be expected to be below the auditory threshold (Kössl 1992). Therefore we conclude that M. lyra may spontaneously refer to a collective pitch in the ultrasonic range.
Fig. 7.4.7: Spontaneous classification of incomplete complex tones by M. lyra. a: Examples for low (A) and high (B) complete complex tones, ambiguous incomplete test tones (C) and unambiguous incomplete controls (D, E). Physically present harmonics are represented by solid lines, missing harmonics by dotted lines. The dashed line represents the 33 kHz reference. b-d: Spontaneous categorization performance (% High Categorizations) of three individuals for complex tones presented at a level of 47 ± 3 dB (modified from Preisler and Schmidt 1995b). Open squares represent choices for stimuli of types A and B, downward triangles for type C and upward triangles for types D and E. As in all subsequent figures, each data point is based on at least 30 trials. The reference tone is marked by the vertical arrow. Note that each bat adopts a consistent classification strategy for tones of type C.
234
7.4 Echolocation behaviour and comparative psychoacoustics in bats In contrast, the two other animals spontaneously classified all ambiguous stimuli (C in Fig. 7.4.7a) as high (Fig. 7.4.7c–d), i.e. they referred to the spectral pitches of the complex tones. Bat 3 (Fig. 7.4.7d) categorized one of the unambiguous control stimuli (E in Fig. 7.4.7a) as high. In a comparable experiment performed at a test tone level of 65 dB SPL, even both control stimuli were categorized as high by this individual. This suggests that this bat may have used the common timbre of the incomplete tones for categorization. The spontaneous classification experiments show that M. lyra can pay attention to both the collective and spectral pitches of complex tones. Unlike in man (Ritsma 1967) and other mammals (Heffner and Whitfield 1976, Tomlinson and Schwarz 1988), where a collective pitch exists only at relatively low frequencies (below 2 kHz in man), collective pitch in M. lyra extends to the ultrasonic range, with interesting implications for current models of pitch perception (cf. discussion in Preisler and Schmidt 1998). In short, it is highly improbable that a temporal mechanism – as assumed by most current models of pitch perception – is responsible for collective pitch perception in M. lyra. Rather, it seems that purely spectral models (e.g. as proposed by Terhardt 1978 or Wightman 1973) may account for collective pitch formation in this species. It is a prerequisite of these models that the harmonics of the complex tones are spectrally resolved by the inner ear. This brings up the question of whether this condition is met by the auditory system of M. lyra.
7.4.4.4 Frequency resolution and auditory filter shape in M. lyra Two experiments were performed in order to characterize the width and the shape of the auditory filters in M. lyra, defined by the frequency range within which simultaneously-present spectral components are analyzed together and can mask each other. The width of the auditory filters was assessed by determining masked thresholds for pure tones in broadband noise, i.e. critical ratios (CR; Sedlmeier and Schmidt 1989). The CR widths increased monotonically (steepness: about 6 dB/octave) as a function of test-tone frequency from about 1.5 kHz at 25 kHz to about 13.5 kHz at 90 kHz (Fig. 7.4.8). A direct determination of the critical band (CB) around a centre frequency of 64.5 kHz by the method of band-narrowing yielded a bandwidth identical to the CR-value at that frequency (Schmidt, unpublished). Therefore, in this species, the CR-values can be considered to represent the CB. Auditory filter shape was characterized by determining masked thresholds for pure tones shifted in frequency around a masker of an approximate width of one CB (Schmidt 1993, Schmidt et al. 1995). Data were obtained at several masker levels for maskers with centre frequencies of 23, 30, 40, and 64.5 kHz, i.e. in the region of the first three harmonics of the sonar calls of M. lyra. The filter shapes obtained (cf. Fig. 7.4.9) were symmetrical around the masker centre frequency at all masker levels tested, and their slopes were between 40 and 100 dB/CB. Thus, compared to man (cf. Zwicker and Fastl 1990), the filter slopes in M. lyra are exceedingly steep, and the symmetry of the auditory filters is preserved at high spectrum levels of the masker. 235
7 Comparative animal behaviour and psychoacoustics
Fig. 7.4.8: Critical ratio function in M. lyra. Mean critical ratios (left ordinate [dB], right ordinate [kHz]) from four bats plotted as a function of test tone frequency [kHz]. Vertical bars indicate the standard error. Masked thresholds were determined at two masker spectrum levels (l = 20 dB, circles; l = 26 dB, diamonds). The data point at 100 kHz was not included in the regression analysis as it is influenced by the high threshold in quiet at this frequency.
Fig. 7.4.9: Auditory filter shape in M. lyra. Masked thresholds [dB SPL] of one bat for pure tone pulses in narrow-band noise of two spectrum levels (l = 0 dB and l = 30 dB) plotted as a function of test -tone frequency [kHz]. Open symbols are masked thresholds, closed symbols thresholds in quiet from the same animal. The error bars give the standard error of the thresholds. The arrow marks the centre frequency of the masker, indicated by the hatched area (-3 dB bandwidth: 7.5 kHz).
236
7.4 Echolocation behaviour and comparative psychoacoustics in bats Both characteristics may result from a central sharpening of the frequency tuning reported for M. lyra (Rübsamen et al. 1988), and, in combination with the narrow CBs, guarantee the clear spectral separation of the harmonic components of the echoes required for collective pitch formation. Our results in M. lyra may thus be taken as further evidence that spectral theories are sufficient to account for collective pitch formation in the mammalian auditory system.
7.4.4.5 Perception of spectral pitches – corroboration of the broadband spectral-analysis model In the spontaneous pitch classification experiments described above, some bats consistently referred to the spectral pitches of the complex tones. These experiments, however, are not conclusive with respect to whether the bats referred to a single component, e.g. the pitch of the lowest partial, or whether they perceived a more complex quality derived from the broadband spectral structure of the test tones. The following two experiments aimed at a description of the perceptual dimensions involved in spectral pitch perception. The first experiment was designed to find out whether and how bats that spontaneously adopt one mode of perception would adjust their decision strategy if systematically rewarded according to a different classification criterion. Bats initially classifying complex tones with a missing fundamental below the reference as high could indeed be trained to categorize such stimuli as if they perceived a collective pitch (Sedlmeier 1992, Krumbholz and Schmidt 1997, 1999). This performance, however, was not due to a change in the initially-adopted classification criterion (see Fig. 7.4.10b). Although the bats associated a sample of missing fundamental training stimuli with a certain feeding dish and transferred their decision behaviour to new test tones of similar structure (cf. choices for training and class 1 stimuli in Fig. 7.4.10b), they failed to categorize less similar (class 2) test stimuli according to their collective pitch. Moreover, inharmonic (class 3) stimuli without defined collective pitch were consistently associated with the ‘low pitch’ feeding dish. It must be concluded that the bats referred to a different classification criterion. The only general criterion differing for the ‘low’ and ‘high’ missing fundamental training stimuli was the pitch of the lowest partial which, however, was not used for classification (Fig. 7.4.10c). If, on the other hand, we assume in analogy to texture perception by sonar that the bats memorized the broadband spectral structure of the sample of training tones and categorized the test tones according to their similarity with the training sample, the classification behaviour of all three bats is explained for all stimulus classes (Fig. 7.4.10d–f). In this case, the sensory quality related to this broadband spectral analysis, in analogy to the timbre of a vowel (Chistovich and Lublinskaya 1979), will depend on the number and the average frequencies of the partials in the complex tone. It may be conjectured that timbre is indeed a major quality referred to for the classification of complex tones.
237
7 Comparative animal behaviour and psychoacoustics
Fig. 7.4.10: Broadband perception of spectral pitches (modified from Krumbholz and Schmidt, 1999). a: Two examples with lower or higher collective pitch than the reference are shown for each of three classes of test stimuli. The dash-dotted line is the 23 kHz reference. Physically present harmonics are represented by solid lines, missing components by dotted lines. In class 1 stimuli, only the fundamental is missing, in class 2 the number of missing components is variable. Class 3 are inharmonic stimuli whose components are shifted by 50 % of the fundamental frequency. They are shown with their respective unshifted harmonic complexes. b-d: Classification performance of bat 1 assuming different decision criteria. Filled squares, classification performance for the training stimuli; open circles, class 1, open triangles, class 2 and open stars, class 3 stimuli. Performance for class 3 stimuli is plotted at f0/Fref = 1.0, since random classification is expected when assuming criterion A for stimuli; without a defined collective pitch. Systematic deviations from a monotonic decision function are shown by arrows. While systematic deviations are obvious when assuming that the bat uses the collective pitch (A) or the pitch of the lowest partial (B) as decision criteria, all data points can be accounted for by assuming the broadband spectral similarity criterion (C). e, f: This also holds true for bats 2 and 3.
238
7.4 Echolocation behaviour and comparative psychoacoustics in bats 7.4.4.6 Focused perception of single partials in complex tones The use of timbre, however, does not rule out that the bats are able to focus on the spectral pitches of single partials, as was recently shown (Krumbholz and Schmidt 1998). The frequency-difference limen for the third partial in a complex tone with four partials was unaffected by frequency shifts in one or several of the other harmonics (see Fig. 7.4.11) when the bats had previously been trained to a pitch classification task using pure tones with frequencies close to the third partial of the complex tones. This shows that M. lyra is indeed able to selectively pay attention to the spectral pitch of single signal components.
Fig. 7.4.11: Frequency difference limen for a single partial in complex tones. a: Examples for the four sets of test tones. Animals were initially trained to classify pure tone stimuli (set 1) as lower or higher than a pure tone reference at 64.5 kHz (dash-dotted line). Then, frequency difference limens for the third partial in complex tones were determined. In test tone set 2, only the frequency of the third partial was varied. Partials one, two and four were also coherently shifted (filled head arrows) by a random amount in set 3 (range = vertical bar). In set 4, only one partial was shifted randomly (open headed arrows). b–d: Discrimination performance of three bats for complex tones of set 2 (open squares), set 3 (filled circles) and set 4 (filled triangles). Note that all data points are represented by the fit curve approximated to the data points of set 2. This shows that random shifts of the other partials did not influence the perception of the third partial.
239
7 Comparative animal behaviour and psychoacoustics To sum up, we have now established collective pitch, timbre and the spectral pitch of a single component as perceptual dimensions along which the perception of complex tones is organized in M. lyra. It can be conjectured (cf. the discussion relating the perception of quasi-static complex tones to sonar echo analysis in Krumbholz and Schmidt 1999) that the harmonically-structured sonar call echoes are classified according to the same dimensions. In an echolocation context, the collective pitch of the echo may allow the bats to relate each echo to its respective call. The timbre of the call echoes as well as the spectral pitch of single harmonics, both depending on target reflectivity, may on the other hand be used by the bats as a source of information about object structure.
7.4.5 Psychoacoustical frequency and time-domain constants in bats The quality of the acoustical images obtained by sonar depends critically on the performance of the bats´ auditory system. In this chapter, some psychoacoustical measures characterizing the frequency- and time processing in different bat species are given. Furthermore, the possible effect of the active echolocation situation compared to mere passive listening is considered in an experiment on summation time constants.
7.4.5.1 Discrimination performance in the frequency domain Comparative studies of frequency resolution and discrimination Frequency-difference limens and critical ratios of M. lyra were discussed above. Both measures were quite sensitive, but not unlike those of other mammals with good high-frequency hearing. Similar results were also found in two comparative studies with two bat species, Tadarida brasiliensis and Eptesicus fuscus, using frequency-modulated (FM) sonar calls, typically consisting of only two harmonics, but which are longer than those of M. lyra. Critical ratios obtained from T. brasiliensis (Schmidt et al. 1990) revealed a relatively broad tuning in this species, with a constant CR-width of 10 kHz below 20 kHz and a bandwidth increasing by about 4 dB/octave for frequencies up to 100 kHz. In E. fuscus, frequency-difference limens for a reference frequency of 25 kHz were determined (v. Stebut and Schmidt 1997). The discrimination thresholds for the three individuals tested were between 180 Hz and 600 Hz, corresponding to Weber ratios of 0.007 and 0.024. E. fuscus emits narrowband search calls at about 25 kHz. On account of sharply-tuned neurons found in the inferior colliculus of this species in this frequency range, it had been postulated that these neurons are specialized for detecting the bats´ search signals (Casseday and Covey 1992). However, the frequency-discrimination performance of E. fuscus did not reflect any kind of specialization. On the contrary, the average performance was similar to that of M. lyra and other small mammals (cf. Fay 1988). 240
7.4 Echolocation behaviour and comparative psychoacoustics in bats Dynamic frequency discrimination in T. brasiliensis Echolocating bats hunting flying insects have to analyze target echoes whose frequency characteristics are cyclically modulated by the insects´ wingbeats, thus creating complex spectro-temporal patterns. The dynamic aspects of frequency discrimination in T. brasiliensis were studied by determining the discrimination thresholds for sinusoidally frequency-modulated stimuli of different modulation rate from pure tones at 40 kHz, corresponding to the narrow-band search-call frequency in this species (Bartsch and Schmidt 1993). The discrimination performance deteriorated with increasing modulation frequency, from 1.58 kHz (WR = 0.04) at a modulation rate of 10 Hz, to about 3 kHz (WR = 0.07) at a modulation rate of 2 kHz. Frequency modulation thresholds in T. brasiliensis are thus considerably larger than those in the highly-specialized bat species that use constant-frequency (CF) call components. However, they fit well to the frequency-discrimination values obtained for E. fuscus and non-echolocating mammals. Again, the ability to echolocate is not reflected in the modulation thresholds. How, then, do fluttering targets appear to FM-bats? Since the calls of FM-bats are often too short and their repetition rates too low to convey reliable information about wingbeat cycles, it can be assumed that the bats perceive ‘still images’ that differ in consecutive echoes in their lowest frequency and the broad-band spectra. The above two experiments suggest that the dynamic frequency discrimination performance may not be sufficient to permit an analysis based on the lowest frequency of the echo. 7.4.5.2 Time constants in echolocating bats For echolocating animals that use the stream of call echoes as a primary source of information about the environment, the time constants in the auditory system constitute important perceptual constraints. On the one hand, an independent processing of successive sonar calls and their echoes presupposes a sufficiently short minimum integration time, so that even the call-echo sequences of very high repetition rate occurring during the final stages of target approach, and multiple echoes of a call from spatially separate objects, can be resolved. On the other hand, the detection of weak echoes of sonar calls with typical durations in the millisecond range critically depends on the ability of the bats´ auditory system to integrate sound energy over the duration of their calls. In addition, an integration time constant adapted to call duration may be essential for processing the call echo as a single, simultaneous event, as is assumed by the broad-band spectral-analysis model discussed above. The integration time constants relevant to these different acoustical situations may be based on different auditory mechanisms. In fact, the time constants determined in man for the resolution of two stimuli, e.g. in gap detection, or for the detection of click pairs, differ by two orders of magnitude from those determined for the detection of stimuli with variable signal duration (de Boer 1985, Zwicker and Fastl 1990, Viemeister and Wakefield 1991). We studied different time constants that are potentially relevant for echo perception in bats, using various adaptations of a twoalternative, forced-choice experimental procedure. 241
7 Comparative animal behaviour and psychoacoustics Gap detection and temporal order perception in T. brasiliensis In a first resolution experiment, gap detection for white-noise pulses and narrowband noise pulses of different centre frequencies was studied in T. brasiliensis (Nitsche 1993a, 1993b). Gap-detection thresholds amounted to 2 ms for the whitenoise stimuli, and increased from 10 ms to about 90 ms for narrow-band stimuli of increasing spectral separation between the stimuli preceding and following the gap. This performance is comparable to the gap-detection thresholds in man and birds (see Sect. 7.1.4) and does not reflect a specialization to echolocation. Moreover, experiments on the perception of temporal order in this species revealed that the animals can perceive frequency-modulated stimuli of up to 6 ms duration, resembling their sonar calls, as a single, simultaneous event (Nitsche 1992, 1994). This corroborates the hypothesis, assumed in the spectral-correlation model, that the bats tend to evaluate a call echo in terms of the average qualities of its frequency components, rather than in their temporal fine structure. Temporal integration for tone-pip pairs in M. lyra The minimum integration time found in bats, however, is about one order of magnitude shorter than the gap-detection thresholds (Wiegrebe and Schmidt 1996). In an experiment with M. lyra, auditory thresholds were determined for pairs of short tone pips with centre frequencies in the lower and upper half of the hearing area of this species as a function of the time delay between the pips (Fig. 7.4.12). The time constants determined with this method amounted to only about 220 µs for pip pairs in both frequency ranges. Similarly short time constants have been reported for click-pair integration in the echolocating dolphin, Tursiops truncatus (Au et al. 1988) and for clutter interference in E. fuscus (Simmons et al. 1989). Yet these time constants, that are short compared to those of about 3 ms for click-pair integration in humans (Viemeister and Wakefield 1991), may again be a consequence of the different hearing ranges in man and echolocating animals, rather than an adaptation to sonar. We point out, however, that the time constant for the detection of click pairs that were coupled to the bats´ sonar emissions in E. fuscus was 2.4 ms, i.e. in the range of the bats´ call durations (Surlykke and Bojesen 1996).
242
7.4 Echolocation behaviour and comparative psychoacoustics in bats
Fig. 7.4.12: Detection thresholds of three M. lyra (shown as triangles, squares and circles) for pairs of 60-µs tone pips with a centre frequency of 72 kHz as a function of the delay between the two tone pips (modified from Wiegrebe and Schmidt 1996). Thresholds of each bat are given relative to the individual threshold in quiet determined for single 60-µs tone pips. The solid line represents the mean threshold differences, the dashed line at -3 dB the threshold difference expected for complete intensity integration for the pip pairs.
Temporal summation in passive acoustic and echolocation situations Auditory thresholds improve with increasing signal duration for signals shorter than a critical value, a phenomenon referred to as temporal summation. Temporal summation was studied in two bat species with a different echolocation strategy, T. brasiliensis and M. lyra. In contrast to the gleaning bat, M. lyra, that is known to exclusively use very short (below 2 ms), multiharmonic, frequency-modulated sonar calls, T. brasiliensis adapts the frequency range and durations of its calls to the behavioural context. When hunting in open spaces, this species emits narrow-band search calls of up to 50 ms duration. It could be argued that this difference in echolocation behaviour should be reflected in the time constants for temporal summation. In T. brasiliensis, the summation time constants for 40 kHz pure tone signals, i.e. in the frequency range of the search calls, were determined for two levels of background noise (Schmidt and Thaller 1994). Time constants amounted to 62 ms in the undisturbed condition and only 14 ms in the presence of moderate background noise. The thresholds decreased by more than 10 dB per decade of duration even when taking spectral splatter into account, suggesting a complete intensity summation for low-level background noise and an over-integration of sound intensity for the higher masker level. These features of the temporal-summation function of T. brasiliensis favour the detection of targets even in noisy or cluttered conditions, and could thus be understood as an adaptation to echolocation. In M. lyra, the temporal summation function for 72 kHz pure-tone pulses, i.e. in the frequency range of the (often very prominent) third harmonic of the sonar calls, was similarly steep, but the time constant amounted to only 9 ms (Heinze et al. 1996). This discrepancy in the time constants determined for the two species, however, must not be interpreted as an adaptation to the typical sonar call dura243
7 Comparative animal behaviour and psychoacoustics
Fig. 7.4.13: Temporal summation for multiharmonic stimuli that are spectrally similar to sonar calls of M. lyra. Mean thresholds [dB SPL] and standard errors of the mean for four bats are given as a function of the equivalent rectangular duration [ms] of the stimuli. Triangles indicate threshold values obtained in an active echolocation paradigm in which a stimulus pulse was presented 5 ms after each sonar call emission from the bat. Circles represent threshold values measured for free-running stimuli. The same stimulus durations were used in both conditions; some data points are slightly shifted to avoid overlap. The solid lines give the best fits for an intensity integration function (modified after Feldtkeller and Oetinger 1956, cf. Schmidt and Thaller 1994).
tions. In fact, in a recent series of experiments, we found time constants of 60 ms for broad-band noise pulses (Heinze et al. 1996) and 51 ms for stimuli of spectral composition similar to the typical approach-phase calls in M. lyra (Fig. 7.4.13, free-running; Schmidt and Schillinger, unpublished data). These results are compatible with those for the 72- kHz stimuli if we consider the typical shortening of summationtime constants with increasing stimulus frequency (cf. Fay 1988) and assume that the longer time constants of the low-frequency components determine the compound time constant. On the basis of the above data, it must be concluded that the summation-time constants in bats do not differ significantly from those of other, non-echolocating mammals with high frequency hearing. Finally, we were able to show that this conclusion also holds true in an echolocation context, i.e. when the test stimuli used to determine the temporal-summation functions are not presented free-running, but coupled to the bats´ own sonar emissions. For test stimuli presented 8.5 ms after a sonar call, corresponding to a target difference of 1.44 m, the summation time constant in M. lyra amounted to 62 ms, i.e. it was practically identical to those obtained with free-running broadband stim-uli. Mean auditory thresholds in the active echolocation, compared to the free-running, condition were equal for test-tone durations above 32 ms (cf. Fig. 7.4.13). For shorter stimuli, thresholds in the active echolocation situation deteriorated progressively compared to those in the free-running condition. In a current experiment on forward 244
7.4 Echolocation behaviour and comparative psychoacoustics in bats masking in M. lyra (Siewert and Schmidt, unpublished), detection thresholds for call-like signals were affected by a preceding broadband masker for about 40 ms. This suggests that the threshold differences described above can indeed be attributed to a masking of the test stimuli by the preceding sonar pulses. Thus, in contrast to the idea of echolocation-specific processing time windows, postulated early on the basis of neurophysiological findings (Suga 1970, Sullivan 1982) and seemingly established in behavioural experiments (Roverud and Grinnell 1985), the psychoacoustical time constants we determined do not indicate specific adaptations to sonar. In view of the similar performance of bats and non-echolocating mammals in psychoacoustic experiments, it seems that echolocation is based less on specific adaptations than on an efficient analysis of echo information using common mechanisms of the mammalian auditory system.
245
8 The human aspect I: Human psychophysics
8.1 The presentation of stimuli in psychoacoustics Hugo Fastl
The main goal of psyschoacoustics is to describe quantitatively the relationships between physically well-defined stimuli and the hearing sensations they evoke. For the presentation of stimuli, the transducers used play a crucial role. Headphones available on the market usually do not produce a flat frequency response on a human ear (Fastl and Schorer 1986). Rather, equalizing networks have to be used. Therefore, for several headphones applied in psychoacoustics, free-field equalizers were developed and realized (Fastl and Zwicker 1983, Fastl et al. 1985b, Schorer 1986). We showed that even for the same sound pressure in the ear canal, loudspeaker presentation versus headphone presentation can elicit a different loudness sensation (Fastl et al. 1985). Some of the main results are compiled in Figure 8.1.1. The level difference ∆L measured in the ear canal for the same loudness perception when presenting sounds by headphones versus loudspeaker is given as a function of the test-tone frequency fT. Data for eight normally-hearing subjects are plotted as medians with interquartiles. Filled circles represent data for a closed electrodynamic headphone, squares for an open electrostatic headphone. The data displayed in Figure 8.1.1 reveal that the level measured in the ear canal can differ for the same loudness perception. For frequencies up to about 1 kHz, both a closed electrodynamic headphone and an open electrostatic headphone have to produce about 5 dB more level in the ear canal than a loudspeaker in the anechoic chamber to elicit the same loudness perception. For frequencies around 3 kHz, the same level in the ear canal by headphone or loudspeaker also means the same loudness. At high frequencies (10 kHz), headphones again have to produce 5 dB more sound pressure level in the ear canal than loudspeakers to produce the same loudness perception. Therefore, the ‘correct’ frequency response of headphones for psychoacoustic experiments can be obtained only by loudness matches (Fastl 1986). When sounds are presented via loudspeakers, frequently only the amplitude response but not the 246 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
8.2 Masking effects
Fig. 8.1.1: Presentation of sounds by headphones versus loudspeaker. Difference ∆L of the level measured in the ear canal for presention via headphones versus loudspeaker for same loudness perception. Filled circles: closed electrodynamic headphone; squares: open electrostatic headphone.
phase response are considered. However, the respective phase may have a dramatic effect on psychoacoustic results, in particular for line spectra (Krump 1996).
8.2 Masking effects Hugo Fastl
8.2.1 Post masking (forward masking) The dependence of post masking was studied for a large variety of stimulus parameters. Post masking clearly depends on the duration of the masker (Zwicker 1984). Interactions in the post masking effect of two maskers can be interpreted in the context of suppression effects (Bechly 1983, Fastl and Bechly 1983). These effects show also up when post masking of a narrow-band noise is compared to post mask-ing of a cubic difference noise at the same frequency (Fastl 1987a). The dependence of post masking on the duration of the test-tone impulse was studied and implemented into a model. The spectral distribution of maskers clearly influences the correlated post masking functions (Gralla 1993a). Some of the results are compiled in Figure 8.2.1. The level LT* of the just-audible test tone impulse is given as a function of the spectral distribution of the maskers, expressed in critical bands. Figure 8.2.1a shows results for a single-sided extension of the masker bandwidth. Starting from a noise one critical band wide, more critical bands are added either at higher (positive values of the abscissa) or lower frequencies. In all three frequency ranges considered (4.5 Bark, 8.5 Bark, and 17.5 Bark), a spectral broadening of the masker leads to a decrease in the post-masking threshold. This holds for an extension of the masker towards both higher or lower frequencies. Figure 8.2.1b gives the results for the case in which to one central critical band masker another single 247
8 The human aspect I: Human psychophysics critical band masker in a spectral distance zx – z0 is added. In particular, in cases where the additional critical band noise is directly adjacent to the central critical band masker (zx – z0 = ±1 Bark), a clear drop in the post-masked threshold is visible. On the other hand, an addition of a critical band masker at a distance of ±4 Bark from the central masker hardly influences post-masked threshold. Figure 8.2.1c shows the results for the case in which the masker is extended symmetrically around a central critical band masker. Up to a masker bandwidth of 5 Bark, post masked-threshold drops significantly by up to 15 dB, but stays constant for larger bandwidth of the masker.
Fig. 8.2.1: Post-masking of pure tones as a function of the spectral distribution of the masker. Starting from a central masking band, the masker is a: Extended towards lower or higher frequencies; b: Combined of two critical band maskers in a distance zx - z0; c: Extended symmetrically around z0 leading to a total bandwidth ∆z. Data are given for z0 = 4.5 Bark, 8.5 Bark, and 17.5 Bark. Temporal and spectral distributions are illustrated by the insets.
248
8.2 Masking effects The results suggest that comodulation masking release (CMR) can be traced back to effects of post masking. A model quantitatively describing the effects of postmasking and related CMR-effects was realized and implemented (Gralla 1993b).
8.2.2 Binaural Masking-Level Differences In a series of experiments, binaural masking-level differences (BMLDs) were studied as a function of relevant stimulus parameters. For combinations of broadband noise and test-tones at 400 Hz or 800 Hz, the following results were obtained using a Bekesy-tracking procedure: The masked threshold clearly depends on the duration of the test-tone. With decreasing test-tone duration, masked thresholds increase. In contrast, however, the magnitude of the BMLD is nearly independent of the testtone duration. On the other hand, the BMLD increases from about 5 dB to 10 dB when the masker duration is increased from 10 ms to 200 ms (Zwicker and Zwicker 1984a). For non-simultaneous masking, binaural masking-level differences were measured with a broadband masker and test-tone impulses at 400 Hz or 800 Hz (Zwicker and Zwicker 1984b). For post-masking, the magnitude of the BMLD decreases with increasing delay time. This means that the temporal course of the BMLD is similar to the familiar time pattern for post-masking. The same holds true for pre-masking: For test-tone impulses close to the following masker impulse, large values of the BMLD show up, whereas for test-tones presented 20 ms before the start of the masker, only few dB of the BMLD are found. For gated broadband maskers, the temporal pattern of the BMLD is also similar to the temporal masking pattern. In other words, in temporal regions where masker energy is present, large (12 dB) values of the BMLD show up, whereas in temporal gaps only small BMLDs are found. When regarding instead of the temporal envelope of maskers the time-function for a masker, a transition from temporal-masking patterns to masking-period patterns is obtained. Using low-frequency (40 Hz) maskers and short (4 ms) testtone bursts at 800 Hz, almost no BMLD is obtained despite the fact that the resulting masking-period patterns show a variation of more than 20 dB. As a function of masker bandwidth, BMLDs reach a shallow maximum at a bandwidth around 30 Hz. With increasing masker level, the magnitude of BMLDs increases (Henning and Zwicker 1984). For narrow-band maskers, effects of BMLDs were tested not only for frequencies in a spectral region around the maskers, but also for lower or higher frequencies (Zwicker and Henning 1984). Some of the results are illustrated by Figure 8.2.2. For a masker bandwidth of 31.6 Hz, masking was measured for the situations of masker and signal in phase (M0S0), the situation with the masker reversed in phase (MπS0), as well as the situation with the signal reversed in phase (M0Sπ). Figure 8.2.2a shows the masking patterns for a masking-level of 75 dB, Figure 8.2.2b for a level of 55 dB. The results displayed in Figure 8.2.2 clearly show the classic
249
8 The human aspect I: Human psychophysics
Fig. 8.2.2: Masking patterns and BMLDs for maskers with a bandwidth of 31.6 Hz. Maskerlevel A: 75 dB; B: 55 dB. The lower part shows the magnitude of the BMLD as a function of the test-frequency fT.
nonlinearity of masking patterns for the situation M0S0, i.e. steep lower slope and flatter upper slope at high masker levels versus a more symmetrical masking pattern at lower masker level. As compared to the standard in phase situation M0S0, for test-frequencies centred in the masker, a phase reversal of the masker (MπS0), leads to a drop in threshold by some 15 dB, whereas a phase reversal of the signal (M0Sπ) yields a drop of some 20 dB in threshold. These magnitudes are plotted at the bottom of Figure 8.2.2 as a function of the test-tone frequency. While the magnitude of the BMLD corresponds to the classic values reported in the literature for frequency ranges, where masker energy is present, at test-frequencies outside the frequency range of the masker, the BMLD drops to almost zero. This means that the magnitude of the decrease in the BMLD is not merely a result of the reduced masking effect. This behaviour is not reflected in traditional models of BMLDs such as those of Durlach or Schenkel. BMLDs were also studied for tonal maskers (Henning and Zwicker 1985). Testtones at 250 Hz were masked by a 250 Hz sinusoidal masker. The duration of the test-tone was varied, and signal and masker were presented in phase or in quadrature. In addition, the just-noticeable degree of amplitude modulation or the justnoticeable modulation-index for frequency modulation were measured as a function of the modulation frequency. Both sets of results suggest that monaural and binau250
8.3 Basic hearing sensations ral time constants have similar values (near 100 ms) and that the hearing system does not seem particularly ‘sluggish’ with the paradigms used. For both long-term and short stimuli, an interpretation of the results is relatively easy: It requires no more than the individual just-noticeable difference in level (1 dB) and just-noticeable interaural time-delay (0.1 ms). For the description of binaural masking-level differences, a model was developed based on four factors (Zwicker and Henning 1985). The size of the BMLD can be predicted from: 1. Just-noticeable differences 2. Temporal effects in simultaneous masking 3. Binaural interaction 4. Temporal effects in non-simultaneous masking This model is verified by a number of examples illustrating that the knowledge of just-noticeable differences for level and interaural-delay, and their joint dependence on stimulus parameters, are sufficient to predict the magnitude of BMLDs quantitatively. The model was challenged for 250 Hz with masker-bandwidths of 10 Hz or 30 Hz (Zwicker 1984). The results revealed that the four factors mentioned are sufficient to predict the magnitude of BMLDs.
8.3 Basic hearing sensations Hugo Fastl
8.3.1 Pitch Frequency discrimination for pure tones depends on both the psychophysical method used, as well as the presentation of sounds. With regard to the influence of the psychophysical methods, frequency discrimination thresholds obtained with a yes/no-procedure are typically by a factor of two larger than thresholds obtained with an adjustment procedure. This holds for a wide range of tone durations between 500 ms and 10 ms and frequencies between 125 Hz and 4 kHz (Fastl and Hesse 1984). With regard to the influence of the presentation of sounds, frequency discrimination thresholds obtained with pulsed tones are typically by a factor of three smaller than thresholds obtained with modulated tones. Differences in the magnitude of difference limens were studied with pulsed versus modulated tones for both frequency discrimination and amplitude discrimi251
8.3 Basic hearing sensations nation (Schorer 1989a). A model describing the influences of procedures and sound presentation on the magnitude of the difference limen is based on the classical Zwicker-Maiwald model. However, in order to also predict the difference limen for pulsed tones, the original Zwicker-Maiwald model was extended to include memories for test and comparison sounds as well as comparators (Schorer 1989b). Pitch shifts caused by different presentation levels, as well as by the addition of background noises (Sonntag 1983, Hesse 1987a) were also systematically studied. A model of spectral pitch based on the related masking patterns was developed (Hesse 1987b). This model is illustrated in Figure 8.3.1 which shows the absolute threshold (RHS), the masking pattern of a masker (M), as well as the masking pattern of the test tone (T). The spectral pitch of the test tone can be calculated from the masking patterns using three magnitudes: Magnitude 1 indicates the difference between the masking pattern of the masker and the top of the masking pattern of the test tone. Magnitude 3 shows the distance of the masking pattern of the masker to the absolute threshold. Magnitude 2 is divided: 2a shows the steepness of the slope of the masker 1 Bark below the test tone and magnitude 2b gives the slope of the absolute threshold for the same location. For each magnitude, detailed algorithms are given in the literature (Hesse 1987b).
Fig. 8.3.1: Illustration of the model of spectral pitch. Masking pattern of masker (M), test tone (T), as well as absolute threshold (RHS). Using the magnitudes 1, 2a, 2b, as well as 3, the spectral pitch of a test tone can be calculated from the related masking patterns (for details see text).
The dependence of the pitch of pure tones on sound pressure level can be described quantitatively by using magnitudes 1, 2b, and 3. For the description of the pitch of pure tones, which are partially masked by maskers, magnitude 2a also has to be taken into account. Using the model, the spectral pitch of pure tones could be predicted from the corresponding masking pattern for a large variety of situations described in the literature. 252
8.3 Basic hearing sensations Not only the pitch of pure tones, but also the pitch of low-pass noises can be quantitatively described on the basis of the correlated masking patterns (Schmid and Auer 1996). The pitch of the low-pass noise corresponds to the frequency at a point 3 dB down on the upper slope of the masking pattern and increases with increasing sound pressure level. When switching off sounds with a spectral gap, a faint pure tone may be heard with a pitch corresponding to the frequency region of the gap. This acoustic afterimage was called by our colleagues a ‘Zwicker-tone’. In extensive psychoacoustic experiments, the dependence of the Zwicker-tone on relevant stimulus parameters was studied. By digital synthesis, a large variety of Zwicker-tone exciters (ZTE) were realized (Fastl 1989a). Clear Zwicker-tones are produced for levels of the ZTE around 0 dB spectrum level. They show a duration of about 2 s and a loudness corresponding to 10 dB sensation level. For the production of a Zwicker-tone, a minimum width of the spectral gap of 1 Bark at low frequencies, and 0.5 Bark at high frequencies is necessary. For broader spectral gaps, the pitch of the Zwicker-tone lies approximately 1 Bark above the lower edge of the gap. With increasing level of the ZTE, the pitch of the Zwicker-tone becomes higher. This behaviour can be quantitatively described on the basis of the corresponding masking patterns (Fastl 1989a). Not only band-stop noises, but also low-pass noises can elicit a Zwicker-tone. The minimum bandwidth of the low-pass noise is 1 Bark. Similarly, the flanking bands of spectral gaps must have a bandwidth of at least 1 Bark to produce a faint Zwicker-tone. Pulsed sounds may also elicit a Zwicker-tone. The quality of the Zwicker-tone depends in a complex manner on the on/off-times: Clear Zwickertones are obtained for short temporal gaps (10 ms), even for very short sound pulses. On the other hand, for pauses as long as 6 s, clear Zwicker-tones can be produced if the sound pulses exceed a duration of 6 s. With digital synthesis of signals, a large variety of temporal envelopes can be produced for the same amplitude spectra, using different phases of the components. Starting from the classical phase rule proposed by Schroeder, phase rules were developed and realized for line spectra, producing temporal envelopes of impulses, single sweeps, or sequences of sweeps (Krump 1992). The quality of the Zwickertone depends in a rather complex manner on the phase rule applied. With some simplification, it can be stated that on the one hand a sufficent number of impulses within a temporal window of say 200 ms is necessary, and on the other hand the spectral spacing of the components should not exceed several hundred Hz. In other words, the ZTE must have a sufficent ‘density’ both in the temporal as well as in the spectral domain. It was shown for the first time that not only spectral gaps but also spectral enhancements can elicit a Zwicker-tone (Fastl and Krump 1995). The combination of a broadband pedestal plus a pure tone can lead to a Zwicker-tone with a pitch corresponding to a frequency below the frequency of the added pure tone. Noise bands added to a broadband pedestal can also produce Zwicker-tones as long as their bandwidth does not exceed a critical band. A multitude of experiments with dichotic signals confirmed the hypothesis that the Zwicker-tone is a monaural phenomenon (Krump 1994). 253
8 The human aspect I: Human psychophysics A model of the pitch of the Zwicker-tone was developed on the basis of the correlated masking patterns. Some examples are shown in Figure 8.3.2. Figure 8.3.2a shows the dependence of the pitch of the Zwicker-tone for different levels LZTE of the Zwicker-tone exciter. With increasing level, the pitch of the Zwicker-tone rises. As illustrated by Figure 8.3.2b, this behaviour can be described on the basis of the correlated masking patterns with the hypothesis that the pitch of the Zwicker-tone corresponds to the minimum in the masking pattern. In cases where no minimum is produced, the pitch of the Zwicker-tone is predicted by the crossing of the lower slope of the masking pattern with the absolute threshold (dashed). Figure 8.3.2c shows the pitch of the Zwicker-tone for different levels LPT of the pure tone added to a broad-band pedestal. As illustrated in Figure 8.3.2d, the pitch of the Zwicker-tone corresponds to the crossing of the lower slope of the masking pattern for the added pure tone with the making pattern (dotted) of the broadband pedestal. Applying the model, the pitch of the Zwicker-tone could be predicted with an accuracy of about 0.3 Bark for masking patterns taken from the literature. If instead of average masking patterns, individual masking patterns of the subjects who performed the pitch matches for the Zwicker-tone are taken as a starting point of the calculations, the accuracy of the predictions increases to about 0.15 Bark (Fastl and Krump 1995).
Fig. 8.3.2: Pitch of the Zwicker-tone and masking patterns. a: Spectral gap: Dependence of the pitch of the Zwicker-tone on the level of the ZTE. b: Correlated masking patterns (solid); threshold of hearing (dashed). c: Spectral enhancement: Dependence of the pitch of the Zwicker-tone on the level LPT of a pure tone added to a broad-band pedestal. d: Correlated masking patterns of pure tones (solid) and broadband pedestal (dotted); dashed: threshold of hearing. Shaded areas and double arrows: Illustrations of spectral distributions.
254
8.3 Basic hearing sensations
8.3.2 Pitch strength With respect to pitch, sounds can be ordered along a scale low-high constituting pitch magnitude. In addition, however, another feature can be assessed describing how definite, salient, or pronounced a pitch sensation is perceived, leading to the magnitude called pitch strength. Pitch strength can be assessed directly by magnitude estimation or indirectly by measuring frequency discrimination, adopting the view that sounds with a strong pitch sensation should produce small difference limens for frequency. With the same group of subjects, the alternative methods of assessing pitch strength were tested for narrow-band noise and complex tones (Wiesmann and Fastl 1991, 1992). Some results are illustrated in Figure 8.3.3. The squares in Figure 8.3.3a represent pitch strength data obtained by magnitude estimation for narrow-band noises of different bandwidth. Frequency discrimination is illustrated by triangles (adjustment procedure) or by filled circles (2AFC-procedure). Starting from a pure tone (SIN), pitch strength is more or less constant up to a bandwidth of 31.6 Hz and decreases for larger bandwidth. In line with the hypothesis that small values of pitch strength lead to high thresholds for frequency discrimination, the data displayed in Figure 8.3.3a indicate that frequency discrimination is worse at larger bandwidth than at smaller bandwidth (note that the scale of pitch strength or frequency discrimination point in opposite directions). Although there is not a complete agreement between the data derived from magnitude estimates or frequency discrimination, nevertheless the data obtained with both procedures are in qualitative agreement. In contrast, Figure 8.3.3b shows diverging results of pitch strength versus frequency discrimination for complex tones as a function of the number of harmonics. With increasing number of harmonics, pitch strength as obtained by a magnitude estimation procedure (squares) decreases. On the other hand, frequency discrimination for complex tones with many harmonics is better than for a pure tone, i.e. one harmonic (filled circles). This means that when pitch strength is only inferred from frequency discrimination, misleading data may be obtained. Therefore, for the assessment of pitch strength, we prefer magnitude estimation procedures in our experiments (Fastl 1997c). For pure tones, pitch strength shows a band-pass characteristic as a function of their frequency with a maximum around 1.5 kHz. Tones at 125 Hz or 10 kHz produce only 1/3 of the pitch strength of tones at mid frequencies. For an increase in level from 20 dB to 80 dB, pitch strength of pure tones increases by a factor of approximately 2.5. For pure tones shorter than about 200 ms, pitch stength decreases in proportion to the logarithm of tone duration (Fastl 1989b). Pitch strength of AM-tones decreases with increasing carrier frequency. The results nicely confirm the classic data on the existence region of virtual pitch (Fastl and Wiesmann 1990). IIR-noise produces a pitch strength which increases for modulation depths larger than 15 dB linearly with the logarithm of the spectral modulation depth. When increasing the level of IIR-noise from 30 dB to 70 dB, pitch strength decreases slightly by about 10 %. Pitch matches of pure tones with IIR-noise suggest an interpretation of the pitch of IIR-noise as a virtual pitch (Fastl 1988c). 255
8 The human aspect I: Human psychophysics
Fig. 8.3.3: Pitch strength and frequency discrimination. Pitch strength: Squares; frequency discrimination: triangles (adjustment procedure) or filled circles (2AFC-procedure). All data are normalized relative to the values for a pure tone. a: Noise bands centered at 1 kHz. b: Complex tones with 250 Hz basic frequency.
The pitch strength of inharmonic complex tones was assessed for a basic frequency of 250 Hz. Pitch strength for these sounds shows a saw-tooth-like pattern known from the literature for the pitch of these sounds (Schmid 1994). Results are displayed in Figure 8.3.4. The relative pitch strength is given as a function of the spectral shift of all 9 harmonics of a complex tone with 250 Hz basic frequency. The data displayed in Figure 8.3.4 clearly reveal that large values of pitch strength occur for ∆f = -250 Hz, 0 Hz, +250 Hz. In this case, a harmonic complex tone is presented. On the other hand, for ∆f = ±125 Hz, low values of the pitch strength of the correlated inharmonic complex tones show up. In this case, the pitch of inhar-
Fig. 8.3.4: Relative pitch strength of inharmonic complex tones as a function of the spectral shift ∆f. Complex tone with 250 Hz basic frequency and nine harmonics. Anchor sound: pure tone (open symbols) or complex tone with ∆f = -25 Hz (filled symbols).
256
8.3 Basic hearing sensations monic complex tones shows a maximum of its ambiguity. These data confirm the hypothesis that sounds that elicit an ambiguous pitch sensation produce only small values of pitch strength. For harmonic complex tones, correlations between pitch strength and enhancement phenomena were described (Chalupper and Schmid 1997). Enhancements were produced by increasing the level of a partial by amplitude modulation, frequency modulation, or by a frequency shift of a partial within a harmonic complex tone. The pitch of pure tones partially masked by narrow-band noise, broadband noise, or pure tones was assessed. Pitch strength crucially depends on the level difference of the pure tone and the masking pattern of the masker. Pure tones only 3 dB above the masking pattern produce almost no pitch strength whereas for tones which lie 20 dB above the masking pattern, pitch strength has values similar to those for unmasked pure tones. Pitch strength of pure tones can be reduced by amplitude modulation at low modulation frequencies. This holds also true for AMtones partially masked by broadband noise (Schmid 1997). In addition to the basic research on pitch strength, some practical applications were also studied. The observation of musicians that for tympani natural heads are preferred to plastic heads could be verified by means of the related pitch strength (Fastl and Fleischer 1992). Sounds including tonal components are considered to be particularly annoying; therefore, they usually get a ‘tone penalty’. In psychoacoustic experiments of pitch strength, the magnitude of the penalties could be determined (Seiter et al. 1996) and included into a computer program (Beckenbauer et al. 1996).
8.3.3 Fluctuation strength The hearing sensation fluctuation strength shows a band-pass characteristic as a function of the modulation frequency with a maximum at about 4 Hz. Typical results are shown in Figure 8.3.5 for amplitude-modulated broadband noise, amplitude-modulated pure tones, frequency-modulated pure tones, as well as unmodulated narrow-band noise. In the latter case, an ‘effective’ modulation frequency was calculated from the bandwidth of the narrow-band noise by the formula f*mod = 0.64 ∆f. Fluctuation strength increases by a factor of about three for an increase in level by 40 dB. Large fluctuation strength is produced by FM-tones with large frequency deviation as well as amplitude modulated broadband noise. AM-tones reach medium values of fluctuation strength whereas narrow-band noises produce only little fluctuation strength (Fastl 1983, Fastl 1992). The band-pass characteristic of fluctuation strength is reflected in a band-pass characteristic of envelope fluctuation for fluent speech as well as easy-listening popular music (Fastl 1984c). Neural correlates of the hearing sensation fluctuation strength were found in the cortex of squirrel monkey (Fastl et al. 1986a, Fastl et al. 1991a, Fastl 1990a). 257
8 The human aspect I: Human psychophysics
Fig. 8.3.5: Dependence of fluctuation strength on modulation frequency. a: Amplitude modulated broadband noise (AM BBN) L = 60 dB, ∆f = 16 kHz, d = 40 dB; b: Amplitude modulated pure tone (AM SIN) L = 70 dB, f = 1000 Hz, d = 40 dB; c: Frequency modulated pure tone (FM SIN) L = 70 dB, f = 1500 Hz, ∆f = 700 Hz; d: Narrow-band noise (NBN) L = 70 dB, f = 1000 Hz.
Because of the close correlation between the hearing sensation fluctuation strength and the temporal envelope fluctuation of human speech, the intelligibility of speech in rooms can be predicted on the basis of fluctuation strength (Fastl et al. 1990a).
8.4 Loudness and noise evaluation Hugo Fastl
8.4.1 Loudness evaluation Equal-loudness contours represent a classical basic feature of loudness evaluation. Contours according to ISO 226 show values that are lower at 400 Hz than at 1 kHz. This behaviour is at variance with new data on equal loudness contours. In psychoacoustic experiments we showed that at equal loudness, the level of pure tones at 400 Hz is the same or even higher than at 1 kHz. In extensive studies (Fastl et al. 1990b) it was found that for levels of 30, 50, and 70 dB in a frequency range between 1 kHz and 100 Hz, equal loudness contours show a continuous increase with no depression at 400 Hz. While for the loudness of sounds presented in isolation an extremely large number of papers can be found in the literature, partially masked loudness has received significantly less attention. Therefore, partial masking of pure tones by broadband noise as well as partial masking of industrial noise by road traffic noise was studied (Zwicker and Fastl 1986b). It was found that partial masking clearly 258
8.4 Loudness and noise evaluation
Fig. 8.4.1: Model of temporal partial masking. Loudness-time pattern for a masker (M), a temporally partial masked test tone pulse (PM), and a comparison pulse (C). Patterns measured by a loudness meter according to Zwicker and Fastl 1983. The shaded areas illustrate that the partially masked tone pulse (PM) and the comparison pulse (C) produce the same loudness.
involves a recruitment phenomenon, i.e. above masked threshold an extremely steep increase in loudness occurs. This means that for levels about 15 dB above masked threshold, a loudness value of the partially masked sound is obtained that corresponds to the loudness of an unmasked test sound (Zwicker 1987b). For temporal partial masking, the dependence on relevant stimulus parameters was studied and two models were developed (Fastl 1984d). Both models have in common that they are based on areas of the temporal patterns. One model describes temporal partial masking on the basis of temporal masking patterns, the alternative model on the basis of loudness-time patterns. The latter model is illustrated by Figure 8.4.1. Figure 8.4.1a shows the loudness-time pattern for a masker (M) as well as for a tone pulse (PM) partially masked by the following masker. In Figure 8.4.1b the loudness-time function of a comparison tone (C) that produces the same loudness as the partially masked tone (PM), is shown. Together with colleagues from the United States, we confirmed the usefulness of magnitude estimation procedures for the assessment of loudness. With this method, the loudness of complex sounds (Hellman and Zwicker 1989) as well as some strange behaviour could be explained (Hellmann and Zwicker 1987): For example, a decrease of the dB(A)-value can lead to an increase in perceived loudness. Loudness effects at low frequencies were studied (Henning and Zwicker 1990, Widmann and Goossens 1993) and binaural effects of loudness were assessed as a function of phase differences (Zwicker and Henning 1991). An increase in loudness corresponding to an increase in level of 6 to 12 dB could be verified. The dependence of binaural loudness summation on interaural level differences, spectral distribution, as well as temporal distribution was also studied (Zwicker and Zwicker 1991b). In summary, the results showed that current loudness meters have no significant systematic errors because of their single channel evaluation in contrast to the human two-channel processing. 259
8 The human aspect I: Human psychophysics A long-standing, fruitful cooperation with colleagues from Osaka University, Japan, resulted in a number of cross-cultural studies. In particular, the evaluation of impulsive sounds or road traffic noise is rather similar when carried out using Japanese and German subjects (Fastl et al. 1986b, 1986c, Kuwano et al. 1986, Namba et al. 1987). Figure 8.4.2 illustrates data for 16 Japanese or German subjects. In the lower part, the level LAFmax of road traffic noise is given that produces the same loudness as an impulsive sound with LAFmax = 87 dB(A). Different letters denote data from different subjects. From the 16 individual results, respectively, medians and interquartiles were calculated for German subjects (open circles) as well as Japanese subjects (filled circles). In the upper part of Figure 8.4.2, the corresponding loudness-time functions are given. The impulsive sound reaches a loudness of about 70 sone. The loudness pattern for three passing vehicles of the same overall loudness is given by dotted lines for German subjects and by solid lines for Japanese subjects. The data shown in Figure 8.4.2 reveal that individual data for Japanese versus German subjects can differ by more than 10 dB. However, on the average, Japanese and German subjects judge the loudness of impulsive or road traffic noise rather similarly. The median values differ by less than 2 dB. Also, for combinations of broadband noise with a tonal component, Japanese and German subjects produce rather similar evaluations (Fastl and Yamada 1986). With respect to the suitability of sounds as danger signals, however, some differences, presumably due to different experiences in different cultures, did show up
Fig.8.4.2: Loudness comparison of impulsive noise and road traffic noise for Japanese and German subjects. Upper part: Loudness-time patterns for same loudness. Lower part: Individual data for Japanese (JPN) or German (GER) subjects with related medians (filled circle versus open circle).
260
8.4 Loudness and noise evaluation (Kuwano et al. 1997): Bell sounds signal danger more frequently for Japanese subjects than for German subjects. On the other hand, sounds with large fluctuation strength convey the information of danger for both groups of subjects. With regard to the evaluation of noise immissions, a new method for the subjective assessment of overall loudness was developed and realized in cooperation with colleagues from Osaka University, Japan (Fastl et al. 1988). Several variants of the method were tested and implemented (Fastl 1991a,b) and as a result, the following procedure was proposed: During the presentation of sounds, the subject tracks the instantaneous loudness by adjusting the length of a bar on the screen of a computer in such a way that at each instant of time the barlength corresponds to the perceived instantaneous loudness. After a longer period of time, usually 15 minutes, the subject gets a questionnaire and has to scale overall loudness of the preceeding time span by three methods: category scaling, absolute magnitude estimation, and indication of line length. As a rule, the three procedures yield rather similar results, giving a clear indication of the overall loudness of noise immissions. Recently, successful attempts were made to predict the overall loudness on the basis of the instantaneous loudness (Fastl 1997a).
8.4.2 Physical measurement of loudness and psychoacoustic annoyance For the physical measurement of loudness, algorithms for the calculation of loudness were developed (Zwicker et al. 1984, Zwicker et al. 1991, Fastl 1996c). These algorithms were implemented in hardware, leading from loudness calculation procedures to loudness meters (Zwicker and Fastl 1983, Zwicker et al. 1985). Since the loudness calculation algorithms were published in a German standard (DIN 45 631), several manufacturers now offer loudness analysis systems. Their quality was rated (Fastl and Schmid 1997) and it was found that all systems are within about 5 % on target. Calibration signals for testing loudness analysis systems were proposed and realized (Fastl 1993c). In addition, the statistical significance of percentile values of loudness indicated by loudness analysis systems was determined (Stemplinger and Fastl 1997). The loudness of musical instruments was assessed using a loudness meter (Fastl 1990b). It was shown that in spite of the same sound pressure level of 85 dB(A), the sounds of a flute versus a church organ may differ in loudness by more than a factor of 2. This means that for the recording of musical sounds, level measurements give no reliable indications with respect to the loudness of music. Therefore, in recording studios, in addition to VU-meters, indications of the respective loudness could ease the complicated task of a sound engineer significantly. In several studies, we verified that loudness constitutes the main part of annoyance and shows a high correlation to noisiness (Fastl 1985a, Kuwano et al. 1988, Widmann 1995). In addition to loudness, hearing sensations such as sharpness, fluctuation strength, or roughness may contribute to the psychoacoustic annoyance (Zwicker 1991, Zwicker 1989, Widmann 1993, Widmann 1994). A mixture of the same hearing sensations can be successfully applied to rate sound 261
8 The human aspect I: Human psychophysics quality (Fastl 1994a, 1996b). Furthermore, the hypothesis that sounds are particularly annoying if they impede speech intelligibility could be verified in psychoacoustic experiments (Widmann 1991).
8.4.3 Loudness of speech For many practical applications, the loudness of fluent speech is of relevance. By comparison of speech with a speech-simulating noise, we assessed the loudness of speech in psychoacoustic experiments (Fastl 1990c). Pairs of speech and speech noise were presented. The level of the speech was kept constant, while the level of the speech noise was varied. In one series of experiments, the subjects had to decide whether the speech noise was louder than the speech, in the other series whether the speech noise was softer than the speech. The results from 336 loudness comparisons are illustrated
Fig. 8.4.3: Loudness of speech. a: Loudness-time function of a test sentence (left) and of a speech noise (right) at the same perceived loudness. b: Cumulative loudness distribution of test sentence (solid) and of speech noise (dotted) for the condition of the same perceived loudness.
262
8.4 Loudness and noise evaluation in Figure 8.4.3. Figure 8.4.3a shows the loudness-time functions of a sentence (left) in comparison to a speech noise with same perceived loudness. From the results displayed in Figure 8.4.3a, it becomes clear that the loudness of speech does not correspond to an average value, but is rather governed by the loudness maxima of the speech sound. More specifically, an analysis of the percentile values of speech (solid) versus speech noise (dotted) as displayed in Figure 8.4.3b reveals that a percentile value of loudness N7 represents a good physical descriptor for the perceived loudness of speech. Since the percentile distribution for speech is rather flat at high loudness values, in view of the interquartiles of the subjective results, values of N5 to N10 can also be taken as physical data for the loudness of speech.
8.4.4 Evaluation of road traffic noise Both the emissions and immissions of road traffic noise were studied in psychoacoustic experiments. The noise emissions of passing vehicles can be described on the basis of the correlated loudness according to DIN 45 631 (Fastl 1987b, Widmann 1990). Noise reductions, possible by the enforcement of speed limits, were assessed in psychoacoustic experiments. As a general rule, within cities, the speed limit is 50 km/h. However, in many streets a reduction of the speed to 30 km/h is enforced. The noise emission of a typical middle class car was evaluated for the conditions that the respective speed limits were obeyed, however, by using different gears. Some of the results from the literature (e.g. Fastl et al. 1991b) are compiled in Figure 8.4.4. The relative loudness N/Nref is given for 30 km/h in the left and for 50 km/h in the right part. As indicated on the abscissa, 30 km/h were achieved by using either the second or the third gear, and 50 km/h were achieved with the third versus the fourth gear. Subjective evaluations are given by circles and physical evaluations by stars.
Fig. 8.4.4: Noise emission by a typical middle class car at speeds of 30 km/h versus 50 km/h using different gears. Circles: subjective evaluation; stars: physical evaluation.
263
8 The human aspect I: Human psychophysics The data displayed in Figure 8.4.4 indicate a good agreement between subjective and physical evaluation. Within cities, many drivers usually use the third gear. In this case, with a speed of 50 km/h they produce a noise which is by a factor of 1.35 louder than the noise at 30 km/h. This means that, in addition to the impact of speed limits with respect to the severity of accidents, a reduction of the maximum speed from 50 km/h to 30 km/h can also produce a substantial relief from noise. Further, the data displayed in Figure 8.4.4 are also amenable to an alterna-tive interpretation: The noise-sensitive driver within the city would use the fourth gear at 50 km/h, resulting in a noise reduction of about 10 % compared to driving in the third gear. Unfortunately, however, drivers using the second gear for 30 km/h produce at 30 km/h almost the same subjectively-perceived noise emission as noise-sensitive drivers at 50 km/h with fourth gear. This means that reducing the speed limit within cities from 50 km/h to 30 km/h will lead to relief from the noise only if noise-sensitive drivers are willing to use high gears. The resulting low r.p.m. lead to a reduced loudness that can be successfully measured physically by loudness analysis systems according to DIN 45 631. With respect to immissions from road traffic noise, a physical magnitude corresponding to the perceived overall loudness is represented by the percentile loudness N4 as measured by a loudness analysis system (Fastl 1991a,b, 1989c). The subjective evaluation of noise absorbing road surfaces (‘whispering asphalt’) was evaluated in psychoacoustic experiments (Fastl and Zwicker 1986) and also could be assessed by a percentile N4 of the physically measured loudness (Fastl 1991c). In a pilot study, we verified that with respect to the evaluation of road traffic noise, field studies and laboratory studies lead to rather similar results (Widmann 1992, Fastl 1993b). Therefore, future noise immissions can be predicted on the basis of results from psychoacoustic experiments.
8.4.5 Evaluation of aircraft noise The loudness of aircraft noise was assessed both with respect to noise emissions as well as noise immissions. With respect to noise emissions, modern aircraft equipped with engines with large bypass ratio, produce significantly less noise than their predecessors (Fastl et al. 1985c). Noise emissions from aircraft can be measured in line with features of the human hearing system by applying the loudness method of Zwicker. With respect to the loudness ratio of loud versus quieter aircraft, the EPNdB values, which are used for the certification of aircraft, are also relatively close to subjective evaluations (Fastl and Widmann 1990). Concerning noise immissions from aircraft, questions of ‘trading’ play a crucial role: On the one hand, modern aircraft are quieter than their predecessors; on the other hand, the number of flights is increasing. By applying the Leq-concept, which is enforced for the assessment of aircraft noise in many countries, very unrealistic trading factors show up. Based on Leq-calculations, one single loud old aircraft could be replaced by as many as 100 (hundred!) quieter new aircraft (Fastl 1990d). In contrast to this completely misleading prediction when applying the Leq-concept, 264
8.4 Loudness and noise evaluation
Fig. 8.4.5: Loudness-time functions of noise immissions from loud (B) or quieter (C) aircraft.
the application of loudness leads to a rather reasonable rule of thumb: If a new, quieter aircraft produces, compared to its predecessor, half the loudness, twice as many new aircraft can be admitted in the flight schedule to preserve the status quo of the noise climate around airports. In connection with the efforts to replace old, loud aircraft by new, quieter aircraft, numerous scenarios were studied (Fastl 1993a). Some of the assumptions are illustrated in Figure 8.4.5: 1. All loud aircraft are replaced by quieter aircraft: cf. (a) versus (b) 2. Twice as many quieter aircraft than louder aircraft are used: cf. (b) versus (c) 3. For some period of time, only a certain proportion of the loud aircraft can be replaced by quieter aircraft: cf. (a) versus (e). The results displayed in Figure 8.4.6 compare subjective and physical evaluation for the scenarios displayed in Figure 8.4.5. Circles represent subjective evaluations, and stars physical evaluations. All results shown in Figure 8.4.6 show a good agreement between subjective and physical evaluations. This means that, similar to road traffic noise, noise immissions for aircraft noise can also be assessed in line with fea265
8 The human aspect I: Human psychophysics
Fig. 8.4.6: Subjective (circles) or physical (stars) evaluation of the scenarios for noise immissions from aircraft noise as illustrated in Fig. 8.4.5.
tures of the human hearing system when measuring a percentile value of loudness according to DIN 45 631. Despite the same Leq, aircraft noise can be rated higher than road traffic noise (Fastl and Hunecke 1995). A similar tendency is reported from field studies. As a result of compilations of data from numerous field studies, our colleagues proposed an ‘aircraft malus’ of 5 dB(A) at medium levels and up to 15 dB(A) at higher levels.
8.4.6 Evaluation of railway noise Noise emissions from different types of railways were also studied in psychoacoustic experiments (Fastl 1997b). Table 8.4.1 gives an overview of the evaluated types of train, their length, as well as their speed. In Figure 8.4.7, the loudness evaluation of the trains A through G is displayed. All data were normalised relative to the value for train C. Circles indicate subjective evaluations (medians and interquartiles), stars represent the physically-measured evaluations according to DIN 45 631. Table 8.4.1: Trains evaluated. Type of train
Length [m]
Speed [km/h]
A
Freight train
520
86
B
Passenger train
95
102
C
Express train
228
122
D
ICE
331
250
E
ICE
331
250
F
Freight train
403
100
G
Freight train
175
90
266
8.4 Loudness and noise evaluation
Fig. 8.4.7: Noise emissions for different types of trains listed in Table 8.4.1. Subjective (circles) and physical (stars) loudness evaluations.
The data displayed in Figure 8.4.7 also suggest a good agreement between subjective and physical loudness evaluation for the noise emission from trains. As shown before for noise emissions from cars, trucks, and aircraft, noises of trains can also be measured by means of physical measurements with loudness analysis systems according to DIN 45 631, in line with subjective noise evaluations. In line with subjective evaluations also, it was found that when increasing the speed of a freight train by a factor of two from 60 km/h to 120 km/h, the perceived loudness increases by about 1/3. With respect to noise immissions of railway noise, subjective evaluations can be successfully predicted by the percentile loudness N4 measured by a loudness analysis system (Fastl 1997a). In comparison to road traffic noise at the same Leq, railway noise is rated lower, leading to a ‘railway bonus’ (Fastl et al. 1996b,c). While for outdoor situations, the ‘railway bonus’ reaches typical values of about 5 dB(A), indoors larger effects can occur, i.e. for the same subjective evaluation, the Leq for railway noise can be by more than 5 dB(A) higher than the Leq for road traffic noise. In extensive studies, the combination of different noise sources were evaluated with respect to noise immissions. Noise immissions for combinations of road traffic noise plus railway noise (Stemplinger and Gottschling 1997), as well as traffic noise plus industrial noise (Stemplinger 1997) can be successfully assessed by physical measurements of percentile loudness. Moreover, subjective evaluation of industrial noise (Stemplinger 1996) or noise in the work place (Stemplinger and Seiter 1995) can be physically predicted by means of percentile values of loudness according to DIN 45 631.
267
8 The human aspect I: Human psychophysics
8.5 Perceived differences and quality judgments of piano sounds Miriam N. Valenzuela
A broad knowledge of the physics of both upright and grand pianos has been achieved by many researchers. It is still not clear, however, which attributes of the signal reaching the ear are the information-carrying parameters responsible for the characteristic piano sound. Likewise, the question of how, and how well, subjects distinguish among tones of equal pitch and loudness played on different pianos has yet to be answered. Especially for the design of digital pianos, it would be helpful to know which features are necessary to imitate a given piano sound, rather than having to store large libraries of piano sounds in digital memories. Moreover, for the improvement of acoustical pianos it would be advantageous to have a method to predict with which physical, and specifically mechanical, modifications essential perceptible changes can really be achieved. It is therefore of great importance to investigate first which are the most essential cues used by listeners to judge differences between piano sounds. With more knowledge about the auditory cues used by listeners to distinguish a tone played on different pianos, another question arises: How do these distinctive attributes affect the quality of the piano tone? Quality here refers to the sound quality of the piano tone, i.e. a judgment of whether a piano sound is of high or poor quality. In the following sections, two types of experiments are described through which auditory estimates of differences and sound quality of piano tones were obtained. With regard to sound quality, the influence of sharpness was especially verified, since sharpness turned out to be one of the most important cues used to distinguish between piano sounds.
8.5.1 Perceived differences between piano sounds The psychoacoustic procedure used for discovering the attributes responsible for perceived differences between single tones of different pianos utilizes comparison experiments. Subjects are presented with two successive sounds and are asked to judge their similarity. The subjects are free to use any criteria they wish, as long as they base their judgments on the properties of the sound. That way, it is not necessary to know beforehand the ways in which the stimuli differ. On the basis of the similarity judgments, a statistical procedure, the so-called ‘multidimensional scaling’ (Kruskal 1964), is used to place the stimuli in a geometric configuration. The distance between each pair of sounds in that configuration represents the perceptual dissimilarity (subjective distance); the greater the dissimilarity, the greater the distance between the sounds. For a well-fitting geometric configuration, measured 268
8.5 Perceived differences and quality judgments of piano sounds by the stress of the statistical procedure, the number of perceptual attributes used by the listeners to judge dissimilarities is in accordance with the number of dimensions of the configuration. Comparing physically measurable differences between the stimuli with their distribution along the axis leads to the interpretations of the different dimensions. Comparison experiments with single tones played on four approximately equal-sized grand pianos from different manufacturers were conducted with eight test persons who had a good or even extensive musical education (Valenzuela 1995). The listeners ranged in age from 22 to 52 years, and all had normal hearing thresholds. In order to prevent false conclusions from being made regarding the quality of different manufacturers, abbreviations for the four pianos are introduced (b, i, s, y). However, it should be obvious that quality judgments for single piano tones may not be extrapolated onto the quality of the whole piano, especially taking into account that the quality of an instrument is a highly complex problem, depending on many aspects which themselves again are multidimensional. As representatives of the low, middle and high range of a piano, three loudly struck notes C2 (fundamental frequency approximately 65 Hz), C4 (267 Hz) and C6 (1046 Hz) of each piano had to be judged with respect to their dissimilarities. Only tones having the same pitch, loudness and duration (C2: 3 seconds, C4 and C6: 2 seconds) were compared. A trial consisted of four single sounds of the same pitch, of which at least three were from the same piano. The four tones were presented in two successive pairs; the position of the tone that differed from the other three was randomly selected. All possible pairwise combinations of the piano sounds of the same range were used to compose these trials. Each trial was presented eight times in a random sequence. In order to have a measure of how well listeners could discriminate different single piano sounds, they were asked first to indicate in which pair the differing note was located; secondly they were asked to rate its degree of dissimilarity from the other sounds – relative to the differences heard in all other trials – on a scale of zero to nine. This scale had the following general ranges: (1) 0, no perceptible difference; (2) 1–3, very similar; (3) 4–6, average level of similarity; and (4) 7–9, very dissimilar. The discrimination scores, being 97 % to 100 % in the low tonal range and 91 % to 100 % in the middle range, indicate that subjects could distinguish single tones of different pianos from one another very well in the low and middle ranges. In comparison with this, listeners had much greater difficulty in distinguishing tones of the C6 range, which is indicated by discrimination scores of 41 % to 100 %. Only trials with four identical sounds were judged correctly, with scores of 100 %. However, there were also some comparisons in the high tonal range that could be discriminated very well, with scores of about 80 %. Increased confusion in the high tonal range probably is due to a smaller number of partials in that range. The decreasing discriminability in higher tonal ranges is also evident in the dissimilarity judgments of the subjects. Subjective distance scores, averaging approximately 6 for the low, 5 for the middle and just 2 for the high tonal range, indicate the decreasing perceptibility of differences between single piano sounds for higher tones. The dissimilarity judgments of all listeners were in good agreement for the three tonal ranges examined, as shown by a standard deviation of approximately 1.5. 269
8 The human aspect I: Human psychophysics As previously mentioned, one way to find the attributes responsible for the perceived differences is by treating the dissimilarity judgments with a multidimensional scaling algorithm. This procedure revealed for all three investigated tonal ranges a relationship between the centre of momentum of the spectral energy distribution of each piano sound and the distribution of the sounds along one of the axes of the two-dimensional geometric configuration (dimension I in Fig. 8.5.1a-c, explained below). This result is in complete agreement with the findings of many other researchers, who consistently identified the spectral energy distribution of other instruments and sounds as a main component of the perceptual quality known as timbre (Bismarck 1974a, Gordon and Grey 1978). Timbre is defined as that aspect of a tone that distinguishes one sound from another, given the same loudness and pitch (ANSI 1973). Trying to detect features used by listeners to judge dissimilarities between piano sounds, a perceptually-based measure of dissimilarities should be looked for instead of a physical measure like the centre of momentum of the spectral energy distribution. A psychoacoustically-based measure for the spectral energy distribution of a sound, that takes fundamental processes of hearing into account, is sharpness (Bismarck 1974a,b). An experiment was designed to determine whether listeners used the perceived sharpness to judge dissimilarities for each examined range using the original and synthetic piano sounds (Valenzuela 1996). For each tonal range (C2, C4 and C6), a simple synthetic piano sound was created to be as natural-sounding as possible, having the same linearly-decreasing sound level over time for each partial. The partials of the low range (C2) decreased in accordance with the average of the original sounds by 7 dB per second, and those of the middle range (C4) decreased by 10 dB per second. In the high range (C6), a decrease in two steps was necessary to achieve a natural-sounding synthetic piano sound. According to the average of the original sounds, the partials decreased first by 73 dB per second, and later by 15 dB per second. Especially in this range, a typical ‘attack’ noise had to be superimposed onto the synthetic sound in order to obtain greater realism. The inharmonicity of the synthetic sounds were chosen according to the average inharmonicity of the corresponding original piano sounds. Finally, four synthetic piano sounds for each tonal range, having approximately the same sharpness as the original tones, were created by adjusting the spectral energy of each partial of the corresponding synthetic tone to those of the four original sounds. In this way, for each original piano sound (b, i, s, y) of each range examined (C2, C4, C6), a synthetic piano sound having the same sharpness (B, I, S, Y) was created. The synthetic sounds of a tonal range differed only in their spectral energy distribution and therefore in their sharpness. Listeners had no doubt about the synthetic tones being piano sounds but criticized them as being flat and boring. All pairwise combinations of these eight piano sounds per range, excluding comparisons of the same sounds, were presented in separate sessions for each tonal range to eight or nine normally-hearing test persons of ages 25 to 53 years, who had a good to extensive musical education. Each combination was presented eight times in a random order. After listening to a pair twice, subjects were asked to rate the dissimilarities perceived in this pair – relative to the differences heard in all other trials – on a scale of one (smaller dissimilarity) to nine (larger dissimilarity). In a pre270
8.5 Perceived differences and quality judgments of piano sounds ceding practice session, listeners could adapt perceived differences to the given rating scale. Interpreting the dissimilarity judgments of each tonal range by means of a multidimensional scaling algorithm yields the geometric configurations shown in Figure 8.5.1a–c. A relationship between the distribution of sounds along the horizontal axis, namely dimension I in Figure 8.5.1a–c, and the spectral energy distributions of the corresponding stimuli, is obvious for the three tonal ranges. Original and synthetic piano sounds, having the same spectral energy distribution, are located in close proximity with respect to dimension I. The calculated correlations of -0.97 for the low, -0.92 for the middle and -0.86 for the high tonal range between the values of dimension I and the measured sharpness medians of the corresponding piano sounds confirm the relationship between sharpness and the horizontal axis. The negative signs of the correlations simply indicate that sounds on the positive side of axis I are less sharp than those on the negative side, that is, that sharpness decreases with increasing values of dimension I. The vertical axis appears to relate – at least for the middle and the high tonal range – to the temporal characteristics of the sounds, as original and synthetic piano sounds are clearly separated in this dimension II. Further investigations concerning the interpretation of dimension II in the low range brought further insights into which attributes are used by listeners to discriminate between piano sounds. However, these results would need to be discussed separately in greater detail. These investigations lead to the following conclusions: Subjects are well able to distinguish different single piano tones of the same pitch and loudness. One of the most important auditory cues used to judge differences between piano sounds is their sharpness. Therefore, a piano sound of any tonal range is to a large extent characterized by its sharpness. Considering that, in particular, the spectral frequencies of partials are invariant parameters of the source signal (as they are not corrupted through transmission), it is evident that the sharpness, which depends on the spectral frequencies, serves as an essential basis for evaluating differences between piano sounds.
Fig. 8.5.1: Two-dimensional scaling solutions for distance estimates obtained from multidimensional scaling. For interpretations of labels see text. a: Configuration for the low tonal range C2, stress = 0.094; b: for the middle range C4, stress = 0.117; c: and for the high range C6, stress = 0.013.
271
8 The human aspect I: Human psychophysics
8.5.2 Influence of sharpness on quality judgments of piano sounds Judging the quality of an instrument depends not only on many objective aspects – which themselves again may be multidimensional – but also on subjective preferences that are not dependent on sound quality, such as the appearance and color of an instrument. Further aspects that may influence quality judgments in general are, for example, the mechanics of the piano, the materials used for construction, the room acoustics that influence the interaction of played notes, the chosen piece and its interpretation by the pianist, specific sound properties, and tuning. Another important thing to note is that there is a significant difference when the person who is judging the quality of a piano is listening or playing. Obviously, when discussing the quality of an instrument, a differentiation between player-specific features, mechanical and feedback aspects, psychological and simply perceptual acoustic properties of the piano must be made. In considering only perceptive properties, there are still many aspects that may lead to different judgments. It is, however, also evident that some prominent basic properties of single notes must be fulfilled in order to make good quality judgments possible when putting the notes together in a musical piece. A piano whose single tones are of poor quality will not have good quality when the tones are combined. Which attributes of a single piano tone, then, are responsible for good quality judgment? It is plausible to assume that quality judgments are dependent only on such audible attributes that are used to distinguish between piano sounds. Therefore, it is reasonable to hypothesize that sharpness plays an important role in classifying the quality of piano tones, as sharpness turned out to be one of the most important cues used to judge differences between piano sounds. For an experimental exploration of this hypothesis, listeners must be able to judge the quality of single piano sounds. This of course first needs to be proven. An experiment was conducted with eight normally-hearing test persons, ages 22 to 52 years, having a good to extensive musical education, to first determine whether subjects are able to judge the quality of single piano tones. In separate sessions for three examined tonal ranges (C2, C4, C6), all pairwise combinations of four original piano sounds per range were presented, excluding comparisons of identical sounds. Each combination was presented six times in a random order. After listening twice to a pair, subjects had to decide which of the two sounds was of better quality. The average probability of a subject giving the same answer for a combination that is presented several times, amounted to 80 % for the low, 82 % for the middle and 79 % for the high range. This shows that test persons are definitely able to judge the quality of single piano tones, even for the high range. The relative frequency of preference of a piano tone, with respect to its quality, is shown in Figure 8.5.2a for each tonal range. A comparison of the quality judgments in Figure 8.5.2a with the distribution of the corresponding piano tones along dimension I in Figure 8.5.1a–c, reveals a relationship between the sharpness and the quality judgment for each tonal range. One way to examine this statement is to create synthetic piano sounds as explained in Section 8.5.1, that differ only in their spectral energy distribution and therefore in their sharpness, and to conduct quality judgments with these sounds. In this way, the influence of sharpness on quality judgments can be investigated. Repeating the above 272
8.5 Perceived differences and quality judgments of piano sounds experiment using these synthetic piano sounds yielded the results shown in Figure 8.5.2b. The average probability of a subject’s answer remaining the same for a combination that is presented several times is about the same as computed for the original sounds, namely 83 % for the low, 90 % for the middle and 79 % for the high tonal range. As for the original piano sounds, the quality judgments for the synthetic tones of each tonal range (Fig. 8.5.2b) correspond to their distribution along dimension I in Figure 8.5.1a–c. Taking a closer look at these results reveals that for each range, original and synthetic piano tones on one side of dimension I (Fig. 8.5.1a–c) – corresponding to smaller sharpness values – are preferred with respect to their quality. The following conclusions can be drawn: The sharpness of single piano sounds is not only a dominant cue for distinguishing between different tones, but also for rating their quality. A tendency of piano sounds with smaller sharpness values to be preferred was shown for each tonal range.
Fig. 8.5.2: Relative frequency of preference of a piano tone with respect to its sound quality, in comparison with three tones of other pianos, measured for three different tonal ranges. For interpretation of labels, see text in section 8.5.1. a: Single tones of four original pianos b, i, s and y. b: Single tones of four synthetic pianos (B, I, S, Y) which differ only in their spectral energy distribution and therefore in their sharpness.
273
8 The human aspect I: Human psychophysics
8.6 Identification and segregation of multiple auditory objects Uwe Baumann
One of the most advanced signal-processing skills of the human auditory system is its ability to direct attentional processes to follow the sound of a selected acoustical source in an environment of multiple, simultaneously-present voices. For the purpose of building robust speech recognition systems or the advancement of hearing aids, it is desirable to implement signal processing strategies that can cope with an auditory environment consisting of a mixture of different sound sources and enhance or extract the desired information coming from one of them. Perception of polyphonic music is an example of this ability. With regard to its highly systematic structure, polyphonic music was chosen as a model to investigate human listeners’ strategies for grouping and for obtaining information about musical voices. The procedure outlined in Figure 8.6.1 was implemented on a computer to separate polyphonic music to the original voices (Baumann 1995). A hierarchical combination of ear-related spectral analysis, psycho-acoustical weighting functions and psychological elements and findings of the Gestalt theory were employed in this process. Several independent stages contribute to abstraction and selection of meaningful contours of spectral components. The aim is the formation of components pertinent to auditory objects. The on-going sequence of auditory objects forms an auditory object pattern. Experiments were conducted to obtain thresholds above which listeners can hear out auditory objects separately (Baumann 1994b) and the results of the experiments were incorporated into the procedure. Fig. 8.6.1: Flow chart of a model for identification and segregation of multiple auditory objects. Characters in circles are referring to intermediate results displayed in Fig. 8.6.3 to 8.6.7.
274
8.6 Identification and segregation of multiple auditory objects A brief two-voiced music example (score in Fig. 8.6.2) demonstrates the functions of the procedure. The tone sequences were created on a programmable synthesizer. Each tone of the soprano voice consisted of three harmonics, that of the bass voice of six harmonics. After analog to digital conversion, a special ear-related spectral analysis (SPECTRAL TRANSFORM) with high time- and frequency resolution was applied to the signal. Figure 8.6.3 shows schematically the absolute magnitude of the frequency-time spectrum of this example.
Fig. 8.6.2: Score of the short two-voiced example tune.
Fig. 8.6.3: Pseudo three-dimensional plot of the time-varying spectra derived with an ear-related frequency transformation simulating excitation patterns (Unkrig and Baumann 1993). The underlying time signal was generated with a programmable synthesizer playing the example tune (score in Fig. 8.12.2). The soprano sound consisted of three harmonics, the bass of six harmonics.
The next stage of the process (CONTOUR) includes a peak-picking algorithm and extracts the contours of the power spectra, omitting phase information. This yields a part-tone time pattern (PTTP, the sound level is indicated by line thickness). Compared to the spectral diagram, the PTTP (Fig. 8.6.4 left) is more easily readable, and the two voices are visually identifiable. Part tones closely related in time, frequency, and amplitude were linked together to form part-tone lines, as symbolized in 275
8 The human aspect I: Human psychophysics Figure 8.6.4 (right), with different symbols for each line. A splitting of the sixth harmonic of the first bass tone occurs due to the onset of the second soprano note. The block labeled ACCENTUATION takes time-dependent aspects of the spectral pitch pattern into account.
Fig. 8.6.4: Left: Part-tone time pattern gained by contouring the time-varying spectra, with phase information lost. Sound level transformed to line thickness. Right: Result of linking time-, amplitude- and frequency-neighboured part tones. Related part tones are forming parttone lines, their relationship is indicated with different symbols for each part-tone line.
Part-tone lines with similar onset times were combined in the next processing stage, labeled ONSET INTEGRATION. The result of this algorithm is called the auditory object pattern (AOP), since the AOP resembles the human capability of integrating simultaneously-occurring part-tone lines into a single percept. Figure 8.6.5 (left) displays the AOP of the simple example tune. Five auditory objects were detected and marked with different symbols. Due to the simultaneity of the last two notes, no segregation of the two voices occurred at this stage.
Fig. 8.6.5: . Left: Part-tone lines with equal onset time and duration are connected to a cluster. The membership of a part-tone line is displayed by the usage of a common symbol. Right: Results of pitch calculation indicated with lines showing the estimated fundamentals. Note that due to the possibility of ambiguities, the last two tones are attached to two fundamentals.
Separation of simultaneous tones is accomplished by calculating the fundamental pitch of every auditory object, accepting ambiguous results whenever indicated (block PITCH). As a result, each of the two last tones of the example tune was 276
8.6 Identification and segregation of multiple auditory objects attached to two pitch estimates. Figure 8.6.5 (right) shows the outcome of the pitch calculation algorithm for the five auditory objects obtained.
Fig. 8.6.6: Left: Rearranged cluster membership after segregation using the pitch information acquired with the pitch-calculation algorithm. Note that the harmonics of the last two tones have been distributed to soprano and bass. Right: Several harmonics from bass and soprano sound intermingle, due to their fundamental-frequency relationship. These collisions are detected and displayed in this figure.
Whenever ambiguous fundamental-pitch estimates were detected, the next processing step segregated the harmonics belonging to each pitch height. Due to the ability of this process to segregate homophoneously-sounding voices, its name was chosen to be HOMOPHONIC SEPARATION. Figure 8.6.6 (left) illustrates the separation of the last two tones. As visible with the last tone of Figure 8.6.6 (left), the third harmonic of the bass voice is intermingled with the fundamental of the soprano tone and the sixth harmonic with the second harmonic. If the soprano voice were to be extracted, the resulting bass tone would be missing these two harmonics and its timbre would be distorted. To avoid this, the next processing step (COLLISION DETECTION) searches for these intermingled harmonics and tries to share these collisions between the underlying auditory objects (Fig. 8.6.6 right). The final stage, termed SEQUENTIAL INTEGRATION, combines and links auditory objects to form a musical line or melody which is perceived by a human listener concentrating on hearing the desired voice, for example the bass melody. Figure 8.6.7 shows the extracted and restored voices, soprano voice on the right, bass voice on the left side. Nearly all harmonics have been properly attached to the two voices, only the sixth harmonic of the first bass tone is disrupted by the onset of the second soprano tone. An evaluation of the procedure with several examples of polyphonic music and speech sounds was performed (Baumann 1994a). The quality of the segmentation depended on the complexity of the material. Simple, two-voiced polyphonic music with low reverberation is segregated into single voices with only small changes of timbre, whereas time structure and melody are preserved.
277
8 The human aspect I: Human psychophysics
Fig. 8.6.7: Left: Reconstructed bass voice. Interruption within the sixth harmonic of the first tone due to onset of the second soprano tone. Right: Reconstructed soprano voice. All harmonics have been correctly distributed.
8.7 The role of accentuation of spectral pitch in auditory information processing Claus von Rücker
8.7.1 Spectral pitch as a primary auditory contour The robustness of auditory information acquisition under everyday circumstances has been of great interest in research. Taking into consideration the manifold possibilities of interference from other sounds and of corruption in communication channels, this kind of robustness can only be accomplished by a sensory system that takes advantage of such physical signal parameters that (a) carry information about the sound sources and (b) remain essentially unchanged by transmission through a linear system. Only the time-variant spectral frequencies of the Fourier components comply with these requirements. In the auditory system, spectral frequencies are represented as a time-variant pattern of spectral pitches that can be verified in psychoacoustical experiments. Spectral pitch can be regarded as the auditory equivalent of visual contour; therefore, auditory perception depends essentially on the robust and rapid extraction of those primary contours to serve its purpose of collecting information about objects in the environment (Terhardt 1987, 1989, 1992).
278
8.7 The role of accentuation of spectral pitch in auditory information processing
8.7.2 Accentuation as a cue for segregation Accentuation is a sensory phenomenon that can be defined as an increase of the perceptual prominence of portions of the time-variant distribution of auditory excitation under certain conditions. These conditions can be specified as being caused by real-world events that raise new acoustic objects. From the point of view of information theory, it is obvious that only temporal changes of auditory excitation can convey information, therefore it is only logical to emphasize any transient portion of the signal. The cues provided by accentuation are evaluated by the subsequent segregation process to enhance its reliability under difficult conditions. During the course of evolutionary development, the auditory system has ‘learned’ to decide whether or not a set of primary contours represent a single object. The conditions for the decision to integrate a primary contour into a common percept are given by the physical properties of our acoustical environment. Typically, a sound emitted by a single source, e.g. a speaker or a musical instrument, consists of a number of time-varying harmonic partials with a common onset, which persist or gradually fade away. Hence, a number of criteria can be deduced for spectral pitches belonging to a single auditory object: • Common onset time and duration • Harmonicity • Continuity in time (smooth connection of breaks) • Consecutive pitches with similar salience Spectral pitches that do not meet these criteria are not fused into a single percept and remain segregated. They can be heard out as isolated acoustic objects and often are accentuated.
8.7.3 Physical signal parameters evoking accentuation The physical signal parameters on which accentuation depends can be basically derived from the inversion of the above-mentioned set of criteria. Usually, several conditions for accentuation are given simultaneously within a natural acoustical scene. A number of signals, where the spectral pitch pertinent to a single partial is accentuated, are depicted in Figure 8.7.1. In the left column, temporal cues evoke accentuation of the spectral pitch pertinent to the third partial of a harmonic complex sound: Onset asynchrony (a), abrupt increase in level (b) and amplitude modulation (c). In the right column, purely spectral cues evoke accentuation: Static inharmonicity (g), periodic inharmonicity (h) and increased intensity of the partial, equivalent to a peak in the spectral envelope (i). In the centre column, some examples of spectro-temporal effects are given: Temporal contrast effect for a harmonic tone (d) and for noise (e), highlighting a spectrally-local increase in amplitude, and a sinusoidal tone of about 279
8 The human aspect I: Human psychophysics
Fig. 8.7.1: Temporal (left column), spectro-temporal (centre column) and spectral (right column) cues eliciting accentuation of spectral pitch. Temporal cues: (a) onset asynchrony, (b) abrupt increase in level, (c) amplitude modulation. Spectro-temporal cues: (d) spectral gap or valley in preceding sound, (e) band-stop noise causing a distinct pitch perception within the subsequent white noise, (f) single sinusoidal tone, drawing attention to pitch of third harmonic. Spectral cues: (g) static inharmonicity, (h) periodic inharmonicity caused by frequency modulation, (i) increased intensity of a partial, equivalent to a pronounced peak in the spectral envelope.
the same frequency as the third partial (f), drawing attention to the third harmonic and thus accentuating its pitch. Due to the variety of possible signal conditions and combinations, this list of examples is rather meant to illustrate the principle, than to be complete. In the following two sections, experiments are described that examine the significance of accentuation for speech perception and of the thresholds for some signal parameters for the existence of accentuation.
8.7.4 Improved perception of vowels by accentuation Speech perception depends to a great extent on the robust perception of vowels and voiced sounds. A temporal contrast effect, usually referred to as ‘auditory enhancement’ (cf. Viemeister 1980), leads to an accentuation of those portions of the signal where a spectrally local increase in amplitude occurs. In contrast to steady-state conditions, such an increase may elicit the perception of prominent spectral pitches which otherwise are much less salient or even imperceptible. Summerfield et al. 280
8.7 The role of accentuation of spectral pitch in auditory information processing
Fig. 8.7.2: Spectral envelopes of the vowels /a/ and /u/ for four different degrees of smoothing. 0 % means ‘original vowel’, 100 % means ‘smoothed out to a spectrum with a slope of -20 dB/dec’.
(1984) showed that a segment of sound that is devoid of peaks in the spectral envelope, is perceived as a vowel, when it is preceded by a sound with a spectral envelope complementary to that of the vowel itself. Due to the large differences in the spectral envelope of successive segments in fluent speech, it can be assumed that any segment will act as an adaptor which may have considerable influence on the perception of the subsequent segment, e.g. a vowel. We carried out vowel-recognition experiments with synthetic vowels as adaptors and target stimuli (Wartini 1995). The five german vowels /a/, /e/, /i/, /o/ and /u/1 were digitally synthesized using 50 harmonics of 100 Hz. The physical prominence of formants could be reduced by gradually smoothing out the spectral envelope to a spectrum with a slope of -20 dB/decade. Samples of spectral envelopes at various degrees of smoothing are shown for the vowels /a/ and /u/ in Fig-ure 8.7.2. When presented in isolation, i.e. without an adaptor, the original vowels were recognized 100 %. With the smoothed versions, a decrease in recognition rate with a rising degree of smoothing was obtained (Fig. 8.7.3).
Fig. 8.7.3: Identification of vowels with smoothed spectral envelope. The decline of recognition rate is smaller for the vowels /a/ and /e/, because most subjects tend to interpret a heavily-smoothed spectrum as /a/ or /e/. Results averaged over four subjects, 25 monotic presentations per vowel and per degree of smoothing, at 60 dB SPL and 200 ms duration. 1Notation
according to the International Phonetic Association.
281
8 The human aspect I: Human psychophysics Table 8.7.1: Mean recognition rates of target vowels, averaged over three degrees of smoothing. Each of the 75 adaptor-vowel combinations was presented 25 times to four subjects at a level of 60 dB SPL. adaptor vowel
recognition rate in % for smoothed vowels /a/
/e/
/i/
/o/
/u/
/a/
21.0
64.3
59.7
37.3
58.0
/e/
94.3
1.3
40.0
34.0
95.7
/i/
77.0
58.7
0.3
64.7
44.0
/o/
94.3
17.3
23.0
16.3
62.7
/u/
78.3
60.3
30.0
36.0
13.7
none
81.7
54.0
47.7
58.7
55.3
In another experiment, the five original vowels of 500 ms duration were used as adaptors, each followed by a pause of 10 ms and a smoothed target vowel of 200 ms duration. The degree of smoothing of the target vowel was chosen such that either a small, medium or great impact on the recognition rate was obtained (40, 60 and 75 % for the vowels /e/, /i/, /o/ and /u/; 40, 90 and 95 % for /a/). The results in Table 8.7.1 show that recognition rates vary greatly for different adaptor vowels, revealing a strong impact of the adaptor on the perception of the target vowel. A close look at the various spectral envelopes of the test-stimulus combinations reveals that recognition rate increased in those cases where the formant structure of the adaptor is complementary to that of the target vowel. To verify the significance of the differences in recognition rates, a chi-square-test was performed on the data. Table 8.7.2 displays those vowel percepts whose identification is significantly improved when preceded by the adaptor vowel named in the column, compared to the adaptor depicted at the beginning of the respective line. The results of the second experiment demonstrate that the identification of synthesized vowels can be heavily influenced by adaptor vowels. The rather long duration (500 ms) and the stationary characteristic of these adaptors permit the transfer of these findings to fluent natural speech only with certain reservations. Nonetheless, we may conclude that with high probability, the time course of the speech signal plays an important role in the robust identification of speech sounds. A positive slope of the signal amplitude within spectral channels of the auditory system definitely produces accentuation of spectral pitch and thus can improve the perception of voiced speech sounds.
282
8.7 The role of accentuation of spectral pitch in auditory information processing Table 8.7.2: Significant (α = 0.001 except two cases) differences between recognition rates of smoothed vowels. The recognition rate of the target vowels was improved when preceded by one of the vowels named in the columns. For example, see first line, second column: The identification of the target vowels /a/ and /u/ was improved when preceded by the adaptor /e/ compared to adaptor /a/. compared to adaptor
target vowels with increased recognition rate after adaptor /a/
/e/
/i/
/o/
/u/
none
/a/
–
/a,u/
/a,o/
/a/
/a/
/a,o/
/e/
/e,i/
–
/e,o/
/e/
/e/
/e,o/
/i/
/i,u/
/a,i,u/
–
/a,i,u/
/i/
/i,u/
/o/
/e,i,u/
/i,o,u/
/e,o/
–
/e,o/
/e,i,o/
/u/
/i,u/
/a,i,u/
/o,u/
/a,u/
–
/i,o,u/
none
/e,i/
/a,u/
/a/
–
8.7.5 Thresholds for the accentuation of part tones in harmonic sounds With regard to the design of a computer model for the segregation and integration of acoustical objects in the automatic analysis of music, Baumann (1994, 1995, 1997) examined the perception of melodic patterns by accentuating certain partials within sequences of harmonic sounds. If certain cues are available, a single partial of a harmonic sound can be perceived as an isolated acoustical object. Figure 8.7.4 illustrates this schematically. Depending on the magnitude of changes of level (∆L), frequency (∆f), onset time (∆t) and frequency modulation (M) applied to a spectral component, the thus accentuated partial can be heard out individually. To determine the thresholds of the above mentioned parameters for segregation, the following experiment was performed. The task was to identify a background melody that was produced by accentuated partials of harmonic sounds of a foreground melody. Two foreground melodies consisting of six harmonic complex tones (20 harmonics, 200 ms duration and pauses) were used. Either one or a group of three harmonics were modified such that the accentuated components could elicit the perception of a background melody. Stimuli were diotically presented to five subjects in random order. Each background melody was presented 20 times, preceded by the respective unmodified stimulus. The partials were modified stepwise in one of the following alternatives: Onset time ∆t (-10..30 ms), sound pressure level ∆L (-10..10 dB), relative frequency ∆f (-2..2 %) and relative frequency modulation M (0.5..6 %, modulation rate 5 Hz). The subjects had to indicate the contour of the background melody (choice of: up, down, up-down, down-up, no background contour) by pressing a labelled key on a computer keyboard; no feedback was given.
283
8 The human aspect I: Human psychophysics
Fig. 8.7.4: Signal parameters evoking accentuation of a single part tone in harmonic sounds.
Table 8.7.3 shows the results of the melody-recognition experiments, calculated by determining the threshold for recognition (62.5 % correct responses) by interpolating between the measured data points. According to these results, a background melody is perceived if one of the following conditions apply: • Onset time asynchrony of at least +20 ms (trailing onset) or -4 ms (leading onset). • Increase in level of at least 3–4 dB. • Static mistuning of at least 1–1.5 %. • Periodic mistuning of at least 1.5 %.
Under these circumstances, accentuation of the modified partials can be regarded as a result of failed integration; the perceived pitches of the background melody are not being fused into the same percept as the pitches of the harmonics of the foreground tune.
Table 8.7.3: Recognition thresholds for the correct identification of a background melody hidden within the harmonics of a foreground melody. Positive sign of ∆t means that the modified partials trail the onset of the harmonic tones of the foreground melody. 1 harmonic changed
3 harmonics changed
parameter
∆t
∆L
∆ƒ
M
∆t
∆L
∆ƒ
recognition +
20 ms
4 dB
1.5%
1.5%
25 ms
3 dB
1.5% 1.5%
threshold
4 ms
X
1%
4 ms
X
1%
284
–
M
8.7 The role of accentuation of spectral pitch in auditory information processing
8.7.6 Accentuation of pitch – discussion and outlook The above considerations and experimental results confirm the importance of accentuation for auditory processing in the context of a perception model that depends on the robust extraction of a pattern of primary contours. Moreover, accentuation effects play an important role for theories of pitch perception. The perceptibility and salience of the pitches of time-variant sounds can be particularly well understood by taking accentuation into account. Only if the spectral pitch pattern reflects all fundamental properties of known psychoacoustic data can one expect to succeed in modelling the formation of virtual pitch. Virtual pitch depends on spectral pitch (Terhardt 1970) and is assigned to sets of primary contours as an acoustic object identifier (cf. Terhardt 1989). Thus, a well-established theory of pitch perception can contribute to answering the problem of sound-source separation. Any model of the auditory system that is to explain how the complex task of analyzing the acoustic environment actually works must account for effects such as accentuation, that can be psychoacoustically verified. One has to face the complexity and diversity of known psychoacoustic facts in order to improve the low-level representations used by auditory models and thus improve our understanding and explain the astonishing performance and reliability of the auditory system.
285
9 Hearing impairment: Evaluation and rehabilitation Karin Schorn and Hugo Fastl
Acoustical communication is the fundamental prerequisite for a human society to come into existence. The receiver for acoustical signals is the ear, and acoustica stimuli only lead to adequate hearing sensations when the hearing system is fully functional. From the very beginning, the Special Research Area 204 was also clinically oriented, i.e. some of the research concentrated on developing different tests for the diagnosis of patients with different conductive and sensorineural hearing disorders. Various psychoacoustic tests were developed to differentiate between cochlear and retrocochlear impairment and to determine the causes of a discrimination loss. These tests include amplitude resolution, temporal resolution, temporal integration, frequency resolution (tuning curves) and loudness scaling. Moreover, objective tests such as otoacoustic emissions and acoustic evoked potentials were tested, especially for use with uncooperative children. Furthermore, the complex pattern of speech recognition in noisy surroundings was investigated and rehabilitation with hearing aids optimized, with the intention of improving acoustical communication in hearing impaired patients. The following sections aim to demonstrate the audiological research program in detail.
9.1 Psychoacoustic tests for clinical use 9.1.1 Amplitude resolution – ∆L-test A fundamental property of hearing is the ability of the auditory system to discriminate intensity differences. The most widely used clinical test for evaluating the capacity of the ear to discriminate intensity differences is the SISI test. In this test, a pure tone is presented to the patient at a specified sensation level and an increased intensity of 1 dB is superimposed upon the steady-state tone at periodic intervals. In the Special Research Area 204, a technique of measuring the discrimination of level differences using pulsed tones was developed. Alternating tone bursts, each of 500 ms duration but of different sound pressure level, were separa286 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
9.1 Psychoacoustic tests for clinical use
Fig. 9.1.1: Level difference thresholds ∆L of 1: Normal-hearing subjects and of patients suffering from 2: Conductive hearing loss; 3: Toxic impairment; and 4: Noise-induced hearing loss.
ted by 200 ms pauses. Level discrimination was determined in patients with normal hearing, with conductive hearing loss, with cochlear hearing loss caused by noise, ototoxic medication, Menière’s disease, sudden deafness or presbyacousis, and with retrocochlear hearing impairment (Fastl and Schorn 1981, 1984, Schorn 1993c, Zwicker and Fastl 1990). The results show that most of the patients with a conductive or a cochlear hearing impairment indicate level differences that are almost the same as those of normal-hearing subjects. The median value of the just-perceptible level differences, ∆L, is 1.5 dB between 250 Hz and 8 kHz. The patients with a cochlear hearing loss indicate a level discrimination of about 2 dB in the corresponding frequency range, but the interquartile ranges overlap (Fig. 9.1.1; Schorn 1981). Using the SISI test, patients with a cochlear hearing loss are able to perceive increments of 1 dB. A marked increase in the ∆L value was found in all patients with retrocochlear hearing impairment caused by acoustic neuroma, brain injury, apoplectic stroke, cerebral haemorrhage, and multiple sclerosis (Schorn 1981, Fastl and Schorn 1984, Schorn and Fastl 1984). The median values of ∆L varied between 4.5 and 5 dB, i.e. 3 dB higher than the normative values, and were in individual cases much higher (Fig. 9.1.2). The pulsed-tone technique for level discrimination developed in the collaborative research centre 204 obviously represents a powerful diagnostic tool for detecting retrocochlear impairments (Schorn and Stecker 1994a, Schorn 1997). 287
9 Hearing impairment: Evaluation and rehabilitation Since this method avoids the problems associated with modulated tones as used in the SISI test, the ∆L-test can also be applied in cases of small hearing loss and also produces highly reliable results.
Fig. 9.1.2: Level difference thresholds ∆L of a patient with an acoustic neuroma.
9.1.2 Frequency selectivity – psychoacoustical tuning curves A number of studies have demonstrated that the discrimination processes involved in the perception of speech and music require the ability to discriminate between frequencies. Based on the classical masking pattern, which describes an effect that is defined as simultaneous spectral tone-on-tone masking, the method of measuring a psychoacoustical tuning curve was developed. This is defined as that of a masking tone, that is necessary to just mask a test tone, as a function of the frequency of the masker. For clinical use, a simplified masking device was developed, measuring in a stepwise fashion only three points below and three points above the test-tone frequency. In most clinical cases, data from both the low-frequency range and the high-frequency region are needed. Therefore, the apparatus provides one test frequency of 500 Hz with masker frequencies of 215, 390, 460, 540, 615 and 740 Hz 288
9.1 Psychoacoustic tests for clinical use and a second test frequency of 4000 Hz, masked by 1.72, 3.12, 3.68, 4.32, 4.92, and 5.92 kHz tones (Schorn and Zwicker 1990). For quantitative comparison, a frequency resolution factor (FRF) was defined. According to the frequency representation along the cochlear partition, it is advisable to express the slopes in terms of dB/Bark. Thus, the frequency selectivity can be characterized quite simply by the sum of the two slopes around the 500 Hz and the 4000 Hz test tone:
FRF500 =
(LM – LM ) (LM – LM ) + 2.5 Bark 1.5 Bark
FRF4000 =
215
460
740
540
(LM – LM ) (LM – LM ) + 5 Bark 1.7 Bark 1720
3680
5920
4320
The frequency resolution factor of normal hearing subjects is 34.6 dB/Bark (FRF500) and 34.4 dB/Bark (FRF4000). A reference value of 34.5 dB/Bark is employed in order to differentiate between normal frequency resolution (FRF = 1) and that of the various patient groups. Psychoacoustical tuning curves were determined in normal-hearing subjects and in different groups of patients with hearing disorders. Conductive hearing loss does not influence frequency selectivity except in cases of otosclerosis. The otosclerotic patient shows a flatter slope, when an additional sensorineural hearing loss could be established. The resulting tuning curve data of patients with cochlear impairment, due to problems such as noise-induced hearing loss, hearing loss due to toxic effects, sudden deafness, presbyacousis, Menière’s disease, and degenerative hearing loss, indicates that frequency resolution is reduced by 50 %, especially in the range of greater hearing loss (Schorn 1990a, 1993c, 1997). A strong reduction of the frequency selectivity was demonstrated in patients with Menière’s disease (FRF500 = 0.30), even in patients with only a slight hearing loss (Fig. 9.1.3).
Fig. 9.1.3: Simplified tuning curves in the frequency ranges of 500 Hz and 4 kHz for patients with Menière’s disease (crosses). The frequency resolving power is greatly decreased at both frequencies compared to tuning curves in normal subjects (open circles).
289
9 Hearing impairment: Evaluation and rehabilitation In the presence of background noise, the frequency resolution was reduced by approximately 20 % in normal-hearing patients. Comparison of frequency resolution between the tests without and with background noise demonstrates that the presentation of background noise further reduces the FRF in the hearing-impaired groups. The measurement of frequency resolution is also important when special medical expert opinions have to be procured (Schorn 1989, Eisenmenger and Schorn 1984).
9.1.3 Frequency discrimination – ∆ƒ measurement In addition to the measurement of frequency selectivity by means of tuning curves, frequency discrimination can be investigated using the ∆f-test (Zwicker and Fastl 1990). In this test, the patient is supplied with two tones of different frequency, each having a duration of 500 ms and an intervening pause of 200 ms. Our experience shows that the test is not suitable for most patients with an ascending or descending hearing threshold, because they are unable to differentiate between difference limens of frequency and of intensity.
9.1.4 Temporal integration When sound bursts are shortened to less than 200 ms, their level must be raised to remain audible. This decreased hearing threshold for short tones is valid only for normal-hearing patients. For testing temporal integration in patients with hearing impairment, a simple device for clinical use was developed (Fastl 1984e) to measure the hearing threshold at eight different frequencies using tones of 300 ms, 100 ms, 30 ms, and 10 ms duration. For patients with a sensorineural hearing loss, temporal integration often followed a distinctly different course compared to that of subjects with normal hearing. Normal-hearing persons have an increment of about 10 dB for a factor of 10 reduction from 100 ms to 10 ms. For hearing-impaired listeners, this holds true only at low frequencies around 500 Hz. At high frequencies around 4 kHz, however, hearingimpaired patients show an increment in level of only about 4 dB over the same range of duration. This result is obtained by the method of bracketing, of adjustment, as well as tracking (Fastl 1987e). In extreme cases, the same threshold may be obtained by hearing-impaired listeners for sounds of 2 ms and 200 ms duration. We were able to show that a change of temporal integration does not always depend on the degree of the hearing loss, nor on the diagnosis of the sensorineural impairment (Schorn and Stecker 1994a, Schorn 1997). It varies so much from patient to patient that the test is not suitable for differential diagnosis. The application of the temporal integration test has already been recommended in hearing aid fitting for calculating the gain to be used at different frequencies (Schorn 1986, Schorn and Stecker 1994b). 290
9.1 Psychoacoustic tests for clinical use In a series of experiments, we examined the hypothesis that the reduced temporal integration of hard-of-hearing persons is due to the elevated thresholds. By presenting appropriately-shaped masking noises, a hearing loss was simulated in normal hearing persons. While temporal integration was reduced in persons with a hearing loss, the normal hearing persons with simulated hearing loss showed normal temporal integration (Florentine et al. 1988). Hence, the reduced temporal integration is not only due to elevated threshold values.
9.1.5 Temporal resolution Reduced speech discrimination in patients with inner ear hearing damage may be influenced not only by a poor frequency selective capacity, but also by a reduced temporal resolution. This is because speech is coded in time by the temporal structure of the syllables. A simplified method for measuring temporal resolution was developed, which was found to be very acceptable to all patients (Zwicker and Schorn 1982). It is based on masking-period patterns, but measures only their maximum and minimum using continuous and square-wave amplitude-modulated masking noises. With test-tones of 500 ms duration, the threshold is measured without a masker (THS), with a modulated masker (MOD), or with a continuous masker (CONT). Test tones and filtered masking noises are presented at 500, 1500, and 4000 Hz. For quantitative comparisons, a temporal resolution factor (TRF) is defined (Zwicker 1985b 1986g) as a ratio between the level differences of the three subjectively-measured values, i.e. 1TRF
=
LCONT – LMOD LMOD – LTHS
For normal-hearing subjects, this factor is about 1. The results show that patients with a sensorineural hearing loss (caused by noise induced hearing loss, presbyacousis, sudden deafness, ototoxic hearing loss, cervical hearing loss, Menière’s disease, or progressive degenerative hearing loss) have a reduced temporal resolution at least in one frequency range and mostly in all three frequency ranges tested (Fig. 9.1.4; Schorn 1990a, 1993).
1L cont
= Level of continuous noise Lmod = Level of modulated noise
291
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.1.4: Temporal resolution diagrams from patients with toxic hearing loss (closed circles and continuous line) compared with the temporal resolution for normal hearing persons (closed circles, dashed line). The TRF of the toxic-impaired ears decreased to 0.3 in the 4 kHz range.
Our data indicate that there is no direct correlation between hearing loss and the reduced temporal resolution factor. The most reduced TRF, down to 0.2, was found in patients with a retrocochlear hearing impairment (Schorn and Zwicker 1985). Nevertheless, no clear diagnostic conclusions can be drawn from an impaired temporal resolution capacity. In each of the groups, individual cases could be observed with a normal as well as with an impaired temporal resolution capacity. As poor speech discrimination does not always correspond with a reduction in temporal resolution, the TRF was determined under background noise conditions. Surprisingly, for normal-hearing persons, the temporal resolution factor was found to increase by a factor of 3 at all frequencies (Fig. 9.1.5).
Fig. 9.1.5: Medians and interquartile ranges of simplified temporal resolution diagrams at 0.5 and 4 kHz from 153 normal hearing subjects. Lower curves, without background noise, show a TRF of 1. The upper curves, measured with background noise show a TRF of 3 at both frequencies.
292
9.1 Psychoacoustic tests for clinical use This demonstrates a remarkable improvement of temporal resolution performance in normal hearing persons for noise-disturbed situations. Also, in all patients with sensorineural hearing loss of diverse etiology, the temporal resolution increased under background noise conditions. The improvement is, however, much smaller than that of normal-hearing subjects (Zwicker 1985b, Schorn and Zwicker 1989). The findings indicate that in patients with a sensorineural hearing loss, the temporal resolution should also be studied in the presence of background noise for estimating the result of fitting a hearing aid. After a special device was constructed for testing temporal resolution in free-field conditions (Zwicker 1986g), we showed that hearing aids can influence the temporal resolution without and with background noise in very individual ways (Brügel and Schorn 1993b,c, 1994). This effect is evident even when the insertion gain and the speech intelligibility with background noise yield no differences between the hearing aids.
9.1.6 Melodic pattern discrimination test Beside the diagnosis of peripheral hearing disorders, the clinical assessment of central auditory function is an extremely interesting and challenging area of audiology. A great deal of basic and clinical research is needed to evaluate special tests. The melodic pattern discrimination test (MPDT) was developed to evaluate central hearing disorders not only in adults, but also in children with learning disabilities (Baumann 1995). Basically, the patient has to discriminate accentuated partial tones in a melody of complex tones (Baumann 1994a,b). A sequence consisting of six harmonic complex tones is used to form a foreground melodic figure. One harmonic tone in each complex tone is altered in such a way that the modified component, depending on the amount of adjustment, can elicit the simultaneous perception of a second (background) melody. The modification of the harmonic partial tone can be done in three ways: increase of level, detuning or onset-time asynchrony. The task of the patient is to detect the contour of the background melody, which can be ascending, descending or alternating: first three tones ascending, then three tones descending or vice versa. Consequently, the auditory short-term memory as well as the individual time-, frequency- and intensityresolution can be assessed.
293
9 Hearing impairment: Evaluation and rehabilitation
9.2 Objective tests for clinical use
9.2.1 Delayed evoked otoacoustic emissions (DEOAE) for screening The early detection of hearing disorders in children is strongly recommended, because damage in the auditory periphery can induce irreversible disorders in auditory information processing (Schorn 1990b, Schorn and Stecker 1994 a). As currently-used tests, such as the elicitation of reflex responses (APR, startle reflex) and behavioural observations, depend on the vigilance of the child, objective screening measurements are required. A suitable screening test is the measurement of delayed evoked otoacoustic emissions (DEOAE), which can be recorded in almost 100 % of normal-hearing subjects, regardless of age and gender. They are not measurable in ears with a conductive or sensorineural hearing loss of more than 25 dB. A device to measure OAEs was developed, with several features adjusted for small children (Zwicker and Schorn 1990). A small and robust probe was produced that fits comfortably even into the small diameter of a neonate’s ear canal. The stimulus is either a half cycle or one full cycle of a 1.5 kHz oscillation. The stimulus repetition rate was set to 40 Hz. Small electret transducers, normally used as microphones, were integrated as transmitters to produce sound levels 10 or 34 dB above threshold, with minimal distortions. A band-pass filter with cut off frequencies of 0.7 and 3 kHz and slopes of 12 dB per octave was added to reduce the noise level caused by muscle activity and the blood supply. We used 1000 sample repetitions that were averaged online (Zwicker and Fastl 1990). The averaging process took about 1 min, if less than 20 % of the data were rejected (due to artefacts). Two different studies conducted with newborns and children (Munich screening programme) both came to the conclusion that the method of measuring transient-evoked otoacoustic emissions, compared to other tests for detecting a deficit of hearing, is the most useful test for screening. This is due to the following advantages: Emissions are quickly measured, i.e. within a few minutes, the measurements can be performed in the hospital nursery by trained volunteer examiners, the test produces the same results in awake, anaesthetized, sedated, and sleeping patients, and the existence of DEOAE excludes with very high probability a peripheral hearing loss that may influence the development of speech (Schorn 1990a, 1993a,b). The sensitivity amounts to 100 % and the specificity to 91 %, compared with the gold standard of BERA, while behavioural-observation audiometry provides only a sensitivity of 94.1 % and a specificity of 22 % (Arnold and Schorn 1994, Arnold et al. 1995).
294
9.2 Objective tests for clinical use
9.2.2 Delayed-evoked and distortion-product otoacoustic emissions for hearing threshold determination For normal hearing subjects, a close relationship between the fine structure of the hearing threshold and delayed evoked otoacoustic emissions as well as spontaneous otoacoustic emissions (SEOAE) was established (Zwicker and Fastl 1990). In addition, several clinical studies describe the relationship between DEOAE, clinical data and the pure tone audiogram. However, this relationship is complex because the intensity of DEOAE in a given frequency band is also linked to other frequencies of an audiogram. As it is difficult to deduce an audiogram from the power spectrum of DEOAE, correlation coefficients of the pure-tone hearing loss (0.5–6 kHz), the amplitude of DEOAE (1–4 kHz), and also of the distortion product 2f1-f2 (0.46–4 kHz) were computed. In order to study the correlations with the auditory threshold, we fitted a multivariant linear regression model with DEOAE and distortion-product OAE (DPOAE) simultaneously as predictors for the auditory threshold, gaining 95 % prediction intervals of 19 to 39 dB depending on the frequency under investigation. By restricting the hearing loss to a maximum of 70 dB HL, the 95 % prediction interval of the auditory threshold could be decreased to 18–26 dB (Suckfüll et al. 1996). The results already allow us to use DEOAE and DPOAE, in addition to clickevoked brainstem audiometry, to provide more frequency-specific information on the hearing loss in newborns, which is most important for the ideal fitting of hearing aids. In extensive studies, the relationship between otoacoustic emissions and tinnitus were studied (Zwicker 1987b). The source for all kinds of emissions seems to be an intact, active cochlea that produces nonlinear feedback, often leading to oscillations. In contrast to tinnitus, no emissions were found in frequency regions with a hearing loss larger than 25 dB.
9.2.3 Delayed-evoked and distortion-product otoacoustic emissions as an objective TTS measurement Otoacoustic emissions are an attractive tool for obtaining information about small temporary or permanent threshold shifts, even when the pure tone audiogram is normal. DEOAE diminished temporarily in patients who underwent MR-tomography. This effect was verified with the commonly-used 1.0 Tesla Magneton, Siemens. The MR imager produces a third-octave noise with a level of 110 dB at around 250 Hz, and even up to 120 dB at lower frequencies (Zwicker et al. 1990). Therefore, ear protection with earplugs is strongly recommended, especially for patients with preexisting sensorineural hearing loss. Studies conducted with New Zealand white rabbits showed that DPOAE are suitable for establishing the effects of changes in blood viscosity on the function of the cochlea. The purpose of the studies was to evaluate whether haemoconcentration can lead to a diminished cochlear blood flow (CoBF) and may thus play a role in the pathogenesis of sudden deafness (Suckfüll et al. 1997). In a case-control 295
9 Hearing impairment: Evaluation and rehabilitation study of patients suffering from sudden deafness, plasma cholesterol, fibrinogen levels, erythrocyte aggregation, and plasma viscositiy were determined. Plasma fibrinogen and cholesterol levels were significantly higher in patients with idiopathic sudden hearing loss, leading to significantly-elevated values of erythrocyte aggregation and plasma viscosity (Suckfüll et al. 1997). Most of these patients treated with a single Heparin-induced extracorporeal LDL-precipitation (H.E.L.P.) immediately showed an improvement of auditory threshold, as seen using DEOAE and DPOAE.
9.2.4 Evoked-response audiometry (ERA) for hearing threshold determination It is well known that brainstem evoked response audiometry (BERA) is the most reliable method for determining the hearing threshold in infants and young children (Schorn 1982). To use the advantages of this objective test method with regard to hearing aid fitting, several criteria must be fulfilled: 1. Good sedation. 2. Sufficient test program including not only clicks as stimulus but also 500 Hz tone impulses, bone-conducted stimuli and measurement with suprathreshold loudness. 3. Sufficient experience in interpreting the various response patterns recorded. As the amplitude of the auditory evoked potentials is very small (50-400 nV), the myogenically-generated potentials can severely disturb recordings. Therefore, good sedation or general anaesthesia is necessary to obtain a reliable estimate of the individual hearing threshold. Our numerous examinations showed that sedation with Chloralhydrate, Itridal (combination of Dominal, Cyclobarbital-Calcium and Phenozon), Dormicum (Midazolam) or Truxal-juice (Chlorprothixen) is not sufficient nor as safe as it is with Propofol (Schorn and Stecker 1988). We prefer an initial dose of 2 mg per kg bodyweight, and a maintenance dose of 10 mg/kg B.W./hour, given per infusor (Schorn 1993b). It is not sufficient to determine the hearing threshold in a frequency range of 2–4 kHz using click stimuli. The use of evoked electrical potentials in the low frequencies has been very controversially discussed. We found that the commonlyused, Gaussian-modulated 500 Hz tone impulses with alternating polarity do not produce reliable ‘on-effect’-potentials. The main arguments are: The amplitudes of the potentials are too small, the form of the potentials exists in a single slow wave which is hard to separate from the noise potentials occurring by chance, and the suprathreshold growth of amplitude is so small that it is hardly possible to estimate the threshold by extrapolation. We developed a more reliable method for determining the threshold in the low frequency range. A constant stimulus polarity was chosen, and instead of the on-effect potentials, frequency-following responses (FFR) were studied. In thousands of infants, we were able to demonstrate that this procedure is extremely reliable, as the FFR-amplitude is distinctly greater than the amplitude of the on-effect potentials (Fig. 9.2.1). The FFR pattern is easy to identify out of noise potentials, and the suprathreshold amplitude-growth is so evident 296
9.2 Objective tests for clinical use
Fig. 9.2.1: Good correspondence in the evaluation of hearing threshold between the pure-tone audiogram (top panel) and brainstem audiometry using 2 to 4 kHz clicks (bottom left panel) and 0.5 kHz tone bursts (bottom right panel) as stimuli.
that a characteristic input/output curve is easy to measure. This curve provides additional information on the recruitment behaviour of the ear, which is essential for fitting a hearing aid. Another important prerequisite for determining the hearing threshold in young children is the ability to record bone-conducted stimuli, since a latency shift of the stimulus related potentials may not be caused by a conductive hearing loss (Fig. 9.2.2). Another reason for these shifts may be a dysmaturity or a maturity delay of the VIII. nerve (Stecker 1991). We showed that bone-conduction testing is possible when the electromagnetic interference-potential of the bone-conduction vibrator is reduced, and proper calibration is conducted (Schorn and Stecker 1988, Stecker 1990, Schorn and Stecker 1994a). 297
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.2.2: Auditory evoked potentials of a normal hearing infant, using bone-conducted click stimuli.
9.2.5 Evoked-response audiometry (ERA) for hearing aid fitting Due to the nonlinear behaviour of hearing aids (compression, peak clipping) the AEP stimulus pattern is distorted when a hearing aid serves as a transmitter. In the course of time, several hearing aids were found not to negatively influence the AEP. This makes it possible to objectively test the hearing aid amplification, the peak clipping, and the amplitude compression using the evoked potentials (Fig. 9.2.3). Our studies made it possible to optimize hearing aid fitting in children (Schorn and Stecker 1984).
Fig. 9.2.3: Auditory evoked potentials of a child with a hearing loss of about 60 dB (left panel); AEP after hearing aid fitting (right panel).
298
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids
9.3.1 Psychoacoustic basis of hearing aid fitting In a review paper (Fastl 1996b) the relationship between psychoacoustics and the fitting of hearing aids was discussed. As a basis, the frequency response of headphones and loudspeakers frequently used in audiology were discussed. The frequency selectivity of hard-of-hearing people can be assessed by means of psychoacoustic tuning curves. A reduced temporal resolution is measured by the temporal resolution factor. The reduced temporal integration in hard-of-hearing people can lead to an overestimation of their hearing loss for short signals. The loudness perception of hard-of-hearing people is governed by recruitment phenomena, as well as by deficits in the spectral integration of sound energy. The hearing sensation ‘sharpness’ plays a crucial role in hearing aid fitting. On the one hand, boosting high frequencies leads to an improved speech intelligibility; on the other hand, the resulting sharpness of the sound image prompts a rejection of the hearing aid by the user. ‘Fluctuation strength’ represents a basic hearing sensation responsible for the ability to follow temporal variations in the envelope of human speech. Psychoacoustic models for the improvement of hearing aid fitting need to take into account the reduced frequency selectivity, the recruitment phenomenon, the reduced temporal processing, as well as the temporal integration in hard-ofhearing people.
9.3.2 Hearing aid fitting with in-situ measurement The preselection of a hearing device depends mainly on the hearing loss in the pure tone audiogram and on the gain of the hearing aid (Schorn 1983, 1986). In the last several decades, the selective amplification hypotheses have ranged between the mirrored amplification and the one-half rule. We have developed and integrated with success the Munich method, which takes into account the frequency characteristics, but provides a greater amplification than other methods in the mid- and highfrequency range, even higher than Berger proposed in his formula (Brügel and Schorn 1991c, Schorn and Stecker 1994b). A prerequisite for a good fitting is the objective control by in-situ measurement (Brügel and Schorn 1991b, Schorn and Brügel 1994). Reliable evaluation of the real ear gain can be achieved only with this method, especially with regard to children and babies, as they have an individual ear resonance. We were able to show that the average ear canal resonance of 87 children had its maximum between 3 and 4 kHz (Fig. 9.3.1). It differs significantly from the maximum resonance of adults, which is situated in the 2.7 kHz range (Brügel and Schorn 1990, 1991a). 299
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.3.1: Evaluation of real ear resonances of 87 children, tested with in-situ measurement.
Comparing hearing aid amplification measured with a 2 cm3-coupler or KEMAR on the one hand and with probe tube measuring equipment on the other hand, no adequate correspondence could be obtained. The differences between these test methods were quite individual, even before an earmold modification was applied (Brügel et al. 1990, Brügel and Schorn 1991b, 1992c, Brügel et al. 1992). Our results on 140 patients showed that the 2 cm3-coupler underestimates the amplification in the low frequency range and overestimates it at higher frequencies (Fig. 9.3.2).
Fig. 9.3.2: Comparison of hearing-aid amplification between in-situ measurements (filled squares) and 2 cm3 coupler measurements (filled circles).
300
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids Another prerequisite for optimizing hearing aid fitting is the exact calculation of the output level and the compression factor of the device. To achieve this objectively, we introduced the in-situ measurement, using not only the common level of input signals of 60 dB, but also levels of 80, 90 and 100 dB (Brügel and Schorn 1992c, 1993a). We found differences between the in-situ measurement and the coupler measurement of up to ±30 dB. Furthermore, the highest compression rate was found in the mid- and high-frequency ranges. Consequently, probe tube measurement with high input level should be used to characterize the real effect of Automatic Gain Control (AGC) and of Peak Clipping (PC).
9.3.3 Hearing aid fitting – earmold modifications with vents and horns In the past 5 years, a new state-of-the-art hearing aid fitting has evolved (Schorn and Stecker 1994b). The prerequisite of a good fitting is the right choice of an earmold with a beneficial earmold modification. The most important earmold modification is the venting. Vents have been employed in an effort to make hearing aids more acceptable to the user, and we have done so for more than 1000 of our hearing-aid patients. The drilled hole gives pressure equalization in the ear and relieves the feeling of pressure. Contrary to other workers, we found that the reduction effect in the low-frequency range was only moderate. This occurs even in cases of a short vent with a large diameter of up to 2.5 mm (Brügel et al. 1990, Brügel et al. 1992). The median gain-reduction effect amounts to only 3 dB in the 750 Hz frequency range. This effect was contrary to our expectations based on fundamental acoustics. Normally, low-frequency reduction should occur due to the opening. To investigate this behaviour in more detail, we used a special device for our experiments, the AMH (acoustical measuring head) developed by H. Hudde, Bochum. To permit accurate comparison and reproducibility, special earmolds were manufactured. A network model allows an estimation of the sound pressure in front of the eardrum. Figure 9.3.3 shows the sound pressure function with closed vent and 2.5 mm vent size as compared to the closed earmold at 1000 Hz. In accordance with our previous experiments, the reduction of sound pressure level with 2.5 mm vents is restricted to frequencies below 450 Hz and is therefore negligible for speech discrimination. In the frequency range between 450 Hz and 2 kHz, an increase of up to 5 dB can be seen. Although the insertion of a vent into the earmold improves the listening comfort for the hearing impaired person, it does not enhance the speech intelligibility significantly in quiet surroundings. Analyzing the effect of earmold venting on speech intelligibility under different background-noise conditions, we showed that venting improves speech discrimination in noisy conditions (Brügel et al. 1991a, Brügel and Schorn 1992b). With few exceptions, the improvement was low using CCITT Rec. G 227-noise, but it was significant, up to 25 %, under conditions using the time-varying background noise proposed by Fastl. Another significant earmold modification is the insertion of a Libby- or Bakkehorn to increase the amplification in the high frequency range. All of the 250 hear301
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.3.3: Sound-pressure level in front of the eardrum obtained with an acoustical measuring head and a special earmould adjusted to a normal hearing subject. Reference: Closed earmould at 1 kHz. Continuous line: closed vent; dotted line: vent size 2.5 mm.
ing aid users tested, showed good subjective acceptance with a subjective improvement of speech intelligibility, especially of consonants (Brügel et al. 1991b, Schorn and Brügel 1994). The median acoustic amplification between 2 kHz and 4 kHz improved by 7 dB, and up to 20 dB in single cases. The speech discrimination with Freiburger monosyllables showed a median improvement of 15 %. It is apparent that to improve hearing aid fitting, the earmold modifications should be used, deliberately and specifically, far more often than at present.
9.3.4 Hearing aid fitting with loudness scaling The increase of loudness is quite different in patients with cochlear hearing damage, compared to normal-hearing subjects or patients with a conductive hearing loss. Therefore, with regard to hearing aid fitting, it is important to evaluate not only the hearing threshold, but also the uncomfortable level and the most comfortable threshold. In our patients with sensorineural hearing loss, the UCL for narrow-band impulses and sinusoidal tones amounted to 85-90 dB between 500 Hz and 4.0 kHz, while the median UCL for speech increased to 109 dB (Brügel and Schorn 1991d, 1992a, Schorn and Brügel 1994). In the beginning of the collaborative research centre 204, the reliability of otometry, i.e. the ‘Rauschimpulsaudiometrie’, or noise-pulse audiometry, was established to verify the range of comfortable hearing (Schorn 1980). Recently, in a large number of patients, we were able to verify that the best method to estimate the amplification, the peak clipping, the compression-factor and -kneepoint of a hearing aid is the scaling of the auditory field. This method has been introduced in Germany as ‘category scaling of loudness’ (Würzburger Hörfeld) and was produced by WESTRA GmbH. This system employs 1/3 octave filtered noise with varying centre frequency and sound-pressure level as well as octave filtered environmental sounds. The patient has to describe his loudness perception by touching 302
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids
Fig. 9.3.4: Loudness ratings of 1/3-octave filtered noise bursts with centre frequencies 500, 1000, 2000 and 4000 Hz. Median and interquartile ranges of 97 ratings for each median from 57 normal hearing subjects. Dashed line: WHF-reference functions. Solid line: Polynomial regression for the median values.
a scale labelled with seven verbal categories between inaudible and painful. Normative data has to be obtained as a prerequisite to measurements in patients with sensorineural hearing loss and to justify hearing aids (Baumann et al. 1996). Our study to gain this normative data was conducted in a typical audiometric room (Arnold et al. 1996). The results in normal-hearing subjects indicate that no significant sex-specific effect exists. Additionally, for higher centre frequencies we observed a remarkable increase of loudness estimation above a certain level of sound pressure (Fig. 9.3.4). The stimulus (band-filtered noise, band-filtered environmental sounds) influences the loudness scaling function. Furthermore, a considerable difference between the normative functions given by the manufacturer of the WHF and our data was observed (Fig. 9.3.4; Baumann et al. 1997). The loudness scaling also allows us to check the effectiveness of a fitted hearing device. After proper fitting of a good hearing aid, the patient produces a scaling curve which runs parallel to a normal loudness increase (Fig. 9.3.5 left), while another patient with an incorrectly-fitted aid shows a very narrow dynamic range (Fig. 9.3.5 right).
303
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.3.5: Loudness functions (categorial loudness versus level in dB SPL) for a hearing-impaired listener fitted with a proper hearing aid (left) and with an insufficient hearing device (right).
9.3.5 Hearing aid fitting with speech in noise The ability to understand speech must be considered the most important measurable aspect of human auditory function. As for the development of discrimination tests, many variabilities have to be considered, i.e. female or male speaker, trained or untrained speaker, natural or synthetic speech material, open or closed response, numbers of word alternatives, announcement sentence or not, monosyllables, spondaic words or sentences as items, meaningful or senseless words, sound-pressure level of the recorded words, kind of noises, signal-to-noise ratio, and interpretation strategies. It is to be expected that in the future, special speech tests for different situations such as diagnostic, medical expertise, fitting with hearing aids, or cochlear implant will be used, and various patient conditions will be taken into consideration, i.e. degree of hearing loss, age, vigilance, and psychological stability. We studied three different speech tests with regard to hearing aid fitting, i.e. the Freiburger monosyllabic test, the Marburger sentence test and the newly-developed Rhyme test of Sotscheck (Brügel et al. 1994, Schorn 1997). We showed that the results for speech intelligibility employing the Sotscheck test bear a resemblance to the Marburger sentence test, although the sentence test has much more redundancy. This effect can be explained with the closed set response system of the rhyme test. In contrast to these results, hearing aid users often demonstrate very poor speech perception in the rhyme test, although the average speech intelligibility with a hearing aid increased to 20 % when using the Freiburger monosyllabic test. With the Sotscheck test, one quarter of our patients had a deterioration of speech intelligibility with a hearing aid, although the subjective acceptance of the aid was satisfactory and the other speech tests showed gratifying results. This effect can be explained by the attack-time of the gain control circuits of the amplifier of the hearing aid. In hearing impaired persons, speech intelligibility is considerably reduced when, in addition to the speech signal, background noises are audible. Therefore, a special masking noise for use in audiology was developed and realised (Fastl 1987c). This noise is available on CD (Westra 1992) and is called by our colleagues 304
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids
Fig. 9.3.6: Background noise for speech audiometry. a: Spectral distribution of CCITT-noise; b: Corresponding loudness-time-function; c: Spectral distribution of modulator, representing temporal envelope; d: Loudness-time-function of Fastl-noise.
the ‘Fastl-noise’. The masking noise simulates, on the average, the spectral and temporal envelope of human speech. These features are illustrated in Figure 9.3.6. The upper left panel shows the frequency distribution of a noise according to CCITT Rec. G 227. The shaded area represents the average spectral envelope of human speech according to research by Tarnoczy. In the upper right panel, the loudness-time-function of the CCITT noise is shown. The temporal fluctuation of fluent speech is simulated by a band-pass noise with a maximum at 4 Hz and a slope of 6 dB/oct towards lower frequencies and -12 dB/oct towards higher frequencies. The features of this band-pass are in line with the dependence of the hearing sensation fluctuation strength on modulation frequency. This means that speech and hearing are optimally adapted to each other. A loudness-time-function as displayed in the lower right panel of Figure 9.3.6 results when CCITT noise is amplitude modulated by a band-pass noise as displayed in the lower left panel. As clearly seen in Figure 9.3.6, the noise fluctuates at a rate of about 4 Hz, corresponding to a normal speaking rate of about four syllables per second. Interestingly, a similar correspondence between hearing and vocal activity was also found in the squirrel monkey (Fastl et al. 1991a). Both the traditional CCITT masking noise as well as the Fastl-noise from the Westra CD were used with several languages. The intelligibility of speech in noise was studied with both noises (Hautmann and Fastl 1993, Stemplinger et al. 1994, Hojan and Fastl 1996, Stemplinger et al. 1997), and some of the results are compiled in Figure 9.3.7. The percentage (h) of correctly-identified items is given as a function of the signal-to-noise ratio (∆L). Filled symbols give results for the CCITT305
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.3.7: Speech in noise. Percentage h of correctly-repeated mono-syllables as a function of signal to noise ratio ∆L. Filled symbols: CCITT-noise; open symbols: Fastl-noise. Different languages are a: German; b: Hungarian; c: Polish.
noise, open symbols for the Fastl-noise. The panels show results for the following languages: (a) German, (b) Hungarian, (c) Slovenian, (d) Polish. Results displayed in Figure 9.3.7 clearly show that a continuous noise produces larger masking than a fluctuating noise. For 50 % correct, the signal-to-noise ratios for different languages vary between -9.2 to -5.5 dB for the Fastl-noise. For the continuous CCITT-noise, the corresponding numbers are -5.0 dB to -1.5 dB. For normally-hearing persons, on the average about 5 dB less signal-to-noise ratio is required for the time-varying noise than for the continuous noise. Hard-of-hearing people, however, frequently cannot profit from the gaps in the temporal envelope of the Fastl-noise. One reason is that their hearing system may show reduced temporal resolution. Another reason is that hearing aids with AGC circuits can significantly degrade temporal envelopes (Fastl 1987d). With hearing impaired patients, three types of masking signals, namely speech-simulating noise CCITT Rec. G 227, Döring babble noise, and Fastl-noise 306
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids were studied. It could be demonstrated that the Fastl-noise is very well accepted by the listeners, even by patients with hearing disorders with recruitment (Brügel et al. 1991a, Brügel and Schorn 1992b, Schorn and Stecker 1994b, Schorn 1997). Furthermore, the Fastl-noise did not mask, i.e. reduce the speech intelligibility, to such a large extent as the speech-simulating noise and Döring-noise. These results demonstrate that Fastl-noise is suitable for the diagnosis of communication disturbances in noise and for hearing aid fitting with German and other languages (Schorn and Stecker 1994b).
9.3.6 Experimental Hearing Aid To improve the quality of hearing aids, a series of experiments was performed. Persons with a hearing-impairment experience problems in particular in noisy surroundings. Therefore, the improvement of the signal-to-noise ratio for speech was the focus of the research. One possibility of achieving an increased signal-to-noise ratio is to use directional microphones (Beckenbauer 1987). Using miniature electret microphones, substantial spatial selectivity can be obtained. A more sophisticated approach modelled the effects of lateral inhibition in a multi-channel hearing aid (Beckenbauer 1991). The technical realization of the inhibition network is illustrated in Figure 9.3.8. The inhibition is explained for the example of two channels. Starting from top to bottom, each channel is illustrated by a band-pass filter with an output of v1 or v2, respectively. After rectification and lowpass filtering, the voltages v1’ and v2’ are obtained, which represent the temporal envelopes of the filtered signals. The voltages are distributed to three branches. In one branch, they are weighted by a factor g. In the centre branch, they are directly transmitted. In the third branch, they go to a divider. The signals weighted by the factor g are subtracted from the direct signal in the respective neighbouring channel. The resulting voltage is then divided by v1’ or v2’. The following block insures that no negative control signals are fed to the multipliers. Each band-pass signal (v1 or v2) is multiplied by a control signal resulting in an output of i1 or i2, respectively. For the improvement of the signal-to-noise ratio in hearing aids, a prototype was developed and realized with 20 critical band-wide channels. The lateral inhibition explained by Figure 9.3.8 was extended to a maximum of four neighbouring channels. When presenting speech material over the multi-channel hearing aid without background noise, a score of 94 % was achieved compared to a score of 98 % for direct presentation without the hearing aid. This means that the inhibition network does not significantly degrade the intelligibility of speech. For situations of speech in noise, an improvement can be obtained with the inhibition network, that corresponds to an improvement in signal-to-noise ratio of about 3 dB (Beckenbauer 1991).
307
9 Hearing impairment: Evaluation and rehabilitation
Fig. 9.3.8: Realisation of lateral inhibition in a multi-channel hearing aid. For details see text.
9.3.7 Cochlear implantation – Preliminary test, intra- and postoperative control Two techniques were employed to assess the suitability of deaf patients for cochlear implantation (CI): Electrical stimulation by means of an ear canal electrode and by means of a transtympanic promontorium electrode. The results of the ear-canal tests indicate that the patients may be divided into three groups: Non-usable dynamics, usable dynamics in the low frequency range, and usable dynamics over a wider frequency range. Comparable results were obtained via the promontory tests (Schorn et al. 1986). Since the promontory test showed a more accurate description of the subjective sensations, we recommend performing this test with those patients who have been preselected on the basis of ear canal stimulation. The stapedius reflex can be used as an objective test in patients fitted with a CI. It is elicited by a loud acoustic stimulus in normal-hearing subjects, but also in patients with a sensorineural hearing loss. The pattern of this reflex serves to differentiate between cochlear vs. VIIIth-nerve disorders (Wurzer et al. 1983). The sensorineural diagnosis is based on the relationship of the acoustic-reflex hearing threshold level to the degree of hearing loss. Furthermore, the reflex can be triggered by electrical stimuli. During the operation, the integrity of the VIIIth-nerve can be tested through observation of the stapedius reflex. Moreover, the reflex gives information about the function and position of the electrodes. In a study conducted with the MEDEL Combi 40 Implant, 18 patients underwent an intraoperative stapedius reflex test during cochlear implantation. The stapedius reflex was absent in six patients due to defects in the reflex system or to central neu308
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids rological diseases. Additionally, a correlation between the electrical stapedius reflex threshold and the subjective uncomfortable loudness was demonstrated (Arnold et al. 1997). The intraoperative stapedius reflex test is a valuable diagnostic tool, especially when fitting the CI in small children, but the absence of the reflex cannot prove definitely a retrocochlear defect and a lack of profit from the CI. Consequently, the electrical stapedius reflex test is applied as an essential prerequisite during the time-consuming period of adjusting the speech processor. Especially in uncooperative or multiply-handicapped children, the level of discomfort can be sufficiently estimated (Arnold et al. 1997).
9.3.8 Speech processing with cochlear implant Since the introduction of the continuous-interleaved sampling (CIS) strategy with the MEDEL Combi 40 cochlear implant, the interest in development of new speechcoding strategies has increased. A new method for speech enhancement was developed and tested in patients fitted with two different CI systems, the MEDEL Combi 40 or the COCHLEAR CI22. We demonstrated that for users of the COCHLEAR device, speech intelligibility increased when spectral information was reduced to the six most prominent spectral components. These components have to be carefully selected in order to preserve temporal continuity (Baumann 1997). For speech and noise, CI patients show remarkable differences to normal hearing subjects (Fastl et al. 1998). While normal hearing subjects can profit from the temporal gaps in the Fastl-noise (cf. Fig. 9.3.7), CI patients need the same signalto-noise ratio for continuous CCITT-noise or time-varying Fastl-noise. Figure 9.3.9 shows the data obtained with a sentence test (HSM-test).
Fig. 9.3.9: With CCITT-noise, the disadvantage of CI patients in speech intelligibility amounts to 16 dB (squares), whereas for the Fast-noise CI patients need as much as 26 dB better signal to noise ratio (circles). This results indicates that temporal processing in CI patients is not yet well understood.
309
9 Hearing impairment: Evaluation and rehabilitation
9.3.9 Tactile Hearing Aids With a view to improving speech communication for deaf people, a tactile hearing aid was developed and realized. For the application of tactile stimuli, new ringshaped electromechanic transducer elements were realized using thin piezoelectric PVDF-films (Leysieffer 1985). They have a large dynamic range of 30 dB, low power consumption of 0.1 mW at threshold for 200 Hz, and very low weight (below 2 g) and volume. The mechanical sturdiness and the excellent long-time stability of these transducers allowed the development of a small and wearable tactile hearing aid that could also be used by deaf preschool children. With the new type of PVDFtransducers, tactile hearing aids were realized with low power consumption that ran for many hours on one set of batteries. The system developed works on a vocoder principle. The speech signal is first compressed in an AGC-circuit and then filtered by a bank of seven filters with centre frequencies between 200 and 2500 Hz. From the filtered signals, the envelope is calculated and effects of lateral inhibition are realized. In addition, information about the basic frequency of the speech signal as well as a distinction between voiced and unvoiced speech elements are implemented (Leysieffer 1986). Fig. 9.3.10 gives examples of the spatial distribution of the transducers along the fingers. The upper panels shows the vibrations applied for the German vowels ‘a’, ‘e’, ‘i’, ‘o’ and ‘u’ that the coding scheme is realized in such a way that for the vowels, rather different spatial patterns of stimulation are obtained. The lower panels gives the vibration pattern for the consonants ‘s’ and ‘sch’ and illustrate that the distinction between voiced and unvoiced speech sounds is coded by the transducer number 9.
Fig. 9.3.10: Spatial stimulation pattern of the tactile hearing aid for the five German vowels (top) as well as unvoiced fricatives (bottom).
310
9.3 Rehabilitation with hearing aids, cochlear implants and tactile aids In comparison to systems with electrocutaneous stimulation, stimulations with PVDF-transducers have the great advantage that the thresholds show good stability, with variations of less than 6 dB between subjects, and a reproducibility of about 2 dB. Therefore, in contrast to tactile aids with electrocutaneous stimulation, it is not necessary to calibrate the PVDF-transducer aid before each use. This stability gives the PVDF-based aid an eminent advantage for practical applications. As concerns the recognition-rate, the PVDF-based tactile aid allowed 90 % correct recognition for five synthetic vowels after a training of only a quarter of an hour. After one hour training, the recognition-rate exceeded 95 %. With natural vowels, after one hour training about 85 % correct were obtained. This compares very favourably to 60 % for an electrocutaneous hearing aid. After a training of two hours, numbers were correctly identified in 80 % of cases (Leysieffer 1987). Simple calculations like 8–9 = –1 were recognized with 80 % accuracy after only four hours of training. With respect to the ability to learn the new ‘tactile language’, the main influences were age and technical background. Best results were obtained by an engineer aged 25, followed by a mechanic aged 33. A female high-school student aged 18 performed significantly worse, while a housewife aged 61 obtained only about 20 % correct after one hour of training. In summary, the PVDF-based tactile hearing aids show a great potential for conveying speech information to deaf people.
311
10 References (work of the collaborative research centre 204)
Arnold, B., Baumann, U., Schilling, V. (1997): The intraoperative Electric Stapedius-Reflex-Test: Possibilities and Limits. Eur. Arch. Otolaryngol., 305. Arnold, B., Baumann, U., Stemplinger, I., Schorn, K. (1996): Bezugskurven für die kategoriale Lautstärkeskalierung. HNO 21, 195-196. Arnold, B., Schorn, K. (1994): Sensitivität und Spezifität der evozierten otoakustischen Emissionen bei 1202 Kleinkindern nach den Empfehlungen der EG. Arch. Otorhinolaryngol Suppl. II, 37-38. Arnold, B., Schorn, K., Stecker, M. (1995): Screeningprogramm zur Selektierung von Hörstörungen Neugeborener im Rahmen der Europäischen Gemeinschaft. Laryngo-Rhino-Otol. 74, 172-178. Audet, D. (1990): Foraging behaviour and habitat use by a gleaning bat, Myotis myotis. J. Mamm. 71, 420-427. Audet, D., Krull, D., Marimuthu, G., Sumithran, S., Singh, J.B. (1991): Foraging behaviour of the Indian False Vampire bat, Megaderma lyra. Biotropica 23, 63-67. Authier, S., Manley, G.A. (1995): A model of frequency tuning in the basilar papilla of the Tokay gecko, Gekko gecko. Hearing Res. 82, 1-13. Bartsch, E., Schmidt, S. (1993): Psychophysical frequency modulation thresholds in a FM-bat, Tadarida brasiliensis. Hearing Res. 67, 128-138. Baumann, U. (1994a): Segregation and integration of acoustical objects in automatic analysis of music. In: Deliège, I. (Ed.): Proc. of the 3rd Intern. Conf. for Music Perception and Cognition, Liège, 282-285. Baumann, U. (1994b): Über die Wahrnehmung von Teiltönen aus harmonisch komplexen Klängen. In: Lenk, A., Hoffmann, R. (Eds.): Fortschritte der Akustik - DAGA 94. Verlag DPGGmbH, Bad Honnef, 1017-1021. Baumann, U. (1995): Ein Verfahren zur Erkennung und Trennung multipler akustischer Objekte. Utz Verlag, München. Baumann, U. (1997): Hörversuche zum Verständnis von konturierter Sprache bei Cochlear Implant Benutzern. In: Wille, P. (Ed.): Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 79-80. Baumann, U., Stemplinger, I., Arnold, B., Schorn, K. (1996): Kategoriale Lautstärkeskalierung in der klinischen Anwendung. In: Portele, Th., Hess, W. (Eds.): Fortschritte der Akustik DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 128-129. Baumann, U., Stemplinger, I., Arnold, B., Schorn, K. (1997): Kategoriale Lautstärkeskalierung in der klinischen Anwendung. Laryngo-Rhino-Otol. 76, 458-465. Bechly, M. (1983): Zur Abhängigkeit der „suppression“ bei 8 kHz vom Frequenzverhältnis und den Schallpegeln der beiden Maskierer. Acustica 52, 113-115. Beckenbauer, T. (1987): Möglichkeiten zur Verbesserung des Signal/Störverhältnisses durch gerichtete Schallaufnahmen. In: Fortschritte der Akustik - DAGA 87. Verlag DPG-GmbH, Bad Honnef, 449-452.
312 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
10 References Beckenbauer, T. (1991): Technisch realisierte laterale Inhibition und ihre Wirkung auf störbehaftete Sprache. Acustica 75, 1-16. Beckenbauer, T., Stemplinger, I., Seiter, A. (1996): Basics and use of DIN 45681 ‘Detection of tonal components and determination of a tone adjustment for the noise assessment’. In: Proc. inter-noise’96, Vol. 6, 3271-3276. Bieser, A., Müller-Preuss, P. (1996): Auditory responsive cortex in the Squirrel monkey: neural responses to amplitude-modulated sounds. EBR 108, 273-284. Bieser, A., Müller-Preuss, P.(1988): Projections from superior temporal gyrus: convergent connections between structures involved in both audition and phonation. In: Syka, J., Masterton, B. (Eds.): Auditory pathway: Structure and function. Plenum Publishing Co., New York, 323-326. Böhnke, F. Arnold, W. (1998): Nonlinear Mechanics of the Organ of Corti Caused by Deiters Cells. IEEE Transactions on Biomedical Engineering, in press. Brix, J., Fischer, F.P., Manley, G.A. (1994): The cuticular plate of the hair cell in relation to morphological gradients of the chicken basilar papilla. Hearing Res. 75, 244-256. Brix, J., Manley, G.A. (1994): Mechanical and electromechanical properties of the stereovillar bundles of isolated and cultured hair cells of the chicken. Hearing Res. 76, 147-157. Brügel, F.J., Schorn, K. (1990): Die Gehörgangsresonanz von Kindern und ihre Bedeutung für die Hörgeräteversorgung. Eur. Arch. Otolaryngol. Suppl. II, 165-166. Brügel, F.J., Schorn, K. (1991a): Ist die Hörgeräteanpassung beim Kind ohne Bestimmung der Gehörgangsresonanz noch vertretbar? Stimme-Sprache-Gehör 15, 136-141. Brügel, F.J., Schorn, K. (1991b): In-Situ-Messung als notweniger Bestandteil der Hörgeräteanpassung. Laryngo-Rhino-Otol. 70, 616-619. Brügel, F.J., Schorn, K. (1991c): Bericht über eine ungewöhnliche Hörgeräteversorgung. Hörakustik 26, 3-8. Brügel, F.J., Schorn, K. (1991d): Die Bedeutung der verschiedenen Unbehaglichkeitsschwellen für die Hörgeräteanpassung. Eur. Arch. Otolaryngol. Suppl. II, 145. Brügel, F.J., Schorn, K. (1992a): Verschiedene Unbehaglichkeitsschwellen, ihr Zusammenhang und die Einsatzpunkte in der Praxis. Laryngo-Rhino-Otol. 71, 572-575. Brügel, F.J., Schorn, K. (1992b): Welchen Einfluß nimmt die Zusatzbohrung auf die Spracherkennung in verschiedenen Störschallen? Hörakustik 27, 4-7. Brügel, F.J., Schorn, K. (1992c): Die Wirkung von Regelschaltungen im Vergleich: Kuppler Messung - In-Situ-Messung. Eur. Arch. Otolaryngol. Suppl. II, 91. Brügel, F.J., Schorn, K. (1993a): Die Bedeutung der In-Situ-Messung zur Einschätzung der wirksamen Hörgeräte-Verstärkung bei höheren Schalldruckpegeln. Laryngo-Rhino-Otol. 72, 301-305. Brügel, F.J., Schorn, K. (1993b): Veränderung der Zeitauflösung durch die Hörgeräteanpassung. Eur. Arch. Otolaryngol. Suppl. II, 140. Brügel, F.J., Schorn, K. (1993c): Variation of temporal resolution following hearing aid fitting. In: Orhan Sunar (Ed.): Proceedings of ORL, 28-30. Multi Science Publishing Co. Ltd., Essex, UK. Brügel, F.J., Schorn, K. (1994): Zeitauflösung mit Hörgerät bei Schwerhörigen. Laryngo-RhinoOtol. 73, 123-127. Brügel, F.J., Schorn, K., Fastl, F. (1991a): Der Einfluß der Zusatzbohrung im Ohrpaßstück auf die Sprachdiskrimination im Störgeräusch. HNO 39, 356-361. Brügel, F.J., Schorn, K., Fastl, H. (1994): Sprachaudiometrische Untersuchungen vor und nach Hörgeräteanpassung mit verschiedenen Sprachtests. Eur. Arch. Otolaryngol. Suppl. II, 297. Brügel, F.J., Schorn, K., Hofer, R. (1992): Welche Verbesserungen sind mit der Zusatzbohrung bei der Hörgeräteanpassung zu erreichen? Laryngo-Rhino-Otol. 71, 79-82. Brügel, F.J., Schorn, K., Stecker, M. (1990): In-Situ-Messung zur Bestimmung der Wirkung von Zusatzbohrungen im Ohrpaßstück. Laryngo-Rhino-Otol. 69, 337-340. Brügel, F.J., Schorn, K., Stecker, M. (1991b): Eine Möglichkeit, die Hörgeräteanpassung zu verbessern. Audiol. Akust. 30, 4-14. Buchfellner, E., Leppelsack, H.-J., Klump, G.M. and Häusler, U. (1989): Gap detection in the starling (Sturnus vulgaris). II. Coding of gaps by forebrain neurons. J. Comp. Physiol. 164, 539-549.
313
10 References Buus, S., Klump, G.M., Gleich, O., Langemann, U. (1995): An excitation-pattern model for the starling (Sturnus vulgaris). J. Acoust. Soc. Am. 98, 112-124. Casseday, J.H., Covey E., Vater, M. (1988a): Connections of the superior olivary complex in the rufous horseshoe bat Rhinolophus rouxi. J. Comp. Neurol. 278, 313-330. Casseday, J.H., Covey, E., Grothe, B. (1997): Neurons specialized for sinusoidal frequency modulation in the inferior colliculus of the big brown bat, Eptesicus fuscus. J. Neurophysiol. 77(3), pp.1595-1605. Casseday, J.H., Covey, E., Vater, M. (1988b): Ascending pathways to the inferior colliculus via the superior olivary complex in the rufous horseshoe bat, Rhinolophus rouxi. In: Nachtigall, P.E., Moore, P.W.B. (Eds.): Animal Sonar. NATO ASI Series A: Life Sciences 156, Plenum Press, New York, London, 243-247. Chalupper, J., Schmid, W. (1997): Akzentuierung und Ausgeprägtheit von Spektraltonhöhen bei harmonischen Komplexen Tönen. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 357-358. Covey, E., Vater, M., Casseday, J.H. (1991): Binaural properties of single units in the superior olivary complex of the mustached bat. J. Neurophysiol. 66, 1080-1094. Dallmayr, C. (1985a): Spontane oto-akustische Emissionen: Statistik und Reaktion auf akustische Störtöne. Acustica 59, 67-75. Dallmayr, C. (1985b): Suppressions-Periodenmuster von spontanen oto-akustischen Emissionen. In: Fortschritte der Akustik - DAGA 85. Verlag DPG-GmbH, Bad Honnef, 479-482. Dallmayr, C. (1987): Stationary and dynamic properties of simultaneous evoked otoacoustic emissions (SEOAE). Acustica 63, 243-255. Duifhuis, H., Vater, M. (1986): On the mechanics of the horseshoe bat cochlea. In: Allen, J.W., Hall, J.L., Hubbard, A., Neely, S.T., Tubis, A. (Eds.): Peripheral auditory mechanisms, Springer, Heidelberg, 89-96. Dyson, M.L., Klump, G.M. Gauger, B. (1998): Absolute hearing thresholds and critical masking ratios in the European barn owl: A comparison with other owls. J. Comp. Physiol. A 182, 695-702. Eckrich, M. (1988): Untersuchungen zur Räuber-Beutebeziehung zwischen Fledermäusen und Nachfaltern in Südindien. Dissertation Fakult. Biol. LMU München. Eckrich, M., Neuweiler, G. (1988): Food habits of the sympatric insectivorous bats Rhinolophus rouxi and Hipposideros lankadiva from Sri Lanka. J. Zool. 215, 729-737. Eisenmenger, W., Schorn, K. (1984): Untersuchungen zur Funktionsfähigkeit des Gehörs, speziell der Frequenzauflösung unter Alkoholeinfluß. Blutalkohol 21, 250-263. Fastl, H. (1983): Fluctuation strength of modulated tones and broadband noise. In: Klinke, R., Hartmann, R. (Eds.): Hearing - Physiological Bases and Psychophysics. Springer, Berlin, 282-288. Fastl, H. (1984a): Dynamic hearing sensations: Facts and models. In: Transactions of the Commitee on Hearing Research H-84-13, The Acoust. Soc. of Japan. Fastl, H. (1984b): Dynamic hearing sensations: Facts and models. J. Acoust. Soc. Jpn. 40, 767-771. (In Japanese). Fastl, H. (1984c): Schwankungsstärke und zeitliche Hüllkurve von Sprache und Musik. In: Fortschritte der Akustik - DAGA 84. Verlag DPG-GmbH, Bad Honnef, 739-742. Fastl, H. (1984d): Folgedrosselung von Sinustönen durch Breitbandrauschen: Meßergebnisse und Modellvorstellungen. Acustica 54, 145-153. Fastl, H. (1984e): An instrument for measuring temporal integration in hearing. Audiol. Acoustics 23, 164-170. Fastl, H. (1985a): Loudness and annoyance of sounds: Subjective evaluation and data from ISO 532 B. In: Proc. inter-noise’85, Vol. II, 1403-1406. Fastl, H. (1985b): Auditory adaptation, post masking and temporal resolution. Audiol. Acoustics 24, 144-154, 168-177. Fastl, H. (1986): Gibt es den Frequenzgang von Kopfhörern? In: NTG-Fachberichte, Hörrundfunk 7, VDE-Verlag GmbH, Berlin, 274-281. Fastl, H. (1987a): Nachverdeckung von Schmalbandrauschen bzw. kubischen Differenzrauschen. In: Fortschritte der Akustik - DAGA 87. Verlag DPG-GmbH, Bad Honnef, 565-568.
314
10 References Fastl, H. (1987b): How loud is a passing vehicle? In: Proc. inter-noise’87, Vol. II, 993-996. Fastl, H. (1987c): A background noise for speech audiometry. Audiol. Acoustics 26, 2-13. Fastl, H. (1987d): On the influence of AGC Hearing aids of different types on the loudnesstime pattern of speech. Audiol. Acoustics 26, 42-48. Fastl, H. (1987e): The influence of different measuring methods and background noises on the temporal integration in noise-induced hearing loss. Audiol. Acoustics 26, 66-82. Fastl, H. (1988a): Gehörbezogene Lärmmeßverfahren. In: Fortschritte der Akustik - DAGA 88. Verlag DPG-GmbH, Bad Honnef, 111-124. Fastl, H. (1988b): Noise measurement procedures simulating our hearing system. J. Acoustic Soc. Jpn. (E) 9, 75-80. Fastl, H. (1988c): Pitch and pitch strength of peaked ripple noise. In: Basic Issues in Hearing, Proc. of the 8th Intern. Symp. on Hearing, Academic Press, London, 370-379. Fastl, H. (1989a): Zum Zwicker-Ton bei Linienspektren mit spektralen Lücken. Acustica 67, 177-186. Fastl, H. (1989b): Pitch strength of pure tones. In: Proc. 13. ICA Belgrade, Vol. 3, 11-14. Fastl, H. (1989c): Average loudness of road traffic noise. In: Proc. inter-noise’89, Vol. II, 815-820. Fastl, H. (1990a): The hearing sensation roughness and neuronal responses to AM-tones. Hearing Res. 46, 293-296. Fastl, H. (1990b): Gehörbezogene Lautstärke - Messverfahren in der Musik. Das Orchester 38, 1-6. Fastl, H. (1990c): Loudness of running speech measured by a loudness meter. Acustica 71, 156-158. Fastl, H. (1990d): Trading number of operations versus loudness of aircraft. In: Proc. internoise’90, Vol. II, 1133-1136. Fastl, H. (1991a): Beurteilung und Messung der wahrgenommenen äquivalenten Dauerlautheit. Z. für Lärmbekämpfung 38, 98-103. Fastl, H. (1991b): Evaluation and measurement of perceived average loudness. In: Schick, A. et al. (Eds.): Contributions to psychological acoustics. Bibliotheks- und Informationssystem der Carl-v.-Ossietzky-Universität, Oldenburg, 205-216. Fastl, H. (1991c): On the reduction of road traffic noise by „whispering asphalt“. In: Proc. Congress, Acoust. Soc. of Japan, Nagano, 681-682. Fastl, H. (1992): Fluctuation strength of narrow band noise. In: Auditory Physiology and Perception, Advances in the Biosciences, Vol. 83, Pergamon Press Oxford, 331-336. Fastl, H. (1993a): Loudness evaluation by subjects and by a loudness meter. In: Verrillo, R.T. (Ed.): Sensory Research, Multimodal Perspectives. Lawrence Erlbaum Ass., Hillsdale, New Jersey, 199-210. Fastl, H. (1993b): Psychoacoustics and noise evaluation. In: Schick, A. et al. (Eds.): Contributions to psychological acoustics. Bibliotheks- und Informationssystem der Carl-v.-OssietzkyUniversität, Oldenburg, 505-520. Fastl, H. (1993c): Calibration signals for meters of loudness, sharpness, fluctuation strength, and roughness. In: Proc. inter-noise’93, Vol. III, 1257-1260. Fastl, H. (1994a): Psychoacoustics and noise evaluation. In: Olesen, H.S. (Ed.): NAM’94, Aarhus, Denmark, Danish Technol. Institute, 1-12. Fastl, H. (1994b): Psychoakustik und Geräuschbeurteilung. In: Vo, Q. et al. (Eds.): Soundengineering, expert-verl., Renningen, 10-33. Fastl, H. (1996a): The Psychoacoustics of Sound-Quality Evaluation. In: Proc. EAA-Tutorium, Antwerpen, 1-20. Fastl, H. (1996b): Psychoakustik und Hörgeräteanpassung. In: Zukunft der Hörgeräte, Schriftenreihe der GEERS-Stiftung, Band 11, 133-146. Fastl, H. (1996c): Masking effects and loudness evaluation. In: Fastl, H. et al. (Eds.): Recent Trends in Hearing Research. Bibliotheks- und Informationssystem der Carl-v.-Ossietzky-Universität, Oldenburg 29-50.
315
10 References Fastl, H. (1997a): Gehörgerechte Geräuschbeurteilung. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 57-64. Fastl, H. (1997b): Psychoacoustic noise evaluation. In: Proceedings of the 31st International Acoustical Conference - High Tatras ‘97, 21-26. Fastl, H. (1997c): Pitch strength and frequency discrimination for noise bands or complex tones. 11th International Symposium on Hearing, Grantham, UK, August 1st-6th, 1997. Fastl, H., Bechly, M. (1983): Suppression in simultaneous masking. J. Acoust. Soc. Am. 74, 754757. Fastl, H., Fleischer, H. (1992): Über die Ausgeprägtheit der Tonhöhe von Paukenklängen. In: Fortschritte der Akustik - DAGA 92. Verlag DPG-GmbH, Bad Honnef, 237-240. Fastl, H., Hesse, A. (1984): Frequency discrimination for pure tones at short durations. Acustica 56, 41-47. Fastl, H., Hesse, A., Schorer, E., Urbas, J., Müller-Preuss, P. (1986a): Searching for neural correlates of the hearing sensation fluctuation strength in the auditory cortex of squirrel monkeys. Hearing Res. 23, 199-203. Fastl, H., Hunecke, J. (1995): Psychoakustische Experimente zum Fluglärmmalus. In: Fortschritte der Akustik - DAGA 95. Dt. Gesellschaft für Akustik e.V., Oldenburg, 407-410. Fastl, H., Jaroszewski, A., Schorer, E., Zwicker, E. (1990b): Equal loudness contours between 100 and 1000 Hz for 30, 50, and 70 phon. Acustica 70, 197-201. Fastl, H., Krump, G. (1995): Pitch of the Zwicker-tone and masking patterns. In: Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 457-466. Fastl, H., Kuwano, S., Namba, S. (1996b): Assessing in the railway bonus in laboratory studies. J. Acoustic Soc. Jpn. (E) 17, 139-148. Fastl, H., Kuwano, S., Schick, A. (Eds.) (1996a): Recent Trends in Hearing Research. Bibliotheks- und Informationssystem der Carl-v.-Ossietzky-Universität, Oldenburg. Fastl, H., Markus, D., Nitsche, V. (1985c): Zur Lautheit und Lästigkeit von Fluglärm. In: Fortschritte der Akustik - DAGA 85. Verlag DPG-GmbH, Bad Honnef, 227-230. Fastl, H., Namba, S., Kuwano, S. (1985a): Freefield response of the headphone Yamaha HP 1000 and its application in the Japanese Round Robin Test on impulsive sound. Acustica 58, 183185. Fastl, H., Namba, S., Kuwano, S. (1986b): Cross-cultural investigations of loudness evaluation for noises. In: A. Schick et al. (Eds.): Contributions to Psychological Acoustics. Kohlrenken, Oldenburg, 354-369. Fastl, H., Namba, S., Kuwano, S. (1986c): Crosscultural study on loudness evaluation of road traffic noise and impulsive noise: Actual sounds and simulations. In: Proc. inter-noise’86, Vol. II, 825-830. Fastl, H., Schmid, W. (1997): Comparison of loudness analysis systems. In: Proc. inter-noise’97, Vol. II, 981-986. Fastl, H., Schmid, W., Kuwano, S., Namba, S. (1996c): Untersuchungen zum Schienenbonus in Gebäuden. In: Fortschritte der Akustik - DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 208-209. Fastl, H., Schmid, W., Theile, G., Zwicker, E. (1985b): Schallpegel im Gehörgang für gleichlaute Schalle aus Kopfhörern oder Lautsprechern. In: Fortschritte der Akustik - DAGA 85. Verlag DPG-GmbH, Bad Honnef, 471-474. Fastl, H., Schorer, E. (1986): Critical bandwidth at low frequencies reconsidered. In: Moore, B.C.J. et al. (Eds.): Auditory Frequency Selectivity. Plenum Press, New York, 311-318. Fastl, H., Schorn, K. (1981): Discrimination of level differences by hearing-impaired patients. Audiology 20, 488-502. Fastl, H., Schorn, K. (1984): On the diagnostic relevance of level discrimination. Audiology 23, 140-142. Fastl, H., Widmann, U. (1990): Subjective and physical evaluation of aircraft noise. Noise Contr. Engng. J. 35, 61-63.
316
10 References Fastl, H., Widmann, U., Kuwano, S., Namba, S. (1991b): Zur Lärmminderung durch Geschwindigkeitsbeschränkungen. In: Fortschritte der Akustik - DAGA 91. Verlag DPG-GmbH, Bad Honnef, 449-452. Fastl, H., Widmann, U., Müller-Preuss, P. (1991): Correlations between hearing and vocal activity in man and squirrel monkey. Acustica 73, 35-36. Fastl, H., Wiesmann, N. (1990): Ausgeprägtheit der virtuellen Tonhöhe von AM- und QFM-Tönen. In: Fortschritte der Akustik - DAGA 90. Verlag DPG-GmbH, Bad Honnef, 759-762. Fastl, H., Yamada, Y. (1986): Cross-cultural study on loudness and annoyance of broadband noise with a tonal component. In: Schick, A. et al. (Eds.): Contributions to Psychological Acoustics, Kohlrenken, Oldenburg, 341-353. Fastl, H., Zwicker, E. (1983): A free-field equalizer for TDH 39 earphones. J. Acoust. Soc. Am. 73, 312-314. Fastl, H., Zwicker, E. (1986): Beurteilung lärmarmer Fahrbahnbeläge mit einem Lautheitsmesser. In: Fortschritte der Akustik - DAGA 86. Verlag DPG-GmbH, Bad Honnef, 223-226. Fastl, H., Zwicker, E., Fleischer, R. (1990a): Beurteilung der Verbesserung der Sprachverständlichkeit in einem Hörsaal. Acustica 71, 287-292. Fastl, H., Zwicker, E., Kuwano, S., Namba, S. (1988): Beschreibung von Lärmimmissionen anhand der Lautheit. In: Fortschritte der Akustik - DAGA 89. Verlag DPG-GmbH, Bad Honnef, 751-754. Faulstich, M., Kössl, M., Reimer, K. (1996): Analysis of nonlinear cochlear mechanics in the marsupial, Monodelphis domestica: ancestral and modern mammalian features. Hearing Res. 94, 47-53. Feng, A.S., Vater, M. (1985): Functional organization of the cochlear nucleus of rufous horseshoe bats (Rhinolophus rouxi). Frequencies and internal connections are arranged in slabs. J. Comp. Neurol. 235, 529-553. Fischer F.P. (1998): Hair-cell morphology and innervation in the basilar papilla of the emu (Dromaius novaehollandiae). Hearing Res. 121, 112-124. Fischer, F.P. (1992): Quantitative analysis of the innervation of the chicken basilar papilla. Hearing Res. 61, 167-178. Fischer, F.P. (1994a): General pattern and morphological specializations of the avian cochlea. Scanning Microsc. 8, 351-364. Fischer, F.P. (1994b): Quantitative TEM analysis of the barn owl basilar papilla. Hearing Res. 73, 1-15. Fischer, F.P., Brix, J., Singer, I., Miltz, C. (1991): Contacts between hair cells in the avian cochlea. Hearing Res. 53, 281-292. Fischer, F.P., Eisensamer, B., Manley, G.A. (1994): Cochlear and lagenar ganglia of the chicken. J. Morphol. 220, 71-83. Fischer, F.P., Köppl, C., Manley, G.A. (1988): The basilar papilla of the barn owl Tyto alba: A quantitative morphological SEM analysis. Hearing Res. 34, 87-102. Fischer, F.P., Miltz, C., Singer, I., Manley, G.A. (1992): Morphological gradients in the starling basilar papilla. J. Morphol. 213, 225-240. Florentine, M., Fastl, H., Buus, S. (1988): Temporal integration in normal hearing, cochlear impairment, and impairment simulated by masking. J. Acoust. Soc. Am. 84, 195-203. Frank, G. (1996): Die akustischen Zweitonverzerrungen f2-f1 und 2f1-f2 als Indikatoren mechanischer Verstärkungsprozesse im Säugerinnenohr am Beispiel der Wüstenrennmaus. Doctoral thesis, Zoologisches Institut, Universität München. Frank, G., Kössl, M. (1995): The shape of 2f1-f2 suppression tuning curves reflect basilar membrane specializations in the mustached bat. Hearing Res. 83, 151-160. Frank, G., Kössl, M. (1996): The acoustic two-tone distortions 2f1-f2 and f2-f1 and their possible relation to changes of the gain and operating point of the cochlear amplifier. Hearing Res. 98, 104-115. Frank, G., Kössl, M. (1997): Electrical and acoustical biassing of the cochlear partition: Effects on the f2-f1 and 2f1-f2 distortions. Hearing Res. (in press).
317
10 References Fromm, S. (1990): Anatomische und physiologische Charakterisierung des medialen Geniculatums der Hufeisennase Rhinolophus rouxi. Diplomarbeit Ludwig-Maximilians-Universität, München. Gleich, O. (1989): Auditory primary afferents in the starling: Correlation of function and morphology. Hearing Res. 37, 255-267. Gleich, O. (1994): Excitation patterns in the starling cochlea: A population study of primary auditory afferents. J. Acoust. Soc. Am. 95, 401-409. Gleich, O., Dooling, R.J., Manley, G.A. (1994a): Inner-ear abnormalities and their functional consequences in Belgian Waterslager canaries (Serinus canarius). Hearing Res. 79, 123-136. Gleich, O., Dooling, R.J., Manley, G.A., Klump, G.M., Strutz, J. (1995a): Hinweise für eine kontinuierliche Haarzell-Regeneration bei einem Singvogel mit erblich bedingter, cochleärer Hörstörung. HNO 43, 287-293. Gleich, O., Kadow, C., Vater, M., Strutz, J. (1997): The gerbil cochlear nucleus: postnatal development and deprivation. In: Syka, J. (Ed.): Acoustical signal processing in the central auditory system. Plenum Publishing Co., New York, 167-174. Gleich, O., Klump, G.M. (1995): Temporal modulation transfer functions in the European starling (Sturnus vulgaris): II. Responses of auditory-nerve fibres. Hearing Res. 82, 81-92. Gleich, O., Klump, G.M., Dooling, R.J. (1994b): Hereditary sensorineural hearing loss in a bird. Naturwissenschaften 81, 320-323. Gleich, O., Klump, G.M., Dooling, R.J. (1995b): Peripheral basis for the auditory deficit in Belgian Waterslager canaries (Serinus canarius). Hearing Res. 82, 100-108. Gleich, O., Manley, G.A. (1988): Quantitative morphological analysis of the sensory epithelium of the starling and pigeon basilar papilla. Hearing Res. 34, 69-85. Gleich, O., Manley, G.A. (1998): The Hearing Organ of Birds and Crocodilia. In: Dooling, R., Popper, A.N., Fay, R.R. (Eds.): Comparative Hearing: Birds and Reptiles. Springer Handbook of Auditory Research, in press. Gleich, O., Manley, G.A., Mandl, A., Dooling, R.J. (1994): Basilar papilla of the canary and zebra finch: A quantitative scanning electron microscopical description. J. Morphol. 221, 1-24. Gleich, O., Narins, P.M. (1988): The phase response of primary auditory afferents in a songbird (Sturnus vulgaris L.). Hearing Res. 32, 81-92. Gleich, O., Vater M. (1998): The postnatal development of GABA- and Glycine-like immunoreactivity in the cochlear nucleus of the mongolian gerbil (Meriones unguiculatus). Cell Tiss. Res., in press. Gralla, G. (1993a): Wahrnehmungskriterien bei Simultan- und Nachhörschwellenmessungen. Acustica 77, 243-251. Gralla, G. (1993b): Modelle zur Beschreibung von Wahrnehmungskriterien bei Mithörschwellenmessungen. Acustica 78, 233-245. Gralla, H.G. (1991): Nachhörschwellen in Abhängigkeit von der spektralen Zusammensetzung des Maskierers. In: Fortschritte der Akustik - DAGA 91. Verlag DPG-GmbH, Bad Honnef, 501-504. Grothe, B. (1994): Interaction of excitation and inhibition in processing of pure tone and amplitude modulated stimuli in the medial superior olive of the mustached bat. J. Neurophysiol. 71 (2), 706-721. Grothe, B. (1997?): Evolution der akustischen Kommunikation (Phylogenese). Physik Med 7 (5), 251-256. Grothe, B. (1997a): Inhibition is essential for temporal processing in the auditory pathway. Verh. Deutsch. Zool. Ges. 90, 284. Grothe, B., Covey, E., Casseday, J.H. (1996): Spatial tuning of neurons in the inferior colliculus of the big brown bat: effects of sound level, stimulus type and multiple sound sources. J. Comp. Physiol. A 179, 89-102. Grothe, B., Koch, U., Park, T.J., Pollak, G.D. (1997): Interaction of neuronal filter properties for periodicity coding and binaural stimulus parameter differs in different nuclei of the ascending auditory brainstem. ARO-Abstr. 358.
318
10 References Grothe, B., Park, T.J. (1995): Time can be traded for intensity in the lower auditory system. Naturwiss. 82, 521-523. Grothe, B., Park, T.J. (1998): Sensitivity to interaural time differences in the medial superior olive of a small mammal, the Mexican free-tailed bat. J. Neurosci. 18(16), 6608-6622. Grothe, B., Park, T.J., Schuller, G. (1997): The medial superior olive in the free-tailed bat: Response to pure tones and amplitude modulated tones. J. Neurophysiol. 77(3), 1553-1565. Grothe, B., Sanes, D.H. (1993): Bilateral inhibition by glycinergic afferents in the medial superior olive. J. Neurophysiol. 69, 1192-1196. Grothe, B., Sanes, D.H. (1994): Inhibition influences the temporal response properties of gerbil medial superior olivary neurons: An in-vitro study. J. Neurosci, 14 (3), 1701-1709. Grothe, B., Schweizer, H., Pollak, G.D., Schuller, G., Rosemann, C. (1994): Anatomy and Projection Patterns of the Superior Olivary Complex in the Mexican Free-Tailed Bat, Tadarida brasiliensis mexicana. J. Comp. Neurol. 343, 630-646. Grothe, B., Vater, M., Casseday, J.H., Covey, E. (1992): Monaural excitation and inhibition in the medial superior olive of the mustached bat: An adaptation to biosonar. Proc. Natl. Acad. Sci. USA 89, 5108-5112. Habbicht, H., Vater, M. (1996): The effects of microiontophoretically applied acetylcholine in the auditory midbrain of the horseshoe bat. Brain Res. 724, 169-179. Hautmann, I., Fastl, H. (1993): Zur Verständlichkeit von Einsilbern und Dreinsilbern im Störgeräusch. In: Fortschritte der Akustik - DAGA 93. Verlag DPG-GmbH, Bad Honnef, 784-787. Heinbach, W. (1988): Aurally adequate signal representation: The part-tone-time-pattern. Acustica 67, 113-121. Heinze, M., Schmidt, S., Wiegrebe, L. (1996): Auditory temporal summation in the bat, Megaderma lyra. In: Elsner, N., Schnitzler, H.-U. (Eds.): Göttingen Neurobiology Report 1996, Proc. of the 24th Göttingen Neurobiol. Conf. II, Thieme, Stuttgart, Abstract No. 235. Hellman, R., Zwicker, E. (1987): Why can a decrease in dB(A) produce an increase in loudness? J. Acoust. Soc. Am. 82, 1700-1705. Hellman, R.P., Zwicker, E. (1989): Loudness of two-tone-noise complexes. In: Proc. internoise’89, Vol. II, 827-832. Henning, G.B., Zwicker, E. (1990): The effect of a low-frequency masker on loudness. Hearing Res. 47, 17-23. Henning, G.B., Zwicker, E. (1984): Effects of the bandwidth and level of noise and of the duration of the signal on binaural masking level differences. Hearing Res. 14, 175-178. Henning, G.B., Zwicker, E. (1985): Binaural masking-level differences with tonal maskers. Hearing Res. 16, 279-290. Henson, O.W. jr., Schuller, G., Vater, M. (1985): A comparative study of the physiological properties of the inner ear in Doppler shift compensating bats (Rhinolophus rouxi and Pteronotus parnellii). J. Comp. Physiol. 157, 587-607. Hesse, A. (1987a): Zur Tonhöhenverschiebung von Sinustönen durch Störgeräusche. Acustica 62, 264-281. Hesse, A. (1987b): Ein Funktionsschema der Spektraltonhöhe von Sinustönen. Acustica 63, 1-16. Hojan, E., Fastl, H. (1996): Intelligibility of Polish and German speech for the Polish audience in the presence of noise. Archives of Acoustics 21, 123-130. Janssen, Th. (1996): Otoakustische Emissionen. In: Lehnhardt E. (Ed.): Praxis der Audiometrie. Thieme Verlag, 7. Auflage, ISBN 3-13-369007-8, 83-112. Janssen, Th., Arnold, W. (1995): Otoakustische Emissionen und Tinnitus: DPOAE eine Meßmethode zum objektiven Nachweis des auf der Ebene der äußeren Haarzellen entstehenden Tinnitus? Otorhinolaryngol. NOVA 5, 127-141. Janssen, Th., Kummer, P., Arnold, W. (1995a): Wachstumsverhalten der Distorsionsproduktemissionen bei normaler Hörfunktion. Otorhinolaryngol. NOVA 5, 211-222. Janssen, Th., Kummer, P., Arnold, W. (1995b): Wachstumsverhalten der Distorsionsproduktemissionen bei kochleären Hörstörungen. Otorhinolaryngol. NOVA 5, 34-46.
319
10 References Janssen, Th., Kummer, P., Arnold, W. (1998): Growth behavior of the 2f1-f2 distortion product otoacoustic emission in tinnitus. J. Acoust. Soc. Am. 103(6), 3418-3430. Janssen, Th., Kummer, P., Boege, P., Scholz, M., Arnold, W. (1997): Rekonstruktion von Hörschwellenverläufen mit DPOAE-Schätzmodellen. Audiologische Akustik 36(4), 178-190. Jurzitza, D., Hemmert, W. (1992): Quantitative measurements of simultaneous evoked otoacoustic emmissions. Acustica 77, 93-99. Kaiser, A. (1992): The ontogeny of homeothermic regulation in post-hatching chicks: Its influence on the development of hearing. Comp. Biochem. Physiol. 103 A, 105-111. Kaiser, A., Covey, E. (1996): Serotonin-immunoreactive innervation of the auditory pathway of an echolocating bat, Eptesicus fuscus. In: Syka, J. (Ed.): Acoustic signal processing in the central auditory system. Prag, 1-9. Kaiser, A., Manley, G.A. (1994): Physiology of single putative cochlear efferents in the chicken. J. Neurophysiol. 72, 2966-2979. Kaiser, A., Manley, G.A. (1996): Brainstem connections of the Macula lagenae in the chicken. J. Comp. Neurol. 374, 108-117. Kautz, D. (1997): Microiontophoretic studies on the physiological mechanism of auditory motion-direction detection in the inferior colliculus of the barn owl (Tyto alba). Doctoral Thesis, Institut for Biology II, RWTH Aachen. Kautz, D., Wagner, H. (1994): Micro-iontophoretic studies suggest similar neuronal mechanisms for visual and acoustic motion-direction sensitivity. In: Elsner, N., Breer, H. (Eds.): Göttingen Neurobiology Report 1994. Thieme Verlag, Stuttgart, New York, 391. Kautz, D., Wagner, H. (1994): Neural mechanisms of auditory motion-direction sensitivity. Am. Soc. Neurosci. Abstr. 20, 320. Kautz, D., Wagner, H. (1995): Acoustic motion-direction sensitivity in the barn owl: Temporal fine structure of spike trains. In: Elsner, N., Menzel, R. (Eds.): Proceedings of the 23rd Göttingen Neurobiology Conference 1995, Vol. I. Thieme Verlag, Stuttgart, New York, 149. Kautz, D., Wagner, H., Takahashi, T. (1992): A computer model for acoustic motion-direction sensitivity in the barn owl. In: Elsner, N., Richter, D.W. (Eds.): Rhythmogenesis in neurons and networks. Thieme Verlag, Stuttgart, New York, 727. Kemmer, M., Vater, M. (1997): The distribution of GABA and glycine immunostaining in the cochlear nucleus of the mustached bat. Cell Tiss. Res. 287, 487-506. Kettembeil, S., Manley, G.A., Siegl, E. (1995): Distortion-product otoacoustic emissions and their anaesthesia sensitivity in the European starling and the chicken. Hearing Res. 86, 47-62. Klump, G.M. (1991): The detection of upward and downward frequency sweeps in the European starling (Sturnus vulgaris). Naturwiss. 78, 469-471. Klump, G.M., Baur, A. (1990): Intensity discrimination in the European starling (Sturnus vulgaris). Naturwiss. 77, 545-548. Klump, G.M., Gleich, O. (1991): Gap detection in the European starling (Sturnus vulgaris). III: Processing in the peripheral auditory system. J. Comp. Physiol. A 168, 469-476. Klump, G.M., Langemann, U. (1995): Comodulation masking release in a songbird. Hearing Res. 87, 157-164. Klump, G.M., Maier, E.H. (1989): Gap detection in the starling (Sturnus vulgaris). I. Psychophysical thresholds. J. Comp. Physiol. 164, 531-538. Klump, G.M., Maier, E.H. (1990): Temporal summation in the European starling (Sturnus vulgaris). J. Comp. Psychol. 104, 94-100. Klump, G.M., Okanoya, K. (1991): Temporal modulation transfer functions in the European starling (Sturnus vulgaris): I. Psychophysical modulation detection thresholds. Hearing Res. 52, 1-12. Koch, U., Grothe, B. (1997): Azimuthal position influences analysis of complex sounds in the mammalian auditory system. Naturwiss. 84, 160-162. Koch, U., Grothe, B. (1998): GABAergic and Glycinergic Inhibition Sharpens Tuning for Sinusoidal Frequency Modulations in the Inferior Colliculus of the Big Brown Bat. J. Neurophysiol. 80, 71-82.
320
10 References Koch, U., Grothe, B. (1998): Ipsilateral inhibition influences rate coding for frequency-modulated stimuli. J. Neurophysiol. in press. Köppl, C. (1988): Morphology of the basilar papilla of the bobtail lizard Tiliqua rugosa. Hearing Res. 35, 209-228. Köppl, C. (1993): Hair-cell specializations and the auditory fovea in the barn owl cochlea. In: Duifhuis, H., Horst, J.W., van Dijk, P., van Netten, S.M. (Eds.): Biophysics of Hair Cell Sensory Systems. World Scientific Publishing Co., Singapore, 216-222. Köppl, C. (1994): Auditory nerve terminals in the cochlear nucleus magnocellularis: Differences between low and high frequencies. J. Comp. Neurol. 339, 438-446. Köppl, C. (1995): Otoacoustic emissions as an indicator for active cochlear mechanics: A primitive property of vertebrate auditory organs. In: Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 207-218. Köppl, C. (1997a): Frequency tuning and spontaneous activity in the auditory nerve and cochlear nucleus magnocellularis of the barn owl Tyto alba. J. Neurophysiol. 77, 364-377. Köppl, C. (1997b): Number and axon calibres of cochlear afferents in the barn owl. Auditory Neurosci. 3, 313-334. Köppl, C. (1997c): Phase locking to high frequencies in the auditory nerve and cochlear nucleus magnocellularis of the barn owl Tyto alba. J. Neurosci. 17, 3312-3321. Köppl, C., Authier, S. (1995): Quantitative anatomical basis for a model of micromechanical frequency tuning in the Tokay gecko, Gekko gecko. Hearing Res. 82, 14-25. Köppl, C., Carr, C.E. (1997): Low-frequency pathway in the barn owl’s auditory brainstem. J. Comp. Neurol. 378, 265-282. Köppl, C., Gleich, O. (1988): Cobalt labelling of single primary auditory neurones: An alternative to HRP. Hearing Res. 32, 111-116. Köppl, C., Gleich, O., Manley, G.A. (1993): An auditory fovea in the barn owl cochlea. J. Comp. Physiol. A 171, 695-704. Köppl, C., Klump, G.M., Taschenberger, G., Dyson, M., Manley, G.A. (1998): The auditory fovea of the barn owl - no correlation with enhanced frequency resolution. In: Palmer, A.R., Rees, A., Summerfield, A.Q., Meddis, R (Eds.): Psychophysical and physiological advances in hearing. Whurr Publishers Ltd, London, 153-159. Köppl, C., Manley, G.A. (1990a): Peripheral auditory processing in the bobtail lizard Tiliqua rugosa. II: Tonotopic organization and innervation pattern of the basilar papilla. J. Comp. Physiol. A 167, 101-112. Köppl, C., Manley, G.A. (1990b): Peripheral auditory processing in the bobtail lizard Tiliqua rugosa III: Patterns of spontaneous and tone-evoked nerve-fibre activity. J. Comp. Physiol. A 167, 113-127. Köppl, C., Manley, G.A. (1992): Functional consequences of morphological trends in the evolution of lizard hearing organs. In: Webster, D.B., Fay, R.R., Popper, A.N. (Eds.): The Evolutionary Biology of Hearing., 1st ed., Springer Verlag, New York, 489-510. Köppl, C., Manley, G.A. (1993a): Distortion-product otoacoustic emissions in the bobtail lizard. II: Suppression tuning characteristics. J. Acoust. Soc. Am. 93, 2834-2844. Köppl, C., Manley, G.A. (1993b): Spontaneous otoacoustic emissions in the bobtail lizard. I: General characteristics. Hearing Res. 71, 157-169. Köppl, C., Manley, G.A. (1994): Spontaeous otoacoustic emissions in the bobtail lizard. II: Interactions with external tones. Hearing Res. 72, 159-170. Köppl, C., Manley, G.A. (1997): Frequency representation in the emu basilar papilla. J. Acoust. Soc. Am. 101, 1574-1584. Köppl, C., Manley, G.A., Johnstone, B.M. (1990): Peripheral auditory processing in the bobtail lizard Tiliqua rugosa. V: Seasonal effects of anaesthesia. J. Comp. Physiol. A 167, 139-144. Köppl, C., Yates, G.K., Manley, G.A. (1997): The mechanics of the avian cochlea: Rate-intensity functions of auditory-nerve fibres in the emu. In: Lewis, E.R., Long, G., Lyon, R.F., Narins, P.M., Steele, C.R., Hecht-Poinar, E. (Eds.): Diversity in Auditory Mechanics. World Scientific Publishing Co., Singapore, 76-82.
321
10 References Kössl, M. (1992): High frequency distortion products from the ears of two bat species, Megaderma lyra and Carollia perspicillata. Hearing Res. 60, 156-164. Kössl, M. (1994a): Evidence for a mechanical filter in the cochlea of the ‘constant frequency’ bats Rhinolophus rouxi and Pteronotus parnellii. Hearing Res. 72, 73-80. Kössl, M. (1994b): Otoacoustic emission from the cochlea of the ‘constant frequency’ bats Pteronotus parnellii and Rhinolophus rouxi. Hearing Res. 72, 59-72. Kössl, M. (1997): Sound emissions from cochlear filters and foveae - Does the auditory sense organ make sense? Naturwiss. 84, 9-16. Kössl, M., Frank, G. (1995): Acoustic two-tone distortions from the cochlea of echolocating bats. In: Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 125-135. Kössl, M., Frank, G., Burda, H., Müller, M. (1996): Acoustic distortion products from the cochlea of the blind african mole rat Cryptomys spec. J. Comp. Physiol. A 178, 427-434. Kössl, M., Frank, G., Faulstich, M., Russell, I.J. (1997): Acoustic distortion products as indicator of cochlear adaptations in Jamaican mormoopid bats. In: Lewis, E.R., Long, G., Lyon, R.F., Narins, P.M., Steele, C.R., Hecht-Poinar, E. (Eds.): Diversity in Auditory Mechanics. World Scientific Publishing Co., Singapore, 42-49. Kössl, M., Russell, I.J. (1995): Basilar membrane resonance in the cochlea of the mustached bat. Proc. Natl. Acad. Sci. USA 92, 276-279. Kössl, M., Vater, M. (1985a): Evoked acoustic emissions and cochlear microphonics in the mustache bat, Pteronotus parnellii. Hearing Res. 19, 157-170. Kössl, M., Vater, M. (1985b): The cochlear frequency map of the mustache bat, Pteronotus parnellii. J. Comp. Physiol. A 157, 687-697. Kössl, M., Vater, M. (1989): Noradrenalin enhances temporal auditory contrast and neuronal timing precision in the cochlear nucleus of the mustache bat. J. Neurosci. 9, 4169-4178. Kössl, M., Vater, M. (1990a): Tonotopic organization of the cochlear nucleus of the mustache bat, Pteronotus parnellii. J. Comp. Physiol. A 166, 695-709. Kössl, M., Vater, M. (1990b): Resonance phenomena in the cochlea of the mustache bat and their contribution to neuronal response characteristics in the cochlear nucleus. J. Comp. Physiol. A 166, 711-720. Kössl, M., Vater, M. (1995): Cochlear structure and function in bats. In: Popper, A., Fay, R.R. (Eds.): Springer Handbook of Auditory Research, Vol. 11: Hearing by bats. Springer, New York, 191-234. Kössl, M., Vater, M. (1996): Antibiotics induced hair cell damage and concomittant changes in otoacoustic emissions and vocalization in the mustached bat. 19th ARO, 182. Kössl, M., Vater, M. (1996a): A tectorial membrane fovea in the cochlea of the mustached bat. Naturwiss. 83, 89-91. Kössl, M., Vater, M. (1996b): Further studies on the mechanics of the cochlear partition in the mustached bat. II: A second cochlear frequency map derived from measurement of acoustic distortion products. Hearing Res. 94, 78-86. Kössl, M., Vater, M., Schweizer, H. (1988): Distribution of catecholaminergic fibers in the cochlear nucleus of horseshoe bats and mustache bats. J. Comp. Neurol. 278, 313-329. Krull, D. (1992): Jagdverhalten und Echoortung bei Antrozous pallidus. Dissertation der Fak. Biologie an der LMU München. Krull, D., Schumm, A., Metzner, W., Neuweiler, G. (1991): Foraging areas and foraging behavior in the notch-eared bat, Myotis emarginatus (Vespertilionidae). Behav. Ecol. Sociobiol. 28, 247-253. Krumbholz, K., Schmidt, S. (1997): Harmonicity versus broadband spectral analysis for the classification of complex sounds in the bat, Megaderma lyra. Association for Research in Otolaryngology, Midwinter Meeting, Abstract No. 574. Krumbholz, K., Schmidt, S. (1998): Focusing on single components in the context of broadband spectral analysis in Megaderma lyra. Association for Research in Otolaryngology, Midwinter Meeting, Abstract No. 558, 140.
322
10 References Krumbholz, K., Schmidt, S. (1999): Perception of complex tones and its analogy to echo spectral analysis in the bat, Megaderma lyra. J. Acoust. Soc. Am. Krump, G. (1992): Zum Zwicker-Ton bei Linienspektren unterschiedlicher Phasenlagen. In: Fortschritte der Akustik - DAGA 92. Verlag DPG-GmbH, Bad Honnef, 825-828. Krump, G. (1994): Zum Zwicker-Ton bei binauraler Anregung. In: Fortschritte der Akustik DAGA 94. Verlag DPG-GmbH, Bad Honnef, 1005-1008. Krump, G. (1996): Linienspektren als Testsignale in der Akustik. In: Fortschritte der Akustik DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 288-289. Kuhn, B., Vater, M. (1995): The distribution of F-Aktin, tubulin and fodrin in the organ of Corti of the horseshoe bat and gerbil. Hearing Res. 84, 139-156. Kuhn, B., Vater, M. (1996): The early postnatal development of F-actin patterns in the organ of Corti of the gerbil (Meriones unguiculatus) and the horseshoe bat (Rhinolophus rouxi). Hearing Res. 99, 47-70. Kuhn, B., Vater, M. (1997): The development of F-actin tension fibroblasts of the spiral ligament of the gerbil cochlea. Hearing Res. 108, 180-190. Kummer, P., Janssen, T., Arnold, W. (1998a). The level and growth behavior of the 2f1-f2 distortion product otoacoustic emission and its relationship to the auditory sensitivity in normal hearing and cochlear hearing loss. J. Acoust. Soc. Am. 103(6), 3431-3444. Kummer, P., Janssen, T., Hulin, P., Arnold, W. (1998b): Primary tone level separation allows 2f1f2 DPOAE measurements in humans at close-to-threshold levels. Hearing Res. (submitted). Kummer, P., Janssen, Th., Arnold, W. (1995): Suppression tuning characteristics of the 2f1-f2 distortion product in humans. J. Acoust. Soc. Am. 98, 197-210. Kummer, P., Janssen, Th., Hulin, P., Arnold, W. (1997): Zur Wahl geeigneter Primärtonpegel bei der Messung von DPOAE. HNO 4, 314. Kuwano, S., Namba, S., Fastl, H. (1986): Loudness evaluation of various sounds by Japanese and German subjects. In: Proc. inter-noise’86, Vol. II, 835-840. Kuwano, S., Namba, S., Fastl, H. (1988): On the judgment of loudness, noisiness and annoyance with actual and artifical noises. J. Sound and Vibration 127, 457-465. Kuwano, S., Namba, S., Fastl, H., Schick, A. (1997): Evaluation of the impression of danger signals - comparison between Japanese and German subjects. In: Schick, A., Klatte, M. (Eds.): 7. Oldenburger Symposium. BIS Oldenburg, pp 115-128. Langemann, U. (1995): Die Bedeutung von Störgeräuschen für die akustische Wahrnehmung bei Singvögeln. Doctoral Thesis, Inst. f. Zoologie, Techn. Univ. München. Langemann, U., Gauger, B., Klump, G.M. (1998): Auditory sensitivity in the great tit: Signal perception in quiet and in the presence of noise. Anim. Behav., 56, 763-769. Langemann, U., Klump, G.M. (1992): Frequency discrimination in the European starling (Sturnus vulgaris): A comparison of different measures. Hearing Res. 63, 43-51. Langemann, U., Klump, G.M., Dooling, R.J. (1995): Critical bands and critical-ratio bandwidth in the European starling. Hearing Res. 84, 167-176. Lechner, T.P. (1993): A hydromechanical model of the cochlea with nonlinear feedback using PVF2 bending transducers. Hearing Res. 66, 202-212. Leysieffer, H. (1985): Polyvinylidenfluorid als elektromechanischer Wandler für taktile Reizgeber. Acustica 58, 196-206. Leysieffer, H. (1986): A wearable multi-channel auditory prosthesis with vibrotactile skin stimulation. Audiol. Acoustics 25, 230-251. Leysieffer, H. (1987): Mehrkanalige Sprachübertragung mit einer vibrotaktilen Sinnesprothese für Gehörlose. In: Fellbaum, K.-H. (Ed.): Workshop Elektronische Kommunikationshilfen, BIG-Tech Berlin’86. Weidler, Berlin, 287-298. Link, A., Marimuthu, G., Neuweiler, G. (1986): Movement as a specific stimulus for prey catching behavior in rhinolophid and hipposiderid bats. J. Comp. Physiol. A 159, 403-413. Lumer, G. (1984): Überlagerung von Mithörschwellen an den unteren Flanken schmalbandiger Schalle. Acustica 54, 154-160. Lumer, G. (1987a): Computer model of cochlear preprocessing (steady state condition) I. Basics and results for one sinusoidal input signal. Acustica 62, 282-290.
323
10 References Lumer, G. (1987b): Computer model of cochlear preprocessing (steady state condition) II. Twotone suppression. Acustica 63, 17-25. Maier, E.H., Klump, G.M. (1990): Duration discrimination in the European starling (Sturnus vulgaris). J. Acoust. Soc. Am. 88, 616-621. Manley, G.A. (1983a): Auditory nerve fibre activity in mammals. In: Lewis, B. (Ed.): Bioacoustics. Academic Press, London, 207-232. Manley, G.A. (1983b): Frequency spacing of otoacoustic emissions: a possible explanation. In: Webster, W.R., Aitkin, L.M. (Eds.): Mechanics of hearing. Melbourne, Australia, 36-39. Manley, G.A. (1983c): The hearing mechanism. In: Hinchcliffe, R. (Ed.): Hearing and Balance in the Elderly. Churchill, Livingstone, Edinburgh, London, Melbourne, New York, 45-73. Manley, G.A. (1986): The evolution of the mechanisms of frequency selectivity in vertebrates. In: Moore, B.C.J., Patterson, R.D. (Eds.): Auditory Frequency Selectivity. Plenum Press, New York, 63-72. Manley, G.A. (1990): Peripheral hearing mechanisms in reptiles and birds. 1st ed. Springer Verlag, Berlin, Heidelberg. Manley, G.A. (1995a): The avian hearing organ: a status report. In: Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 219-229. Manley, G.A. (1995b): The lessons of middle-ear function in non-mammals: Improving columellar prostheses. J. R. Soc. Med. 88, 367-368. Manley, G.A. (1996): Ontogeny of frequency mapping in the peripheral auditory system of birds and mammals: A critical review. Auditory Neurosci. 3, 199-214. Manley, G.A. (1997): Diversity in hearing organ structure and the characteristics of spontaneous otoacoustic emissions in lizards. In: Lewis, E.R., Long, G., Lyon, R.F., Narins, P.M., Steele, C.R., Hecht-Poinar, E. (Eds.): Diversity in auditory mechanics. World Scientific Publishing Co., Singapore, 32-38. Manley, G.A. (1999): The hearing organs of lizards. In: Dooling, R., Popper, A.N., Fay, R.R. (Eds.): Comparative Hearing: Birds and Reptiles, Springer Handbook of Auditory Research, Springer Verlag, New York, in press. Manley, G.A., Brix, J., Gleich, O., Kaiser, A., Köppl, C., Yates, G.K. (1988a): New aspects of comparative peripheral auditory physiology. In: Syka, J., Masterton, B. (Eds.): Auditory pathway: Structure and function. Plenum Publishing Co., New York, 3-12. Manley, G.A., Brix, J., Kaiser, A. (1987a): Developmental stability of the tonotopic organization of the chick’s basilar papilla. Science 237, 655-656. Manley, G.A., Gallo, L. (1997): Otoacoustic emissions, hair cells and myosin motors. J. Acoust. Soc. Am. 102, 1049-1055. Manley, G.A., Gallo, L., Köppl, C. (1996a): Spontaneous otoacoustic emissions in two gecko species, Gekko gecko and Eublepharis macularius. J. Acoust. Soc. Am. 99, 1588-1603. Manley, G.A., Gleich, O. (1984): Avian primary auditory neurones: The relationship between characteristic frequency and preferred intervals. Naturwiss. 71, 592-594. Manley, G.A., Gleich, O. (1992): Evolution and specialization of function in the avian auditory periphery. In: Webster, D.B., Fay, R.R., Popper, A.N. (Eds.): The Evolutionary Biology of Hearing. 1st ed., Springer Verlag, New York, 561-580. Manley, G.A., Gleich, O. (1998): The hearing organ of birds and their relatives. In Dooling, R., Popper, A.N., Fay, R.R. (Eds.): Comparative Hearing: Birds and Reptiles, Springer Handbook of Auditory Research, Springer Verlag, New York, in press. Manley, G.A., Gleich, O., Brix, J., Kaiser, A. (1988b): Functional parallels between hair-cell populations of birds and mammals. In: Duifhuis, H., Horst, J.W., Witt, H.P. (Eds.): Basic Issues in Hearing. Academic Press, London, 64-71. Manley, G.A., Gleich, O., Kaiser, A., Brix, J. (1989a): Functional differentiation of sensory cells in the avian auditory periphery. J. Comp. Physiol. A 164, 289-296. Manley, G.A., Gleich, O., Leppelsack, H.-J., Oeckinghaus, H. (1985): Activity patterns of cochlear ganglion neurones in the starling. J. Comp. Physiol. A 157, 161-181.
324
10 References Manley, G.A., Haeseler, C., Brix, J. (1991a): Innervation patterns and spontaneous activity of afferent fibres to the lagenar macula and apical basilar papilla of the chick’s cochlea. Hearing Res. 56, 211-226. Manley, G.A., Kaiser, A., Brix, J., Gleich, O. (1991b): Activity patterns of primary auditorynerve fibres in chickens: Development of fundamental properties. Hearing Res. 57, 1-15. Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (1995) (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 1995. Manley, G.A., Köppl, C. (1992): A quantitative comparison of peripheral tuning measures: primary afferent tuning curves versus suppression tuning curves of spontaneous and distortion-product otoacoustic emissions. In: Cazals, Y., Horner, K., Demany, L. (Eds.): Auditory physiology and perception. 1st ed., Pergamon Press, Oxford, 151-158. Manley, G.A., Köppl, C. (1994): Spontaneous otoacoustic emissions in the bobtail lizard. III: Temperature effects. Hearing Res. 72, 171-180. Manley, G.A., Köppl, C. (1998): The phylogenetic development of the cochlea and its innervation. Curr. Opin. in Neurobiol. 8, 468-474. Manley, G.A., Köppl, C., Johnstone, B.M. (1990a): Components of the 2f1-f2 distortion product in the ear canal of the bobtail lizard. In: Dallos, P., Geisler, C.D., Matthews, J.W., Ruggero, M.A., Steele, C.R. (Eds.): The Mechanics and Biophysics of Hearing. Springer Verlag, New York, 210-218. Manley, G.A., Köppl, C., Johnstone, B.M. (1990b): Peripheral auditory processing in the bobtail lizard Tiliqua rugosa. I. Frequency tuning of auditory-nerve fibres. J. Comp. Physiol. A 167, 89-99. Manley, G.A., Köppl, C., Johnstone, B.M. (1993a): Distortion-product otoacoustic emissions in the bobtail lizard. I. General characteristics. J. Acoust. Soc. Am. 93, 2820-2833. Manley, G.A., Köppl, C., Konishi, M. (1988c): A neural map of interaural intensity differences in the brain stem of the barn owl. J. Neurosci. 8, 2665-2676. Manley, G.A., Köppl, C., Yates, G.K. (1989b): Micromechanical basis of high-frequency tuning in bobtail lizard. In: Wilson, J.P., Kemp, D.T. (Eds.): Cochlear Mechanisms - Structure, Function and Models. Plenum Press, New York, 143-151. Manley, G.A., Köppl, C., Yates, G.K. (1997): Activity of primary auditory neurones in the cochlear ganglion of the Emu Dromaius novaehollandiae. I. Spontaneous discharge, frequency tuning and phase locking. J. Acoust. Soc. Am. 101, 1560-1573. Manley, G.A., Meyer, B., Fischer, F.P., Schwabedissen, G., Gleich, O. (1996b): Surface morphology of basilar papilla of the tufted duck Aythya fuligula, and domestic chicken Gallus gallus domesticus. J. Morphol. 227, 197-212. Manley, G.A., Müller-Preuss, P. (1078): Response variability of auditory cortex cells in the squirrel monkey to constant acoustic stimuli. Exp. Brain Res. 32, 171-180. Manley, G.A., Schulze, M., Oeckinghaus, H. (1987b): Otoacoustic emissions in a song bird. Hearing Res. 26, 257-266. Manley, G.A., Schwabedissen, G., Gleich, O. (1993b): Morphology of the basilar papilla of the budgerigar, Melopsittacus undulatus. J. Morphol. 218, 153-165. Manley, G.A., Taschenberger, G. (1993): Spontaneous otoacoustic emissions from a bird: a preliminary report. In: Duifhuis, H., Horst, J.W., van Dijk, P., van Netten, S.M. (Eds.): Biophysics of Hair Cell Sensory Systems, World Scientific Publishing Co., Singapore, 33-39. Manley, G.A., Yates, G.K., Köppl, C. (1988d): Auditory peripheral tuning: evidence for a simple resonance phenomenon in the lizard Tiliqua. Hearing Res. 33, 181-190. Manley, G.A., Yates, G.K., Köppl, C., Johnstone, B.M. (1990c): Peripheral auditory processing in the bobtail lizard Tiliqua rugosa. IV. Phase locking of auditory-nerve fibres. J. Comp. Physiol. A 167, 129-138. Marimuthu, G., Habersetzer, J., Leippert, D. (1995): Active acoustic gleaning of water surface by the Indian false vampire bat, Megaderma lyra. Ethology 99, 61-74. Marimuthu. G., Neuweiler, G. (1987): The use of acoustical cues for prey detection by the Indian False Vampire Bat, Megaderma lyra. J. Comp. Physiol. 160, 509-515.
325
10 References Metzner, W. (1989): A possible neuronal basis for Dopplershift compensation in echolocating bats. Nature 341, 529-532. Metzner, W. (1993): An audio-vocal interface in echolocating horseshoe bats. J. Neurosci. 13(5), 1899-1915. Metzner, W. (1996): Anatomical basis for audio-vocal integration in echolocating horseshoe bats. J. Comp. Neurol. 368, 252-269. Müller-Preuss, P. (1985): Correlates of forward masking in Squirrel monkeys. In: Syka, J. (1997) (Ed.): Acoustical signal processing in the central auditory system. Plenum Publishing Co., New York, 449-452. Müller-Preuss, P. (1988): Neural bases of signal detection. In: Todt, D., Goedeking, P., Symmes, D. (Eds.) in: Primate vocal communication. Springer, New York, 154-161. Müller-Preuss, P. (1988): Neural correlates of audio-vocal behavior: properties of anterior limbic cortex and related areas. In: Newman, J.D. (Ed.): The physiological control of mammalian vocalization. Plenum Press, New York, 245-262. Müller-Preuss, P. (1997): Correlates of forward masking in Squirrel monkeys. In: Syka, J. (Ed.): Acoustical signal processing in the central auditory system. Plenum Publishing Co., New York, 449-452. Müller-Preuss, P., Bieser, A., Preuss, A., Fastl, H. (1988): Neural processing of AM-sounds within central parts of the auditory pathway. In: Syka, J. Masterton, B. (Eds.): Auditory pathway: Structure and function. Plenum Publishing Co., New York, 327-331. Müller-Preuss, P., Flachskamm, C., Bieser, A. (1994): Neural encoding of amplitude modulation within the auditory midbrain of squirrel monkeys. Hearing Res. 80, 197-208. Müller-Preuss, P., Maurus, M. (1985): Coding of call components essential for intraspecific communication through auditory neurons in the squirrel monkey. Naturwiss. 72, 437-438. Müller-Preuss, P., Newman, J.D., Jürgens, U. (1980): Anatomical and physiological evidence for a relationship between the „cingular“ vocalization area and the auditory cortex in the squirrel monkey. Brain Res. 202, 307-315. Müller-Preuss, P., Ploog, D. (1983): Central control of sound production in mammals. In: Lewis, B. (Ed.): Bioacoustics. Academic Press, London, 125-146. Namba, S., Kuwano, S., Fastl, H. (1987): Cross-cultural study on the loudness, noisiness, and annoyance of various sounds. In: Proc. inter-noise’87, Vol. II, 1009-1012. Narins, P.M., Gleich, O. (1986): Phase response of low-frequency cochlear ganglion cells in the starling. In: Moore, B.C.J., Patterson, R.D. (Eds.): Auditory Frequency Selectivity. Plenum Press, New York, 209-215. Neuweiler, G. (1989): Foraging ecology and audition in echolocating bats. Trends Ecol. Evol. 4, 160-166. Neuweiler, G. (1990): Auditory adaptations for prey capture in echolocating bats. Physiol. Review 70, 615-641. Neuweiler, G., Bruns, V., Schuller, G. (1980): Ears adapted for the detection of motion, or how echolocating bats have exploited the capacities of the mammalian auditory system. J. Acoust. Soc. Am. 68, 741-753. Neuweiler, G., Metzner, W., Heilmann, U., Rübsamen, R., Eckrich, M., Costa, H.H. (1987): Foraging behaviour and echolocation in the rufous horseshoe bat of Sri Lanka. Behav. Ecol. Sociobiol. 20, 53-67. Neuweiler, G., Schmidt, S. (1993): Audition in echolocating bats. Curr. Opin. in Neurobiol. 3, 563-569. Nieder, A., Klump, G.M. (1999): Adjustable frequency selectivity of auditory forebrain neurons recorded in a freely moving songbird via radiotelemetry. Hearing Res. 127, 41-54. Nitsche, V. (1992): Thresholds for auditory temporal order in the bat, Tadarida brasiliensis. In: Elsner, N., Richter , D.W. (Eds.): Rhythmogenesis in Neurons and Networks. - Proc. of the 20th Göttingen Neurobiol. Conf., Georg Thieme, Stuttgart, Abstract No. 261. Nitsche, V. (1993a): Gap detection in the bat, Tadarida brasiliensis. In: Elsner, N., Heisenberg, M. (Eds.): Proc. of the 21st Göttingen Neurobiol. Conf., Georg Thieme, Stuttgart, Abstract No. 230.
326
10 References Nitsche, V. (1993b): Detection of temporal gaps in passband noise in the Mexican free-tailed bat, Tadarida brasiliensis. Proc. VIth European Bat Res. Symp., Bat News. Nitsche, V. (1994): Auditory pattern discrimination in the bat, Tadarida brasiliensis. In: Elsner, N., Breer, H. (Eds.): Proc. of the 22nd Göttingen Neurobiol. Conf., Georg Thieme, Stuttgart, Abstract No. 379. Obrist, K.M. (1989): Individuelle Variabilität der Echoortung: Vergleichende Freilanduntersuchungen an vier vespertilioniiden Fledermausarten Kanadas. Doctoral thesis, Ludwig-Maximilians-Univ., München. Park, T.J., Grothe, B. (1996): From pattern recognition to sound localization: A by-product of growing larger during evolution. Naturwiss. 83, 30-32. Park, T.J., Grothe, B., Pollak, G.D., Schuller, G., Koch, U. (1996): Neural delays shape selectivity to interaural intensity differences in the LSO. J. Neurosci. 16(20), 6554-6566. Park, T.J., Klug, A., Oswald, J.P., Grothe, B. (1998): A novel circuit in the bat’s auditory midbrain recruits neurons into sound localization processing. Naturwiss. 85, 176-179. Peisl, W., Zwicker, E. (1989): Simulation der Eigenschaften oto-akustischer Emissionen mit Hilfe eines analogen und eines digitalen Innenohrmodells. In: Fortschritte der Akustik - DAGA 89. Verlag DPG-GmbH, Bad Honnef, 419-422. Pickles, J.O., Brix, J., Comis, S.D., Gleich, O., Köppl, C., Manley, G.A., Osborne, M.P. (1989a): The organization of tip links and stereocilia on hair cells of bird and lizard basilar papillae. Hearing Res. 41, 31-42. Pickles, J.O., Brix, J., Gleich, O. (1989b): The search for the morphological basis of mechanotransduction in cochlear hair cells. In: Aitkin, L.M., Rowe, M.J. (Eds.): Information Processing in the Mammalian Auditory and Tactile Systems. Alan R. Liss, New York, 29-43. Pickles, J.O., Brix, J., Manley, G.A. (1990): Influence of collagenase on tip links in hair cells of the chick basilar papilla. Hearing Res. 50, 139-144. Pickles, J.O., Osborne, M.P., Comis, S.D., Köppl, C., Gleich, O., Brix, J., Manley, G.A. (1989c): Tip-link organization in relation to the structure and orientation of stereovillar bundles. In: Wilson, J.P., Kemp, D.T. (Eds.): Cochlear mechanisms - Structure, Functions and Models. Plenum Press, New York, 37-44. Pickles, J.O., Von, P.M., Rouse, G.W., Brix, J. (1991): The development of links between stereocilia in hair cells of the chick basilar papilla. Hearing Res. 54, 153-163. Pickles, J.P., Brix, J., Gleich, O., Köppl, C., Manley, G.A., Osborne, M.P., Comis, S.D. (1988): The fine structure and organization of tip links on hair cell stereovilli. In: Duifhuis, H., Horst, Witt, H.P. (Eds.): Basic Issues in Hearing (Proceedings of the 8th international symposium on hearing). Academic Press, London, San Diego, 56-63. Pillat, J. (1993): Pharmacological stimulation of vocalization in a midbrain area of the bat Rhinolophus rouxi. In: Elsner, N., Heisenberg, M. (eds.): Gene-Brain-Behaviour. Proc. 21th Göttingen Neurobiol. Conf., 260. Pillat, J. (1994): Lesion of the paralemniscal zone does not suppress Doppler shift compensation in the bat Rhinolophus rouxi. In: Elsner, N., Breer, H. (eds.): Sensory transduction. Proc. 22nd Göttingen Neurobiol. Conf., 381. Pillat, J., Schuller, G. (1997): Audio-vocal behavior of Dopplershift compensation in the horseshoe bat survives bilateral lesion of the paralemniscal tegmental area. Exp. Brain Res 119, 17-26. Prechtl, H. (1995): Senso-motorische Wechselwirkung im auditorischen Mittelhirn der Hufeisennasen-Fledermaus Rhinolophus rouxi. Doctoral thesis, Ludwig-Maximilians-Universität, München 1995. Preisler, A., Schmidt, S. (1995a): Relative versus absolute pitch perception in the bat, Megaderma lyra. In: Elsner, N., Menzel, R. (Eds.): Göttingen Neurobiology Report 1995, Proc. of the 23rd Göttingen Neurobiol. Conf. II. Thieme, Stuttgart, Abstract No. 309. Preisler, A., Schmidt, S. (1995b): Virtual pitch formation in the ultrasonic range. Naturwiss. 82, 45-47. Preisler, A., Schmidt, S. (1998): Spontaneous classification of complex tones at high and ultrasonic frequencies in the bat, Megaderma lyra. J. Acoust. Soc. Am. 103, 2595-2607.
327
10 References Preuss, A., Müller-Preuss, P. (1990): Processing of amplitude modulated sounds in the medial geniculate body of squirrel monkeys, Exp. Brain Res. 79, 207-211. Pujol, R., Forge, A., Pujol, T., Vater, M. (1996): Specialization of the outer hair cell lateral wall and at the junction with the Deiters cells in bats. 19th ARO, p. 60. Radtke-Schuller, S. (1997): Struktur und Verschaltung des Hörcortex der Hufeisennasenfledermaus Rhinolophus rouxi. Dissertation, Ludwig-Maximilians-Universität, München 1997. Radtke-Schuller, S., Schuller, G. (1995): Auditory cortex of the rufous horseshoe bat: I. Physiological response properties to acoustic stimuli and vocalizations and topographical distribution of neurons. Eur. J. Neurosci. 7, 570-591. Reimer, K. (1989): Bedeutung des Colliculus superior bei der Echoortung der Fledermaus Rhinolophus rouxi. Doctoral thesis, Ludwig-Maximilians-Universität, München 1989. Reimer, K. (1989): Retinofugal projections in the rufous horseshoe bat, Rhinolophus rouxi. Anat. Embryol. 180, 89-98. Reimer, K. (1991): Auditory properties of the superior colliculus in the horseshoe bat, Rhinolophus rouxi. J. Comp. Physiol. A 169, 719-728. Reuter, G., Kössl, M., Hemmert, W., Preyer, S., Zimmermann, U., Zenner, H.-P. (1994): Electromotility of outer hair cells from the cochlea of the echolocating bat, Carollia perspicillata. J. Comp. Physiol. A 175, 449-455. Roverud, R.C. (1990): A gating mechanism for sound pattern recognition is correlated with the temporal structure of echolocation sounds in the rufous horseshoe bat. J. Comp. Physiol. A 166, 243-249. Roverud, R.C. (1990): Harmonic and frequency structure used for echolocation sound pattern recognition and distance information processing in the rufous horseshoe bat. J. Comp. Physiol. A 166, 251-255. Roverud, R.C., Grinnell, A.D. (1985): Echolocation sound features processed to provide distance information in the CF/FM bat, Noctilio albiventris: evidence for a gated time window utilizing both CF and FM components. J. Comp. Physiol. 156, 457-469. Roverud, R.C., Nitsche, V., Neuweiler, G. (1991): Discrimination of wingbeat motion by bats, correlated with echolocation sound pattern. J. Comp. Physiol. A 168, 259-264. Rübsamen, R. (1987): Ontogenesis of the echolocation system in the rufous horseshoe bat, Rhinolophus rouxi. J. Comp. Physiol. A 161, 899-913. Rübsamen, R., Betz, M. (1986): Control of the echolocation pulses by neurons of the nucleus ambiguus in the rufous horseshoe bat, Rhinolophus rouxi. I. Single unit recordings in the motor nucleus of the larynx in actively vocalizing bats. J. Comp. Physiol. A 159, 675-687. Rübsamen, R., Neuweiler, G., Sripathi, K. (1988): Comparative collicular tonotopy in two bat species adapted to movement detection, Hipposideros speoris and Megaderma lyra. J. Comp. Physiol. 163, 271-285. Rübsamen, R., Schäfer, M (1990): Audiovocal interactions during development? Vocalisation in deafened young horseshoe bats vs. audition in vocalisation-impaired bats. J. Comp. Physiol. A 167, 771-784. Rübsamen, R., Schäfer, M. (1990): Ontogenesis of auditory fovea representation in the inferior colliculus of the Sri Lanka rufous horseshoe bat, Rhinolophus rouxi. J. Comp. Physiol. A 167, 757-769. Rübsamen, R., Schweizer, H. (1986): Control of the echolocation pulses by neurons of the nucleus ambiguus in the rufous horseshoe bat, Rhinolophus rouxi. II. Afferent and efferent connections of the motor nucleus of the laryngeal nerves. J. Comp. Physiol. A 159, 689-699. Runhaar, G. (1989): The surface morphology of the avian tectorial membrane. Hearing Res. 37, 179-188. Runhaar, G., Manley, G.A. (1987): Potassium concentration in the inner sulcus is perilymphlike. Hearing Res. 29, 93-103. Runhaar, G., Schedler, J., Manley, G.A. (1991): The potassium concentration in the cochlear fluids of the embryonic and post-hatching chick. Hearing Res. 56, 227-238. Russell, I.J., Kössl, M., Richardson, G.P. (1992): Nonlinear mechanical responses of mouse cochlear hair bundles. Proc. R. Soc. Lond. B 250, 217-227.
328
10 References Russell, I.J., Richardson, G.P., Kössl, M. (1989): The response of cochlear hair cells to tonic displacement of the sensory bundle. Hearing Res. 43, 55-70. Scharmann, M.G. (1996): Stoffwechselphysiologische und elektrophysiologische Untersuchungen zur Modulation neuronaler Verarbeitung beim Staren (Sturnus vulgaris L.). Doctoral Thesis, Inst. f. Zoologie, TU München. Scharmann, M.G., Klump, G.M. Ehret, G. (1995): Discrimination training in a GO/NOGO-procedure alters the 2-deoxyglucose pattern in the starling’s forebrain. Brain. Res. 682, 83-92. Scherer, A. (1988): Erklärung der spektralen Verdeckung mit Hilfe von Mithörschwellen- und Suppressionsmustern. Acustica 67, 1-18. Schlang, M. (1989): An auditory based approach for echo compensation with modulation filtering. In: Proc. Eurospeech, Paris, 661-664. Schlang, M., Mummert, M. (1990): Die Bedeutung der Fensterfunktion für die Fourier-t-Transformation als gehörgerechte Spektralanalyse. In: Fortschritte der Akustik - DAGA 90. Verlag DPG-GmbH, Bad Honnef, 1043-1046. Schloth, E. (1983): Relation between spectral composition of spontaneous otoacoustic emissions and fine-structure of threshold in quiet. Acustica 53, 250-256. Schloth, E., Zwicker, E. (1983): Mechanical and acoustical influences on spontaneous otoacoustic emmissions. Hearing Res. 11, 285-293. Schmid, W. (1994): Zur Tonhöhe inharmonischer Komplexer Töne. In: Fortschritte der Akustik - DAGA 94. Verlag DPG-GmbH, Bad Honnef, 1025-1028. Schmid, W. (1997): Zur Ausgeprägtheit der Tonhöhe gedrosselter und amplitudenmodulierter Sinustöne. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 355-356. Schmid, W., Auer, W. (1996): Zur Tonhöhenempfindung bei Tiefpaßrauschen. In: Fortschritte der Akustik - DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 344-345. Schmidt, S. (1988a): Evidence for a spectral basis of texture perception in bat sonar. Nature 331, 617-619. Schmidt, S. (1988b): Discrimination of target surface structure in the echolocating bat, Megaderma lyra. In: Nachtigall, P.E., Moore, P.W.B. (Eds.): Animal Sonar. NATO ASI Series A: Life Sciences 156, Plenum Press, New York, London, 507-512. Schmidt, S. (1992): Perception of structured phantom targets in the echolocating bat, Megaderma lyra. J. Acoust. Soc. Am. 91, 2203-2223. Schmidt, S. (1993): Perspectives and problems of comparative psychoacoustics in echolocating bats. Association for Research in Otolaryngology, Midwinter Meeting, Abstract No. 145. Schmidt, S. (1995): Psychoacoustic studies in bats. In: Klump, G.M., Dooling, R.J., Fay, R.R., Stebbins, W.C. (Eds.): Methods in Comparative Psychoacoustics 6. Birkhäuser Verlag, Zürich, 123-134. Schmidt, S., Hanke, S., Pillat, J. (1997): Hunting terrestrial prey by sonar - new evidence for an underestimated strategy. In: Zissler, D. (Ed.): Verh. Dtsch. Zool. Ges. 89.1. Gustav Fischer, Stuttgart, Jena, New York, p. 322. Schmidt, S., Preisler, A., Sedlmeier, H. (1995): Aspects of pitch perception in the ultrasonic range. In: Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 374-382. Schmidt, S., Thaller, J. (1994): Temporal auditory summation in the echolocating bat, Tadarida brasiliensis. Hearing Res. 77, 125-134. Schmidt, S., Thaller, J., Pötke, M. (1990): Behavioural audiogram and masked thresholds in the free-tailed bat, Tadarida brasiliensis. In: Elsner, N., Roth, G. (Eds.): Brain - Perception Cognition., Proc. of the 18th Göttingen Neurobiol. Conf., Georg Thieme, Stuttgart, Abstract No. 146. Schmidt, S., Zwicker, E. (1991): The effect of masker spectral asymmetry on overshoot in simultaneous masking. J. Acoust. Soc. Am. 89, 1324-1330. Schmidt, U., Schlegel, P., Schweizer, H., Neuweiler, G. (1991): Audition in vampire bats, Desmodus rotundus. J. Comp. Physiol. A 168, 45-51.
329
10 References Schorer, E. (1986): An active free-field equalizer for TDH-39 earphones [43.66.Yw, 43.88.Si]. J. Acoust. Soc. Am. 80, 1261-1262. Schorer, E. (1986): Critical modulation frequency based on detection of AM versus FM tones. J. Acoust. Soc. Am. 79, 1054-1057. Schorer, E. (1989a): Vergleich eben erkennbarer Unterschiede und Variationen der Frequenz und Amplitude von Schallen. Acustica 68, 183-199. Schorer, E. (1989b): Ein Funktionsschema eben wahrnehmbarer Frequenz- und Amplitudenänderungen. Acustica 68, 268-287. Schorn, K. (1980): Klinische Erfahrungen mit dem neuen Hörgeräteanpaßsystem bei der Versorgung Hörbehinderter. Eur. Arch. Otolaryngol. 227, 546-553. Schorn, K. (1981): Difference limen for intensity in patients with sudden deafness and other inner ear disorders. Advanc. Oto-Rhino-Laryngol 27, 100-109. Schorn, K. (1982): Kann man eine Schwerhörigkeit objektiv nachweisen? Münch. Med. Wschr. 124, 989-991. Schorn, K. (1986): Adäquate Hörhilfen für Schwerhörige. Dt. Ärzteblatt 83(9), 544-550. Schorn, K. (1989): Die Erstellung des HNO-ärztlichen Sachverständigen-Gutachtens bei Fahrerfluchtprozessen. Laryngo-Rhino-Otol. 68, 67-71. Schorn, K. (1990a): Differentialdiagnose von Hörstörungen. In: Naumann, H.H. (Ed.): Differentialdiagnostik in der Hals-Nasen-Ohren-Heilkunde, Thieme, Stuttgart, 56-98. Schorn, K. (1990b): Hör- und Sprachstörungen bei Morbus Down. In: Murken, J., DietrichReichart, E. (Eds.): Down-Syndrom. R.S. Schulz, Starnberg-Percha, 159-171. Schorn, K. (1993a): The Munich Screening Programme in Neonates. Br. J. Audiol. 27, 143-148. Schorn, K. (1993b): Die Erfassung der kindlichen Schwerhörigkeit. Dt. Ärzteblatt 90(14), B 19992007. Schorn, K. (1993c): Differential diagnosis of hearing disorders. In: Naumann, H.H. (Ed.): Differential diagnosis in Otorhinolaryngology. Thieme, New York, 55-96. Schorn, K. (1997): Stand der audiologischen Diagnostik. In: Theissing, J. (Ed.): Klinik und Therapie in der Hals-Nasen-Ohren-Heilkunde. Kopf- und Hals-Chirurgie im Wandel. Springer, Berlin, pp.131-172. Schorn, K.(1993d): Die Bedeutung der Hörgeräteüberprüfung durch den Facharzt für HNO. Laryngo-Rhino-Otol. 62, 552-554. Schorn, K., Brügel, F.J. (1994): Neue Gesichtspunkte bei der Hörgeräteanpassung. LaryngoRhino-Otol. 73, 7-13. Schorn, K., Fastl, H. (1984): The measurement of level difference thresholds and its importance for the early detection of the acoustic neurinoma. Part 1: Audiol. Akust. 23, 22-27. Part 2: Audiol. Akust. 24, 60-62. Schorn, K., Seifert, J., Stecker, M., Zollner, M. (1986): Voruntersuchungen gehörloser Patienten zur Cochlear-Implantation. Laryngo-Rhino-Otol. 65, 114-117. Schorn, K., Stecker, M. (1984): Objektive Überprüfungsverfahren bei der Hörgeräteanpassung. In: Biesalski, P. (Ed.): Pädaudiologie Aktuell. Dr. Hanns Krach, Mainz, 115-119. Schorn, K., Stecker, M. (1988): ERA in der Pädaudiologie. Laryngo-Rhino-Otol. 67, 78-83. Schorn, K., Stecker, M. (1994a): Hörprüfungen. In: Naumann, H.H., Helms, J., Herberholdt, C., Kastenbauer, E. (Eds.): Oto-Rhino-Laryngologie in Klinik und Praxis, Band 1: Ohr. Thieme, Stuttgart, 309-368. Schorn, K., Stecker, M. (1994b): Hörgeräte In: Naumann, H.H., Helms, J., Herberholdt, C., Kastenbauer, E. (Eds.): Oto-Rhino-Laryngologie in Klinik und Praxis, Band 1: Ohr. Thieme, Stuttgart, 812-840. Schorn, K., Zwicker, E. (1985): Klinische Untersuchungen zum Zeitauflösungsvermögen des Gehörs bei verschiedenen Hörschädigungen. Audiol. Akust. 26, 170-184. Schorn, K., Zwicker, E. (1989): Zusammenhänge zwischen gestörtem Frequenz- und gestörtem Zeitauflösungsvermögen bei Innenohrschwerhörigkeiten. Eur. Arch. Otolaryngol.-Suppl. II, 116-118. Schorn, K., Zwicker, E. (1990): Frequency selectivity and temporal resolution in patients with various inner ear disorders. Audiology 29, 8-20.
330
10 References Schuller, G., Covey, E., Casseday, J.H. (1991): Auditory Pontine Grey: Connections and response properties in the horseshoe bat. Eur. J. Neurosci. 3, 648-662. Schuller, G., Fischer, S., Schweizer, H. (1997): Significance of the paralemniscal tegmental area for audio-motor control in the mustached bat, Pteronotus p. parnellii: The afferent and efferent connections of the paralemniscal area. Eur. J. Neurosci. 9, 342-355. Schuller, G., O’Neill, W.E., Radtke-Schuller, S. (1991): Facilitation and delay sensitivity of auditory cortex neurons in CF-FM bats, Rhinolophus rouxi and Pteronotus p. parnellii. Eur. J. Neurosci. 3, 1165-1181. Schuller, G., Radtke-Schuller, G. (1998): The pretectal area as potential audio-motor interface: Electrical stimulation and tracer studies in the rufous horseshoe bat, Rhinolophus rouxi. Eur. J. Neurosci. (in Press). Schuller, G., Radtke-Schuller, S. (1988): Midbrain areas as candidates for audio-vocal interface in echolocating bats. In: Nachtigall, P.E. (ed): Animal sonar systems. Helsingœr symposium. Plenum Press, New York. Schuller, G., Radtke-Schuller, S. (1990): Neural control of vocalization in bats: Mapping of brainstem areas with electrical microstimulation eliciting species-specific echolocation calls in the rufous horseshoe bat. Exp. Brain Res. 79, 192-206. Schuller, G., Radtke-Schuller, S., Betz, M. (1986): A stereotaxic method for small animals using experimentally determined reference profiles. J. Neurosci. Meth. 18, 339-350. Schumm, A., Krull, D., Neuweiler, G. (1991): Echolocation in the notch-eared bat, Myotis emarginatus. Behav. Ecol. Sociobiol. 28, 255-261. Sedlmeier, H. (1992): Tonhöhenwahrnehmung beim falschen Vampir Megaderma lyra. Doctoral Thesis, Fakultät für Biologie, Ludwig-Maximilians-Univ. München. Sedlmeier, H., Schmidt, S. (1989): Masked auditory thresholds from the Indian false vampire bat (Megaderma lyra). In: Elsner, N., Singer, W. (Eds.): Dynamics and Plasticity in Neuronal Systems. Proc. of the 17th Göttingen Neurobiol. Conf. Georg Thieme, Stuttgart, Abstract No. 292. Seiter, A., Stemplinger, I., Beckenbauer, T. (1996): Untersuchungen zur Tonhaltigkeit von Geräuschen. In: Fortschritte der Akustik - DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 238-239. Siefer W, Kriner E. (1991): Soaring bats. Naturwiss. 78, 185. Sonntag, B. (1983): Tonhöhenverschiebung von Sinustönen durch Terzrauschen bei unterschiedlichen Frequenzlagen. Acustica 53, 218. Stecker, M. (1990): Frühe akustisch evozierte Potentiale bei Knochenschallreizung. In: Heinemann, M. (Ed.): Subjektive Audiometrie bei Kindern und akustisch evozierte Potentiale. Renate Groß, Bingen, 63-77. Stecker, M. (1991): Fehldiagnosen durch Verzicht auf Knochenleitungsmessungen bei der BERA. Eur. Arch. Otolaryngol.-Suppl. II, 135-136. Stemplinger, I. (1996): Globale Lautheit von gleichförmigen Industriegeräuschen. In: Fortschritte der Akustik - DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 240-241. Stemplinger, I. (1997): Beurteilung der Globalen Lautheit bei Kombination von Verkehrsgeräuschen mit simulierten Industriegeräuschen. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 353-354. Stemplinger, I., Fastl, H. (1997): Accuracy of loudness percentile versus measurement time. In: Proc. inter-noise’97, Vol. III, 1347-1350. Stemplinger, I., Fastl, H., Schorn, K., Bruegel, F. (1994): Zur Verständlichkeit von Einsilbern in unterschiedlichen Störgeräuschen. In: Fortschritte der Akustik - DAGA 94. Verlag DPGGmbH, Bad Honnef, 1469-1472. Stemplinger, I., Gottschling, G. (1997): Auswirkungen der Bündelung von Verkehrswegen auf die Beurteilung der Globalen Lautheit. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 401-402. Stemplinger, I., Schiele, M., Meglic, B., Fastl, H. (1997): Einsilberverständlichkeit in unterschiedlichen Störgeräuschen für Deutsch, Ungarisch und Slowenisch. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 77-78.
331
10 References Stemplinger, I., Seiter, A. (1995): Beurteilung von Lärm am Arbeitsplatz. In: Fortschritte der Akustik - DAGA 95. Dt. Gesellschaft für Akustik e.V., Oldenburg, 867-870. Suckfüll, M., Schneeweiß, S., Dreher, A., Schorn, K. (1996): Evaluation of TEOAE and DPOAE measurements for the essessment of auditory thresholds in sensorineural hearing loss. Acta Otolaryngol. (Stockh.) 116, 528- 533. Suckfüll, M., Thiery, J., Wimmer, C., Jäger, B., Schorn, K., Seidel, D., Kastenbauer, E. (1997): Heparin - induced extracorporal LDL - precipitation (H. E. L. P.) improves the recovery of hearing in patients suffering from sudden idiopathic hearing loss. In: McCafferty, G. Coman, W., Carrol, R. (Eds.): XVI. World Congress of Otorhinolaryngology Head and Neck Sugery. Monduzzi Int. Proceedings Division, 1121-1125. Taschenberger, G., Gallo, L., Manley, G.A. (1995): Filtering of distortion-product otoacoustic emissions in the inner ear of birds and lizards. Hearing Res. 91, 87-92. Taschenberger, G., Manley, G.A. (1997): Spontaneous otoacoustic emissions in the barn owl. Hearing Res. 110, 61-76. Terhardt, E. (1979): Calculating virtual pitch. Hearing Res. 1, 155-182. Terhardt, E. (1985): Fourier transformation of time signals: Conceptual revision. Acustica 57, 242-256. Terhardt, E. (1987a): Gestalt principles and music perception. In: Yost, W.A., Watson, C.S. (Eds.): Auditory Processing of Complex Sounds. Lawrence Erlbaum Associates, Hilldale NJ, 157-166. Terhardt, E. (1987b): Psychophysics of auditory signal processing and the role of pitch in speech. In: Schouten, M.E.H. (Ed.): The Psychophysics of Speech Perception. Nijhoff, Dordrecht, 271-283. Terhardt, E. (1989): Warum hören wir Sinustöne? Naturwiss. 76, 496-504. Terhardt, E. (1992): From speech to language: On auditory information processing. In: Schouten, M.E.H. (Ed.): The Auditory Processing of Speech: From Sounds to Words. Mouton de Gruyter, Berlin, 363-380. Terhardt, E. (1997): Lineares Modell der peripheren Schallübertragung im Gehör. In: Fortschritte der Akustik - DAGA 97. Dt. Gesellschaft für Akustik e.V., Oldenburg, 367-368. Terhardt, E. (1998): Akustische Kommunikation. Springer, Berlin, Heidelberg. Tschopp, K., Fastl, H. (1991): On the loudness of German speech material used in audiology. Acustica 73, 33-34. Unkrig, A., Baumann, U. (1993): Spektralanalyse und Frequenzkonturierung durch Filter mit asymmetrischen Flanken. In: Fortschritte der Akustik - DAGA 93. Verlag DPG-GmbH, Bad Honnef, 876-879. v. Stebut, B., Schmidt, S. (1997): Psychophysical frequency discrimination ability in the FM-bat, Eptesicus fuscus. In: Elsner, E., Wässle, H. (Eds.): Göttingen Neurobiology Report 1995. Proc. of the 25th Göttingen Neurobiol. Conf. II. Thieme, Stuttgart, Abstract No. 368. Valenzuela, M.N. (1995): Subjektive Beurteilung der Qualität und Ähnlichkeit von Flügelklängen. In: Fortschritte der Akustik - DAGA 95. Dt. Gesellschaft für Akustik e.V., Oldenburg, 587-590. Valenzuela, M.N. (1996): Einfluß der spektralen Energieverteilung auf die subjektive Beurteilung von Flügelklängen. In: Fortschritte der Akustik - DAGA 96. Dt. Gesellschaft für Akustik e.V., Oldenburg, 316-317. van Dijk, P., Manley, G.A., Gallo, L., Pavusa, A., Taschenberger, G. (1996): Statistical properties of spontaneous otoacoustic emissions in one bird and three lizard species. J. Acoust. Soc. Am. 100, 2220-2227. Vater, M. (1987): Narrow-band frequency analysis in bats. In: Fenton, M.B., Racey, P., Rayner, J. (Eds.): Recent advances in the study of bats. Cambridge University Press. 200-226. Vater, M. (1988a): Cochlear anatomy and physiology in bats. In: Nachtigall, P.E., Moore, P.W.B. (Eds.): Animal Sonar. NATO ASI Series A: Life Sciences 156, Plenum Press, New York, London, 225-243. Vater, M. (1988b): Lightmicroscopic observations on cochlear development in horseshoe bats. In: Nachtigall, P.E., Moore, P.W.B. (Eds.): Animal Sonar. NATO ASI Series A: Life Sciences 156, Plenum Press, New York, London, 341-347.
332
10 References Vater, M. (1995): Ultrastructural and immunocytochemical observations on the superior olivary complex of the mustached bat. J. Comp. Neurol. 358, 155-180. Vater, M. (1996): Conservative traits and innovative patterns in design of cochlea and brainstem auditory nuclei in echolocating bats. Neurobiologentagung, Göttingen. Vater, M. (1997): Evolutionary plasticity of cochlear design in echolocating bats. In: Lewis, E.R., Long, G., Lyon, R.F., Narins, P.M., Steele, C.R., Hecht-Poinar, E. (Eds.): Diversity in auditory mechanics. World Scientific Publishing Co., Singapore, 49-54. Vater, M. (1998): How the auditory periphery of bats is adapted for echolocation. In: Proc. 10th Int. Bat Res.Conf. Eds. Kunz, T. et al. (in press). Vater, M., Braun, K. (1994): Parvalbumin, Calbindin-D28k- and Calretinin-immunoreactivity in the ascending auditory pathway of horseshoe bats. J. Comp. Neurol. 341, 534-558. Vater, M., Covey, E., Casseday, J.H. (1995): Ascending pathways from the lateral and medial superior olives of the mustached bat converge at midbrain target. J. Comp. Neurol. 351, 632-646. Vater, M., Covey, E., Casseday, J.H. (1997): The columnar region of the ventral nucleus of the lateral lemniscus in the big brown bat: Synaptic organization and structural correlates for feedforward inhibitory function. Cell Tiss.Res. 289, 223-233. Vater, M., Duifhuis, H. (1986): Ultra-high frequency selectivity in the horseshoe bat: Does the bat use an acoustic interference filter? In: Moore, B.C.J., Patterson, R.D. (Eds.): Auditory frequency selectivity. Plenum Press, New York, pp.23-30. Vater, M., Feng, A.S. (1990): Functional organization of ascending and descending connections of the cochlear nucleus of horseshoe bats. J. Comp. Neurol. 292, 373-395. Vater, M., Feng, A.S., Betz, M. (1985): An Hearing Res.P-study of the frequency place map of the horseshoe bat cochlea: Morphological correlates of the sharp tuning to a narrow frequency band. J. Comp. Physiol. 157, 671-686. Vater, M., Habbicht, H., Kössl, M., Grothe, B. (1992): The functional role of GABA and glycine in monaural and binaural processing in the inferior colliculus of horseshoe bats. J. Comp. Physiol. A 171, 541-553. Vater, M., Kössl, M. (1996): Further studies on the mechanics of the cochlear partition in the mustached bat. I. Ultrastructural observations on the tectorial membrane and hair cells. Hearing Res. 94, 63-77. Vater, M., Kössl, M., Horn, A. (1996): Immuno-cytochemical aspects of the functional organization of the horseshoe bat’s cochlear nucleus. In: Ainsworth, W.A., Evans, E.F., Hackney, C.M. (Eds.): Advances in speech, hearing and language processing. JAI Press Inc., Greenwich. 129-144. Vater, M., Kössl, M., Horn, A.K.E. (1992): GAD- and GABA-immunoreativity in the ascending auditory pathway of horseshoe and mustached bats. J. Comp. Neurol. 325, 183-206. Vater, M., Lenoir, M. (1992): Ultrastructure of the horseshoe bat’s organ of Corti. I. Scanning electron microscopy. J. Comp. Neurol. 318, 367-379. Vater, M., Lenoir, M., Pujol, R. (1992): Ultrastructure of the horseshoe bat’s organ of Corti. II. transmission electron microscopy. J. Comp. Neurol. 318, 380-391. Vater, M., Lenoir, M., Pujol, R. (1997): The development of the organ of Corti in horseshoe bats: Scanning and transmission electron microscopy. J. Comp. Neurol. 377, 520-534. Vater, M., Rübsamen, B. (1992): Ontogeny of frequency maps in the peripheral auditory system of horseshoe bats. 3rd Intern. Congress of Neuroethology. Vater, M., Rübsamen, R. (1989): Postnatal development of the cochlea in horseshoe bats. In: Wilson, J.P., Kemp, D.T. (Eds.): Cochlear mechanisms. Structure, function and models. Plenum Press, New York, London. 217-225. Vater, M., Siefer, W. (1995): The cochlea of Tadarida brasiliensis: specialized functional organization in a generalized bat. Hearing Res. 91, 178-195. Wagner, H. (1992): Distribution of acoustic motion-direction sensitive neurons in the barn owl’s brainstem. In: Elsner, N., Richter, D.W. (Eds.): Rhythmogenesis in neurons and networks. Thieme Verlag, Stuttgart, New York, 237. Wagner, H., Takahashi, T. (1990): Neurons in the midbrain of the barn owl are sensitive to the direction of apparent acoustic motion. Naturwiss. 77, 439-442.
333
10 References Wagner, H., Takahashi, T. (1992): The influence of temporal cues on acoustic motion-direction sensitivity of auditory neurons in the owl. J. Neurophysiol. 68, 2063-2076. Wagner, H., Trinath, T., Kautz, D. (1994): Influence of stimulus level on acoustic motion-direction sensitivity in barn owl midbrain neurons. J. Neurophysiol. 71, 1907-1916. Wartini, S. (1996): Zur Rolle der Spektraltonhöhen und ihrer Akzentuierung bei der Wahrnehmung von Sprache. VDI-Verlag, Düsseldorf. Westra (1992): Zahlen und Wörtertest nach DIN 45 621 mit Störgeräusch nach Prof. Dr.-Ing. H. Fastl. Audiometrie Disc Nr. 11, Westra GmbH, Wertingen. Widmann, U. (1990): Beschreibung der Geräuschemission von Kraftfahrzeugen anhand der Lautheit. In: Fortschritte der Akustik - DAGA 90. Verlag DPG-GmbH, Bad Honnef, 401-404. Widmann, U. (1991): Minderung der Sprachverständlichkeit als Maß für die Belästigung. In: Fortschritte der Akustik - DAGA 91. Verlag DPG-GmbH, Bad Honnef, 973-976. Widmann, U. (1992): Meßtechnische Beurteilung und Umfrageergebnisse bei Straßenverkehrslärm. In: Fortschritte der Akustik - DAGA 92. Verlag DPG-GmbH, Bad Honnef, 369-372. Widmann, U. (1993): Untersuchungen zur Schärfe und zur Lästigkeit von Rauschen unterschiedlicher Spektralverteilung. In: Fortschritte der Akustik - DAGA 93. Verlag DPG-GmbH, Bad Honnef, 644-647. Widmann, U. (1994): Zur Lästigkeit von amplitudenmodulierten Breitbandrauschen. In: Fortschritte der Akustik - DAGA 94. Verlag DPG-GmbH, Bad Honnef, 1121-1124. Widmann, U. (1995): Subjektive Beurteilung der Lautheit und der Psychoakustischen Lästigkeit von PKW-Geräuschen. In: Fortschritte der Akustik - DAGA 95. Dt. Gesellschaft für Akustik e.V., Oldenburg, 875-878. Widmann, U., Goossens, S. (1993): Zur Lästigkeit tieffrequenter Schalle: Einflüsse von Lautheit und Zeitstruktur. Acustica 77, 290-292. Wiegrebe, L., Kössl, M., Schmidt, S. (1996): Auditory enhancement at the absolute threshold of hearing and its relations to the Zwicker tone. Hearing Res. 100, 171-180. Wiegrebe, L., Schmidt, S. (1996): Temporal integration in the echolocating bat, Megaderma lyra. Hearing Res. 102, 35-42. Wiesmann, N., Fastl, H. (1991): Ausgeprägtheit der Tonhöhe und Frequenzunterschiedsschwellen von Bandpass-Rauschen. In: Fortschritte der Akustik - DAGA 91. Verlag DPGGmbH, Bad Honnef, 505-508. Wiesmann, N., Fastl, H. (1992): Ausgeprägtheit der virtuellen Tonhöhe und Frequenzunterschiedsschwellen von harmonischen komplexen Tönen. In: Fortschritte der Akustik DAGA 92. Verlag DPG-GmbH, Bad Honnef, 841-844. Wurzer, H., Chucholowski, M., Schorn, K. (1983): Untersuchungen zum Stapediusreflex. Laryngo-Rhino-Otol. 62, 293-297. Zwicker E., Harris, F.P. (1990): Psychoacoustical and ear canal cancellation of (2f1-f2)-distortion products. J. Acoust. Soc. Am. 87, 2583-2591. Zwicker, E. (1984b): Dependence of post-masking on masker duration and its relation to temporal effects in loudness. J. Acoust. Soc. Am. 75, 219-223. Zwicker, E. (1985c): What is a meaningful value for quantifying noise reduction? In: Proc. inter-noise’85, Vol. I, 47-56. Zwicker, E. (1987c): Procedure for calculating partially masked loudness based on ISO 532 B. In: Proc. inter-noise’87, Vol. II, 1021-1024. Zwicker, E. (1987d): Meaningful noise measurement and effective noise reduction. Noise Contr. Engng. J. 29, 66-76. Zwicker, E. (1988c): Loudness patterns (ISO 532 B) an excellent guide to noise- reduced design and to expected public reaction. In: J. S. Bolton (Ed.): Proc. of NOISE-CON 88, Noise Contr. Found., New York, 15-26. Zwicker, E. (1989b): On the dependence of unbiased annoyance on loudness. In: Proc. internoise’89, Vol. II, 809-814. Zwicker, E. (1991): A proposal for defining and calculating the unbiased annoyance. In: Schick, A. et al. (Eds.): Contributions to psychological acoustics. Bibliotheks- und Informationssystem der Carl-v.-Ossietzky-Universität, Oldenburg, 187-202.
334
10 References Zwicker, E. (1992): Psychoakustik. Japanese edition, Japan: Nishimura Co., Ltd. 1992. (Translation by Y. Yamada). Zwicker, E. (1983a): Delayed evoked oto-acoustic emissions and their suppression by Gaussian-shaped pressure impulses. Hearing Res. 11, 359-371. Zwicker, E. (1983b): On peripheral processing in human hearing. In: Klinke, R., Hartmann, R. (Eds.): Hearing - Physiological Bases and Psychopysics. Springer Verlag, Berlin, 104-110. Zwicker, E. (1983c): Level and phase of the (2f1-f2)-cancellation tone expressed in vector diagrams. J. Acoust. Soc. Am. 74, 63-66. Zwicker, E. (1984a): Warum gibt es binaurale Mithörschwellendifferenzen? In: Fortschritte der Akustik - DAGA 84. Verlag DPG-GmbH, Bad Honnef, 691-694. Zwicker, E. (1985a): Das Innenohr als aktives schallverarbeitendes und schallaussendendes System. In: Fortschritte der Akustik - DAGA 85. Verlag DPG-GmbH, Bad Honnef, 29-44. Zwicker, E. (1985b): Temporal resolution in background noise. Br. J. of Audiol. 19, 9-12. Zwicker, E. (1986a): Psychophysics of hearing. In: Saenz, A.L., et al. (Eds.): Noise Pollution, Chapter 4, SCOPE, Wiley & Sons Ltd., New York, 146-167. Zwicker, E. (1986b): Spontaneous oto-acoustic emissions, threshold in quiet, and just noticeable amplitude modulation at low levels. In: Moore, B.C. J., et al. (Eds.): Auditory Frequency Selectivity. Plenum Press, New York, 49-56. Zwicker, E. (1986c): A hardware cochlear nonlinear preprocessing model with active feedback. J. Acoust. Soc. Am. 80, 146-153. Zwicker, E. (1986d): „Otoacoustic“ emissions in a nonlinear cochlear hardware model with feedback. J. Acoust. Soc. Am. 80, 154-162. Zwicker, E. (1986e): Suppression and (2f1-f2)-difference tones in a nonlinear cochlear preprocessing model with active feedback. J. Acoust. Soc. Am. 80, 163-176. Zwicker, E. (1986f): Peripheral preprocessing in hearing and psychoacoustics as guidelines for speech recognition. In: (Eds.): Units and their Representation in Speech Recognition. Canadian Acoustical Association. Zwicker, E. (1986g): The temporal resolution of hearing - An expedient measuring method for speech intelligibility. Audiol. Acoustics 25, 156-168. Zwicker, E. (1987a): Masking in normal ears - Psychoacoustical facts and physiological correlates. In: Feldmann, H. (Ed.): Proc. III. Intern. Tinnitus Seminar, Münster 1987. Harsch-Verl., Karlsruhe, 214-223. Zwicker, E. (1987b): Objective otoacoustic emissions and their uncorrelation to tinnitus. In: Feldmann, H. (Ed.): Proc. III. Intern. Tinnitus Seminar, Münster 1987. Harsch-Verl., Karlsruhe, 75-81. Zwicker, E. (1988a): The inner ear, a sound processing and a sound emitting system. J. Acoust. Soc. Jpn. (E) 9, 59-74. Zwicker, E. (1988b): Psychophysics and physiology of peripheral processing in hearing. In: Basic Issues in Hearing. Proc. of the 8th Intern. Symp. on Hearing. Academic Press, London, 14-25. Zwicker, E. (1989a): Otoacoustic emissions and cochlear travelling waves. In: Wilson, J.P., Kemp, D.T. (Eds.): Cochlear Mechanisms. Plenum Press, New York, 359-366. Zwicker, E. (1990a): On the frequency separation of simultaneously evoked otoacoustic emissions’ consecutive extrema and its relation to cochlear travelling waves. J. Acoust. Soc. Am. 88, 1639-1641. Zwicker, E. (1990b): On the influence of acoustical probe impedance on evoked otoacoustic emissions. Hearing Res. 47, 185-190. Zwicker, E. (1990c): Otoacoustic emissions in research of inner ear signal processing. In: Hoked, M. (Ed.): Advances in Audiology, Vol. 7. S. Karger AG, Basel, 63-76. Zwicker, E., Deuter, K., Peisl, W. (1985): Loudness meters based on ISO 532 B with large dynamic range. In: Proc. inter-noise’85, Vol II, 1119-1122. Zwicker, E., Fastl, H. (1983): A portable loudness-meter based on ISO 532 B. In: Proc. 11. ICA Paris, Vol. 8, 135-137. Zwicker, E., Fastl, H. (1986a): Sinnvolle Lärmmessung und Lärmgrenzwerte. Z. für Lärmbekämpfung 33, 61-67.
335
10 References Zwicker, E., Fastl, H. (1986b): Examples for the use of loudness: Transmission loss and addition of noise sources. In: Proc. inter-noise’86, Vol. II, 861-866. Zwicker, E., Fastl, H. (1990): Psychoacoustics. Facts and Models. Springer, Heidelberg, New York 1990. Zwicker, E., Fastl, H., Dallmayr, C. (1984): BASIC-Program for calculating the loudness of sounds from their 1/3-oct band spectra according to ISO 532 B. Acustica 55, 63-67. Zwicker, E., Fastl, H., Widmann, U., Kurakata, K., Kuwano, S., Namba, S. (1991): Program for calculating loudness according to DIN 45631 (ISO 532 B). J. Acoust. Soc. Jpn. (E) 12, 39-42. Zwicker, E., Henning, G.B. (1991): On the effect of interaural phase differences on loudness. Hearing Res. 53, 141-152. Zwicker, E., Henning, G.B. (1984): Binaural masking-level differences with tones masked by noises of various bandwidths and level. Hearing Res. 14, 179-183. Zwicker, E., Henning, G.B. (1985): The four factors leading to binaural masking-level differences. Hearing Res. 19, 29-47. Zwicker, E., Hesse, A. (1984): Temporary threshold shifts after onset and offset of moderately loud low-frequency maskers. J. Acoust. Soc. Am. 75, 545-549. Zwicker, E., Lumer, G. (1985): Evaluating traveling wave characteristics in man by an active nonlinear cochlea preprocessing model. In: Allen, J.W., Hall, J.L., Hubbard, A., Neely, S.T., Tubis, A. (Eds.): Peripheral auditory mechanisms. Springer Verlag, Heidelberg, Berlin, 250-257. Zwicker, E., Manley, G. (1983): The auditory system of mammals and man. In: Hoppe, W., Lohmann, W., Markl, H., Ziegler, H. (Eds.): Biophysics. Springer, Berlin, 671-682. Zwicker, E., Peisl, W. (1990): Cochlear preprocessing in analog models, in digital models and in human inner ear. Hearing Res. 44, 209-216. Zwicker, E., Scherer, A. (1987): Correlation between time functions of sound pressure, masking, and OAE-suppression. J. Acoust. Soc. Am. 81, 1043-1049. Zwicker, E., Schloth, E. (1984): Interrelation of different otoacoustic emissions. J. Acoust. Soc. Am. 75, 1148-1154. Zwicker, E., Schorn, K. (1982): Temporal resolution in hard-of-hearing patients. Audiology 21, 474-492. Zwicker, E., Schorn, K. (1990): Delayed evoked otoacoustic emissions - An ideal screening test for excluding hearing impairment in infants. Audiology 29, 241-251. Zwicker, E., Schorn, K., Vogel, T. (1990): Zeitlich begrenzte Schwellenverschiebung nach Kernspin-Tomographie. Laryngo-Rhino-Otol. 69, 412-416. Zwicker, E., Stekker, M., Hind, J. (1987): Relations between masking, otoacoustic emissions, and evoked potentials. Acustica 64, 102-109. Zwicker, E., Wesel, J. (1990): The effect of „addition“ in suppression of delayed evoked otoacoustic emissions and in masking. Acustica 70, 189-196. Zwicker, E., Zollner, M. (1984): Elektroakustik. Springer, Berlin, Hochschultext Zwicker, E., Zollner, M. (1987): Elektroakustik, 2. erw. Auflage. Springer, Berlin, Hochschultext. Zwicker, E., Zwicker, U.T. (1991b): Dependence of binaural loudness summation on interaural level differences, spectral distribution, and temporal distribution. J. Acoust. Soc. Am. 89, 756-764. Zwicker, E., Zwicker, U.T. (1991a): Audio engineering and psychoacoustics: Matching signals to the final receiver, the human hearing system. J. Audio Eng. Soc. 39, 115-126. Zwicker, E., Zwicker, U.T. (1984b): Binaural masking level differences in non-simultaneous masking. Hearing Res. 13, 221-228. Zwicker, U.T., Zwicker, E. (1984a): Binaural masking level difference as a function of masker and testsignal duration. Hearing Res. 13, 215-219.
336
11 References (work outside the collaboration research centre 204)
Abdala, C., Sininger, Y.S., Ekelid, M., Zeng, F.G. (1996): Distortion product otoacoustic emission suppression tuning curves in human adults and neonates. Hearing Res. 98, 38-53. Adret-Hausberger, M., Jenkins, P.F. (1988): Complex organization of the warbling song in the European starling Sturnus vulgaris. Behaviour 107, 138-156. Allen, J.B., Fahey, P.F. (1993): A second cochlear-frequency map that correlates distortion product and neural measurements. J. Acoust. Soc. Am. 94, 809-816. Altman, J.A. (1968): Are there neurons detecting direction of sound source motion? Exp. Neurol. 22, 13-25. Arnold, W., Anniko, M. (1990): Das Zytokeratinskelett des menschlichen Corti-Organs und seine funktionelle Bedeutung. Laryngo-Rhino-Otol. 69, 24-30. Attias, J., Bresloff, I., Furman, V. (1996): The influence of the efferent auditory system on otoacoustic emission in noise induced tinnitus: Clinical relevance. Acta Otolaryngol. 116, 534-539. Au, W.L., Moore, P.W.B., Pawloski, D.A. (1988): Detection of complex echoes in noise by an echolocating dolphin. J. Acoust. Soc. Am. 83, 662-668. Barlow, H.B., Levick, W.R. (1965): The mechanisms of directionally selective units in rabbit’s retina. J. Physiol. 178, 477-504. Bekesy, G. von (1960): Experiments in hearing. McGraw Hill Book Company, New York Toronto London. Bismarck, G. von (1974a): Timbre of steady sounds: A fractional investigation of its verbal attributes. Acustica 30, 146-159. Bismarck, G. von (1974b): Sharpness as an attribute of the timbre of steady sounds. Acustica 30, 159-172. Boord, R.L., Rasmussen, G.L. (1963): Projection of the cochlear and lagenar nerves on the cochlear nuclei of the pigeon. J. Comp. Neurol. 120, 463-473. Bregman, A.S. (1990): Auditory scene analysis: The perceptual organization of sound. MIT Press, Cambridge. Brown, A.M., Gasket, S.A., Williams, D.M. (1992): Mechanical filtering of sound in the inner ear. Proc. R. Soc. Lond. B 250, 29-34. Brown, A.M., Harris, F.P., Beveridge, H.A. (1996): Two sources of acoustic distortion products from the human cochlea. J. Acoust. Soc. Am. 100, 3260-3267. Brown, A.M., Kemp, D.T. (1984): Suppressibility of the 2f1-f2 stimulated acoustic emissions in gerbil and man. Hearing Res. 13, 29-37. Brown, C.H., Maloney, C.G. (1986): Temporal integration in two species of Old World monkeys: Blue monkeys (Cercopithecus mitis) and grey-cheeked mangabeys (Cercocebus albigena). J. Acoust. Soc. Am. 79, 1058-1064. Brownell, W.E., Bader, C.R., Bertrand, D., Ribeaupierre, D.E.Y. (1985): Evoked mechanical resposes of isolated outer hair cells. Science 227, 194-196. Bruns, V. (1976a): Peripheral tuning for fine frequency analysis by the CF-FM bat, Rhinolophus ferrumequinum. I. Mechanical specializations of the cochlea. J. Comp. Physiol. 106, 77-86.
337 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
11 References Bruns, V. (1976b): Peripheral tuning for fine frequency analysis by the CF-FM bat, Rhinolophus ferrumequinum. II. Frequency mapping in the cochlea. J. Comp. Physiol. 106, 87-97. Bruns, V., Schmieszek, (1980): Cochlear innervation in the Greater horseshoe bat: Demonstration of an acoustic fovea. Hearing Res. 3, 27-43. Buunen, T.J.F. van Valkenburg, D.A. (1979): Auditory detection of a single gap in noise. J. Acoust. Soc. Am. 65, 534-537. Buus, S. (1985): Release from masking caused by envelope fluctuations. J. Acoust. Soc. Am. 78, 1958-1965. Cant, N.B. (1992): The cochlear nucleus; neuronal types and their synaptic organization. In: Popper, A.N., Fay, R.R. (Eds.): The mammalian auditory pathway: Neuroanatomy. Springer Verlag, New York, 66-116. Cant, N.B., Hyson, R.L. (1992): Projections from the lateral nucleus of the trapezoid body to the medial olivary nucleus in the gerbil. Hearing Res. 58, 26-34. Carr, C.E. (1992): Evolution of the central auditory system in reptiles and birds. In: Webster, D.B., Fay, R.R., Popper, A.N. (Eds.): The Evolutionary Biology of Hearing, 1st ed., Springer Verlag, New York, 511-543. Carr, C.E. (1993): Delay line models of sound localization in the barn owl. Am. Zool. 33, 79-85. Carr, C.E., Konishi, M. (1990): A circuit for detection of interaural time differences in the brain stem of the barn owl. J. Neurosci. 10, 3227-3246. Carroll, R.L. (1988): Vertebrate paleontology and evolution. Freeman, New York. Casseday, J.H., Covey, E. (1992): Frequency tuning properties of neurons in the inferior colliculus of an FM-bat. J. Comp. Neurol. 319, 34-50. Casseday, J.H., Covey, E. (1996): A neuroethological theory of the operation of the inferior colliculus. Brain Behav. Evol. 47(6), 311-336. Casseday, J.H., Ehrlich, D., Covey, E. (1994): Neural tuning for sound duration: Role of inhibitory mechanisms in the inferior colliculus. Science 264 (5160), 847-50. Casseday, J.H., Kobler, J.B., Isbey, S.F., Covey, E. (1989): Central acoustic tract in an echolocating bat: An extralemniscal auditory pathway to the thalamus. J. Comp. Neurol. 287, 247-259. Cazals, Y., Aran, J.-M., Erre, J.-P., Guilhaume, A. (1980): Acoustic responses after total destruction of the cochlear receptor: Brainstem and auditory cortex. Science 210, 83-86. Chandler, J.P. (1984): Light and electron microscopic studies of the basilar papilla in the duck, Anas platyrhynchos. I. The hatchling. J. Comp. Neurol. 222, 506-522. Chen, L., Salvi, R. and Shero, M. (1994): Cochlear frequency-place map in adult chickens: Intracellular biocytin labeling. Hearing Res. 81, 130-136. Chistovich, L.A. and Lubinskaya, V.V. (1979): The „center of gravity“ effect in vowel spectra and critical distance between the formants: psychoacoustical study of the perception of vowel-like stimuli. Hearing Res. 1, 185-195. Clack, J.A. (1997): The evolution of tetrapod ears and the fossil record. Brain Behav. Evol. 50, 198-212. Clarey, J.C., Barone, P., Imig, T.J. (1992): Physiology of thalamus and cortex. In: Popper, A.N., Fay, R.R. (Eds.): The mammalian auditory pathway: Neurophysiology. Springer Handbook of Auditory Research, Springer-Verlag, Berlin, Heidelberg, New York, 232-334. Code, R.A. (1995): Efferent neurons to the macula lagena in the embryonic chick. Hearing Res. 82, 26-30. Code, R.A., Carr, C.E. (1994): Choline acetyltransferase-immunoreactive cochlear efferent neurons in the chick auditory brainstem. J. Comp. Neurol. 340, 161-173. Cole, K.S., Gummer, A.W. (1990): A double-label study of efferent projections to the cochlea of the chicken, Gallus domesticus. Exp. Brain Res. 82, 585-588. Cooke, M. (1993): Modelling auditory processing and organisation. Cambridge University Press, Cambridge. Corwin, J.T., Cotanche, D.A. (1988): Regeneration of sensory hair cells after acoustic trauma. Science 240, 1772-1774. Cotanche, D.A., Saunders, J.C., Tilney, L.G. (1987): Hair cell damage produced by acoustic trauma in the chick cochlea. Hearing Res. 25, 267-286.
338
11 References Covey, E., Casseday, J.H. (1995): Lower brainstem auditory pathways. In Popper, A.N., Fay, R.R. (Eds.): Hearing by Bats. Springer Handbook of Auditory Research. Springer Verlag, New York, 235-295. Covey, E., Hall, W.C., Kobler, J.B. (1987): Subcortical connections of the superior colliculus in the mustached bat, Pteronotus parnellii. J. Comp. Neurol. 263, 179-197. Covey, E., Kauer, J.A., Casseday, J.H. (1996): Whole-cell patch-clamp recording reveals subthreshold sound-evoked postsynaptic currents in the inferior colliculus of awake bats. J. Neurosci. 16, 3009-3018. Covey, E., Vater, M., Casseday, J.H. (1991): Binaural properties of single units in the superior olivary complex of the mustached bat. J. Neurophysiol. 66, 1080-1094. D´Amato, M.R. (1988): A search for tonal pattern perception in cebus monkeys: Why monkeys can´t hum a tune. Music perception 5, 453-480. de Boer, E. (1985): Auditory time constants: A paradox? In: Michelsen, A. (Ed.): Time Resolution in Auditory Systems. Springer, Berlin, 141-158. de Boer, E., Kuyper, P. (1968): Triggered correlation. IEEE Trans. Biomed. Eng. 15, 169-179. Dooling, R.J. (1979): Temporal summation of pure tones in birds. J. Acoust. Soc. Am. 65, 1058-1060. Dooling, R.J. (1992): Hearing in Birds. In : Webster, D.B., Fay, R.R., Popper, A.N. (Eds.): The Evolutionary Biology of Hearing. Springer, New York, 545-559. Eatock, R.A., Manley, G.A., Pawson, L. (1981): Auditory nerve fiber activity in the Tokay Gecko. I. Implications for cochlear processing. J. Comp. Physiol. 142, 203-218. Ebert, U., Ostwald, J. (1992): Serotonin modulates auditory information processing in the cochlear nucleus of the rat. Neurosci. Lett. 145, 51-54. Eens, M., Pinxten, R., Verheyen, R.F. (1989): Temporal and sequential organization of song bouts in the starling. Ardea 77, 75-86. Ehret, G. (1995): Auditory frequency resolution in mammals: From neural representation to perception. In: Manley, G.A., Klump, G.M., Köppl, C., Fastl, H., Oeckinghaus, H. (Eds.): Advances in Hearing Research. Proc. 10th internat. Symp. on Hearing. World Scientific Publishing Co., Singapore, 387-397. Fastl, H. (1977): Roughness and temporal masking patterns of sinusoidally amplitude modulated broadband noise. In: Psychophysics and Physiology of Hearing. Evans, E.F., Wilson, J.P. (Eds.): Academic Press, London, 403-414. Fastl, H. (1978): Frequency discrimination for pulsed versus modulated tones. J. Acoust. Soc. Am. 63, 275-277. Fastl, H. (1979): Temporal masking effects: III. Pure tone masker, Acustica, 43. Fay, R.R. (1988): Hearing in Vertebrates: A Psychophysics Databook. Hill-Fay Associates, Winnetka, JL. Feduccia, A. (1980): The age of birds. Harvard University Press, Cambridge, Massachusetts, USA. Feldtkeller, R., Oetinger, R. (1956): Die Hörbarkeitsgrenzen von Impulsen verschiedener Dauer. Acustica 6, 489-493. Fletcher, H. (1940): Auditory patterns. Rev. Modern Physics 12, 47-65. Formby, C., Muir, K. (1988): Modulation and gap detection for broadband and filtered noise signals. J. Acoust. Soc. Am. 84, 545-550. Fuchs, P.A. (1992): Ionic currents in cochlear hair cells. Prog. Neurobiol. 39, 493-505. Fuchs, P.A., Evans, M.G., Murrow, B.W. (1990): Calcium currents in hair cells isolated from the cochlea of the chick. J. Physiol. Lond. 429, 553-568. Fuchs, P.A., Murrow, B.W. (1992): Cholinergic inhibition of short (outer) hair cells of the chick’s cochlea. J. Neurosci. 12, 800-809. Fuchs, P.A., Nagai, T., Evans, M.G. (1988): Electrical tuning in hair cells isolated from the chick cochlea. J. Neurosci. 8, 2460-2467. Gaskill, S.A., Brown, A.M. (1990): The behaviour of the acoustic distortion product 2f1-f2, from the human ear and its relation to auditory sensitivity. J. Acoust. Soc. Am. 88, 821-839. Glasberg, B.R., Moore, B.C J. (1990): Derivation of auditory filter shapes from notched-noise data. Hearing Res. 47, 103-138.
339
11 References Goldberg, J.M., Brown, P.B. (1969): Response of binaural neurons of the dog superior olivary complex to dichotic tonal stimuli: Some physiological mechanisms of sound localization. J. Neurophysiol. 32, 613-636. Gordon, J.W., Grey, J.M. (1978): Perceptual effects of spectral modifications on musical timbres. J. Acoust. Soc. Am. 63, 1493-1500. Grantham, D.W. (1997) Auditory motion aftereffects in the horizontal plane: The effects of spectral region, spatial sector and spatial richness. Acta Acoustica (in press) Green, D.M. (1985): Temporal factors in psychoacoustics. In: Michelsen, A.(Ed.): Time Resolution in Auditory Systems. Springer, New York, Berlin, 122-140. Green, D.M., Birdsall, T.G., Tanner, W.P. (1957): Signal detection as a function of signal intensity and duration. J. Acoust. Soc. Am. 29, 523-531. Greenewalt, C.H. (1968): Bird song: Acoustics and physiology. Smithsonian Institution Press, Washington D.C. Griffin, D.R., Webster, F.A., Michael, C.R. (1960): The echolocation of flying insects by bats. Evolution 8, 141-154. Griffiths, T.D., Rees, A., Witton, C., Shakir, R.A., Henning, G.B., Green, G.G.R. (1996): Evidence for a sound movement area in the human cerebral cortex. Nature 383, 425-427. Guinan, J.J.J. (1996): Physiology of olivocochlear efferents. In: Dallos, P., Popper, A.N., Fay, R.R. (Eds.): The Cochlea. Springer Verlag, New York, 435-502. Gummer, A., Smolders, J.W.T., Klinke, R. (1986): The mechanics of the basilar membrane and middle ear in the pigeon. In: Allen, J.B., Hall, J.L., Hubbard, A, Neely, S.T., Tubis, A. (Eds.): Peripheral auditory mechanisms. Springer Verlag, Berlin, Heidelberg, New York, Tokyo, 81-88. Hall, J.W., Grose, J.H. (1991): Relative contributions of envelope maxima and minima to comodulation masking release. Quart. J. Exp. Psych. A 43, 349-372. Hall, J.W., Haggard, M.P., Fernandes, M.A. (1984): Detection in noise by spectro-temporal pattern analysis. J. Acoust. Soc. Am. 76, 50-56. Harris, F.P., Lonsbury-Martin, B.L., Stagner, B.B., Coats, A.C., Martin, G.K. (1989): Acoustic distortion products in humans: Systematic changes in amplitude as a function of f2/f1 ratio. J. Acoust. Soc. Am. 85, 220-229. Harris, F.P., Probst, R., Xu, L. (1992): Suppression of the 2f1-f2 otoacoustic emission in humans. Hearing Res. 64, 133-141. Hartby, E. (1969): The calls of the starling. Dansk. Ornithol. Foren. Tidsskr. 62, 205-230. Hauser, R., Probst, R. (1991): The influence of systematic primary-tone level variation L2-L1 on the acoustic distortion product emission 2f1-f2 in normal human ears. J. Acoust. Soc. Am. 89, 280-286. Heffner, H., Whitfield, I.C. (1976): Perception of the missing fundamental by cats. J. Acoust. Soc. Am. 59, 915-919. Heil, P., Langner, G., Scheich, H. (1992): Processing of frequency-modulated stimuli in the chick auditory cortex analogue: Evidence for topographic representations and possible mechanisms of rate and directional sensitivity. J. Comp. Physiol. A 171, 583-600. Heil, P., Scheich, H. (1992): Postnatal shift of tonotopic organization in the chick auditory cortex analogue. Neuroreport 3, 381-384. Heitmann, J., Waldmann, B., Schnitzler, H.U., Plinkert, P.K., Zenner, H.P. (1997): Suppression growth functions of DPOAE with a suppressor near 2f1-f2 depends on DP fine structure: Evidence for two generation sites for DPOAE. Abstr. 20th Midwinter Meeting ARO 83. Henson, M.M., Henson, O.W.jr. (1991): Specializations for sharp tuning in the mustached bat: the tectorial membrane and the spiral limbus. Hearing Res. 56, 122-132. Henson, M.M., Rübsamen, R. (1996): The postnatal development of tension fibroblasts in the spiral ligament of the horseshoe bat, Rhinolophus rouxi. Audit. Neurosci. 2, 3-13. Herbert, H., Aschoff, A., Ostwald, J. (1991): Topographical projections from the auditory cortex to the inferior colliculus in the rat. J. Comp. Neurol. 304, 103-122. Highstein, S.M., Baker, R. (1985): Action of the efferent vestibular system on primary afferents in the toadfish, Opsanus tau. J. Neurophysiol. 54, 370-384.
340
11 References Hill, K.G., Mo, J., Stange, G. (1989): Excitation and suppression of primary auditory fibres in the pigeon. Hearing Res. 39, 37-48. Hindmarsh, A.M. (1984): Vocal mimicry in starlings. Behaviour 90, 302-324. Hudspeth, A.J., Jacobs, R. (1979): Stereocilia mediate transduction in vertebrate hair cells. Proc. Natl. Acad. Sci. 76, 1506-1509. Huffman, R.F., Henson, O.W. (1990): The descending auditory pathway and acousticomotor systems: Connections with the inferior colliculus. Brain Res. Rev. 15, 295-323. Hulse, H.H., Cynx, J. (1985): Relative pitch perception is constrained by absolute pitch in songbirds (Mimus, Molothrus and Sturnus). J. Comp. Psychol. 99, 176-196. Hulse, S.H., Cynx, J., Humpal, J. (1984): Absolute and relative pitch discrimination in serial pitch perception by songbirds. J. Exp. Psychol. Gen. 113, 38-54. Hulse, S.H., MacDougall-Shackleton, S.A. and Wisniewski, A.B. (1997): Auditory scene analysis by songbirds: Stream segregation of bird song by European starlings (Sturnus vulgaris). JCPs 111, 3-13. Irvine, D.R.F. (1992): Physiology of the auditory brainstem. In: Popper, A.N., Fay, R.R. (Eds.): The mammalian auditory pathway: Neurophysiology. Springer Handbook of auditory research. Springer Verlag, New York, 153-232. Irvine, D.R.F., Park, V.N., Mattingley, (1995): Responses of neurons in the inferior colliculus of the rat to interaural time and intensity differences in transient stimuli: Implications for the latency hypothesis. Hearing Res. 85, 127-141. Iurato, S. (1962): Functional Implications of the Nature and Submicroscopic Structure of the tectorial and basilar Membranes. J. Acoust. Soc. Am. 34, 1386-1395. Jeffress, L.A. (1948): A place theory of sound localization. J. Comp. Physiol. Psychol. 41, 35-39. Jhaveri, S., Morest, D.K. (1982): Neuronal architecture in nucleus magnocellularis of the chicken auditory system with observations on nucleus laminaris: A light and electron microscope study. Neuroscience 7, 809-836. Johnstone, B.M., Patuzzi, R., Yates, G. (1986): Basilar membrane measurements and travelling wave. Hearing Res. 22, 147-153. Jones, E.G., Burton, H. (1976): Areal differences in the laminar distribution of thalamic afferents in cortical fields of the insular, parietal and temporal regions of primates, J. Comp. Neurol. 168, 197-248. Jones, S.M., Jones, T.A. (1995): The tonotopic map in the embryonic chicken cochlea. Hearing Res. 82, 149-157. Jörgensen, J.M., Christensen, J.T. (1989): The inner ear of the common rhea (Rhea americana L.). Brain Behav. Evol. 34, 273-280. Joris, P.X., Carney, L.H., Smith, P.H., Yin, T.C.T. (1994): Enhancement of neural synchronization in the anteroventral cochlear nucleus. I. Responses to tones at the characteristic frequency. J. Neurophysiol. 71, 1022-1036. Joris, P.X., Yin, T.C.T. (1995): Envelope coding in the lateral superior olive. I. Sensitivity to interaural time differences. J. Neurophysiol. 73, 1043-1062. Jürgens, U., Ploog, D. (1981): On the neural control of mammalian vocalization. TINS 4(6), 135-137. Kalinec, F., Holley, C.M., Iwasa, K.H., Lim, D.J., Kachar, B. (1992): A membrane-based force generating mechanism in auditory sensory cells. PNAS 89, 8671-8675. Kelly, J.B. (1973): The effects of insular and temporal lesions in cats on two types of auditory pattern discrimination. Brain Res. 62, 71-87. Kemp, D.T. (1979): Evidence of mechanical nonlinearity and frequency selective wave amplification in the cochlea. Arch. Otorhinolaryngol. 22, 437-455. Kemp, D.T. (1986): Otoacoustic emissions, travelling waves and cochlear mechanisms. Hearing Res. 22, 95-104. Kemp, D.T., Brown, A.M. (1983a): A comparison of mechanical nonlinearities in the cochleae of man and gerbil from ear canal measurements. In: Klinke, R., Hartman, R. (Eds.): Hearing - physiological bases and psychophysics. Springer-Verlag, Berlin, Heidelberg, New York, 82-87. Kemp, D.T., Brown, A.M. (1983b): An integrated view of cochlear mechanical nonlinearities observable from the ear canal. In: de Boer, E., Viergever, M.A. (Eds.): Mechanics of hearing. Martinus Nijhoff, Den Haag, 75-82.
341
11 References Klinke, R., Müller, M., Richter, C.-P., Smolders, J. (1994): Preferred intervals in birds and mammals: A filter response to noise. Hearing Res. 74, 238-246. Knipschild, M., Dörrscheidt, G.J., Rübsamen, R. (1992): Setting complex tasks to single units in the avian auditory forebrain. I: Processing of complex artificial stimuli. Hearing Res. 57, 216230. Konishi, M. (1993): Listening with two ears. Sci. Am. 268, 66-73. Kruskal, J.B. (1964): Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrica 29, 1-27. Kuwabara, N., Zook, J.M. (1992): Projections to the medial superior olive from the medial and lateral nuclei of the trapezoid body in rodents and bats. J. Comp. Neurol. 324, 522-538. Langner, G. (1992): Periodicity coding in the auditory system. Hearing Res. 60, 115-142. Lenoir, M., Puel, J.-L., Pujol, R. (1987): Stereocilia and tectorial membrane development in the rat. A SEM study. Anat. Embryol. 175, 477-487. Leppelsack, H.-J. (1974): Funktionelle Eigenschaften der Hörbahn im Feld L des Neostriatum caudale des Staren (Sturnus vulgaris L., Aves). J. Comp. Physiol. 88, 271-320. Liberman, M.C. (1982): The cochlear frequency map for the cat: labelling auditory nerve fibers of known characteristic frequency. J. Acoust. Soc. Am. 72, 1441-1449. Liberman, M.C. (1990): Effects of chronic cochlear de-efferentation on auditory-nerve response. Hearing Res. 49, 209-224. Liberman, M.C., Brown, M.C. (1986): Physiology and anatomy of single olivocochlear neurons in the cat. Hearing Res. 24, 17-36. Liberman, M.C., Kiang, N.Y.S. (1978): Acoustic trauma in cats. Acta Otolaryngol. (Stockh.) 358. Lim, D.J. (1986): Functional structure of the organ of Corti: A review. Hearing Res. 22, 117-147. May, B., Moody, D.B., Stebbins, W.C. (1989): Categorial perception of conspecific communication sounds by japanese macaques, Macaca fuscata. J. Acoust. Soc. Am. 85, 837-847. Middlebrooks, J.C. Green, D.M. (1991): Sound localization by human listeners. Ann. Rev. Psychol. 42, 135-159. Miller, M.R. (1980): The reptilian cochlear duct. In: Popper, A.N., Fay, R.R. (Eds.): Comparative studies of hearing in vertebrates. Springer Verlag, New York, 169-204. Miller, M.R. and Beck, J. (1988): Auditory hair cell innervational patterns in lizards. J. Comp. Neurol. 271, 604-628. Mills, D.M., Rubel, E.D. (1996): Development of the cochlear amplifier. J. Acoust. Soc. Am. 100, 428-441. Mogdans, J., Schnitzler, H.-U. (1990): Range resolution and the possible use of spectral information in the echolocating bat, Eptesicus fuscus. J. Acoust. Soc. Am. 88, 754-757. Mogdans, J., Schnitzler, H.-U. Ostwald, J. (1993): Discrimination of two-wavefront echoes by the big brown bat, Eptesicus fuscus: behavioral experiments and receiver simulations. J. Comp. Physiol. 172, 309-323. Moore, B.C.J. (1990): Co-modulation masking release: Spectro-temporal pattern analysis in hearing. Brit. J. Audiol. 24, 131-137. Moore, B.C.J. (1992): Across-channel processes in auditory masking. J. Acoust. Soc. Jpn. 13, 25-37. Moore, B.J.C. (1974): Relation between the critical bandwidth and the frequency-difference limen. J. Acoust. Soc. Am. 55, 359. Moore, B.J.C., Glasberg, B.R. (1989): Mechanisms underlying the frequency discrimination of pulsed tones and the detection of frequency modulation. J. Acoust. Soc. Am. 86, 1722-1732. Müller, C.M., Leppelsack, H.-J. (1985): Feature extraction and tonotopic organization in the avian auditory forebrain. Exp. Brain Res. 59, 587-599. Müller, M. (1991): Frequency representation in the rat cochlea. Hearing Res. 51, 247-254. Müller, M., Laube, B., Burda, H., Bruns, V. (1992): Structure and function of the cochlea of the African mole rat, Cryptomys hottentotus: Evidence for a low frequency acoustic fovea. J. Comp. Physiol. A 171, 469-476. O`Neill, W.E., Suga, N. (1982): Encoding of target-range and its representation in the auditory cortex of the mustached bat. J. Neurosci. 2, 17-31.
342
11 References Obrist, M.K., Fenton, M.B., Eger, J.L., Schlegel, P.A. (1993): What ears do for bats - A comparative study of pinna sound pressure transformation in chiroptera. J. Exp. Biol. 180, 119-152. Oertel, D. (1985): Use of brain slices in the study of the auditory system: Spatial and temporal summation of synaptic inputs in cells in the anteroventral cochlear nucleus of the mouse. J. Acoust. Soc. Am. 78, 328-333. Ofsie, M.S., Cotanche, D.A. (1996): Distribution of nerve fibers in the basilar papilla of normal and sound-damaged chick cochleae. J. Comp. Neurol. 370, 281-294. Ohyama, K., Sato, T., Wada, H., Takasaka, T. (1992): Frequency instability of spontaneous otoacoustic emissions in the guinea pig. Abstr. 15th Midwinter Meeting ARO, St. Petersburg Bch., Fla., 150. Ohyama, K., Wada, H., Kobayashi, T., Takasaka, T. (1991): Spontaneous otoacoustic emissions in the guinea pig. Hearing Res. 56, 111-121. Okanoya, K., Dooling, R.J. (1987): Hearing in passerine and psittacine birds: A comparative study of absolute and masked auditory thresholds. JCPs 101, 7-15. Okanoya, K., Dooling, R.J. (1988): Hearing in the swamp sparrow, Melospiza georgiana, and in the song sparrow, Melospiza melodia. Anim. Behav. 36, 726-732. Okanoya, K., Dooling, R.J. (1990): Detection of gaps in noise by budgerigars (Melopsittacus undulatus) and zebra finches (Poephila guttata). Hearing Res. 50, 185-192. O’Neill, W.E., Suga, N. (1979): Target range-sensitive neurons in the auditory cortex of the mustached bat. Science 203, 69-73. Patterson, R.D., Holdsworth, J., Allerhand, M. (1992): Auditory models as preprocessors for speech recognition. In: Schouten, M.E.H. (Ed.): The auditory processing of speech: From sounds to words. Mouton de Gruyter, Berlin, 67-83. Plomp, R., Bouman, M.A. (1959): Relation between hearing threshold and duration of tone pulses. J. Acoust. Soc. Am. 31, 749-758. Poggio, T., Reichardt, W. (1973): Considerations on models of movement detection. Kybernetik 13, 223-227. Pollak, G.D. (1988): Time is traded for intensity in the bat’s auditory system. Hearing Res. 36, 107-124. Pollak, G.D., Henson, O.W.jr., Novick, A. (1972): Cochlear microphonic audiograms in the pure tone bat, Chilonycteris parnellii. Science 176, 66-68. Preyer, S., Gummer, A.W. (1996): Nonlinearity of mechanoelectrical transduction of outer hair cells as the source of nonlinear basilar-membrane motion and loudness recruitment. Audiology & Neuro-Otology 1, 3-11. Raman, I.M., Trussell, L.O. (1992): The kinetics of the response to glutamate and kainate in neurons of the avian cochlear nucleus. Neuron 9, 173-186. Ratnanather, J.T., Brownell, W.E., Popel, A.S. (1993): Mechanical properties of the outer hair cell. Biophysics of hair cell sensory system. Academisch Ziekenhuis Groningen, 149-155. Rees, H., Roberts, M.H.T. (1993): The anterior pretectal nucleus: A proposed role in sensory processing. Pain 53, 121-135. Rhode, W.S., Greenberg, S. (1992): Physiology of the cochlear nuclei. In: Popper, A.N., Fay, R.R. (Eds.): The mammalian auditory pathway: Neurophysiology. Springer Handbook of auditory research. Springer Verlag, New York, 94-153. Richards, D.G., Wiley, R.H. (1980): Reverberations and amplitude fluctuations in the propagation of sound in a forest: Implications for animal communication. Am. Nat. 115, 381-399. Ritsma, R.J. (1967): Existence region of the tonal residue. I. J. Acoust. Soc. Am. 42, 191-198. Robertson, D. (1982): Effect of acoustic trauma on stereocilia structure and spiral ganglion cell tuning properties in the guinea pig cochlea. Hearing Res. 7, 55-74. Robertson, D. (1984): Horseradish peroxidase injection of physiologically characterized afferent and efferent neurones in the guinea pig spiral ganglion. Hearing Res. 15, 113-121. Robles, L., Ruggero, M.A., Rich, N.C. (1991): Two-tone distortion in the basilar membrane of the cochlea. Nature 349, 413-414. Rosowski, J.J., Peake, W.T., White, J.R. (1984): Cochlear nonlinearities inferred from two-tone distortion products in the ear canal of the alligator lizard. Hearing Res. 13, 141-158.
343
11 References Roverud, R.C., Grinnell, A.D. (1985): Echolocation sound features processed to provide distance information in the CF/FM bat, Noctilio albiventris: Evidence for a gated time window utilizing both CF and FM components. J. Comp. Physiol. 156, 457-469. Rubel, E.W. (1984): Ontogeny of auditory system function. Ann. Rev. Physiol. 46, 213-229. Rubel, E.W., Ryals, B.M. (1983): Development of the place principle: Acoustic trauma. Science 219, 512-514. Rübsamen R., Schuller, G. (1981): Laryngeal nerve activity during pulse emission in the CF-FM bat Rhinolophus ferrumequinum. II. The recurrent laryngeal nerve. J. Comp. Physiol. 143, 323-327. Rübsamen, R., Dörrscheidt, G.J. (1986): Tonotopic organization of the auditory forebrain in a songbird, the European starling. J. Comp. Physiol. A 158, 639-646. Rübsamen, R., Schäfer, M. (1990): Ontogenesis of auditory fovea representation in the inferior colliculus of the Sri Lankan rufous horseshoe bat, Rhinolophus rouxi. J. Comp. Physiol. 167, 757-769. Ruggero, M.A., Rich, N.C. (1991): Application of a commercially-manufactured Doppler-shift laser velocimeter to the measurement of the basilar-membrane vibration. Hearing Res. 51, 215-230. Ruggero, M.A., Rich, N.C., Recio, A., Narayan, S.S., Robles, L. (1997): Basilar-membrane responses to tones at the base of the chinchilla cochlea. J. Acoust. Soc. Am. 101, 2151-2163. Ryals, B.M., Rubel, E.W. (1988): Hair cell regeneration after acoustic trauma in adult Coturnix quail. Science 240, 1774-1776. Saillant, P.A., Simmons, J.A., Dear, S.P. (1993): A computational model of echo processing and acoustic imaging in frequency-modulated echolocating bats: The spectrogram correlation and transformation receiver. J. Acoust. Soc. Am. 94, 2691-2712. Salvi, R.J., Saunders, S.S., Powers, N.L., Boettcher, F.A. (1992): Discharge patterns of cochlear ganglion neurons in the chicken. J. Comp. Physiol. A 170, 227-241. Sanes, D.H., Rubel, E.W. (1988): The ontogeny of inhibition and excitation in the gerbil lateral superior olive. J. Neurosci. 8, 682-700. Scharf, B. (1970): Critical bands. In: Tobias, J.V. (Ed.): Foundations of modern auditory theory. Academic Press, New York, 159-202. Schermuly, L., Klinke, R. (1985): Change of characteristic frequency of pigeon primary auditory afferents with temperature. J. Comp. Physiol. A 156, 209-211. Schermuly, L., Klinke, R. (1990): Origin of infrasound sensitive neurones in the papilla basilaris of the pigeon: An HRP study. Hearing Res. 48, 69-78. Schnitzler, H.-U. (1968): Die Ultraschall-Ortungslaute der Hufeisen-Fledermäuse (Chiroptera-Rhinolophidae) in verschiedenen Orientierungssituationen. Z. Vergl. Physiol. 57, 376-408. Schnitzler, H.-U. (1970): Echoortung bei der Fledermaus Chilonycteris rubiginosa. Z. Vergl. Physiol. 68, 25-38. Schnitzler, H.-U., Henson, W.jr. (1980): Performance of airborne animal sonar systems. In: Busnel, R.G., Fish, J.F. (Eds.): Animal sonar systems. Plenum Press, New York and London, 109-181. Schooneveldt, G.P., Moore, B.C.J. (1989): Comodulation masking release (CMR) as a function of masker bandwidth, modulator bandwidth, and signal duration. J. Acoust. Soc. Am. 85, 273-281. Schuller, G. (1980): Hearing characteristics and Doppler shift compensation in South Indian CF-FM bats. J. Comp. Physiol. A 139, 349-356. Schuller, G., Beuter, K., Schnitzler, H.U. (1974): Response to frequency shifted artificial echoes in the bat Rhinolophus ferrumequinum. J. Comp. Physiol. 89, 275-286. Schuller, G., Pollak, G. (1979): Disproportionate frequency representation in the inferior colliculus of Doppler-compensating greater horseshoe bats: Evidence for an acoustic fovea. J. Comp. Physiol. A 132, 47-54. Schuller, G., Rübsamen, R. (1981): Laryngeal nerve activity during pulse emission in the CFFM Bat, rhinolophus ferrumequinum. I. Superior laryngeal nerve (External motor branch). J. Comp. Physiol. 143, 317-321.
344
11 References Schwartz, I.R. (1992): The superior complex and lateral lemniscus nuclei. In: The mammalian auditory pathway: Neuroanatomy. Popper, A.N., Fay, R.R. (Eds.): Springer Handbook of auditory research. Springer Verlag, New York, 117-167. Schwarz, I.E., Schwarz, D.W.F., Frederickson, J.M., Landolt, J.P. (1981): Efferent vestibular neurons: A study employing retrograde tracer methods in the pigeon (Columba livia). J. Comp. Neurol. 196, 1-12. Schweizer, H., Rübsamen, R., Rühle, C. (1981): Localization of brain stem motoneurons innervating the laryngeal muscles in the rufous horseshoe bat, Rhinolophus rouxi. Brain Res. 230, 41-50. Seller, T.J. (1981): Midbrain vocalization centers in birds. Trends Neurosci.12, 301-303. Shailer, M.J. and Moore, B.C.J. (1983): Gap detection as a function of frequency, bandwidth, and level. J. Acoust. Soc. Am. 74, 1169-1174. Siegel, J.H., Hirohata, E.T. (1994): Sound calibration and distortion product otoacoustic emissions at high frequencies. Hearing Res. 80, 146-152 . Simmons, J.A., Freeman, E.G., Stevenson, S.B., Chen, L., Wohlgenannt, T.J. (1989): Clutter interference and the integration time of echoes in the echolocating bat, Eptesicus fuscus. J. Acoust. Soc. Am. 86, 1318-1332. Simmons, J.A., Moss, C.F., Ferragamo, M. (1990): Convergence of temporal and spectral information into acoustic images of complex sonar targets perceived by the echolocating bat, Eptesicus fuscus. J. Comp. Physiol. 166, 449-470. Sinnott, J.M., Owren, M.J., Petersen, M.R. (1987): Auditory duration discrimination in Old World monkeys (Macacca, Cercopithecus) and humans. J. Acoust. Soc. Am. 82, 465-478. Slaney, M. (1993): An efficient implementation of the Patterson-Holdsworth auditory filter bank. Apple Computer Inc. Smith, C.A. (1985): Inner ear. In: A.S. King and J. McLelland (Eds.), Form and function in birds, Vol. 3. Academic Press, London, 273-310. Smith, C.A., Konishi, M., Schuff, N. (1985): Structure of the Barn Owl’s (Tyto alba) inner ear. Hearing Res. 17, 237-247. Smith, P.H., Joris, P.X., Yin, T.C.T. (1993): Projections of physiologically characterized spherical bushy cell axons from the cochlear nucleus of the cat: Evidence for delay lines to the medial superior olive. J. Comp. Neurol. 331, 245-260. Smolders, J.W.T., Ding, D., Klinke, R. (1992): Normal tuning curves from primary afferent fibres innervating short and intermediate hair cells in the pigeon ear. In: Cazals, Y., Demany, L., Horner, K. (Eds.): Auditory Physiology and Perception - Advances in Biosciences 83. Pergamon, Oxford. 197-204. Smolders, J.W.T., Ding-Pfennigdorff, D., Klinke, R. (1995): A functional map of the pigeon basilar papilla: Correlation of the properties of single auditory nerve fibres and their peripheral origin. Hearing Res. 92, 151-169. Spitzer, W., Semple, M.N. (1995): Neurons sensitive to interaural phase disparity in the gerbil superior olive: Diverse monaural and temporal response properties. J. Neurophysiol. 73, 1668-1690. Steele, C.R. (1997): Three-dimensional mechanical modeling of the cochlea. In: Lewis, E.R., Long, G., Lyon, R.F., Narins, P.M., Steele, C.R., Hecht-Poinar, E. (Eds.): Diversity in Auditory Mechanics. World Scientific Publishing Co., Singapore, 455-462. Suga, N. (1970): Echo ranging neurons in the inferior colliculus of bats. Science 170, 449-452. Sullivan, W.E. (1982): Neural representation of target distance in the auditory cortex of the echolocating bat, Myotis lucifugus. J. Neurophysiol. 58, 1011-1032. Sullivan, W.E., Konishi, M. (1984): Segregation of stimulus phase and intensity coding in the cochlear nucleus of the barn owl. J. Neurosci. 4, 1787-1799. Summerfield, Q., Haggard, M., Foster, J., Gray, S. (1984): Perceiving vowels from uniform spectra: Phonetic exploration of an auditory aftereffect. Perception & Psychophysics 35, 203-213. Surlykke, A., Bojesen, O. (1996): Integration time for short broadband clicks in echolocating FM-bats (Eptesicus fuscus). J. Comp. Physiol. A 178, 235-241.
345
11 References Takahashi, T.T., Konishi, M. (1988): Projections of the cochlear nuclei and nucleus laminaris to the inferior colliculus of the barn owl. J. Comp. Neurol. 274, 190-211. Takasaka, T., Smith, C.A. (1971): The structure and innervation of the pigeon’s basilar papilla. J. Ultrastructure 35, 20-65. Temchin, A.N. (1988): Unusual discharge patterns of single fibers in the pigeon’s auditory nerve. J. Comp. Physiol. A 163, 99-115. Terhardt, E. (1970): Frequency analysis and periodicity detection in the sensations of roughness and periodicity pitch. In: Plomp, R., Smoorenburg, G.F. (Eds.): Frequency analysis and periodicity detection in hearing. A.W. Sijthoff, Leiden, 278-290. Terhardt, E. (1978): Psychoacoustic evaluation of musical sounds. Percept. Psychophys. 23, 483-492. Thompson, A.M., Moore, K.R., Thompson, G.C. (1995): Distribution and origin of serotoninergic afferents to guinea pig cochlear nucleus. J. Comp. Neurol. 351, 104-116. Thompson, G.C., Thompson, A.M., Garrett, K.M., Britton, B.H. (1994): Serotonin and serotonin receptors in the central auditory system. Otolaryngol. Head Neck Surg. 110, 93-102. Tilney, L.G., Saunders, J.C. (1983): Actin filaments, stereocilia and hair cells of the bird cochlea I. Length, number, width and distribution of stereocilia of each hair cell are related to the position of the hair cell on the cochlea. J. Cell Biol. 96, 807-821. Tilney, M.S., Tilney, L.G., DeRosier, D.J. (1987): The distribution of hair cell bundle lengths and orientations suggests an unexpected pattern of hair cell stimulation in the chick cochlea. Hearing Res. 25, 141-151. Tomlinson, R.W.W., Schwarz, D.W.F. (1988): Perception of the missing fundamental in nonhuman primates. J. Acoust. Soc. Am. 84, 560-565. Tsuchitani, C. (1988): The inhibition of cat lateral superior olive unit excitatory responses to binaural tone bursts. I. The transient chopper response. J. Neurophysiol. 59, 164-183. Vater, M. (1982): Single unit responses in cochlear nucleus of horseshoe bats to sinusiodal frequency and amplitude modulated signals. J. Comp. Physiol. A 149, 369-386. Viemeister, N.F. (1979): Temporal modulation transfer function based upon modulation thresholds. J. Acoust. Soc. Am. 66, 1364-1380. Viemeister, N.F. (1980): Adaptation of masking. In: van den Brink, G., Bilsen, F.A. (Eds.): Psychophysical, physiological and behavioural studies in hearing. Delft University Press, Delft, 190-198. Viemeister, N.F., Wakefield, G.H. (1991): Temporal integration and multiple looks. J. Acoust. Soc. Am. 90, 858-865. Voldvrich, L. (1978): Mechanical properties of the basilar membrane. Acta Otolaryngol. 86, 331-335. Wang, W., Timsit-Berthier, M., Schoenen, J. (1996): Intensity dependence of auditory evoked potentials is pronounced in migraine: An indication of cortical potentiation and low serotonergic neurotransmission? Neurology 46, 1404-1409. Warchol, M.E., Dallos, P. (1989): Neural response to very low-frequency sound in the avian cochlear nucleus. J. Comp. Physiol. A 166, 83-95. Warchol, M.E., Dallos, P. (1990): Neural coding in the chick cochlear nucleus. J. Comp. Physiol. A 166, 721-734. Watson, C.S., Gengel, R.W. (1969): Signal duration and signal frequency in relation to auditory sensitivity. J. Acoust. Soc. Am. 46, 989-997. Weber, J.T., Chen, I.L., Hutchins, B. (1986): The pretectal complex of the cat: Cells of origin of projections to the pulvinar nucleus. Brain Res. 397, 389-394. Wenstrup, J.J., Larue D.T., Winer, J.A. (1994): Projections of the physiologically-defined subdivisions of the inferior colliculus in the mustached bat: Output to the medial geniculate body and extrathalamic targets. J. Comp. Neurol. 346, 207-236. Whitehead, M.C., Morest, D.K. (1981): Dual populations of efferent and afferent cochlear axons in the chicken. Neuroscience 6, 2351-2365. Whitehead, M.L., Kamal, N., Lonsbury-Martin, B.L., Martin, G.K. (1993): Spontaneous otoacoustic emissions in different racial groups. Scand. Audiol. 22, 3-10.
346
11 References Whitehead, M.L., Lonsbury-Martin, B.L., Martin, G.K. (1992): Evidence for two discrete sources of 2f1-f2 distortion-product otoacoustic emissions in rabbit. II: Differential physiological vulnerabilty. J. Acoust. Soc. Am. 92, 2662-2682. Whitehead, M.L., McCoy, M.J., Lonsbury-Martin, B.L., Martin, G.K. (1995a): Dependence of distortion-product otoacoustic emissions in primary levels in normal and impaired ears. I. Effects of decreasing L2 below L1. J. Acoust. Soc. Am. 97, 2346-2358. Whitehead, M.L., Stagner, B.B., Lonsbury-Martin, B.L., Martin, G.K. (1995c): Effects of earcanal standing waves on measurements of distortion-product otoacoustic emissions. J. Acoust. Soc. Am. 98, 3200-3214. Whitehead, M.L., Stagner, B.B., McCoy, M.J., Lonsbury-Martin, B.L., Martin, G.K. (1995b): Dependence of distortion-product otoacoustic emissions on primary levels in normal and impaired ears. II. Asymmetry in L1, L2 space. J. Acoust. Soc. Am. 97, 2359-2377. Wier, C.C., Jesteadt, W., Green, D.M. (1977): Frequency discrimination as a function of frequency and sensation level. J. Acoust. Soc. Am. 61, 178-184. Wightman, F.L. (1973): The pattern transformation model of pitch. J. Acoust. Soc. Am. 54, 407-416. Wu, Y.-C., Art, J.J., Goodman, M.B., Fettiplace, R. (1995): A kinetic description of the calciumactivated potassium channel and its application to electrical tuning of hair cells. Prog. Biophys. Mol. Biol. 63, 131-158. Yang, L., Pollak, G.D. (1997): Differential response properties to amplitude modulated signals in the dorsal nucleus of the lateral lemniscus of the mustache bat and the roles of GABAergic inhibition. J. Neurophysiol. 77, 324-340. Yin, T.C.T., Chan, J.C.K. (1990): Interaural time sensitivity in the medial superior olive of the cat. J. Neurophysiol. 64, 465-488. Yin, T.C.T., Hirsch, J.A., Chan, J.C.K. (1985): Response of neurons in the cat’s superior colliculus to acoustic stimuli. II A model of interaural intensity sensitivity. J. Neurophysiol. 53, 746-758. Yoshida, A., Sessle, B.J., Dostrovsky, J.O., Chiang, C.H. (1992): Trigeminal and dorsal column nuclei projections to the anterior pretectal nucleus in the rat. Brain Res. 590, 81-94. Zenner, H.P., Ernst, A. (1993): Cochlear-motor transduction and signal-transfer tinnitus: Models of three types of cochlear tinnitus. Eur. Arch. Otolanyngol. 249, 447-454. Zenner, H.-P., Zimmermann, U., Schmidt, U. (1985): Reversible contraction of isolated cochlear hair cells. Hearing Res. 18, 127-133. Zidanic, M. Fuchs, P.A. (1996): Synapsin-like immunoreactivity in the chick cochlea: Specific labeling of efferent terminals. Auditory Neurosci. 2, 347-362. Zook, J.M., Leake, P.A. (1989) Connections and frequency representation in the auditory brainstem of the mustached bat, Pteronotus parnellii. J. Comp. Neurol. 290, 243-261. Zwicker, E. (1952): Die Grenzen der Hörbarkeit der Amplitudenmodulation und der Frequenzmodulation eines Tones. Acustica 2, 125-133. Zwicker, E. (1956): Die elementaren Grundlagen zur Bestimmung der Informationskapazität des Gehörs. Acustica 6, 365-381. Zwicker, E. (1961): Subdivision of the audible frequency range into critical bands (Frequenzgruppen). J. Acoust. Soc. Am. 33, 248. Zwicker, E., Feldtkeller, R. (1967): Das Ohr als Nachrichtenempfänger. S. Hirzel Verlag, Stuttgart. Zwicker, E., Flottorp, G., Stevens, S.S. (1957): Critical bandwidth in loudness summation. J. Acoust. Soc. Am. 29, 548-557. Zwislocki, J.J., Cefaratti, L.K. (1989): Tectorial membrane II. Stiffness measurements in vivo. Hearing Res. 42, 211-228.
347
12 Appendices
12.1 Institutes involved in the collaborative research centre 204 Technische Universität München Institut für Schaltungstechnik Lehrstuhl für Elektroakustik (since 1992: Lehrstuhl für Mensch-Maschine-Kommunikation) Arcisstr. 21, 80333 München Institut und Lehrstuhl für Zoologie Lichtenbergstr. 4, 85748 Garching Labor für experimentelle Audiologie Hals-Nasen-Ohrenklinik und Poliklinik, Klinikum rechts der Isar Ismaningerstr. 22, 81675 München Institut für Informationstechnik Lehrstuhl für Datenverarbeitung Arcisstr. 21, 80333 München Ludwigs-Maximilians-Universität München Zoologisches Institut Lehrstuhl für Zoologie und vergleichende Anatomie Luisenstr. 14, 80333 München Klinik und Poliklinik für Hals-Nasen-Ohrenkranke Klinikum Großhadern, Marchioninistr. 15, 81377 München Universität Regensburg Institut für Zoologie Universitätsstr. 31, 93040 Regensburg Max-Planck-Institut für Psychiatrie Kraepelinstr. 2, 80804 München
348 Auditory Worlds: Sensory Analysis and Perception in Animals and Man. Deutsche Forschungsgemeinschaft (DFG) Copyright © 2000 WILEY-VCH Verlag GmbH; ISBN: 978-3-527-27587-8
12 Appendices
12.2 Projects supported by the collaborative research centre 204
Project
Title
Project Leader
from to
2
Anatomy and physiology of the peripheral auditory sytem of birds and lizards
Manley, Geoffrey A. Prof. Ph.D.
1983–1997
3
Peripheral processing of acoustic information
Zwicker, Eberhard † Prof. Dr.-Ing.
1983–1991
4
Functional organization of the peripheral auditory system of bats
Vater, Marianne Prof. Dr. rer. nat.
1983–1997
5
Hearing sensations as basis of acoustic information transfer
Fastl, Hugo Prof. Dr.-Ing.
1983–1997
6
Information processing of the impaired hearing
Schorn, Karin Prof. Dr. med.
1983–1997
7
Information processing in the auditory pathway of birds
Leppelsack, Hans-Joachim Prof. Dr. rer. nat.
1983–1991
8
Psychoacoustics of echolocating bats (cf. 20)
Neuweiler, Gerhard Prof. Dr. rer. nat.
1983–1994
9
System theory of echolocation in bats
Türke, Bernhard Dr.-Ing.
1983–1985
10
Functional interactions of hearing and motor systems
Schuller, Gerd Prof. Dr. rer. nat. Grothe, Benedikt PD Dr. rer. nat.
1983–1997
11
Interactions between sound producing and sound recepting brain structures in primates
Ploog, Detlev Prof. Dr. med. Müller-Preuss, Peter Dr. rer. nat.
1983–1991
12
Complex perceptual processes on speech and music
Terhardt, Ernst Prof. Dr.-Ing.
1983–1997
349
12 Appendices 13
Recognition of speech information
Ruske, Günther Dr.-Ing.
1983–1991
16
Auditory processing in birds
Klump, Georg PD Dr. rer. nat.
1988–1997
17
Processing of frequency in the cochlea of mammals
Kössl, Manfred PD Dr. rer. nat.
1992–1997
18
Properties and clinical application of distortion product otoacoustic emissions
Janssen, Thomas PD Dr.-Ing.
1994–1997
19
Mechanisms of sound localization in barn owls
Wagner, Hermann Prof. Dr. rer. nat.
1995–1996
20
Psychoacoustics of echolocating bats (cf. 8)
Schmidt, Sabine PD Dr. rer. nat.
1995–1997
350
12 Appendices
12.3 Members and co-workers of the collaborative research centre 204
Name and degree
Institute and Department
Project
From – to
Arnold, Barbara Dr. Arnold, Wolfgang Prof. Dr. Aures, Wilhelm Dr.* Baumann, Martin Baumann, Uwe Dr.* Beckenbauer, Thomas Dr.* Behrend, Oliver* Bergmann, Peter Dr.* Bieser, Armin Dr. Boege, P. Böhnke, Frank Dr. Brix, Jutta Dr.* Brügel, Franz Dr. Buchfellner, Elisabeth Dr.* Dallmayr, Christoph Dr.* Datscheweit, Winfried Dr.* Daxer, Wolfgang Dr.* Eckrich, Michael Dr.* Einsele, Theodor Prof. Dr. Fastl, Hugo Prof. Dr. Faulstich, Michael Dr.* Fischer, Franz Peter Dr.** Frank, Gerhard Dr.* Gallo, Lothar Dr.* Gleich, Otto Dr.* Gooßens, Sebastian Gralla, Gisbert Dr.* Grothe, Benedikt Dr.*, ** Grubert, Andreas Habbicht, Hartmann* Hartl, Ulrike Häusler, Udo Dr.* Hautmann, I. Heinbach, Wolfgang Dr.* Hess, Wolfgang Prof. Dr. Hesse, Alexander Dr.* Huber, Barbara
HNO-Klinik LMU HNO-Klinik TUM Elektroakustik/MMK TUM Zoologie LMU HNO-Klinik LMU Elektroakustik/MMK TUM Zoologie LMU Zoologie TUM MPI Psychiatrie HNO-Klinik TUM HNO-Klinik TUM Zoologie TUM HNO-Klinik LMU Zoologie TUM Elektroakustik/MMK TUM Datenverarbeitung TUM Elektroakustik/MMK TUM Zoologie LMU Datenverarbeitung TUM Elektroakustik/MMK TUM Zoologie LMU Zoologie TUM Zoologie LMU Zoologie TUM Zoologie TUM Elektroakustik/MMK TUM Elektroakustik/MMK TUM Zoologie LMU Elektroakustik/MMK TUM Zoologie UR Zoologie TUM Zoologie TUM Elektroakustik/MMK TUM Elektroakustik/MMK TUM Datenverarbeitung TUM Elektroakustik/MMK TUM Zoologie TUM
6 18 12 10 6 3 10 7 11 18 18 2 6 7 3 13 3 8 3 5 10 2 17 2 2 3 3 10 12 4 2 7 5 12 13 5 2
1993 – 1997 1993 – 1997 1983 – 1991 1987 – 1991 1989 – 1997 1984 – 1990 1995 – 1997 1983 – 1986 1985 – 1991 1996 – 1997 1993 – 1997 1987 – 1991 1989 – 1997 1987 – 1989 1983 – 1986 1987 – 1991 1983 – 1985 1984 – 1988 1983 – 1988 1983 – 1997 1995 – 1997 1987 – 1997 1993 – 1996 1993 – 1997 1984 – 1997 1990 – 1991 1988 – 1992 1988 – 1997 1986 – 1988 1993 – 1997 1994 – 1996 1984 – 1989 1993 – 1995 1983 – 1988 1983 – 1986 1983 – 1986 1983 – 1985
351
12 Appendices Indefrey, Helge Janssen, Thomas Dr. Jurzitza, Dieter Dr.* Kaiser, Alexander Dr.* Kautz, Dirk Dr.* Keller, Helmut Dr.* Kemmer, Michaela* Kettembeil, Sibylle Kießlich, Stephan Kleiser, Annette Dr.* Klump, Georg Dr.** Koch, Ursula Dr.* Köhlmann, Michael Dr.* Köppl, Christine Dr.*, ** Kössl, Manfred Dr.*, ** Kriner, Eva Dr.* Krull, Dorothea Dr.* Krumbholz, Kathrin* Krump, Gerhard Dr.* Kuhn, Birgit Dr.* Kühne, Roland Dr. Kummer, Peter Dr.* Kunert, Franz Landvogt, Robert Langemann, Ulrike Dr.* Lechner, Thomas Dr.* Leppelsack, Hans-J. Prof. Dr. Leysieffer, Hans Dr.* Lumer, Georg Dr.* Maier, Elke Dr.* Manley, Geoffrey Prof. Ph.D. Manley, Judith Dr. Metzner, Walter Dr.* Molter, Dietfried Müller, Christian Müller-Preuss, Peter Dr. Mummert, Markus Dr.* Negri, Barbara Dr. Neuweiler, Gerhard Prof. Dr. Nitsche, Volker Dr.* Obrist, Martin Dr.* Oeckinghaus, Horst Dr. Pavusa, Andrea Peisl, Wolfgang Dr.* Peschl, Ulrich Pfaffelhuber, Klaus Dr.* Pillat, Jürgen Dr.* Ploog, Detlev Prof. Dr. Prechtl, Helmut Dr.* Preißler, Annemarie Dr. Radtke-Schuller Dr.* Reimer, Kathrin Dr.* Roverud, Roald Dr. Rübsamen, Rudolf Prof. Dr.*
352
Datenverarbeitung TUM HNO-Klinik TUM Elektroakustik/MMK TUM Zoologie TUM Zoologie TUM Elektroakustik/MMK TUM Zoologie UR Zoologie TUM Zoologie LMU Zoologie LMU Zoologie TUM Zoologie LMU Elektroakustik/MMK TUM Zoologie TUM Zoologie LMU Zoologie LMU Zoologie LMU Zoologie LMU Elektroakustik/MMK TUM Zoologie UR, Zoologie TUM Zoologie TUM HNO-Klinik TUM Elektroakustik/MMK TUM Zoologie TUM Zoologie TUM Elektroakustik/MMK TUM Zoologie TUM Elektroakustik/MMK TUM Elektroakustik/MMK TUM Zoologie TUM Zoologie TUM Zoologie LMU Zoologie LMU Zoologie LMU Zoologie TUM MPI Psychiatrie Elektroakustik/MMK TUM HNO-Klinik LMU Zoologie LMU Zoologie LMU Zoologie LMU Zoologie TUM Zoologie TUM Elektroakustik/MMK TUM Elektroakustik/MMK TUM Elektroakustik/MMK TUM Zoologie LMU MPI Psychiatrie Zoologie LMU Zoologie LMU Zoologie LMU Zoologie LMU Elektroakustik/MMK TUM Zoologie LMU
13 18 3 2 19 12 4 2 10 10 16 10 12 2 4, 17 8 8 20 5 4, 2 7 18 3 2 16 3 7 3 3 7 2 8 10 10 7 11 12 6 8 8 8 2 2 3 5 12 10 11 10 20 10 10 8 10
1989 – 1991 1993 – 1997 1988 – 1992 1987 – 1995 1995 – 1996 1984 – 1988 1995 – 1997 1989 – 1993 1991 – 1995 1993 – 1997 1985 – 1997 1996 – 1997 1983 – 1984 1987 – 1997 1985 – 1997 1990 – 1991 1991 – 1994 1995 – 1997 1988 – 1993 1991 – 1997 1984 – 1985 1995 – 1996 1989 – 1990 1983 – 1985 1991 – 1997 1987 – 1991 1983 – 1989 1983 – 1987 1983 – 1986 1988 – 1989 1983 – 1997 1983 – 1983 1985 – 1989 1995 – 1997 1983 – 1984 1983 – 1991 1989 – 1991 1987 – 1988 1983 – 1997 1990 – 1994 1988 – 1989 1983 – 1997 1994 – 1996 1985 – 1989 1990 – 1991 1988 – 1993 1991 – 1995 1983 – 1991 1991 – 1995 1993 – 1997 1983 – 1985 1986 – 1989 1986 – 1987 1983
12 Appendices Rücker, Claus von Runhaar, Geert Dr. Ruske, Günther Dr. Scharmann, Michael Dr.* Scherer, Angelika Dr.* Schlegel, Peter Dr. Schloth, Eberhard Dr.* Schmid, Wolfgang* Schmidt, Sabine Dr.*, ** Schorer, Edwin Dr.* Schorn, Karin Prof. Dr. Schuller, Gerd Prof. Dr. Schweizer, Hermann Dr. Seck, Rainer Dr.* Sedlmeier, Heinz Dr.* Sonntag, Benno Dr.* Stecker, Matthias Dr. Steel, Karen Dr. Steinhoff, Jochen Stemplinger, Ingeborg* Stiefer, Werner Dr.* Stoll, Gerhard Dr.* Suckfüll, Markus Dr. Taschenberger, Grit Dr.* Terhardt, Ernst Prof. Dr. Türke, Berhard Dr. Valenzuela, Miriam Dr.* Vater, Marianne Prof. Dr.** Wagner, Hermann Prof. Dr. Wartini, Stefan Dr.* Widmann, Ulrich Dr.* Wiegrebe, Lutz Dr.* Witzke, Peter Dr.* Zollner, Manfred Dr.* Zwicker, Eberhard Prof. Dr. †
Elektroakustik/MMK TUM Zoologie TUM Datenverarbeitung TUM Zoologie TUM Elektroakustik/MMK TUM Zoologie LMU Elektroakustik/MMK TUM Elektroakustik/MMK TUM Zoologie LMU Elektroakustik/MMK TUM HNO-Klinik LMU Zoologie LMU Zoologie LMU Datenverarbeitung TUM Zoologie LMU Elektroakustik/MMK TUM HNO-Klinik LMU Zoologie TUM HNO-Klinik TUM Elektroakustik/MMK TUM Zoologie LMU Elektroakustik/MMK TUM HNO-Klinik LMU Zoologie TUM Elektroakustik/MMK TUM Zoologie LMU Elektroakustik/MMK TUM Zoologie UR Zoologie TUM Elektroakustik/MMK TUM Elektroakustik/MMK TUM Zoologie LMU Zoologie LMU Elektroakustik/MMK TUM Elektroakustik/MMK TUM
12 2 13 16 3 14 3 5 20 5 6 10 10 13 8 5 6 2 18 5 8 12 6 2 12 9 12 4 19 12 5 20 8 3 3
1995 – 1997 1984 – 1988 1983 – 1991 1990 – 1996 1983 – 1988 1986 – 1987 1983 1993 – 1996 1984 – 1997 1983 – 1988 1983 – 1997 1983 – 1997 1983 – 1991 1983 – 1987 1988 – 1992 1983 – 1985 1983 – 1995 1983 – 1985 1993 – 1997 1993 – 1997 1991 – 1994 1983 – 1984 1994 – 1997 1992 – 1996 1983 – 1997 1983 – 1985 1995 – 1997 1983 – 1997 1994 – 1996 1989 – 1995 1987 – 1992 1993 – 1996 1987 – 1991 1983 – 1985 1983 – 1990
* Doctoral and **Habilitation theses work under the auspices of the collaborative research centre 204
Datenverarbeitung TUM: Elektroakustik/MMK TUM:
HNO-Klinik LMU:
HNO-Klinik TUM: MPI Psychiatrie: Zoologie LMU: Zoologie TUM: Zoologie UR:
Lehrstuhl für Datenverarbeitung, Technische Universität München Lehrstuhl für Elektroakustik (since 1992 Lehrstuhl für Mensch-Maschine-Kommunikation), Technische Universität München Klinik und Poliklinik für Hals-Nasen-Ohrenkranke (Klinikum Großhadern), Ludwig-Maximilians-Universität München Hals-Nasen-Ohrenklinik und Poliklinik (Klinikum rechts der Isar), Technische Universität München Max-Planck-Institut für Psychiatrie, München Lehrstuhl für Zoologie und vergleichende Anatomie, Ludwig-Maximilians-Universität München Lehrstuhl für Zoologie, Technische Universität München Institut für Zoologie, Universität Regensburg
353
12 Appendices
12.4 Doctoral (D) and habilitation (H) theses of the collaborative research centre 204
Aures, W.: „Berechnungsverfahren für den Wohlklang beliebiger Schallsignale, ein Beitrag zur gehörbezogenen Schallanalyse.“ (D, 12) Baumann, U.: „Ein Verfahren zur Erkennung und Trennung multipler akustischer Objekte.“ (D, 12) Beckenbauer, T.: „Spektrale Inhibition als Mittel zur Sprachverarbeitung.“ (D, 3) Behrend, O.: „The central acoustic tract and audio-vocal coupling in the horseshoe bat, Rhinolophus rouxi.“ (D, 10) Brix, J.: „Die Haarzellpopulation der Vogel-Cochlea: Eine In-vitro-Studie der elektrischen und mechanischen Eigenschaften.“ (D, 2) Brügel, F.J.: „Ein neues Verfahren zur Bestimmung der wirksamen Verstärkung von Hörgeräten.“ (D, 6) Dallmayr, C.: „Stationäre und dynamische Eigenschaften spontaner und simultanevozierter otoakustischer Emissionen.“ (D, 3) Datscheweit, W.: „Untersuchungen zur merkmalsbasierten Erkennung von Sprachlauten.“ (D, 13) Dreher, A.M.: „Hörschwellenabschätzung mittels otoakustischer Emissionen.“ (D, 6) Eckrich, M.: „Räuber-Beute-Interaktion zwischen echoortenden Fledermäusen und Nachtschmetterlingen.“ (D, 8) Ernstberger, M.G.: „Schwerhörigkeit und Halswirbelsäule.“ (D, 6) Faulstich: „On the nature of distortion products in mammalian cochleae.“ (D, 17) Fischer, F.P.: „Das Innenohr der Vögel.“ (H, 2) Frank, G.: „Aktive Haarzellmechanismen bei der Wüstenrennmaus.“ (D, 17) Gallo, L.: „Eine vergleichende Untersuchung von Verzerrungsprodukt-Emissionen bei Reptilien.“ (D, 2) Geywitz, H.J.: „Automatische Erkennung fließender Sprache mit silbenorientierter Segmentierung und Klassifizierung.“ (D, 13) Gleich, O.: „Untersuchung der funktionellen Bedeutung der Haarzell-Typen und ihrer Innervationsmuster im Hörorgan des Staren.“ (D, 2) González-Stein, P.I.V.: „Ergebnisse der Kurztonaudiometrie bei Normalhörenden und Patienten mit verschiedenen Innenohrschwerhörigkeiten.“ (D, 6) Gräf, W.: „Zusammenhänge zwischen gestörtem Frequenz- und Zeitauflösungsvermögen des Hörorgans.“ (D, 6) Grothe, B.: „Versuch einer Definition des medianen Kerns des oberen Olivenkomplexes bei der Neuweltfledermaus Pteronotus parnellii.“ (D, 4) Gralla, G.: „Wahrnehmungskriterien bei Mithörschwellenmessungen und deren Simulation in Rechnermodellen.“ (D, 5) Häusler, U.: „Das thalamo-telencephale System in der Hörbahn des Staren.“ (D, 7) Heinbach, W.: „Gehörgerechte Repräsentation von Audiosignalen durch das Teiltonzeitmuster.“ (D, 12)
354
12 Appendices Hesse, A.: „Modell der Spektraltonhöhe.“ (D, 5) Hofer, R.M.: „Über die Auswirkungen der Zusatzbohrung eines Hörgerätes auf die angenehme Lautstärke und wirksame Verstärkung.“ (D, 6) Jaschul, J.: „Untersuchungen zur Sprecheradaption.“ (D, 13) Jurzitza, D.: „Technische Grundlagen der Messung otoakustischer Emissionen sowie deren Anwendung auf die Untersuchung der nichtlinearen Verzerrungen des Ohres.“ (D, 3) Kaiser, A.: „Verteilungsmuster und physiologische Wirkung der efferenten Innervation in der Papilla basilaris beim Vogel.“ (D, 2) Kaiser, W.P.: „Bestimmung der Frequenzcharakteristik von Hörgeräten bei hohen Eingangspegeln unter Anwendung von Sondenmeßsystemen.“ (D, 6) Kautz, D.: „Mikroiontophoretische Untersuchungen und Computersimulationen zur Arbeitsweise des akustischen Bewegungsdetektors im Culliculus inferior der Schleiereule.“ (D, 19) Kleiser, A.: „Neuronale Repräsentation bewegter Schallquellen im Mittelhirn von Fledermäusen und ihre Bedeutung für akustisch gesteuerte Verhaltensweisen.“ (D, 10) Klump, G.M.: „Untersuchungen zur Informationsverarbeitung im Hörsystem des Staren (Sturnus vulgaris, L., Aves).“ (H, 17) Koch, U.: „The influence of bilateral inhibition on temporal coding in the auditory midbrain.“ (D, 10) Köhlmann, M.: „Rhythmische Segmentierung von Schallsignalen und ihre Anwendung auf die Analyse von Sprache und Musik.“ (D, 12) Köppl, C.: „Struktur und Funktion des Hörorgans der Echsen: Ein Vergleich von Lacertidae und Scincidae.“ (D, 2) Köppl, C.: „Neuronale akustische Verarbeitung bei Vögeln: Innenohr und Beginn der Hörbahn.“ (H, 2) Kössl, M.: „Frequenzrepräsentation und Frequenzverarbeitung in der Cochlea und im Nucleus cochlearis der Schnurrbartfledermaus Pteronotus parnellii.“ (D, 4) Kössl, M.: „Frequenzverarbeitung im peripheren Hörsystem von Säugern.“ (H, 17). Kothny, T.M.: „Tinnitus und Halswirbelsäule.“ (D, 6) Kriner, E.: „Jagdverhalten von Taphozous-Arten in Madurei/Indien.“ (D, 8) Krull, D.: „Jagd- und Echoortungsverhalten von Anthrozous pallidus.“ (D, 8) Krumbholz, K.: „Die Rolle der vielkomponentigen Frequenzstruktur von Megaderma lyra’s Orientierungslauten bei der Analyse feiner räumlicher Strukturen durch Echoortung.“ (D, 8) Krump, G.: „Beschreibung des akustischen Nachtones mit Hilfe von Mithörschwellenmustern.“ (D, 5) Kuhn, B.: „Das Cytoskelett im adulten Cortiorgan und während der Entwicklung.“ (D, 4) Kummer, P.: „Suppressionseigenschaften von Verzerrungsprodukt-Emissionen des Menschen.“ (D, 18) Langemann, U.: „Bedeutung der Verdeckung für die akustische Wahrnehmung bei Singvögeln.“ (D, 16) Lechner, T.: „Piezoelektrische PVDF-Biegewandler und ihr Einsatz in einer taktilen Hörprothese, bei Schnellemikrofonen und in einem hydromechanischen Cochleamodell.“ (D, 3) Leysieffer, H.: „Mehrkanalige Übertragung von Sprachinformation durch eine tragbare Hörprothese mit PVDF-Vibrationswandlern.“ (D, 3) Loretz, C.: „Die Versorgung mit Hörgeräten.“ (D, 6) Lumer, G.: „Nachbildung nichtlinearer simultaner Verdeckungseffekte bei schmalbandigen Schallen mit einem Rechnermodell.“ (D, 3) Maier, E.: „Die Entwicklung neuronaler Spezifität bei der Zeitverarbeitung in der Starenhörbahn.“ (D, 7) Metzner, W.: „Beantwortung von Vokalisation und vokale Beeinflussung der Hörverarbeitung durch Neurone des Mittelhirns (IC, LL und tegmentale Gebiete) bei Rhinolophus rouxi.“ (D, 10) Mummert, M.: „Sprachcodierung durch Konturierung eines gehörangepaßten Spektrogramms und ihre Anwendung zur Datenreduktion.“ (D, 12) Nikel, S.: „Das zeitliche Integrationsvermögen des funktionsgestörten Gehörs.“ (D, 6)
355
12 Appendices Nitsche, V.: „Zeitverarbeitung komplexer akustischer Reize bei Tadarida brasiliensis.“ (D, 8) Obrist, M.: „Individuelle Ortungslautstrukturen freifliegender Fledermäuse in unterschiedlichen Jagdbiotopen.“ (D, 8) Peisl, W.: „Beschreibung aktiver nichtlinearer Effekte der peripheren Schallverarbeitung des Gehörs durch ein Rechnermodell.“ (D, 3) Pfaffelhuber, K.: „Das dynamische Verhalten der Geige an der Anstrichstelle und sein Einfluß auf das Klangsignal.“ (D, 12) Pillat, J.: „Pharmakologische Vokalisationsauslösung und Einwirkung von Läsionen in der paralemniscalen Zone auf die Doppelkompensation der Hufeisennasen-Fledermaus.“ (D, 10) Prechtl, H.: „Neurophysiologische Charakterisierung und Verschaltung des rostralen Pols des Colliculus inferior der Hufeisennasen-Fledermaus.“ (D, 10) Prescher, M.: „Die Beeinflussung des Frequenzselektionsvermögens durch Zusatzrauschen.“ (D, 6) Radtke-Schuller, S.: „Funktionell-neuroanatomische Untersuchungen im Hörcortex bei der Fledermaus Rhinolophus rouxi.“ (D, 10) Reimer, K.: „Neuronale Antworten auf akustische Reize und Vokalisation in tiefen Schichten des Colliculus superior bei Hufeisenfledermäusen.“ (D, 10) Rübsamen, R.: „Ableitungen aus der absteigenden Vokalisationsbahn bei vokalisierenden Fledermäusen (Rhinolophus rouxi).“ (D, 10) Scharmann, M.: „Histologische und elektrophysiologische Untersuchung zur neuronalen Modulation auditorischer Aufmerksamkeit beim Star.“ (D, 16) Scherer, A.: „Die Erregung des Gehörs abgeleitet aus Mithörschwellen und aus verzögerten oto-akustischen Emissionen.“ (D, 3) Schloth, E.: „Akustische Emissionen aus dem Gehörgang des Menschen.“ (D, 3) Schmidt, S.: „Unterscheidung von Oberflächenstrukturen durch Echoortung.“ (D, 8) Schmidt, S.: „Untersuchungen zur Psychoakustik und Echoortung der Fledermäuse.“ (H, 20) Schorer, E.: „Ein Funktionsschema zur Beschreibung wahrnehmbarer Frequenzen.“ (D, 5) Seck, R.: „Untersuchung zur Wahrnehmung von Sprachlauten.“ (D, 13) Sedlmeier, H.: „Untersuchungen zum Gestalthören bei echoortenden Fledermäusen.“ (D, 8) Sosnowski, G.H.: „Die Pegelunterschiedsschwelle von Normalhörenden und Patienten mit verschiedenen Schwerhörigkeitsformen.“ (D, 6) Stiefer, W.: „Echoortungsverhalten und Hören bei Tadarida spec.“ (D, 8) Taschenberger, G.: „Die Eigenschaften spontaner und Verzerrungsprodukt-Emissionen bei der Schleiereule.“ (D, 2) Valenzuela, M.: „Untersuchungen und Berechnungsverfahren zur Klangqualität von Klaviertönen.“ (D, 12) Vater, M.: „Frequenzverarbeitung im peripheren Hörsystem von Fledermäusen.“ (H, 4) Vogler, B.: „Freilandbeobachtung und Lautanalysen zur Echoortung der Gattung Pippistrellus in Madurai (Indien).“ (D, 8) Wartini, S.: „Zur Rolle der Spektraltonhöhen und ihrer Akzentuierung bei der Wahrnehmung von Sprache.“ (D, 12) Widmann, U.: „Ein Modell der psychoakustischen Lästigkeit von Schallen und seine Anwendung in der Praxis der Lärmbeurteilung.“ (D, 5) Wiegrebe, L.: „Untersuchungen zur Diskrimination breitbandiger, nicht harmonischer Frequenzprofile durch Megaderma lyra.“ (D, 20) Wimmer, J.: „Hilfe zur Selbsthilfe. Ein Therapie- und Übungsprogramm für Patienten mit subjektiven Ohrgeräuschen.“ (D, 6) Witzke, P.: „Mechanismen der Lateralisation bei Megaderma lyra.“ (D, 8) Zollner, M.: „Ein implantierbares Hörgerät zur elektrischen Reizung des Hörnerven.“ (D, 3)
356
12 Appendices
12.5 Guest scientists of the collaborative research centre 204
Name, Title
Home University
Year(s) of stay
Allen, Jont Dr. Audet, Doris Dr. Blum, Joe Prof. Buus, SØren Prof. Carr, Catherine Dr. Casseday, John Prof. Chandrachekaran, Maroli Prof. Code, Rebecca Dr. Coro, Francisco Dr. Costa, Henry Dr. Covey, Ellen Dr. Dooling, Robert Prof. Duifhuis, Diek Prof. Frost, Barrie Prof. Ganeshina, Olga Dr.
AT & T, New Jersey, USA University of Quebec, Canada Duke University, Durham, USA Northeastern University Boston, USA University of Maryland, USA Duke University, Durham, USA Madurai Kamaraj University, India
1995 1991 1994 1992 1995 1990, 92, 93, 97 1995
University of Maryland, USA University of Havanna, Cuba University of Kelaniya, Sri Lanka Duke University, Durham, USA University of Maryland, USA Universiteit Groningen, Netherlands Queens University Kingston, Canada Sechonov Inst. of Evolutionary Physiology and Biochemistry, St. Petersburg, Russia Université de Rennes, France
1993 1997 1992 1991, 92, 93, 97 1991, 92, 94 1985, 95 1995 1993 1990
Monash University, Clayton, USA University of W. Australia, Perth, Australia University of Tbilisi, Georgia
1996 1992 1997
University of Maryland, USA Polish Academy of Science, Poznan, ´ Poland Osaka University, Japan
1996 1991 1985, 88, 90, 1993, 95, 97 1996 1995 1995 1995 1984, 86 1989 1990 1987 1994, 95
Adret-Hausberger, Martine Dr. Irvine, Dexter Prof. Kapadia, Sarosh Dr. Kevanishvili, Zuriko Prof. Kubke, Mana Dr. Kubzdela, Henryk Dr. Kuwano, Sonoko Prof. Manabe, Kazuchika Dr. Mazer, James Dr. Meddis, Ray Dr. Miller, Lee Dr. Narins, Peter Prof. Okanoya, Kazuo Dr. Patuzzi, Robert Dr. Pickles, James O. Dr. Pytel, Joseph Dr.
University of Maryland, USA California Inst. of Technology, Pasadena, USA Loughborough University of Technology, UK Odense University, Denmark University of California Los Angeles, USA University of Maryland, USA University of W. Australia, Perth, Australia University of Birmingham, UK Medical University Pécs, Hungary
357
12 Appendices Russell, Jan Prof. Schofield, Brett Dr. Sneary, Michael Dr. Werner, Yehudah Prof. Winter, Ian Prof. Yates, Graeme
358
University of Sussex, Brighton, UK Duke University, Durham, USA San José State University, California, USA Hebrew University, Jerusalem, Israel University of Cambridge, UK University of W. Australia, Perth, Australia
1993, 94, 97 1995 1997 1997 1994 1987
12 Appendices
12.6 International and industrial cooperations of the collaborative research centre 204
● AT&T Bell Labs, New Jersey, USA (Dr. J. Allen, 18) ● Duke University, Durham, USA (Prof. E. Covey, Prof. J. Casseday, 10) ● Northeastern University Boston, USA, Dept. of Electrical and Computer Engineering (Prof. S. Buus, 16) ● Osaka University, Japan, (Prof. S. Namba, Prof. S. Kuwano, 5) ● San José State University, USA, Dept. of Biology (Prof. M. Sneary, 2) ● University of California, Los Angeles, USA, Dept. of Physiology (Prof. Narins, 2, 16) ● University Hospital Groningen, The Netherlands (Dr. van Dijk, 2) ● University of Illinois at Chicago, USA (Prof. T.J. Park, 10) ● University of Kopenhagen, Denmark, Zoological Institute (Prof. T. Dabelsteen, Prof. P. McGregor, 16) ● University of Madurai, Madurai, India (Dr. Marimuthu, Dr. Sripathi, 10) ● University of Maryland in College Park, USA (Prof. R.J. Dooling, 2, 16) ● University of Odense, Denmark, Biological Institute (Prof. O.N. Larsen, 16) ● Université de Rennes, France (Dr. Adret-Hausberger, 7) ● University of Sussex, Brighton, UK (Prof. J. Russell, 17) ● University of Texas at Austin, USA (Prof. G. Pollak, 10) ● University of Western Australia in Perth, Dept. of Physiology (Dr. G. Yates, 2)
359